LLM Reasoning 相关度: 9/10

Mi:dm K 2.5 Pro

KT Tech innovation Group

arXiv: 2603.18788v1 发布: 2026-03-19 更新: 2026-03-19

下载 PDF arXiv 页面

AI 摘要

Mi:dm K 2.5 Pro是一个针对企业级复杂场景优化的32B韩语LLM，具备卓越推理能力。

主要贡献

针对韩语及特定领域进行优化
构建高质量数据基础，采用AST分析、gap-filling等方法
多阶段训练流程，包括推理SFT、模型融合和异步强化学习

方法论

采用高质量数据构建，利用DuS进行预训练扩展，并通过多阶段训练流程提升推理和对话能力。

原文摘要

The evolving LLM landscape requires capabilities beyond simple text generation, prioritizing multi-step reasoning, long-context understanding, and agentic workflows. This shift challenges existing models in enterprise environments, especially in Korean-language and domain-specific scenarios where scaling is insufficient. We introduce Mi:dm K 2.5 Pro, a 32B parameter flagship LLM designed to address enterprise-grade complexity through reasoning-focused optimization. Our methodology builds a robust data foundation via a quality-centric curation pipeline utilizing abstract syntax tree (AST) analysis for code, gap-filling synthesis for mathematics, and an LLM-based quality evaluator. Pre-training scales the model via layer-predictor-based Depth Upscaling (DuS) and a progressive strategy supporting a 128K token context window. Post-training introduces a specialized multi-stage pipeline, including Reasoning SFT, model merging, and asynchronous reinforcement learning (RL), to develop complex problem-solving skills. "Fusion Training" then rebalances these capabilities with conversational fluency, consistent response styling, and reliable tool-use. The evaluations show that Mi:dm K 2.5 Pro achieves competitive performance against leading global and domestic models. In addition, it sets state-of-the-art results on Korean-specific benchmarks, showcasing deep linguistic and cultural understanding. Finally, Responsible AI evaluations validate safety against attacks, ensuring a secure profile for deployment with a balance of harmlessness and responsiveness.

arXiv 分类

cs.CL cs.AI

AI 摘要

主要贡献

方法论

原文摘要

标签

arXiv 分类