Agent Tuning & Optimization 相关度: 7/10

Merge and Conquer: Instructing Multilingual Models by Adding Target Language Weights

Eneko Valero, Maria Ribalta i Albado, Oscar Sainz, Naiara Perez, German Rigau
arXiv: 2603.28263v1 发布: 2026-03-30 更新: 2026-03-30

AI 摘要

通过模型合并将语言知识迁移到指令微调LLM,无需特定语言指令和重复微调。

主要贡献

  • 提出了一种轻量级的低资源语言LLM适配方法:模型合并
  • 验证了模型合并在迁移语言知识和指令遵循方面的有效性
  • 展示了模型合并在构建多语言LLM方面的潜力

方法论

将指令微调LLM与特定语言基础模型合并,评估在伊比利亚语上的指令遵循能力和多语言能力。

原文摘要

Large Language Models (LLMs) remain heavily centered on English, with limited performance in low-resource languages. Existing adaptation approaches, such as continual pre-training, demand significant computational resources. In the case of instructed models, high-quality instruction data is also required, both of which are often inaccessible for low-resource language communities. Under these constraints, model merging offers a lightweight alternative, but its potential in low-resource contexts has not been systematically explored. In this work, we explore whether it is possible to transfer language knowledge to an instruction-tuned LLM by merging it with a language-specific base model, thereby eliminating the need of language-specific instructions and repeated fine-tuning processes whenever stronger instructed variants become available. Through experiments covering four Iberian languages (Basque, Catalan, Galician, and Spanish) and two model families, we show that merging enables effective instruction following behavior in new languages and even supports multilingual capability through the combination of multiple language-specific models. Our results indicate that model merging is a viable and efficient alternative to traditional adaptation methods for low-resource languages, achieving competitive performance while greatly reducing computational cost.

标签

模型合并 低资源语言 指令微调 多语言模型

arXiv 分类

cs.CL cs.AI