Agent Tuning & Optimization 相关度: 7/10

Exploring different approaches to customize language models for domain-specific text-to-code generation

Luís Freire, Fernanda A. Andaló, Nicki Skafte Detlefsen
arXiv: 2603.16526v1 发布: 2026-03-17 更新: 2026-03-17

AI 摘要

论文研究了使用合成数据集定制小型LLM用于特定领域代码生成的三种方法,并分析了它们的优劣。

主要贡献

  • 评估了三种定制策略:few-shot prompting, RAG, LoRA
  • 构建了三个Python生态系统领域的编程练习数据集
  • 对比了不同方法在领域相关性和基准精度上的权衡

方法论

使用合成数据集,评估few-shot prompting、RAG和LoRA三种方法在领域代码生成任务中的表现,并使用benchmark和相似度指标进行评估。

原文摘要

Large language models (LLMs) have demonstrated strong capabilities in generating executable code from natural language descriptions. However, general-purpose models often struggle in specialized programming contexts where domain-specific libraries, APIs, or conventions must be used. Customizing smaller open-source models offers a cost-effective alternative to relying on large proprietary systems. In this work, we investigate how smaller language models can be adapted for domain-specific code generation using synthetic datasets. We construct datasets of programming exercises across three domains within the Python ecosystem: general Python programming, Scikit-learn machine learning workflows, and OpenCV-based computer vision tasks. Using these datasets, we evaluate three customization strategies: few-shot prompting, retrieval-augmented generation (RAG), and parameter-efficient fine-tuning using Low-Rank Adaptation (LoRA). Performance is evaluated using both benchmark-based metrics and similarity-based metrics that measure alignment with domain-specific code. Our results show that prompting-based approaches such as few-shot learning and RAG can improve domain relevance in a cost-effective manner, although their impact on benchmark accuracy is limited. In contrast, LoRA-based fine-tuning consistently achieves higher accuracy and stronger domain alignment across most tasks. These findings highlight practical trade-offs between flexibility, computational cost, and performance when adapting smaller language models for specialized programming tasks.

标签

language model code generation fine-tuning domain adaptation

arXiv 分类

cs.AI