LoRA-MME: Multi-Model Ensemble of LoRA-Tuned Encoders for Code Comment Classification
AI 摘要
LoRA-MME利用LoRA微调多个代码编码器,集成模型进行代码注释分类,提升性能但牺牲了效率。
主要贡献
- 提出LoRA-MME多模型集成架构
- 使用PEFT方法降低内存开销
- 在Java、Python和Pharo上进行多标签代码注释分类
方法论
使用LoRA独立微调UniXcoder、CodeBERT等编码器,通过学习到的权重策略集成模型预测。
原文摘要
Code comment classification is a critical task for automated software documentation and analysis. In the context of the NLBSE'26 Tool Competition, we present \textbf{LoRA-MME}, a Multi-Model Ensemble architecture utilizing Parameter-Efficient Fine-Tuning (PEFT). Our approach addresses the multi-label classification challenge across Java, Python, and Pharo by combining the strengths of four distinct transformer encoders: UniXcoder, CodeBERT, GraphCodeBERT, and CodeBERTa. By independently fine-tuning these models using Low-Rank Adaptation(LoRA) and aggregating their predictions via a learned weighted ensemble strategy, we maximize classification performance without the memory overhead of full model fine-tuning. Our tool achieved an \textbf{F1 Weighted score of 0.7906} and a \textbf{Macro F1 of 0.6867} on the test set. However, the computational cost of the ensemble resulted in a final submission score of 41.20\%, highlighting the trade-off between semantic accuracy and inference efficiency.