An Evaluation of Context Length Extrapolation in Long Code via Positional Embeddings and Efficient Attention
AI 摘要
研究如何扩展LLM在长代码上下文中的应用,着重关注位置编码和注意力机制的优化。
主要贡献
- 评估了用于长代码上下文扩展的方法
- 探索了位置编码的改进方法
- 分析了优化注意力机制的策略
方法论
该研究采用零样本、推理期方法,改进位置编码并优化注意力机制,以实现长代码补全。
原文摘要
The rapid advancement of large language models (LLMs) has led to a significant increase in automated tools in the software engineering, capable of performing various code-related tasks such as code generation, completion, and translation. Despite these advancements, its effectiveness is constrained by fixed context lengths, limiting its ability to generalize across long, domain-specific code sequences. To address this challenge, we investigate zero-shot, inference-only methods aimed at improving position encodings and optimizing attention mechanisms. Our goal is to provide a thorough analysis of current approaches that facilitate context length extrapolation in code, particularly in the context of long code completion tasks.