LLM Reasoning 相关度: 8/10

An Evaluation of Context Length Extrapolation in Long Code via Positional Embeddings and Efficient Attention

Madhusudan Ghosh, Rishabh Gupta
arXiv: 2602.21800v1 发布: 2026-02-25 更新: 2026-02-25

AI 摘要

研究如何扩展LLM在长代码上下文中的应用,着重关注位置编码和注意力机制的优化。

主要贡献

  • 评估了用于长代码上下文扩展的方法
  • 探索了位置编码的改进方法
  • 分析了优化注意力机制的策略

方法论

该研究采用零样本、推理期方法,改进位置编码并优化注意力机制,以实现长代码补全。

原文摘要

The rapid advancement of large language models (LLMs) has led to a significant increase in automated tools in the software engineering, capable of performing various code-related tasks such as code generation, completion, and translation. Despite these advancements, its effectiveness is constrained by fixed context lengths, limiting its ability to generalize across long, domain-specific code sequences. To address this challenge, we investigate zero-shot, inference-only methods aimed at improving position encodings and optimizing attention mechanisms. Our goal is to provide a thorough analysis of current approaches that facilitate context length extrapolation in code, particularly in the context of long code completion tasks.

标签

LLM 代码生成 上下文长度扩展

arXiv 分类

cs.SE cs.AI