LLM Reasoning 相关度: 7/10

Avey-B

Devang Acharya, Mohammad Hammoud

arXiv: 2602.15814v1 发布: 2026-02-17 更新: 2026-02-17

下载 PDF arXiv 页面

AI 摘要

Avey模型的encoder-only改进版，性能超越Transformer，更高效处理长文本。

主要贡献

Avey模型的encoder-only重构
解耦静态和动态参数化
稳定导向的标准化
神经压缩

方法论

对Avey模型进行改造，提出解耦参数、稳定性标准化和神经压缩等创新，并在token分类和信息检索任务上进行评估。

原文摘要

Compact pretrained bidirectional encoders remain the backbone of industrial NLP under tight compute and memory budgets. Their effectiveness stems from self-attention's ability to deliver high-quality bidirectional contextualization with sequence-level parallelism, as popularized by BERT-style architectures. Recently, Avey was introduced as an autoregressive, attention-free alternative that naturally admits an encoder-only adaptation. In this paper, we reformulate Avey for the encoder-only paradigm and propose several innovations to its architecture, including decoupled static and dynamic parameterizations, stability-oriented normalization, and neural compression. Results show that this reformulated architecture compares favorably to four widely used Transformer-based encoders, consistently outperforming them on standard token-classification and information-retrieval benchmarks while scaling more efficiently to long contexts.

arXiv 分类

cs.CL cs.AI

AI 摘要

主要贡献

方法论

原文摘要

标签

arXiv 分类