LLM Reasoning 相关度: 8/10

LLM REgression with a Latent Iterative State Head

Yiheng Su, Matthew Lease
arXiv: 2604.01206v1 发布: 2026-04-01 更新: 2026-04-01

AI 摘要

RELISH提出一种轻量级迭代状态头用于文本回归,优于现有方法且参数效率高。

主要贡献

  • 提出了一种新的轻量级文本回归架构RELISH
  • 使用迭代潜在状态细化预测标量值
  • 实验证明RELISH优于现有基线且参数效率高

方法论

通过迭代细化潜在状态,使用跨注意力机制处理token级别表示,最终映射到线性回归的预测值。

原文摘要

We present RELISH (REgression with a Latent Iterative State Head), a novel, lightweight architecture designed for text regression with large language models. Rather than decoding numeric targets as text or aggregating multiple generated outputs, RELISH predicts scalar values directly from frozen LLM representations by iteratively refining a learned latent state through cross-attention over token-level representations, and then mapping the final state to a point estimate with a linear regressor. Across five datasets, four LLM backbones, and two LLM training regimes, RELISH consistently outperforms prior baselines from all three major LLM regression families, including autoregressive decoding, regression-aware inference, and existing predictive head methods. Despite these gains, RELISH remains highly parameter-efficient, requiring only 3.4-3.7M trainable parameters across frozen LLM backbones (only 0.01-0.04% additional overhead), far less than LoRA-based alternatives that grow with model size (0.26-0.42%).

标签

LLM Regression Text Regression Parameter-efficient

arXiv 分类

cs.CL cs.LG