A Closer Look into LLMs for Table Understanding
AI 摘要
该论文深入研究了LLM在表格理解中的内部机制,并分析了不同模型的表现差异。
主要贡献
- 揭示了LLM处理表格数据的注意力机制演变过程
- 分析了不同类型LLM在表格任务中的有效层数
- 探索了MoE模型在表格理解中的专家激活模式
方法论
通过对16个LLM进行实验分析,从注意力机制、层深度、专家激活和输入设计四个维度进行评估。
原文摘要
Despite the success of Large Language Models (LLMs) in table understanding, their internal mechanisms remain unclear. In this paper, we conduct an empirical study on 16 LLMs, covering general LLMs, specialist tabular LLMs, and Mixture-of-Experts (MoE) models, to explore how LLMs understand tabular data and perform downstream tasks. Our analysis focus on 4 dimensions including the attention dynamics, the effective layer depth, the expert activation, and the impacts of input designs. Key findings include: (1) LLMs follow a three-phase attention pattern -- early layers scan the table broadly, middle layers localize relevant cells, and late layers amplify their contributions; (2) tabular tasks require deeper layers than math reasoning to reach stable predictions; (3) MoE models activate table-specific experts in middle layers, with early and late layers sharing general-purpose experts; (4) Chain-of-Thought prompting increases table attention, further enhanced by table-tuning. We hope these findings and insights can facilitate interpretability and future research on table-related tasks.