LLM Reasoning 相关度: 8/10

Beyond Length: Context-Aware Expansion and Independence as Developmentally Sensitive Evaluation in Child Utterances

Jiyun Chun, Eric Fosler-Lussier, Michael White, Andrew Perrault
arXiv: 2602.05392v1 发布: 2026-02-05 更新: 2026-02-05

AI 摘要

提出一种上下文感知的儿童语言评估框架,关注扩展性和独立性,优于传统长度指标。

主要贡献

  • 提出Expansion和Independence两个评估儿童语言的新维度
  • 开发基于LLM的评估框架,自动评估儿童语言
  • 验证了该框架的有效性,与人类判断一致,并具有预测价值

方法论

利用LLM分类成人语句类型,然后根据Expansion和Independence两个维度对儿童的回应进行评分,并进行实验验证。

原文摘要

Evaluating the quality of children's utterances in adult-child dialogue remains challenging due to insufficient context-sensitive metrics. Common proxies such as Mean Length of Utterance (MLU), lexical diversity (vocd-D), and readability indices (Flesch-Kincaid Grade Level, Gunning Fog Index) are dominated by length and ignore conversational context, missing aspects of response quality such as reasoning depth, topic maintenance, and discourse planning. We introduce an LLM-as-a-judge framework that first classifies the Previous Adult Utterance Type and then scores the child's response along two axes: Expansion (contextual elaboration and inferential depth) and Independence (the child's contribution to advancing the discourse). These axes reflect fundamental dimensions in child language development, where Expansion captures elaboration, clause combining, and causal and contrastive connectives. Independence captures initiative, topic control, decreasing reliance on adult scaffolding through growing self-regulation, and audience design. We establish developmental validity by showing age-related patterns and demonstrate predictive value by improving age estimation over common baselines. We further confirm semantic sensitivity by detecting differences tied to discourse relations. Our metrics align with human judgments, enabling large-scale evaluation. This shifts child utterance assessment from simply measuring length to evaluating how meaningfully the child's speech contributes to and advances the conversation within its context.

标签

儿童语言 语言评估 LLM 上下文感知

arXiv 分类

cs.CL cs.AI