LLM Reasoning 相关度: 6/10

Towards Empowering Consumers through Sentence-level Readability Scoring in German ESG Reports

Benjamin Josef Schüßler, Jakob Prange
arXiv: 2603.29861v1 发布: 2026-03-31 更新: 2026-03-31

AI 摘要

该论文研究了德语ESG报告的可读性,通过众包标注和模型评估,找到了预测人类可读性的最佳模型。

主要贡献

  • 构建了德语ESG报告句子级可读性标注数据集
  • 评估了多种可读性评分方法在德语ESG报告上的表现
  • 发现微调的Transformer模型在预测人类可读性方面表现最佳

方法论

该论文通过众包收集人工标注数据,然后利用LLM和微调的Transformer模型进行可读性评分,并与人工标注进行对比评估。

原文摘要

With the ever-growing urgency of sustainability in the economy and society, and the massive stream of information that comes with it, consumers need reliable access to that information. To address this need, companies began publishing so called Environmental, Social, and Governance (ESG) reports, both voluntarily and forced by law. To serve the public, these reports must be addressed not only to financial experts but also to non-expert audiences. But are they written clearly enough? In this work, we extend an existing sentence-level dataset of German ESG reports with crowdsourced readability annotations. We find that, in general, native speakers perceive sentences in ESG reports as easy to read, but also that readability is subjective. We apply various readability scoring methods and evaluate them regarding their prediction error and correlation with human rankings. Our analysis shows that, while LLM prompting has potential for distinguishing clear from hard-to-read sentences, a small finetuned transformer predicts human readability with the lowest error. Averaging predictions of multiple models can slightly improve the performance at the cost of slower inference.

标签

ESG 可读性 自然语言处理 Transformer 德语

arXiv 分类

cs.CL cs.AI