Agent Tuning & Optimization 相关度: 5/10

Grammatical Error Correction Evaluation by Optimally Transporting Edit Representation

Takumi Goto, Yusuke Sakai, Taro Watanabe
arXiv: 2602.05419v1 发布: 2026-02-05 更新: 2026-02-05

AI 摘要

提出了一种基于非平衡最优传输的语法纠错评估指标UOT-ERRANT,提高了评估性能和可解释性。

主要贡献

  • 提出edit vector,一种用于表示编辑操作的向量。
  • 引入基于非平衡最优传输的GEC评估指标UOT-ERRANT。
  • 实验证明UOT-ERRANT在Fluency方面表现更优,并具有良好的可解释性。

方法论

通过计算hypothesis和reference之间的edit vector距离,并使用非平衡最优传输来对齐编辑操作,从而评估GEC系统的性能。

原文摘要

Automatic evaluation in grammatical error correction (GEC) is crucial for selecting the best-performing systems. Currently, reference-based metrics are a popular choice, which basically measure the similarity between hypothesis and reference sentences. However, similarity measures based on embeddings, such as BERTScore, are often ineffective, since many words in the source sentences remain unchanged in both the hypothesis and the reference. This study focuses on edits specifically designed for GEC, i.e., ERRANT, and computes similarity measured over the edits from the source sentence. To this end, we propose edit vector, a representation for an edit, and introduce a new metric, UOT-ERRANT, which transports these edit vectors from hypothesis to reference using unbalanced optimal transport. Experiments with SEEDA meta-evaluation show that UOT-ERRANT improves evaluation performance, particularly in the +Fluency domain where many edits occur. Moreover, our method is highly interpretable because the transport plan can be interpreted as a soft edit alignment, making UOT-ERRANT a useful metric for both system ranking and analyzing GEC systems. Our code is available from https://github.com/gotutiyan/uot-errant.

标签

Grammatical Error Correction Evaluation Metric Optimal Transport NLP

arXiv 分类

cs.CL