Multimodal Learning 相关度: 8/10

Using Multimodal and Language-Agnostic Sentence Embeddings for Abstractive Summarization

Chaimae Chellaf, Salima Mdhaffar, Yannick Estève, Stéphane Huet
arXiv: 2603.08282v1 发布: 2026-03-09 更新: 2026-03-09

AI 摘要

SBARThez利用多模态、多语言嵌入和实体注入提升摘要的准确性和简洁性。

主要贡献

  • 提出SBARThez框架,支持跨语言摘要和多模态输入
  • 引入命名实体注入机制,提升生成摘要的事实一致性
  • 使用LaBSE, SONAR, BGE-M3等多模态多语言预训练模型生成句子嵌入

方法论

基于BART的法语模型,利用多模态多语言句子嵌入,结合命名实体注入,生成摘要。

原文摘要

Abstractive summarization aims to generate concise summaries by creating new sentences, allowing for flexible rephrasing. However, this approach can be vulnerable to inaccuracies, particularly `hallucinations' where the model introduces non-existent information. In this paper, we leverage the use of multimodal and multilingual sentence embeddings derived from pretrained models such as LaBSE, SONAR, and BGE-M3, and feed them into a modified BART-based French model. A Named Entity Injection mechanism that appends tokenized named entities to the decoder input is introduced, in order to improve the factual consistency of the generated summary. Our novel framework, SBARThez, is applicable to both text and speech inputs and supports cross-lingual summarization; it shows competitive performance relative to token-level baselines, especially for low-resource languages, while generating more concise and abstract summaries.

标签

摘要生成 多模态学习 跨语言 自然语言处理 命名实体识别

arXiv 分类

cs.CL