LLM Reasoning 相关度: 8/10

Towards Robust Speech Deepfake Detection via Human-Inspired Reasoning

Artem Dvirniak, Evgeny Kushnir, Dmitrii Tarasov, Artem Iudin, Oleg Kiriukhin, Mikhail Pautov, Dmitrii Korzh, Oleg Y. Rogov

arXiv: 2603.10725v1 发布: 2026-03-11 更新: 2026-03-11

下载 PDF arXiv 页面

AI 摘要

提出HIR-SDD，结合大型音频语言模型和人类推理，提升语音深度伪造检测的鲁棒性和可解释性。

主要贡献

提出HIR-SDD框架
结合大型音频语言模型和人类推理
构建人类标注数据集用于链式思考推理

方法论

利用人类标注数据集，训练大型音频语言模型进行链式思考推理，提供可解释的深度伪造检测结果。

原文摘要

The modern generative audio models can be used by an adversary in an unlawful manner, specifically, to impersonate other people to gain access to private information. To mitigate this issue, speech deepfake detection (SDD) methods started to evolve. Unfortunately, current SDD methods generally suffer from the lack of generalization to new audio domains and generators. More than that, they lack interpretability, especially human-like reasoning that would naturally explain the attribution of a given audio to the bona fide or spoof class and provide human-perceptible cues. In this paper, we propose HIR-SDD, a novel SDD framework that combines the strengths of Large Audio Language Models (LALMs) with the chain-of-thought reasoning derived from the novel proposed human-annotated dataset. Experimental evaluation demonstrates both the effectiveness of the proposed method and its ability to provide reasonable justifications for predictions.

arXiv 分类

cs.SD cs.AI

AI 摘要

主要贡献

方法论

原文摘要

标签

arXiv 分类