Adaptive Evidence Weighting for Audio-Spatiotemporal Fusion
AI 摘要
论文提出FINCH框架,自适应融合音频和时空信息,提升生物声学分类性能。
主要贡献
- 提出了FINCH框架,用于自适应融合音频和时空证据。
- 引入per-sample gating函数,评估上下文信息的可靠性。
- 实现了在上下文信息较弱时,性能优于固定权重融合和仅使用音频的基线。
方法论
FINCH通过学习per-sample gating函数,基于不确定性和信息量统计,自适应地调整音频和时空证据的权重,最终进行融合。
原文摘要
Many machine learning systems have access to multiple sources of evidence for the same prediction target, yet these sources often differ in reliability and informativeness across inputs. In bioacoustic classification, species identity may be inferred both from the acoustic signal and from spatiotemporal context such as location and season; while Bayesian inference motivates multiplicative evidence combination, in practice we typically only have access to discriminative predictors rather than calibrated generative models. We introduce \textbf{F}usion under \textbf{IN}dependent \textbf{C}onditional \textbf{H}ypotheses (\textbf{FINCH}), an adaptive log-linear evidence fusion framework that integrates a pre-trained audio classifier with a structured spatiotemporal predictor. FINCH learns a per-sample gating function that estimates the reliability of contextual information from uncertainty and informativeness statistics. The resulting fusion family \emph{contains} the audio-only classifier as a special case and explicitly bounds the influence of contextual evidence, yielding a risk-contained hypothesis class with an interpretable audio-only fallback. Across benchmarks, FINCH consistently outperforms fixed-weight fusion and audio-only baselines, improving robustness and error trade-offs even when contextual information is weak in isolation. We achieve state-of-the-art performance on CBI and competitive or improved performance on several subsets of BirdSet using a lightweight, interpretable, evidence-based approach. Code is available: \texttt{\href{https://anonymous.4open.science/r/birdnoise-85CD/README.md}{anonymous-repository}}