Multimodal Learning 相关度: 9/10

Adaptive Debiasing Tsallis Entropy for Test-Time Adaptation

Xiangyu Wu, Dongming Jiang, Feng Yu, Yueying Tian, Jiaqi Tang, Qing-Guo Chen, Yang Yang, Jianfeng Lu
arXiv: 2602.11743v1 发布: 2026-02-12 更新: 2026-02-12

AI 摘要

提出自适应去偏Tsallis熵(ADTE)用于测试时自适应,解决CLIP模型在不平衡数据上的偏差问题。

主要贡献

  • 发现Tsallis熵(TE)更适合表征有偏分布
  • 提出自适应去偏Tsallis熵(ADTE),通过类别相关的参数q^l进行自适应调整
  • ADTE在多个图像分类任务上超越了现有SOTA方法

方法论

通过归一化测试实例的标签偏差,为每个类别定制参数q^l,自适应调整Tsallis熵,并结合标签调整策略。

原文摘要

Mainstream Test-Time Adaptation (TTA) methods for adapting vision-language models, e.g., CLIP, typically rely on Shannon Entropy (SE) at test time to measure prediction uncertainty and inconsistency. However, since CLIP has a built-in bias from pretraining on highly imbalanced web-crawled data, SE inevitably results in producing biased estimates of uncertainty entropy. To address this issue, we notably find and demonstrate that Tsallis Entropy (TE), a generalized form of SE, is naturally suited for characterizing biased distributions by introducing a non-extensive parameter q, with the performance of SE serving as a lower bound for TE. Building upon this, we generalize TE into Adaptive Debiasing Tsallis Entropy (ADTE) for TTA, customizing a class-specific parameter q^l derived by normalizing the estimated label bias from continuously incoming test instances, for each category. This adaptive approach allows ADTE to accurately select high-confidence views and seamlessly integrate with a label adjustment strategy to enhance adaptation, without introducing distribution-specific hyperparameter tuning. Besides, our investigation reveals that both TE and ADTE can serve as direct, advanced alternatives to SE in TTA, without any other modifications. Experimental results show that ADTE outperforms state-of-the-art methods on ImageNet and its five variants, and achieves the highest average performance on 10 cross-domain benchmarks, regardless of the model architecture or text prompts used. Our code is available at https://github.com/Jinx630/ADTE.

标签

Test-Time Adaptation Entropy Bias Vision-Language Model

arXiv 分类

cs.CV