Multimodal Learning 相关度: 7/10

Learning Perceptual Representations for Gaming NR-VQA with Multi-Task FR Signals

Yu-Chih Chen, Michael Wang, Chieh-Dun Wen, Kai-Siang Ma, Avinab Saha, Li-Heng Chen, Alan Bovik
arXiv: 2602.11903v1 发布: 2026-02-12 更新: 2026-02-12

AI 摘要

提出一种多任务学习框架,利用FR指标作为监督信号,提升游戏视频的无参考视频质量评估。

主要贡献

  • 提出基于FR指标的多任务学习框架MTL-VQA
  • 自适应任务权重分配策略
  • 在游戏视频NR-VQA任务上取得SOTA结果

方法论

利用FR指标进行多任务学习,预训练网络以学习感知相关的特征,再迁移到NR-VQA任务中。

原文摘要

No-reference video quality assessment (NR-VQA) for gaming videos is challenging due to limited human-rated datasets and unique content characteristics including fast motion, stylized graphics, and compression artifacts. We present MTL-VQA, a multi-task learning framework that uses full-reference metrics as supervisory signals to learn perceptually meaningful features without human labels for pretraining. By jointly optimizing multiple full-reference (FR) objectives with adaptive task weighting, our approach learns shared representations that transfer effectively to NR-VQA. Experiments on gaming video datasets show MTL-VQA achieves performance competitive with state-of-the-art NR-VQA methods across both MOS-supervised and label-efficient/self-supervised settings.

标签

NR-VQA 视频质量评估 多任务学习 游戏视频 Full-Reference Metrics

arXiv 分类

eess.IV cs.CV cs.MM