Multimodal Learning 相关度: 7/10

Retrieval-Augmented Gaussian Avatars: Improving Expression Generalization

Matan Levy, Gavriel Habib, Issar Tzachor, Dvir Samuel, Rami Ben-Ari, Nir Darshan, Or Litany, Dani Lischinski
arXiv: 2603.08645v1 发布: 2026-03-09 更新: 2026-03-09

AI 摘要

提出了一种检索增强方法RAF,提升无模板头部avatar的表情泛化能力。

主要贡献

  • 提出了检索增强方法RAF,用于训练无模板头部avatar
  • 通过检索邻近表情特征,扩大了表情覆盖范围
  • 提升了avatar在自驱动和跨驱动场景下的表情保真度

方法论

构建表情库,训练时用检索到的邻近表情特征替换部分原始特征,重建原始帧,增强模型对表情变化的鲁棒性。

原文摘要

Template-free animatable head avatars can achieve high visual fidelity by learning expression-dependent facial deformation directly from a subject's capture, avoiding parametric face templates and hand-designed blendshape spaces. However, since learned deformation is supervised only by the expressions observed for a single identity, these models suffer from limited expression coverage and often struggle when driven by motions that deviate from the training distribution. We introduce RAF (Retrieval-Augmented Faces), a simple training-time augmentation designed for template-free head avatars that learn deformation from data. RAF constructs a large unlabeled expression bank and, during training, replaces a subset of the subject's expression features with nearest-neighbor expressions retrieved from this bank while still reconstructing the subject's original frames. This exposes the deformation field to a broader range of expression conditions, encouraging stronger identity-expression decoupling and improving robustness to expression distribution shift without requiring paired cross-identity data, additional annotations, or architectural changes. We further analyze how retrieval augmentation increases expression diversity and validate retrieval quality with a user study showing that retrieved neighbors are perceptually closer in expression and pose. Experiments on the NeRSemble benchmark demonstrate that RAF consistently improves expression fidelity over the baseline, in both self-driving and cross-driving scenarios.

标签

头部Avatar 表情泛化 检索增强 NeRSemble

arXiv 分类

cs.CV cs.GR cs.LG