Multimodal Learning 相关度: 8/10

A2BFR: Attribute-Aware Blind Face Restoration

Chenxin Zhu, Yushun Fang, Lu Liu, Shibo Yin, Xiaohong Liu, Xiaoyun Zhang, Qiang Hu, Guangtao Zhai
arXiv: 2603.29423v1 发布: 2026-03-31 更新: 2026-03-31

AI 摘要

A$^2$BFR通过属性感知学习和语义双重训练,实现了高保真和可控的盲脸修复。

主要贡献

  • 提出了A$^2$BFR框架,结合高保真重建和提示控制生成
  • 引入属性感知学习,利用面部属性嵌入监督去噪潜在空间
  • 提出了语义双重训练,增强提示可控性

方法论

基于扩散Transformer,使用统一的图像-文本跨模态注意力,结合属性感知学习和语义双重训练,进行盲脸修复。

原文摘要

Blind face restoration (BFR) aims to recover high-quality facial images from degraded inputs, yet its inherently ill-posed nature leads to ambiguous and uncontrollable solutions. Recent diffusion-based BFR methods improve perceptual quality but remain uncontrollable, whereas text-guided face editing enables attribute manipulation without reliable restoration. To address these issues, we propose A$^2$BFR, an attribute-aware blind face restoration framework that unifies high-fidelity reconstruction with prompt-controllable generation. Built upon a Diffusion Transformer backbone with unified image-text cross-modal attention, A$^2$BFR jointly conditions the denoising trajectory on both degraded inputs and textual prompts. To inject semantic priors, we introduce attribute-aware learning, which supervises denoising latents using facial attribute embeddings extracted by an attribute-aware encoder. To further enhance prompt controllability, we introduce semantic dual-training, which leverages the pairwise attribute variations in our newly curated AttrFace-90K dataset to enforce attribute discrimination while preserving fidelity. Extensive experiments demonstrate that A$^2$BFR achieves state-of-the-art performance in both restoration fidelity and instruction adherence, outperforming diffusion-based BFR baselines by -0.0467 LPIPS and +52.58% attribute accuracy, while enabling fine-grained, prompt-controllable restoration even under severe degradations.

标签

盲脸修复 扩散模型 属性控制 图像生成

arXiv 分类

cs.CV