Large-scale Score-based Variational Posterior Inference for Bayesian Deep Neural Networks
AI 摘要
提出了一种可扩展的基于分数的变分贝叶斯深度神经网络后验推断方法,适用于大规模模型。
主要贡献
- 提出了一种新的可扩展的变分推断方法
- 结合了分数匹配损失和近端惩罚项
- 适用于大规模神经网络,如 Vision Transformer
方法论
通过结合分数匹配损失和近端惩罚项,避免重参数化采样,并允许使用随机梯度进行噪声无偏的小批量分数计算。
原文摘要
Bayesian (deep) neural networks (BNN) are often more attractive than the mainstream point-estimate vanilla deep learning in various aspects including uncertainty quantification, robustness to noise, resistance to overfitting, and more. The variational inference (VI) is one of the most widely adopted approximate inference methods. Whereas the ELBO-based variational free energy method is a dominant choice in the literature, in this paper we introduce a score-based alternative for BNN variational inference. Although there have been quite a few score-based variational inference methods proposed in the community, most are not adequate for large-scale BNNs for various computational and technical reasons. We propose a novel scalable VI method where the learning objective combines the score matching loss and the proximal penalty term in iterations, which helps our method avoid the reparametrized sampling, and allows for noisy unbiased mini-batch scores through stochastic gradients. This in turn makes our method scalable to large-scale neural networks including Vision Transformers, and allows for richer variational density families. On several benchmarks including visual recognition and time-series forecasting with large-scale deep networks, we empirically show the effectiveness of our approach.