On the Equivalence of Random Network Distillation, Deep Ensembles, and Bayesian Inference
AI 摘要
论文建立了随机网络蒸馏(RND)与深度集成和贝叶斯推断的理论等价性。
主要贡献
- 证明了RND的自预测误差等价于深度集成的预测方差。
- 表明通过构造特定的RND目标函数,RND误差分布可以反映贝叶斯推断的后验预测分布。
- 基于等价性,设计了一种从贝叶斯后验预测分布中生成独立同分布样本的后验抽样算法。
方法论
通过神经正切核框架,在无限网络宽度的极限下分析RND。
原文摘要
Uncertainty quantification is central to safe and efficient deployments of deep learning models, yet many computationally practical methods lack lacking rigorous theoretical motivation. Random network distillation (RND) is a lightweight technique that measures novelty via prediction errors against a fixed random target. While empirically effective, it has remained unclear what uncertainties RND measures and how its estimates relate to other approaches, e.g. Bayesian inference or deep ensembles. This paper establishes these missing theoretical connections by analyzing RND within the neural tangent kernel framework in the limit of infinite network width. Our analysis reveals two central findings in this limit: (1) The uncertainty signal from RND -- its squared self-predictive error -- is equivalent to the predictive variance of a deep ensemble. (2) By constructing a specific RND target function, we show that the RND error distribution can be made to mirror the centered posterior predictive distribution of Bayesian inference with wide neural networks. Based on this equivalence, we moreover devise a posterior sampling algorithm that generates i.i.d. samples from an exact Bayesian posterior predictive distribution using this modified \textit{Bayesian RND} model. Collectively, our findings provide a unified theoretical perspective that places RND within the principled frameworks of deep ensembles and Bayesian inference, and offer new avenues for efficient yet theoretically grounded uncertainty quantification methods.