The Rules-and-Facts Model for Simultaneous Generalization and Memorization in Neural Networks
AI 摘要
提出了规则-事实(RAF)模型,用于研究神经网络的泛化和记忆能力。
主要贡献
- 提出了RAF模型,简化了泛化和记忆的理论分析
- 量化了过参数化如何支持同时实现规则学习和记忆
- 揭示了正则化和核函数选择如何控制规则学习和记忆之间的容量分配
方法论
构建RAF模型,结合统计物理学中的教师-学生框架和Gardner容量分析,推导学习者同时学习规则和记忆事实的条件。
原文摘要
A key capability of modern neural networks is their capacity to simultaneously learn underlying rules and memorize specific facts or exceptions. Yet, theoretical understanding of this dual capability remains limited. We introduce the Rules-and-Facts (RAF) model, a minimal solvable setting that enables precise characterization of this phenomenon by bridging two classical lines of work in the statistical physics of learning: the teacher-student framework for generalization and Gardner-style capacity analysis for memorization. In the RAF model, a fraction $1 - \varepsilon$ of training labels is generated by a structured teacher rule, while a fraction $\varepsilon$ consists of unstructured facts with random labels. We characterize when the learner can simultaneously recover the underlying rule - allowing generalization to new data - and memorize the unstructured examples. Our results quantify how overparameterization enables the simultaneous realization of these two objectives: sufficient excess capacity supports memorization, while regularization and the choice of kernel or nonlinearity control the allocation of capacity between rule learning and memorization. The RAF model provides a theoretical foundation for understanding how modern neural networks can infer structure while storing rare or non-compressible information.