Trust via Reputation of Conviction
AI 摘要
论文提出基于信念的声誉体系,为AI信任建立可验证的基础。
主要贡献
- 提出了基于信念(Conviction)的声誉度量方法
- 论证了信念是信任的原则性基础
- 建立了基于信念的声誉框架,并分析了其行为
方法论
通过数学公式对声明和来源进行形式化建模,并推导声誉计算方法。
原文摘要
The question of \emph{knowledge}, \emph{truth} and \emph{trust} is explored via a mathematical formulation of claims and sources. We define truth as the reproducibly perceived subset of knowledge, formalize sources as having both generative and discriminative roles, and develop a framework for reputation grounded in the \emph{conviction} -- the likelihood that a source's stance is vindicated by independent consensus. We argue that conviction, rather than correctness or faithfulness, is the principled basis for trust: it is regime-independent, rewards genuine contribution, and demands the transparent and self-sufficient perceptions that make external verification possible. We formalize reputation as the expected weighted signed conviction over a realm of claims, characterize its behavior across source-claim regimes, and identify continuous verification as both a theoretical necessity and a practical mechanism through which reputation accrues. The framework is applied to AI agents, which are identified as capable but error-prone sources for whom verifiable conviction and continuously accrued reputation constitute the only robust foundation for trust.