Beyond Preset Identities: How Agents Form Stances and Boundaries in Generative Societies
AI 摘要
该论文研究了生成式社会中智能体立场形成、身份协商和边界重构问题。
主要贡献
- 提出了一种混合方法框架,结合虚拟民族志和定量社会认知剖析
- 定义了三个新指标:内在价值偏见(IVB)、说服敏感性和信任-行动解耦(TAD)
- 揭示了静态提示工程的脆弱性,为混合人机社会中的动态对齐提供方法论和定量基础
方法论
采用计算虚拟民族志方法,将研究人员嵌入生成式多智能体社区,进行可控的干预实验,追踪集体认知演变。
原文摘要
While large language models simulate social behaviors, their capacity for stable stance formation and identity negotiation during complex interventions remains unclear. To overcome the limitations of static evaluations, this paper proposes a novel mixed-methods framework combining computational virtual ethnography with quantitative socio-cognitive profiling. By embedding human researchers into generative multiagent communities, controlled discursive interventions are conducted to trace the evolution of collective cognition. To rigorously measure how agents internalize and react to these specific interventions, this paper formalizes three new metrics: Innate Value Bias (IVB), Persuasion Sensitivity, and Trust-Action Decoupling (TAD). Across multiple representative models, agents exhibit endogenous stances that override preset identities, consistently demonstrating an innate progressive bias (IVB > 0). When aligned with these stances, rational persuasion successfully shifts 90% of neutral agents while maintaining high trust. In contrast, conflicting emotional provocations induce a paradoxical 40.0% TAD rate in advanced models, which hypocritically alter stances despite reporting low trust. Smaller models contrastingly maintain a 0% TAD rate, strictly requiring trust for behavioral shifts. Furthermore, guided by shared stances, agents use language interactions to actively dismantle assigned power hierarchies and reconstruct self organized community boundaries. These findings expose the fragility of static prompt engineering, providing a methodological and quantitative foundation for dynamic alignment in human-agent hybrid societies. The official code is available at: https://github.com/armihia/CMASE-Endogenous-Stances