AI Agents 相关度: 9/10

LangMARL: Natural Language Multi-Agent Reinforcement Learning

Huaiyuan Yao, Longchao Da, Xiaoou Liu, Charles Fleming, Tianlong Chen, Hua Wei

arXiv: 2604.00722v1 发布: 2026-04-01 更新: 2026-04-01

下载 PDF arXiv 页面

AI 摘要

LangMARL将MARL的信用分配和策略梯度引入语言空间，提升LLM智能体在多智能体任务中的表现。

主要贡献

提出 agent-level 语言信用分配
首创语言空间策略梯度进化
利用轨迹总结因果关系以提供密集反馈

方法论

LangMARL通过语言信用分配、策略梯度进化和轨迹总结，改进LLM智能体的协作策略学习。

原文摘要

Large language model (LLM) agents struggle to autonomously evolve coordination strategies in dynamic environments, largely because coarse global outcomes obscure the causal signals needed for local policy refinement. We identify this bottleneck as a multi-agent credit assignment problem, which has long been studied in classical multi-agent reinforcement learning (MARL) but remains underaddressed in LLM-based systems. Building on this observation, we propose LangMARL, a framework that brings credit assignment and policy gradient evolution from cooperative MARL into the language space. LangMARL introduces agent-level language credit assignment, pioneers gradient evolution in language space for policy improvement, and summarizes task-relevant causal relations from replayed trajectories to provide dense feedback and improve convergence under sparse rewards. Extensive experiments across diverse cooperative multi-agent tasks demonstrate improved sample efficiency, interpretability, and strong generalization.

arXiv 分类

cs.CL

AI 摘要

主要贡献

方法论

原文摘要

标签

arXiv 分类