LLM Reasoning 相关度: 7/10

Text-to-Stage: Spatial Layouts from Long-form Narratives

Jefferson Hernandez, Swarnadeep Saha, Chenxi Whitehouse, Sanjeel Parekh, Calvin Murdock, Yuliang Li, W. Owen Brimijoin, Vamsi Krishna Ithapu, Ishwarya Ananthabhotla
arXiv: 2603.17832v1 发布: 2026-03-18 更新: 2026-03-18

AI 摘要

论文研究了利用语言模型从文本推断舞台布局,并提出了一种训练和评估方法。

主要贡献

  • 提出了一种从非结构化文本生成舞台布局的方法
  • 设计了一个受戏剧启发的可验证评估套件
  • 结合拒绝SFT和RL的训练策略

方法论

使用Best-of-N采样进行拒绝SFT,并通过GRPO从可验证的奖励中进行RL,优化模型。

原文摘要

In this work, we probe the ability of a language model to demonstrate spatial reasoning from unstructured text, mimicking human capabilities and automating a process that benefits many downstream media applications. Concretely, we study the narrative-to-play task: inferring stage-play layouts (scenes, speaker positions, movements, and room types) from text that lacks explicit spatial, positional, or relational cues. We then introduce a dramaturgy-inspired deterministic evaluation suite and, finally, a training and inference recipe that combines rejection SFT using Best-of-N sampling with RL from verifiable rewards via GRPO. Experiments on a text-only corpus of classical English literature demonstrate improvements over vanilla models across multiple metrics (character attribution, spatial plausibility, and movement economy), as well as alignment with an LLM-as-a-judge and subjective human preferences.

标签

语言模型 空间推理 舞台布局 强化学习

arXiv 分类

cs.CL cs.AI cs.LG