Multimodal Learning 相关度: 8/10

Persistent Story World Simulation with Continuous Character Customization

Jinlu Zhang, Qiyun Wang, Baoxiang Du, Jiayi Ji, Jing He, Rongsheng Zhang, Tangjie Lv, Xiaoshuai Sun, Rongrong Ji
arXiv: 2603.16285v1 发布: 2026-03-17 更新: 2026-03-17

AI 摘要

EverTale通过持续角色定制实现持久故事世界模拟,提升角色一致性和视觉故事质量。

主要贡献

  • 提出All-in-One-World Character Integrator
  • 引入基于MLLM的Character Quality Gate
  • 设计Character-Aware Region-Focus Sampling策略

方法论

结合LoRA、MLLM和区域焦点采样,实现角色连续定制和高质量多角色故事生成。

原文摘要

Story visualization has gained increasing attention in computer vision. However, current methods often fail to achieve a synergy between accurate character customization, semantic alignment, and continuous integration of new identities. To tackle this challenge, in this paper we present EverTale, a story world simulator for continuous story character customization. We first propose an All-in-One-World Character Integrator to achieve continuous character adaptation within unified LoRA module, eliminating the need for per-character optimization modules of previous methods. Then, we incorporate a Character Quality Gate via MLLM-as-Judge to ensure the fidelity of each character adaptation process through chain-of-thought reasoning, determining whether the model can proceed to the next character or require additional training on the current one. We also introduce a Character-Aware Region-Focus Sampling strategy to address the identity degradation and layout conflicts in existing multi-character visual storytelling, ensuring natural multi-character generation by harmonizing local character-specific details with global scene context with higher efficiency. Experimental results show that our EverTale achieves superior performance against a wider range of compared methods on both single- and multi-character story visualization. Codes will be available.

标签

故事可视化 角色定制 多模态学习 MLLM LoRA

arXiv 分类

cs.CV