Agent Tuning & Optimization 相关度: 8/10

PlotTwist: A Creative Plot Generation Framework with Small Language Models

Abhinav Thorat, Ravi Kolla, Jyotin Goel, Niranjan Pedanekar
arXiv: 2603.16410v1 发布: 2026-03-17 更新: 2026-03-17

AI 摘要

PlotTwist利用结构化框架和偏好对齐,使小型语言模型能生成高质量的故事梗概。

主要贡献

  • 提出PlotTwist框架,分解生成过程为三个专业组件
  • 设计新颖的Positive-Negative Prompting策略训练奖励模型
  • 利用直接偏好优化(DPO)对齐混合专家模型

方法论

通过奖励模型评估、混合专家模型生成、智能体评估三个模块,实现高质量故事梗概生成。

原文摘要

Creative plot generation presents a fundamental challenge for language models: transforming a concise premise into a coherent narrative that sustains global structure, character development, and emotional resonance. Although recent Large Language Models (LLMs) demonstrate strong fluency across general-purpose tasks, they typically require preference alignment to perform well on specialized domains such as creative plot generation. However, conducting such alignment at the scale of frontier LLMs is computationally prohibitive, significantly limiting accessibility and practical deployment. To address this, we present PlotTwist, a structured framework that enables Small Language Models (SLMs) with $\leq$ 5B active parameters to generate high-quality, premise-conditioned plots competitive with frontier systems up to $200\times$ larger. Our approach decomposes generation into three specialized components: (1) an Aspect Rating Reward Model trained via a novel Positive-Negative prompting strategy to deliver structured narratives across five Narrative Quality Dimensions (NQDs); (2) a Mixture-of-Experts (MoE) plot generator aligned via Direct Preference Optimization on high-confidence preference pairs; and (3) an Agentic Evaluation module that emulates human critical judgment for unbiased post-hoc assessment. Extensive experiments demonstrate that PlotTwist consistently outperforms frontier models across multiple NQDs despite substantially tighter capacity constraints. Further validation confirms strong sensitivity to narrative quality, as the framework reliably distinguishes plots derived from critically acclaimed versus widely panned screenplays. Together, these results establish structured, preference-based alignment as a resource-efficient approach to high-quality creative plot generation.

标签

故事生成 小型语言模型 偏好对齐 奖励模型

arXiv 分类

cs.CL cs.AI