LLM Reasoning 相关度: 8/10

Implicit Patterns in LLM-Based Binary Analysis

Qiang Li, XiangRui Zhang, Haining Wang
arXiv: 2603.19138v1 发布: 2026-03-19 更新: 2026-03-19

AI 摘要

研究基于LLM的二进制分析中,隐式token级模式如何组织探索过程。

主要贡献

  • 首次大规模trace级别研究LLM在二进制分析中的隐式模式
  • 识别出四种主导模式:早期修剪、路径依赖锁定、目标回溯、知识引导优先级
  • 系统性地刻画了LLM驱动的二进制分析,为可靠分析系统奠定基础

方法论

通过分析521个二进制文件和99563个推理步骤的trace,识别并分析LLM推理过程中的隐式token级模式。

原文摘要

Binary vulnerability analysis is increasingly performed by LLM-based agents in an iterative, multi-pass manner, with the model as the core decision-maker. However, how such systems organize exploration over hundreds of reasoning steps remains poorly understood, due to limited context windows and implicit token-level behaviors. We present the first large-scale, trace-level study showing that multi-pass LLM reasoning gives rise to structured, token-level implicit patterns. Analyzing 521 binaries with 99,563 reasoning steps, we identify four dominant patterns: early pruning, path-dependent lock-in, targeted backtracking, and knowledge-guided prioritization that emerge implicitly from reasoning traces. These token-level implicit patterns serve as an abstraction of LLM reasoning: instead of explicit control-flow or predefined heuristics, exploration is organized through implicit decisions regulating path selection, commitment, and revision. Our analysis shows these patterns form a stable, structured system with distinct temporal roles and measurable characteristics. Our results provide the first systematic characterization of LLM-driven binary analysis and a foundation for more reliable analysis systems.

标签

LLM 二进制分析 漏洞分析 隐式模式 推理过程

arXiv 分类

cs.AI cs.CR cs.SE