Agent Tuning & Optimization 相关度: 8/10

CoFEH: LLM-driven Feature Engineering Empowered by Collaborative Bayesian Hyperparameter Optimization

Beicheng Xu, Keyao Ding, Wei Liu, Yupeng Lu, Bin Cui
arXiv: 2602.09851v1 发布: 2026-02-10 更新: 2026-02-10

AI 摘要

CoFEH提出了一种基于LLM的特征工程框架,通过协同贝叶斯优化实现端到端AutoML。

主要贡献

  • 提出CoFEH框架,实现LLM驱动的特征工程和贝叶斯优化的协同
  • 引入Tree of Thought探索灵活的特征工程管道
  • 提出动态优化器选择器和互条件机制,实现LLM与BO之间的信息共享

方法论

CoFEH结合LLM的语义推理和贝叶斯优化,通过动态调度和互条件机制实现特征工程与超参数优化的协同优化。

原文摘要

Feature Engineering (FE) is pivotal in automated machine learning (AutoML) but remains a bottleneck for traditional methods, which treat it as a black-box search, operating within rigid, predefined search spaces and lacking domain awareness. While Large Language Models (LLMs) offer a promising alternative by leveraging semantic reasoning to generate unbounded operators, existing methods fail to construct free-form FE pipelines, remaining confined to isolated subtasks such as feature generation. Most importantly, they are rarely optimized jointly with hyperparameter optimization (HPO) of the ML model, leading to greedy "FE-then-HPO" workflows that cannot capture strong FE-HPO interactions. In this paper, we present CoFEH, a collaborative framework that interleaves LLM-based FE and Bayesian HPO for robust end-to-end AutoML. CoFEH uses an LLM-driven FE optimizer powered by Tree of Thought (ToT) to explore flexible FE pipelines, a Bayesian optimization (BO) module to solve HPO, and a dynamic optimizer selector that realizes interleaved optimization by adaptively scheduling FE and HPO steps. Crucially, we introduce a mutual conditioning mechanism that shares context between LLM and BO, enabling mutually informed decisions. Experiments show that CoFEH not only outperforms traditional and LLM-based FE baselines, but also achieves superior end-to-end performance under joint optimization.

标签

AutoML Feature Engineering Large Language Models Bayesian Optimization

arXiv 分类

cs.LG