AI Agents 相关度: 9/10

Agent Skill Framework: Perspectives on the Potential of Small Language Models in Industrial Environments

Yangjie Xu, Lujun Li, Lama Sleem, Niccolo Gentile, Yewei Song, Yiqun Wang, Siming Ji, Wenbo Wu, Radu State
arXiv: 2602.16653v1 发布: 2026-02-18 更新: 2026-02-18

AI 摘要

研究Agent Skill框架对小语言模型的性能提升,尤其在工业场景的应用潜力。

主要贡献

  • 形式化定义Agent Skill过程
  • 系统评估不同规模语言模型在多个用例上的性能
  • 揭示Agent Skill在SLM环境中的能力和约束

方法论

通过数学定义Agent Skill流程,并在开源任务和真实数据集上评估不同规模语言模型的表现。

原文摘要

Agent Skill framework, now widely and officially supported by major players such as GitHub Copilot, LangChain, and OpenAI, performs especially well with proprietary models by improving context engineering, reducing hallucinations, and boosting task accuracy. Based on these observations, an investigation is conducted to determine whether the Agent Skill paradigm provides similar benefits to small language models (SLMs). This question matters in industrial scenarios where continuous reliance on public APIs is infeasible due to data-security and budget constraints requirements, and where SLMs often show limited generalization in highly customized scenarios. This work introduces a formal mathematical definition of the Agent Skill process, followed by a systematic evaluation of language models of varying sizes across multiple use cases. The evaluation encompasses two open-source tasks and a real-world insurance claims data set. The results show that tiny models struggle with reliable skill selection, while moderately sized SLMs (approximately 12B - 30B) parameters) benefit substantially from the Agent Skill approach. Moreover, code-specialized variants at around 80B parameters achieve performance comparable to closed-source baselines while improving GPU efficiency. Collectively, these findings provide a comprehensive and nuanced characterization of the capabilities and constraints of the framework, while providing actionable insights for the effective deployment of Agent Skills in SLM-centered environments.

标签

Agent Skill Small Language Models Industrial Applications

arXiv 分类

cs.AI