AI Agents 相关度: 8/10

MobileKernelBench: Can LLMs Write Efficient Kernels for Mobile Devices?

Xingze Zou, Jing Wang, Yuhua Zheng, Xueyi Chen, Haolei Bai, Lingcheng Kong, Syed A. R. Abu-Bakar, Zhaode Wang, Chengfei Lv, Haoji Hu, Huan Wang
arXiv: 2603.11935v1 发布: 2026-03-12 更新: 2026-03-12

AI 摘要

该论文研究了LLM为移动设备生成高效内核的能力,并提出了MoKA多智能体系统提升内核生成效率。

主要贡献

  • 提出了MobileKernelBench基准测试框架,用于评估LLM生成的移动内核
  • 揭示了现有LLM在移动内核生成方面的局限性,如编译失败率高、性能提升有限等
  • 提出了MoKA多智能体系统,显著提高了内核编译成功率和性能

方法论

构建MobileKernelBench基准,评估LLM在MNN上的内核生成能力,并设计MoKA多智能体系统进行优化。

原文摘要

Large language models (LLMs) have demonstrated remarkable capabilities in code generation, yet their potential for generating kernels specifically for mobile de- vices remains largely unexplored. In this work, we extend the scope of automated kernel generation to the mobile domain to investigate the central question: Can LLMs write efficient kernels for mobile devices? To enable systematic investigation, we introduce MobileKernelBench, a comprehensive evaluation framework comprising a benchmark prioritizing operator diversity and cross-framework interoperability, coupled with an automated pipeline that bridges the host-device gap for on-device verification. Leveraging this framework, we conduct extensive evaluation on the CPU backend of Mobile Neural Network (MNN), revealing that current LLMs struggle with the engineering complexity and data scarcity inher-ent to mobile frameworks; standard models and even fine-tuned variants exhibit high compilation failure rates (over 54%) and negligible performance gains due to hallucinations and a lack of domain-specific grounding. To overcome these limitations, we propose the Mobile K ernel A gent (MoKA), a multi-agent system equipped with repository-aware reasoning and a plan-and-execute paradigm.Validated on MobileKernelBench, MoKA achieves state-of-the-art performance, boosting compilation success to 93.7% and enabling 27.4% of generated kernelsto deliver measurable speedups over native libraries.

标签

LLM 代码生成 移动设备 内核优化 多智能体

arXiv 分类

cs.LG cs.AI