AI Agents 相关度: 7/10

Channel-Adaptive Edge AI: Maximizing Inference Throughput by Adapting Computational Complexity to Channel States

Jierui Zhang, Jianhao Huang, Kaibin Huang

arXiv: 2603.03146v1 发布: 2026-03-03 更新: 2026-03-03

下载 PDF arXiv 页面

AI 摘要

提出了一种信道自适应AI算法，通过调整计算复杂度来最大化边缘推理吞吐量。

主要贡献

提出了端到端推理精度的可追踪分析模型
设计了信道自适应AI算法，最大化边缘处理速率
联合优化传输端特征压缩和接收端模型复杂度

方法论

使用MvM分布表征高维特征分布，推导出精度与量化比特宽度和模型深度的闭式表达式，解决EPR最大化问题。

原文摘要

\emph{Integrated communication and computation} (IC$^2$) has emerged as a new paradigm for enabling efficient edge inference in sixth-generation (6G) networks. However, the design of IC$^2$ technologies is hindered by the lack of a tractable theoretical framework for characterizing \emph{end-to-end} (E2E) inference performance. The metric is highly complicated as it needs to account for both channel distortion and artificial intelligence (AI) model architecture and computational complexity. In this work, we address this challenge by developing a tractable analytical model for E2E inference accuracy and leveraging it to design a \emph{channel-adaptive AI} algorithm that maximizes inference throughput, referred to as the edge processing rate (EPR), under latency and accuracy constraints. Specifically, we consider an edge inference system in which a server deploys a backbone model with early exit, which enables flexible computational complexity, to perform inference on data features transmitted by a mobile device. The proposed accuracy model characterizes high-dimensional feature distributions in the angular domain using a Mixture of von Mises (MvM) distribution. This leads to a desired closed-form expression for inference accuracy as a function of quantization bit-width and model traversal depth, which represents channel distortion and computational complexity, respectively. Building upon this accuracy model, we formulate and solve the EPR maximization problem under joint latency and accuracy constraints, leading to a channel-adaptive AI algorithm that achieves full IC$^2$ integration. The proposed algorithm jointly adapts transmit-side feature compression and receive-side model complexity according to channel conditions to maximize overall efficiency and inference throughput. Experimental results demonstrate its superior performance as compared with fixed-complexity counterparts.

arXiv 分类

cs.IT cs.AI cs.LG cs.NI

AI 摘要

主要贡献

方法论

原文摘要

标签

arXiv 分类