Multimodal Learning 相关度: 5/10

FSMC-Pose: Frequency and Spatial Fusion with Multiscale Self-calibration for Cattle Mounting Pose Estimation

Fangjing Li, Zhihai Wang, Xinxin Ding, Haiyang Liu, Ronghua Gao, Rong Wang, Yao Zhu, Ming Jin
arXiv: 2603.16596v1 发布: 2026-03-17 更新: 2026-03-17

AI 摘要

FSMC-Pose通过频率空间融合和多尺度自校准,提升复杂环境下牛只骑跨姿态估计的准确性。

主要贡献

  • 提出轻量级的频率-空间融合网络CattleMountNet,用于分离牛只和背景
  • 设计多尺度自校准头SC2Head,减少动物重叠造成的结构错位
  • 构建包含1176个骑跨实例的MOUNT-Cattle数据集

方法论

采用top-down框架,结合CattleMountNet和SC2Head,前者进行特征提取,后者进行姿态校准,并使用MOUNT-Cattle数据集进行训练。

原文摘要

Mounting posture is an important visual indicator of estrus in dairy cattle. However, achieving reliable mounting pose estimation in real-world environments remains challenging due to cluttered backgrounds and frequent inter-animal occlusion. We present FSMC-Pose, a top-down framework that integrates a lightweight frequency-spatial fusion backbone, CattleMountNet, and a multiscale self-calibration head, SC2Head. Specifically, we design two algorithmic components for CattleMountNet: the Spatial Frequency Enhancement Block (SFEBlock) and the Receptive Aggregation Block (RABlock). SFEBlock separates cattle from cluttered backgrounds, while RABlock captures multiscale contextual information. The Spatial-Channel Self-Calibration Head (SC2Head) attends to spatial and channel dependencies and introduces a self-calibration branch to mitigate structural misalignment under inter-animal overlap. We construct a mounting dataset, MOUNT-Cattle, covering 1176 mounting instances, which follows the COCO format and supports drop-in training across pose estimation models. Using a comprehensive dataset that combines MOUNT-Cattle with the public NWAFU-Cattle dataset, FSMC-Pose achieves higher accuracy than strong baselines, with markedly lower computational and parameter costs, while maintaining real-time inference on commodity GPUs. Extensive experiments and qualitative analyses show that FSMC-Pose effectively captures and estimates cattle mounting pose in complex and cluttered environments. Dataset and code are available at https://github.com/elianafang/FSMC-Pose.

标签

姿态估计 动物姿态估计 频率空间融合 自校准

arXiv 分类

cs.CV cs.AI