GeoSeg: Training-Free Reasoning-Driven Segmentation in Remote Sensing Imagery
AI 摘要
GeoSeg提出一种免训练的遥感图像推理驱动分割框架,无需标注数据即可进行精准分割。
主要贡献
- 提出GeoSeg框架,实现零样本遥感图像分割
- 引入偏差感知坐标精细化,校正定位偏差
- 提出双路提示机制,融合语义信息和空间线索
- 构建GeoSeg-Bench基准测试集
方法论
利用MLLM进行推理,结合坐标精细化和双路提示机制,实现遥感图像的分割。
原文摘要
Recent advances in MLLMs are reframing segmentation from fixed-category prediction to instruction-grounded localization. While reasoning based segmentation has progressed rapidly in natural scenes, remote sensing lacks a generalizable solution due to the prohibitive cost of reasoning-oriented data and domain-specific challenges like overhead viewpoints. We present GeoSeg, a zero-shot, training-free framework that bypasses the supervision bottleneck for reasoning-driven remote sensing segmentation. GeoSeg couples MLLM reasoning with precise localization via: (i) bias-aware coordinate refinement to correct systematic grounding shifts and (ii) a dual-route prompting mechanism to fuse semantic intent with fine-grained spatial cues. We also introduce GeoSeg-Bench, a diagnostic benchmark of 810 image--query pairs with hierarchical difficulty levels. Experiments show that GeoSeg consistently outperforms all baselines, with extensive ablations confirming the effectiveness and necessity of each component.