SynPo: Boosting Training-Free Few-Shot Medical Segmentation via High-Quality Negative Prompts

1Xiamen University, 2Peking University, 3Kuaishou Inc.
MICCAI 2025 Early Accept

*Corresponding Author
Challenges in Few-Shot Medical Image

Challenges in Training-Free Few-Shot Segmentation. (1) The universal process of point promptable segmentation model for training-free few-shot. (2) Different Confidence Maps (C. Map) vs. Ground Truth. In the confidence map generated by DINOv2 features, irrelevant regions on the right are mistakenly identified as "similar". The confidence map from SAM-ViT features exhibits less clear differentiation in values. Our synergy confidence map leverages the strengths of both while mitigating their respective weaknesses. (3) Pilot Experiment. Positioning negative prompts outside the anatomical region, even when using identical positive prompt locations, yields worse segmentation performance compared to placing negative prompts inside.

Abstract

The advent of Large Vision Models (LVMs) offers new opportunities for few-shot medical image segmentation. However, existing training-free methods based on LVMs fail to effectively utilize negative prompts, leading to poor performance on medical images with low-contrast. To address this issue, we propose Synpo, a training-free few-shot method based on LVMs (e.g., SAM), with the core insight: improving the quality of negative prompts. To select point prompts in a more reliable confidence map, we design a novel Confidence Map Synergy Module by combining the strengths of DINOv2 and SAM. Based on the confidence map, we select the top-k pixels as the positive points set and choose the negative points set using a Gaussian distribution, followed by independent K-means clustering for both sets. Then, these selected points are leveraged as high-quality prompts for SAM to get the segmentation results. Extensive experiments demonstrate that Synpo achieves performance comparable to state-of-the-art training-based few-shot methods.

Pipeline of SynPo

SynPo, as shown in below figure, consists of three key components: Confidence Map Synergy Module (CMSM), Point Selection Module (PSM), and Noise-aware Refine Module (NRM). Given a support-query pair, first extract zero-shot visual features using pre-trained vision models (SAM-ViT and DINOv2). In CMSM, the feature maps along with the support mask \( \mathcal{M}_S\in \mathbb{R}^{H \times W} \), are used to calculate the synergy map \(SynMap\in \mathbb{R}^{H \times W}\) and model the negative confidence distribution \(P_{neg}\), which are used to support the generation of prompts. In PSM, the pixels in the synergy map are sorted by their confidence score into a ranked list, which, along with the confidence distribution, are decisive factors in the selection of point prompts. Finally, the generated point prompt and Query Image \(I_{Q}\) are fed into the SAM to predict the naive mask \(\mathcal{M}_{coarse}\in \mathbb{R}^{H\times W}\). Additional NRM is designed for refining the \(\mathcal{M}_{coarse}\).

SynPo Pipeline Illustration

(1) Overview of SynPo Architecture. SynPo integrates a Support-Query pair through three key modules: Confidence Map Synergy Module (CMSM), Point Selection Module (PSM), and Noise-aware Refine Module (NRM). Frozen vision models (SAM-ViT and DINOv2) are used to extract visual features, and a frozen prompt-based segmentation model (e.g., SAM) is employed for mask prediction. (2) Illustration of Confidence Map Synergy. The synergy map and confidence statistics guide the PSM in selecting positive and negative point prompts through ranking and filtering. (3) Point Selection Module Diagram. The selected prompts are used to perform initial segmentation via SAM, followed by refinement through the NRM for more accurate mask prediction.

Experiment Results

Poster

BibTeX


        @misc{liu2025synpo,
              title={SynPo: Boosting Training-Free Few-Shot Medical Segmentation via High-Quality Negative Prompts}, 
              author={Yufei Liu and Haoke Xiao and Jiaxing Chai and Yongcun Zhang and Rong Wang and Zijie Meng and Zhiming Luo},
              year={2025},
              eprint={2506.15153},
              archivePrefix={arXiv},
              primaryClass={cs.CV},
              url={https://arxiv.org/abs/2506.15153}, 
        }