Latent diffusion for medical image segmentation: End-to-end learning for fast sampling and accuracy

Fahim Ahmed Zaman; Mathews Jacob; Amanda Chang; Kan Liu; Milan Sonka; Xiaodong Wu

doi:10.1016/j.bspc.2025.109380

Back

Latent diffusion for medical image segmentation: End-to-end learning for fast sampling and accuracy

Journal article

Peer reviewed

Latent diffusion for medical image segmentation: End-to-end learning for fast sampling and accuracy

Fahim Ahmed Zaman, Mathews Jacob, Amanda Chang, Kan Liu, Milan Sonka and Xiaodong Wu

Biomedical signal processing and control, Vol.114, 109380

04/01/2026

DOI: 10.1016/j.bspc.2025.109380

View Online

Abstract

Diffusion Probabilistic Models (DPMs) suffer from inefficient inference due to their slow sampling and high memory consumption, which restricts their utility in 3D/4D medical image segmentation. We introduce a latent diffusion framework (LDSeg) for medical image segmentation, where the conditional distribution of the labels is learned in an end-to-end fashion. The latent formulation not only ensures accurate image segmentation for multiple interacting objects but also addresses the fundamental issues of traditional DPM-based segmentation methods: (1) high memory consumption, (2) time-consuming sampling process, and (3) discrete labels, restricting the direct application of denoising score matching that is designed for continuous domain signals. The main contribution is the end-to-end training strategy, which trains all four modules of LDSeg by minimizing a single loss function. This approach enables robust representation learning in the latent space related to segmentation features, ensuring significantly faster sampling from the posterior distribution for segmentation generation in the inference phase. Our experiments demonstrate that LDSeg achieves state-of-the-art segmentation accuracy on three medical image datasets with different imaging modalities. It is seen to outperform deterministic models significantly, while the computational complexity is comparable. In addition, the proposed model is significantly more robust to noise in the image, compared to traditional deterministic segmentation models. The code is available at https://github.com/FahimZaman/LDSeg.git. •First end-to-end trained latent diffusion model for medical image segmentation.•Enables faster sampling and lower complexity than traditional diffusion methods.•Continuous label embeddings in latent space allow direct use of diffusion theory.•Latent training reduces memory and boosts robustness to noisy inputs.•Supports multi-class 2D/3D segmentation with uncertainty estimation.

Diffusion in latent space

Diffusion probabilistic model

Medical image segmentation

Details

Title: Subtitle: Latent diffusion for medical image segmentation: End-to-end learning for fast sampling and accuracy
Creators: Fahim Ahmed Zaman - University of Iowa
Mathews Jacob - Department of Electrical and Computer Engineering, University of Virgina, VA 22904, USA
Amanda Chang - University of Iowa
Kan Liu - Washington University in St. Louis School of Medicine
Milan Sonka - University of Iowa
Xiaodong Wu - Department of Electrical and Computer Engineering, The University of Iowa, Iowa City, IA 52242, USA
Resource Type: Journal article
Publication Details: Biomedical signal processing and control, Vol.114, 109380
DOI: 10.1016/j.bspc.2025.109380
ISSN: 1746-8094
eISSN: 1746-8108
Publisher: Elsevier Ltd
Grant note: National Institutes of Health: R01HL171624, R01AG067078, R01EB019961
Acknowledgments This research was supported in part by National Institutes of Health Grants R01HL171624, R01AG067078 and R01EB019961
Language: English
Date published: 04/01/2026
Academic Unit: Roy J. Carver Department of Biomedical Engineering; Electrical and Computer Engineering; Iowa Technology Institute; Radiation Oncology; The Iowa Institute for Biomedical Imaging; Fraternal Order of Eagles Diabetes Research Center; Injury Prevention Research Center; Ophthalmology and Visual Sciences
Record Identifier: 9985090653602771

Metrics

14 Record Views