Journal article
Latent diffusion for medical image segmentation: End-to-end learning for fast sampling and accuracy
Biomedical signal processing and control, Vol.114, 109380
04/01/2026
DOI: 10.1016/j.bspc.2025.109380
Abstract
Diffusion Probabilistic Models (DPMs) suffer from inefficient inference due to their slow sampling and high memory consumption, which restricts their utility in 3D/4D medical image segmentation. We introduce a latent diffusion framework (LDSeg) for medical image segmentation, where the conditional distribution of the labels is learned in an end-to-end fashion. The latent formulation not only ensures accurate image segmentation for multiple interacting objects but also addresses the fundamental issues of traditional DPM-based segmentation methods: (1) high memory consumption, (2) time-consuming sampling process, and (3) discrete labels, restricting the direct application of denoising score matching that is designed for continuous domain signals. The main contribution is the end-to-end training strategy, which trains all four modules of LDSeg by minimizing a single loss function. This approach enables robust representation learning in the latent space related to segmentation features, ensuring significantly faster sampling from the posterior distribution for segmentation generation in the inference phase. Our experiments demonstrate that LDSeg achieves state-of-the-art segmentation accuracy on three medical image datasets with different imaging modalities. It is seen to outperform deterministic models significantly, while the computational complexity is comparable. In addition, the proposed model is significantly more robust to noise in the image, compared to traditional deterministic segmentation models. The code is available at https://github.com/FahimZaman/LDSeg.git.
•First end-to-end trained latent diffusion model for medical image segmentation.•Enables faster sampling and lower complexity than traditional diffusion methods.•Continuous label embeddings in latent space allow direct use of diffusion theory.•Latent training reduces memory and boosts robustness to noisy inputs.•Supports multi-class 2D/3D segmentation with uncertainty estimation.
Details
- Title: Subtitle
- Latent diffusion for medical image segmentation: End-to-end learning for fast sampling and accuracy
- Creators
- Fahim Ahmed Zaman - University of IowaMathews Jacob - Department of Electrical and Computer Engineering, University of Virgina, VA 22904, USAAmanda Chang - University of IowaKan Liu - Washington University in St. Louis School of MedicineMilan Sonka - University of IowaXiaodong Wu - Department of Electrical and Computer Engineering, The University of Iowa, Iowa City, IA 52242, USA
- Resource Type
- Journal article
- Publication Details
- Biomedical signal processing and control, Vol.114, 109380
- DOI
- 10.1016/j.bspc.2025.109380
- ISSN
- 1746-8094
- eISSN
- 1746-8108
- Publisher
- Elsevier Ltd
- Grant note
- National Institutes of Health: R01HL171624, R01AG067078, R01EB019961
Acknowledgments This research was supported in part by National Institutes of Health Grants R01HL171624, R01AG067078 and R01EB019961
- Language
- English
- Date published
- 04/01/2026
- Academic Unit
- Roy J. Carver Department of Biomedical Engineering; Electrical and Computer Engineering; Iowa Technology Institute; Radiation Oncology; The Iowa Institute for Biomedical Imaging; Fraternal Order of Eagles Diabetes Research Center; Injury Prevention Research Center; Ophthalmology and Visual Sciences
- Record Identifier
- 9985090653602771
Metrics
14 Record Views