Logo image
Disentangled Contrastive Representation Learning for Zero-Shot Biomedical Text Classification
Conference proceeding

Disentangled Contrastive Representation Learning for Zero-Shot Biomedical Text Classification

Ratri Mukherjee, Shailesh Dahal and Kishlay Jha
Proceedings (IEEE International Conference on Data Mining), pp.1425-1434
11/12/2025
DOI: 10.1109/ICDM65498.2025.00152

View Online

Abstract

Zero-shot biomedical text classification requires accurate assignment of biomedical text (e.g., scientific abstract) to previously unseen labels (or concepts). Existing methods often struggle to generalize to novel concepts such as new diseases, drugs, and genes. To address these unique challenges, we propose a framework that combines feature disentanglement with contrastive learning to address this limitation. It separates each abstract into a content representation relevant for classification and a variance representation. This disentanglement ensures that the content features are invariant to the writing style, improving generalization to unseen labels. A contrastive learning strategy further structures the latent space by encouraging semantic clustering and separation of categories. Moreover, we model intra-class variance as a shared distribution across labels to enable variational data augmentation, enhancing robustness. The framework uses a domain-specific biomedical language model for feature extraction and fixed label anchors for semantic alignment. Extensive experiments conducted on the largest available biomedical corpus achieve superior performance on zero-shot multi-label classification tasks by learning discriminative and style-invariant representations.
Semantics Accuracy Biological system modeling biomedical multi-label text classification Contrastive learning Multi label classification Robustness Text categorization Training Writing Zero shot learning

Details

Metrics

1 Record Views
Logo image