Logo image
Boosting Clinical Outcome Prediction with Context-Aware Feature Imputation and Disentanglement
Conference proceeding

Boosting Clinical Outcome Prediction with Context-Aware Feature Imputation and Disentanglement

Lei Gong, Aidong Zhang and Kishlay Jha
Proceedings (IEEE International Conference on Data Mining), pp.1234-1243
11/12/2025
DOI: 10.1109/ICDM65498.2025.00132

View Online

Abstract

Accurate prediction of patient outcomes from electronic health records (EHRs) is a fundamental task in data mining with practical benefits to clinical decision support and healthcare resource allocation. Over the past few years, with the advent of large language models (LLMs), there has been increasing interest in training LLMs on EHR clinical notes to improve outcome predictions. Despite significant advances, existing approaches have a certain limitation. Specifically, the existing approaches largely model clinical notes as flat token sequences and overlook their intrinsic semi-structured organization into sections (e.g., History of Present Illness and Physical Exam). Moreover, most of the existing approaches ignore the issue of missing data prevalent in real-world EHR clinical notes. To address these challenges, we propose a novel approach that leverages the inherent structure of clinical notes to impute missing sections and learns robust feature representations needed for outcome prediction. In particular, we propose a context-aware section imputation strategy that utilizes multi-head attention to infer missing section representations based on inter-section dependencies within the clinical note. Moreover, to learn disentangled feature representations, we propose orthogonality constraints across the section embeddings. Extensive experiments on multiple benchmark datasets for clinical outcome prediction show that the proposed approach achieves consistent improvements over strong baseline algorithms. The code has been released on github at https://github.com/LeiGong0125Carrot/Strucure-Awared-Clinical-Note-Processing/tree/ICDM-2025
Data Mining Electronic Health Records clinical notes clinical outcome prediction Electronic medical records Imputation imputing data Medical services Organizations Prediction algorithms Redundancy Resource management Software development management Training

Details

Metrics

1 Record Views
Logo image