Conference proceeding
Boosting Clinical Outcome Prediction with Context-Aware Feature Imputation and Disentanglement
Proceedings (IEEE International Conference on Data Mining), pp.1234-1243
11/12/2025
DOI: 10.1109/ICDM65498.2025.00132
Abstract
Accurate prediction of patient outcomes from electronic health records (EHRs) is a fundamental task in data mining with practical benefits to clinical decision support and healthcare resource allocation. Over the past few years, with the advent of large language models (LLMs), there has been increasing interest in training LLMs on EHR clinical notes to improve outcome predictions. Despite significant advances, existing approaches have a certain limitation. Specifically, the existing approaches largely model clinical notes as flat token sequences and overlook their intrinsic semi-structured organization into sections (e.g., History of Present Illness and Physical Exam). Moreover, most of the existing approaches ignore the issue of missing data prevalent in real-world EHR clinical notes. To address these challenges, we propose a novel approach that leverages the inherent structure of clinical notes to impute missing sections and learns robust feature representations needed for outcome prediction. In particular, we propose a context-aware section imputation strategy that utilizes multi-head attention to infer missing section representations based on inter-section dependencies within the clinical note. Moreover, to learn disentangled feature representations, we propose orthogonality constraints across the section embeddings. Extensive experiments on multiple benchmark datasets for clinical outcome prediction show that the proposed approach achieves consistent improvements over strong baseline algorithms. The code has been released on github at https://github.com/LeiGong0125Carrot/Strucure-Awared-Clinical-Note-Processing/tree/ICDM-2025
Details
- Title: Subtitle
- Boosting Clinical Outcome Prediction with Context-Aware Feature Imputation and Disentanglement
- Creators
- Lei Gong - University of VirginiaAidong Zhang - University of VirginiaKishlay Jha - University of Iowa
- Resource Type
- Conference proceeding
- Publication Details
- Proceedings (IEEE International Conference on Data Mining), pp.1234-1243
- DOI
- 10.1109/ICDM65498.2025.00132
- eISSN
- 2374-8486
- Publisher
- IEEE
- Grant note
- R01LM014012-01A1,IIS-2106913,BIO2313865,SCH-2500341,SCH-2500344 / National Science Foundation (10.13039/100000001) NIH (10.13039/100000002)
- Language
- English
- Date published
- 11/12/2025
- Academic Unit
- Electrical and Computer Engineering
- Record Identifier
- 9985141959202771
Metrics
1 Record Views