Conference proceeding
MedSkim: Denoised Health Risk Prediction via Skimming Medical Claims Data
2022 IEEE International Conference on Data Mining (ICDM), Vol.2022-, pp.81-90
11/2022
DOI: 10.1109/ICDM54844.2022.00018
Abstract
Health risk prediction is a challenge task that aims to predict whether patients would suffer from a certain disease/condition in the near future based on their historical EHR data. Although existing approaches can achieve better performance, none of them can deal with the noise existing in the EHR data explicitly. In this paper, we hypothesize that automatically removing noise from EHR data should help the models further improve the performance. Correspondingly, we propose a novel model named MedSkim, which is able to automatically rule out irrelevant visits and codes by effectively skimming through the EHR data. In particular, the proposed model has a code selection module that can directly make a skipping decision to each individual diagnosis codes and then remove the target-irrelevant ones. A backward probing RNN (BPRNN) is designed to reversely process the EHR data and provide a coarse grained representation learning for visits. Besides, a forward skipping RNN (FSRNN) is proposed to read the EHR in a preceding way and dynamically select important visits and codes based on the results of previous two modules. Finally, the risk prediction module uses the output hidden states from FSRNN for generating the final representation to make predictions. Additionally, we also design an extra regularization term based on the skip rate of the model and combine it with standard cross entropy loss to train the model in an end-to-end setting. Experimental results show that MedSkim achieves the best performance on three real-world datasets compared with the state-of-the-art baselines in terms of PR-AUC, F1 and Cohen's Kappa. Moreover, the ablation study and case study confirm that the proposed MedSkim is reasonable and effective for removing noise from EHR data 1 . 1 The source code of the proposed MedSkim is available at https://github.com/SH-Src/MedSkim
Details
- Title: Subtitle
- MedSkim: Denoised Health Risk Prediction via Skimming Medical Claims Data
- Creators
- Suhan Cui - Pennsylvania State UniversityJunyu Luo - Pennsylvania State UniversityMuchao Ye - Pennsylvania State UniversityJiaqi Wang - Pennsylvania State UniversityTing Wang - Pennsylvania State UniversityFenglong Ma - Pennsylvania State University
- Resource Type
- Conference proceeding
- Publication Details
- 2022 IEEE International Conference on Data Mining (ICDM), Vol.2022-, pp.81-90
- Publisher
- IEEE
- DOI
- 10.1109/ICDM54844.2022.00018
- ISSN
- 1550-4786
- eISSN
- 2374-8486
- Grant note
- National Institutes of Health (10.13039/100000002) National Science Foundation (10.13039/100000001)
- Language
- English
- Date published
- 11/2022
- Academic Unit
- Computer Science
- Record Identifier
- 9984696585102771
Metrics
1 Record Views