Logo image
Symptom-BERT: Enhancing Cancer Symptom Detection in EHR Clinical Notes
Journal article   Peer reviewed

Symptom-BERT: Enhancing Cancer Symptom Detection in EHR Clinical Notes

Nahid Zeinali, Alaa Albashayreh, Weiguo Fan and Stephanie Gilbertson White
Journal of pain and symptom management, Vol.68(2), pp.190-198.e1
05/22/2024
DOI: 10.1016/j.jpainsymman.2024.05.015
PMCID: PMC12433187
PMID: 38789092
url
https://pmc.ncbi.nlm.nih.gov/articles/PMC12433187/pdf/nihms-2105313.pdfView
Open Access

Abstract

Extracting cancer symptom documentation allows clinicians to develop highly individualized symptom prediction algorithms to deliver symptom management care. Leveraging advanced language models to detect symptom data in clinical narratives can significantly enhance this process. This study uses a pre-trained large language model to detect and extract cancer symptoms in clinical notes. We developed a pre-trained language model to identify cancer symptoms in clinical notes based on a clinical corpus from the Enterprise Data Warehouse for Research at a healthcare system in the Midwestern United States. This study was conducted in 4 phases:1 pre-training a Bio-Clinical BERT model on 1 million unlabeled clinical documents,2 fine-tuning Symptom-BERT for detecting 13 cancer symptom groups within 1112 annotated clinical notes,3 generating 180 synthetic clinical notes using ChatGPT-4 for external validation, and4 comparing the internal and external performance of Symptom-BERT against a non-pre-trained version and six other BERT implementations. The Symptom-BERT model effectively detected cancer symptoms in clinical notes. It achieved results with a micro-averaged F1-score of 0.933, an AUC of 0.929 internally, and 0.831 and 0.834 externally. Our analysis shows that physical symptoms, like Pruritus, are typically identified with higher performance than psychological symptoms, such as Anxiety. This study underscores the transformative potential of specialized pre-training on domain-specific data in boosting the performance of language models for medical applications. The Symptom-BERT model's exceptional efficacy in detecting cancer symptoms heralds a groundbreaking stride in patient-centered AI technologies, offering a promising path to elevate symptom management and cultivate superior patient self-care outcomes.
Cancer symptoms large language Model Multiclassification Natural language processing

Details

Metrics

Logo image