Logo image
Knowledge-Guided Efficient Representation Learning for Biomedical Domain
Conference proceeding

Knowledge-Guided Efficient Representation Learning for Biomedical Domain

Kishlay Jha, Guangxu Xun, Nan Du and Aidong Zhang
KDD '21: Proceedings of the 27th Acm SIGKDD Conference on Knowledge Discovery & Data Mining, pp.3077-3085
01/01/2021
DOI: 10.1145/3447548.3467118

View Online

Abstract

Pre-trained concept representations are essential to many biomedical text mining and natural language processing tasks. As such, various representation learning approaches have been proposed in the literature. More recently, contextualized embedding approaches (i.e., BERT based models) that capture the implicit semantics of concepts at a granular level have significantly outperformed the conventional word embedding approaches (i.e., Word2Vec/GLoVE based models). Despite significant accuracy gains achieved, these approaches are often computationally expensive and memory inefficient. To address this issue, we propose a new representation learning approach that efficiently adapts the concept representations to the newly available data. Specifically, the proposed approach develops a knowledge-guided continual learning strategy wherein the accurate/stable context-information present in human-curated knowledge-bases is exploited to continually identify and retrain the representations of those concepts whose corpus-based context evolved coherently over time. Different from previous studies that mainly leverage the curated knowledge to improve the accuracy of embedding models, the proposed research explores the usefulness of semantic knowledge from the perspective of accelerating the training efficiency of embedding models. Comprehensive experiments under various efficiency constraints demonstrate that the proposed approach significantly improves the computational performance of biomedical word embedding models.
SYSTEM continual learning biomedical domain CORPUS representation learning

Details

Metrics

Logo image