Conference proceeding
Augmenting word embeddings through external knowledge-base for biomedical application
2017 IEEE International Conference on Big Data (Big Data), Vol.2018-, pp.1965-1974
12/2017
DOI: 10.1109/BigData.2017.8258142
Abstract
The technological advancements in biomedical domain has led to a tremendous growth of unstructured data; primarily a result of increased publication of findings. At the same time, a corresponding interest in the Natural Language Processing (NLP) community to develop scalable methodologies to exploit such massive unlabeled corpora for unsupervised language processing has resulted in new opportunities towards developing semantic sensitive models. Amongst them, the field of word embeddings has garnered significant attention due to its capability to understand implicit semantics. However such data driven models are largely agnostic of the rich explicit semantic knowledge available in the biomedical domain in the form of vocabularies and ontologies. This is problematic because it leads to a poor representation of words with little local context and its effect is acute in biomedical domain. In this paper, we propose a novel model (MeSH2Vec) that jointly exploits both contextual information and available explicit semantic knowledge to learn externally augmented word embeddings. Unlike existing approaches, the proposed methodology is more dexterous in its ability to handle relationships between indirectly related concepts. The 13% improvement in the correlation to experts, shown on experiments involving biomedical concept similarity and relatedness task validates the effectiveness of the proposed approach and demonstrates the importance of incorporating human curated knowledge in the process of generating word embeddings.
Details
- Title: Subtitle
- Augmenting word embeddings through external knowledge-base for biomedical application
- Creators
- Kishlay Jha - University at Buffalo, State University of New YorkGuangxu Xun - University at Buffalo, State University of New YorkVishrawas Gopalakrishnan - University at Buffalo, State University of New YorkAidong Zhang - University at Buffalo, State University of New York
- Resource Type
- Conference proceeding
- Publication Details
- 2017 IEEE International Conference on Big Data (Big Data), Vol.2018-, pp.1965-1974
- DOI
- 10.1109/BigData.2017.8258142
- Publisher
- IEEE
- Language
- English
- Date published
- 12/2017
- Academic Unit
- Electrical and Computer Engineering
- Record Identifier
- 9984294927102771
Metrics
32 Record Views