Conference proceeding
InterHG: an Interpretable and Accurate Model for Hypothesis Generation
2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp.1552-1557
12/09/2021
DOI: 10.1109/BIBM52615.2021.9669740
Abstract
Hypothesis generation, which tries to identify implicit associations between two concepts, has attracted much attention due to its ability of linking key concepts scattered in different articles and enriching plausible new hypotheses. Among existing approaches for hypothesis generation, matrix factorization based methods have achieved start-of-the-art performance. However, matrix factorization based methods suffer from the following limitations: 1) Bridge concepts are determined only as a post-hoc analysis of matrix factorization results; 2) The embeddings of concepts by matrix factorization cannot be explained, and thus it is hard to understand whether the concepts are linked in a semantically meaningful way. To overcome these limitations, we propose an interpretable and accurate hypothesis generation model (InterHG), which improves both accuracy and interpretability compared with existing methods. First, we propose to explicitly model the relationship between bridge concepts and given concept pairs, and conduct tensor factorization to identify link concepts. This reduces information loss and improves accuracy compared with post-hoc approaches. Second, we leverage the description of categories in the tensor factorization, which can output concept embedding as a weighted combination of known categories. With this meaningful embedding representation, medical researchers are able to check the correctness of the suggested link concepts for a given concept pair. We conduct experiments based on MeSH terms (a controlled vocabulary of biomedical concepts) extracted from MEDLINE corpus and category information obtained from UMLS (a comprehensive biomedical concept database). Results demonstrate that the proposed InterHG is highly accurate and produces meaningful embeddings for explanations.
Details
- Title: Subtitle
- InterHG: an Interpretable and Accurate Model for Hypothesis Generation
- Creators
- Haoyu Wang - Purdue University SystemXuan Wang - University of Illinois Urbana-ChampaignYaqing Wang - Purdue University SystemGuangxu Xun - University of VirginiaKishlay Jha - University of VirginiaJing Gao - Purdue University System
- Resource Type
- Conference proceeding
- Publication Details
- 2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp.1552-1557
- DOI
- 10.1109/BIBM52615.2021.9669740
- Publisher
- IEEE
- Grant note
- National Science Foundation (10.13039/100000001)
- Language
- English
- Date published
- 12/09/2021
- Academic Unit
- Electrical and Computer Engineering
- Record Identifier
- 9984295023202771
Metrics
16 Record Views