Book chapter
Exploiting Ontology Structure and Patterns of Annotation to Mine Significant Associations between Pairs of Controlled Vocabulary Terms
Data Integration in the Life Sciences, pp.44-60
Lecture Notes in Computer Science, Springer Berlin Heidelberg
2008
DOI: 10.1007/978-3-540-69828-9_6
Abstract
There is significant knowledge captured through annotations on the life sciences Web. In past research, we developed a methodology of support and confidence metrics from association rule mining, to mine the association bridge (of termlinks) between pairs of controlled vocabulary (CV) terms across two ontologies. Our (naive) approach did not exploit the following: implicit knowledge captured via the hierarchical is-a structure of ontologies, and patterns of annotation in datasets that may impact the distribution of parent/child or sibling CV terms. In this research, we consider this knowledge. We aggregate termlinks over the siblings of a parent CV term and use them as additional evidence to boost support and confidence scores in the associations of the parent CV term. A weight factor (α) reflects the contribution from the child CV terms; its value can be varied to reflect a variance of confidence values among the sibling CV terms of some parent CV term. We illustrate the benefits of exploiting this knowledge through experimental evaluation.
Details
- Title: Subtitle
- Exploiting Ontology Structure and Patterns of Annotation to Mine Significant Associations between Pairs of Controlled Vocabulary Terms
- Creators
- Woei-Jyh Lee - University of Maryland, College Park, USALouiqa Raschid - University of Maryland, College Park, USAHassan Sayyadi - University of Maryland, College Park, USAPadmini Srinivasan - The University of Iowa, Iowa City, USA
- Resource Type
- Book chapter
- Publication Details
- Data Integration in the Life Sciences, pp.44-60
- Publisher
- Springer Berlin Heidelberg; Berlin, Heidelberg
- Series
- Lecture Notes in Computer Science
- DOI
- 10.1007/978-3-540-69828-9_6
- eISSN
- 1611-3349
- ISSN
- 0302-9743
- Language
- English
- Date published
- 2008
- Academic Unit
- Business Analytics; Nursing; Computer Science
- Record Identifier
- 9984003179902771
Metrics
14 Record Views