Journal article
Clustering Genes Using Heterogeneous Data Sources
International journal of knowledge discovery in bioinformatics, Vol.1(2), pp.12-28
04/2010
DOI: 10.4018/jkdb.2010040102
Abstract
Clustering of gene expression data is a standard exploratory technique used to identify closely related genes. Many other sources of data are also likely to be of great assistance in the analysis of gene expression data. This data provides a mean to begin elucidating the large-scale modular organization of the cell. The authors consider the challenging task of developing exploratory analytical techniques to deal with multiple complete and incomplete information sources. The Multi-Source Clustering (MSC) algorithm developed performs clustering with multiple, but complete, sources of data. To deal with incomplete data sources, the authors adopted the MPCK-means clustering algorithms to perform exploratory analysis on one complete source and other potentially incomplete sources provided in the form of constraints. This paper presents a new clustering algorithm MSC to perform exploratory analysis using two or more diverse but complete data sources, studies the effectiveness of constraints sets and robustness of the constrained clustering algorithm using multiple sources of incomplete biological data, and incorporates such incomplete data into constrained clustering algorithm in form of constraints sets.
Details
- Title: Subtitle
- Clustering Genes Using Heterogeneous Data Sources
- Creators
- Erliang Zeng - University of Notre Dame, USAChengyong Yang - Life Technologies Inc., USATao Li - Florida International University, USAGiri Narasimhan - Florida International University, USA
- Resource Type
- Journal article
- Publication Details
- International journal of knowledge discovery in bioinformatics, Vol.1(2), pp.12-28
- DOI
- 10.4018/jkdb.2010040102
- ISSN
- 1947-9115
- eISSN
- 1947-9123
- Language
- English
- Date published
- 04/2010
- Academic Unit
- Dental Research; Preventive and Community Dentistry; Roy J. Carver Department of Biomedical Engineering; Biostatistics; Iowa Neuroscience Institute
- Record Identifier
- 9984065370002771
Metrics
12 Record Views