Journal article
pySAPC, a python package for sparse affinity propagation clustering: Application to odontogenesis whole genome time series gene-expression data
Biochimica et biophysica acta. General subjects, Vol.1860(11), pp.2613-2618
11/2016
DOI: 10.1016/j.bbagen.2016.06.008
PMID: 27288587
Abstract
Developmental dental anomalies are common forms of congenital defects. The molecular mechanisms of dental anomalies are poorly understood. Systematic approaches such as clustering genes based on similar expression patterns could identify novel genes involved in dental anomalies and provide a framework for understanding molecular regulatory mechanisms of these genes during tooth development (odontogenesis).
A python package (pySAPC) of sparse affinity propagation clustering algorithm for large datasets was developed. Whole genome pair-wise similarity was calculated based on expression pattern similarity based on 45 microarrays of several stages during odontogenesis.
pySAPC identified 743 gene clusters based on expression pattern similarity during mouse tooth development. Three clusters are significantly enriched for genes associated with dental anomalies (with FDR <0.1). The three clusters of genes have distinct expression patterns during odontogenesis.
Clustering genes based on similar expression profiles recovered several known regulatory relationships for genes involved in odontogenesis, as well as many novel genes that may be involved with the same genetic pathways as genes that have already been shown to contribute to dental defects.
By using sparse similarity matrix, pySAPC use much less memory and CPU time compared with the original affinity propagation program that uses a full similarity matrix. This python package will be useful for many applications where dataset(s) are too large to use full similarity matrix. This article is part of a Special Issue entitled “System Genetics” Guest Editor: Dr. Yudong Cai and Dr. Tao Huang.
•Sparse similarity matrix could save lots of memory and CPU time in affinity propagation clustering•pySAPC is memory and computation efficient, could deal with large dataset•Gene clustering help us understanding molecular mechanisms of dental anomalies
Details
- Title: Subtitle
- pySAPC, a python package for sparse affinity propagation clustering: Application to odontogenesis whole genome time series gene-expression data
- Creators
- Huojun Cao - Iowa Institute for Oral Health Research, College of Dentistry, The University of Iowa, Iowa City, IA 52244, USABrad A Amendt - Iowa Institute for Oral Health Research, College of Dentistry, The University of Iowa, Iowa City, IA 52244, USA
- Resource Type
- Journal article
- Publication Details
- Biochimica et biophysica acta. General subjects, Vol.1860(11), pp.2613-2618
- DOI
- 10.1016/j.bbagen.2016.06.008
- PMID
- 27288587
- NLM abbreviation
- Biochim Biophys Acta Gen Subj
- ISSN
- 0304-4165
- eISSN
- 1872-8006
- Publisher
- Elsevier B.V
- Grant note
- DOI: 10.13039/100000002, name: National Institutes of Health, award: DE13941; name: College of Dentistry, The University of Iowa
- Language
- English
- Date published
- 11/2016
- Academic Unit
- Orthodontics; Anatomy and Cell Biology; Endodontics; Craniofacial Anomalies Research Center; Dental Research
- Record Identifier
- 9984025349202771
Metrics
36 Record Views