Logo image
Multi-granularity Parallel Computing in a Genome-Scale Molecular Evolution Application
Journal article   Peer reviewed

Multi-granularity Parallel Computing in a Genome-Scale Molecular Evolution Application

Jesse D Walters, Thomas B Bair, Terry A Braun, Todd E Scheetz, John P Robinson and Thomas L Casavant
The Journal of supercomputing, Vol.5698, pp.49-59
01/01/2009
DOI: 10.1007/978-3-642-03275-2_6
PMCID: PMC3155254
PMID: 21841894

View Online

Abstract

Previously [ 1 ], we reported a coarse-grained parallel computational approach to identifying rare molecular evolutionary events often referred to as horizontal gene transfers. Very high degrees of parallelism (up to 65x speedup on 4,096 processors) were reported, yet the overall execution time for a realistic problem size was still on the order of 12 days. With the availability of large numbers of compute clusters, as well as genomic sequence from more than 2,000 species containing as many as 35,000 genes each, and trillions of sequence nucleotides in all, we demonstrated the computational feasibility of a method to examine “clusters” of genes using phylogenetic tree similarity as a distance metric. A full serial solution to this problem requires years of CPU time, yet only makes modest IPC and memory demands; thus, it is an ideal candidate for a grid computing approach involving low-cost compute nodes. This paper now describes a multiple granularity parallelism solution that includes exploitation of multi-core shared memory nodes to address fine-grained aspects in the tree-clustering phase of our previous deployment of XenoCluster 1.0. In addition to benchmarking results that show up to 80% speedup efficiency on 8 CPU cores, we report on the biological accuracy and relevance of our results compared to a reported set of known xenologs in yeast.

Details

Metrics

18 Record Views
Logo image