Logo image
A machine learning framework for trait based genomics
Conference proceeding

A machine learning framework for trait based genomics

Wei Zhang, Erliang Zeng, Dan Liu, S Jones and S Emrich
2012 IEEE 2nd International Conference on Computational Advances in Bio and medical Sciences (ICCABS), pp.1-6
02/2012
DOI: 10.1109/ICCABS.2012.6182648

View Online

Abstract

Microbial communities perform many important ecological functions across a wide range of natural and man-made environments. Recently, the utility of trait based approaches for microbial communities has been identified. Increasing availability of whole genome sequences provide the opportunity to explore the genetic foundations of a variety of functional traits. In this paper, we proposed a machine learning framework to quantitatively link the genotype with functional traits. Genes from bacteria genomes belonging to different functional trait groups were grouped to Cluster of Orthologs (COGs), and were used as features. Then, TF-IDF technique from the text mining domain was applied to transform the data to accommodate the abundance and importance of each COG. After TF-IDF processing, COGs were ranked using feature selection methods to identify their relevance to the functional trait of interest. We focused on a binary functional trait in this paper, but plan to extend our approach to continuous functional traits in the future. Experimental results demonstrated that functional trait related genes can be detected using our method.
Bioinformatics Ecology Genomics Machine Learning Functional Trait Communities Microbial communities Ortholog Support vector machines Feature Selection Microorganisms Accuracy Sequencing

Details

Metrics

40 Record Views
Logo image