Conference proceeding
Mining biological databases for candidate disease genes
Proceedings of SPIE, Vol.4528(1), pp.169-180
Commercial Applications for High-Performance Computing
07/27/2001
DOI: 10.1117/12.434869
Abstract
The publicly-funded effort to sequence the complete nucleotide sequence of the human genome, the Human Genome Project (HGP), has currently produced more than 93% of the 3 billion nucleotides of the human genome into a preliminary `draft' format. In addition, several valuable sources of information have been developed as direct and indirect results of the HGP. These include the sequencing of model organisms (rat, mouse, fly, and others), gene discovery projects (ESTs and full-length), and new technologies such as expression analysis and resources (micro-arrays or gene chips). These resources are invaluable for the researchers identifying the functional genes of the genome that transcribe and translate into the transcriptome and proteome, both of which potentially contain orders of magnitude more complexity than the genome itself. Preliminary analyses of this data identified approximately 30,000 - 40,000 human `genes.' However, the bulk of the effort still remains -- to identify the functional and structural elements contained within the transcriptome and proteome, and to associate function in the transcriptome and proteome to genes. A fortuitous consequence of the HGP is the existence of hundreds of databases containing biological information that may contain relevant data pertaining to the identification of disease-causing genes. The task of mining these databases for information on candidate genes is a commercial application of enormous potential. We are developing a system to acquire and mine data from specific databases to aid our efforts to identify disease genes. A high speed cluster of Linux of workstations is used to analyze sequence and perform distributed sequence alignments as part of our data mining and processing. This system has been used to mine GeneMap99 sequences within specific genomic intervals to identify potential candidate disease genes associated with Bardet-Biedle Syndrome (BBS).
Details
- Title: Subtitle
- Mining biological databases for candidate disease genes
- Creators
- Terry A Braun - University of IowaTodd Scheetz - University of IowaGregg L Webster - University of IowaThomas L Casavant - University of Iowa
- Resource Type
- Conference proceeding
- Publication Details
- Proceedings of SPIE, Vol.4528(1), pp.169-180
- Conference
- Commercial Applications for High-Performance Computing
- DOI
- 10.1117/12.434869
- ISSN
- 0277-786X
- Language
- English
- Date published
- 07/27/2001
- Academic Unit
- Electrical and Computer Engineering; Roy J. Carver Department of Biomedical Engineering; Ophthalmology and Visual Sciences
- Record Identifier
- 9984197121202771
Metrics
7 Record Views