Journal article
Using Family Data as a Verification Standard to Evaluate Copy Number Variation Calling Strategies for Genetic Association Studies
Genetic epidemiology, Vol.36(3), pp.253-262
04/2012
DOI: 10.1002/gepi.21618
PMCID: PMC3696390
PMID: 22714937
Abstract
A major concern for all copy number variation (CNV) detection algorithms is their reliability and repeatability. However, it is difficult to evaluate the reliability of CNV‐calling strategies due to the lack of gold‐standard data that would tell us which CNVs are real. We propose that if CNVs are called in duplicate samples, or inherited from parent to child, then these can be considered validated CNVs. We used two large family‐based genome‐wide association study (GWAS) datasets from the GENEVA consortium to look at concordance rates of CNV calls between duplicate samples, parent‐child pairs, and unrelated pairs. Our goal was to make recommendations for ways to filter and use CNV calls in GWAS datasets that do not include family data. We used PennCNV as our primary CNV‐calling algorithm, and tested CNV calls using different datasets and marker sets, and with various filters on CNVs and samples. Using the Illumina core HumanHap550 single nucleotide polymorphism (SNP) set, we saw duplicate concordance rates of approximately 55% and parent‐child transmission rates of approximately 28% in our datasets. GC model adjustment and sample quality filtering had little effect on these reliability measures. Stratification on CNV size and DNA sample type did have some effect. Overall, our results show that it is probably not possible to find a CNV‐calling strategy (including filtering and algorithm) that will give us a set of “reliable” CNV calls using current chip technologies. But if we understand the error process, we can still use CNV calls appropriately in genetic association studies.
Details
- Title: Subtitle
- Using Family Data as a Verification Standard to Evaluate Copy Number Variation Calling Strategies for Genetic Association Studies
- Creators
- Xiaojing Zheng - University of PittsburghJohn R Shaffer - University of PittsburghCaitlin P McHugh - University of WashingtonCathy C Laurie - University of WashingtonBjarke Feenstra - Statens Serum InstitutMads Melbye - Statens Serum InstitutJeffrey C Murray - University of IowaMary L Marazita - University of PittsburghEleanor Feingold - University of Pittsburgh
- Resource Type
- Journal article
- Publication Details
- Genetic epidemiology, Vol.36(3), pp.253-262
- DOI
- 10.1002/gepi.21618
- PMID
- 22714937
- PMCID
- PMC3696390
- NLM abbreviation
- Genet Epidemiol
- ISSN
- 0741-0395
- eISSN
- 1098-2272
- Number of pages
- 10
- Grant note
- NIDCR (R01‐DE 014899) NIDCR (R01‐DE09551; R01‐DE12101)
- Language
- English
- Date published
- 04/2012
- Academic Unit
- Anatomy and Cell Biology; Stead Family Department of Pediatrics; Epidemiology; Pediatric Dentistry; Craniofacial Anomalies Research Center; Dental Research
- Record Identifier
- 9984025467302771
Metrics
65 Record Views