Effective design and analysis of systems genetics studies.
详细信息   
  • 作者:Kang ; Hyun Min.
  • 学历:Doctor
  • 年:2009
  • 导师:Eskin, Eleazar,eadvisorPevzner, Pavel,eadvisorBafna, Vineetecommittee memberDasgupta, Sanjoyecommittee memberIdeker, Treyecommittee memberSchork, Nicholas J.ecommittee member
  • 毕业院校:University of California
  • Department:Computer Science and Engineering
  • ISBN:9781109164831
  • CBH:3356332
  • Country:USA
  • 语种:English
  • FileSize:11284046
  • Pages:241
文摘
Systems genetics studies for unraveling genetic basis of complex traits have been one of the most propitious research area with the advance of high-throughput biotechnologies. This thesis presents several computational and statistical challenges in effective design and analysis of systems genetics studies and present novel methodological advances and corresponding results in several specific contexts of systems genetics studies. First, I present an extensive haplotype analysis on a recently collected catalogue of genetic variation among inbred mouse strains, which revealed the contribution from ancestral subspecies, haplotype block structure, and complex history of each genomic segments among the inbred mouse strains. In addition, I accurately imputed the uncollected genotypes in the resource by developing a novel and efficient genotype imputation method which adaptively learns parameters from data using an Expectaion-Maximuzation EM) algorithm. Our method is demonstrated to outperform previous methods in both mouse and human data. Statistical analyses in systems genetics studies are often confounded by unmodeled factors such as heterogeneous sample structure. Recent studies suggested that mixed models correct for the sample structure in association mapping, but the available methods suffer from substantial computational cost to be applied in genome-wide association mapping. I developed the Efficient Mixed Model Association EMMA), which takes advantage of the invariant structure of eigenvectors in applying mixed models for association mapping, which substantially increase the computational efficiency in several orders of magnitude. Our method was shown to successfully reduce inflated false positives in in silico genome-wide association mapping of inbred mouse strains involving hundreds of thousands of markers. I further extend EMMA to accommodate even larger scale of genome-wide association mapping in humans, typically involving several thousands or more individuals, and demonstrate that the method consistently eliminates the significant over-dispersion of test statistics across multiple human data sets. The method has been further employed in correcting for a different type of confounding effects in expression studies. I developed a novel mixed-model method that corrects for the spurious associations and trans-regulatory bands caused by systematic confounding effects using intersample correlation of expression measurements. Finally, in the design of association studies using inbred strains, I propose a novel trait mapping strategy using hybrid mouse diversity panel HMDP). By integrating classical inbreds and multiple sets of recombinant inbreds while precisely accounting for the sample structure using high-density markers with EMMA, the proposed design is shown to much more powerfully and precisely identify previously known associations than previous approaches.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700