文摘
Large-scale genome sequencing discovered many, rare variants, which occur at such low frequencies in the human population that there is often insufficient statistics for downstream population-genetic computation. The Intensification approach uses the genomic coordinates and the modular structure of repeat protein domains to help amplify signals of selection derived from population genome sequencing and conventional interspecies conservation. Intensification can identify important positions in repeat domains and protein structures that show strong conservation using a combination of conserved positions in motif-MSA and amplified signals in population-genetic measures. We provide an online resource (http://intensification.gersteinlab.org) and illustrate the approach through a case study using the tetratricopeptide repeat.