Mining DNA sequences to predict sites which mutations cause genetic diseases
详细信息    查看全文
文摘
Currently single nucleotide polymorphism (SNP) analysis becomes the crossroad of bioinformatics and medicine. We have developed a data mining system, http://wwwmgs.bionet.nsc.ru/mgs/systems/rsnp/, called rSNP_Guide, to discover regulatory sites in DNA sequences, which mutations could be the cause of genetic diseases. During the first step, we estimate the abilities of the proteins considered to bind to genomic DNA, which alterations by mutations are associated with a genetic disease under study. During the second step, we formalize the disease-associated experimental data on the SNP-referred alterations in DNA binding to unknown protein. During the third step, we cluster fuzzily all known proteins examined so that to determine one of them, which specific site is altered by mutations in consistence with that of the unknown protein experimentally associated with genetic disease. During the fourth step, we predict the known protein, which binding site is (i) resent on DNA and (ii) altered by mutations associated with genetic disease. Finally, during the last step, we estimate the robustness of this prediction. The rSNP_Guide has been tested on the SNPs with the known relationships between regulatory site alterations and genetic disease penetration. Besides, the novel SNPs-referred regulatory sites associated with the genetic disease penetrations were discovered and, then, successfully confirmed experimentally.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700