Sequence and structure similarity search in biological and XML databases.
详细信息   
  • 作者:Aghili ; S. Alireza.
  • 学历:Doctor
  • 年:2005
  • 导师:Agrawal, Divyakant
  • 毕业院校:University of California
  • 专业:Computer Science.
  • ISBN:0542476614
  • CBH:3202701
  • Country:USA
  • 语种:English
  • FileSize:1041705
  • Pages:170
文摘
The unprecedented growth of the Internet and biological databases has introduced challenging and complex data formats and hence furnishing unique collaborative venues for scientists of various disciplines. The set of such complex databases includes, (1) XML (eXtended Markup Language) databases, (2) DNA and Protein sequence and structure databases, (3) Microarray gene expressions, (4) Biomedical images, and (5) Sensor data stream and Time series databases. Given a source query pattern and a target database, the similarity search (range query or top-k) seeks to identify those records of the database which match the given query. The problem of similarity search in biological and textual databases has received substantial attention in the past decade. Numerous filtration and indexing techniques have been proposed to address the scalability issues and reduce the curse of dimensionality. However, complex applications demand special customization based on the inherent and underlying dynamics of the data. In this work, we study the integration of various transformation and shape summarization techniques on biological sequence and protein structure data, as well as path encoding in the tree-structured XML data, for more efficient similarity search query processing.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700