A review on compressed pattern matching
详细信息    查看全文
文摘
Compressed pattern matching (CPM) refers to the task of locating all the occurrences of a pattern (or set of patterns) inside the body of compressed text. In this type of matching, pattern may or may not be compressed. CPM is very useful in handling large volume of data especially over the network. It has many applications in computational biology, where it is useful in finding similar trends in DNA sequences; intrusion detection over the networks, big data analytics etc. Various solutions have been provided by researchers where pattern is matched directly over the uncompressed text. Such solution requires lot of space and consumes lot of time when handling the big data. Various researchers have proposed the efficient solutions for compression but very few exist for pattern matching over the compressed text. Considering the future trend where data size is increasing exponentially day-by-day, CPM has become a desirable task. This paper presents a critical review on the recent techniques on the compressed pattern matching. The covered techniques includes: Word based Huffman codes, Word Based Tagged Codes; Wavelet Tree Based Indexing. We have presented a comparative analysis of all the techniques mentioned above and highlighted their advantages and disadvantages.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700