Longest common extensions in trees
详细信息    查看全文
文摘
The longest common extension (LCE) of two indices in a string is the length of the longest identical substrings starting at these two indices. The LCE problem asks to preprocess a string into a compact data structure that supports fast LCE queries.

In this paper we generalize the LCE problem to trees and suggest a few applications of LCE in trees to tries and XML databases. Given a labeled and rooted tree T of size n, the goal is to preprocess T into a compact data structure that support the following LCE queries between subpaths and subtrees in T  . Let 0729X&_mathId=si1.gif&_user=111111111&_pii=S030439751500729X&_rdoc=1&_issn=03043975&md5=fc97eaab3f9740b81839f8efa395fc13" title="Click to view the MathML source">v1, 0729X&_mathId=si2.gif&_user=111111111&_pii=S030439751500729X&_rdoc=1&_issn=03043975&md5=a2c86b6d4ef6211ffb252846fd194321" title="Click to view the MathML source">v2, 0729X&_mathId=si3.gif&_user=111111111&_pii=S030439751500729X&_rdoc=1&_issn=03043975&md5=6dd59e588f299db7c7cd6125f3ccea2d" title="Click to view the MathML source">w1, and 0729X&_mathId=si4.gif&_user=111111111&_pii=S030439751500729X&_rdoc=1&_issn=03043975&md5=8675e556e8c3241a554d3f181617e283" title="Click to view the MathML source">w2 be nodes of T   such that 0729X&_mathId=si3.gif&_user=111111111&_pii=S030439751500729X&_rdoc=1&_issn=03043975&md5=6dd59e588f299db7c7cd6125f3ccea2d" title="Click to view the MathML source">w1 and 0729X&_mathId=si4.gif&_user=111111111&_pii=S030439751500729X&_rdoc=1&_issn=03043975&md5=8675e556e8c3241a554d3f181617e283" title="Click to view the MathML source">w2 are descendants of 0729X&_mathId=si1.gif&_user=111111111&_pii=S030439751500729X&_rdoc=1&_issn=03043975&md5=fc97eaab3f9740b81839f8efa395fc13" title="Click to view the MathML source">v1 and 0729X&_mathId=si2.gif&_user=111111111&_pii=S030439751500729X&_rdoc=1&_issn=03043975&md5=a2c86b6d4ef6211ffb252846fd194321" title="Click to view the MathML source">v2 respectively.

0729X&_mathId=si28.gif&_user=111111111&_pii=S030439751500729X&_rdoc=1&_issn=03043975&md5=1b2b6d85822ed995789e4c6572bd0896" title="Click to view the MathML source">LCEPP(v1,w1,v2,w2): (path–path LCE) return the longest common prefix of the paths 0729X&_mathId=si6.gif&_user=111111111&_pii=S030439751500729X&_rdoc=1&_issn=03043975&md5=60d7f9fead5f3b3565325a7b388414d6" title="Click to view the MathML source">v1↝w1 and 0729X&_mathId=si26.gif&_user=111111111&_pii=S030439751500729X&_rdoc=1&_issn=03043975&md5=379f13981acfe9de125a9aed525c53ec" title="Click to view the MathML source">v2↝w2.

0729X&_mathId=si29.gif&_user=111111111&_pii=S030439751500729X&_rdoc=1&_issn=03043975&md5=994f783883c43a43e753b0dfc9e6c2a2" title="Click to view the MathML source">LCEPT(v1,w1,v2): (path–tree LCE) return maximal path–path LCE of the path 0729X&_mathId=si6.gif&_user=111111111&_pii=S030439751500729X&_rdoc=1&_issn=03043975&md5=60d7f9fead5f3b3565325a7b388414d6" title="Click to view the MathML source">v1↝w1 and any path from 0729X&_mathId=si2.gif&_user=111111111&_pii=S030439751500729X&_rdoc=1&_issn=03043975&md5=a2c86b6d4ef6211ffb252846fd194321" title="Click to view the MathML source">v2 to a descendant leaf.

0729X&_mathId=si10.gif&_user=111111111&_pii=S030439751500729X&_rdoc=1&_issn=03043975&md5=b84b2af051f16c8e4ea2c929523a2592" title="Click to view the MathML source">LCETT(v1,v2): (tree–tree LCE) return a maximal path–path LCE of any pair of paths from 0729X&_mathId=si1.gif&_user=111111111&_pii=S030439751500729X&_rdoc=1&_issn=03043975&md5=fc97eaab3f9740b81839f8efa395fc13" title="Click to view the MathML source">v1 and 0729X&_mathId=si2.gif&_user=111111111&_pii=S030439751500729X&_rdoc=1&_issn=03043975&md5=a2c86b6d4ef6211ffb252846fd194321" title="Click to view the MathML source">v2 to descendant leaves.

We present the first non-trivial bounds for supporting these queries. For 0729X&_mathId=si11.gif&_user=111111111&_pii=S030439751500729X&_rdoc=1&_issn=03043975&md5=7161b90004416376da3c5a56b61e9468" title="Click to view the MathML source">LCEPP queries, we present a linear-space solution with 0729X&_mathId=si12.gif&_user=111111111&_pii=S030439751500729X&_rdoc=1&_issn=03043975&md5=1d87bc9ef40ab1d7833ad599d0e52441" title="Click to view the MathML source">O(log⁡n) query time. For 0729X&_mathId=si13.gif&_user=111111111&_pii=S030439751500729X&_rdoc=1&_issn=03043975&md5=21fc24ed60d548cc06d519380eeb9301" title="Click to view the MathML source">LCEPT queries, we present a linear-space solution with 0729X&_mathId=si14.gif&_user=111111111&_pii=S030439751500729X&_rdoc=1&_issn=03043975&md5=055f9d9b6a8680c60bbccc91c2d92e83" title="Click to view the MathML source">O((log⁡log⁡n)2) query time, and complement this with a lower bound showing that any path–tree LCE structure of size 0729X&_mathId=si15.gif&_user=111111111&_pii=S030439751500729X&_rdoc=1&_issn=03043975&md5=67a0ee713ea725054b4b0c0ce7263902" title="Click to view the MathML source">O(npolylog(n)) must necessarily use 0729X&_mathId=si16.gif&_user=111111111&_pii=S030439751500729X&_rdoc=1&_issn=03043975&md5=444014296148e4a6a65503c0af79c532" title="Click to view the MathML source">Ω(log⁡log⁡n) time to answer queries. For 0729X&_mathId=si17.gif&_user=111111111&_pii=S030439751500729X&_rdoc=1&_issn=03043975&md5=10f3a8adc572263ac40c502b0eb696da" title="Click to view the MathML source">LCETT queries, we present a time-space trade-off, that given any parameter τ  , 0729X&_mathId=si18.gif&_user=111111111&_pii=S030439751500729X&_rdoc=1&_issn=03043975&md5=e903d3a341952f010369f4003c86f4dc" title="Click to view the MathML source">1≤τ≤n, leads to an 0729X&_mathId=si19.gif&_user=111111111&_pii=S030439751500729X&_rdoc=1&_issn=03043975&md5=cdcc34be080a2978f5b739304532c86c" title="Click to view the MathML source">O(nτ) space and 0729X&_mathId=si20.gif&_user=111111111&_pii=S030439751500729X&_rdoc=1&_issn=03043975&md5=bfdd54c33faee7be72887283021572ee" title="Click to view the MathML source">O(n/τ) query-time solution (all of these bounds hold on a standard unit-cost RAM model with logarithmic word size). This is complemented with a reduction from the set intersection problem implying that a fast linear space solution is not likely to exist.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700