Repetition-free longest common subsequence of random sequences
详细信息    查看全文
文摘
A repetition-free Longest Common Subsequence (LCS) of two sequences 3297&_mathId=si2.gif&_user=111111111&_pii=S0166218X15003297&_rdoc=1&_issn=0166218X&md5=65445691807ecdc069a0e225e35a3640" title="Click to view the MathML source">x and 3297&_mathId=si3.gif&_user=111111111&_pii=S0166218X15003297&_rdoc=1&_issn=0166218X&md5=575a2d571ad4a4c6ffc03f4abd4b34d1" title="Click to view the MathML source">y is an LCS of 3297&_mathId=si2.gif&_user=111111111&_pii=S0166218X15003297&_rdoc=1&_issn=0166218X&md5=65445691807ecdc069a0e225e35a3640" title="Click to view the MathML source">x and 3297&_mathId=si3.gif&_user=111111111&_pii=S0166218X15003297&_rdoc=1&_issn=0166218X&md5=575a2d571ad4a4c6ffc03f4abd4b34d1" title="Click to view the MathML source">y where each symbol may appear at most once. Let 3297&_mathId=si6.gif&_user=111111111&_pii=S0166218X15003297&_rdoc=1&_issn=0166218X&md5=7db9e273b92d6c8c9616ad37a7d2f9aa" title="Click to view the MathML source">R denote the length of a repetition-free LCS of two sequences of 3297&_mathId=si7.gif&_user=111111111&_pii=S0166218X15003297&_rdoc=1&_issn=0166218X&md5=a6723ad66b160acabad27ebdf505f070" title="Click to view the MathML source">n symbols each one chosen randomly, uniformly, and independently over a 3297&_mathId=si8.gif&_user=111111111&_pii=S0166218X15003297&_rdoc=1&_issn=0166218X&md5=a961bde9fefe52318bad16bb798bb7de" title="Click to view the MathML source">k-ary alphabet. We study the asymptotic, in 3297&_mathId=si7.gif&_user=111111111&_pii=S0166218X15003297&_rdoc=1&_issn=0166218X&md5=a6723ad66b160acabad27ebdf505f070" title="Click to view the MathML source">n and 3297&_mathId=si8.gif&_user=111111111&_pii=S0166218X15003297&_rdoc=1&_issn=0166218X&md5=a961bde9fefe52318bad16bb798bb7de" title="Click to view the MathML source">k, behavior of 3297&_mathId=si6.gif&_user=111111111&_pii=S0166218X15003297&_rdoc=1&_issn=0166218X&md5=7db9e273b92d6c8c9616ad37a7d2f9aa" title="Click to view the MathML source">R and establish that there are three distinct regimes, depending on the relative speed of growth of 3297&_mathId=si7.gif&_user=111111111&_pii=S0166218X15003297&_rdoc=1&_issn=0166218X&md5=a6723ad66b160acabad27ebdf505f070" title="Click to view the MathML source">n and 3297&_mathId=si8.gif&_user=111111111&_pii=S0166218X15003297&_rdoc=1&_issn=0166218X&md5=a961bde9fefe52318bad16bb798bb7de" title="Click to view the MathML source">k. For each regime we establish the limiting behavior of 3297&_mathId=si6.gif&_user=111111111&_pii=S0166218X15003297&_rdoc=1&_issn=0166218X&md5=7db9e273b92d6c8c9616ad37a7d2f9aa" title="Click to view the MathML source">R. In fact, we do more, since we actually establish tail bounds for large deviations of 3297&_mathId=si6.gif&_user=111111111&_pii=S0166218X15003297&_rdoc=1&_issn=0166218X&md5=7db9e273b92d6c8c9616ad37a7d2f9aa" title="Click to view the MathML source">R from its limiting behavior.

Our study is motivated by the so called exemplar model proposed by Sankoff (1999) and the related similarity measure introduced by Adi et al. (2010). A natural question that arises in this context, which as we show is related to long standing open problems in the area of probabilistic combinatorics, is to understand the asymptotic, in 3297&_mathId=si7.gif&_user=111111111&_pii=S0166218X15003297&_rdoc=1&_issn=0166218X&md5=a6723ad66b160acabad27ebdf505f070" title="Click to view the MathML source">n and 3297&_mathId=si8.gif&_user=111111111&_pii=S0166218X15003297&_rdoc=1&_issn=0166218X&md5=a961bde9fefe52318bad16bb798bb7de" title="Click to view the MathML source">k, behavior of parameter 3297&_mathId=si6.gif&_user=111111111&_pii=S0166218X15003297&_rdoc=1&_issn=0166218X&md5=7db9e273b92d6c8c9616ad37a7d2f9aa" title="Click to view the MathML source">R.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700