A repetition-free Longest Common Subsequence (LCS) of two sequences
3297&_mathId=si2.gif&_user=111111111&_pii=S0166218X15003297&_rdoc=1&_issn=0166218X&md5=65445691807ecdc069a0e225e35a3640" title="Click to view the MathML source">x and
3297&_mathId=si3.gif&_user=111111111&_pii=S0166218X15003297&_rdoc=1&_issn=0166218X&md5=575a2d571ad4a4c6ffc03f4abd4b34d1" title="Click to view the MathML source">y is an LCS of
3297&_mathId=si2.gif&_user=111111111&_pii=S0166218X15003297&_rdoc=1&_issn=0166218X&md5=65445691807ecdc069a0e225e35a3640" title="Click to view the MathML source">x and
3297&_mathId=si3.gif&_user=111111111&_pii=S0166218X15003297&_rdoc=1&_issn=0166218X&md5=575a2d571ad4a4c6ffc03f4abd4b34d1" title="Click to view the MathML source">y where each symbol may appear at most once. Let
3297&_mathId=si6.gif&_user=111111111&_pii=S0166218X15003297&_rdoc=1&_issn=0166218X&md5=7db9e273b92d6c8c9616ad37a7d2f9aa" title="Click to view the MathML source">R denote the length of a repetition-free LCS of two sequences of
3297&_mathId=si7.gif&_user=111111111&_pii=S0166218X15003297&_rdoc=1&_issn=0166218X&md5=a6723ad66b160acabad27ebdf505f070" title="Click to view the MathML source">n symbols each one chosen randomly, uniformly, and independently over a
3297&_mathId=si8.gif&_user=111111111&_pii=S0166218X15003297&_rdoc=1&_issn=0166218X&md5=a961bde9fefe52318bad16bb798bb7de" title="Click to view the MathML source">k-ary alphabet. We study the asymptotic, in
3297&_mathId=si7.gif&_user=111111111&_pii=S0166218X15003297&_rdoc=1&_issn=0166218X&md5=a6723ad66b160acabad27ebdf505f070" title="Click to view the MathML source">n and
3297&_mathId=si8.gif&_user=111111111&_pii=S0166218X15003297&_rdoc=1&_issn=0166218X&md5=a961bde9fefe52318bad16bb798bb7de" title="Click to view the MathML source">k, behavior of
3297&_mathId=si6.gif&_user=111111111&_pii=S0166218X15003297&_rdoc=1&_issn=0166218X&md5=7db9e273b92d6c8c9616ad37a7d2f9aa" title="Click to view the MathML source">R and establish that there are three distinct regimes, depending on the relative speed of growth of
3297&_mathId=si7.gif&_user=111111111&_pii=S0166218X15003297&_rdoc=1&_issn=0166218X&md5=a6723ad66b160acabad27ebdf505f070" title="Click to view the MathML source">n and
3297&_mathId=si8.gif&_user=111111111&_pii=S0166218X15003297&_rdoc=1&_issn=0166218X&md5=a961bde9fefe52318bad16bb798bb7de" title="Click to view the MathML source">k. For each regime we establish the limiting behavior of
3297&_mathId=si6.gif&_user=111111111&_pii=S0166218X15003297&_rdoc=1&_issn=0166218X&md5=7db9e273b92d6c8c9616ad37a7d2f9aa" title="Click to view the MathML source">R. In fact, we do more, since we actually establish tail bounds for large deviations of
3297&_mathId=si6.gif&_user=111111111&_pii=S0166218X15003297&_rdoc=1&_issn=0166218X&md5=7db9e273b92d6c8c9616ad37a7d2f9aa" title="Click to view the MathML source">R from its limiting behavior.
Our study is motivated by the so called exemplar model proposed by Sankoff (1999) and the related similarity measure introduced by Adi et al. (2010). A natural question that arises in this context, which as we show is related to long standing open problems in the area of probabilistic combinatorics, is to understand the asymptotic, in 3297&_mathId=si7.gif&_user=111111111&_pii=S0166218X15003297&_rdoc=1&_issn=0166218X&md5=a6723ad66b160acabad27ebdf505f070" title="Click to view the MathML source">n and 3297&_mathId=si8.gif&_user=111111111&_pii=S0166218X15003297&_rdoc=1&_issn=0166218X&md5=a961bde9fefe52318bad16bb798bb7de" title="Click to view the MathML source">k, behavior of parameter 3297&_mathId=si6.gif&_user=111111111&_pii=S0166218X15003297&_rdoc=1&_issn=0166218X&md5=7db9e273b92d6c8c9616ad37a7d2f9aa" title="Click to view the MathML source">R.