Национальный цифровой ресурс Руконт - межотраслевая электронная библиотека (ЭБС) на базе технологии Контекстум (всего произведений: 635212)
Контекстум
Руконтекст антиплагиат система
0   0
Первый авторZnamenskij
Страниц4
ID581756
АннотацияThe expected value E of the longest common subsequence of letters in two random words is considered as a function of the α = |A| of alphabet and of words lengths m and n. It is assumed that each letter independently appears at any position with equal probability. A simple expression for E(α, m, n) and its empirical proof are presented for fixed α and m + n. High accuracy of the formula in a wide range of values is confirmed by numerical simulations
УДК004.412
Znamenskij, SergejV. A FORMULA FOR THE MEAN LENGTH OF THE LONGEST COMMON SUBSEQUENCE / SergejV. Znamenskij // Журнал Сибирского федерального университета. Математика и физика. Journal of Siberian Federal University, Mathematics & Physics .— 2017 .— №1 .— С. 71-74 .— URL: https://rucont.ru/efd/581756 (дата обращения: 10.05.2024)

Предпросмотр (выдержки из произведения)

Mathematics & Physics 2017, 10(1), 71–74 УДК 004.412 A Formula for the Mean Length of the Longest Common Subsequence Sergej V. Znamenskij∗ Ailamazyan Program Systems Institute of RAS Peter the First, 4, Veskovo village, Pereslavl area, Yaroslavl region, 152021 Russia Received 10.10.2016, received in revised form 10.11.2016, accepted 20.12.2016 The expected value E of the longest common subsequence of letters in two random words is considered as a function of the α = |A| of alphabet and of words lengths m and n. <...> It is assumed that each letter independently appears at any position with equal probability. <...> A simple expression for E(α,m,n) and its empirical proof are presented for fixed α and m+ n. <...> High accuracy of the formula in a wide range of values is confirmed by numerical simulations. <...> Introduction The random words of lengths m and n in the alphabet α are also reffered as random symbol sequences. <...> We consider the letter appearance in different positions of words as equally probable and independent events. <...> So for those random sequences the expected value of the longest common subsequence length is a function of E(m,n,α), which reflects the similarity of the original words. <...> However, both the use of mathematical apparatus as in [2, 3] and numerical modeling (usually with special algorithms) [4, 5, 6] succeeded to clarify situation only in special cases m = n or α = 2 (see [7]). proof). <...> Computer calculations E for small m,n, in [9] have identified a similar relation for the α = 4. <...> Even for α = 2 the asymptotic on m n became clear only resently [8] (just now without detailed The work is intended to the detection and empirical proof of this relation with except of huge α and small m+n cases. ∗svz@latex.pereslavl.ru ⃝ Siberian Federal University. <...> All rights reserved c – 71 – Sergej V. Znamenskij A Formula for the Mean Length of the Longest Common Subsequence Fig. 1. <...> Evaluation Direct evaluation for huge LCS lengths is impossible due to known square complexity of algorithm. <...> Therefore 6 fixed values of m+n and 10 for α were selected and for 6Ч10 series of 32 triplets (m,n,α) their expected values of LSS lengths were calculated as sample means <...>