长度分布悬殊的序列比对法
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Sequence Alignment Method with Large Length Distribution
  • 作者:陆成刚
  • 英文作者:LU Cheng-gang;Institute of Science,Zhejiang University of Technology;
  • 关键词:动态时间规整 ; 时间序列相异(似)性 ; 累计均值 ; 均值DTW ; 动态规划 ; 长度分布悬殊
  • 英文关键词:DTW;;time series dissimilarity(similarity);;accumulated averaging;;averaging DTW;;dynamic programming;;disparity in length
  • 中文刊名:XXWX
  • 英文刊名:Journal of Chinese Computer Systems
  • 机构:浙江工业大学理学院;
  • 出版日期:2019-01-15
  • 出版单位:小型微型计算机系统
  • 年:2019
  • 期:v.40
  • 基金:浙江省基础公益研究计划项目(2019-LCHG)资助
  • 语种:中文;
  • 页:XXWX201901032
  • 页数:7
  • CN:01
  • ISSN:21-1106/TP
  • 分类号:171-177
摘要
经典的动态时间规整算法(Dynamic Time Warping,DTW)对长度相差剧烈的两个序列之间的相异(似)性度量会引入长度的影响,这种影响在多对序列之间匹配时造成度量数值上的失衡从而导致DTW度量相似性的失效.该文分析了这种失效性的来源,提出祛除长度差异影响的均值DTW法.和传统DTW法相比,均值DTW是其非平凡的推广,因而无法直接使用基于动态规划原理的算法.基于该文给出的动态规划的匹配路径数目计算,强调了枚举法施加于均值DTW的不可行性.在引入均值的累计计算技巧后,将均值DTW纳入传统的基于动态规划原理的算法框架,实现了等价于DTW计算复杂性、双倍于DTW存储占用的均值DTW算法.该算法相比于DTW在电离辐射时间序列的聚类实验中将算法结果的精度指标至少提升了50%以上.该文的创新性在于:1)首次发现了DTW可能失效的长度差异因素; 2)提出了均值DTW的概念; 3)首次提出了动态路径数目的定量计算方法; 4)给出均值DTW的加窗限制版本.
        The classic Dynamic Time Warping( DTW) method introduces the effect of length on the difference( similarity) between two sequences with severe length differences. This effect can cause measurement problems when matching between multiple pairs of sequences. The numerical imbalance leads to the failure of the DTW metric similarity. This paper analyzes the source of this failure and proposes the mean DTW method to eliminate the influence of length difference. Compared with the traditional DTW method,the mean DTW is a non-trivial promotion,so it is impossible to directly use the algorithm based on the dynamic programming principle. Based on the calculation of the number of all matching paths for dynamic programming given in this paper,the infeasibility of the enumeration method applied to the mean DTW is emphasized. After introducing the cumulative calculation technique of the mean,the mean DTW is included in the traditional algorithm framework based on the dynamic programming principle,and the mean DTW algorithm equivalent to DTW computational complexity and double the DTW storage occupancy is realized. Compared with DTW,the algorithm improves the accuracy of the algorithm results by at least 50% in the clustering experiment of ionizing radiation time series. The innovation of this paper is: 1) the first time to find the difference in length of DTW possible failure; 2) the concept of mean DTW is proposed; 3) the quantitative calculation method of the number of dynamic paths is proposed for the first time; 4) the addition of the mean DTW is given,windowlimited version.
引文
[1] Hiroaki Sakoe,Seibi Chiba. Dynamic programming algorithm optimization for spoken word recognition[J]. IEEE Trans,Acoustic Speech and Signal Processing(TASSP),1978,26(1):43-49.
    [2] Berndt D J,James Clifford. Using dynamic time warping to find patterns in time series[C]. AAAI 1994 Workshop on Knowledge Discovery in Database(KDD-1994),AAAI Technical Report WS-94-03,1994:359-370.
    [3] Yamauchi T,Xiao K,Bowman C,et al. Dynamic time warping:a single dry electrode EEG study in a self-paced learning task[C].International Conference on Affective Computing and Intelligent Interaction,IEEE,2015:56-62.
    [4] Rakthanmanon,Thanawin. Addressing big data time series:mining trillions of time series subsequences under dynamic time warping[J]. ACMTransactions on Knowledge Discovery from Data(TKDD),2013,7(3):1-31.
    [5] Cao D,Liu J. Research on dynamic time warping multivariate time series similarity matching based on shape feature and inclination angle[J]. Journal of Cloud Computing,2016,5(1):11.
    [6] Thomas Prtzlich,Jonathan Driedger,Meinard Müller. Memory-restricted multiscale dynamic time warping[C]. Proceedings of the IEEE International Conference on Acoustics,Speech,and Signal Processing(ICASSP),2016:569-573.
    [7] Omer Gold,Micha Sharir. Dynamic time warping and geometric edit distance:breaking the quadratic barrier[C]. International Colloquium on Automata,Languages,and Programming(ICALP-2017),25:1-14.
    [8] Li Jun-kui,Wang Yuan-zhen. Early abandon to accelerate exactdynamic time warping[J]. The International Arab Journal of Information Technology(AJIT),2009,6(2):144-152.
    [9] Morse M,Patel J. An efficient and accurate method for evaluating time series similarity[C]. ACMSpecial Interest Group on Management of Data 2007(SIGMOD-2007),Beijing,China,June2007:101-123.
    [10] Al-Naymat G,Chawla S,Taheri J. SparseDTW:a novel approach to speed up dynamic time warping[C]. The 2009 Australasian Data Mining(ADM),Melbourne,Australia,ACMDigital Library,2009,101:117-127.
    [11] Stan Salvador,Philip Chan. Fast DTW:toward accurate dynamic time warping in linear time and space[C]. KDD Workshop on Mining Temporal and Sequential Data(MTSA),2004:70-80.
    [12] Zhou F,Torre F D. Generalized time warping for multi-modal alignment of human motion[C]. IEEE Conference on Computer Vision and Pattern Recognition-2012(CVPR-2012),2012:1-8.
    [13] Eamonn J. Keogh,Michael J. Pazzani. Derivative dynamic time warping[C]. In First SIAMInternational Conference on Data Mining(SDM'2001),2001:234-265.
    [14] Petitjean F O. Ketterlin A. Gan9arski. A global averaging method for dynamic time warping,with applications to clustering[J]. Pattern Recognition(PR),2011,44(3):678-690.
    [15] Petitjean F O,Gan9arski. Summarizing a set of time series by averaging:from Steiner sequence to compact multiple alignment[J].Theoretical Computer Science(TCS),2012,414(1):76-88.
    [16] Lu Chenggang. A front-end solution to VAD,SNR,and AGC for speech recognition[C]. Information Science and Engineering,International Conference on(ICISE-2009),2009:445-447.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700