基于SVM的联机手写分类器设计
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
联机手写识别已有数十年的发展历史,而联机手写识别的效果总是不尽如人意。但是,随着近年来智能手机和平板电脑等采用纯触摸屏交互方式的电子产品的崛起、MEMS技术和图形处理技术的发展,手写输入受到越来越多用户的青睐,同时,空间手写、姿态识别、签名验证、数学公式识别、化学符号识别等新研究和应用领域的出现,使得联机手写识别的研究受到越来越多的关注。
     支持向量机(Support Vector Machine, SVM)作为近十几年来迅速发展的新一代模式识别算法,它以统计学习理论、核函数理论和泛化性理论为基础,采用结构风险最小化(Structural Risk Minimization, SRM)原则计算最优分类超平面,与其它模式识别算法相比,有着更加坚实的理论基础。在语音识别、基因检测、手写识别等模式领域和状态预测、曲线拟合等相关领域,SVM有很多的研究和应用,取得了不俗的成果。
     本文研究了基于SVM的联机手写识别分类器设计中应用最成功的核函数——高斯动态时间规整(Gaussian Dynamic Time Warping, GDTW)核函数——和它的两个不足之处:它是针对语音识别等多种模式识别领域提出的,应用于联机手写识别的性能与其它模式识别算法相比,优势并不明显;计算复杂度高,使得联机手写识别的运算时间较长。针对上述两个问题,本文研究了联机手写识别中特征向量的特点,提出了优化GDTW核函数的方法,并探索不同对齐路径长度计算方式对识别效果的影响。
     为了证明所提方法的有效性,本文设计了基于优化GDTW核函数的SVM联机手写识别分类器,而且使用样本质量较高的联机手写识别数据库UJIpenchars2和样本质量较差的联机手写识别数据库UNIPEN进行联机手写识别实验。实验结果表明,本文所提出的方法有效地减少了SVM的支持向量的数目,提高了识别效率。
Although on-line handwritten recognition has been developing for decades, its effect still doesn't come up to expectation. However, in recent years, along with the emergence of keyboard-less electronic products (smart phone, tablet, etc.) and the rapid advance of MEMS and image processing, handwritten input has attracted more and more customers and in the meantime new research and application fields have appeared, such as space handwritten recognition, gesture identification, signature ver-ification, mathematical equation recognition, chemical symbol recognition, and so on. All these facts lead to an increasing interest in the study of on-line handwritten recog-nition.
     Support vector machine is a new pattern recognition algorithm and is developing fast. It is based on statistical learning theory, kernel method and generalization theory and adopts the principle of structural risk minimization to compute the optimal sepa-rating hyper-plane. Compared with other pattern recognition algorithms, SVM has a substantial theoretical foundation. In the field of pattern recognition (speech recogni-tion, genetic testing, handwritten recognition, etc.) and its related fields (state predic-tion, curve fitting, etc.), SVM enjoys reasonable recognition effect.
     This thesis studies Gaussian Dynamic Time Warping kernel, which is applied most successfully in SVM-based on-line handwritten recognition classifier design, and its two shortages:it is designed for a variety of pattern recognitions and thus compared with other algorithms the advantage of SVM hasn't been fully utilized; its computation complexity is high and thus the running time is comparatively long. To solve these two problems, this thesis studies the characteristic of feature vectors, pre-sents a method of optimizing GDTW kernel and explores the affection of different calculation ways of the optimal alignment path.
     In order to verify the validity, the thesis designs a SVM classifier on the basis of optimized GDTW kernel and conducts an experiments with on-line handwritten recognition database UJIpenchars2 (with high quality) and UNIPEN (with low quali-ty). The result shows that the method reduces the number of support vectors and in- creases recognition efficiency.
引文
[1]Hsu W H, Chiang Y Y, Lin W Y, et al. Integrating LCS and SVM for 3D handwriting recognition on handheld devices using accelerometers[Z]. Athens, Greece:2009195-197.
    [2]Choi S, Lee A S, Lee S. On-line handwritten character recognition with 3D accelerometer[C]. Weihai, Shandong, China:Inst. of Elec. and Elec. Eng. Computer Society,2006.
    [3]Chan K, Yeung D. Mathematical expression recognition:a survey[J]. International Journal on Document Analysis and Recognition.2000,3(1):3.
    [4]Bunke H, Blostein D, Grbavec A. Recognition Of Mathematical Notation[J].1996.
    [5]Xiangwei Q, Abaydulla Y. The study of mathematical expression recognition and the embedded system design[J]. Journal of Software.2010,5(1):44-53.
    [6]Qi X, Pan W, Yusup W Y. The study of structure analysis strategy in handwritten recognition of general mathematical expression[C]. Chengdu, China:IEEE Computer Society,2009.
    [7]Vuong B, Hui S, He Y. Progressive structural analysis for dynamic recognition of on-line hand-written mathematical expressions[J]. Pattern Recognition Letters.2008,29(5):647-655.
    [8]Li C, Zeleznik R, Miller T, et al. Online recognition of handwritten mathematical expressions with support for matrices[C]. Tampa, FL, United states:Institute of Electrical and Electronics Engineers Inc., 2008.
    [9]Yang J, Shi G, Wang K, et al. A study of on-line handwritten chemical expressions recognition[C]. Tampa, FL, United states:Institute of Electrical and Electronics Engineers Inc.,2008.
    [10]Shi G, Zhang Y. An improved SVM-HMM based classifier for online recognition of handwritten chemical symbols[C]. Chongqing, China:IEEE Computer Society,2010.
    [11]Yang Z, Guangshun S, Jufeng Y. HMM-based online recognition of handwritten chemical sym-bols[C]. Barcelona, Spain:IEEE Computer Society,2009.
    [12]Feng H, Wah C C. A novel usage of on-line handwritten signatures[J]. Recent Advances in Cir-cuits, Systems and Signal Processing.2002:344-348.
    [13]Jain A K, Griess F D, Connell S D. On-line signature verification[J]. Pattern Recognition.2002, 35(12):2963-2972.
    [14]Bothe S, Gartner T, Wrobel S. On-Line Handwriting Recognition with Parallelized Machine Learning Algorithms[M]. KI 2010:Advances in Artificial Intelligence, Dillmann R, Beyerer J, Hane-beck U, et al, Springer Berlin/Heidelberg,2010:6359,82.
    [15]Bahlmann C, Haasdonk B, Burkhardt H. Online handwriting recognition with support vector ma-chines-a kernel approach[Z].200249-54.
    [16]Vapnik V N. Statistical learning theory[M]. Wiley,1998.
    [17]Cristianini N, Shawe-Taylor J. Introduction to support vector machines and other kernel-based learning methods [M].北京:China Machine Press,2005.[18]数据挖掘中的新方法:支持向量机[M].科学出版社,2004.
    [19]Zhang D, Zuo W, Zhang D, et al. Time series classification using support vector machine with Gaussian elastic metric kernel[Z]. Istanbul, Turkey:201029-32.
    [20]Xu S H, Wang B. A new support vector machine model and its application in time-varying signal classification[Z]. Jinan, China:2008416-420.
    [21]Gudmundsson S, Runarsson T P, Sigurdsson S. Support vector machines and dynamic time warp-ing for time series[Z]. Hong Kong, China:20082772-2776.
    [22]Hansheng L, Bingyu S. A study on the dynamic time warping in kernel machines[Z]. Jiangong Jinjiang, Shanghai, China:2007839-845.
    [23]Wang H L, Han J Q, Li H F. SVM with discriminative dynamic time alignment[J]. Journal of Harbin Institute of Technology (New Series).2007,14(5):598-603.
    [24]Gruber C, Gruber T, Sick B. Online signature verification with new time series Kernels for sup-port vector machines[Z]. Hong Kong, China:2006500-508.
    [25]Wan V, Carmichael J. Polynomial dynamic time warping kernel support vector machines for dys-arthric speech recognition with sparse training data[Z]. Lisbon, Portugal:20053321-3324.
    [26]Leslie C, Eskin E, Weston J, et al. Mismatch string kernels for SVM protein classification[J]. 2003.
    [27]Uk H L H C, Uk C S C C, Shawe-Taylor J, et al. Text Classification using String Kernels 2002[J]. Journal of Machine Learning Research 2 (2002) 419-444.2002.
    [28]Dynamic time-align kernel in support vector machine[J]. Advances in Neural Information Pro-cessing Systems 14 (2006).
    [29]统计学习理论的本质[M].清华大学出版社,2000.
    [30]Cawley G C. MATLAB support vector machine toolbox (v0.55beta)[DB/CD]. Univesity of East Anglia, School of Information Systems, Norwich, Norfolk, U.K. NR 7TJ:2000.
    [31]Platt J C. Fast training of support vector machines using sequential minimal optimization[M]. Ad-vances in kernel methods, MIT Press,1999,185-208.
    [32]Chen L, Ng R. On the marriage of Lp-norms and edit distance[C]. Toronto, Canada:VLDB En-dowment,2004.
    [33]Rabiner L R, Juang B H. Fundamentals of speech recognition[M]. PTR Prentice Hall,1993.
    [34]Haasdonk B, Keysers D. Tangent distance kernels for support vector machines[C]. Institute of Electrical and Electronics Engineers Inc.,2002.
    [35]Cuturi M, Vert J P, Birkenes O, et al. A kernel for time series based on global alignments[Z]. Honolulu, HI, United states:20071413-1416.
    [36]Vert J. Kernel methods in computational biology[R]. Computational Biology group,2004.
    [37]Cuturi M. Permanents, transport polytopes and positive definite kernels on histograms[C]. Hyder-abad, India:Morgan Kaufmann Publishers Inc.,2007.
    [38]孙即祥.现代模式识别(第二版)[M].北京:高等教育出版社,2008.
    [39]Llorens D, Prat F, Marzal A, et al. The ujipenchars database:a pen-based database of isolated handwritten characters[J]. Proceedings of the Sixth International Language Resources and Evaluation (LREC 2008), European Language Resources Association (ELRA), Marrakech, Morocco (May 2008), ISBN.2008:2-9517408.
    [40]Sanchez E G, Gonzalez J A G, Dimitriadis Y A, et al. Experimental study of a novel neuro-fuzzy system for on-line handwritten UNIPEN digit recognition[J]. Pattern Recognition Letters.1998, 19(3-4):357-364.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700