基于bandelet的脱机手写体汉字识别研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
脱机手写体汉字识别研究对汉字信息处理自动化,对开拓新一代计算机的智能输入有重要意义。脱机手写体汉字的识别不仅具有重要的理论研究价值,而且具有广阔的市场前景。
     本文在分析手写体汉字识别研究的历史与现状的基础上,将第二代bandelet变换引入文字识别领域,提出了基于bandelet的特征提取方式。主要工作如下:
     1)脱线签名的验证仅仅依靠签名图像的静态信息,而书写过程中的动态信息几乎完全消失,因此是一个较难解决的问题。本文针对脱线手写签名识别的特点,提出基于bandelet变换的特征选取方法。该方法将传统的结构特征与统计特征有机结合起来,运用K-L变换对已提取的特征向量进行降维,最后通过支持向量机进行真伪识别。实验结果表明该算法对测试样本具有高识别率。
     2)bandelet变换不仅继承了小波变换的主要特征(多尺度、时频局部),而且具有高度的方向性和各向异性。bandelet变换是一种基于图像边缘的变换方法,它能自适应地跟踪图像的几何正则方向,是一种对图像“真正”的稀疏表达。针对相似字丰富的方向性特征,提出了基于bandelet变换的相似字识别。该算法能有效获取图像中的笔划密度方向特征,真实反映文字的结构特性,通过支持向量机进行分类识别,从而有利于手写相似字的识别。实验结果表明bandelet变换和支持向量机相结合的模式识别方法具有很好的识别率。
Off-line handwritten Chinese character recognition research has important meaning to Chinese character information processing automation and new generation computer's intelligent input .The research on off-line handwritten Chinese characters machine recognition has not only important theoretical value, but also wide market prospect.
     Based on researching the current OCR systems and related technologies, the paper presents the Recognition of Off-line handwritten signature verification and similar handwritten Chinese characters based on second generation bandelet. The works are as following.
     Off-line handwritten signature verification (HSV) is hard because it depends on only static information of image and dynamic information in writing process nearly fully disappeared. Facing the characters of Off-line HSV, this paper proposes a method based on bandelet tansform. It combines structure feature with statistical feature, extracted eigenvector is compressed by K-L transform. At last, true signature and forge signature are distinguished through support vector machines (SVM). The experimental results demonstrate the algorithm presented here reaches satisfied identical rate for testing stylebook.
     The bandelet transform not only has wavelet's main characteristic multiscale and time frequency localization ,but also has high direction and anisotropism. The bandelet transform which provides a sort of new image representation, can make full use of intrinsic geometric regularity and provide optimal representation for geometry regular image .Aim at similer character's abundance stroke direction, a new recognition method based on bandelet transform and support vector machines is provided. This method can real express character's structure feature and increase similer character recognition rate .Results of experiments show that this method is efficiency.
引文
[1]吴佑寿,丁晓青.汉字识别:原理.方法与实现.北京:高等教育出版社,1992
    [2]A.Amin,S.G.Kim,C.Sammut.Hand-printed Chinese Character Recognition via Machine Learning.Document Analysis and Recognition.Proceedings of the Fourth International Conference.1997,1:190-194
    [3]朱学庆.脱机手写体汉字识别的研究与实现.北京大学博士论文.2000:1-35
    [4]J.S.Bridle.Training Stochastic Model Recognition Algorithms as Networks can Lead to Maximum Mutual Information Estimation of Parameters.Advances in Neural Information Processing Systems.1990,2:211-217
    [5]郝红卫.手写体字符的识别与集成.中国科学院声学研究所博士后论文.1998:1-32
    [6]张忻中.汉字识别技术.北京:清华大学出版社,1991
    [7]G.F.Groner,J.F.Heafner,T.W.Robinson.Handprinted Chinese Characters as a Translation Aid.Electronic Computers,IEEE Transactions.1967,6:856- 860
    [8]K.J.Anil,P.W.Duin,J.C.Mao.Statistical Pattern Recognition.Patern Analysis and Machine Inteligence,IEEE Transactions.2000,1(22):56-59
    [9]T.M.Bruel.A System for the Off-Line Recognition of Handwritten Text.Pattern Recognition.1994,2(12):129-133
    [10]Y.Lu,M.Sharidar.Character segmentation handwritten words,An overview.Pattern Recognition.1996,29(1):77-96
    [11]徐蔚然,张洪刚.基于知识的银行票据二值化方法.中文信息学报.2001,16(2):60-64
    [12]肖波,许蔚然.基于贝叶斯分类器的混排文字切分与分类.计算机工程与应用.2005,(10):83-85
    [13]Y.K.Chen,J.F.Wang.Segmentation of Single or Multiple-Touching Handwritten Numeral String Using Background and Foreground Analysis.Pattern Analysis and Machine Intelligence,IEEE Transactions.2000,22(11):1304-1317
    [14]Hildebr,T.H.Liu.Optical Recognition of Handwritten Chinese Characters.Pattern Recognition.1993,26(2):205-225
    [15]J.Cai,Z.Q.Liu.Integration of Structural and Statistical Information for Unconstrained Handwritten Numeral Recognition.Pattern Analysis and Machine Intelligence,IEEE Transactions.1999,21(3):263-270
    [16]Kenneth.R.Castleman,数字图像处理.朱志刚,北京:电子工业出版社,1998
    [17]M.Shridhar,A.Badreldin.Recognition of Isolated and Simply Connected Handwritten Numerals.Pattern Recognition.1986,19(1):1-12
    [18]郎锐.数字图像处理学Visual C++实现.北京:北京希望电子出版社,2003
    [19]Pksahoo.A Survey of Thresholding Techniques.Computer Graphics Vision and Image Processing.1988,41:233-260
    [20]T.Pun.A New Method for Gray Level Picture Thresholding Using the Entropy of the Histogram.Signal Processing.1980,2(3):223-237
    [21]B.Sambhunath,R.P.Nikhil.On Hierarchical Scgmcntation for Image Compression.Pattern Recognition Letters.2000,21:131-144
    [22]A.Datta,S.Parui.A Robust Parallel Thinning Algorithm for Binary Images.Pattern Recognition.1994,27(9):1181-1192
    [23]S.D.Yanowitz,A.M.Bruckstein.A New Method for Image Segmentation.Computer Graphics Vision and Image Processing.1989,46:82-95
    [24]R.Chen,Y.Tang,Y.H.Qiu.A Novel Stroke Extraction Model for Chinese Characters Based on Steerable Filters.Cognitive Informatics,IEEE Transactions.2006,7:547-551
    [25]L.R.Rabiner,S.E.Levinson,M.M.Sondhi.On the Application of Vector Quantization and Hidden Markov Models to Speaker Independent Isolated Word Recognition.Tech Journal.1983,4(62):1075-1105
    [26]丁慧东.脱机手写体汉字识别研究.东北师范大学硕士论文.2005:1-39
    [27]Y.Lu,M.Sharidar.Character Segmentation in Handwritten Words,An Overview.Pattern recognition.1996,29(1):77-96
    [28]N.Reitboech,T.P.Brody.A Transformation with Invariance Under Cyclic Permutation for Applications in Pattern Recognition.Information Control.1969,15:130-154
    [29]P.P.Wang,R.C.Shian.Machine Recognition of Printed Chinese Characters via Transformation Algorithms.Patern Recognition.1973,5:303-321
    [30]F.H.Cheng,W.H.Hsu,M.Y.Chen.Recognition of Handwriten Chinese Characters by Modified Hough Transform Techniques.Patern Anal Mach Intell,IEEE Transactions.1989,11(4):429-239
    [31]S.Mori,K.Yamamoto,M.Yasuda.Research on Machine Recognition of Handprinted Characters.Patern Anal Mach Intell,IEEE Transactions.1984,6:386-405
    [32]姚丹霖.脱机手写识别的研究.国防科技大学博士论文.2000:1-55
    [33]田盛丰,黄厚宽,李洪波.基于支持向量机的手写体相似字识别.中文信息学报.1999,14(3):37-41
    [34]Erwan Le Pennec,Stephane Mallat.Image Compression with Geometrical Wavelets.Image Processing,IEEE Transactions.2000,9(1):661-664
    [35]Erwan Le Pennec,Stephane Mallat.Sparse Geometric Image Representation with Bandelets.Image Processing,IEEE Transactions.2005,14(4):423-438
    [36]G..Peyre,Stephane Mallat.Surface Compression with Geometric Bandelets.Graphics(SIGGRAPH),ACM Transactions.2005,14(3):521-527
    [37]Erwan Le Pennec,Stephane Mallat.Bandelet Image Approximation and Compression.SIAM Journal of Multiscale Modeling and Simulation.2005,4(3):992-1039
    [38]G.Peyre,Stephane Mallat.Discrete Bandelets with Geometric Orthogonal Filters.Proceedings of ICIP.2005,9
    [39]D.L.Donoho.Wedgelets:Nearly-minimax Estimation of Edges.The Annals of Statistics.1999,27:859-897
    [40]D.L.Donoho.Sparse Component Analysis and Optimal Atomic Decomposition.Constructive Approximation.1998,17:353-382
    [41]路浩如,扬源远.手写体汉字识别问题综述.计算机应用与软件.1992,11(2):1-8
    [42]H.Yamada,K.Yamamoto,T.Saito.A Nonlinear Normalization Method for Handprinted Kanji Character Recognition-line Density Equalization.Pattern Recognition.1990,11(23):1023-1029
    [43]高彦宇,扬扬.脱机手写体汉字识别研究综述.计算机工程.2004,7(11):51-55
    [44]李云峰,欧宗瑛.基于Gabor小波变换和支持向量机的人脸识别.计算机工程.2006,32(19):181-184
    [45]J.C.Burges.A Tutorial on Support Vector Machines for Pattern Recognition.Data Mining and Knowledge Discovery.1998,2:121-167
    [46]边肇棋,张学工.模式识别.北京:清华大学出版社,2000
    [47]闫敬文,沈贵明,胡晓毅,许芳.基于Karhunen-Loeve变换和小波谱特征矢量量化的三维谱像数据压缩.光学学报.2003,10:67-71
    [48]L.Chang,C.M.Cheng,T.C.Chen.An Efficient Adaptive KL Transform for Hyperspectral Image Compression.Image Analysis and Interpretation Proceedings,IEEE Transactions.2000,5:189-192
    [49]O.Chapelle,V.Vapnik.Model Selection for Support Vector Machines.Advances in Neural Information Processing Systems.2000,12:230-236
    [50]C.W.Hsu,C.J.Lin.A Comparison of Methods for Multiclass Support Vector Machines.Neural Networks,IEEE Transactions.2002,13(2):41-425
    [51]C.Batur,L.Zhou,C.C.Chan.Support Vector Machines for Fault Detection.Detection an d Control,41st IEEE Conference.2002,2:1355-1356
    [52]Y.X.Wang,J.Wong,A.Miner.Anomaly Intrusion Detection Using One Class SVM.Man and Cybernetics Information Assurance Workshop,15~(th)Annual IEEE System.2004,358-364
    [53]G.F.Hughes.On the Mean Accuracy of Statistical Pattern Recognizers. Information Theory,IEEE Transactions.1968,14:55-63
    [54]郑松峰,徐维朴,刘维湘,郑南宁.基于无监督聚类的约简支撑向量机.计算机工程与应用.2003,10:74-77
    [55]张丽霞,施国庆.基于支持向量机的工程项目风险预测研究.计算机工程与应用.2005,21:224-226
    [56]喻莹,杨扬,董才林.基于动态特征选择的手写体相似汉字的识别.计算机工程.2006,32(17):10-13
    [57]金连文,徐秉铮.基于多级神经网络结构的手写体汉字识别.通信学报.1997,18(3):21-27
    [58]N.Sun,M.Abe,Y.Nemoto.A Fine Classification Method of Handwritten Character by Using Automatic Learning Algorithm of Partial Area Matching.IEICE Transactions.1995(3):492-500
    [59]封筠,朴春擎.一种手写相似汉字特征选择方法的研究.计算机工程.2005,31(15):471-474