限制性手写体字符OCR识别方法的研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
本系统的研究对象为限制性手写体字符(包括10个阿拉伯数字和52个英文字母的大小写,共62个字符)。本文研制的CC—OCR系统完成了字符从扫描输入到计算机识别的全过程。
     本文提出并实现了基于特征编码的多级分类识别方法,通过给字符抽取足够多的有效的特征并给特征编码实现第一级分类,对于第一级分类后仍不能区分的字符,再进入第二级分类用模板匹配的方法最终达到区分的目的,这种方法的重点在第一级分类阶段。实验结果表明这种基于特征编码的多级分类识别方法是可行有效的。
     在预处理阶段,本系统对字符点阵进行了预处理,为以后的特征提取和识别打下了良好的基础。在第一级分类阶段,本文提出了边沿表极值差特征、左边沿表间断特征、改进的宽度特征、针对所区分的字符在不同局部范围取交截特征的平均值与阈值比较等特征,这些特征与已有的一些特征相结合,较好的实现了在第一级分类阶段对字符的分类能力。
     本系统的硬件部分由扫描仪与计算机组成,实现程序由C和VC++6.0完成。
The research object of this system are constrained handwritten characters (including 10 Arabic numerals, 26 capital English letters and 26 small English letters , 62 characters aggregately). The CC-OCR system developed by the author can complete the process from the characters scan input to the computer recognition.
    This dissertation brings forward and realizes the multilevel classifiable method which is based on characters coding. Above all, this method realizes the first-grade classification by extracting enough effective characters from characters and coding them, to the others which coundn' t be recognized by the first-grade classification, the method will adopt the second-grade classification using template matching to recognize these characters. The emphasis of this method stands on the first-grade classification phase. The experiment proves that this method is feasible and effective.
    In the pre-processing phase, each character is fed into a pre-processor, this makes feature extraction and recognition easy. In the first-grade classification phase, the dissertation puts forward border-table subtract of maximum and minimum feature left-border-table intermission feature improving width feature crossing amount average feature, these features combines with some existing features, realizes the ability of classification in the first-grade classification phase better.
    This system is composed of scanner and computer. This program is completed using C and VC++6. 0.
引文
[1] 黄瀚敏,汪先矩,易正俊,马笑潇.一种基于特征提取的手写字符识别技术.重庆大学学报.2000,23(1):66-69页
    [2] 王林.电路图的自动识别和分析处理系统.哈尔滨工程大学硕士学位论文.2000:9-10页
    [3] 何斌,马天予,王运坚,朱红莲.Visual C++数字图像处理.第一版.北京:人民邮电出版社,2001:386-393页
    [4] 阮秋琦.数字图像处理学.北京:电子工业出版社,2001
    [5] 崔屹.图像处理与分析---数学形态学方法及应用.北京:科学出版社,2000
    [6] 王士元.C高级实用程序设计.北京:清华大学出版社,1996
    [7] 沈清,汤霖.模式识别导论.长沙:国防科技大学出版社,1991
    [8] 林静宇,曹雨龙.计算机图像处理及常用算法手册.南京:南京大学出版社,1997
    [9] 朱小燕,史一凡,马少平.手写体字符识别研究.模式识别与人工智能.2000,13(2):174-180页
    [10] 李莉,舒文豪.手写体汉字识别粗分类方法的研究.模式识别与人工智能.1990,3(2):45-51页
    [11] 许士良.C常用算法程序集.北京:清华大学出版社,1996
    [12] 胡钟山,娄震,杨静宇等.基于多分类器组合的手写体数字识别.计算机学报.1999,22(4):370-374页
    [13] 田村秀行.计算机图像处理技术.北京:北京师范大学出版社,1988
    [14] 娄震,胡钟山,杨静宇.基于轮廓分段特征的手写体阿拉伯数字识别.、计算机学报.1999,22(10):1066-1073页
    [15] 吕岳.字符识别技术及其在信函分拣系统中的应用:现状与前景.中文信息学报.2000(8):41-43页
    [16] 乐宁,邵华.细化字符断裂笔划修补方法的研究.中文信息学报.2001(1):15-18页
    
    
    [17] 张立明.人工神经网络的模型及其应用.上海:复旦大学出版社,1992
    [18] 焦李成.神经网络系统理论.西安:西安电子科技大学出版社,1996
    [19] Zhangping, Chen Lihui, Alex C Kot. A Floating Feature Detector for Handwritten Numeral Recognition. IEEE. 2000(6):553-556p
    [20] Rikard Berthilsson. Character Recognition Using Shape for Curves. IEEE. 2000(8):227-230p
    [21] Livind Due Trier & Anil K. J. & Torfinn T. Feature Extraction Methods for Character Recognition-A Survey. Pattern Recognition. 1996,29(4): 641-662p
    [22] L. Lam, S. W. Lee, C. Y. Suen. Thinning methodologies-a comprehensive survey. IEEE Trans. Pattern Anal. Math. Intell. 1992,14:869-885p
    [23] T. H. Hildebrandt, W. Liu. Optical recognition of handwritten Chinese characters. Pattern Recogn. 1993,26:205-225p
    [24] C. Y. Suen, M. Berthod, S. Mori. Automatic recognition of handprinted characters-the state of the art. Proc. IEEE. 1990,68:469-485p
    [25] Unger S. H. Pattern Detection and Recognition. Proceeding of the IRE, 1959:1737-1752p
    [26] S. W. Lee, J. S. Park. Nonlinear shape normalization methods for the recognition of large-set handwritten characters. Pattern Recogn. 1994,27:895-902p
    [27] Kwok D. C. K. A. Thinning Algorithm by Contour Generation. CAMA. 1988, 31:1314-1324p
    [28] Peleg S. & Rosenfeld A. A Min-max Medial Transformation. IEEE Trans. Sys. Mach. Intell, 1991,3:208-210p
    [29] Zhang T. Y. & Suen C. Y. A Fast Parallel Algorithm for Digital Pattern. CACM. 1994,27(3):236-239p
    [30] Naccache N. J. & Shinghal R. SPTA: A Proposed Algorithm for Thinning by Binary Patterns. IEEE Trans. Sys. Man Cyb, 1994,14(3):409-418p
    [31] Yoshiki Mizukami, Taiji Sato, Kanya. Tanaka Handwritten digit recognition by hierarchical displacement extraction with gradual prototype elimination. IEEE.2000,6:339-342p

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700