基于全局仿射变换的分级动态汉字字库
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
目前的静态汉字字库,经历了点阵字库、矢量字库到曲线字库的发展,在字库存储量问题上已经有了长足的进步。微软的TrueType字库和Adobe的PostScript系列字库利用曲线轮廓技术,在字形美观效果方面也取得了很好的效果。但是由于汉字数量庞大,这些字库在汉字的信息处理应用中都还具有一定的局限性。
     目前成熟的汉字字库,都属于静态字库,缺乏笔顺信息,无法模拟汉字书写过程,再加上汉字数量繁多,字形多变从而限制了动态汉字字库技术的发展。
     在这样的背景下,本文设计并实现了一种全新的分级动态汉字字库,主要工作包括:
     (1)构建了两种基本组件库:笔画库和部件库。其中部件库是在文献[7]的旧部件库的基础上重新构建的。
     (2)设计了一种当部件库改变时,汉字的半自动拆分方法。
     (3)将全局仿射变换用于分级汉字字库的构建,实现了字库构建的自动化。推导了仿射变换参数的具体计算式,通过实验说明了各种预处理(骨架、轮廓、特征点、重心)对全局仿射变换的影响。
     (4)用结构相似度评判模拟效果。结构相似度能够更加客观地有效地评判模拟汉字的效果。推导了计算二值图象的结构相似度的方法。
     (5)将分级汉字字库技术应用到动态汉字字库中,构建并实现了分级动态汉字字库,从而使动态汉字字库在嵌入式设备上的实现成为可能。
     (6)在Borland C++Builder 6平台上构建并实现了基于笔画库和部件库的包括“楷体_GB2312”和“仿宋_GB2312”两种字体的分级动态汉字字库。
     分级汉字字库通过组件的重复使用大大减少了汉字图象字库的存储量。本文通过全局仿射变换方法的应用使得分级汉字字库的构建完全实现了自动化。分级字库技术在动态字库中的使用大幅度地减少了动态字库的存储量。本文的工作使分级动态汉字字库向实用化迈出了重要一步。
The technology of static computer Chinese fonts has developed from Bitmap type, Vetor type to Curve Contour type fonts. It has great improvement on the storage of Chinese fonts. The Curve Contour fonts, represented by the TTF of Microsoft and PostScript of Adobe, have excellent displaying effect. But the huge number of Chinese character limits the use of Chinese fonts.
     At present, the well-knowed Chinese fonts are static ones. They have the same shortage: they don’t contain the written temporal information of Chinese characters and they can’t display how to write Chinese characters correctly. The development of Dynamic Chinese Character Database (DCCD) is limited because of the huge number and variable fonts of Chinese characters.
     From this background, a new Hierarchical Dynamic Chinese Character Database (HDCCD) is constructed and implemented in this paper. The main works include:
     (1) Two kinds of component database, stroke database and radical database, is constructed. The radical database is rebuided based on the old radical database proposed in reference [7].
     (2) The method of splitting Chinese characters semiautomatically is proposed, when the radical database is changed.
     (3) Global Affine Transformation (GAT) is used to construct Hierarchical Chinese Character Database (HCCD). The application of GAT makes the construction of HCCD automatic. The specific expressions of affine transformation parameters are solved. The effect of different preprocessings to GAT, including skelecton, contour, feature points and barycenter, is explained by experiments.
     (4) Structural Similarity (SSIM) is applied to judge the effect of simulated Chinese characters. The judgement of SSIM is more objective and more efficient. The method for computing SSIM of binary images is proposed.
     (5) HDCCD is implemented by the application of HCCD technology on the DCCD. HDCCD makes the application of DCCD on the embeded system possible.
     (6) HDCCD including“KaiStyle_GB2312”and“FangSongStyle_GB2312”fonts, based on stroke and radical database, is implemented on Borland C++ Builder 6.
     HCCD can reduce the storage of Chinese graph database greatly by reusing components. GAT applied in this paper can construct HCCD automatically. The storage of DCCD is reduced largely by the application of HCCD technology. The work of this paper enhances the practicability of HDCCD.
引文
[1] 许嘉璐.中文信息处理的现状和发展方向.《未成集——论新时期语言文字工作》:语文出版社, 2000:85-98
    [2] 徐雨明.PostScript 曲线字库的结构和直接显示方法.电脑编程技巧与维护, 1997(11):71-76
    [3] 景年社.字库技术及其应用概述.《印刷技术》, 2002, 10(19):41-50
    [4] Candy L. K. Yiu, Wai Wong . Chinese character synthesis using METAPOST.Proceedings of TUG 2003.2003, 24(1):85-93
    [5] Jungpil Shin, Kazunori Suzuki . Interactive System for Handwritten-Style Font Generation.Computer and Information Technology, IEEE.2004:94-100
    [6] EJ Jakubiak, RN Perry, SF Frisken.An Improved Representation for Stroke-based Fonts.SIGGRAPH, 2006
    [7] 冯万仁, 金连文.“基于部件复用的分级汉字字库的构想与实现”.计算机应用, 2006, 26( 3)
    [8] Lars Borin, Where will the Standards for Intelligent Computer-Assisted Language Learning Come from?.Workshop Proceedings.International Standards of Terminology and Language Resources Management.2002:61–68
    [9] Language Processing and Intelligent Computer-Assisted Language . The MiLCA-project.http://milca.sfs.uni-tuebingen.de
    [10] 潘志庚, 马小虎, 石教英.动态汉字库自动生成算法.自动化学报, 1996, 22(5):561-567
    [11] 姚鸿滨, 邹敏, 柳德钟.《怎样写汉字》软件的教学实验及应用.江苏无锡:无锡教育学院
    [12] 快乐汉字软件网 http://www.zh2002.com
    [13] H.C. LAM, W.W. KI, A.L.S. CHUNG, P.Y. KO.Experience in Designing Databases for Learning Chinese Characters.International Journal of Computer Processing of Oriental Languages, 2000, 13(4):351–375
    [14] H.C. Lam, W.W. Ki, N. Law, A.L.S. Chung, P.Y. Ko.The Design of CALL software for Learning Chinese Characters.The 4th Global Chinese Conference on Computers inEducation, 2000, 1:55-63
    [15] H.C. Lam, W.W. Ki, N. Law, A.L.S. Chung, P.Y. Ko.Designing CALL for learning Chinese characters.Journal of Computer Assisted Learning, 2001, 17(1):P115-128
    [16] 陈东明, 金连文.“基于骨架自动跟踪的动态汉字数据库的设计与实现”.第五届中国计算机图形学大会论文集( China graph’2004).2004 年 9 月:457-461
    [17] Rafael C.Gonzalez, Richard E.Woods, Steven L.Eddins.《Digital Image Processing Using MATLAB》.北京电子工业出版社, 2004 年 5 月:453-454
    [18] 章毓晋.《图象处理与分析》.清华大学出版社, 1999:224
    [19] 怀王群.二值图象的细化.无锡轻工大学学报, 2001 年, 20(3):316
    [20] 唐良瑞.图象处理实用教程.化学工业出版社, 2001 年:145
    [21] Mao-Jiun J. Wang, Wen-Yen Wu, Liang-Kai Huang, Der-Meei Wang.Corner detection using bending value Pattern Recognition Letters.1995, 16(6):575-583
    [22] Liu, H.C. and M.D. Srinath . Comer detection from chain-code . Pattern Recognition.1990:51-68
    [23] T.Wakahara and K.Odaka.Adaptive Normalization of Handwritten Characters Using Global/Local Affine Transformation.IEEE Trans. Pattern Anal. Machine Intell.1998, 20: 1332-1341
    [24] James D.Foley, Andries van Dam, Steven K.Feiner, John F.Hughes, Richard L.Phillips.《Introduction to Computer Graphics》.董士海, 唐泽圣, 李华, 吴恩华, 汪国平.北京机械工业出版社, 2004:233-237
    [25] 段华伟, 黄灵阁.计算机文字处理技术现状.《印刷质量与标准化》, 2004, 5(1) :37-41
    [26] 刁宝成, 焦永和.《计算机图形学》.高等教育出版社, 1999:22-35
    [27] 马小虎, 潘志庚.高质量 Bezier 曲线描述轮廓库自动生成算法《.自动化学报》, 1994, 20(1):121-125
    [28] 王汝传, 邹北骥.《计算机图形学》.北京人民邮电出版社, 2002:121-125
    [29] Yamada K, Tsukumo J.Consideration on stability of Gabor feature extraction and character recognition application.IEICE Tech.Rep.1993:75-82
    [30] Zhou Wang.A Universal Image Quality Index.IEEE SIGNAL PROCESSING LETTERS, 2002, 9(3)
    [31] 丁晓青, 郭繁复.汉字识别技术的发展
    [32] 郭军, 马跃, 盛立东, 钟义信.发展中的文字识别理论与技术.电子学报, 1995, 23 (10):184-185
    [33] 杨俊, 赵荣椿, 任金昌.手写体汉字识别的回顾与展望.中国体视学与图像分析, 1998, 3(1):55
    [34] Bovik A C, Clark M, Geialer W S.Multichannel texture analysis using localized spatial filters.IEEE Trans. Pattern Analysis and Machine Intelligence.1990, 12(1):55-73
    [35] Yamada K, Tsukumo J.Consideration on stability of Gabor feature extraction and character recognition application.IEICE Tech.Rep.1993:75-82
    [36] 覃剑钊, 金连文, 高学, 韦岗.基于 Gabor 滤波器的手写汉字特征提取方法的研究.第 12 界全国神经网络学术大会
    [37] 金连文.手写体汉字识别的研究.华南理工大学博士学位论文.1996
    [38] 高学.基于运动图像的手写汉字识别研究.华南理工大学博士学位论文.2003

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700