基于文本图像纹理特征的文种识别技术研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
随着网络通信技术和信息处理技术的迅速发展,文本图像成为人们获取信息的重要来源。随着国家之间交流的日益频繁,多种语言文字需要识别和处理,文本图像的文种自动识别技术对于有效提取文本图像中的信息具有重要意义。本文主要研究基于文本图像纹理特征的文种识别技术,所做的主要工作如下:
     1.分析了文本图像的特征尤其是纹理特征,介绍了文本图像文种识别技术的发展历史和研究现状,明确了文种识别技术现有的成果和面临的问题。
     2.提出了一种基于多小波变换的文种识别算法。将文本图像多小波分解的子图能量作为特征,用SVM实现文种分类。实验结果表明,该算法的识别性能明显优于基于小波变换的方法,尤其提高了对文字字体格式变化的鲁棒性。
     3.针对目前文种识别中纹理特征描述子对文字行倾斜缺乏鲁棒性,通过研究组成文字的纹理基元可控金字塔子带能量的分布特点,对文本图像的可控金字塔能量统计特征空间重新进行排序,提出一种对文字行倾斜具有鲁棒性的文种识别算法。对十种文字的文本图像进行不同倾斜角度的文种识别实验,结果表明该算法具有较高的识别率且对文字行倾斜具有较强的鲁棒性。
     4.针对文字笔划具有较强的方向性和文字边缘包含重要的纹理信息,提出了基于多尺度几何分析的文种识别算法。采用Contourlet及复数Contourlet变换对文本图像进行分解,提取子带能量特征;同时对图像Contourlet变换子带系数的边缘分布进行广义高斯建模,提取模型参数特征。采用SVM作为分类器。对十五种文字的文本图像进行实验,结果表明所提出的算法提高了对视觉特征相近的文种的识别能力。
     5.提出一种分级识别文种的方法。对十四种文字分两级识别,第一级采用文本行灰度投影法对文种粗分类,第二级采用基于纹理特征的算法对文种进行细分类。该方法识别效率高,错误积累小,可以根据文字特征选择识别算法,根据应用需求确定识别层次,具有较高的实用价值。
With the rapid development of network communication technology and information processing technology, document images have become important source for attaining information. For the intercommunications among countries are more frequent, many languages or scripts need to be identified and processed. Script identification is significant for attaining information from document images effectively. This dissertation mainly works on script identification based on texture feature of document images. The main work is as following:
     1. The features especially texture features of document images are deeply studied. The development history and researching state of script identification are introduced. The fruits that have got and difficulties that are faced are pointed out.
     2. A script identification algorithm based on multi-wavelet transform is proposed. The energies of sub images after multi-wavelet decomposition are used as features and SVM is used as classifier. Experimental results confirm the proposed algorithm is more excellent than the one based on wavelet. It's especially robust to the changes of font and format of characters.
     3. Most algorithms on texture feature extraction for script identification are unadaptable to the skew of text line presently. To obtain features robust to rotation, texture units consisting of characters are decomposed by Steerable Pyramid and the energy features of sub bands are studied deeply. An algorithm robust to the skew of text line is proposed through realigning the energy statistical features. The experiments are performed on the image database containing ten scripts with different skew angles. The results confirm that the algorithm can identify scripts accurately and is robust to the skew of text line at the same time.
     4. Aiming at the orientation of characters and the abundant texture features of character edges, algorithms based on multi-scale geometric analysis are proposed. Document images are decomposed by Contourlet and complex Contourlet transform. Energy features of sub bands are extracted. At the same time, sub bands of Contourlet transform are modeled by Generalized Gaussian Model and model parameters are used as features. SVM is used as classifier. The experiments done on image database containing fifteen scripts confirm the proposed algorithms improve the identification performance on the scripts whose vision features are similar.
     5. A script identification method identifying scripts by steps is proposed. Fourteen scripts are identified by two steps. The text line projection algorithm is used in the first step for coarse identification and the algorithm based on texture feature is used in the second step for fine identification. This method is efficient with small error accumulation. It is very practical for it can select algorithm according as characteristic of script and select step according as application requirement.
引文
[1]Spitz A L. Multilingual document recognition [A]. Proceedings of International Conference on Electronic Publishing, Document Manipulation and Typography [C], Cambridge Universit, Press,1990:193-206.
    [2]陈睿,龚招友.基于文档图像的中英文文种识别[J].电信技术研究,2005,No.5:25-28.
    [3]Pal U, Chaudhuri B B. Identification of different script lines from multi-script documents [J]. Image and Vision Computing,2002,20(13-14):945-954.
    [4]Spitz A L. Script and language determination from document images [A]. Proceedings of Third Annual Symplic Document Analysis Information Retrieval [C], Vegas, America, 1994:229-235.
    [5]Elgammal A M, Ismail M A. Techniques for language identification for hybrid Arabic-English Document images [A]. Proceedings of Sixth International Conference on Document Analysis and Recognition [C], Seattle,2001:1100-1104.
    [6]Ding J, Lam L, Suen C Y. Classification of Oriental and European scripts by using characteristic features [A]. Proceedings of ICDAR [C], Ulm,1997:1023-1027.
    [7]Nakayama Takehiro, Spitz A L. European language determination from image [A]. Proceedings of the International Conference on Document Analysis and Recognition [C], Tsukuba,1993:159-162.
    [8]Spitz A L. Determination of the script and language content of document images [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,1997,19(3):235-245.
    [9]Hochberg J, Kelly P, Thomas T. Automatic script identification from images using cluster-based templates [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1997,19(2):176-181.
    [10]Busch A, Boise W W, Sridharan S. Texture for script identification [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2005,27(11):1720-1732.
    [11]Tao Yu, Tang Yuan Y. Discrimination of Oriental and Euramerican scripts using fractal feature. Proceedings of International Conference on Document Analysis and Recognition [C],2001:1115-1119.
    [12]Tan T. Rotation invariant texture features and their use in automatic script identification [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,1998,20(7):751-756.
    [13]曾理,唐远炎,陈廷槐.基于多尺度小波纹理分析的文字种类自动识别[J].计算机学报,2000,23(7):699-704.
    [14]Hiremath P S, Shivashankar S. Wavelet based co-occurrence histogram features for texture classification with an application to script identification in a document images. Pattern Recognition Letters,29 (2008):1182-1189. www. Science direct, com.
    [15]朱华光,平西建,程娟.基于二元树复数小波变换的文种自动识别[J].数据采集与处理,2008,23(6):766-712.
    [16]Sarkar N, Chaudhuri B B. An efficient approach to estimate fractal dimension of textural images [J]. Pattern Recognition,1992,25(9):1035-1041.
    [17]郑南宁.计算机视觉与模式识别[M].北京:国防工业出版社,1998:374-376.
    [18]秦前清,杨宗凯.实用小波分析[M].西安:西安电子科技大学出版社,1994.
    [19]Kingsbury N G. The dual-tree complex wavelet transform:a new technique for shift invariance and directional filters [A]. Proceedings of 8th IEEE DSP Workshop [C], Utah, 1998:86-89.
    [20]Kingsbury N G. The dual-tree complex wavelet transform:a new efficient tool for image restoration and enhancement [A]. Proceedings of European Signal Processing Conference [C], Rhodes,1998:319-322.
    [21]Daubechies I. Ten lectures on wavelets, philadephia, PA:SIAM,1992.
    [22]Ganesan T N T, LEE S L. Wavelet of multiplicity [R]. Transactions on America Mathematics Society,1994,338(2):639-654.
    [23]Xia X G, Geronimo J S, Hardin D P, etal. Design of pre-filters for discrete multi-wavelet transform [J]. IEEE Transactions on Signal Processing,1996,44(1):25-35.
    [24]Strela V. Multi-wavelets:theory and applications [D]. Cambridge:MIT,1996.
    [25]Strela V, Tan H H, Tham J Y. Symmetric-anti-symmetric orthogonal multi-wavelets and related scalar wavelets [J]. Journal of Applied and Computational Harmonic Analysis, 2008:258-279.
    [26]Vapnik V. The nature of statistical learning theory [M]. New York:Springer-Verlag,1995.
    [27]Chih-Chung Chang, Chih-Jen Lin. Libsvm:a library for support vector machines [EB/OL]. http://www. csie. ntu. edu.tw/-cjlin/libsvm,2001/2006.
    [28]Simonceli E P, Freeman W T, Adelson E H, etal. Shiftable multi-scale transforms [J]. IEEE Transactions on Information Theory, Special Issue on Wavelet Transforms and Multi-resolution Signal Analysis,1992,38(2):587~607.
    [29]黄丽,庄越挺,苏从勇等.基于多尺度和多方向特征的人脸超分辨率算法[J].计算机辅助设计与图形学学报,2004,16(7):953-961.
    [30]Simonceli E P, Freeman W T. The Steerable Pyramid:a flexible architecture for multi-scale derivative computation [C]. IEEE Second International 1st Conference on Image Processing, Washington, DC,1995.
    [31]Karasaridis A, Simonceli E P. A filter design technique for Steerable Pyramid image transforms [C]. International 1st Conference on Acoustics Speech and Signal Processing, Atlanta GA,1996.
    [32]Javier A Montoya-Zegarra, Neucimar J Leite, Ricardo da S Torres. Rotation-invariant and scale-invariant Steerable Pyramid decomposition for texture image retrieval. Proceedings of the XX Brazilian Symposium on Computer Graphics and Image Processing Table of Contents, Washington DC, USA,2007:121-128.
    [33]边肇祺,张学工.模式识别[M].北京:清华大学出版社,1994:178-187.
    [34]Donoho D L, Flesia A G. Can recent innovations in harmonic analysis'explain'key findings in natural image statistics [J]. Network:Computation in Neural Systems,2001, 12(3):371-393.
    [35]Candes E J. Ridgelets:theory and applications [D]. USA:Department of Statistics, Stanford University,1998.
    [36]Candes E J, D L Donoho. Curvelets [R]. USA:Department of Statistics, Stanford University, 1999.
    [37]Penec E L, Mallat S. Image compression with geometrical wavelets [C]. Proceedings of ICIP'2000, Voncouver, Canada, September,2000:661-664.
    [38]Do M N, Vetterli M. Contourlets [A]. Stoeckler J, Welland G V. Beyond Wavelets [C]. Academic Press,2002.
    [39]Donoho D L, Xiaoming Huo. Beamlets and multi-scale image analysis [R]. USA: Department of Statistics, Stanford University,2000.
    [40]Donoho D L. Wedgelets:nealy-minimax estimation of edges [R]. Stanford University and U C Berkeley, August,1997.
    [41]Dipeng Chen, Qi Li. The use of complex Contourlet transform on fusion scheme [A]. Processings of World Academy of Science, Engineering and Technology [C], August, 2005:342-347.
    [42]Mallat S. A theory for multi-resolution signal decomposition:the wavelet representation [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,1989, 11(2):674-693.
    [43]De Wouwer V, Schenders G, Van Dyek D P. Statistical texture characterization from discrete wavelet representation [J]. IEEE Transactions on Image Processing,1999,8(4):592-598.
    [44]Do M N, Vetterli M. Texture similarity measurement using Kullback-Leibler distance on wavelet sub-bands [A]. Proceedings of IEEE International Conference on Image Processing [C]. Vaucouver Canada, September,2000:730-733.
    [45]Do M N, Vetterli M. Wavelet-based texture retrieval using generalized Gaussian density and Kullback-Leibler distance. IEEE Transactions on Image Processing,2002:146-158.
    [46]Kay S M. Fundamentals of statistical signal processing:estimation theory [M]. Englewood Cliffs, NJ:PrenticeHall,1993.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700