场景文本识别关键技术研究

英文题名：Study on Key Technologies of Scene Text Recognition
作者：尹芳
论文级别：博士
学科专业名称：计算机应用技术
中文关键词：场景文本识别 ; 文本提取 ; 变形校正 ; Gabor小波变换特征 ; 典型相关分析
英文关键词：scene text recognition ; text extraction ; distortion correction ; Gabor
英文关键词：wavelet transform feature ; canonical correlation analysis
学位年度：2012
导师：陈德运
学科代码：081203
学位授予单位：哈尔滨理工大学
论文提交日期：2012-09-01

摘要

场景图像中包含着丰富的文本信息,它们可以从很大程度上帮助人们去捕获和认知场景图像的内容及含义,因此场景图像中的文本对其所在图像的视觉信息获取具有极其重要的作用。如果使用计算机自动识别场景图像中包含的文本内容,并应用于盲人辅助导航、无人驾驶导航、安全保卫、危机预防及处理等领域,将给人们的工作生活带来极大便利。
     场景文本识别技术与传统的光学字符识别技术(Optical Character Recognition,OCR)有着显著差别,主要在于场景文本图像与传统扫描文档的不同。场景文本图像主要通过数码相机、摄像机等设备获得,图像中存在颜色不一致、亮度不均匀、背景复杂多变、噪声强烈等现象,文本可能发生变形、字迹模糊、残缺、笔划断裂等问题,这些干扰因素使得场景文本识别存在很大困难,面临诸多挑战。针对这些问题,本文拟对场景文本识别的几个关键技术展开研究,包括复杂背景下的文本提取技术；自然场景下的文本变形校正技术以及场景文本单字符识别技术。
     针对场景文本背景图像构成复杂、影响文本识别效果的问题,通过分析场景文本图像的特点,在识别前进行预处理,将文本图像从复杂背景中提取出来,在此基础上提出了一种基于归一化割的谱聚类文本提取方法。首先根.据文本图像特点建立相似性权值函数,然后根据场景文本颜色分布特性按照颜色直方图对色彩空间进行量化,得到数量有限、不同颜色的像素集合,并以量化的颜色等级为单位结合像素的纹理特征及分布特点来构造相似矩阵,最后在归一化割准则下利用谱聚类方法实现图像分割。该方法将经过量化的颜色集合作为图分割中的顶点以简化加权图模型,从而显著降低谱聚类的计算复杂性,提高了谱聚类方法在图像分割方面的应用能力。在ICDAR2009、2003竞赛测试集、以及大量其他文本图像上的实验表明,本文方法具有良好的文本提取性能。
     针对场景文本由于文本载体本身倾斜或获取过程中相机视角倾斜引起的倾斜变形和透视变形问题,提出了一种基于数学形态学的变形校正方法。使用形态学方法针对不同变形情况选取不同形态学因子提取特征点；然后通过聚类方法和最近邻方法根据特征点的聚类信息拟合文本基线,并使用随机采样一致性算法计算基线位置,获得变形参数；最后,通过投影变换完成文本图像的变形校正。实验结果表明,本文提出的方法能够对存在一定程度变形的场景文本进行校正,以提高文本识别系统的识别准确率,特别是对行数较少的场景文本的处理,与其他方法相比具有明显优势。
     针对场景文本字迹模糊、笔划断裂、噪声强烈等问题,本文提出了一种鲁棒性强的提取Gabor小波特征的改进方法。该方法首先在基本Gabor小波变换基础上进行滤波方向的选择分类,然后利用带有方向选择性的小波变换提取Gabor特征,并与直方图相结合得到用于识别的组合特征。通过一系列的对比实验,显示出利用本文方法提出的组合特征针对笔划模糊这样的低质量字符图像具有良好的分类能力。
     为寻求高性能的场景文本识别系统,本文提出了一种基于背景相关分析的文本识别方法。该方法针对场景中文本与其背景之间的相互联系,利用典型相关分析方法挖掘背景与文本之间的相关性,提取字符图像与背景图像之间的典型相关特征用作字符分类特征,在场景文本样本集上的测试取得令人满意的结果,实验数据显示使用典型相关特征显著提高了场景文本的识别性能,表明了该分类特征的有效性。该方法突破了传统识别方法仅考虑文本自身特征的局限性,充分利用了图像中文本的周边信息,对场景文本识别方法研究是一个新的突破。实验结果同时表明利用字符以外的背景信息辅助识别是一个值得继续研究的课题,它为实现高性能的场景文本识别系统提供了全新的研究思路。
Images in natural scene always contain rich text information, and they can help people to capture and understand the content and meaning of natural scene image to a large extend. So text in natural scene plays an important role in the image visual information acquisition. If humans can use computer to recognize the text content in natural scene image automatically, and apply it to auxiliary blind navigation, unmanned navigation, security, crisis prevention and treatment and other fields, our life will become more convenient.
     Scene text recognition and the traditional optical character recognition (OCR) have essential difference, which mainly lies in scene text is mainly obtained by digital camera or video camera, so the image has color not consistent, brightness uneven, background complicated and other strong noise, so text in the image may be deformed, low resolution, strokes fracture and other issues. These factors bring scene text recognition a lot of difficulties, and there are many problems facing challenges. In this paper, text in natural scene recognition system is studied, and the research on the key technical issues is carried on.
     According to the problem that the background of scene text is complex and it will affect the text recognition, the characteristic of scene text image and research situation of text segmentation are analyzed, and improved text image segmentation method based on spectral clustering is brought out on the basis of decision to do image preprocessing as the first step to separate the text from the complex background. Firstly the similarity function is established considering the characteristic of the text image. According to the color distribution of scene images the color space is quantized to get limited number pixel sets of each kind of color using color histogram, and the affinity matrix is constructed under the quantized levels. Finally, the method uses the spectral clustering under Ncut criterion to segment images. The method uses color sets quantized as vertex of graph to simplify the weighted graph model so the computational complexity of spectral clustering is reduced significantly and the application ability of the spectral clustering method in image segmentation is improved. Experiments on the test images of ICDAR2003,2009competition and plenty of other text images have been done, and the results show that the proposed method is with good performance on text segmentation.
     An effective perspective distortion correction method is presented to resolve the perspective distortion correction in scene text recognition caused by the tilt of text carrier itself or camera view. In this paper mathematic morphology is employed to select morphological factors for various distortion; then the clustering information is extracted by using clustering method and nearest neighbor method to fit text base-line and followed by some statistic calculation such as RANSAC (Random Sample Consensus) to locate the base-line so as to extract the distortion parameters. At last affine transformation is applied to finish the distortion correction for text images. Experiments show the method in this paper is effective to correct the text image distortion, and improve the text recognition rate significantly. Especially for scene text with a few lines, this method has advantages.
     According to the problems of scene text recognition such as lower resolution, poor quality, serious noise and others, this paper uses Gabor wavelet transform feature with high robustness as classification feature. And further the original Gabor wavelet transform is improved by pre-classification of direction and feature fusion combined with histogram. Series of comparative experiments prove that the proposed features have good classification ability for low quality character image with fuzzy strokes.
     A text recognition method based on CCA of background is proposed in order to seek a high-performance scene text recognition system. According to the correlationship between scene text and background the method extracts the CCA feature as the classification feature for character to mine correlation between background text using CCA. The method obtained a satisfied result and the experimental data show CCA feature may significantly improves the performance of scene text recognition and the method is effective. This method breaks through the limitations of traditional recognition ways which onlv consider the characteristic of text itself, and makes full use of the surrounding information of the text in image. That is a new breakthrough of scene text recognition. The experimental results also show that using background information to assist recognition is a worthy subject for further research. It provides a new research idea to achieve high-performance scene text recognition system.

引文

[I]WANG KAI, BABENKO BORIS, BELONGIE SERGE. End-to-End Scene Text Recognition[C].2011 IEEE International Conference on Computer Vision,2011:1457-1464.
    [2]MERLER M, GALLEGUILLOS C, BELONGIE S. Recognizing groceries in situ using in vitro training data[C].2007 IEEE Computer Society Conference on Computer Vision and Pattern Recognition,2007: 1-8.
    [3]LIANG JIAN, DOERMANN DAVID, LI HUIPING, Camera-Based Analysis of Text and Documents:A Survey[J]. International Journal on Document Analysis and Recognition.2005, (7):84-104.
    [4]OHYA J, SHIO A, AKAMATSU S. Recognizing Characters in Scene Images[J]. IEEE Transaction on PAMI.1994,16:214-220.
    [5]LEE C, KANKANHALLI A. Automatic Extraction of Characters in Complex Scene Images[J]. International Journal of Pattern Recognition and Artificial Intelligence,1995,9:67-82.
    [6]ZHONG Y, KARU K, JAIN A K. Locating Text in Complex Color Images[J]. Pattern Recognition,1995:523-1236.
    [7]ZHOU JIANGYING, LOPRESTI DANIEL. Extracting Text from WWW Images[C]. International Conference on Document Analysis and. Recognition,1997:248-252.
    [8]DAVID L SMITH, JACQUELINE FIELD, ERIC LEARNED MILLER. Enforcing Similarity Constraints with Integer Programming for Better Scene Text Recognition[C].2011 IEEE Conference on Computer Vision and Pattern Recognition,2011:73-80.
    [9]赵建东,张昊,王自上,等.交通事件视频图像自动识别系统的研究[J].北京交通大学学报,2011,25(4)：138-141,147.
    [10]THILLOU C, FERREIRA S, GOSSELIN B. An Embedded Application for Degraded Text Recognition[J]. Eurasip Journal on Applied Signal Processing,2005,13:2127-2135.
    [11]LUCAS S M, PANARETOS A, SOSA L, et al. ICDAR 2003 Robust Reading Competitions[C]. Proceedings of the Seventh International Conference on Document Analysis and Recognition,2003:1-6.
    [12]MARIANO V, MIN J, PARK J, et al. Performance Evaluation of Object Detection Algorithms[C]. Proceedings of International Conference on Pattern Recognition,2002,16(3):965-969.
    [13]SIMON M LUCAS. ICDAR 2005 Text Locating Competition Results[C]. Proceedings of the 2005 Eighth International Conference on Document Analysis and Recognition,2005:80-84.
    [14]GATOS B, NTIROGIANNIS K, PRATIKAKIS I. ICDAR 2009 Document Image Binarization Contest (DIBCO 2009)[C].2009 10th International Conference on Document Analysis and Recognition,2009: 1375-1382.
    [15]http://www.iit.demokritos.gr/～bgat/DIBCO2C09/benchmark[CP],2009.
    [16]NTIROGIANNIS K, GATOS B, PRATIKAKIS I. An Objective Evaluation Methodology for Document Image Binarization Techniques[C].Proceedings of the 8th International Workshop on Document Analysis Systems (DAS'08),2008:217-224.
    [17]KARATZAS D, ROBLES MESTRE S, MAS J, et al. ICDAR 2011 Robust Reading Competition Challenge 1:Reading Text in Born-Digital Images (Web and Email)[C].2011 International Conference on Document Analysis and Recognition,2011:1485-1490.
    [18]ASIF SHAHAB. FAISAL SHAFAIT, ANDREAS DENGEL. ICDAR 2011 Robust Reading Competition Challenge 2:Reading Text in Scene Images[C].2011 International Conference on Document Analysis and Recognition,2011:91-96.
    [19]CLAVELLI A, KARATZAS D, LI ADOS J. A Framework for the Assessment of Text Extraction Algorithms on Complex Colour Images[C]. Proceedings of the 9th IARP International.Workshop on Document Analysis Systems,2010:19-28.
    [20]LIU HAILONG, Ding XIAOQING. Handwritten character recognition using gradient feature and quadratic classifier with multiple discrimination schemes[C]. International Conference on Document Analysis and Recognition,2005,19-25.
    [21]http://finereader.abbyy.com/[CP] 2011.5.
    [22]LIU YANGXING, GOTO SATOSHI, IKENAGA TAKESHI. A Robust Algorithm for Text Detection in Color Images[C]. Proceedings of the Eighth International Conference on Document Analysis and Recognition, 2005:399-403.
    [23]刘洋,薛向阳,路红,等.一种基于边缘检测和线条特征的视频字符检测算法[J].计算机学报,2005,28(3)：427-433.
    [24]张引,潘云鹤.面向彩色图像和视频的文本提取新方法[J].计算机辅助设计与图形学学报,2002,14(1)：36-40.
    [25]TOAN DINH NGUYEN, JONGHYUN PARK, GUEESANG LEE. Tensor Voting Based Text Localization in Natural Scene Images[J]. IEEE Signal Processing Letters,2010,17(7):639-642.
    [26]周慧灿,刘琼,王耀南.基于颜色散布分析的自然场景文本定位[J].计算机工程,2010,36(8)：197-199,202.
    [27]LIU ZONGYI, SARKAR S. Robust Outdoor Text Detection Using Text Intensity and Shape Features[C].19th International Conference on Pattern Recognition,2008:1-4.
    [28]王建,周源华.一种基于纹理能量的jpeg图像文本定位算法[J].上海交通大学学报,2004,38(9)：1492-1495.
    [29]庄越挺,刘骏伟,吴飞,等.基于支持向量机的视频字幕自动定位与提取[J].计算机辅助设计与图形学学报,2002,14(8)：750-771.
    [30]黄剑华,吴锐,刘家锋,等.一种基于同质映射的视频图像.中文本检测方法[J].高技术通讯,2007,17(3)：249-254.
    [31]ZHANG XIULI, SHI ZHENGGANG. A New Algorithm for Text Segmentation Based on Stroke Filter[C].2010 Chinese Control and Decision Conference,2010:4347-4350.
    [32]杨晓敏,周强,吴炜,等.基于支持向量机的车牌字符识别[J].四川大学学报.2009,46(4)：968-972.
    [33]LIM YOUNGKYU, CHOI SONGHA. LEE SEONGWHAN. Text Extraction in MPEG Compressed Video for Content-based Indexing[C]. 15th International Conference on Pattern Recognition,2000,4:409-412.
    [34]CHEN XIANGRONG, ZHANG HONGJIANG. Text Area Detection from Video Frames[C]. IEEE Pacific Rim Conference on Multimedia: Advances in Multimedia Information Processing.2001:222-228.
    [35]古辉,王益义.船舶识别系统中的变形矫正方法研究[J].现代技术,2009,10：87-90.
    [36]陈锻生,谢志鹏,刘政凯.复杂背景下彩色图像车牌提取与字割技术[J].小型微型计算机系统,2002,23(9)：1144-1148.
    [37]吴欣,张志伟.基于形态学和霍夫变换的文档图像倾斜检测[J].京理工大学学报,2009,33(2)：178-182.
    [38]张云刚,张长水.利用Hough变换和先验知识的车牌字符分割[J].计算机学报,2004,27(1)：130-135.
    [39]戴维,张申生.基于二值化聚类的图像文字提取算法[J].计算·用,2009,29(1)：57-59.
    [40]常莹,何东健,李宗儒.基于聚类与边缘检测的自然场景文本方法[J].计算机工程与设计,2010,31(18)：4040-4043.
    [41]马懿超,戴汝为,王春恒.一种集成的小型文档图像透视变形方法[J].模式识别与人工智能,2006,19(4)：503-508.
    [42]COMANICIU D, MEER P. Mean Shift:A Robust Approach tow Feature Space Analysis[J]. IEEE Transactions on Pattern Analysis Machine Intelligence,2002,24(5):603-619.
    [43]MA W Y, MANJUNATH B S. Edge flow:A Framework of Bour Detection and Image Segmentation[C]. Proceedings of 1997 i Computer Society Conference on Computer Vision and Pa Recognition,1997:744-749.
    [44]刘铭,王晓龙,刘远超.基于主题分析的文本分割技术研究[J].子学报,2009,37(2)：278-284.
    [45]张引,潘云鹤.基于模拟退火的最大似然聚类图像分割算法[J].件学报,2001,12(2)：212-218.
    [46]何清法,李国杰.一种自动抽取图像中可判别区域的新方法[J].算机学报,2002,25(8)：801-809.
    [47]李伦波,马广富.自然场景下交通标志检测与分类算法研究[J].尔滨工业大学学报,2009，41(11)：29-33.
    [48]GOTO HIDEAKI, ASO HIROTOMO. Character Pattern Extraction Colorful Documents with Complex Backgrounds[C]. Proceedings of International Conference on Pattern Recognition,2002,3:180-183.
    [49]陈又新,刘长松,丁晓青.复杂彩色文本图像中字符的提取[J].中文信息学报,2003,17(15)：55-59.
    [50]蔡晓妍,戴冠中,杨黎斌.谱聚类算法综述[J].计算机科学2008,35(7)：14-18.
    [51]陶文兵,金海.一种新的基于图谱理论的图像阈值分割方法[J].计算机学报,2007,30(1)：110-118.
    [52]HE XIAOFEI, CAI DENG, WEN JIRONG, et al. Clustering and Searching WWW Images Using Link and Page Layout Analysis [J].ACM Transactions on Multimedia Computing, Communications and Applications,2007,3(2):10-es.
    [53]DESMOND HIGHAMJ, GABRIELA KALNA, MILLA KIBBLE. Spectral Clustering and its Use in Bioinformatic[J]. Computational and Applied Mathematics,2007,204(1):25-37.
    [54]LIANG JIAN. Processing Camera-captured Document Images: Geometric Rectification, Mosaicing, and Layout Structure Recognition [D]. Maryland University,2006.
    [55]LIANG JIAN, DANIEL DEMENTHON, DAVID DOERMANN. Geometric Rectification of Camera-Captured Document Image[J]. IEEE Transactions on Pattern Analysis and Machine Intrlligence,2008,30(4): 591-605.
    [56]ZHANG WANGBO, LI XIAOYUAN, MA XINGJIE. Perspective Correction Method for Chinese Document Image-s[C]. International Symposium on Inteligent Information Technology Application Workshop, 2008:467-470.
    [57]LU SHIJIAN, TAN CHEW LIM. Camera Text Recognition based on Perspective Invariants[C]. Proceedings of 18th International Conference on Pattern Recognition.2006:1042-1045.
    [58]夏波涌,童悍操.基于纹理梯度的文档图像的倾斜矫正方法[J].计算机仿真,2009,26(3)：240-243.
    [59]ZHOU PENG, LI LINLIN, TAN CHEW LIM. Character Recognition under Severe Perspective Distortion[C].10th International Conference on Document Analysis and Recognition.2009:676-679.
    [60]孙东卫,朱程辉,丰义.基于主方向检测的畸变字符图像快速矫正算法[J].自动化技术,2007,11：112-114.
    [61]ZHANG LI, ZHANG ZHENG, TAN CHEW LIM, et al.3D Geometric and Optical Modeling of Warped Document Images from Scanners. The 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition,2005,1:337-342.
    [62]ZHANG ZHENG, TAN CHEW LIM. Correcting Document Image Wapping Based on Regression of Curved Text Lines[C]. The Seventh International Conference on Document Analysis and Recognition,2003,1: 589-593.
    [63]吴一全,丁坚.基于K-L展开式的车牌倾斜校正方法[J].仪器仪表学报,2008,29(8)：1690-1694.
    [64]CHEN XILIN, YANG JIE, ZHANG JING, et al. Automatic Detection and Recognition of Signs From Natural Scenes[J]. Transactions on Image Processing,2004,13(1):87-98.
    [65]PAUL CLARK, MAJID MIRMEHDI. Rectifying Perspective Views of Text in 3D Scenes Using Vanishing Points[J]. Pattern Recognition,2003, 36(82):2673-2686.
    [66]SINGH CHANDAN, NITIN BHATIA, AMANDEEP KAUR. Hough Transform Based Fast Skew Detection and Accurate Skew Correction Methods[J]. Pattern Recognition,2008,41:3528-3546.
    [67]PILU M. Extraction of Illusory Linear Clues in Perspectively Skewed Documents[J]. Proceedings of IEEE Conference on Computer Vision^and Pattern Recognition,2001:363-368.
    [68]MIAO LIGANG, PENG SILONG. Perspective Rectification of Document Images Based on Morphology[C].2006 International Conference on Computational Intelligence and Security,2006,2:1805-1808.
    [69]LU SHIJIAN, CHEN BEN M, KO C C. Perspective Rectification of Document Images Using Fuzzy Set and Morphological Operations [J]. Image and Vision Computing,2005,23:541-553.
    [70]张勇,金学波.基于先验信息和射影几何变换的车牌图像矫正[J].计算机应用研究,2008,25(7)：2210-2212.
    [71]哈力木拉提,阿孜古丽.多字体印刷维吾尔文字符识别系统的研究与开发[J].计算机学报,2004,11：1480-1484.
    [72]吴锐,赵巍,尹芳,等.特征融合及相似度判据在英文识别中的应用[J].计算机工程与应用,2005,16：55-57.
    [73]CAI JINHUA, LIU ZHIQIANG. Integration of Structural and Statistical Information for Unconstrained Handwritten Numeral Recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,1999, 21(3):263-270.
    [74]邓伟,陈庆虎,袁凤,等.复杂背景文件图像的字符提取[J].武汉大学学报(信息科学版),2009,34(3)：313-316.
    [75]NEI KATO. A Handwritten Character Recognition System Using Directional Element Feature and Asymmetric Mahalanobis Distance[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence.1999, 21(3):258-262.
    [76]温江涛,王伯雄,秦垚.基于局部灰度梯度特征的图像快速配准方法[J].清华大学学报(自然科学版),2009,49(5)：673-675.
    [77]肖俊,王颖.基于多级离散余弦变换的鲁棒数字水印算法.计算机学报,2009,32(5)：1055-1061.
    [78]张立国,杨瑾,李晶,等.基于小波包和数学形态学结合的图像特征提取方法[J].仪器仪表学报,2010,31(10)：2285-2290.
    [79]DUBEY PREMNATH, SINTHUPINYO WASIN. New Approach on Structural Feature Extraction for Character Recognition[C].2010 10th International Symposium on Communications and Information Technologies,2010:946-949.
    [80]陈军胜.组合结构特征的自由手写体数字识别算法研究[J/OL].计算机工程与应用,2012= http://www.cnki.net/kcms/detail/11.2127.TP. 20120116.0928.062.html
    [81]罗辉武,唐远炎,王翊.基于结构特征和灰度特征的车牌字符识别方法[J].计算机科学,2011,38(11)：267-270,302.
    [82]TOMOYUKI HAMAMURA, BUNPEI IRIE, TAKUYA NISHIMOTO. Concurrent Optimization of Context Clustering and GMM for Offline Handwritten Word Recognition Using HMM[C].2011 International Conference on Document Analysis and Recognition,2011:523-527.
    [83]CHEN XUEFEI, LIU XIABI, JIA YUNDE. Unsupervised Selection and Discriminative Estimation of Orthogonal Gaussian Mixture Models for Handwritten Digit Recognition[C].10th International Conference on Document Analysis and Recognition,2009:1151-1155.
    [84]SCHLUETER R, NuSSBAUM THOM M, NEY H. Does the Cost Function Matter in Bayes Decision Rule[J]. Pattern Analysis and Machine Intelligence,2012,34(2):292-301.
    [85]ZHENG MENGZE, LIU QINGYU. Application of LVQ Neural Network to Car License Plate Recognition[C]. Intelligent Systems and Knowledge Engineering,2011:287-290.
    [86]陈蕾,王传栋,孙知信.距离加权的2-D核自联想记忆模型及其应用[J].模式识别与人工智能,2007：110-114.
    [87]吴丽芸,王文伟,张平,等.手写混合字符集识别的多特征多级分类器设计[J].计算机应用,2005,25(12)：2948-2950.
    [88]李孟歆,吴成东.一种基于分级RBF网络的车牌字符识别方法[J].计算机工程与应用,2008,44(30)：213-215.
    [89]ZHU CHENGHUI, XU XIAOLI, WANG JIANPING, et al. Study of Off-line Handwritten Chinese Character Recognition Based on Dynamic Pruned Binary Tree SVMs[J]. Advanced Materials Research-Materials Science and Information Technology,2012,433-440:3623-3628.
    [90]殷绪成,江世盛,韩智.层次型金融票据图像分类方法[J].中文信息学报,2005,19(6)：70-77.
    [91]ANNE LAURE BIANNE BERNARD, FARES MENASRI, RAMI AL-HAJJ MOHAMAD, et al. Dynamic and Contextual Information in HMM Modeling for Handwritten Word Recognition [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2011,33(10):2066-2080.
    [92]SOTIRIOS P CHATZIS, GABRIEL TSECHPENAKIS. The Infinite Hidden Markov Random Field Model [J]. IEEE Transactions on Neural Networks,2010,21(6):1004-1014.
    [93]AZEEM S A, El MESEERY, M ARABIC. Handwriting writing recognition using concavity features and classifier fusion [J]. Machine Learning and Applications and Workshops,2011:200-203.
    [94]LI HUIPING, DOERMANN D, KIA O. Automatic Text Detection and Tracking in Digital Video [J]. IEEE Transaction Image Processing,2000, 9(1):147-156.
    [95]CHEN DATONG, ODOBEZ J M, BOURLARD H. Text Segmentation and Recognition in Complex Background Based on Markov Random Field[C]. Proceedings of International Conference on Pattern Recognition, 2002,4:227-230.
    [96]ROHIT PRASAD, SHIRIN SALEEM. Multi-frame Combination for Robust Videotext Recognition [C]. IEEE International Conference on Acoustics, Speech and Signal Processing,2008:1357-1360.
    [97]吴锐,刘家峰,唐降龙等.基于Gabor小波变换的汉字识别方法[J].高技术通讯,2005,15(3)：7-10.
    [98]路小波,凌小静,刘斌.基于组合特征的车牌字符识别[J].仪器仪表学报.2006,27(7)：698-701.
    [99]RAMANATHAN R, ARUN S NAIR, et al. Robust Feature Extraction Technique for Optical Character Recognition[C].2009 International Conference on Advances in Computing, Control and Telecommunication Technologies,2009:573-575.
    [100]MARCO LA CASCIA, SARATENDU SETHI, STAN SCLAROFF. Combining Textual and Visual Cues for Content-based Image Retrieval on the World Wide Web[J]. IEEE Workshop on Content-based Access of Image and Video Libraries,1998:24-28.
    [101]HE RUHAN, XIONG NAIXUE, LAURENCE T YANG, et al. Using Multi-Modal Semantic Association Rules to Fuse Keywords and Visual Features Automatically for Web Image Retrieval [J]. Information Fusion, 2011,12(3):223-230.
    [102]许红涛,周向东,向宇,等.一种自适应的Web图像语义自动标注方法[J].软件学报,2010,21(9)：2183-2195.
    [103]JIANG TAO, TAN AH HWEE. Learning Image-Text Associations[J]. IEEE Transactions on Knowledge and Data Engineering,2009,21(2): 161-177.
    [104]PHILLIP ISOLA, XIAO JIANXIONG, ANTONIO TORRALBA, et al. What Makes an Image Memorable[C].2011 IEEE Conference on Computer Vision and Pattern Recognition,2011:145-152.
    [105]RUSSELL B C, TORRALBA A, MURPHY P Y, et al. LabelMe:a Database and Web-Based Tool for Image Annotation[J]. International Journal of Computer Vision,2008,77(1-3):157-173.
    [106]LU ZHIWU, HORACE H S IP. Image Categorization with Spatial Mismatch Kernels[C]. IEEE Conference on Computer Vision and Pattern Recognition,2009:397-404.
    [107]BRYAN C RUSSELL, ALEXEI A EFROS, JOSEF SIVIC, et al. Using Multiple Segmentations to Discover Objects and their Extent in Image Collections[C]. IEEE Conference on Computer Vision and Pattern Recognition,2006:1605-1614.
    [108]VAILAYA A, FIGUEIREDO M, JAIN A, et al. Image Classification for Content-based Indexing[C]. IEEE Transaction on Image Processing, 2001,10(1):117-130.
    [109]LI FEIFEI, PIETRO PERONA. A Bayesian Hierarchical Model for Learning Natural Scene Categories[C]. IEEE Computer Society Conference on Computer Vision and Pattern Recognition,2005,2:524-531.
    [110]CHEN XIANGRONG, YUILLE A L. Detecting and Reading Text in Natural Scenes [C]. Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition,2004,2:366-373.
    [111]CONCI A, DE CARVALHO J E R, RAUBER T W. A Complete System for Vehicle Plate Localization, Segmentation and Recognition in Real Life Scene[J]. IEEE Latin America Transactions,,2009,7(5):497-506.
    [112]SHI JIANBO, JITENDRA MALIK. Normalized Cuts and Image Segmentation[J]. IEEE Transactions Pattern Analysis and Machine Intellence,2000,22(8):888-905.
    [113]TAO WENBING, JIN HAI, ZHANG YIMIN, et al. Image Thresholding Using Graph Cuts[J]. IEEE Transactions on System, Man, and Cybernetics-Part A:System and Humans,2008,38(5):1181-1195.
    [114]刘丽,匡纲要.图像纹理特征提取方法综述[J].中国图象图形学报,2009,14(4)：622-635.
    [115]KARATZAS D, ANTONACOPOULOS A. Text Extraction from Web Images Based on a Split-and-Merge Segmentation Method Using Colour Perception. Proceedings of International Conference on Pattern Recognition,2004:634-637.
    [116]王向阳,杨红颖,郑宏亮,等.基于视觉权值的分块颜色直方图图像检索方法[C].自动化学报,2010,36(10)：1489-1492.
    [117]CHENG HENGDA, SUN YING. A Hierarchical Approach to Color Image Segmentation Using Homogeneity[C]. Image Processing,2000, 9(12):2071-2082.
    [118]王学文,丁晓青,刘长松.基于Gabor变换的高鲁棒汉字识别新方法[C].电子学报,2002,30(9)：1317-1322.
    [119]H.Hotelling. Relations Between Two Sets of Variates. Biometrika, 1936,28:321-377.
    [120]HARDOON DAVID R, CAIG SAUNDERS, SANDOR SZEDMAK, et al. A Correlation Approach for Automatic Image Annotation[J]. Advanced Data Mining and Application,2006,4093:681-692.
    [121]张鸿,吴飞,庄越挺,等.一种基于内容相关性的跨媒体检索方法[J].计算机学报,2008,31(5)；820-826.
    [122]吴飞,庄越挺.互联网跨媒体分析与检索：理论与算法[J].计算机辅助设计与图形学学报,2010,22(1)：1-9.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700