图像中字符定位方法的研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
随着多媒体技术的发展,数字图像(视频)在各个领域的应用越来越广泛,能够从图像(视频)中检索到所需的信息成为人们迫切的要求。在图像(视频)中,字符信息在一定程度上反映了本幅图像(视频)的部分内容,因此自动定位图像中的字符区域,并抽取这些文字信息,是其关键步骤。
     文献中提出图像中字符边缘体现出的纹理具有方向性,即水平、垂直、斜向走向。首先提取字符的横向、竖向纹理的方向信息,然后根据各自的阈值把候选字符区标识出来,用形态滤波的方法消除噪声,最后用斜向纹理的平均斜向能量判断是否是字符区。文献中将此模型用在以DCT为编码基础压缩数据上,效果较好。本文将其算法中的能量定义利用块间的信息重新定义,并且采用自适应动态阈值对原算法进行改进。对比实验结果表明,改进的方法对图像字符定位比原方法准确率更高,漏检情况有一定程度的降低,取得了较好的效果。同时本文将此模型推广到小波分析中来定位字符。分析了各个方向边缘经小波变换后的特点,用具有良好时频局部和变尺度特征的小波分析方法提取出不同空间分辨率,水平和垂直及对角线方向的边缘子图像,把满足水平和垂直方向能量阈值区域进行合并,经过去噪后,用对角线方向能量阈值即高高频能量作为判断标准,确定是否是真正的字符区。实验中正确检测率达到93.7%,较低的漏检率6.3%,错误检测率百分之十几。
     本模型可用于空间域中图像字符定位也适用于以小波变换或以DCT技术为编码基础的压缩数据。通过大量的实验验证这种改进模型具有较高的准确率。
With the development of the multimedia, the application of digital image (video) in all the fields is more and mere. Now the ability of finding the needed information from the many images (video) is required by the people. For the image, the information of the characters in it reflects part important content of this image. Automatic Image index based on the characters with the semantic character in the image is one of the important research fields of the image index technologies.
    Reference [12] presents that the texture of the edges of the character has the directions, that is horizontal direction, vertical direction, and slant directions. With the distinguishing characteristics of character' s texture (such as horizontal lines, vertical lines, or slant lines in a character) that can be extracted directly, the character-regions are segmented from their background quickly, and the image-noises rising during the processing period can be removed by morphological filter. With this method, the compressed bit-streams, which are encoded by DCT-Based encoding algorithm, can be processed directly to locate the character-regions in the images, just a very small amount of decoding is required So, the amount of data which want to process is smaller, the processing speed is faster and the demand of computer memory is less. Now the new energy definition combined with the nearby blocks to instead that in reference 12 and the adaptive dynamic threshold instead of the fixed threshold are presented. E
    xperiments show that the amended method is better than that in reference 12. As the same time the method can combine with the wavelet transformation to locate the characters. By multi-resolution
    
    
    
    analysis and pyramid decomposition, the edge components with different spatial resolutions and different directions can be acquired, among which, the detail components have the most distinguished texture features standing for the object region, Then by further morphological operations, the useless information is greatly decreased and the last object text region is acquired.
    Experiments show that he correct-localizations rate of this model applied in the dot-domain and wavelet-transformation-domain is higher.
引文
【1】黄学军,邢爱风,等.基于内容的图像检索系统.电子技术应用,29(11),2003:29-32
    【2】石军,常义林.图像检索技术综述.西安电子科技大学学报(自然科学版),30(4),2003:486-491
    【3】V. Wu, R. Manmatha, E. M. Riseman. Finding text in images, in Proc. 2~(nd) ACM Int. Conf. Digital Libraries, Philadelphia, PA, July 1997, 23-26
    【4】V. Wu, R. Manmatha, E.M. Riseman., TextFinder: An Automatic System to Detect and Recognize Text In Images. IEEE Trans. on Pattern Analysis and Machine Intelligence, 21(11), 1999:1224-1229
    【5】Xian-Sheng Hua, Xiang-Rong Chen, Liu Wenyin, Hong-Jiang Zhang. Automatic Location of Text in Video Frames. 3~(rd) Intl Workshop on Multimedia Information Retrieval October 5, 2001, Ottawa, Canada
    【6】马小勇,谢萍,等.视频帧中提取文字区域的算法.计算机工程.29(9),2003:155-157
    【7】JianMing Hu, JieXi, LiDe Wu. Automatic Detection And Verification of Text Regions In News Video Frames. Internatonal Journal of Pattern Recognition and Arificial Intelligence, 16(2), 2002:257-271
    【8】欧国斌,张利,等.视频信号中实时字幕信息的提取方法.清华大学学报(自然科学版),42(7),2002:869-872
    【9】Toshio Sato, Takeo Kanade, Ellen K. Hughes, Michael A. Smith, Shin'ichi Satoh. Video OCR: Indexing Digital News Libraries by Recognition of Superimposed Capions.Multimedia Syst. 7(5), 1999:385-395
    
    
    【10】H. Li, D. Doermann and O. Kia. Automatic text detection and tracking in digital video, IEEE Trans. Image Process. 9(1), 2000:147-15
    【11】A. K, Jain and B. Yu, Automatic text location in images and video frames. Pattern Recognition. 31(12), 1998:2055-2076
    【12】黄祥林,沈兰荪.基于DCT压缩域的图像字符定位.中国图像图形学报,7(1),2002:22-26
    【13】Yu Zhong, Hongjiang, Zhang, Anil K. Jain. Automatic Caption Localization in Compressed Video. IEEE Transactions on Pattern Analysis and Machine Intelligence. 31(12),2000:2055-2076
    【14】David Crandall, Rangachar Kasturi. Robust Detection of Stylized Text Events in Digital Video. Proceedings of International Conference on Document Analysis and Recognition 2001:865-869
    【15】Yi Zhang, Tat-Seng Chua. Detection of Text Captions in Compressed Domain Video. Proceedings of ACM Multimedia' 2000 Workshops (Multimedia Information Retrieval), USA, California, Nov, 2000, 201-204
    【16】王辰,老松杨,胡晓峰等.视频中的文字探测.小型微型计算机系统,23(4),2002:478-481
    【17】李朝晖,王秀峰.影视字幕文字识别的研究.计算机工程.28(3),2002:175-176
    【18】万罡,周洞汝,等.数字视频中文字分割算法的研究.计算机工程与研究.39(2),2003:103-105
    【19】蔡波,周洞汝,等.数字视频中字幕检测及提取的研究和实现.计算机辅助设计与图形学学报.15(7),2003:898-903
    【20】李朝晖,余英林.基于小波形态学的文本自动检测.计算机工程与研究.39(14),2003:119-120
    【21】黄晓东,周源华.用小波变换及颜色聚类提取的视频图像内中文字幕.计
    
    算机工程.29(1),2003:4-44,135
    【22】胡广书.数字信号处理——理论、算法与实现.北京:清华大学出版社.1999
    【23】刘艳,李宏东.DCT域图像处理和特征提取技术.中国图象图形学报.8A(2),2003:121-128
    【24】李晓华,沈兰荪.基于压缩域的图像检索技术.计算机学报,26(9),2003:1051-1059
    【25】吴冬升,吴乐南.基于JPEG的图像检索综述.计算机应用与软件.20(3),2003:1-2
    【26】谭雁英,董志信.图像边缘的DCT频谱特征的分析.西北工业大学学报.13(2),1995:298-302
    【27】Stephane Mallat.信号处理的小波导引.北京:机械工业出版社.2002
    【28】胡昌华,张军波,等.基于MATLAB的系统分析与设计.西安:西安电子科技大学出版社.2001
    【29】杨福生.小波变换的工程分析与应用.北京:科学出版社2001
    【30】崔锦泰.小波分析导论[M],程正兴译.西安:西安交通大学出版社.1992
    【31】程正兴.小波分析算法与应用[M].西安:西安交通大学出版社.1998
    【32】李建平,小波分析与信号处理——理论、应用及其软件实现[M],重庆:重庆出版社,1997
    【33】秦前清,杨宗凯,实用小波分析[M].西安:西安电子科技大学出版社.1994
    【34】杨文杰,刘浩学,等.边界探测的小波变换方法.中国图象图形学报4(1),1999:38-40
    【35】孙家广,等.计算机图形学.北京:清华大学出版社.1998
    【36】David F.Rogers,等.计算机图形学算法基础.北京:机械工业出版社.2002
    【37】何斌,王运坚,等.数字图像处理.北京:人民邮电出版,2001
    【38】赵荣椿,等.数字图像导论.西安:西北工业大学出版社.1995

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700