多区域图像的分割和倾斜检测方法研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
在现代信息社会里,计算机已经进入了社会的各个领域,互联网也日益普及,人们越来越多地依赖计算机获得各种信息,大量的处理工作也都转移到计算机上进行。研究如何将传统的纸张文本转换成电子文本就成为了人们关注的课题。在日常生活和工作中,存在着大量的文件资料的处理问题,这些文件不仅包括那些只有文字信息的文件还包括那些图文混排的文件和图像文件,因而如何将文件快速准确的输入计算机的要求变得非常迫切。
     本文主要研究的是多区域图像的分割和倾斜检测方法。针对常用的文本图像分割算法进行了综述,并对各个分割算法的优缺点进行了介绍。一般的文本图像的处理算法大体可以分成两类:几何分析法和纹理分析法。其中几何分析法又可以分为自顶向下、自底向上、混合法。本文详细介绍了两种自顶向下的分割算法,分别是游程平滑算法和投影轮廓算法,以及两种自底向上的处理方法:近邻线密度法和连通分量分析法。除此之外,还列举了几种常见的图像分割算法。
     本文总结以上的基本分割方法,针对多区域图像提出了改进的投影轮廓算法。该算法解决了使用一般的投影轮廓算法不能适用于复杂的具有倾斜角度的多区域图像的分割,本文首先对图像进行二值化,使用数学形态学的腐蚀—膨胀操作降低图像上的噪声。对于得到的图像使用改进的投影轮廓算法,该算法即使在X、Y轴方向上没有谷点,也可以根据图像像素的分布状况找出切分点,将图像切成小块,再对小块图像进行投影分析,循环此过程,直到将图像的各个区域分割出来为止。
     文档倾斜角的检测大体可以归为五大类,基于Hough变换的方法、基于交叉相关性的方法、基于投影的方法、基于Fourier变换的方法和K-最近邻簇法,其中基于Fourier变换的方法计算量非常大,故而很少使用。
     通常文档图像在扫描入计算机时难免会有损失,文档图像的边缘也很不规则。如果用普通的边缘提取方法寻找图像轮廓,不仅增加了计算量而且增加了许多不必要的计算。本文针对一般倾斜检测算法计算量大的问题,提出了一种简单的寻找边缘的方法,这里并不需要精确地找出文档图像的边缘轮廓,只是找出含有图像的区域就可以了,这个区域就是外接矩形,即bounding box。本文引入GA方法检测图像的倾斜角,该方法使用bounding box的面积作为适应度函数值,只需要找出图像的上下左右四个坐标值便可以了,这样大大减少了计算量。实验结果表明该算法对倾斜角的检测具有较高的精确度。
In the modern information society, computer technology has been involved in various fields of our lives. The Internet has also become popular increasingly, and we depend on computers to get information more than ever before, a lot of work is shifted on to computer. Studying how to covert the traditional paper into electronic text has become a topic of concern. In daily life, there are a large number of documents to be handled. All of these documents include not only text files but also images and mixed files, so how to put them into computer efficiently and accurately has become urgent requirements.
     The main purpose of this thesis is to study algorithms for page segmentation and skew detection of multi-region document images. The thesis summarizes the common algorithms of page segmentation, and gives their advantages and disadvantages of each algorithm. Generally, methods of page segmentation can be classified into two types, one is structural analysis, and the other is texture analysis. The structural analysis includes top-down, bottom-up and a mixing of the two. The thesis presents two top-down methods, run-length smoothing and projection profile cut, and two bottom-up methods, neighborhood line density and connected component analysis. In addition, it gives several algorithms which usually be used in image segmentation.
     According to these algorithms, this paper presents an improved method of the projection profile cut algorithm. This algorithm solves the problem that the projection profile cut algorithm couldn’t deal with complicated documents containing skewed multi-regions. First, the image is binarized, then denoised by erosion and dilation operation of mathematical morphology. Applying the improved projection profile cut algorithm to document images, we can find the cut-off points of the image which don’t have any peak-valley point on the X-axis and Y-axis. With these cut-off points we could cut the image into small pieces, and then we conduct the same operation until multi-regions are separated.
     Skew estimating methods can be classified into five general categories: Hough transform, cross-correlation, projection profile, Fourier transform and nearest-neighbor, of which Fourier transform is rarely used because of its high complexity.
     During document scanning, the image may lose something inevitably, and the edges are not smoothing. If we use the normal image edge detection to find the profile, it increases not only the amount of computation but also many unnecessary calculations. The thesis proposed a brief method to find the profile of the image, for which there is no need to find the edges accurately, just to find the area which contains the image. The area being found is called bounding box. The thesis used GA algorithm to detect skew angles of the images. This method uses the area of the bounding box as its fitness function, in which only the coordinate values of the 4 corners need to be found. This can reduce tremendous computing complexity. Experimental results show that the proposed algorithm can certainly guarantee the accuracy for document image deskewing.
引文
[1] Mori S., Suen C. Y., Yamamoto K.. Historical Review of OCR Research and evelopment[J]. Proc. IEEE, 1992, 80(7), pp.1029-1057.
    [2] Govindan V. K., Shivaprasad A. P.. Character Recognition ---A Review[J]. Pattern Recognition, 1990, 23(7), pp.671-683.
    [3] F.Wahl , W.Scheri . Procedures For Automatic Segmentation Of Text Graphic And Halftone Regions [C].Document Proc.2nd Scandinavian Conf. On Image Analysis, 1981, pp.177-182.
    [4] J.Toyoda, Y.Noguchi ,Y.Nishimura. Study Of Extracting Japanese Newspaper Article[C]. Proc.6th Conf. On Pattern Recognition, 1982, pp.1113-1115.
    [5] I.Y.Wong, R.G.Casey , F.M.Wahl. Document Analysis System IBM[J]. J.Research Develop, 1982, 26(6), pp.647-656.
    [6] Oley Okun, David Doermann, Matti Pietikainen. Page Segmentation and Zone Classification[C]. The State of the Art. UMD—TR4079, November 1999, pp.1-34.
    [7] George Nagy. A Proto Document Image Analysis System for Technical Journals[J], IEEE, 1992, 25(7), pp.10-22.
    [8] L.Abele, F.Wahl, W.Scherl. Procedures for an automated segmentation of text, graphic and halftone regions in documents[C]. Proc. of the 2nd Seandinavian Conference on Image analysis Hellsinkii, 1981, pp.177-182.
    [9] J.Toyoda, Y.Noguchi and Y.Nishimura. Study of Extracting Japanese Newspaper Article[C]. Proc.6th Conf. On Pattern Recognition, 1982, pp.1113-1115.
    [10] Kapur J. N., Sahoo P .K, along A K C. A new method for gray thresholding using Image Processing, the entropy of the histogram[J]. Computer Vision,-level picture Graphics , 1985; 29, pp.273-285.
    [11] Fisher,J., Hinds,S., D’Amato, D.. A rule based system for document image segmentation[C]. Proc.10th Internet. Conf. on Pattern recognition, 1990, pp.567-572.
    [12] Jain, A.K., Zhong, Y.. Pattern Recognition[M]. 1996, pp.743-770.
    [13] Pavlidis,T., Zhou, J.. Page segmention and classification[J]. CVGIP, Graphical Models and Image Processing, Vol.54, No.6, 1992, pp.484-496.
    [14] Gatos B, Papermarkos N, Chamzas C. Skew Detection and Text Line Position Determination in Digitized Documents[J]. Pattern Recognition, 1997, 30(9), pp.1505-1519.
    [15] Chen M, Ding X. A Robust Skew Detection Algorithm for Grayscale Document Image[C]. Proceedings of International Conference on Document Analysis and Recognition, Bangalore, India, 1999, pp.617-620.
    [16] Pstl W. Detection of Linear Oblique Structure and Skew Scan in Digitized Documents[C]. Proceedings of the 8th International Conference on Pattern Recognition, Paris, France, 1986, pp. 487-489.
    [17] Goman L O. The Document Spectrum for Page Layout Analysis[J].IEEE Transaction on PAMI, 1993, 15(11), pp.1162-1173.
    [18] P.V.C.Hough. Method and Means for Recognizing Complex Patterns[C]. US. Patent 3069654.Dec.18.1962.
    [19] Robey, D. and D.L. Farrow. Group Process and Conflict in System Development, pp. A Conflict Model and Empirical Test[J].Management Science, 35 (10), pp.1172-1191.
    [20] Daniel S. Le, etc. Automated Page Orientation and Skew Angle Detection for Binary Document Images[J]. Pattern Recognition Society,Vol.27, No. 10, 1994, pp.1325-1344.
    [21] S.C. Hinds, J.L. Fisher ends D.P D'Amato. A Document Skew Detection Method Using Run Length Encoding ends the Hough Transform[C]. In Proceedings of International Conference on Pattern Recognition, Atlantic City, Now York, June 16-21, 1990, pp. 464-468.
    [22] H. Jiang, C. Han and K. Fan. A Fast Approach to the Detection and Correction of Skew Documents[J]. Pattern Recognition Letters, 1997, 18, pp. 675-686.
    [23] Ciardiello G, Scafur G, Degrand IM, et al. An Experimental System for Office Document Handling and Text Recognition [A ]. Proceedings of Ninth International Conference on Pattern Recognition, 1998, pp.739 – 743.
    [24] T. Akiyama and N. Hagita. Automated Entry System for Printed Documents[J]. Pattern Recognition, 1990, 23(11), pp.1141-1154.
    [25] T. Steinherz, N. Intrator and E. Rivlin. Skew Detection via Principal Component Analysis[C].ICDAR'99, pp.153-156.
    [26] Sun and D. Si. Skew and Slant Correction for Document Image Using Gradient Direction[C]. ICDAR’97, pp.142-146.
    [27] H. Yan. Skew Correction of Document Images Using Interline Cross- Correlati- on[J]. CVGIP, pp. Graphical Models and Image Processing, 1993, 55(6), pp.538-543.
    [28] Chaudhuri and S. Chaudhuri. Robust Detection of Skew in Document Images[J]. IEEETransactions on Image Processing, 1997, 6(2), pp.344-349.
    [29] Gatos, N. Papantarkos and C. Chamzas. Skew Detection and Text Line Position Determination in Digitized Documents[J]. Pattern Recognition, 1997, 30(9), pp. 1505-1519.
    [30] M. Chen and X.Ding. A Robust Skew Detection Algorithm for Grayscale Document Image[C]. ICDAR'99, pp. 617-628.
    [31] L. O'Gorman. The Document Spectrum for Structual Page Layout Analysis[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1993, 15(11), pp. 1162-1173.
    [32] 李 敏 强 , 寇 纪 松 等 . 遗 传 算 法 的 基 本 理 论 与 应 用 [M]. 北 京 : 科 学 出 版社,2002.pp.17-73.
    [33] Koichi Kise, Akinori Sato, and Motoi Iwata. Segmentation of Page Images Using the Area Voronoi Diagram[J]. Computer Vision And Image Understanding Vo1.70, No.3, 1998, pp.370-382.
    [34] Yi Xiao, Hong Yan. Text region extraction in a document image based on the Delaunay tessellation[J]. Pattern Recognition ,2003,36(3), pp.799-809.
    [35] Nagy and S. Seth. Hierarchical representaticn of optically scanned documents[C]. ICPR(7), 1984 ,pp.347-349.
    [36] Amin, Adnan, Shiu, Ricky, Page Segmentation and Classification Utilizing Bottom-Up Approach[J]. International Journal of Image & Graphics, Apr2001, Vol.1 Issue 2,pp.345-363.
    [37] F. M. Wahl, K. Y Wong and R. G Casey. Block Segmentation and Text Extraction in Mixed Text Image Documents[J].C VGIP, Vol. 20,1982, pp.375-390.
    [38] Strouthopoulos C.,Papamarkos N.,Chamaz is C.. PLA Using RLSA And A Neural Network[J]. Engineering Application Of .Artificial Intelligence EAAI 12, 1999, pp.119-138.
    [39] D.Wang and S.N.Srihari. Classification of Newspaper Image Blocks Using Texture Analysis[J]. CVGIP, Vol.47, 1989, pp.327-352.
    [40] M. Viswanathan and Q Nagy. Characteristics of Digitized Images of Technical Articles[J]. SPIE Vol. 1661, 1992, pp.6-17.
    [41] Jaekyu Ha,Robert M. Haralick, Ihsin T.Philips. Recursive X-Y Cut Using Bounding Boxes Of Connected Components[C].Proc. Third Int'l Conf. Document Analysis and Recognition 1995, pp.952-955.
    [42] Jaekyu Ha, R.Haralik, and I.Philips. Document Page Decomposition by theBounding-Box Projection Technique[C]. Proc.Third Int'1 Conf. Document Analysis and Recognition, 1995, pp. 1119-1122.
    [43] Yang Cao,Shuhua Wang,Heng Li. Skew detection and correction in document images based on straight-line fitting. Pattern Recognition Letters 24, 2003, pp. 1871-1879.
    [44] F. Legourgiois, Z. Bublinski, and H.Emptoz. A fast and Efficient Method for Extracting Text Paragraphs and Graphics from Unconstrained Documents. Proc.l lth Int'1 Conf. Pattern Recognition, 1992, pp.272-276.
    [45] D.Drivas and A. Amin. Page Segmentation and Classification Utilizing Bottom-Up Approach[C]. Proc. Third Int'1 Conf. Document Analysis and Recognition, 1995, pp.610-614.
    [46] Simon, J.Pret, and A.Johnson. A Fast Algorithm for Bottom-Up Document Layout Analysis. IEEE Trans. Pattern Analysis and Machine Intelligence, Vol.19, 1997, pp.273-276.
    [47] Kuo-Chin Fan, Liang-Shen Wang. Classification Of Document Blocks Using Density Feature And Connectivity Histogram[C]. Pattern Recognition Letters 16, 1995, pp. 955-962.
    [48] L. A. Fletcher and R. Kasturi. A Robust Algorithm for Text String Separation from Mixed Text/Graphics Images[J].IEEE Trans .PAMI,Vol.10, 1988, pp.910-918.
    [49] Strouthopoulos, C., Papamarkos, N., Chamazas, C.. Identification of Text-Only Areas in Mixed-Type Documents[J].Engineering Applications of Artifications of Artificial Intelligence 10, 1997, pp.387-401.
    [50] Strouthopoulos,C., Papamarkos, N,Chamazas,C.. Text identification for document image analysis using a neural network[J].Image Vision Computing 16, 1998, pp.879-896.
    [51] Antoacopoulos. Page Segmentation Using the Description of the Background[J]. Computer Vision Image Understanding 70(3), 1998, pp. 350-369.
    [52] S.N.Srihari,T.Hong, G Srikantan. Machine printed Japanese document recognition[J]. Pattern Recognition 30, 1997, pp.1301-1313.
    [53] B. Kruatrachue and P. Suthaphan. A Fast acid Efficient Method for Document Segmentation for OCR[J] . IEEE Catalogue No. 0l, 2001, pp. 381-383.
    [54] Anil.K.Jain, Sushil Bhattacharjee. Text Segmentation Using Gabor Filter For Automatic Document Processing[J]. Machine Vision and Applications, 1992, pp.169-184.
    [55] Alain,YZhong. Page Segmentation Using Texture Analysis[J] .Pattern Recognition,vol.29, 1996, pp.743-770.
    [56] 吴高洪,章毓晋,林行刚.利用小波变换和特征加权进行纹理分割[J].中国图象图形学报第六卷第 4 期, 2001 年 4 月,pp.333-337.
    [57] 林业忠等.纹理分割中的特征提取[J].中国医学物理学杂志,2001, 18(4), pp.204-205
    [58] Anil.K.Jain, Sushil Bhattacharjee. Text Segmentation Using Gabor Filter For Automatic Document Processing[C]. Machine Vision and Applications, 1992, pp.169-184.
    [59] 吴高洪,章毓晋,林行刚.利用小波变换和特征加权进行纹理分割[J].中国图象图形学报第六卷第 4 期, 2001 年 4 月, pp.333-337.
    [60] Kamran Etemad,Kavid Doermann,Rama Chollappa. Page Segmentation Using Decision Integration And Wavelet Packets[C]. Proceedings of the 12th IAPR International Conference on Pattern Recognition, 1994, pp.345-349.
    [61] L.Cinque, L.Lombardi, G Manzini. A Multiresolution Approach For Page Segmentation[C].Pattern Recognition Letters 19, 1998, pp. 217-225.
    [62] Ihsin T. Phillip, Robert M. Haralick. Zone Classification in a Document Using the Method of Feature Vector Generation[C]. Dopartment of Electrical Engineering, FT 10, University of Washington Seattle, U.S.A, 1998, pp.541-544.
    [63] Jiming Liu, Yuan Y, YTang, Ching Y Suen.Chinese Document Layout Based on Adaptive Split-And-Merge And qualitative Spatial Reasoning[C]. Pattern Recognition, Vo1.30, No.8, 1997, pp.1265-1278.
    [64] Re Xi, Jianming Hu, Lide Wu. Page segmentation of Chinese newspapers[C]. Pattern Recognition 35, 2002, pp.2695-2702.
    [65] 管继斌,明德烈. 基于游程的倾斜表格图像的快速检测和校正[J].华中科技大学学报(自然科学版),2005,(08), pp.69-71.
    [66] 邵桂芳, 李祖枢, 刘恒, 张昌盛. 基于遗传量子的自适应图像分割算法[J]. 计算机工程,2005,(22), pp.189-191.
    [67] 朱程辉, 吴德会. 基于主元分析的倾斜车牌图像校正方法研究[J]. 微电子学与计算机, 2006, (01), pp.177-180.
    [68] 闫成新,桑农,张天序.基于图论的图像分割研究进展[J].计算机工程与应用, 2006,(05), pp.11-14.
    [69] ALGINAHI, Y. and FEKRI, D.. A Neural-Based Page Segmentation System [C].M. A.. Journal of Circuits, Systems & Computers, Feb2005, Vol. 14 Issue 1, pp.109-122.
    [70] Y. C.Cheng, The Probabilistic Hough Transform with Localized Search Guided by Evidence Clusters, Image Analysis and Interpretation[C], 2006 IEEE SouthwestSymposium on, 2006, pp.16- 20.
    [71] 李艳玲,文本图像页面分割和分类技术的研究. [硕士学位论文]. 苏州,苏州大学,2004 年 5 月.
    [72] 魏之来,页面倾斜检测与版面分析算法的研究. [硕士学位论文]. 南京,南京理工大学,2004 年 6 月.
    [73] 刘海萍,肖刚等. 选票图像快速倾斜检测方法[J]. 计算机系统应用, 2007 年 11 期, pp.P45-49.
    [74] 邓婷,李树涛. 一种快速有效的车牌倾斜检测方法[J]. 仪器仪表用户, 2007 年 03期, pp.P87-88.
    [75] 李云松,李明. 基于灰度空间特征的模糊 C 均值聚类图像分割[J]. 计算机工程与设计, 2007 年 06 期, pp.1358-1363.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700