文档图像段落分割技术研究与应用

设为首页

收藏本站

网站地图 | English | 公务邮箱

远程访问

NSTL服务站

文档图像段落分割技术研究与应用

详细信息本馆镜像全文| 推荐本文 | | 获取CNKI官网全文

英文题名：Research and Application of Document Image Paragraph Segmentation
作者：赵娜
论文级别：硕士
学科专业名称：管理科学与工程
中文关键词：文档图像分割 ; 段落分割 ; 行分割 ; Mumford-Shah模型 ; 水平集
英文关键词：document image segmentation ; paragraph segmentation ; text line segmentation ; Mumford-Shah model ; the level set
学位年度：2010
导师：王希常
学科代码：1201
学位授予单位：山东师范大学
论文提交日期：2010-05-26

摘要

随着计算机技术的不断发展,计算机的存储能力得到大幅提高,越来越多的纸质文档通过各种数码输入设备以图像的形式存储到计算机中,提供给后续处理系统应用。能够将存储到计算机中的文档图像转化为可检索、分析、重新应用的电子文档的图像分析技术已经引起了广泛的关注。从文档图像处理过渡到图像分析的一个关键步骤是文档图像分割技术,它在图像工程中占据着重要的位置。
     文档图像分割是图像处理领域中的一个基本而又非常重要的环节,有着重大的现实意义和理论价值,也是计算机视觉领域中的一个重要步骤。文档图像的分割效果直接影响着后续分类和识别的准确性。虽然目前针对印刷体文档图像人们提出了各种各样的分割算法,其中有些算法也取得了理想的效果,然而对于手写体文档图像的分割还是存在某些重要的问题有待解决。当前在手写体文档的分割算法中,有的算法是基于倾斜矫正之后的图像的,有的算法是只是针对特定语言实施分割的,缺乏对各种图像进行统一分割的通用算法,而且大部分的文档图像分割算法对于手写体字符拓扑结构变化比较敏感,而目前主要应用于医学图像的几何主动轮廓模型对拓扑结构的变化处理非常自然。
     主动轮廓模型充分利用高层信息,是一种自上而下的处理模型。主动轮廓模型为轮廓提取、立体匹配和目标跟踪等一系列视觉问题提供了一个统一的理论基础,并已经在人机交互、图像对准、目标跟踪等众多领域获得了广泛的应用。基于Mumford-Shah模型的水平集图像分割方法是一种很优秀的图像分割方法。由于Mumford-Shah模型的水平集图像分割方法依赖的特征是图像里同质区域的全局信息,提高了分割的准确性,并是分割方法有了一定的抗噪能力。
     本文首先介绍了基于主动轮廓模型的图像分割方法的产生和发展背景,然后详细阐述了水平集的基本理论和基于Mumford-Shah模型的图像分割方法。最后针对扫描文档图像的特点,依据分割结果的需要,本文采用基于简化Mumford-Shah模型的C-V图像分割方法进行文本行和自然段落的轮廓提取,并分析了本文算法的优缺点。文章中详细介绍了算法流程。因为在传统的水平集方法中,必须要进行复杂的水平集的重新初始化以确保水平集函数在迭代过程中保持为符号距离函数SDF,使得分割速度大幅减慢。为提高运算速度,实验中注意借鉴了李纯明的无需初始化的水平集方法,提高了图像分割的速度,并确保可得到一个有效的结果。本文引入李纯明的思路,经过数百次的实验证明该算法只需60秒,迭代10次以内就可以取得了较好的分割效果,而且是与语言的种类无关的的,适用于各种语言的文档图像。
With the development of computer technology, the estrangement capability of computer has been enlarged many times, by kinds of input digital device, more and more documents are stored into computer and saved as bitmap form. There has been growing interest to the technology of convert these document images into a retrievable and editable form. For all these tasks, document image analysis comes in to being.
     Document segmentation is the major and basic technology in document processing and computer vision. Document segmentation got great academic and practical significance. The result of segmentation is better or not influence the following recognition and interpretation strongly. Therefore, many documentation segmentation methods have been developed and got successful in machine printed documents, processing of handwritten documents has still remains an open research field. Until now, the universal method to process all kinds of pictures has not being proposed. Most current documentation segmentation methods are based on that the document images are reasonably straight. Some segmentation approaches are depended on special language. Most proposes are sensitive to the topological changes of handwritten documents. Geometrical methods based on active contour model is not.
     Active contour model is a top-down processing with prior knowledge and provides a theoretically uniform frame work to a series of problems, such as contour extraction, stereo matching and object tracking. So the method has been successfully applied to image segmentation, medical image processing, human-computer interaction and many other research and practical fields. Level set methods which are based on Mumford-Shah model are excellent and important methods which are based on deformable model. Because of depending on global information of homogeneous regions in the image, they segment the images much more quickly and precisely.
     The paper introduces the background of active contour model. And then illuminates the foundation of the level set method and the image segmentation based on Mumford-Shah model. According to the characteristic of document, the author propose that the piecewise constant approximation of the Mumford-Shah model is very appropriate for the paragraph segmentation and text line segmentation. And the traditional level set methods must re-initialize level set functions costly so that level set functions can be closet to Signed Distance Function and image can be effectively segmented. But in order to be close to a Signed Distance Function, the time step must be small, and the evolution procedure is slowed down. The thesis introducing the Chunming Li'method of level set without re-initialization into them. The experiments indicate that the typically edges of our sample image will be picked up only no more than 10 iterations by using the proposed method. The segmentation tests for kinds of handwritten documents proved that the proposed method is very quick and universal.

引文

[1]Mitram, Chaudhuri B B. Information retrieval from documents:A survey[J]. Information Retrieval 2.2000:141-163.
    [2]路系群,陈纯.图像处理原理：技术与算法[M].杭州,浙江大学出版社.
    [3]李峰,英文科技文档中数学公式的定位识别与重建[D],大连理工大学；计算数学(专业)博士论文,2007.
    [4]陈明,丁晓青,梁健.复杂中文报纸的版面分析、理解和重构[J].清华大学学报：自然科学版,2001,41(1)：29-32.
    [5]Marie Francine Moens, Roxana Angheluta, Jos Dumortier. Generic Technologies for Single and Multi-Document Summarization[J]. Information Processing & Management,2005,41(3): 45-57.
    [6]付鸿鹄,张晓林.段落检索及其相关算法研究[J].现代图书情报技术,2007(2)：39-43.
    [7]朱靖波,叶娜,罗海涛.基于多元判别分析的文本分割模型[J],软件学报,2007,18(3),555-564.
    [8]Xiaojun Du,Wumo pan,Tien.D. Bui. Text line segmentation in handwritten documents using Mumford-shah model[C]. the 11th International conference on frontiers in handwriting recognition,2008,11:1132-1134.
    [9]A. K. Jain, B. Yu. Document Representation and Its Application to Page Document Position[C]. IEEE Transactions on Pattern Analysis and Machine Intelligence,1999,20(3):294-305.
    [10]A. K. Jain, B. Yu. Page Segment Using Document Model[C]. ICDAR,2001,9(7):34-38.
    [11]A.SimomJ.C.Pret, A. PJohnson.A Fast Algorithm for Bottom-Up Document Layout Analysis [C]. IEEE Transaetions on Pattern Analysis and Machine Intelligence,1997,19(3):273-277.
    [12]T. Pavlids, J. Zhou. Page Segmentation and Classification[J]. CVGIP:Graphical Models and Image Processing,1992,54(6):484-496.
    [13]S. Mao, A. Rosenfeld, T. Kanungo.Document Structure Analysis Algorithms:A Literature Survey [J]. In Proc. SPIE Electronic Imaging,2003,50 (10):197-207.
    [14]Yue L, Chew LT. Document Retrieval from Compressed Images [J]. Pattern Recognition, 2003,36 (4):987-996.
    [15]He X P. Study of Color Document Image Processing [J]. Journal of Chong qing Technology Business University,2004,21 (5):468-472.
    [16]G. Nagy. Twenty Years of Document Image Analysis in PAMI[J]. IEEE Trans:Pattern an Machine Intelligence.2000,22(1):38-82.
    [17]Ma Hui-fang, Qi Yun-ping, Yang Xiao-dong. Approach of Multidocument Summarization System Based on a Text Relationship Map[J]. Journal of Information,2007,1(3):11-13.
    [18]龚贤卫.基于Mumford-Shah模型的图像分割[D].硕士学位论文,哈尔滨工程大学,2006.
    [19]Y. Li,Y. Zheng, D.Doermann,S.Jaeger, A new algorithm for detecting text line in handwritten documents[C]. Proc. of the 10th IWFHR.2006,10:.35-40.
    [20]TerzoPoulos D. On Matching Deformable Models to Images[C]. Technical Report 60, Schlumberger Palo Alto research,1986.
    [21]Kass M, Witkin A P, Terzopoulos D. Snakes:Active Contour Models[J]. International Journal of Computer Vision,1988,1(4):321-331.
    [22]张湘伟,成思源.可变形模型的弹性理论分析[J].重庆大学学报,2003,26(4)：11-14.
    [23]Caselles V, Kimmel R, Sapiro G. Geodesic Active Contours[J]. International Journal of Computer Vision,1997,22(1):61-79.
    [24]王海军.图像分割算法的研究与改进[D].硕士学位论文,山东大学,2007.
    [25]Cohen L D. On Active Contours and Balloons[J]. Computer Vision, Graphics and Image Processing:Image Understanding,1991,53(2):211-218.
    [26]Caselles V, Catte F, Coll T, Dibos F. A Geometric Model for Active Contours[J]. Numerische Mathematik,1993,66:1-31.
    [27]Siddiqi K, Lauziere Y B, Tannenbaum A, Zueker S W. Area and Length Minimizing Flows for Shape Segmentation[C]. IEEE Trans on Image Proeessing.1998,7:433-443.
    [28]贾迪野,黄凤岗,文小芳.基于四阶偏微分方程平滑的图像分割新方法[J].计算机应用.2004,24(9)：19-21.
    [29]Nikos Paragios, Olivier Mellina-Gottardo, Visvanathan Ramesh. Gradient Vector Flow Fast Geodesic Active Contours[C]. IEEE International Conference in Computer Vision.2001:67-75.
    [30]Roman Goldenberg, Ron Kimmel, Ehud Rivlin, Michael Rudzsky. Fast Geodesic Active Contours[C]. IEEE Trans on Image Proeessing.2001,10(10):1467-1475.
    [31]Remi Ronfard. Region based strategies for active contour models[J]. International Journal of Computer Vision.1994,13(2):229-251.
    [32]Zhu SC, Lee T S, Yuille A L. Region Copetition:Unifying Snakes, Region Growing, and Bayes/MDL for Multi-Band Image Segmentation[C]. In:Proc of the 5th International Conference on Computer Vision, Boston, MA, USA,1995:416-423.
    [33]唐明,马颂德.非参数化区域竞争方法：一种新的区域分割框架[J].自动化学报.2001,27(6)：737-743.
    [34]Tony F. Chan, Luminita A. Vese. Active Contours Without Edges[C]. IEEE Trans on Image Proeessing.2001,10(2):266-277.
    [35]李俊.基于曲线演化的图像分割方法及应用研究.[D]上海交通大学博士论文.2001：1-14.
    [36]马波,张田文,李培华.基于HMM的卡尔曼蛇跟踪[J].计算机辅助设计与图形学学报,2003,15(10)：1237-1243.
    [37]T. F. Chan, B. Sandberg, L. Vese. Active Contours without Edges for Vector-Valued Images[C]. Journal of Visual Communication and Image Representation,2000, (11):130-141.
    [38]X. Tai, T. F. Chan. A Survey on Multiple Level Set Methods with Applications for Identifying Piecewise Constant[J]. International Journal of Numerical Analysis and Modeling,2004,1(1): 25-47.
    [39]石澄贤.几何图像模型及其在医学图像处理中的应用研究[D].博士论文,南京理工大学,2005.
    [40]V. Carlos, M. Amar. Joint Multiregion Segmentation and Parametric Estimation of Image Motion by Basis Function Representation and Level Set Evolution[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2006,28(5):782-793.
    [41]陈金男,基于水平集方法的图像分割研究[D].硕士论文,燕山大学,2007.
    [42]Gomes J and Fangeras O. Reconciling distance functions and level sets[J]. J. Visiual Communic and Imag Representation,2000,11:209-223.
    [43]Li C, Xu C, Gui Cetal. Level Set Evolution without Re-initialization:A new variational Formulation[C]. IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) San Diego,2005:430-436.
    [44]D. Mumford, J. Shah. Boundary Detection by Minimizing Functions. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, CA, USA,1985:22-26.
    [45]陈旭锋,管志成.基于Mumford-Shah泛函的数字信号去噪算法[C].浙江大学学报：工学版,2004,38(10)：1260-1264.
    [46]L. Vese,T.Chan.A Multiphase Level Set Framework for Image Segmentation using the Mumford and Shah Model[C]. International Journal of Computer Vision,2002,50(3):271-293.
    [47]肖亮.图像分割中分段光滑的Mumford-Shah模型的水平集算法[J].计算机研究与发展,2004(1)：129-135.
    [48]L. Ambrosio, V. Tortorelli. Approximation of Functionals Depending on Jumps by Elliptic Functionals via Convergence[C].Communication on Pure and Applied Mathematics,1990,43(8): 999-1036.
    [49]S. Teboul, L. Blanc-Feraud, G. Aubert. Variational Approach for Edge-Preserving Regularization Using Coupled PDE's[C]. IEEE Transactions on Image Processing,1998,7(3): 387-397.
    [50]T. F. Chan, L. A. Vese. A Level Set Algorithm for Minimizing the Munford-Shah Functional in Image Processing[C]. Proceedings of the IEEE Workshop on Variational and Level Set Methods, Vancouver, BC, Canada,2001:161-168.
    [51]林亚忠,跃斌,陈武凡.基于GFO模型的水平集分割算法[J].计算机应用与软件,2006,23(1)：7-9.
    [52]李俊,杨新,施鹏飞.基于Mumford-Shah模型的快速水平集图像分割方法[J].计算机学报,2002,25(11)：1178-1183.
    [53]Malladi R, Sethian J A, Vemuri B C. Shape modeling with front propagation:a level set approach[C], IEEE Trans, on PAMI,1995,17(2):158-175.
    [54]C. Xu, A. Yezzi, Jr., and J. L. Prince. On the Relationship between Parametric and Geometric Active Contours[C]. In Proc. Of 34th Asilomar Conference on Signals, Systems, and Computers, Oct.2000:483-489.
    [55]冯所前.大规模复杂文档图像快速检索系统的研究与实现[D].北京大学硕士学位论文.2005.
    [56]胡芝兰,林行刚,严洪.基于分层密度特征的文档图像检索[J].清华大学学报(自然科学版),,2006,46(7)：1231-1234.
    [57]杨有,尚晋.一种政府资源档案图像的二值化方法[J].计算机科学,2007,34(3)：227-229.
    [58]瞿洋,杨利平.Hough变换OCR图象倾斜矫正方法[J].中国图象图形学报,2001,6A(2)：178-181.
    [59]Manjunath Aradhya V N*, Hemantha Kumar G, and Shivakumara P.Skew Detection Technique for Binary Document Images based on Hough Transform[C]. International Journal of Information Technology,2006,3(1):194-200.
    [60]Dhandra, B.V., Malemath, V.S., Mallikarjun, H., Hegadi, R. Skew Detection in Binary Image Documents Based on Image Dilation and Region labeling Approach. Pattern Recognition[C],2006. ICPR 2006.18th International Conference on,2006,2:954-957.
    [61]H.Liu, Q.Wu, H.B. Zha and X.P. Liu. Skew detection for complex document images using robust borderlines in both text and non-text regions[C]. Pattern Recognition Letters,2008,29(13): 1893-1900.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700