基于深度CNN和极限学习机相结合的实时文档分类
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:REAL-TIME DOCUMENT CLASSIFICATION BASED ON DEEP CNN AND EXTREME LEARNING MACHINE
  • 作者:闫河 ; 王鹏 ; 董莺艳 ; 罗成 ; 李焕
  • 英文作者:Yan He;Wang Peng;Dong Yingyan;Luo Cheng;Li Huan;College of Computer Science, Chongqing University of Teachnology;Artificial Intelligence College, Chongqing University of Teachnology;
  • 关键词:文档图像分类 ; CNN ; 迁移学习
  • 英文关键词:Document image classification;;CNN;;Migration learning
  • 中文刊名:JYRJ
  • 英文刊名:Computer Applications and Software
  • 机构:重庆理工大学计算机科学与工程学院;重庆理工大学两江人工智能学院;
  • 出版日期:2019-03-12
  • 出版单位:计算机应用与软件
  • 年:2019
  • 期:v.36
  • 基金:国家自然科学基金面上项目(61173184);; 重庆市自然科学基金项目(cstc2018jcyjAX0694)
  • 语种:中文;
  • 页:JYRJ201903033
  • 页数:6
  • CN:03
  • ISSN:31-1260/TP
  • 分类号:180-185
摘要
提出一种文档图像实时分类训练和测试的方法。在实际应用中,数据训练的精确性和高效性在文档图像识别中起着关键的作用。现有的深度学习方法不能满足此要求,因为需要大量的时间用于训练和微调深层次的网络架构。针对此问题,提出一种基于计算机视觉的新方法:第一阶段训练深度网络,作为特征提取器;第二阶段用极限学习机(ELM)用于分类。该方法的性能优于目前最先进的基于深度学习的相关方法,在Tobacco-3482数据集上的最终准确率为83.45%。与之前基于卷积神经网络(CNN)的方法相比,相对误差降低了26%。ELM的训练时间仅为1.156秒,对2 482张图像的整体预测时间是3.083秒。因此,该文档分类方法适合于大规模实时应用。
        This paper presented a real-time training and testing method for document image classification. In practical applications, the accuracy and efficiency of data training play a key role in document image recognition. The existing deep learning methods cannot meet this requirement, because they need a lot of time to train and fine-tune the deep network architecture. To solve this problem, we proposed a new method based on computer vision. The method was divided into two steps: the depth network was trained as a feature extractor; we used the extreme learning machine(ELM) for classification. The performance of this method is superior to the advanced methods based on deep learning. The final accuracy of this method on Tobacco-3482 dataset is 83.45%. Compared with the method based on convolution neural network, the relative error is reduced by 26%. The training time of ELM is only 1.156 s, and the overall prediction time of 2 482 images is 3.083 s. Therefore, the method is suitable for large-scale real-time applications.
引文
[1] Afzal M Z, Capobianco S, Malik M I, et al. Deepdocclassifier: document classification with deep convolutional neural network[C]//2015 13th International Conference on Document Analysis and Recognition (ICDAR). IEEE,2015:1111-1115.
    [2] Kang L, Kumar J, Ye P, et al. Convolutional Neural Networks for Document Image Classification[C]//International Conference on Pattern Recognition. IEEE Computer Society, 2014.
    [3] Harley A W, Ufkes A, Derpanis K G. Evaluation of deep convolutional nets for document image classification and retrieval[C]//13th International Conference on Document Analysis and Recognition (ICDAR). IEEE, 2015:991-995.
    [4] Kumar J, Ye P, Doermann D. Learning document structure for retrieval and classification[C]//Proceedings of the 21st International Conference on Pattern Recognition(ICPR2012). IEEE, 2012: 1558-561.
    [5] Chen S, He Y, Sun J, et al. Structured document classification by matching local salient features[C]//Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012). IEEE, 2012:653-656.
    [6] Reddy K V U. Form classification[C]//Proceedings of SPIE—The International Society for Optical Engineering, 2008, 6815.
    [7] Tang B, He H, Baggenstoss P, et al. A Bayesian Classification Approach Using Class-Specific Features for Text Categorization[J]. IEEE Transactions on Knowledge and Data Engineering, 2016,28(6):1602-1606.
    [8] Diab D M, El Hindi K M. Using differential evolution for fine tuning naive Bayesian classifiers and its application for text classification[J]. Applied Soft Computing, 2017, 54:183-199.
    [9] Shin C, Doermann D S. Document Image Retrieval Based on Layout Structural Similarity[C]//Proceedings of the 2006 International Conference on Image Processing, Computer Vision, & Pattern Recognition, Las Vegas, Nevada, USA, June 26-29, 2006, Volume 2. DBLP, 2006.
    [10] Collins-Thompson K, Nickolov R. A clustering-based algorithm for automatic document separation[C]//Research & Development in Information Retrieval. 2007.
    [11] Kumar J, Ye P, Doermann D. Structural similarity for document image classification and retrieval[J]. Pattern Recognition Letters, 2014,43:119-126.
    [12] Shunsuke K, Ryunosuke K, Donahue I. End-to-end text classification via image-based embedding using character-level networks[EB].arXiv preprint arXiv:1810.03595v2, 2018.
    [13] Sharad J, Suraj S, Nitin K. First steps toward CNN based source classification of document images shared over messaging app[EB]. arXiv preprint arXiv:1808.05941v1, 2018.
    [14] Wang H, Feng L, Kong A, et al. Multi-view reconstructive preserving embedding for dimension reduction[EB]. arXiv preprint arXiv:1807.10614v1, 2018.
    [15] Praveen K, Jawahar C V. HWNet v2: An efficient word image representation for handwritten documents[EB]. arXiv preprint arXiv:1802.06194v1, 2018.
    [16] Das A, Roy S, Bhattacharya U, et al. Document Image Classification with Intra-Domain Transfer Learning and Stacked Generalization of Deep Convolutional Neural Networks[EB]. arXiv preprint arXiv:1801.09321v3, 2018.
    [17] Roy S, Das A, Bhattacharya U. Generalized stacking of layerwise-trained Deep Convolutional Neural Networks for document image classification[C]//2016 23rd International Conference on Pattern Recognition (ICPR). IEEE, 2016:1273-1278.
    [18] Csurka G, Larlus D, Gordo A, et al. What is the right way to represent document images?[EB]. arXiv preprint arXiv:1603.01076,2016.
    [19] Afzal M Z, Andreas K?lsch, Ahmed S, et al. Cutting the error by half: investigation of very deep CNN and advanced training strategies for document image classification[C]//Iapr International Conference on Document Analysis & Recognition. IEEE Computer Society, 2017.
    [20] Krizhevsky A, Sutskever I, Hinton G. Imagenet classification with deep convolutional neural networks[C]//Proceedings of the 25th International Conference on Neural Information Processing Systems—Volume 1. Curran Associates Inc., 2012:1097-1105.
    [21] Huang G B, Zhu Q Y, Siew C K. Extreme learning machine: A new learning scheme of feedforward neural networks[C]//Neural Networks, 2004. Proceedings. 2004 IEEE International Joint Conference on. IEEE, 2004:985-990.
    [22] Huang G B, Zhu Q Y, Siew C K. Extreme learning machine: Theory and applications[J]. Neurocomputing, 2006, 70(1-3):489-501.
    [23] Kotsiantis S B. Supervised machine learning: A review of classification techniques[J]. Informatica, 2007,31 (3):249-268.
    [24] Jia Y, Shelhamer E, Donahue J, et al. Caffe: convolutional architecture for fast feature embedding[EB]. arXiv preprint arXiv:1408.5093, 2014.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700