摘要
文字图像识别具有重要的研究价值,为了完成复杂的字体图像识别任务,基于VGG结构思想,设计了基于卷积神经网络结构的手写字体识别模型,使用HWDB1. 1数据集中最常用的100个汉字组成的子数据集,应用Batch-Normalization等方法进行优化训练。实验结果表明,模型能够以很快的速度收敛,在有限的训练迭代次数下,模型在测试集上的识别准确率为96.77%。
Character image recognition enjoys important value of research and application. In order to achieve complicate character image recognition,we design a convolutional neural network model specially for handwritten Chinese character image recognition task based on VGG architecture. Batch-Normalization applied to accelerate model training. Training is carried over the training set which contains the most commonly used 100 Chinese characters in HWDB1.1 dataset. The training shows that our model can converge at a fast rate and achieve good performance with limited training iterations. The recognition accuracy of the well-trained model is 96.77%.
引文
[1]LECUN Y,BOTTOU L,BENGIO Y,et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE,1998,86(11):2278-2324.
[2]周志华.机器学习[M].北京:清华大学出版社,2016.
[3]KRIZHEVSKY A,SUTSKEVER I,Hinton G E. Image Net classification with deep convolutional neural networks[C]//International Conference on Neural Information Processing Systems. Curran Associates Inc. 2012:1097-1105.
[4]SIMONYAN K,ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[C]//International Conference on Learning Representations. San Diego:ICML Press,2015.
[5]IOFFE S,SZEGEDY C. Batch normalization:accelerating deep network training by reducing internal covariate shift[C]//Proceedings of the 32nd International Conference on Machine Learning. New York:ICML Press,2015:448-456.
[6]LECUN Y. The MNIST database of handwritten digits[DB/OL].[2018-06-28]. http://yann.lecun.com/exdb/mnist/.
[7]CASIA.CASIA online and offline Chinese handwriting database[DB/OL].[2018-06-28]http://www. nlpr. ia. ac. cn/databases/handwriting/Download.html.
[8]ZHANG Y H. Deep convolutional network for handwritten Chinese character recognition[R].Stanford University:Computer Science Department,2014.