基于CNN和LSTM的异构数据舆情分类方法

英文篇名：Public Opinion Classification of Heterogeneous Data Based on CNN and LSTM
作者：黑富郁 ; 王景中 ; 赵林浩
英文作者：HEI Fu-Yu;WANG Jing-Zhong;ZHAO Lin-Hao;School of Computer, North China University of Technology;
关键词：异构数据 ; 神经网络 ; CNN ; LSTM ; 特征提取 ; 特征融合 ; 舆情分类
英文关键词：heterogeneous data;;neural network;;CNN;;LSTM;;feature extraction;;feature fusion;;public opinion classification
中文刊名：XTYY
英文刊名：Computer Systems & Applications
机构：北方工业大学计算机学院;
出版日期：2019-06-15
出版单位：计算机系统应用
年：2019
期：v.28
语种：中文;
页：XTYY201906021
页数：7
CN：06
ISSN：11-2854/TP
分类号：143-149

摘要

随着网络的发展,网络舆情数据呈现出爆炸式增长的趋势.使得数据类型越来越复杂,这些网络数据相互结合,构成了一个复杂的数据结构来表达数据的信息.在舆情数据中,通过单一类型的数据(图片、文本、语音等)越来越难以完整的表达数据信息.对于一个包含多种类型数据的网络信息,本文提出一种新的舆情分类模型,通过神经网络模型分别去学习不同类型信息的数据特征,对它们的特征融合后进行分类,通过这种方法实现数据信息更好地分类.在实验中,本文分别使用LSTM和CNN神经网络提取文本和图像数据特征,对二者特征融合后进行分类.结果证明,多种类型的数据特征进行融合后再分类,可以更好地实现对网络舆情数据信息的分类,提高了舆情信息分类的准确性.
With the development of the network, the public data which shows the trend of explosive growth, making the data type more and more complex. These network data combine with each other to form a complex network data structure to express the information of data. In this scenario, it is increasingly difficult to fully express data information through a single type of data(picture, text, voice, etc.). For the purpose of a network information that contains multiple types of data can be classified better, this study proposes a new public opinion classification model via neural network which is used to learn the data features respectively, and to classify their features after fusion. In the experiment, LSTM and CNN neural networks are used to extract text and image's features, fusing the two features to classified. The experimental results show that the reclassification after the fusion of various data features can better realize the classification and improve the accuracy of data information classification.

引文

1第41次《中国互联网络发展状况统计报告》发布.中国广播,2018,(3):96.
    2钮成明,詹国华,李志华.基于深度神经网络的微博文本情感倾向性分析.计算机系统应用,2018, 27(11):205-210.
    3汪静,罗浪,王德强.基于Word2Vec的中文短文本分类问题研究.计算机系统应用,2018, 27(5):209-215.
    4梁吉业,冯晨娇,宋鹏.大数据相关分析综述.计算机学报,2016, 39(1):1-18.
    5 Krizhevsky A, Sutskever I, Hinton GE. ImageNet classification with deep convolutional neural networks.Proceedings of the 25th International Conference on Neural Information Processing Systems. Lake Tahoe, NV, USA.2012. 1097-1105.
    6 Simony an K,Zisserman A. Very deep convolutional networks for large-scale image recognition. International Conference On Learning Representations. San Diego, CA.2015.
    7 Hochreiter S,Schmidhuber J. Long short-term memory.Neural Computation, 1997, 9(8):1735-1780.[doi:10.1162/neco.1997.9.8.1735]
    8 Sutskever I,Vinyals O, Le Q V. Sequence to sequence learning with neural networks. Proceedings of the 27th International Conference on Neural Information Processing Systems. Montreal, Canada. 2014. 3104-3112.
    9 Tai KS,Socher R,Manning CD. Improved semantic representations from tree-structured long short-term memory networks. Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. Beijing, China. 2015. 1556-1566.
    10 Ngiam J, Khosla A, Kim M, et al. Multimodal deep learning.Proceedings of the 28th International Conference on International Conference on Machine Learning. Bellevue,WA, USA. 2009. 689-696.
    11 Srivastava N, Salakhutdinov R. Multimodal learning with deep Boltzmann machines. Proceedings of the 25th International Conference on Neural Information Processing Systems. Lake Tahoe, NV, USA. 2012. 1-9.
    12冯方向.基于深度学习的跨模态检索研究[博士学位论文].北京:北京邮电大学,2015
    13 Huiskes MJ, Thomee B, Lew MS. New trends and ideas in visual concept detection:The MIR flickr retrieval evaluation initiative. International Conference on Multimedia Information Retrieval. Philadelphia, PA, USA. 2010.527-536.
    14 Guillaumin M, Verbeek J, Schmid C. Multimodal semisupervised learning for image classification. Proceedings of2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Francisco, CA, USA.2010. 902-909.
    15 Xing EP, Yan R, Hauptmann AG. Mining associated text and images with dual-wing harmoniums. Proceedings of the Twenty-First Conference on Uncertainty in Artificial Intelligence. Edinburgh, Scotland. 2005. 633-641.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700