基于串并行卷积门阀循环神经网络的短文本特征提取与分类

设为首页

收藏本站

网站地图 | English | 公务邮箱

读者指南

学术客户端

NSTL服务站

科技查新

基于串并行卷积门阀循环神经网络的短文本特征提取与分类

详细信息查看全文 | 推荐本文 |

英文篇名：Short Text Feature Extraction and Classification Based on Serial–Parallel Convolutional Gated Recurrent Neural Network
作者：唐贤伦 ; 林文星 ; 杜一铭 ; 王婷
英文作者：TANG Xianlun;LIN Wenxing;DU Yiming;WANG Ting;School of Automation,Chongqing Univ.of Posts and Telecommunications;School of Computer Sci.and Technol.,Chongqing Univ.of Posts and Telecommunications;
关键词：特征表示 ; 短文本分类 ; 循环神经网络 ; 门阀循环单元
英文关键词：feature representation;;short text classification;;recurrent neural network;;gated recurrent unit
中文刊名：SCLH
英文刊名：Advanced Engineering Sciences
机构：重庆邮电大学自动化学院;重庆邮电大学计算机学院;
出版日期：2019-07-08 11:32
出版单位：工程科学与技术
年：2019
期：v.51
基金：国家自然科学基金项目(61673079);; 重庆市基础科学与前沿技术研究项目(cstc2016jcyjA1919)
语种：中文;
页：SCLH201904015
页数：8
CN：04
ISSN：51-1773/TB
分类号：129-136

摘要

针对短文本数据特征少、提供信息有限,以及传统卷积神经网络(convolutional neural network,CNN)和循环神经网络(recurrent neural network,RNN)对短文本特征表示不充分的问题,提出基于串并行卷积门阀循环神经网络的文本分类模型,处理句子特征表示与短文本分类。该网络在卷积层中去除池化操作,保留文本数据的时序结构和位置信息,以串并行的卷积结构提取词语的多元特征组合,并提取局部上下文信息作为RNN的输入;以门阀循环单元(gated recurrent unit,GRU)作为RNN的组成结构,利用文本的时序信息生成句子的向量表示,输入带有附加边缘距离的分类器中,引导网络学习出具有区分性的特征,实现短文本的分类。实验中采用TREC、MR、Subj短文本分类数据集进行测试,对网络超参数选择和卷积层结构对分类准确率的影响进行仿真分析,并与常见的文本分类模型进行了对比实验。实验结果表明:去掉池化操作、采用较小的卷积核进行串并行卷积,能够提升文本数据在多元特征表示下的分类准确率。相较于相同参数规模的GRU模型,所提出模型的分类准确率在3个数据集中分别提升了2.00%、1.23%、1.08%;相较于相同参数规模的CNN模型,所提出模型的分类准确率在3个数据集中分别提升了1.60%、1.57%、0.80%。与Text–CNN、G–Dropout、F–Dropout等常见模型相比,所提出模型的分类准确率也保持最优。因此,实验表明所提出模型可改善分类准确率,可实际应用于短文本分类场景。
In order to address the problems that the features and information is limited in short text, the short text features are not fully expressed by traditional convolutional neural network(CNN) and recurrent neural network(RNN), a text classification model named convolutional gated recurrent neural network was proposed to represent sentence feature vector and classify short texts. The pooling operation was removed in convolution layerof the model to retain sequential structure and location information in text data. Series–parallel convolution structure was used to extract multi-feature combination of words and local context information as the input of RNN. Then, the gated recurrent unit(GRU) was used as the structure of RNN to represent the sentence features based on the sequential information of text. The features were input to the classifier with additive margin to guide network to learn distinguishing features and realize short text classification. The short text classification data set TREC, MR,and Subj were applied for testing. The influence of network hyper-parameters selection and convolution layer structures on classification accuracy were simulated and analyzed, and common text classification models were compared in experiments. Experimental results demonstrated that the classification accuracy of text data was improved by removing the pooling operation and using smaller convolution kernels for series–parallel convolution in the multi-feature representation. Compared with the GRU with the same number of parameters, the classification accuracy of the proposed model was increased by 2.00%, 1.23% and 1.08% in three datasets respectively. Compared with the CNN with the same number of parameters, the classification accuracy of the proposed model was increased by 1.60%, 1.57% and 0.80% in three datasets respectively. Compared with Text–CNN, G–Dropout, F–Dropout and other common models, the classification results also kept best. Therefore, experiments showed that the classification accuracy was effectively improved by the proposed model, which could be applied to short text classification scenarios.

引文

[1]Diab D M,El Hindi K M.Using differential evolution for fine tuning na?ve Bayesian classifiers and its application for text classification[J].Applied Soft Computing,2017,54:183-199.
    [2]Zhang Wen,Tang Xijin,Yoshida T.TESC:An approach to TExt classification using semi-supervised clustering[J].Knowledge-Based Systems,2015,75:152-160.
    [3]Vieira A S,Borrajo L,Iglesias E L.Improving the text classification using clustering and a novel HMM to reduce the dimensionality[J].Computer Methods and Programs in Biomedicine,2016,136:119-130.
    [4]Wang Yisen,Xia Shutao T,Wu Jia.A less-greedy two-term Tsallis entropy information metric approach for decision tree classification[J].Knowledge-Based Systems,2017,120:34-42.
    [5]de Boom C,van Canneyt S,Demeester T,et al.Representation learning for very short texts using weighted word embedding aggregation[J].Pattern Recognition Letters,2016,80:150-156.
    [6]Kim Y.Convolutional neural networks for sentence classification[C]//Proceedings of the 2014 Conference on empirical Methods In Natural Language Processing.Doha:ACL,2014:1532-1543.
    [7]Sun Songtao,He Yanxiang.Multi-label emotion classification for microblog based on CNN feature space[J].Advanced Engineering Sciences,2017,49(3):162-169.[孙松涛,何炎祥.基于CNN特征空间的微博多标签情感分类[J].工程科学与技术,2017,49(3):162-169.]
    [8]Er M J,Zhang Yong,Wang Ning,et al.Attention poolingbased convolutional neural network for sentence modelling[J].Information Sciences,2016,373:388-403.
    [9]Lu Chi,Huang Heyan,Jian Ping,et al.A P-LSTM neural network for sentiment classification[M]//Advances in Knowledge Discovery and Data Mining.Cham:Springer,2017:524-533.
    [10]Zhou Yujin,Xu Bo,Xu Jiaming,et al.Compositional recurrent neural networks for Chinese short text classification[C]//Proceedings of the 2016 IEEE/WIC/ACM International Conference on Web Intelligence(WI).Omaha:IEEE,2016:137-144.
    [11]Yang Zizhao,Yang Diyi,Dyer C,et al.Hierarchical attention networks for document classification[C]//Proceedings of the2016 Conference of the North American Chapter of the Association for Computational Linguistics.San Diego:NAACL,2016:1480-1489.
    [12]Wang Xinyou,Jiang Weijie,Luo Zhiyong.Combination of convolutional and recurrent neural network for sentiment analysis of short texts[C]//Proceedings of the 26th International Conference on Computational Linguistics.Osaka:ACM,2016:2428-2437.
    [13]Xie Jinbao,Hou Yongjin,Kang Shouqiang,et al.Multi-feature fusion based on semantic understanding attention neural network for Chinese text categorization[J].Journal of Electronics&Information Technology,2018,40(5):1258-1265.[谢金宝,侯永进,康守强,等.基于语义理解注意力神经网络的多元特征融合中文文本分类[J].电子与信息学报,2018,40(5):1258-1265.]
    [14]Li Linchuan,Wu Zhiyong,Xu Mingxing,et al.Combining CNN and BLSTM to extract textual and acoustic featuresfor recognizing stances in mandarin ideological debate competition[C]//Proceedings of the 17th Annual Conference of the International Speech Communication Association.San Francisco:ISCA,2016:1392-1396.
    [15]Cho K,van Merri?nboer B,Gulcehre C,et al.Learning phrase representations using RNN encoder-decoder for statistical machine translation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing.Doha:ACL,2014:1724-1734.
    [16]Liu Weiyang,Wen Yandong,Yu Zhiding,et al.Large-margin softmax loss for convolutional neural networks[C]//Proceedings of the 33th International Conference on Machine Learning.New York:ACM,2016:507-516.
    [17]Wang Feng,Cheng Jian,Liu Weiyang,et al.Additive margin softmax for face verification[J].IEEE Signal Processing Letters,2018,25(7):926-930.
    [18]Zhang Dongwen,Xu Hua,Su Zengcai,et al.Chinese comments sentiment classification based on word2vec and SVMperf[J].Expert Systems with Applications,2015,42(4):1857-1863.
    [19]Pennington J,Socher R,Manning C D.GloVe:Global vectors for word representation[C]//Proceedings of the 2014Conference on Empirical Methods in Natural Language Processing.Doha:ACL,2014:1532-1543.
    [20]Wang S I,Manning C D.Fast dropout training[C]//Proceedings of the 30th International Conference on Machine Learning.Atlanta:ACM,2013:118-126
    [21]Kalchbrenner N,Grefenstette E,Blunsom P,et al.A Convolutional neural network for modelling sentences[C]//Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics.Baltimore:ACL,2014:655-655.

常见问题　|　交通位置　|　联系我们　|　OA远程办公

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700