融合反问特征的卷积神经网络的中文反问句识别
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Feature Enhanced CNN for Rhetorical Questions Identification
  • 作者:文治 ; 李旸 ; 王素格 ; 廖健 ; 陈鑫
  • 英文作者:WEN Zhi;LI Yang;WANG Suge;LIAO Jian;CHEN Xin;School of Computer & Information Technology,Shanxi University;MOE Key Laboratory of Computational Intelligence and Chinese Information Processing,Shanxi University;
  • 关键词:反问句 ; 卷积神经网络 ; 特征 ; 情感分析
  • 英文关键词:rhetorical questions;;convolutional neural network;;feature;;sentiment analysis
  • 中文刊名:MESS
  • 英文刊名:Journal of Chinese Information Processing
  • 机构:山西大学计算机与信息技术学院;山西大学计算智能与中文信息处理教育部重点实验室;
  • 出版日期:2019-01-15
  • 出版单位:中文信息学报
  • 年:2019
  • 期:v.33
  • 基金:国家自然科学基金(61632011,61573231,61672331,61603229)
  • 语种:中文;
  • 页:MESS201901011
  • 页数:9
  • CN:01
  • ISSN:11-2325/N
  • 分类号:73-81
摘要
反问是一种带有强烈情感色彩的表达方式,对其进行自动识别将提升隐式情感分析的整体效率。针对汉语反问句识别问题,该文分析了反问句的句式特点,将反问句的句式结构融入到卷积神级网络的构建中,提出一种融合句式结构的卷积神经网络的反问句识别方法。首先利用置信度大于70%的反问句的特征词、序列模式,对大规模未被标注的微博语料进行初步筛选,获取大量伪反问句。然后通过多个卷积核分别对句子的词向量和反问句的特征进行抽取,获取句子语义特征和反问词特征,将两者共同作用生成句子的表示。最后利用softmax分类器实现句子的分类。实验结果表明,利用该方法对微博中反问句的识别准确率、召回率和F1值分别达到了89.5%、84.2%和86.7%。
        The Rhetorical Question is a kind of expression with a strong emotion.To automatically identify Rhetorical Questions,this paper proposes a CNN method combined with the sentence structure of Rhetorical Questions.Firstly,candidate Rhetorical Questions is selected from the microblog according to the feature words and sequence pattern features(>70%confidence).Then,word vectors and the features of the Rhetorical Questions are extracted to generate the representations by multiple convolution kernels.Finally,the softmax classifier is used to classify sentences.The experimental results show that the proposed method achieves 89.5%,84.2%and 86.7%in terms of accuracy,recall,and F-measure,respectively.
引文
[1]赵妍妍,秦兵,刘挺.文本情感分析[J].软件学报,2010,21(8):1834-1848.
    [2]冯江鸿.反问句的语用研究[D].上海:上海外国语大学博士学位论文,2003.
    [3]刘钦荣.反问句的句法、语义、语用分析[J].河南师范大学学报(哲学社会科学版),2004,31(4):107-110.
    [4]吕叔湘.吕叔湘文集:第一卷,中国文法要略[M].上海:商务印书馆,1990.
    [5]刘钦荣.反问句和询问句句法结构间的关系[J].沈阳师范大学学报(社会科学版),1995(4):85-88.
    [6]黄伯荣,廖序东.现代汉语[M].北京:高等教育出版社,1983.
    [7]殷树林.现代汉语反问句研究[D]福州:福建师范大学博士学位论文,2006.
    [8]许皓光.试谈反问句语义形成的诸因素[J].辽宁大学学报(哲学社会科学版),1985(3):66-68.
    [9]徐思益.反问句特有的表达式[J].渤海大学学报(哲学社会科学版),1986(4):57-64.
    [10]苏英霞.“难道”句都是反问句吗?[J].语文研究,2000(1):56-60.
    [11]史金生.表反问的“不是”[J].中国语文,1997(1):25-28.
    [12]柴森.谈强调反问的“又”和“还”[J].世界汉语教学,1999(3):65-69.
    [13]刘芳.现代汉语反问句的标记研究[D].沈阳:沈阳师范大学硕士学位论文,2011.
    [14]梁冠华.现代汉语有标记反问句研究[D].曲阜:曲阜师范大学硕士学位论文,2015.
    [15] Mikolov T,et al. Distributed representations of words and phrases and their compositionality[J].arXiv preprine arXiv:1310.4546,2013.
    [16]周飞燕,金林鹏,董军.卷积神经网络研究综述[J].计算机学报,2017,6(40):1-23.
    [17] Gu C,Wu M,Zhang C.Chinese sentence classification based on convolutional neural network[J].IOP ConTerle Series:Materinls Scierce and Engineering.2017,261(1):012008.
    [18] Kim Y.Convolutional neural networks for sentence classification[J].arXiv preprint arXiv:1408.5882,2014.
    [19] Kalchbrenner N,Grefenstette E,Blunsom P.A convolutional neural network for modelling sentences[J].arXiv preprint arXiv:14041:21888,2014.
    [20] Mikolov T,et al.Recurrent neural network based language model[C]//Proceedings of the INTERSPEECH 2010,2010:1045-1048.
    [21] Qian Q,et al.Linguistically regularized LSTMs for sentiment classification[J].arXiv preprint arXiv:1611.03949,2016.
    [22] Li F,et al.A bi-LSTM-RNN model for relation classification using low-cost sequence features[J].arXiv preprint arXiv:1608.07720,2016.
    [23]殷树林.现代汉语反问句特有的句法结构[J].湖南科技大学学报(社会科学版),2007,10(3):101-105.
    [24] Kingma D P,Ba J.Adam:A Method for Stochastic Optimization[J].arXiv preprint arXiv:1412,6980,2014.
    [25] Go A,Bhayani R,Huang L.Twitter sentiment classification using distant supervision[R].Cs224nProject Report,2009.
    [26] Pak A,Paroubek P.Twitter for sentiment analysis:when language resources are not available[C]//Proceedings of the International Workshop on Database and Expert Systems Applications.IEEE,2011:111-115.
    [27] Ding Z,et al.Densely Connected Bidirectional LSTM with Applications to Sentence Classification[J].arXiv preprint arXiv:1802.00889,2018.
    [28]黄磊,杜昌顺.基于递归神经网络的文本分类研究[J].北京化工大学学报(自然科学版),2017,44(1):98-104.
    (1)https://www.csie.ntu.edu.tw/~cjlin/libsvm/
    (2)https://sourceforge.net/projects/crfpp/

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700