摘要
论辩挖掘可分为论点边界的检测、论点类型的识别、论点关系的抽取三个子任务.现有的工作大多数对子任务分别建模研究,忽略了三个子任务之间的关联信息,导致性能低下.另外,还有部分的工作采用流水线模型把三个子任务进行联合建模,由于流水线模型仍然是独立的看待每个子任务,为每个子任务训练单独的模型,存在错误传播的问题,且在训练过程中产生了冗余信息.因此,本文提出了一种基于多任务迭代学习的论辩挖掘方法.该方法将论辩挖掘三个任务并行地联合在一起学习,首先通过深度卷积神经网络(CNN)和高速神经网络(Highway Network),获得文本字符和词级别的浅层共享参数表示;然后输入双向长短时记忆循环神经网络(Bi-LSTM),利用论辩挖掘三个任务之间的关联信息进行同时训练,不仅可以避免错误传播,而且能够克服冗余信息的产生;最后,联结三个任务的Bi-LSTM网络输出作为下一次迭代的输入,来提高模型的性能.实验采用了德国UKP实验室公开的学生论文数据集,实验结果表明,与目前最好的基准方法对比,该方法的准确率指标提高了2.74%,"F1(100%)"和"F1(50%)"指标分别提高了1.05%和1.19%,很好地验证了该方法的有效性.
Argumentation mining has recently become a hot topic in the field of data mining and natural language processing.Its main task is automatic identification of argumentative structures in persuasive essays so as to help people better understand the massive text information.A persuasive essay usually consists of a series of argument components.The types of argument components are generally classified into claims or premises,and the types of relationship between argument components are commonly classified into support or attack.Argumentation mining typically contains three consecutive subtasks,i.e.,(1)Argument component boundary detection(ACBD Task),which involves separating argument component from non-argumentative text units and identifying the argument component boundaries;(2)Argument component identification(ACI Task),whose goal is to classify argument components into different types,such as claims or premises;(3)Argument component relation identification(RI Task),which aims to identify the relationship type between argument components,such as support or attack.Recently,many researchers have proposed a series of argumentation mining models and made brilliant improvement.However,most of the existing approaches mainly focus on modeling each subtask and ignore the correlation information among the three subtasks,resulting in low performance.In addition,some of the approaches utilize pipeline methods to jointly model three subtasks.The pipeline methods still consider each subtask independently,and train separated models for each subtask,which could lead to error propagation and redundant information in the training process.More specifically,the error of argument component boundary recognition module affects the following argument component classification performance.Similarly,the error of argument component classification also influences the performance of argument component relation identification.To solve these problems above,we propose a multi-task iterative learning method which assumes that tags predicting for one task could be useful feature for other tasks,and joints three subtasks in parallel to learn together for argumentation mining.Firstly,we obtain the shallow shared parameters of the text character and word level by utilizing the deep Convolutional Neural Network(CNN)and the highway network.And then,the Bi-directional LSTM neural network is trained to solve three subtasks at the same time to avoid error propagation.In the training process,the correlation information among each subtask is used to overcome the generation of redundant information.Finally,the output of three subtasks is concatenated as the input for the next iteration to improve the performance.Multi-Task Learning(MTL)is an important machine learning mechanism and improves the generalization performance by learning a task together with other related tasks.Our model based on MTL could iterative utilize predicting tags' distribution of each task explicitly.Experimental results on student essays published by the UKP laboratory in Germany show that,compared to the state-of-the-art models,our model improve 2.74% on accuracy,1.05% on"F1(100%)"and 1.19% on "F1(50%)",which verify the validity of our model.Besides,results also show that the performance of multi-task learning is better than single task learning.
引文
[1]Stab C,Gurevych I.Parsing argumentation structures in persuasive essays.Computational Linguistics,2017,43(3):619-659
[2]Mochales R,Moens M-F.Argumentation mining.Artificial Intelligence and Law,2011,19(1):1-22
[3]Wachsmuth H,Al Khatib K,Stein B.Using argument mining to assess the argumentation quality of essays//Proceedings of COLING 2016,the 26th International Conference on Computational Linguistics:Technical Papers.Osaka,Japan,2016:1680-1691
[4]Habernal I,Gurevych I.Which argument is more convincing?Analyzing and predicting convincingness of web arguments using bidirectional LSTM//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics(Volume 1:Long Papers).Berlin,Germany,2016:1589-1599
[5]Eckle-Kohler J,Kluge R,Gurevych I.On the role of discourse markers for discriminating claims and premises in argumentative discourse//Proceedings of the 2015Conference on Empirical Methods in Natural Language Processing.Lisbon,Portugal,2015:2236-2242
[6]Stab C,Gurevych I.Identifying argumentative discourse structures in persuasive essays//Proceedings of the 2014Conference on Empirical Methods in Natural Language Processing(EMNLP).Doha,Qatar,2014:46-56
[7]Florou E,Konstantopoulos S,Koukourikos A,et al.Argument extraction for supporting public policy formulation//Proceedings of the 7th Workshop on Language Technology for Cultural Heritage,Social Sciences,and Humanities.Sofia,Bulgaria,2013:49-54
[8]Moens M F,Boiy E,Palau R M,et al.Automatic detection of arguments in legal texts//Proceedings of the 11th International Conference on Artificial Intelligence and Law.Stanford,USA,2007:225-230
[9]Feng V W,Hirst G.Classifying arguments by scheme//Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics:Human Language TechnologiesVolume 1.Association for Computational Linguistics.Portland,USA,2011:987-996
[10]Palau R M,Moens M F.Argumentation mining:The detection,classification and structure of arguments in text//Proceedings of the 12th International Conference on Artificial Intelligence and Law.Barcelona,Spain,2009:98-107
[11]Rooney N,Wang H,Browne F.Applying kernel methods to argumentation mining//Proceedings of the 25th International Florida Artificial Intelligence Research Society Conference.Florida,USA,2012:272-275
[12]Teufel S.Argumentative Zoning:Information Extraction from Scientific Text[Ph.D.dissertation].University of Edinburgh,Edinburgh,UK,1999
[13]Wyner A,Mochalespalau R,Moens M F,et al.Approaches to text mining arguments from legal cases//Proceedings of the Semantic Processing of Legal Texts.Berlin,Germany,2010:60-79
[14]Li Minglan,et al.Joint RNN model for argument component boundary detection//Proceedings of the 2017IEEE International Conference on Systems,Man,and Cybernetics(SMC2017).Banff,Canada,2017:57-62
[15]Laha A,Raykar V.An empirical evaluation of various deep learning architectures for bi-sequence classification tasks//Proceedings of the COLING 2016,the 26th International Conference on Computational Linguistics.Osaka,Japan,2016:2762-2773
[16]Eger S,Daxenberger J,Gurevych I.Neural end-to-end learning for computational argumentation mining//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics.Vancouver,Canada,2017:11-22
[17]Persing I,Ng V.End-to-end argumentation mining in student essays//Proceedings of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.San Diego,USA,2016:1384-1394
[18]Reed C,Palau R M,Rowe G,et al.Language resources for studying argument//Proceedings of the International Conference on Language Resources and Evaluation,Lrec2008.Marrakech,Morocco,2008:2613-2618
[19]Quinlan J R.C4.5:Programs for Machine Learning.San Francisco,USA:Morgan Kaufmann Publishers,1993
[20]Gao Y,et al.Reinforcement learning based argument component detection.arXiv preprint arXiv:1702.06239,2017
[21]Cabrio E,Villata S.Combining textual entailment and argumentation theory for supporting online debates interactions//Proceedings of the Meeting of the Association for Computational Linguistics:Short Papers.Jeju Island,Korea,2012:208-212
[22]Peldszus A,Stede M.An annotated corpus of argumentative microtexts//Proceedings of the 1st Conference on Argumentation.Lisbon,Portugal,2015:801-815
[23]Ma Xuezhe,Hovy E.End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics(Volume 1:Long Papers).Berlin,Germany,2016:1064-1074
[24]Niculae V,Park J,Cardie C.Argument mining with structured SVMs and RNNs//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics.Vancouver,Canada,2017:985-995
[25]Potash P,Romanov A,Rumshisky A.Here’s my point:Argumentation mining with pointer networks.arXiv preprint arXiv:1612.08994,2016
[26]Kirschner C,Eckle-Kohler J,Gurevych I.Linking the thoughts:Analysis of argumentation structures in scientific publications//Proceedings of the Workshop on Argumentation Mining.Denver,USA,2015:1-11
[27]Boltu6i'c F,najder J.Back up your stance:Recognizing arguments in online discussions//Proceedings of the First Workshop on Argumentation Mining.Baltimore,USA,2014:49-58
[28]Somasundaran S,et al.Evaluating argumentative and narrative essays using graphs//Proceedings of the COLING 2016,the26th International Conference on Computational Linguistics:Technical Papers.Osaka,Japan,2016:1568-1578
[29]Zhang F,Litman D.Using context to predict the purpose of argumentative writing revisions//Proceedings of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.San Diego,USA,2016:1424-1430
[30]Caruana R.Multitask learning.Machine Learning,1997,28(1):41-75
[31]Hollingshead K,Roark B.Pipeline iteration//Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics.Prague,Czech Republic,2007:952-959
[32]Kim Y.Convolutional neural networks for sentence classification//Proceedings of the 2014Conference on Empirical Methods in Natural Language Processing(EMNLP 2014).Doha,Qatar,2014:1746-1751
[33]Kim Y,Jernite Y,Sontag D,et al.Character-aware neural language models//Proceedings of the 30th AAAI Conference on Artificial Intelligence.Phoenix,USA,2016:2741-2749
[34]Srivastava R K,Greff K,Schmidhuber J.Training very deep networks//Proceedings of the Advances in Neural Information Processing Systems.Montreal,Canada,2015:2377-2385
[35]Hochreiter S,Schmidhuber J.Long short-term memory.Neural Computation,1997,9(8):1735-1780
[36]Graves A,Mohamed A R,Hinton G.Speech recognition with deep recurrent neural networks//Proceedings of the IEEE International Conference on Acoustics,Speech and Signal Processing.Vancouver,Canada,2013:6645-6649
[37]Reimers N,Gurevych I.Reporting score distributions makes a difference:Performance study of LSTM-networks for sequence tagging//Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing.Copenhagen,Denmark,2017:338-348
[38]Kingma D P,Ba J.Adam:A method for stochastic optimization.arXiv preprint arXiv:1412.6980,2014
[39]Pennington J,Socher R,Manning C.Glove:Global vectors for word representation//Proceedings of the Conference on Empirical Methods in Natural Language Processing.Doha,Qatar,2014:1532-1543
(1)https://essayforum.com/