融入罪名关键词的法律判决预测多任务学习模型

英文篇名：Multi-task learning model for legal judgment predictions with charge keywords
作者：刘宗林 ; 张梅山 ; 甄冉冉 ; 公佐权 ; 余南 ; 付国宏
英文作者：LIU Zonglin;ZHANG Meishan;ZHEN Ranran;GONG Zuoquan;YU Nan;FU Guohong;School of Computer Science and Technology,Heilongjiang University;School of Information,Guizhou University of Finance and Economics;
关键词：法律判决预测 ; 多任务学习 ; 罪名关键词
英文关键词：legal judgment prediction;;multi-task learning;;charge keywords
中文刊名：QHXB
英文刊名：Journal of Tsinghua University(Science and Technology)
机构：黑龙江大学计算机科学技术学院;贵州财经大学信息学院;
出版日期：2019-04-10 16:26
出版单位：清华大学学报(自然科学版)
年：2019
期：v.59
基金：国家自然科学基金资助项目(61672211,61602160,U1836222);; 黑龙江省自然科学基金资助项目(F2016036)
语种：中文;
页：QHXB201907001
页数：8
CN：07
ISSN：11-2223/N
分类号：4-11

摘要

作为新兴的智慧法院技术之一,基于案情描述文本的法律判决预测越来越引起自然语言处理界的关注。罪名预测和法条推荐是法律判决预测的2个重要子任务。这2个子任务密切相关、相互影响,但常常当作独立的任务分别处理。此外,罪名预测和法条推荐还面临易混淆罪名问题。为了解决这些问题,该文提出一种多任务学习模型对这2个任务进行联合建模,同时采用统计方法从案情描述中抽取有助于区分易混淆罪名的指示性罪名关键词,并将它们融入到多任务学习模型中。在CAIL2018法律数据集上的实验结果表明:融入罪名关键词信息的多任务学习模型能够有效解决易混淆罪名问题,并且能够显著地提高罪名预测和法条推荐这2个任务的性能。
The legal field is using more artificial intelligence methods such as legal judgment prediction(LJP)based on case description texts using natural language processing.Charge prediction and law article recommendations are two important LJP sub-tasks that are closely related and interact with each other.However,previous studies have usually analyzed them as two independent tasks that are analyzed separately.Furthermore,charge prediction and law article recommendations both face the problem of confusing charges.To this end,this paper presents a multi-task learning model for joint modeling of charge prediction and law article recommendations.Confusing charges are handled by using a set of charge keywords extracted from case description texts using statistical techniques for integration into the multi-task learning model.This method was evaluated using the CAIL2018 legal dataset.The results show that incorporating the charge keywords into the multi-task learning model effectively resolves the confusing charge problem and significantly improves both the charge prediction and the law article recommendation results.

引文

[1]COLLOBERT R,WESTON J,BOTTOU L,et al.Natural language processing(almost)from scratch[J].Journal of Machine Learning Research,2011,12(8):2493-2537.
    [2]MIKOLOV T,CHEN K,CORRADO G,et al.Efficient estimation of word representations in vector space[Z/OL].(2013-01-16)[2017-09-03]https://arxiv.org/abs/1301.3781.
    [3]BAHARUDIN B,LEE L H,KHAN K,et al.A review of machine learning algorithms for text-documents classification[J].Journal of Advances in Information Technology,2010,1(1):4-20.
    [4]FIRAT O,CHO K,SANKARAN B,et al.Multi-way,multilingual neural machine translation[J].Computer Speech&Language,2017,45:236-252.
    [5]ZHONG H X,XIAO C J,GUO Z P,et al.Overview of CAIL2018:Legal judgment prediction competition[Z/OL].(2018-10-13)[2018-10-20].https://arxiv.org/abs/1810.0585.
    [6]LUO B F,FENG Y S,XU J B,et al.Learning to predict charges for criminal cases with legal basis[C]//Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing.Copenhagen,Denmark:ACL,2017:2727-2736.
    [7]HU Z K,LI X,TU C C,et al.Few-shot charge prediction with discriminative legal attributes[C]//Proceedings of the27th International Conference on Computational Linguistics.Santa Fe,NM,USA:ACL,2018:487-498.
    [8]JIANG X,YE H,LUO Z C,et al.Interpretable rationale augmented charge prediction system[C]//Proceedings of the27th International Conference on Computational Linguistics:System Demonstrations.Santa Fe,NM,USA:ACL,2018:146-151.
    [9]LONG S B,TU C C,LIU Z Y,et al.Automatic judgment prediction via legal reading comprehension[Z/OL].(2018-09-18)[2018-10-12].https://arxiv.org/abs/1809.0653.
    [10]ZHONG H X,ZHIPENG G P,TU C C,et al.Legal judgment prediction via topological learning[C]//Proceedings of the 2018Conference on Empirical Methods in Natural Language Processing.Brussels,Belgium:ACL,2018:3540-3549.
    [11]LIU C L,CHANG C T,HO J H.Case instance generation and refinement for case-based criminal summary judgments in Chinese[J].Journal of Information Science and Engineering,2004,20(4):783-800.
    [12]HOCHREITER S,SCHMIDHUBER J.Long short-term memory[J].Neural Computation,1997,9(8):1735-1780.
    [13]XIAO C J,ZHONG H X,GUO Z P,et al.CAIL2018:Alarge-scale legal dataset for judgment prediction[Z/OL].(2018-07-04)[2018-09-03].https://arxiv.org/abs/1807.0247.
    [14]SEGAL J A.Predicting supreme court cases probabilistically:The search and seizure cases,1962-1981[J].American Political Science Review,1984,78(4):891-900.
    [15]AAMODT A,PLAZA E.Case-based reasoning:Foundational issues,methodological variations,and system approaches[J].AI Communications,1994,7(1):39-59.
    [16]LAUDERDALE B E,CLARK T S.The supreme court's many median justices[J].American Political Science Review,2012,106(4):847-866.
    [17]LIU C L,HSIEH C D.Exploring phrase-based classification of judicial documents for criminal charges in chinese[C]//International Symposium on Methodologies for Intelligent Systems.Bari,Italy:Springer,2006:681-690.
    [18]LIN W C,KUO T T,CHANG T J.Exploiting machine learning models for Chinese legal documents labeling,case classification,and sentencing prediction[C]//Proceedings of the 24th Conference on Computational Linguistics and Speech Processing(ROCLING 2012).Chung-Li,Taiwan,China:ACL-CLP,2012:140-141.
    [19]ZENG J,USTUN B,RUDIN C.Interpretable classification models for recidivism prediction[J].Journal of the Royal Statistical Society:Series A(Statistics in Society),2017,180(3):689-722.
    [20]BERK R,BLEICH J.Forecasts of violence to inform sentencing decisions[J].Journal of Quantitative Criminology,2014,30(1):79-96.
    [21]SALTON G,BUCKLEY C.Term-weighting approaches in automatic text retrieval[J].Information Processing&Management,1988,24(5):513-523.
    [22]李静月,李培峰,朱巧明.一种改进的TFIDF网页关键词提取方法[J].计算机应用与软件,2011,28(5):25-27.LI J Y,LI P F,ZHU Q M.An improved tfidf-based approach to extract key words from web pages[J].Computer Applications and Software,2011,28(5):25-27.(in Chinese)
    [23]MIHALCEA R,TARAU P.Textrank:Bringing order into text[C]//Proceedings of the 2004Conference on Empirical Methods in Natural Language Processing.Barcelona,Spain:ACL,2004:404-411.
    [24]李素建,王厚峰,俞士汶,等.关键词自动标引的最大熵模型应用研究[J].计算机学报,2004,27(9):1192-1197.LI S J,WANG H F,YU T W,et al.Research on maximum entropy model for keyword indexing[J].Chinese Journal of Computers,2004,27(9):1192-1197.(in Chinese)
    [25]ZHANG K,XU H,TANG J,et al.Keyword extraction using support vector machine[C]//International Conference on Web-Age Information Management.Hong Kong,China:Springer,2006:85-96.
    [26]ERCAN G,CICEKLI I.Using lexical chains for keyword extraction[J].Information Processing&Management,2007,43(6):1705-1714.
    [27]高学东,吴玲玉.基于高维聚类技术的中文关键词提取算法[J].中国管理信息化,2011,14(9):23-27.GAO X D,WU L Y.Chinese keywords extraction algorithm based on the high-dimensional clustering technique[J].China Management Informationization,2011,14(9):23-27.(in Chinese)
    [28]ZHANG Q,WANG Y,GONG Y Y,et al.Keyphrase extraction using deep recurrent neural networks on Twitter[C]//Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing.Austin:ACL,2016:836-845.
    [29]YANG Z C,YANG D Y,DYER C,et al.Hierarchical attention networks for document classification[C]//Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.San Diego,California:ACL,2016:1480-1489.
    [30]GRAVES A,JAITLY N,MOHAMED A.Hybrid speech recognition with deep bidirectional lstm[C]//Automatic Speech Recognition and Understanding(ASRU),2013IEEEWorkshop on.IEEE,Olomouc,Czech Republic:IEEE,2013:273-278.
    [31]MATHUR A,FOODY G M.Multiclass and binary SVMclassification:Implications for training and classification users[J].IEEE Geoscience and Remote Sensing Letters,2008,5(2):241-245.
    [32]HUANG G B,ZHOU H,DING X,et al.Extreme learning machine for regression and multiclass classification[J].IEEETransactions on Systems,Man,and Cybernetics,Part B(Cybernetics),2012,42(2):513-529.
    [33]KINGMA D P,BA J.Adam:A method for stochastic optimization[Z/OL].(2017-01-30)[2017-09-10]https://arxiv.org/abs/1412.6980.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700