基于机器学习和脆弱国家指数的全球恐怖袭击预测研究

英文篇名：Study on Prediction of global terrorist attacks based on Machine Learning and Fragile States Index
作者：邱凌峰 ; 胡啸峰 ; 顾海硕 ; 唐正 ; 郑超慧 ; 沈兵
英文作者：QIU Lingfeng;HU Xiaofeng;GU Haishuo;TANG Zheng;ZHENG Chaohui;SHEN Bing;School of Information Technology and Cyber Security, People's Public University of China;Key Laboratory of Security Technology & Risk Assessment, Ministry of public security;
关键词：恐怖袭击 ; 脆弱国家指数 ; 机器学习 ; 回归预测
英文关键词：terrorist attacks;;fragile states index;;machine learning;;regression and prediction
中文刊名：ZHXU
英文刊名：Journal of Catastrophology
机构：中国人民公安大学信息技术与网络安全学院;安全防范技术与风险评估公安部重点实验室;
出版日期：2019-04-20
出版单位：灾害学
年：2019
期：v.34;No.132
基金：国家自然科学基金项目(71704183);; 国家重点研发计划课题(2018YFC0809702);; 公安部科技强警基础工作专项项目(2018GABJC01)
语种：中文;
页：ZHXU201902039
页数：4
CN：02
ISSN：61-1097/P
分类号：213-216

摘要

恐怖袭击在全球范围内频发,针对恐怖袭击的预警及防控研究十分必要。利用2006-2016年脆弱国家指数及全球恐怖主义数据库(GTD),基于多种机器学习模型,对全球各国家遭受恐怖袭击的风险进行回归预测。结果表明,随机森林、K近邻及决策树模型表现最优,其拟合优度的确定系数R~2达到了0.75、0.74和0.67。随机森林预测结果总体符合实际情况,尤其在恐怖袭击高发的中东和中亚地区预测较为准确。根据特征重要性排序结果,安全机构、公共服务、人权法治和集团之间的矛盾对预测结果的刻画能力最强。
Terrorist attacks occur frequently all over the world. Study on early warning, prevention and control of terrorist attacks is necessary. Methods of prediction of global terrorist attacks were studied using the data from Fragile States Index and Global Terrorism Database from 2006 to 2016, based on six kinds of Machine Learning Models. The results show that Random Forest, K-neighbors and Decision tree perform well, which has the highest R-squared as 0.75, 0.74 and 0.67. The prediction results of Random Forests are generally in line with the actual situation, especially in the Middle East and Central Asia, where terrorist attacks occur frequently. According to the results of importance ranking of characteristics, Security Apparatus, Public Services, Human Rights and Rule of Law and Group Grievance have the strongest ability to portray prediction results.

引文

[1] Petroff V B, Bond J H, Bond D H, et al. Using Hidden Markov Models to Predict Terror Before it Hits (Again)[M]: Springer New York, 2013: 163-180.
    [2] 战兵, 韩锐. 基于隐马尔可夫的恐怖事件预测模型[J]. 解放军理工大学学报(自然科学版), 2015(4): 386-393.
    [3] 傅子洋, 徐荣贞, 刘文强. 基于贝叶斯网络的恐怖袭击预警模型研究[J]. 灾害学, 2016,31(3): 184-189.
    [4] 薛安荣, 毛文渊, 王孟頔, 等. 基于贝叶斯方法和变化表的恐怖行为预测算法[J]. 计算机科学, 2016(12): 130-134.
    [5] 项寅. 基于改进神经网络的恐怖袭击风险预警系统[J]. 灾害学, 2018,33(1): 183-189.
    [6] 胡成, 李明星, 古丽燕, 等. 情报视角下暴力恐怖活动多元社会网络测度研究[J]. 情报杂志, 2018(3): 33-39.
    [7] 李益斌. 印度恐怖主义与社会经济因素的关系探究[J]. 南亚研究, 2018(2): 139-154.
    [8] 李益斌. 欧洲恐怖主义的新态势及原因分析——基于聚类分析法[J]. 情报杂志, 2018(3): 55-63.
    [9] 李友龙. 恐怖活动的象征性标识——以巴黎恐袭案为例[J]. 情报杂志, 2016(8): 25-30.
    [10] Kharas H , Salehi-Isfahani D , Hove C. The failed states index[J]. Foreign Policy, 2009(173):80-93.
    [11] 朱剑, 郝巧英. “失败国家”与文明使命:国家脆弱程度指数再评估[J]. 探索, 2017(5): 157-164.
    [12] 位珍珍. 后911时代恐怖主义的GTD数据分析[J]. 情报杂志, 2017(7): 10-15.
    [13] 叶琼元, 兰月新, 夏一雪, 等. 反恐数据库构建的国际比较及对我国的启示[J]. 情报杂志, 2018(5): 43-51.
    [14] 范淼. 机器学习及实践——从零开始通往Kaggle竞赛之路[M]. 北京: 清华大学出版社, 2016: 183.
    [15] Jameslambrinos. Applied linear regression models[J]. Technometrics, 2004, 26(4): 415-416.
    [16] Ketkar, Nikhil. Stochastic Gradient Descent[M]. Deep Learning with Python. Apress, 2017.
    [17] Awad M, Khanna R. Support vector regression[J]. Neural Information Processing Letters & Reviews, 2007, 11(10): 203-224.
    [18] Buza K, Nanopoulos A, Nagy G. Nearest neighbor regression in the presence of bad hubs[J]. Knowledge-Based Systems, 2015, 86(C): 250-260.
    [19] Friedl M A, Brodley C E. Decision tree classification of land cover from remotely sensed data[J]. Remote Sensing of Environment, 1997, 61(3): 399-409.
    [20] Pal M. Random forest classifier for remote sensing classification[J]. International Journal of Remote Sensing, 2005, 26(1):217-222.
    [21] ZHANG N, HUANG H. Resilience analysis of countries under disasters based on multisource data[J]. Risk Analysis, 2018, 38(1): 31-42.
    [22] 武增海, 李涛. 高新技术开发区综合绩效空间分布研究——基于自然断点法的分析[J]. 统计与信息论坛, 2013, 28(3): 82-88.
    [23] Louppe G, Wehenkel L, Sutera A, et al. Understanding variable importances in Forests of randomized trees[J]. Advances in Neural Information Processing Systems, 2013, 26: 431-439.
    [24] HAN H, GUO X, YU H. Variable selection using Mean Decrease Accuracy and Mean Decrease Gini based on Random Forest[C], IEEE International Conference on Software Engineering and Service Science,2017.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700