摘要
随着人们生活水平的提高,环境污染问题也变得越来越严重,对空气质量数据进行有效分析,对改善环境具有重大的意义。然而,在空气质量数据检测的过程中,由于仪器故障或者人为原因等,会存在异常数据。因此,笔者使用孤立森林算法检测空气质量数据中的异常数据,来提高对空气质量分析研究的有效性,并以郑州市的空气质量历史数据为例进行异常检测分析。结果表明,孤立森林算法可以准确地检测出异常数据。
With the improvement of people's living standards, environmental pollution has become more and more serious. Effective analysis of air quality data is of great significance to improve the environment. However, in the process of air quality data detection, there will be abnormal data due to instrument failure or human reasons. Therefore, the author uses isolated forest algorithm to detect abnormal data in air quality data to improve the effectiveness of air quality analysis, and takes the historical data of air quality in Zhengzhou as an example to carry out abnormal detection and analysis. The results show that the isolated forest algorithm can accurately detect abnormal data.
引文
[1]王跃思,宋涛,高文康,等.北京市大气污染治理现状及面临的机遇与挑战[J].中国科学院院刊,2016,31(9):1082-1087.
[2]Liu F T,Ting K M,Zhou Z H.Isolation Forest[C]//ata Mining,2008:123.
[3]姚旭,王晓丹,张玉玺,等.特征选择方法综述[J].控制与决策,2012,27(2):161-166.