基于数据挖掘技术的信息分析方法研究——以集装箱海运价格预测为例
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Research on Information Analysis Method Based on Data Mining Technology——Taking Container Shipping Price Forecast as An Example
  • 作者:王战平 ; 冯扬文 ; 朱宸良
  • 英文作者:WANG Zhan-ping;FENG Yang-wen;ZHU Chen-liang;School of Information Management, Central China Normal University;
  • 关键词:价格预测 ; 数据挖掘 ; 时间序列 ; 模型 ; 算法
  • 英文关键词:information forecasting;;data mining;;time series;;model algorithm
  • 中文刊名:QBKX
  • 英文刊名:Information Science
  • 机构:华中师范大学信息管理学院;
  • 出版日期:2019-07-01
  • 出版单位:情报科学
  • 年:2019
  • 期:v.37;No.335
  • 语种:中文;
  • 页:QBKX201907011
  • 页数:7
  • CN:07
  • ISSN:22-1264/G2
  • 分类号:67-73
摘要
【目的/意义】针对多组时间序列的海量数据集和以预测为目标的信息分析方法,提出了基于数据挖掘技术的预测模型,在大数据环境下,提高了预测精度,以期在其他领域的信息分析和情报预测能有所借鉴。【方法/过程】以集装箱海运价格预测为例,提出集装箱海运价格预测模型,设计自适应的网格搜索策略,高效准确地确定数据挖掘算法中的超参数组合,提出基于时间序列留出法的评估方法,降低了集装箱运价这种多组时间序列数据集在数据挖掘结果上的泛化误差,针对海量运价信息,对GBDT算法进行并行计算设计和预排序后的损失函数迭代计算优化策略,提高了算法在大数据环境下的计算效率。【结果/结论】模型和算法运行结果仿真显示:对于传统的时间序列问题,基于数据挖掘方法的预测模型取得了比传统时间序列方法更优的结果。
        【Purpose/significance】In the face of the multi-type and multi-feature mass data, the conventional information analysis method faces the challenge. Pointing at the mass data set of multi-group time series and the information analysis method aiming at prediction, a prediction model based on data mining technology is proposed. In the environment of big data, the accuracy of prediction is improved, so that the information analysis and intelligence prediction in other fields can be used for reference.【Method/process】Taking container shipping price prediction as an example, a container shipping price prediction model is proposed, and an adaptive grid search strategy is designed to efficiently and accurately determine the super-parameter combination in the data mining algorithm. An evaluation method based on time series reserving method is proposed to reduce the generalization error of multi-group time series data sets in data mining. The parallel computing design of GBDT algorithm and the iterative optimization strategy of loss function after pre-sorting can improve the efficiency of the algorithm under big data environment.【Result/conclusion】The simulation results of the model and the algorithm show that the prediction model based on the data mining method is better than the traditional time series method for the traditional time series problems.
引文
1尚介丽,骆温平.运用神经网络模型预测铁矿石即期海运运价[J].水运管理, 2012, 34(4):21-24.
    2 徐萍.基于小波分析和神经网络的BFI预测研究[D].大连:大连海事大学, 2006.
    3 杨华龙,东方.基于支持向量机的干散货航运市场运价预警[J].中国航海,2009,(3):101-105.
    4 曾庆成.神经网络在波罗的海运价指数预测中的应用研究[J].大连海事大学学报:自然科学版, 2004, 30(3):45-47.
    5 吕靖,陈庆辉.海运价格指数波动规律[J].大连海事大学学报,2003,(1):1-4.
    6 朱小婷,林国龙.基于BP神经网络的干散货航运市场运价预警[J].水运管理, 2012, 34(4):14-17.
    7 Chen T, Guestrin C. XGBoost:A Scalable Tree Boosting System[C]//The 22nd ACM SIGKDD International Conference,2016:785-794.
    8 Introduction to Boosted Trees, Tianqi Chen, 2014.[EB/OL].http://homes.cs.washington.edu/~tqchen/pdf/BoostedTree.pdf,2014-08-09.
    9 王晓佳,杨善林,陈志强.大数据时代下的情报分析与挖掘技术研究:电信客户流失情况分析[J].情报学报,2013,32(6):564-574.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700