考虑模型相关性的组合预测过程中单项模型筛选研究
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Single Model Screening for Combination Forecast Considering Model Correlation
  • 作者:吴登生 ; 李建平 ; 孙晓蕾
  • 英文作者:WU Dengsheng;LI Jinaping;SUN Xiaolei;Institute of Policy and Management, Chinese Academy of Sciences;
  • 关键词:组合预测 ; 模型筛选 ; 模型相关性 ; 模糊测度 ; 软件成本估算
  • 英文关键词:Combination forecast;;model screening;;model correlation;;fuzzy measure;;software effort estimation
  • 中文刊名:STYS
  • 英文刊名:Journal of Systems Science and Mathematical Sciences
  • 机构:中国科学院科技政策与管理科学研究所;
  • 出版日期:2017-02-15
  • 出版单位:系统科学与数学
  • 年:2017
  • 期:v.37
  • 基金:国家自然科学基金(71201156,71571179,71425002);; 中国科学院青年创新促进会(2013112)资助课题
  • 语种:中文;
  • 页:STYS201702012
  • 页数:11
  • CN:02
  • ISSN:11-2019/O1
  • 分类号:131-141
摘要
针对组合预测过程中单项模型筛选难以刻画模型之间相关性的问题,采用模糊测度和模糊积分来刻画模型的相关性,并对单项模型预测结果进行集成,进而提出一种考虑模型相关性的组合预测过程中单项模型筛选方法.采用2可加模糊测度来刻画不同模型之间的相关性,并利用Choquet积分依据模糊测度值,将单项模型的预测值集成起来,形成组合预测结果.在这个组合预测过程中,采用基于模糊测度定义的Shapley值和交互作用指标来对单项模型进行筛选.为了验证文章提出的考虑模型相关性的组合预测单项模型筛选方法的有效性,选择软件工程领域的软件成本估算问题进行算例分析,选择基于案例推理方法(CBR)、最小二乘回归(OLS)、支持向量回归机(SVR)、分类回归树(CART)、人工神经网络(ANN)等数据驱动模型作为软件成本组合预测过程中的单项模型.选择常用的Desharnias数据库来验证模型的有效性.实证结果表明文章提出的单项模型筛选方法是一种有效方法,经过筛选后的组合预测模型能有效提高软件成本估算的精度,此外,研究结果还表明组合估算过程中最重要的模型(Sharply值最大)并不是估算精度最高的模型,即单个模型的重要性与该模型的估算精度没有必然联系,说明传统的以单个模型估算精度为依据的组合预测模型存在着一定的缺陷.
        For the difficult of model screening considering model correlation, this paper uses fuzzy measure and fuzzy integral to characterize the model correlation, and then proposes a model to consider a combination of relevance the process of screening methods to predict individual models. 2-order additive fuzzy measures is applied to characterize the correlation between the different models and Choquet integral is used to integrate the result from single model. In the combination forecasting process, Shapley value and the interaction index are defined to screen single model.In order to verify the validity of the proposed model, the paper selects software effort estimation problems as example. Five data-driven software effort estimation models, case-based reasoning(CBR), least squares regression(OLS), support vector regression(SVR), classification and regression tree(CART), artificial neural network(ANN), are used as the single model. Desharnias database is utilized to verify the model. The empirical results show that the proposed model is an effective method of screening methods, and improve software cost estimation accuracy effectively. In addition, the results also indicate that a combination of the most important estimation model(Sharply maximum value) is not the most accurate estimation of the model,namely the importance of a single model and estimation accuracy of the model are not necessarily linked, indicating that the traditional model of a single estimate accuracy combination forecasting model based on the existence of certain defects.
引文
[1]Bates J M,Granger C W J.The combination of forecasts.Operational Research,1969,20(4):451-468.
    [2]Yuan Z,Yang Y.Combining linear regression models:When and how.Journal of the American Statistical Association,2005,100(4):1202-1214.
    [3]Shen S,Li G,Song H.Combination forecasts of international tourism demand.Annals of Tourism Research,2011,38(1):72-89.
    [4]王川,赵俊晔,赵友森.组合预测模型在农产品价格短期预测中的应用:以苹果为例的实证分析.系统科学与数学,2013,33(1):89-96.(Wang C,Zhao J Y,Zhao Y S.The application of combination model on short-term forecasting of agriculture products:The empirical analysis of apple price.Journal of Systems Science and Mathematical Sciences,2013,33(1):89-96.)
    [5]陈华友.组合预测方法有效性理论及其应用.北京:科学出版社,2008.(Chen H Y.Theory and Application of Combination Forecasting Method.Beijing:Science Press,2008.)
    [6]张新雨,邹国华.模型平均方法及其在预测中的应用.统计研究,2011,28(6):97-102.(Zhang X Y,Zuo G H.Model averaging method and its application in forecast.Statistical Research,2011,28(6):97-102.)
    [7]Zhang X,Lu Z,Zou G.Adaptively combined forecasting for discrete response time series.Journal of Econometrics,2013,176(1):80-91.
    [8]纪爱兵,庞佳宏,李树环,基于Choquet模糊积分的非线性组合预测及其应用.模糊系统与数学,2006,20(3):145-149.(Ji A B,Pang J H,Li S H.Combination forecasting based on fuzzy integral and its application.Fuzzy Systems and Mathematics,2006,20(3):145-149.)
    [9]Wright J H.Bayesian model averaging and exchange rate forecasts.Journal of Econometrics,2008,146(2):329-341.
    [10]Liu C.Distribution theory of the least squares averaging estimator.Journal of Econometrics,2015,186(1):142-159.
    [11]Kascha C,Ravazzolo F.Combining inflation density forecasts.Journal of Forecasting,2010,29(1-2):231-250.
    [12]赵昕东,钱国骐.基于Kullback-Leibler信息量的最优ARMA模型组选择与组合预测研究.中国管理科学,2011,19(5):21-28.(Zhao X D,Qian G Q.The best ARMA model group selection and combined forecasting based on Kullback-Leibler information.Chinese Journal of Management Science,2011,19(5):21-28.)
    [13]韩冬梅,牛文清,于长锐.组合预测建模中单项预测模型筛选研究.系统工程与电子技术,2009,31(6):1381-1385.(Han D M,Niu W Q,Yu C R.Research on single forecast model screening for combination forecast modeling.Systems Engineering and Electronics,2009,31(6):1381-1385.)
    [14]Grabisch M,Kojadinovic I,Meyer P.A review of methods for capacity identification in Choquet integral based multi-attribute utility theory:Applications of the Kappalab R package.European Journal of Operational Research,2008,186(2):766-785.
    [15]Grabisch M.K-order additive discrete fuzzy measures and their representation.Fuzzy Sets and Systems,1997,92(2):167-189.
    [16]武建章,张强.基于2-可加模糊测度的多准则决策方法.系统工程理论与实践,2010,30(7):1229-1237.(Wu J Z,Zhang Q.Multicriteria decision making method based on 2-order additive fuzzy measures.Systems Engineering—Theory&Practice,2010,30(7):1229-1237.)
    [17]Marichal J L,Roubens M.Determination of weights of interacting criteria from a reference set.European Journal of Operational Research,2000,124(3):641-650.
    [18]Grabisch M,Kojadinovic I,Meyer P.A review of methods for capacity identification in Choquet integral based multi-attribute utility theory applications of the Kappalab R package.European Journal of Operational Research,2008,186(2):766-785.
    [19]Grabisch M.The representation of importance and interaction of features by fuzzy measures.Pattern Recognition Letters,1996,17(6):567-575.
    [20]Wu D S,Li J P,Liang Y.Linear combination of multiple case-based reasoning with optimized weight for software effort estimation.Journal of Supercomputing,2013,64(3):898-918.
    [21]Li Y F,Xie M,Goh T N.A study of project selection and feature weighting for analogy based software cost estimation.Journal of Systems and Software,2009,82(2):241-252.
    [22]Jorgensen M,Shepperd M.A systematic review of software development cost estimation studies.IEEE Transactions on Software Engineering,2007,33(1):33-53.
    [23]Nassif A B,Ho D,Capretz L F.Towards an early software estimation using log-linear regression and a multilayer perceptron model.Journal of Systems and Software,2013,86(1):144-160.
    [24]Dejaeger K,Verbeke W,Martens D,et al.Data mining techniques for software effort estimation:A comparative study.IEEE Transactions on Software Engineering,2012,38(2):375-397.
    [25]Corazza A,Di Martino S,Ferrucci F,et al.Investigating the use of support vector regression for web effort estimation.Empirical Software Engineering,2011,16(2):211-243.
    [26]Oliveira A L I,Braga P L,Lima R M F,et al.GA-based method for feature selection and parameters optimization for machine learning regression applied to software effort estimation.Information and Software Technology,2010,52(11):1155-1166.
    [27]Elish M O.Improved estimation of software project effort using multiple additive regression trees.Expert Systems with Applications,2009,36(7):10774-10778.
    [28]Wen J,Li S,Lin Z,et al.Systematic literature review of machine learning based software development effort estimation models.Information and Software Technology,2012,54(1):41-59.
    [29]Vinay Kumar K,Ravi V,Carr M,et al.Software development cost estimation using wavelet neural networks.Journal of Systems and Software,2008,81(11):1853-1867.
    [30]Huang S J,Chiu N H.Applying fuzzy neural network to estimate software development effort.Applied Intelligence,2009,30(2):73-83.
    [31]Heiat A.Comparison of artificial neural network and regression models for estimating software development effort.Information and Software Technology,2002,44(15):911-922.
    [32]Burgess C J,Lefley M.Can genetic programming improve software effort estimation?A comparative evaluation.Information and Software Technology,2001,43(14):863-873.
    [33]Maxwell K,Van Wassenhove L,Dutta S.Performance evaluation of general and company specific models in software development effort estimation.Management Science,1999,45(6):787-803.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700