摘要
针对现有信誉度量模型存在的粒度过粗、维度考虑不全的问题,本文提出一种基于半监督学习的在线服务信誉度量方法.首先将在线服务信誉度量建模成对服务的分类问题,通过人工标注服务训练集并训练对服务的决策树分类器.然后基于Tri-training算法利用所得到的分类器对未标注服务集中的服务进行分类,并将分类后的服务和标签一起加入到训练集,重新训练分类器模型并用所训练分类器对服务进行分类.同时,为对抗模型过拟合提升模型的泛化能力,对模型进行改进,提出剪枝处理和增加分类器个数并抽样决策属性构造半监督随机森林两种方法,并用所得分类器对服务进行分类实现信誉度量.通过实验验证了本文所提出方法的有效性与高效性.
In order to solve the problem of over-coarse granularity and incomplete consideration of existing reputation measurement model,this paper proposes a method of online service reputation measurement based on semi-supervised learning. First,online service reputation metrics are modeled as classification problems of services,and the service training set is manually annotated and the decision tree classifier for services is trained. Then based on the Tri-training algorithm,the obtained classifier is used to classify the services in unlabeled service sets,and the classified services and labels are added to the training set together,and the classifier is retrained,and the classifier is used to classify the services. At the same time,In order to overcome model overfitting and improve the generalization ability,two methods are proposed: pruning and increasing the number of classifiers and sampling decision attributes to construct a semi-supervised random forest,use the classifier to classify to classify the services to achieve reputation metrics. The experiment verifies the effectiveness and efficiency of the proposed method.
引文
[1]Turban E,Lee J K.Electronic commerce 2010-a managerial prespective[M].New Jersey:Prentice Hall,2009.
[2]Zheng Z,Zhang Y,Lyu M R.Investigating QoS of real-world web services[J].IEEE Transactions on Services Computing,2014,7(1):32-39.
[3]Huang A,Lan C W,Yang S.An optimal QoS-based web service selection scheme[J].Information Sciences,2009,179(19):3309-3322.
[4]Xiong L,Liu L.Peer Trust:supporting reputation-based trust for peer-to-peer electronic communities[J].IEEE Transactions on Know ledge&Data Engineering,2004,16(7):843-857.
[5]Hwang S Y,Hsu C C,Lee C H.Service selection for web services w ith probabilistic QoS[J].IEEE Transactions on Services Computing,2015,8(3):467-480.
[6]Zheng Z,Zhang Y,Lyu M R.Investigating QoS of real-world web services[J].IEEE Transactions on Services Computing,2014,7(1):32-39.
[7]Li Ming,Zhou Zhi-hua.Tri-training:exploiting unlabeled data using three classifiers[J].IEEE Transactions on Know ledge and Data Engineering,2005,17(11):1479-1493.
[8]Yao Y,Ruohomaa S,Xu F.Addressing common vulnerabilities of reputation systems for electronic commerce[J].Journal of Theoretical and Applied Electronic Commerce Research,2012,7(1):1-20.
[9]Malaga R A.Web-based reputation management systems:problems and suggested solutions[J].Electronic Commerce Research,2001,1(4):403-417.
[10]Lu Song-feng,Liu Fang,Hu He-ping,et al.Trust management model based on reputation of nodes for P2P netw ork[J].Journal of Chinese Computer System,2009,30(11):2139-2145.
[11]Mui L,Mohtashemi M,Halberstadt A.A computational model for trust and reputation[C]//Proc of 35th Haw aii Int conf on System Sciences,Haw aii,2002:2431-2439.
[12]Wang Y,Lin K J,Wong D S,et al.Trust management towards service-oriented applications[J].Service Oriented Computing and Applications,2009,3(2):129-146.
[13]Wang Y D,Emurian H H.An overview of online trust:concepts,elements,and implications[J].Computers in Human Behavior,2005,21(1):105-125.
[14]Gan Zao-bin,Ding Qian,Li Kai,et al.Reputation-based multi-dimensional trust algorithm[J].Journal of Softw are,2011,22(10):2401-2411.
[15]Guo L,Zhang C,Fang Y,et al.A privacy-preserving attributebased reputation system in online social netw orks[J].Journal of Computer Science and Technology,2015,30(3):578-597.
[16]Xu Peng,Lin Sen.Internet traffic classification using C4.5 decision tree[J].Journal of Softw are,2009,20(10):2692-2704.
[17]Lv Qiang,Li Zhao-rong,Chen Qi,et al.Sub-optimal actionable know ledge extraction method in data mining[J].Journal of Chinese Computer Systems,2017,38(5):977-982.
[18]J Reunanen.Overfitting in making comparisons between variable selection methods[J].Journal of M achine Learning Research,2003,3(3):1371-1382.
[19]Wei Z,Nan M A.An improved post-pruning algorithm for decision tree[J].Computer&Digital Engineering,2015,(6):960-971.
[20]Moraes R,Valiati J F,Neto W P G.Document-level sentiment classification:an empirical comparison betw een SVM and ANN[J].Expert Systems with Applications,2013,40(2):621-633.
[10]路松峰,刘芳,胡和平,等.一种节点信誉相关的P2P网络信任管理模型[J].小型微型计算机系统,2009,30(11):2139-2145.
[14]甘早斌,丁倩,李开,等.基于声誉的多维度信任计算算法[J].软件学报,2011,22(10):2401-2411.
[16]徐鹏,林森.基于C4. 5决策树的流量分类算法[J].软件学报,2009,20(10):2692-2704.
[17]吕强,李兆荣,陈岐,等.数据挖掘中的一种次优化动作知识提取方法[J].小型微型计算机系统,2017,38(5):977-982.
1 http://www. amazon. cn/
2 http://www. ebay. com/