摘要
相同应用领域,不同时间、地点或设备检测到的数据域不一定完整.文中针对如何进行数据域间知识传递问题,提出相同领域的概率分布差异可用两域最小包含球中心点表示且其上限与半径无关的定理.基于上述定理,在原有支持向量域描述算法基础上,提出一种数据域中心校正的领域自适应算法,并利用人造数据集和KDD CUP 99入侵检测数据集验证该算法.实验表明,这种领域自适应算法具有较好的性能.
The data fields detected from different times, places or devices are not always complete even if they come from the same data resource. To solve the problem of effectively transferring the knowledge between the two fields, the theorem is proposed that the difference between two probability distributions from two domains can be expressed by the center of each domain′s minimum enclosing ball and its up limit has nothing to do with the radius. Based on the theorem, a fast center calibration domain adaptive algorithm, center calibration-core sets support vector data description (CC-CSVDD), is proposed for large domain adaptation by modifying the original support vector domain description (SVDD) algorithm. The validity of the proposed algorithm is experimentally verified on the artificial datasets and the real KDD CUP-99 datasets. Experimental results show that the proposed algorithm has good performance.
引文
[1]Yang J,Yan R,Hauptmann A G.Cross-Domain Video Concept De-tection Using Adaptive SVMs//Proc of the15th International Con-ference on Multimedia.Augsburg,Germany,2007:188-197
[2]Blitzer J,McDonald R,Pereira F.Domain Adaptation with Struc-tural Correspondence Learning//Proc of the Conference on Empiri-cal Methods in Natural Language Processing.Philadelphia,USA,2006:120-128
[3]Pan S J,Tsang I W,Kwok J T,et al.Domain Adaptation via Transfer Component Analysis.IEEE Trans on Neural Networks,2010,22(2):199-210
[4]Tax D M J,Duin R P W.Support Vector Domain Description.Pat-tern Recognition Letters,1999,20(11/12/13):1191-1199
[5]Liu Y H,Liu Yanchen,Chen Y J.Fast Support Vector Data De-scriptions for Novelty Detection.IEEE Trans on Neural Networks,2010,21(8):1296-1313
[6]GhasemiGol M,Monsefi R,Yazdi H S.Intrusion Detection by New Data Description Method//Proc of the International Conference on Intelligent Systems,Modelling and Simulation.Liverpool,UK,2010:1-5
[7]Tsang I W,Kwok J T,Cheung P.Core Vector Machines:Fast SVM Training on Very Large Data Sets.Journal of Machine Learning Re-search,2005,6(4):363-392
[8]Badoiu M,Clarkson K L.Optimal Core Sets for Balls.Computation-al Geometry:Theory and Applications,2008,40(1):14-22
[9]Tsang I W,Kwok J T,Zurada J M.Generalized Core Vector Ma-chines.IEEE Trans on Neural Networks,2006,17(5):1126-1140
[10]Chu C S,Tsang I W,Kwok J K.Scaling up Support Vector Data Description by Using Core-Sets//Proc of the IEEE International Joint Conference on Neural Networks.Budapest,Hungary,2004,I:425-430
[11]Deng Zhaohong,Chung F L,Wang Shitong.FRSDE:Fast Re-duced Set Density Estimator Using Minimal Enclosing Ball Approx-imation.Pattern Recognition,2008,41(4):1363-1372
[12]Mark G,He C.Probability Density Estimation from Optimally Con-densed Data Samples.IEEE Trans on Pattern Analysis and Ma-chine Intelligence,2003,25(10):1253-1264
[13]Marzio M Z,Taylor C C.Kernel Density Classification and Boos-ting:An L2Analysis.Statistics and Computing,2005,15(2):113-123
[14]Hall P,Wand M P.On Nonparametric Discrimination Using Den-sity Differences.Biometrika,1988,75(3):541-547
[15]Smola A,Schlkopf B.Sparse Greedy Matrix Approximation for Machine Learning//Proc of the17th International Conference on Machine Learning.San Francisco,USA,2000:911-918