摘要
常见的相关系数反映变量之间的线性或非线性程度。基于希尔伯特-斯密特独立准则(Hilbert-Schmidt Independence Criterion,HSIC)的有偏估计(HSIC_0),提出了根据类标签划分出的类与类之间的非线性相关关系的度量方法。通过六组真实的、不同类型的数据集,分别选取了线性核、多项式核、RBF核和Sigmoid核函数进行实验。结果表明,该方法具有较好的可行性。
The common correlation coefficient reflects the linear or nonlinear degree between variables. Based on the Hilbert-Schmidt Independent Criterion(HSIC)for biased estimation(HSIC_0), a method of measuring the nonlinear correlation between classes based on class labels is proposed. Through six groups of real and different types of data sets, and linear kernels, polynomial kernels, RBF kernels and Sigmoid kernel functions are selected for experiments. The results show that the method has good feasibility.
引文
[1]梁吉业,冯晨娇,宋鹏.大数据相关分析综述[J].计算机学报,2016,39(1):1-18.
[2] Pearson K.Mathematical contributions to the theory of evolution.III.regression,heredity,and panmixia[J].Philosophical Transactions of the Royal Society A,1896,187:253-318.
[3] Spearman C.The proof and measurement of association between two things[J].The American Journal of Psychology,1904,15(1):72-101.
[4]周航星,陈松灿.有序判别典型相关分析[J].软件学报,2014,25(9):2018-2025.
[5] Székely G J,Rizzo M L,Bakirov N K.Measuring and testing dependence by correlation of distances[J].The Annals of Statistics,2007,35(6):2769-2794.
[6]奉国和.SVM分类核函数及参数选择比较[J].计算机工程与应用,2011,47(3):123-124.
[7]吴涛,贺汉根,贺明科.基于插值的核函数构造[J].计算机学报,2003,26(8):990-996.
[8]周志华.机器学习[M].北京:清华大学出版社,2016.
[9]汪洪桥.模式分析的多核方法及其应用[M].北京:国防工业出版社,2014.
[10]张晨光,张燕,张夏欢,等.从希尔伯特-施密特独立性中学习的多标签半监督学习方法[J].中国科技论文,2013,8(10):998-1002.
[11] Fukumizu K,Bach F R,Jordan M I,et al.Dimensionality reduction for supervised learning with reproducing kernel hilbert spaces[J].Journal of Machine Learning Research,2004,5(11):73-99.
[12] Gretton A,Bousquet O,Smola A J,et al.Measuring statistical dependence with hilbert-schmidt norms[C]//International Algorithmic Learning Theory,2005:63-77.
[13] Song L,Smola A J,Gretton A,et al.Feature selection via dependence maximization[J].Journal of Machine Learning Research,2012,13(1):1393-1434.
[14]楼俊钢,蒋云良,申情,等.软件可靠性预测中不同核函数的预测能力评估[J].计算机学报,2013,36(6):1303-1311.
[15]林升梁,刘志.基于RBF核函数的支持向量机参数选择[J].浙江工业大学学报,2007,35(2):163-167.