L2损失大规模线性非平行支持向量顺序回归模型

设为首页

收藏本站

网站地图 | English | 公务邮箱

远程访问

NSTL服务站

L2损失大规模线性非平行支持向量顺序回归模型

详细信息查看全文 | 推荐本文 |

英文篇名：L2-loss Large-scale Linear Nonparallel Support Vector Ordinal Regression
作者：石勇 ; 李佩佳 ; 汪华东
英文作者：SHI Yong;LI Pei-Jia;WANG Hua-Dong;Research Center on Fictitious Economy & Data Science,Chinese Academy of Sciences;School of Computer and Control Engineering, University of Chinese Academy of Sciences;School of Mathematical Sciences, University of Chinese Academy of Sciences;Key Laboratory of Big Data Mining and Knowledge Management, Chinese Academy of Sciences;
关键词：顺序回归 ; 支持向量机 ; 信赖域牛顿算法 ; 对偶坐标下降算法
英文关键词：Ordinal regression;;SVM;;trust region Newton method;;dual coordinate descent method
中文刊名：MOTO
英文刊名：Acta Automatica Sinica
机构：中国科学院虚拟经济与数据科学研究中心;中国科学院大学计算机与控制学院;中国科学院大学数学科学学院;中国科学院大数据挖掘与知识管理重点实验室;
出版日期：2018-02-06 14:10
出版单位：自动化学报
年：2019
期：v.45
基金：国家自然科学基金(71110107026,71331005,91546201)资助~~
语种：中文;
页：MOTO201903006
页数：13
CN：03
ISSN：11-2109/TP
分类号：63-75

摘要

顺序回归是一种标签具有序信息的多分类问题,广泛存在于信息检索、推荐系统、情感分析等领域.随着互联网、移动通信等技术的发展,面对大量具有大规模、高维、稀疏等特征的数据,传统的顺序回归算法往往表现不足.非平行支持向量顺序回归模型具有适应性强,在性能上优于其他基于SVM的方法等优点,该文在此模型基础上提出基于L2损失的大规模线性非平行支持向量顺序回归模型,其中线性模型的设计可处理大规模数据,基于L2的损失可使标签偏离较大的样本得到更大惩罚.此外,该文从模型的两种不同角度分别设计了信赖域牛顿算法和坐标下降算法求解该线性模型,并比较了两种算法在性能上的差异.为验证模型的有效性,该文在大量数据集上对提出的模型及算法进行了分析,结果表明,该文提出的模型表现最优,尤其采用坐标下降算法求解的该模型在数据集上获得了最好的测试性能.
Ordinal regression, where the labels of the samples exhibit a natural ordering, is a kind of multi-classification problem. It has found wide applications in information retrieval, recommendation systems, and sentiment analysis.With the development of internet and mobile communication technology, traditional ordinal regression models often underperform when facing numerous large scale, high dimensional and sparse data. However, the nonparallel support vector ordinal regression model shows its advantages with strong adaptability and better performance compared with other SVM-based models. Based on this model, this paper presents a new L2-loss linear nonparallel support vector ordinal regression model, whose linear model could deal with large-scale problems and whose L2-loss could give a great punishment to the sample that deviates from the true label. Besides, two algorithms: trust region Newton method and the dual coordinate descent method(DCD) are developed in terms of different perspectives of the model and their performances are compared. To verify the effectiveness of the model, experiments are conducted on numerous datasets and the results show that the proposed model, especially the model with the DCD algorithm can achieve the state-of-art performance.

引文

1 Nakov P, Ritter A, Rosenthal S, Sebastiani F, Stoyanov V. SemEval-2016 task 4:sentiment analysis in Twitter. In:Proceedings of the 10th International Workshop on Semantic Evaluation. San Diego, CA, USA:ACL, 2016. 1-18
    2 Dikkers H, Rothkrantz L. Support, vector machines in ordinal classification:An application to corporate credit scoring.Neural Network World, 2005, 15(6):491
    3 Chang K Y, Chen C S, Hung Y P. Ordinal hyperplanes ranker with cost sensitivities for age estimation. In:Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition(CVPR). Providence, RI, USA:IEEE, 2011. 585-592
    4 Gutierrez P A, Perez-Ortiz M, Sánchez-Monedero J,Fernandez-Navarro F, Hervas-Martinez C., Ordinal regression methods:survey and experimental study. IEEE Transactions on Knowledge and Data Engineering, 2016, 28(1):127-146
    5 Zhang Xue-Gong. Introduction to statistical learning theory and support vector machines. Acta Automatica Sinica,2000, 26(1):32-42(张学工.关于统计学习理论与支持向量机.自动化学报,2000,261(1):32-42)
    6 Chu W, Keerthi S S. Support vector ordinal regression. Neural Computation,2007, 19(3):792-815
    7 Lin H T, Li L. Reduction from cost-sensitive ordinal ranking to weighted binary classification. Neural Computation,2012, 24(5):1329-1367
    8 Perez-Ortiz M. Gutierrez P A. Hervas-Martínez C.Projection-based ensemble learning for ordinal regression.IEEE Transactions on Cybernetics, 2014, 44(5):681-694
    9 Chang K W, Hsieh C J, Lin C J. Coordinate descent method for large-scale L2-loss linear support vector machines. The Journal of Machine Learning Research, 2008, 9:1369-1398
    10 Wang H D, Shi Y, Niu L F, Tian Y J. Nonparallel support vector ordinal regression. IEEE Transactions on Cybernetics, 2017, 47(10):3306-3317
    11 Hsieh C J, Chang K W, Lin C J, Keerthi S S, Sundararajan S. A dual coordinate descent method for large-scale linear SVM. In:Proceedings of the 25th International Conference on Machine Learning. New York, USA:ACM, 2008.408-415
    12 Ho C H, Lin C J. Large-scale linear support vector regression, The Journal of Machine Learning Research, 2012,131(1):3323-3348
    13 Lin C J, More J J. Newton's method for large boundconstrained optimization problems. SIAM Journal on Optimization, 1999, 9(4):1100-1127
    14 Lin C J, Weng R C, Keerthi S S. Trust region newton method for logistic regression. The Journal of Machine Learning Research, 2008, 9:627-650
    15 Hsia C Y, Zhu Y, Lin C J. A study on trust region update rules in newton methods for large-scale linear classification.In:Proceedings of the 9th Asian Conference on Machine Learning(ACML). Seoul, South Korea:ACML, 2017
    16 Chiang W L, Lee M C, Lin C J. Parallel Dual coordinate descent method for large-scale linear classification in multi-core environments. In:Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, USA:ACM, 2016,1485-1494
    17 Yuan G X, Chang K W, Hsieh C J, Lin C J, A comparison of optimization methods and software for large-scale11-regularized linear classification. The Journal of Machine Learning Research, 2010, 11:3183-3234
    18 Tseng P, Yun S. A coordinate gradient descent method for nonsmooth separable minimization. Mathematical Programming, 2009, 117(1-2):387-423
    19 Joachims T. Making Large-scale SVM Learning Practical,Technical Report, SFB 475:Komplexitatsreduktion in Multivariaten Datenstrukturen, Universitat Dortmund, Germany, 1998
    20 Wang H N, Lu Y, Zhai C X. Latent aspect rating analysis on review text data:a rating regression approach. In:Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, DC,USA:ACM, 2010. 783-792
    21 Pang B, Lee L. Seeing stars:exploiting class relationships for sentiment categorization with respect to rating scales. In:Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics. Ann Arbor, LUSA:ACL, 2005. 115-124
    22 McAuley J, Targett I, Shi Q F, van den Hengel A. Imagebased recommendations on styles and substitutes. In:Proceedings of the 38th International ACM SIGIR C,onferenci on Research and Development in Information Retrieval.Santiago, Chile:ACM, 2015. 43-52
    23 McAuley J, Pandey R, Leskovec J. Inferring networks of substitutable and complementary products. In Proceeding, of the 21st ACM SIGKDD International Conference on Knovwledge Discovery and Data Minineg. Sydney, Australia:ACM,2015, 785-794
    24 Tang D Y, Qin B, Liu T, Document modeling with gated recurrent neural network for sentiment classification. In:Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Lisbon, Portugal:ACL, 2015.1422-1432
    25 Diao Q M, Qiu M T, Wu C, Y, Smola A J, Jiang J, Wank C. Jointly modeling aspects, ratings and seltiments for movie recommendation(JMARS). In:Proceedings of the20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, USA:ACM,2014. 193-202
    1算法代码已上传至https://github.com/huadong2014/LinearNPSVOR/.
    2数据集取自http://www.cs.virginia.edu/~hw5x/dataset.html
    3 http://nlp.stanford.edu/sentiment/
    4 scale dataset v1.0:http://www.cs.cornell.edu/people/pabo/movie-review-data/
    5 http://ai.stanford.edu/~amaas/data/sentiment/
    6 http://sifaka.cs. uiuc.edu/~wang296/Data/index.html
    7 Amazon product reviews datasets:http://jmcauley.ucsd.edu/data/amazon/
    8需要注意的是,因为目前还没有针对顺序回归问题提出的大规模求解算法,RedSVM模型只有非线性模型的求解算法,故本文对线性RedSVM算法求解时采用文献[7]中DCD算法RedSVM的线性版本进行实现.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700