A second-order method for strongly convex \(\ell

A second-order method for strongly convex $\ell _1$ -regularization problems

详细信息查看全文

作者：Kimon Fountoulakis ; Jacek Gondzio
关键词：$$\ell _1$$ ℓ1 ; Regularization ; Strongly convex optimization ; Second ; order methods ; Iteration complexity ; Newton conjugate ; gradients method ; 68W40 ; 65K05 ; 90C06 ; 90C25 ; 90C30 ; 90C51
刊名：Mathematical Programming
出版年：2016
出版时间：March 2016
年：2016
卷：156
期：1-2
页码：189-219
全文大小：864 KB
参考文献：1.Acar, R., Vogel, C.R.: Analysis of bounded variation penalty methods for ill-posed problems. Inverse Problems 10, 1217–1229 (1994)CrossRef MathSciNet MATH
2.Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2(1), 183–202 (2009)CrossRef MathSciNet MATH
3.Becker, S.: CoSaMP and OMP for Sparse Recovery. http://www.mathworks.co.uk/matlabcentral/fileexchange/32402-cosamp-and-omp-for-sparse-recovery (2012)
4.Becker, S.R., Bobin, J., Candès, E.J.: Nesta: a fast and accurate first-order method for sparse recovery. SIAM J. Imaging Sci. 4(1), 1–39 (2011)CrossRef MathSciNet MATH
5.Becker, S.R., Candés, E.J., Grant, M.C.: Templates for convex cone problems with applications to sparse signal recovery. Math. Program. Comput. 3(3), 165–218, (2011). http://tfocs.stanford.edu
6.Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press, New York (2004)CrossRef MATH
7.Chan, R.H., Chan, T.F., Zhou, H.M.: Advanced signal processing algorithms. In: Luk F.T. (ed) Proceedings of the International Society of Photo-Optical Instrumentation Engineers, SPIE, pp. 314–325 (1995)
8.Chan, T.F., Golub, G.H., Mulet, P.: A nonlinear primal-dual method for total variation-based image restoration. SIAM J. Sci. Comput. 20(6), 1964–1977 (1999)CrossRef MathSciNet MATH
9.Chang, C.-C., Lin, C.-J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 27 (2011). http://www.csie.ntu.edu.tw/~cjlin/libsvm
10.Chang, K.-W., Hsieh, C.-J., Lin, C.-J.: Coordinate descent method for large-scale $\ell _2$ -loss linear support vector machines. J. Mach. Learn. Res. 9, 1369–1398 (2008)MathSciNet MATH
11.Hartley, R.I., Zisserman, A.: Multiple View Geometry in Computer Vision, 2nd edn. Cambridge University Press, Cambridge (2004). ISBN: 0521540518CrossRef MATH
12.Hsieh, C.-J., Chang, K.-W., Lin, C.-J., Keerthi, S.S., Sundararajan, S.: A dual coordinate descent method for large-scale linear SVM. In: Proceedings of the 25th International Conference on Machine Learning, ICML 2008, pp. 408–415 (2008)
13.Hsu, C.-W., Chang, C.-C., Lin, C.-J.: A practical guide to support vector classification. In: Technical report, Department of Computer Science, National Taiwan University (2010)
14.Keerthi, S.S., DeCoste, D.: A modified finite newton method for fast solution of large scale linear svms. J. Mach. Learn. Res. 6, 341–361 (2005)MathSciNet MATH
15.Kelly, C.T.: Iterative Methods for Linear and Nonlinear Equations. SIAM, Philadelphia (1995)CrossRef
16.Lewis, D.D., Yiming, Yang, Rose, T.G., Li, F.: RCV1: a new benchmark collection for text categorization research. J. Mach. Learn. Res. 5, 361–397 (2004)
17.McCallum, A.: Real-sim: real vs. simulated data for binary classification problem. http://www.cs.umass.edu/~mccallum/code-data.html
18.Needell, D., Tropp, J.A.: Cosamp: iterative signal recovery from incomplete and inaccurate samples. Appl. Comput. Harmonic Anal. 26(3), 301–321 (2009)CrossRef MathSciNet MATH
19.Renegar, J.: A Mathematical View of Interior-Point Methods in Convex Optimization. In: MOS-SIAM Series on Optimization, Cornell University, Ithaca, New York (2001)
20.Richtárik, P., Takáč, M.: Iteration complexity of randomized block-coordinate descent methods for minimizing a composite function. Math. Program 144(1–2), 1–38 (2014)CrossRef MathSciNet MATH
21.Richtárik, P., Takáč, M.: Parallel coordinate descent methods for big data optimization. In: Technical report, School of Mathematics, Edinburgh University, 2012. https://code.google.com/p/ac-dc/
22.Shalev-Shwartz, S., Tewari, A.: Stochastic methods for $\ell _1$ -regularized loss minimization. J. Mach. Learn. Res. 12(4), 1865–1892 (2011)MathSciNet MATH
23.Shewchuk, J.R.: An introduction to the conjugate gradient method without the agonizing pain. In: Technical report, Carnegie Mellon University Pittsburgh, PA, USA, (1994)
24.Sra, S., Nowozin, S., Wright, S.J.: Optimization for Machine Learning. MIT Press, Cambridge (2011)
25.Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. 58(1), 267–288 (1996)MathSciNet MATH
26.Tseng, P.: Convergence of a block coordinate descent method for nondifferentiable minimization. J. Optim. Theor. Appl. 109(3), 475–494 (2001)CrossRef MATH
27.Tseng, P.: Efficiency of coordinate descent methods on huge-scale optimization problems. SIAM J. Optim. 22, 341–362 (2012)CrossRef MathSciNet
28.Tseng, P., Yun, S.: A coordinate gradient descent method for nonsmooth separable minimization. Math. Program. Ser. B 117, 387–423 (2009)CrossRef MathSciNet MATH
29.Webb, S., Caverlee, J., Pu, C.: Introducing the webb spam corpus: using email spam to identify web spam automatically. In: Proceedings of the Third Conference on Email and Anti-Spam (CEAS), (2006)
30.Wright, S.J.: Accelerated block-coordinate relaxation for regularized optimization. SIAM J. Optim. 22(1), 159–186 (2012)CrossRef MathSciNet MATH
31.Wu, T.T., Lange, K.: Coordinate descent algorithms for lasso penalized regression. Ann. Appl. Stat. 2(1), 224–244 (2008)CrossRef MathSciNet MATH
32.Yu, H.-F., Lo, H.-Y., Hsieh, H.-P., Lou, J.-K., McKenzie, T.G., Chou, J.-W., Chung, P.-H., Ho, C.-H., Chang, C.-F., Wei, Y.-H., Weng, J.-Y., Yan, E.-S., Chang, C.-W., Kuo, T.-T., Lo, Y.-C., Chang, P.-T., Po, C., Wang, C.-Y., Huang, Y.-H., Hung, C.-W., Ruan, Y.-X., Lin, Y.-S., Lin, S.-D., Lin, H.-T., Lin, C.-J.: Feature engineering and classifier ensemble for kdd cup 2010. In: JMLR Workshop and Conference Proceedings, (2011)
33.Yuan, G.X., Ho, C.H., Lin, C.J.: Recent advances of large-scale linear classification. Proc. IEEE 100(9), 2584–2603 (2012)CrossRef
作者单位：Kimon Fountoulakis (1)
Jacek Gondzio (1)

1. School of Mathematics and Maxwell Institute, The University of Edinburgh, Peter Guthrie Tait Road, Edinburgh, EH9 3FD, UK
刊物类别：Mathematics and Statistics
刊物主题：Mathematics
Calculus of Variations and Optimal Control
Mathematics of Computing
Numerical Analysis
Combinatorics
Mathematical and Computational Physics
Mathematical Methods in Physics
出版者：Springer Berlin / Heidelberg
ISSN：1436-4646

文摘

In this paper a robust second-order method is developed for the solution of strongly convex $\ell _1$-regularized problems. The main aim is to make the proposed method as inexpensive as possible, while even difficult problems can be efficiently solved. The proposed approach is a primal-dual Newton conjugate gradients (pdNCG) method. Convergence properties of pdNCG are studied and worst-case iteration complexity is established. Numerical results are presented on synthetic sparse least-squares problems and real world machine learning problems. Keywords $\ell _1$-Regularization Strongly convex optimization Second-order methods Iteration complexity Newton conjugate-gradients method

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700