参考文献:1.Acar, R., Vogel, C.R.: Analysis of bounded variation penalty methods for ill-posed problems. Inverse Problems 10, 1217–1229 (1994)CrossRef MathSciNet MATH 2.Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2(1), 183–202 (2009)CrossRef MathSciNet MATH 3.Becker, S.: CoSaMP and OMP for Sparse Recovery. http://www.mathworks.co.uk/matlabcentral/fileexchange/32402-cosamp-and-omp-for-sparse-recovery (2012) 4.Becker, S.R., Bobin, J., Candès, E.J.: Nesta: a fast and accurate first-order method for sparse recovery. SIAM J. Imaging Sci. 4(1), 1–39 (2011)CrossRef MathSciNet MATH 5.Becker, S.R., Candés, E.J., Grant, M.C.: Templates for convex cone problems with applications to sparse signal recovery. Math. Program. Comput. 3(3), 165–218, (2011). http://tfocs.stanford.edu 6.Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press, New York (2004)CrossRef MATH 7.Chan, R.H., Chan, T.F., Zhou, H.M.: Advanced signal processing algorithms. In: Luk F.T. (ed) Proceedings of the International Society of Photo-Optical Instrumentation Engineers, SPIE, pp. 314–325 (1995) 8.Chan, T.F., Golub, G.H., Mulet, P.: A nonlinear primal-dual method for total variation-based image restoration. SIAM J. Sci. Comput. 20(6), 1964–1977 (1999)CrossRef MathSciNet MATH 9.Chang, C.-C., Lin, C.-J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 27 (2011). http://www.csie.ntu.edu.tw/~cjlin/libsvm 10.Chang, K.-W., Hsieh, C.-J., Lin, C.-J.: Coordinate descent method for large-scale \(\ell _2\) -loss linear support vector machines. J. Mach. Learn. Res. 9, 1369–1398 (2008)MathSciNet MATH 11.Hartley, R.I., Zisserman, A.: Multiple View Geometry in Computer Vision, 2nd edn. Cambridge University Press, Cambridge (2004). ISBN: 0521540518CrossRef MATH 12.Hsieh, C.-J., Chang, K.-W., Lin, C.-J., Keerthi, S.S., Sundararajan, S.: A dual coordinate descent method for large-scale linear SVM. In: Proceedings of the 25th International Conference on Machine Learning, ICML 2008, pp. 408–415 (2008) 13.Hsu, C.-W., Chang, C.-C., Lin, C.-J.: A practical guide to support vector classification. In: Technical report, Department of Computer Science, National Taiwan University (2010) 14.Keerthi, S.S., DeCoste, D.: A modified finite newton method for fast solution of large scale linear svms. J. Mach. Learn. Res. 6, 341–361 (2005)MathSciNet MATH 15.Kelly, C.T.: Iterative Methods for Linear and Nonlinear Equations. SIAM, Philadelphia (1995)CrossRef 16.Lewis, D.D., Yiming, Yang, Rose, T.G., Li, F.: RCV1: a new benchmark collection for text categorization research. J. Mach. Learn. Res. 5, 361–397 (2004) 17.McCallum, A.: Real-sim: real vs. simulated data for binary classification problem. http://www.cs.umass.edu/~mccallum/code-data.html 18.Needell, D., Tropp, J.A.: Cosamp: iterative signal recovery from incomplete and inaccurate samples. Appl. Comput. Harmonic Anal. 26(3), 301–321 (2009)CrossRef MathSciNet MATH 19.Renegar, J.: A Mathematical View of Interior-Point Methods in Convex Optimization. In: MOS-SIAM Series on Optimization, Cornell University, Ithaca, New York (2001) 20.Richtárik, P., Takáč, M.: Iteration complexity of randomized block-coordinate descent methods for minimizing a composite function. Math. Program 144(1–2), 1–38 (2014)CrossRef MathSciNet MATH 21.Richtárik, P., Takáč, M.: Parallel coordinate descent methods for big data optimization. In: Technical report, School of Mathematics, Edinburgh University, 2012. https://code.google.com/p/ac-dc/ 22.Shalev-Shwartz, S., Tewari, A.: Stochastic methods for \(\ell _1\) -regularized loss minimization. J. Mach. Learn. Res. 12(4), 1865–1892 (2011)MathSciNet MATH 23.Shewchuk, J.R.: An introduction to the conjugate gradient method without the agonizing pain. In: Technical report, Carnegie Mellon University Pittsburgh, PA, USA, (1994) 24.Sra, S., Nowozin, S., Wright, S.J.: Optimization for Machine Learning. MIT Press, Cambridge (2011) 25.Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. 58(1), 267–288 (1996)MathSciNet MATH 26.Tseng, P.: Convergence of a block coordinate descent method for nondifferentiable minimization. J. Optim. Theor. Appl. 109(3), 475–494 (2001)CrossRef MATH 27.Tseng, P.: Efficiency of coordinate descent methods on huge-scale optimization problems. SIAM J. Optim. 22, 341–362 (2012)CrossRef MathSciNet 28.Tseng, P., Yun, S.: A coordinate gradient descent method for nonsmooth separable minimization. Math. Program. Ser. B 117, 387–423 (2009)CrossRef MathSciNet MATH 29.Webb, S., Caverlee, J., Pu, C.: Introducing the webb spam corpus: using email spam to identify web spam automatically. In: Proceedings of the Third Conference on Email and Anti-Spam (CEAS), (2006) 30.Wright, S.J.: Accelerated block-coordinate relaxation for regularized optimization. SIAM J. Optim. 22(1), 159–186 (2012)CrossRef MathSciNet MATH 31.Wu, T.T., Lange, K.: Coordinate descent algorithms for lasso penalized regression. Ann. Appl. Stat. 2(1), 224–244 (2008)CrossRef MathSciNet MATH 32.Yu, H.-F., Lo, H.-Y., Hsieh, H.-P., Lou, J.-K., McKenzie, T.G., Chou, J.-W., Chung, P.-H., Ho, C.-H., Chang, C.-F., Wei, Y.-H., Weng, J.-Y., Yan, E.-S., Chang, C.-W., Kuo, T.-T., Lo, Y.-C., Chang, P.-T., Po, C., Wang, C.-Y., Huang, Y.-H., Hung, C.-W., Ruan, Y.-X., Lin, Y.-S., Lin, S.-D., Lin, H.-T., Lin, C.-J.: Feature engineering and classifier ensemble for kdd cup 2010. In: JMLR Workshop and Conference Proceedings, (2011) 33.Yuan, G.X., Ho, C.H., Lin, C.J.: Recent advances of large-scale linear classification. Proc. IEEE 100(9), 2584–2603 (2012)CrossRef
作者单位:Kimon Fountoulakis (1) Jacek Gondzio (1)
1. School of Mathematics and Maxwell Institute, The University of Edinburgh, Peter Guthrie Tait Road, Edinburgh, EH9 3FD, UK
刊物类别:Mathematics and Statistics
刊物主题:Mathematics Calculus of Variations and Optimal Control Mathematics of Computing Numerical Analysis Combinatorics Mathematical and Computational Physics Mathematical Methods in Physics
出版者:Springer Berlin / Heidelberg
ISSN:1436-4646
文摘
In this paper a robust second-order method is developed for the solution of strongly convex \(\ell _1\)-regularized problems. The main aim is to make the proposed method as inexpensive as possible, while even difficult problems can be efficiently solved. The proposed approach is a primal-dual Newton conjugate gradients (pdNCG) method. Convergence properties of pdNCG are studied and worst-case iteration complexity is established. Numerical results are presented on synthetic sparse least-squares problems and real world machine learning problems. Keywords \(\ell _1\)-Regularization Strongly convex optimization Second-order methods Iteration complexity Newton conjugate-gradients method