Augmented Leverage Score Sampling with Bounds
详细信息    查看全文
  • 关键词:Subset selection ; Low ; rank matrix approximation ; Leverage scores ; Spectral analysis
  • 刊名:Lecture Notes in Computer Science
  • 出版年:2016
  • 出版时间:2016
  • 年:2016
  • 卷:9852
  • 期:1
  • 页码:543-558
  • 全文大小:1,561 KB
  • 参考文献:1.Boutsidis, C., Drineas, P., Magdon-Ismail, M.: Near-optimal column-based matrix reconstruction. SIAM J. Comput. 10598(i), 1–27 (2014). http://​epubs.​siam.​org/​doi/​abs/​10.​1137/​12086755X MathSciNet MATH
    2.Brouwer, A.E., Haemers, W.H.: Spectra of Graphs. Springer, New York (2011). https://​dx.​doi.​org/​10.​1007/​978-1-4614-1939-6
    3.Chan, T.F., Hansen, P.C.: Some applications of the rank revealing QR factorization. SIAM J. Sci. Comput. 13(3), 727–741 (1992)MathSciNet CrossRef MATH
    4.Drineas, P., Mahoney, M.W.: On the Nystrom method for approximating a gram matrix for improved kernel-based learning (Extended abstract). In: Proceedings of Learning Theory, vol. 3559, pp. 323–337 (2005). Go to ISI://WOS:000230769100022
    5.Drineas, P., Mahoney, M., Muthukrishnan, S.: Relative-error CUR matrix decompositions. SIAM J. Matrix Anal. Appl. 30(2), 844–881 (2008)MathSciNet CrossRef MATH
    6.Efron, B., Hastie, T., Johnstone, I., Tibshirani, R.: Least angle regression. Ann. Stat. 32(2), 407–499 (2004)MathSciNet CrossRef MATH
    7.Gittens, A.: The spectral norm error of the naive Nystrom extension. arXiv preprint, pp. 1–9 (2011). http://​arxiv.​org/​pdf/​1110.​5305 , http://​arxiv.​org/​abs/​1110.​5305
    8.Gittens, A., Mahoney, M.W.: Revisiting the Nystrom method for improved large-scale machine learning. ICML 28, 1–45 (2013)MATH
    9.Golub, G.: Numerical methods for solving linear least squares problems. Numer. Math. 7(1), 206–216 (1965)MathSciNet CrossRef MATH
    10.Klimt, B., Yang, Y.: Introducing the Enron corpus. In: CEAS (2004)
    11.Leskovec, J., Lang, K.J., Dasgupta, A., Mahoney, M.W.: Community structure in large networks: natural cluster sizes and the absence of large well-defined clusters. Internet Math. 6(1), 29–123 (2009)MathSciNet CrossRef MATH
    12.Lichman, M.: UCI machine learning repository (2013). http://​archive.​ics.​uci.​edu/​ml
    13.Mahoney, M.W.: Algorithmic and Statistical Perspectives on Large-Scale Data Analysis, p. 33 (2010). http://​arxiv.​org/​abs/​1010.​1609
    14.Mahoney, M.M.W.: Randomized algorithms for matrices and data. arXiv preprint arXiv:​1104.​5557 , p. 49 (2011)
    15.Papailiopoulos, D., Kyrillidis, A., Boutsidis, C.: Provable deterministic leverage score sampling. arXiv preprint, pp. 997–1006 (2014). http://​dl.​acm.​org/​citation.​cfm?​doid=​2623330.​2623698
    16.Paul, S., Magdon-Ismail, M., Drineas, P.: Column Selection via Adaptive Sampling, pp. 1–9 (2015). http://​arxiv.​org/​abs/​1510.​04149
  • 作者单位:Daniel J. Perry (17)
    Ross T. Whitaker (17)

    17. SCI Institute, University of Utah, 72 S. Central Campus Drive, Salt Lake City, UT, 84112, USA
  • 丛书名:Machine Learning and Knowledge Discovery in Databases
  • ISBN:978-3-319-46227-1
  • 刊物类别:Computer Science
  • 刊物主题:Artificial Intelligence and Robotics
    Computer Communication Networks
    Software Engineering
    Data Encryption
    Database Management
    Computation by Abstract Devices
    Algorithm Analysis and Problem Complexity
  • 出版者:Springer Berlin / Heidelberg
  • ISSN:1611-3349
  • 卷排序:9852
文摘
We introduce a modification to the well studied leverage score sampling algorithm which takes into account data scale, called the augmented leverage score, and introduce an initial error bound proof for the case of deterministic sampling – which to our knowledge is the first bound for this augmented leverage score. We discuss the implications of the error bounds proof and present an empirical evaluation of the proposed augmented leverage score performance on the column subsample selection problem (CSSP) as compared to the traditional leverage score and other methods in both a deterministic and probabilistic sampling paradigm. We show that the augmentation of the leverage score improves the empirical performance on CSSP significantly for many kinds of data.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700