结合切空间及特征空间校准的增量流形学习正则优化算法
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Incremental Manifold Learning Regular Optimization Algorithm on Tangent Space and Feature Space Alignment
  • 作者:谈超 ; 吉根林 ; 赵斌
  • 英文作者:Tan Chao;Ji Genlin;Zhao Bin;School of Computer Science and Engineering,Southeast University;School of Computer Science and Technology,Nanjing Normal University;
  • 关键词:高维流式大数据 ; 自适应增量特征提取 ; 特征空间校准 ; 正则化优化
  • 英文关键词:high dimensional big data streams;;adaptive incremental feature extraction;;feature space alignment;;regularization optimization
  • 中文刊名:SJCJ
  • 英文刊名:Journal of Data Acquisition and Processing
  • 机构:东南大学计算机科学与工程学院;南京师范大学计算机科学与技术学院;
  • 出版日期:2017-11-15
  • 出版单位:数据采集与处理
  • 年:2017
  • 期:v.32;No.146
  • 基金:国家自然科学基金(41471371,61702270)资助项目;; 江苏省高校自然科学基金(15KJB520022)资助项目;; 中国博士后科学基金(2017M621592)资助项目
  • 语种:中文;
  • 页:SJCJ201706009
  • 页数:12
  • CN:06
  • ISSN:32-1367/TN
  • 分类号:77-88
摘要
高维流式大数据的产生与发展对传统机器学习和数据挖掘算法提出了诸多挑战。本文结合流式大数据流式到达的特性,首先建立自适应增量特征提取算法模型。然后,针对噪声环境,建立基于特征空间校准的增量流形学习算法模型,解决小样本问题。最后,构造流形学习的正则化优化框架,解决高维数据流特征提取过程中产生的降维误差问题,并得到最终的最优解。实验结果表明本文提出的算法框架符合流形学习算法的3个评价指标:稳定性、提高性以及学习曲线能迅速增加到一个相对稳定的水平;从而实现了高维数据流的高效学习。
        The emergence and development of high dimensional big data streams have presented a great challenge to the traditional machine learning and data mining algorithms.Based on the characteristics of data flow,first we construct an adaptive incremental feature extraction algorithm model.Then,according to the environment with noise,we establish an incremental manifold learning algorithm model based on feature space alignment to solve the small size sample problem.Finally,the regularization optimization framework of manifold learning is constructed to solve the problem of dimensionality reduction errors of high-dimensional data flow in feature extraction process,and then the optimal solutions are obtained.Experimental results show that the proposed algorithm framework conforms to the three evaluation criterions of manifold learning algorithm:Stability,enhancement,and the learning curve can rapidly increase to a relative stable level.Thus the efficient learning of high-dimensional data streams can be realized.
引文
[1]Zeng X,Li G.Incremental partial least squares analysis of big streaming data[J].Pattern Recognition,2014,47(11):3726-3735.
    [2]孙大为,张广艳,郑纬民.大数据流式计算:关键技术及系统实例[J].软件学报,2014,25(4):839-862.Sun Dawei,Zhang Guangyan,Zheng Weimin.Big data stream computing:Technologies and instances[J].Journal of Software,2014,25(4):839-862.
    [3]潘志松,唐斯琪,邱俊洋,等.在线学习算法综述[J].数据采集与处理,2016,31(6):1067-1082.Pan Zhisong,Tang Siqi,Qiu Junyang,et al.Survey on online learning algorithms[J].Journal of Data Acquisition and Processing,2016,31(6):1067-1082.
    [4]张长水,张见闻.演化数据的学习[J].计算机学报,2013,36(2):310-316.Zhang Changshui,Zhang Jianwen.Learning on time-evolving data[J].Chinese Journal of Computers,2013,36(2):310-316.
    [5]张钢,谢晓珊,黄英,等.面向大数据流的半监督在线多核学习算法[J].智能系统学报,2014,9(3):355-363.Zhang Gang,Xie Xiaoshan,Huang Ying,et al.An online multi-kernel learning algorithm for big data[J].CAAI Transactions on Intelligent Systems,2014,9(3):355-363.
    [6]孙大为.大数据流式计算:应用特征和技术挑战[J].大数据,2015,3(2):99-105.Sun Dawei.Big data stream computing:Features and challenges[J].Big Data Research,2015,3(2):99-105.
    [7]王桂玲,韩燕波,张仲妹,等.基于云计算的流数据集成与服务[J].计算机学报,2017,40(1):107-125.Wang Guiling,Han Yanbo,Zhang Zhongmei,et al.Cloud-based integration and service of streaming data[J].Chinese Journal of Computers,2017,40(1):107-125.
    [8]Wang X.A summary of LDA,PCA and relative work[J].Journal of the Graduates Sun Yat-Sen University:Natural Sciences,Medicine,2007,28(4):50-61.
    [9]Rosipal R,Kramer N.Overview and recent advances in partial least squares[C]∥International conference on Subspace,Latent Structure and Feature Selection.Heidelberg,Berlin:Springer Press,2006:34-51.
    [10]Weng J Y,Zhang Y L,Hwang W S.Candid covariance-free incremental principal component analysis[J].IEEE Trans on Pattern Analysis and Machine Intelligence,2003,25(8):1034-1040.
    [11]李焕哲,吴志健,汪慎文,等.协方差矩阵自适应演化策略学习机制综述[J].电子学报,2017,45(1):238-245.Li Huanzhe,Wu Zhijian,Wang Shenwen,et al.The overview of learning mechanism of covariance matrix adaptation evolution strategy[J].Acta Electronica Sinica,2017,45(1):238-245.
    [12]Chu D,Liao L,Ng K,et al.Incremental linear discriminant analysis:A fast algorithm and comparisons[J].IEEE Transactions on Neural Networks and Learning Systems,2015,26(11):2716-2735.
    [13]李波.基于流形学习的特征提取方法及其应用研究[D].合肥:中国科学技术大学,2008.
    [14]Chen M,Li W,Zhang W,et al.Dimensionality reduction with generalized linear models[C]∥Proceedings of the International Joint Conference on Artificial Intelligence.San Jose,CA,USA:IEEE Computer Society Press,2013:1267-1272.
    [15]Tan C,Ji G.A manifold learning algorithm based on incremental tangent space alignment[C]∥International Conference on Cloud Computing and Security.Heidelberg,Berlin:Springer Press,2016:541-552.
    [16]Zhang Z Y,Zha H Y.Principal manifolds and nonlinear dimensionality reduction via tangent space alignment[J].SIAM Journal of Scientific Computing,2004,26(1):313-338.
    [17]Tan C,Guan J.A feature space alignment learning algorithm[C]∥Pacific Rim International Conference on Artificial Intelligence.Heidelberg,Berlin:Springer Press,2012:795-800.
    [18]Geng X,Smith-Miles K.Encyclopedia of biometrics[M].New York:Springer,2015:912-917.
    [19]Roweis S,Saul L.Nonlinear dimensionality reduction by locally linear embedding[J].Science,2000,290(5500):2323-2326.
    [20]Belkin M,Niyogi P.Laplacian eigenmaps for dimensionality reduction and data representation[J].Neural Computation,2003,15(6):1373-1396.
    [21]Tenenbaum J,Silva de V,Langford J.A global geometric framework for nonlinear dimensionality reduction[J].Science,2000,290(5500):2319-2323.
    [22]谈超,关佶红,周水庚.基于等角映射的多样本增量流形学习算法[J].模式识别与人工智能,2014,27(2):127-133.Tan Chao,Guan Jihong,Zhou Shuigeng.Multi-sample incremental manifold learning algorithm based on isogonal mapping[J].Pattern Recognition and Artificial Intelligence,2014,27(2):127-133.
    [23]Roweis S.Research:Data for MATLAB[EB/OL].http:∥www.cs.nyu.edu/~roweis/data.html,2017-08-06.
    [24]Yale University.Yale face database[EB/OL].http:∥cvc.cs.yale.edu/cvc/projects/yalefaces/yalefaces.html,2017-08-06.
    [25]Li B,Li J,Zhang X.Nonparametric discriminant multi-manifold learning for dimensionality reduction[J].Neurocomputing,2015,152:121-126.
    [26]Martinez A,Benavente R.The AR face database[R].Computer Vision Center,Technical Report.Barcelona,Spain:[s.n.],2007,3:5.
    [27]Nene S,Nayar S,Murase H.Columbia object image library(COIL-20)[R].Technical Report CUCS-005-96.NewYork:Columbia University,1996.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700