基于MapReduce的并行PLS过程监控算法实现
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Implementation of parallel PLS algorithm of process monitoring using MapReduce
  • 作者:王德政 ; 张益农 ; 杨帆
  • 英文作者:WANG Dezheng;ZHANG Yinong;YANG Fan;Beijing Key Laboratory of Information Service Engineering, Beijing Union University;Department of Automation, Tsinghua University;
  • 关键词:云计算 ; 过程监控 ; MapReduce ; 偏最小二乘算法 ; 并行算法
  • 英文关键词:cloud computing;;process monitoring;;MapReduce;;partial least squares;;parallel algorithm
  • 中文刊名:JSGG
  • 英文刊名:Computer Engineering and Applications
  • 机构:北京联合大学北京市信息服务工程重点实验室;清华大学自动化系;
  • 出版日期:2018-04-03 14:26
  • 出版单位:计算机工程与应用
  • 年:2018
  • 期:v.54;No.919
  • 基金:国家自然科学基金(No.61433001);; 北京市属高等学校高层次人才引进与培养计划项目(No.CIT&TCD20150314)
  • 语种:中文;
  • 页:JSGG201824011
  • 页数:6
  • CN:24
  • 分类号:66-70+180
摘要
偏最小二乘算法(PLS)是现代工业过程常用的多变量统计过程监控方法之一,然而在现代工业背景下,采用单台PC对大规模工业过程数据进行PLS回归分析的时间复杂度较高。针对此问题,在Hadoop云平台上提出了一种基于MapReduce框架的并行PLS算法。从时间复杂度考虑,将其交叉有效性检验部分并行处理。在三台PC上搭建三个节点的Hadoop全分布集群平台上,以田纳西-伊斯曼过程仿真平台数据回归分析为例,验证所提出的算法。实验结果表明,在使用PLS做现代大规模工业过程数据分析时,所提出的算法在保证精度的前提下,能有效改善数据处理的时效性并且随着PC数量的增加时效性具有近似线性的提高。
        Partial Least Squares(PLS)has been widely used in multivariate statistical process monitoring methods for industrial processes, and it is computation-intensive and time-demanding when dealing with massive data. To solve this problem to consider time complexity, a novel implementation of parallel partial least squares is proposed using MapReduce,which consists of the parallelization of cross validation. Using Tennessee-Eastman Process data as an example, experiments are conducted on a Hadoop cluster, which is a collection of ordinary computers. The experimental results demonstrate that parallel partial least squares algorithm can handle massive process data, can significantly cut down the modeling time, and gains a basically linear speedup with the number of computers increased, and can be easily scaled up.
引文
[1]Isermann R,Münchhof M.Identification of dynamic systems[M]//杨帆,耿立辉,倪博溢.动态系统辨识:导论与应用.北京:机械工业出版社,2016.
    [2]刘强,秦泗钊.过程工业大数据建模研究展望[J].自动化学报,2016,42(2):161-171.
    [3]Zhang Z M,Liang Y Z,Xu Q S.Multi-core computing:a novel accelerating method for chemometrics calculation[J].Chemometrics&Intelligent Laboratory Systems,2009,96(1):94-97.
    [4]申永祥,杨辉华,何倩,等.基于并行PLS算法的化学计量学软件研究[J].微计算机信息,2010,26(9):208-210.
    [5]杨辉华,杜玲玲,李灵巧,等.并行MapReduce PLS算法及其在光谱分析中的应用[J].光谱学与光谱分析,2012,32(9):2399-2404.
    [6]Hair J F,Sarstedt M,Pieper T M,et al.The use of partial least squares structural equation modeling in strategic management research:a review of past practices and recommendations for future applications[J].Long Range Planning,2012,45(5/6):320-340.
    [7]Aguirre-urreta M,Rnkk M.Sample size determination and statistical power analysis in PLS using R:an annotated tutorial[J].Communications of the Association for Information Systems,2015,36:33-51.
    [8]郧刚.基于异常数据的智能故障诊断探究[D].厦门:厦门大学,2007.
    [9]Baron G.Comparison of cross-validation and test sets approaches to evaluation of classifiers in authorship attribution domain[C]//International Symposium on Computer and Information Sciences,2016:81-89.
    [10]Refaeilzadeh P,Tang L,Liu H.Cross-validation[J].Encyclopedia of Database Systems,2016:532-538.
    [11]Liu C,Yang H C,Fan J,et al.Distributed nonnegative matrix factorization for web-scale dyadic data analysis on Map-Reduce[C]//International Conference on World Wide Web,Raleigh,North Carolina,USA,2010:681-690.
    [12]易秀双,刘勇,李婕,等.基于MapReduce的主成分分析算法研究[J].计算机科学,2017,44(2):65-69.
    [13]Scholkopf B,Platt J,Hofmann T.Map-Reduce for machine learning on multicore[C]//International Conference on Neural Information Processing Systems,2006:281-288.
    [14]赵卫中,马慧芳,傅燕翔,等.基于云计算平台Hadoop的并行k-means聚类算法设计研究[J].计算机科学,2011,38(10):166-168.
    [15]Downs J J,Vogel E F.A plant-wide industrial process control problem[J].Computers&Chemical Engineering,1993,17(3):245-255.
    [16]Ricker N L.Decentralized control of the tennessee eastman challenge process[J].Journal of Process Control,1996,6(4):205-221.
    [17]He B,Fang W,Luo Q,et al.Mars:a MapReduce framework on graphics processors[C]//International Conference on Parallel Architectures and Compilation Techniques,2017:260-269.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700