工业大数据分析技术的发展及其面临的挑战
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:The Development and Challenges of Industrial Big Data Analysis Technology
  • 作者:何文韬 ; 邵诚
  • 英文作者:HE Wentao;SHAO Cheng;Institute of Advanced Control Technology,Dalian University of Technology;
  • 关键词:信息技术 ; 智能化 ; 智能制造 ; 工业大数据 ; 分析技术
  • 英文关键词:information technology;;intelligent;;intelligent manufacturing;;industrial big data;;analytic technique
  • 中文刊名:XXYK
  • 英文刊名:Information and Control
  • 机构:大连理工大学先进控制技术研究所;
  • 出版日期:2018-07-02 10:43
  • 出版单位:信息与控制
  • 年:2018
  • 期:v.47
  • 基金:国家高技术研究发展计划资助项目(2014AA041802-2)
  • 语种:中文;
  • 页:XXYK201804003
  • 页数:13
  • CN:04
  • ISSN:21-1138/TP
  • 分类号:18-30
摘要
信息技术的快速发展以及互联网的广泛应用,引发世界各国先后实施"再工业化"战略.互联网在数据传输、管理软件、信息化应用等方面的强力保障为包括物联网在内的各项技术应用打开了大门.通过新兴技术提升工业智能化水平,提升企业的竞争力,以智能制造为主导的全球化工业革命已提上日程,工业智能化将是构成未来工业体系的关键所在.以工业互联网为基础的工业大数据分析技术及应用将成为推动智能制造,提升制造业生产效率与竞争力的关键要素,是实施生产过程智能化、流程管理智能化、制造模式智能化的重要基础.本文对涉及工业大数据分析的数据存储与管理、数据处理技术、可视化技术等各项技术进行了分析和讨论,也对工业大数据分析技术的研究应用前景和面临的挑战进行了探讨.
        The rapid development of information technology and the wide application of the Internet have triggered the implementation of"re-industrialization"strategy. The strong protection of the Internet in data transmission,management software,and information application opens the door for various technical applications including Internet of Things. A globalized industrial revolution led by intelligent manufacturing,enhanced industrial intelligence,and the competitiveness of enterprises through emerging technologies has been studied.Industrial intelligence will be center of the future industrial system. The industrial big data analysis technology and its applications based on industrial Internet will become the key elements for promoting intelligent manufacturing and enhancing manufacturing efficiency and competitiveness,as well as become the important basis for implementing intelligent production process,process management,and manufacturing modes. We analyze and discuss various technologies involved with industrial big data analysis,such as data storage and management,data processing technique,and also discuss the visualization technology for research prospects,and the challenges of industrial big data analysis technology.
引文
[1]Kagermann H,Wahlster W,Helbig J.Securing the future of German manufacturing industry-Recommendations for implementing the strategic initiative INDUSTRIE 4.0[R].Final report of the Industrie 4.0 Working Group,2013.
    [2]Evans P C,Annunziata M.Industrial internet pushing the boundaries of minds and machines[EB/OL].[2012-11-26].http://www.ge.com/sites/default/files/Industrial_Internet.pdf.
    [3]中国电子技术标准化研究院,全国信息技术标准化技术委员会大数据标准工作组.工业大数据白皮书[R].2017.China Electronics Standardization Institute,China National Information Technology Standardization Network.Industrial data white paper[R].2017.
    [4]Li G J,Cheng X Q.Research status and scientific thinking of big data[J].Bulletin of the Chinese Academy of Sciences,2012,27(6):647-657.
    [5]Wang Y Z,Jin X L,Cheng X Q.Network big data:Present and future[J].Chinese Journal of Computers,2013,36(6):1125-1138.
    [6]刘强,秦泗钊.过程工业大数据建模研究展望[J].自动化学报,2016,42(2):161-171.Liu Q,Qin S L.Research prospect of large data modeling in process industry[J].Acta Automatica Sinica,2016,42(2):161-171.
    [7]周佳军,姚锡凡,刘敏,等.几种新兴智能制造模式研究评述[J].计算机集成制造系统,2017,23(3):624-639.Zhou J J,Yao X F,Liu M,et al.A review of several emerging intelligent manufacturing models[J].Computer Integrated Manufacturing System,2017,23(3):624-639.
    [8]罗恩韬,胡志刚,林华.一种大数据时代海量数据抽取的开发模型研究[J].计算机应用研究,2013,30(11):3269-3271.Luo E T,Hu Z G,Lin H.Study on the development model of a big data era of massive data extraction[J].Application Research of Computers,2013,30(11):3269-3271.
    [9]韩强.一种高效的图数据抽取技术的研究[D].昆明:云南大学,2015.Han Q.Research on an efficient data extraction technique[D].Kunming:Yunnan University,2015.
    [10]Liu L,Calton P,Han W.An XML-enabled data extraction toolkit for web sources[J].Information System,2001,26(9):563-583.
    [11]贾艳凯.多源异构增量数据抽取方法研究与设计[D].哈尔滨:哈尔滨工程大学,2013.Jia Y K.Research and design of multi-source heterogeneous incremental data extraction method[D].Harbin:Harbin Engineering University,2013.
    [12]Halevy A,Rajaraman A,Ordille J.Data Integration:The teenage years[C]//Proceedings of the 32nd International Conference on Very Large Data Bases.New York,USA:ACM,2006:9-16.
    [13]Vassiliadisl P,Simitsis A,Georgantas P,et al.A framework for the design of ETL scenarios[C]//Proceedings of Conference on Advanced Information Systems Engineering(CAISE).Klagenfurt:CAISE,2003:520-535.
    [14]Tziovara V,Vassiliadisl P,Simitsis A.Deciding the physical implementation of ETL Workflows[C]//Proceedings of the ACM 10th International Workshop on Data Warehousing and OLAP.New York,USA:ACM,2007:49-56.
    [15]Ananthakrishna R.Chaudhuri S,Ganti V.Eliminating fuzzy duplicates in data warehouses[C]//Proceedings of the VLDB Conference.Berlin,Germany:Springer,2002.
    [16]Chaudhuri S,Ganjam K,Ganti V,et al.Robust and efficient fuzzy match for online data-cleaning[C]//Proceedings of the ACM SIGMODConference.New York,USA:ACM,2003.
    [17]叶晨.基于众包的数据清洗关键技术的研究[D].哈尔滨:哈尔滨工业大学,2015.Ye C.Research on key technology of data cleaning based on crowdsourcing[D].Harbin:Harbin Institute of Technology,2015.
    [18]Marcus A E.Matching algorithm within a duplicate detection system[J].IEEE Data Engineering Bulletin,2000,23(4):14-20.
    [19]Hernandez M A,Stolfo S J.The merge/purge problem for large databases[C]//Proceedings of the ACM SIGMOD International Conference on Management of Data.New York,USA:ACM,1995:127-138.
    [20]Zhang X F,Sun W W,Wang W,et al.Generating incremental ETL processes automatically[C]//Proceedings of the First International Multsymposiums on Computer and Computational Sciences.Piscataway,NJ,USA:IEEE,2006:516-521.
    [21]Fenk R,Kawakami A,Markl V,et al.Bulk loading a data warehouse built upon a UB-Tree[C]//Proceedings of the 2000 International Symposium on Database Engineering&Applications.Piscataway,NJ,USA:IEEE,2000:179-187.
    [22]秦峰巍,胡家宝,崔龙卫.基于SQL*Loader的海量数据装载方案优化[J].武汉理工大学学报,2010,32(5):707-709Qin F W,Hu J B,Cui L W.Optimization of massive data loading scheme based on SQL*Loader[J].Journal of Wuhan University of Technology,2010,32(5):707-709.
    [23]贺梦洁,朱美正,初宁,等.基于Spark平台的地理数据并行装载技术[J].地球信息科学学报,2016,37(12):63-68.He M J,Zhu M Z,Chu N,et al.Geographic data parallel loading technology based on spark[J].Journal of Geo Information Science,2016,37(12):63-68.
    [24]何勇,陈晓峰.Greenplum企业应用实战[M].北京:机械工业出版社,2014.He Y,Chen X F.Enterprise application with Greenplum[M].Beijing:China Machine Press,2014.
    [25]Corbett J C,Dean J,Epstein M,et al.Spanner:Google's globally-distributed database[C]//Usenix Conference on Operating Systems Design and Implementation.Piscataway,NJ,USA:IEEE,2012:251-264.
    [26]Shvachko K,Kuang H,Radia S,et al.The hadoop distributed file system[C]//IEEE Symposium on Mass Storage Systems&Technologies.Piscataway,NJ,USA:IEEE,2010:1-10.
    [27]Weil S A,Brandt S A,Miller E L,et al.Ceph:A scalable,high-performance distributed file system[J].USENIX Association,2010:307-320.
    [28]Mongo DB[EB/OL].(2017-12-21)[2018-01-27].http://www.mongodb.org.
    [29]Hbase Development Team.Hbase:Bigtable like structured storage for Hadoop HDFS[EB/OL].(2017-10-14)[2018-01-16].http://wiki.apache.org/hadoop/Hbase.
    [30]谭璐.高维数据的降维理论及应用[D].长沙:国防科学技术大学,2005.Tan L.Dimension reduction theory and application of high dimensional data[D].Changsha:National University of Defense Technology,2005.
    [31]Saul L K,Roweis S T.Think globally fit locally:Unsupervised learning of nonlinear manifolds[J].Machine Learning Research,2003,4(2):119-155.
    [32]Mika S,Scholkopf B,Smola A,et al.Kernel PCA and denoising in feature spaces[C]//the Conference on Advances in Neural Information Processing Systems.Denver,CO,USA:MIT Press,1999:536-542.
    [33]Morishima A,Kitagawa H,Matsumoto A.A machine learning approach to rapid development of XML mapping queries[C]//20th International Conference on Data Engineering.Piscataway,NJ,USA:IEEE,2004:276-287.
    [34]Griffithstl T L,Kalish M L.A multidimensional scaling approach to mental multiplication[J].Memory&Cognition,2002,30(1):97-106.
    [35]李荣雨.基于PCA的统计过程监控研究[D].杭州:浙江大学,2007.Li R Y.Research on statistical process monitoring based on PCA[D].Hangzhou:Zhejiang University,2007.
    [36]Brand M.Artificial intelligence[C]//18th International Joint Conference.San Francisco,USA:Morgan Kaufmann Publishers,2003:547-552.
    [37]李弼程,邵美珍,黄洁.模式识别原理与应用[M].西安:西安电子科技大学出版社,2008:45-52.Li B C,Shao M Z,Huang J.Pattern recognition theory and application[M].Xi'an:Xidian University Press,2008:45-52.
    [38]Agrwal R,Srikan R.Fast algorithms for mining association rules in large databases[C]//20th International Conference on Very Large Databases.Berlin,Germany:Springer,1994:487-499.
    [39]赵伯昕,卓秀然,郑潮宇.局域网安全指标间关联规则挖掘系统的研究[J].计算机工程,2011,38(3):150-152.Zhao B X,Zhuo X R,Zhen C Y.Research on association rules mining system for local area network security index[J].Computer Engineering,2011,38(3):150-152.
    [40]郭嘉美.模糊关联规则挖掘及在工业数据中的应用[D].郑州:郑州大学,2014.Guo J M.Fuzzy association rules mining and its application in industrial data[D].Zhengzhou:Zhengzhou University,2014.
    [41]刘静.粗糙集和模糊关联规则在流程工业企业中的应用和研究[D].济南:济南大学,2010.Liu J.Application and research of rough set and fuzzy association rules in process industry[D].Jinan:University of Jinan,2010.
    [42]樊虹.工业过程报警的关联规则挖掘方法及应用[D].北京:北京化工大学,2016.Fan H.Association rules mining and its application in industrial process alarm[D].Beijing:Beijing University of Chemical Technology,2016.
    [43]田苗凤.大数据背景下并行动态关联规则挖掘研究[D].兰州:兰州交通大学,2015.Tian M F.Research on parallel dynamic association rules mining in the context of large data[D].Lanzhou:Lanzhou Jiaotong University,2015.
    [44]刘江华,程君实,陈佳品.支持向量机训练算法综述[J].信息与控制,2002,31(1):45-50.Liu J H,Cheng J S,Chen J P.A review of support vector machine training algorithm[J].Information and Control,2002,31(1):45-50.
    [45]Wu X D,Kumar V,Quinlan J R,et al.Top 10 algorithms in data mining[J].Knowledge and Information Systems,2008,14(1):1-37.
    [46]Quinlan J R.C4.5:Programs for machine learning[D].San Mateo:California Morgan Kaufmann,1993.
    [47]栾丽华,吉根林.决策树分类技术研究[J].计算机工程,2004,30(9):94-96.Luan L H,Ji G L.The study on decision tree classification techniques[J].Computer Engineering,2004,30(9):94-96.
    [48]Rissanen J,Agrawal R,Mehta M.SLIQ:A fast scalable datamining[C]//International Conference on Very Large Data Bases.San Francisco,USA:Morgan Kaufmann Publishers,1996.
    [49]Friedman N,Geiger D,Goldszmidt M.Bayesian network classifiers[J].Machine Learning,1997,29(1):131-163.
    [50]Liu B,Hsu W,Ma Y.Knowledge discovery and data mining[C]//4th International Conference on AAAI.Menlo Park,CA,USA:AAAIPress,1998.
    [51]李伟卫,李梅,张阳,等.基于分布式数据仓库的分类分析研究[J].计算机应用研究,2013,30(10):2936-2939,2943.Li W W,Li M,Zhang Y,et al.Research of classification analysis for distributed data warehouse[J].Application Research of Computers,2013,30(10):2936-2939,2943.
    [52]Park J,Sandberg I W.Universal approximation using radial-basis-function networks[J].Neural Computation,1991,3(2):246-257.
    [53]毛国君,胡殿军,谢松燕.基于分布式数据流的大数据分类模型和算法[J].计算机学报,2017(1):161-175.Mao G J,Hu D J,Xie S Y.Large data classification model and algorithm based on distributed data stream[J].Chinese Journal of Computers,2017(1):161-175.
    [54]耿丽娟,李星毅.用于大数据分类的KNN算法研究[J].计算机应用研究,2014,31(5):1342-1344,1373.Geng L J,Li X Y.Research on KNN algorithm for large data classification[J].Application Research of Computers,2014,31(5):1342-1344,1373.
    [55]杨善林,李永森,胡笑旋,等.K-means算法中的k值优化问题研究[J].系统工程理论与实践,2006,26(2):97-101.Yang S L,Li Y S,Hu X X.Optimization study on k value of K-means algorithm[J].Systems Engineering Theory&Practice,2006,26(2):97-101.
    [56]Frey B J,Dueck D.Clustering by passing messages between data points[J].Science,2007,315(5814):972-976.
    [57]Rodriguze A,Laio A.Clustering by fast search and find of density peaks[J].Science,2014,344(6191):1492-1496.
    [58]Ankerst M,Breuning M,Kriegel H P,et al.OPTICS:Ordering points to identify the clustering structure[C]//Proceeding of 1999 ACM-SIG-MOD International Conference on Management of Data.New York,USA:ACM,1999:49-60.
    [59]Zhang T,Ramakrishnan R,Livny M.BIRCH:An efficient data clustering method for very large databases[C]//Proceeding of 1996 ACM-SIGMOD International Conference on Management of Data.New York,USA:ACM,1996:103-114.
    [60]Markley S C,Miller D J.Joint parsimonious modeling and model order selection for multivariate Gaussian mixure[J].IEEE Journal of Selected Topics in Signal Processing,2010,4(3):548-559.
    [61]Kohonen T.Self organized formation of topologically correct feature mape[J].Biological Cybernetics,1982,43(1):59-69.
    [62]Hinton G E,Osindero S,The Y W.A fast learning algorithm for deep belief nets[J].Neural Computation,2006,18(7):1527-1554.
    [63]Bengio Y.Learning deep architectures for AI[J].Foundations and Trends in Machine Learning,2009,2(1):1-127.
    [64]尹宝才,王文通,王立春,等.深度学习研究综述[J].北京工业大学学报,2015,41(1):48-59.Yin B C,Wang W T,Wang L C.Review of deep learning[J].Journal of Beijing University of Technology,2015,41(1):48-59.
    [65]康岩.深度学习在球磨机料位软测量建模中的应用研究[D].太原:太原理工大学,2014.Kang Y.Research on the application of deep learning in the soft sensor modeling of ball mill[D].Taiyuan:Taiyuan University of Technology,2014.
    [66]Krizhevsky A,Sutskever I,Hinton G E.Image Net classification with deep convolutional neural networks[M]//Advances in Neural Information Processing Systems.Berlin,Germany:Springer,2012:1097-1105.
    [67]Deng J,Dong W,Socher R,et al.Imagenet:A large-scale hierarchical image database[C]//IEEE Conferenceon Computer Vision and Pattern Recognition.Piscataway,NJ,USA:IEEE,2009:248-255.
    [68]Russakovsky O,Deng J,Su H,et al.Image Net large scale visual recognition challenge[J].International Journal of Computer Vision,2015,115(3):211-252.
    [69]Bengio Y,Lamblin P,Popovici D,et al.Greedy layer-wise training of deep networks[M]//Advances in Neural Information Processing Systems.Berlin,Germany:Springer,2007.
    [70]Gehring J,Miao Y,Metze F,et al.Extracting deep bottleneck features using stacked auto-encoders[C]//IEEE International Conference on A-coustics,Speech and Signal Processing.Piscataway,NJ,USA:IEEE,2013:3377-3381.
    [71]Hinton G E,Salakhutdinov R R.Reducing the dimensionality of data with neural networks[J].Science,2006,313(5786):504-507.
    [72]Harrison M A.Introduction to switching and automata theory[M].New York,USA:Mac Graw-Hill,1965.
    [73]Savage J E.The complexity of computing[M].New York,USA:Wiley,1976.
    [74]Steven F,James W.Parallelism in random access machines[C]//Proceedings of the 10th Annual ACM Symposium on Theory of Computing.New York,USA:ACM,1978:114-118.
    [75]van Leeuwen J.Handbook of theoretical computer science(vol.A):Algorithms and complexity[M].Cambridge,USA:MIT Press,1991.
    [76]程学旗,靳小龙,王元卓,等.大数据系统和分析技术综述[J].软件学报,2014,25(9):1889-1908.Cheng X Q,Jin X L,Wang Y Z.A survey of large data systems and analysis techniques[J].Journal of Software,2014,25(9):1889-1908.
    [77]Dean J.Mapreduce:Simplified data processing on large clusters[J].Osdi',2004,51(1):107-113.
    [78]Isard M,Budiu M,Yu Y,et al.Dryad:Distributed data-parallel programs from sequential building blocks[C]//Proceedings of the 2nd ACMEuropean Conference on Computer Systems.New York,USA:ACM,2007:59-72.
    [79]Zaharia M,Chowdhury M,Franklin M J,et al.Spark:Cluster computing with working sets[J].Book of Extremes,2010,15(1):1765-1773.
    [80]Melnik S,Gubarev A.Dremel:Interactive analysis of web-scale datasets[J].Communications of the ACM,2011,54(6):114-123.
    [81]The Apaehe Foundation.Storm official website[EB/OL].(2017-11-02)[2018-02-07].https://storm.apache.org/.
    [82]Malewicz G,Austern M H,Bik A J C,et al.Pregel:A system for large-scale graph processing[C]//ACM SIGMOD International Conference on Management of Data.New York,USA:ACM,2010:135-146.
    [83]Valiant L G.A bridging model for parallel computation[J].Communication of the ACM,1990,33(8):103-111.
    [84]Schulz M,Bronevetsky G,Fermandes R,et al.Implementation and evaluation of a scalable application-level checkpoint-recovery scheme for MPI programs[C]//Proceedings of the ACM/IEEE Science Conferenceon the Supercomputing.New York,USA:ACM,2004.
    [85]Neo4j[EB/OL].(2017-09-27)[2017-10-08].http://www.neo4j.org/.
    [86]Trinity[EB/OL].(2017-08-27)[2017-09-07].http://research.microsoft.com/trinity.
    [87]Shao B,Wang H,Li Y.Trinity:A distributed graph engine on a memory cloud[C]//Proceedings of the 2013 International Conference on Management of Data.New York,USA:ACM,2013:505-516.
    [88]代双凤,董继阳,薛健,等.科学计算中大数据可视化分析与应用[J].工程研究,2014(3):275-281.Dai S F,Dong J Y,Xue J,et al.Large data visualization analysis and application in scientific computing[J].Engineering Studies,2014(3):275-281.
    [89]Zhao J,Chevalier F,Collins C,et al.Facilitating discourse analysis with interactive visualization[J].IEEE Transactions on Visualization and Computer Graphics,2012,18(12):2639-2648.
    [90]Collins C,Carpendale S,Penn G.Docu Burst:Visualizing document content using language structure[J].Computer Graphics Forum,2009,28(3):1039-1046.
    [91]Herman I,Melancon G,Marshall M S.Graph visualization and navigation in information visualization:A survey[J].IEEE Transactions on Visualization and Computer Graphics,2000,6(1):24-43.
    [92]Shneiderman B.Tree visualization with tree-maps:2nd spacing-filling approach[J].ACM Transactions on Graphics,1992,11(1):92-99.
    [93]Halevi G,Moed H.The evolution of big data as a research and scientific topic:Overview of the literature[J].Research Trends,2012,30(1):3-6.
    [94]Hey T,Gannon D,Pinkelman J.The future of data-intensive science[J].Computer,2012,45(5):81-82.
    [95]Keim D A,Kriegel H P.Visualization techniques for mining large databases:A comparison[J].IEEE Transactions on Knowledge and Data Engineering,1996,8(6):923-938.
    [96]Ahlberg C,Shneiderman B.Visual information seeking:Tight coupling of dynamic query filters with starfield displays[C]//Proceedings of the SIGCHI Conference on Human Factors in Computing Systems.New York,USA:ACM,1994:313-317.
    [97]胡琴琴.基于Hadoop的数据可视化技术研究与应用[D].北京:北方工业大学,2016.Hu Q Q.Research and application of data visualization technology based on Hadoop[D].Beijing:North China University of Technology,2016.
    [98]张向宇.工业炉温度场可视化与辐射特性参数解耦重建研究[D].武汉:华中科技大学,2011.Zhang X Y.Research on visualization of temperature field of industrial furnace and decoupling reconstruction of radiation characteristic parameters[D].Wuhan:Huazhong University of Science and Technology,2011.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700