基于企业信息工厂的商务智能数据管理
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
随着内外部环境中数据量的激增,人们开始面对众多的数据管理问题。频繁出现的数据管理事件,更突显了数据管理的重要性。数据管理活动和人们所从事的各类生产劳动活动一样,借助相关工具技术,是有可能提高活动效率和活动价值的。商务智能(BI, Business Intelligence)便是这样一种数据管理工具,其关键是从企业运作系统的数据中析取出有用的数据,并对这些数据进行清理,从而确保数据的正确性,然后经过ETL过程(析取(extraction)、转换(transformation)和装载(load)),合并到目的数据仓库里,由此得到企业数据的全局视图,在全局视图的基础上利用商务智能技术工具对其进行分析和处理,最后将结果呈现给管理者,为管理者的决策过程提供支持。
     但是,商务智能理论和工具也有其局限性,在商务智能数据管理过程中,视阈的局限,造成数据管理全生命周期过程中部分阶段的缺失,使得数据管理活动片面,从而不具有完整性,由此提供的数据服务在可信度和效率性方面也缺乏说服力,也就使得层出不穷的数据问题的出现成为了可能。企业信息工厂是数据管理的全过程逻辑架构,基于企业信息工厂来开展商务智能数据管理课题研究,可以从一定程度上解决上述困境。商务智能试图为管理者提供可靠的决策支持,数据服务优劣是决定商务智能决策成败的关键所在。
     数据管理的问题层出不穷,本研究提出的商务智能数据管理对策具有普遍适用性,其中包括了理论层面的解决对策数据生态系统(数据反馈流和数据质量管理框架)和执行层面的解决对策数据治理政策和数据管理工作流程设计。其中,数据质量管理框架包括了数据血统分析(数据评价)、数据整合(ECTL)、数据标准化、数据匹配与巩固以及持续监控。
     科学技术的不断完善,使得商务智能相关技术处于持续升级换代之中。商务智能从理论的萌芽阶段,到理论的提出阶段,再到理论的发展完善阶段,数据存储技术、数据库、数据仓库、数据集市、联机分析处理技术、数据挖掘、信息可视化技术等都是伴随着这三个发展阶段逐步向前推进的。可以说,企业信息工厂(CIF)、商务智能(BI)理论和这些数据收集、处理、分析技术是相辅相成、互相影响的,理论的提出可以为技术的发展,确定基调及指明方向;而技术的不断更新改进,又为理论的有效性验证和完善创造了条件。数据处理活动与人类从事的其他生产劳作活动类似,借助商务智能这一利器可以提高效率,提升效用。同时,可视化技术的出现和不断完善,使商务智能以及商务智能数据管理的商业产业链形成了一个从“数据整合”、经“数据分析”、“数据挖掘”、到最后“数据展示”的完整“闭环”。
     商务智能数据管理产品层出不穷,并且各具特色,占据了各自目标市场的份额,现阶段的商务智能数据管理产品并非完美无缺,无论是现实的应用需求还是科研理论的需要,都对商务智能数据管理产品提出了迫切的改进需求。
     本研究试图从理论层面(数据生态系统和数据质量框架)和执行层面(数据治理政策和工作设计)来解决商务智能数据管理实践活动中的问题随着相关技术,特别是数据仓库技术、ETL技术、OLAP技术、数据挖掘技术、信息可视化技术的飞速发展,技术层面的优化改进基础已经具备,而基于信息工厂的商务智能数据管理可以很好的结合技术层面的成果并提出相应的理论层面的优化改进对策,真正实现为商务智能决策提供高效、可信数据服务的目的。
     现代企业所面临的商业竞争日趋激烈,数据作为企业重要资产之一的效用体现的越来越明显。高效可信的数据服务,为企业在激烈的竞争的赢得一席之地提供了有效的帮助。在企业中,建立基于企业信息工厂的商务智能数据管理机制,实施理论层面(数据生态系统和数据管理框架)和执行层面(数据治理政策和数据管理工作流程)相结合的革新对策,对于案例中的某大型电子科技公司的数据管理活动起到了明显的改善作用,并且至少实现了预期的为管理层商务决策提供较之以往更为高效可信的数据服务的目标:为所有应用程序主动监控和清洗数据,保持数据清洁;使业务部门能够分担数据质量和数据治理的职责;借助可信的企业数据实现更好的业务成果。
     本研究通过总结现有主流的商用智能数据管理产品在市场定位及功能点方面的特性,分析现状得出不足;并对商务智能数据管理的发展趋势研究热点进行简单介绍,例如目前影响较大的大数据管理课题。
With the surge of the amount of data, people began to face the numerous problems of data management (DM). The problems highlight the importance of data management. Business Intelligence (BI) is a kind of data management technology which is to clean the useful date which is extracted from the enterprise operation system. The data merged into a data warehouse through extraction, transformation and load (the ETL process) to ensure the correctness of it. Then for the data analysis and processing with BI tools, the result presented to the managers for support of their decision-making.
     However, the theory and tool of BI has its limitations. And also the missing data which is made by limit of managers'sight will cause defects in the data services. Based on the BI of Corporate Information Factory will solve the above-mentioned issues.
     A universal applicability of BIDM will be put forward in this paper including the data ecosystem of decision-making (data feedback flow and data quality management framework), the data management policies of decision-making and the design of data management workflow.
     With the continuous development of science and technology, business intelligence technology is being continually updated. Data storage technology, databases, data warehouses, data marts, online analytical processing, data mining and information visualization techniques are moving forward accompanied by the development of BI theory step by step. So, the theory of Corporate Information Factory (CIF) and Business Intelligence (BI) and those techniques are complementary and influenced each other.
     The products of BIDM emerged in endlessly, and each has its own characteristics. But they are not perfect whether on the needs of practical application or research activities.
     With the help of data ecosystem build the framework of data quality, the data management policies and the working flow to provide an efficient and reliable data service of business intelligence decision-making which is based on CIF.
     This study attempts to solve the problem in the business intelligence data management practices from the theoretical level and the implementation level.
     Modern enterprises are facing increasingly fierce competition; the utility of data is becoming more and more important.
     Efficient and reliable data services provide effective helps for the development of enterprises in the fierce competition. And to achieve the expected goal:(1)active monitoring and data cleaning for all applications;(2) business units can share the responsibility of control data quality and data management;(3) with the help of credible data to achieve better results.
     This study summarizes and analysis the present situation of the existing BIDM products' characteristics in market position and function, and also analysis and forecasts the development trend of BIDM while a brief introduction is made to introduce the large data management issues.
引文
[1][46][68]W. H. Inmon, Claudia Imhoff,Ryan Sousa.Corporate Information factory[M].John Wiley & Sons,Inc.,2001 (second edition):60-65,70-73.
    [2]Carl Stephen Guynes,Michael T.Vanecek.Critical success factors in data management [J].Information&Management,30(1996):201-209.
    [3]Maria Vardaki, Haralambos Papageorgiou,Fragkiskos Pentaris.A statistical metadata model for clinical trials'data management[J]. Computer methods and programs in biomedicine,95 (2009):129-145.
    [4]Marko Rosenmuller,Sven Apel, Thomas Leich, Gunter Saake. Tailor-made data management for embedded systems:A case study on Berkeley DB[J]. Data&Knowledge Engineering,68(2009):1493-1492.
    [5]Bongsik Shin. A case of data warehousing project management[J]. Information&Management,39(2002):581-592.
    [6]李炜,林泽明.基于构件的PDM系统中的数据管理[J].微机发展,2004(9):99-104.
    [7]史宝慧,张晓,麦中凡.从数据管理到内容管理—企业门户核心技术研究[J].计算机工程与应用,2001(17):143-146.
    [8]Mokrane Bouzeghoub,Maurizio Lenzerini. Introduction to:data extraction, cleaning, and reconciliation a special issue of Information Systems,An International Journal[J]. Information Systems,26(2001): 535-536.
    [9]Olga Brazhnik, John F. Jones. Anatomy of data integration [J]. Journal of Biomedical Informatics 40(2007):252-269.
    [10]崔艳梅,吴中,冯宪章.产品数据管理技术的最新进展综述[J].机电工程,2009(9):1-4.
    [11]韩京宇,徐立臻,董逸生.数据质量研究综述[J].计算机科学,2008(2): 1-5,12.
    [12]缪嘉嘉,邓苏,刘青宝.ETL综述[J].计算机工程,2004(3):4-5,21.
    [13]杨辅祥,刘云超,段智华.数据清理综述[J].计算机应用研究,2002(3):3-5.
    [14]Yi Peng, Yong Zhang,Yu Tang, Shiming Li. An incident information management framework based on data integration,data mining,and multi-criteria decision making[J]. Decision Support Systems,51 (2011): 316-327.
    [15]Guoqi Feng, Dongliang Cui,Chengen Wang, Jiapeng Yu. Integrated data management in complex product collaborative design[J]. Computers in Industry,60(2009):48-63.
    [16]Jost Vielmetter, Jeff Tishler,Marie L. Ary,Peter Cheung, Richard Bishop. Data management solutions for protein therapeutic research and development[J]. Drug Discovery Today.15(2005):1065-1071.
    [17]Ying Cao, K. W. Chau,M. Anson, Jianping Zhang. An Intelligent Decision Support System in Construction Management by Data Warehousing Technique[J]. Engineering and Deployment of Cooperative Information Systems,2480(2002):360-369.
    [18]季显武,田大钢.基于Teradata数据仓库的零售业商务智能模型[J].价值工程,2010(16):150-152.
    [19]周华,张旭梅,邱晗光.基于数据仓库的制造企业绩效评价方法分析[J].科技与管理,2009(3):47-50.
    [20]宋远芳.基于本体的数据挖掘技术在商务智能中的应用[J].计算机技术与发展,2009(1):184-186.
    [21]Steffen Gebhardt, Thilo Wehrmann, Verena Klinger, Ingo Schettler, Juliane Huth,Claudia Kunzer,Stefan Dech. Improving data management and dissemination in web based information systems by semantic enrichment of descriptive data aspects[J].Computers&Geosciences.36 (2010):1362-1373.
    [22]Shi-Ming Huang, Tung-Hsiang Chou, Jia-Lang Seng. Data warehouse enhancement:A semantic cube model approach[J]. Information Sciences.177 (2007):2238-2254.
    [23]Daniel L. Moody, Graeme G. Shanks. Improving the quality of data models:empirical validation of a quality management framework[J]. Information Systems,28(2003):619-650.
    [24]夏秀峰,张亮,石祥滨,徐蕾.一种改进的分布式ETL体系结构[J].计算机应用于软件,2010(4):174-176.
    [25]王咏梅,嵇晓,汪恒杰,冯安平.一种改进的面向VLDB数据质量处理算法[J].科技创新导报,2009(2):43-45.
    [26][40]周宏广,周继承,彭银桥,龙思锐.数据ETL工具通用框架设计[J].计算机应用,2003(12):96-98.
    [27]Tridib Mukherjee, Ayan Banerjee, Georgios Varsamopoulos, Sandeep K.S.Gupta. Model-driven coordinated management of data centers[J]. Computer Networks,54(2010):2869-2886.
    [28]Min-Hsiung Hung, Ssu-WeiWua, Tsung-Li Wang, Fan-Tien Cheng, Yen-Yun Feng. An efficient data exchange scheme for semiconductor engineering chain management system[J]. Robotics and Computer-Integrated Manufacturing,26(2010):507-516.
    [29]Salvatore T. March, Alan R. Hevner. Integrated decision support systems:A data warehousing perspective[J]. Decision Support Systems,43 (2007):1031-1043.
    [30]赖利君,赵守利,关春艳.基于SOA和ETL的信息整合研究[J].实验科学与技术,2008(4):64-66,150.
    [31]王元珍,李海波.基于OLE DB的数据抽取、转换和装入工具的设计与实现[J].小型微型计算机系统,2002(4):453-455.
    [32]Thiagarajan Ramakrishnan, Mary C. Jones, Anna Sidorova. Factors influencing business intelligence (BI) data collection strategies:An empirical investigation[J]. Decision Support Systems,52(2012):486-496.
    [33]王裕明,吴忠.商务智能中元数据管理模型研究[J].计算机应用与软件,2005(8):34-35,117.
    [34]文巨峰,朱美琳,邢汉承.面向商务智能应用的分布式数据挖掘系统设计[J].东南大学学报(自然科学版),2005(6):858-862.
    [35]Shaker H.Ali El-Sappagh,Abdeltawab M.Ahmed Hendawi.Ali Hamed El Bastawissy. A proposed model for data warehouse ETL processes[J]. Journal of King Saud University-Computer and Information Sciences,23 (2011):91-104.
    [36]Sonia Bergamaschi, Francesco Guerra,Mirko Orsini, Claudio Sartori, Maurizio Vincini.A semantic approach to ETL technologies[J]. Data&Knowledge Engineering,70(2009):717-731.
    [37]Alkis Simitsis, Dimitrios Skoutas, Malu Castellanos. Representation of conceptual ETL designs in natural language using Semantic Web technology [J]. Data&Knowledge Engineering,69(2010):96-115.
    [38]Panos Vassiliadis, Alkis Simitsis, Panos Georgantasb, Manolis Terrovitis,Spiros Skiadopoulos[J]. Information Systems,30 (2005):492-525.
    [39]袁小一,俞毅,赵赛.数字图书馆环境下ETL系统的设计与实现[J].现代图书情报技术,2007(7):72-75.
    [41]Panos Vassiliadis,Alkis Simitsis,Spiors Skiadopoulos. Conceptual Modeling for ETL Processes[C]. Proceedings of the 5th ACM International Workshop on Data Warehousing and OLAP, New York:ACM,2002:14-21.
    [42]Alkis Simitsis. Mapping Conceptual to Logical Models for ETL Processes[C]. Proceedings of the 8th ACM International Workshop on Data Warehousing and OLAP, New York:ACM,2005:67-76.
    [43]Edgar A. Whitley, Ian Hosein. Policy discourse and data retention:The technology politics of surveillance in the United Kingdom[J]. Telecommunications Pol icy,29(2005):857-874.
    [44]赵芳.新西兰政府数据治理政策和标准的特点及启示[J].档案管 理,2010(1):73-75.
    [45]周建.全球化背景下加强我国政府对统计数据质量管理的对策研究—基于公共管理视角的政策取向[J].中国软科学,2005(6):37-42.
    [47]Corporate Information factory [EB/OL]. http://www.inmoncif.com/library/cif/,2012-02-15.
    [48][62]Ralph Kimball,Margy Ross.The Data Warehouse Toolkit:The Complete Guide to Dimensional Modeling[M].John Wiley & Sons,Inc.,2002(second edi tion):25-28.
    -[49]Ralph Kimball,Joe Caserta.The Data Warehouse ETL Toolkit:Practical Techniques for Extracting,Cleaning,Conforming,and Delivering Data[M].Wiley Publ ishing,Inc.,2004:45-49.
    [50][59]Gartner Bus iness Intelligence[EB/OL].http://www.gartner.com/technology/summits/na/b usiness-intelligence/,2012-02-15.
    [51]IBM Business Intelligence[EB/OL].http://www-01.ibm.com/software/cn/data/cognos/ businessintell igence/,2012-02-15.
    [52][69]Microsoft Business Intelligence[EB/OL].http://www.mi crosoft.com/BI/en-us/pages/home.a spx,2012-02-15.
    [53]仿生商务智能处理[EB/OL].http://www.metasphere.cn/index.php/zh/solutions/business-intellige nce.html,2012-02-15.
    [54]大数据时代商业智能升级-搜狐滚动[EB/OL].http://roll.sohu.com/20111018/n322564075.shtml,2012-02-15.
    [55]协同使用Oracle商务智能,Di scoverer与OLAP选件[EB/OL].http://www.oracle.com/technetwork/cn/articles/rittman-olap-098646-zhs.html,2012-02-15.
    [56][62]敏捷商业智能系统[EB/OL]. http://www. inetsoft. com. cn/technology/obi/,2012-02-15.
    [57]赫伯特·西蒙.行政行为:对行政组织决策过程的研究(Administrative Behavior:A Study of DecisionMaking Processes in Administrative Organization[M]北京经济学院出版社,北京,1988:5-8.
    [58]谢阳群,汪传雷,许皓.微观信息管理[M].安徽大学出版社,合肥,2007:20-21.
    [60][70]W.H. Inmon. Building the data warehouse[M]. John Wiley & Sons, Inc.,2005(fourth edition):50-51.
    [63]商务智能领域未来发展将出现三大趋势[EB/OL].http://it. sohu. com/20070604/n250387228. shtml,2012-02-15.
    [64]ETL Tools [EB/OL]. http://www. dbsoftlab. com/products/etl-tools. html,2012-02-15
    [65]2010 Buget of Obama government[EB/OL].http://www. washingtonpost. com/blogs/she-the-peop le/wp/2010/01/29/2010-Buget-of-Obama-government/,2012-02-15.
    [66]Alteryx Positioned in Gartner Business Intelligence Platforms Magic Quadrant [EB/OL]. http://cn. reuters.com/article/pressRelease/idUS165270+09-Feb-2012+ PRN20120209?symbol=IT. N,2012-02-15.
    [67]杨丽,陈荔.基于商务智能的SDN企业决策模型研究[J].统计与决策,2011(1):176-178.
    [69]Tony Griffiths, Alvaro A. A. Fernandes, Norman W. Paton, Robert Barr. The Tripod spatio-historical data model[J]. Data & Knowledge Engineering,49(2004)23-65.
    [71]胡勇,陈炬桦.分布式数据库的事务处理协议[J].情报杂志,2004(2):23-27.
    [72]周静,赵英,杨欣.基于CWM的ETL元数据库系统模型的设计[J].现代图书情报技术,2011(1):88-93.
    [73]Anand Rajaraman,Jeffrey David Ullman,王斌.大数据·互联网大规模数据挖掘与分布式处理[M].人民邮电出版社,北京,2012:10-21.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700