用户名: 密码: 验证码:
面向多类型数据源的数据仓库构建及ETL关键技术的研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
数据仓库的创建与应用是企业信息化发展的必由之路。近十年来,为满足数据的集成、管理和决策支持的目的,在世界各地出现了大量的、不同规模的数据仓库系统。数据仓库数据源的类型也越来越多样化。尤其是Web数据源,文本数据源等实时数据源的出现,给数据仓库的构建以及ETL提出新的挑战。数据仓库技术面临若干紧迫问题:如何构建一个完善的数据仓库体系以适应多种类型的数据源;如何高效实现数据仓库体系中各个层次的ETL过程;如何保证ETL的实时性以及如何改进数据仓库的访问控制模型等。
     本文针对多类型数据源的特点,首先分析现有数据仓库的需求和数据源的种类。本文以国家海洋数据仓库系统为例,利用局部ETL和全局ETL两段式ETL过程;演化面向多类型数据源的数据仓库体系结构,包括抽取层、归档层、汇总层、仓库层和应用层,并且详细论证了每一层的设计思路和作用。基于此,本文研究了每一层涉及的若干关键问题。
     抽取层和归档层主要完成数据的抽取和归档工作,该层的ETL软件实现从数据源中抽取数据并装载到归档库中,因此称为局部ETL。本文重点研究了无结构的Web页面,半结构化文本和结构化的关系型数据库这三种数据源的局部ETL技术。首先,针对无结构的Web页面数据源的局部ETL问题,提出一种较传统方式更为高效的Web页面采集存储方法。把页面按照其布局特点分为若干个区域,把这些区域作为变化检测、存储和处理单元。
     其次,针对半结构化文本数据源的局部ETL问题,重点研究了半结构化非白描述型科学文本数据,提出了一种文本数据关系化方法,实现从文本模型到对象模型进而到关系模型的转换。此外如何保障关系化的效率和安全性也是本研究的重点。
     再次,针对结构化关系数据库数据源的局部ETL问题,本文分析和总结了影响ETL引擎性能的主要因素,提出了一种基于分布式数据库的ETL新方法,还提出了一种元数据驱动的ETL方法来克服现有ETL工具和手工编码方式的不足。基于E-LT方法,本文利用SQL语言实现了元数据驱动的ETL工具并详细测试了其执行性能。
     汇总层和仓库层完成从各个数据源的归档区到数据仓库的数据集成工作,该ETL过程称为全局ETL。由于数据仓库的实时性要求,多数据源全局ETL不仅要面临数据集成问题,还要保证ETL的实时或是近实时调度。本文提出了按照集成的自身规则触发ETL过程,并分配资源,以解决全局ETL的调度执行,以及它和其它数据仓库应用之间争夺数据仓库资源的问题。由于实时ETL执行过程中独占数据仓库资源,应用端一时无法连接数据仓库而处于一种离线状态。本文设计了一个支持离线运行的客户端框架,使得短时离线的过程对客户端用户透明。该离线客户端框架属于环境可感知软件框架,具有一定的通用性。
     数据仓库应用层主要包含查询检索,OLAP,数据挖掘等应用,还包括各应用的访问控制系统。数据仓库应用乃至数据仓库自身都需要一种良好的访问控制机制。本文提出两种访问控制模型。基于角色和上下文的访问控制模型是经典的基于角色的访问控制模型的扩展,适用于数据仓库应用以及任何面向最终用户的软件系统的访问控制。基于意图的访问控制模型适用于数据库系统,数据仓库系统等面向应用软件的系统的访问控制。本研究还在后者的基础上进一步研究了意图间的层次关系挖掘算法。
     总之,本文提出了一种面向多类型数据源的数据仓库体系结构和层次划分,基于该体系结构对各层次的关键问题进行分析和研究。所提出的所有模型和算法均给出实现方法或运用在实际项目中,理论分析和实验证明了所提出方法和技术的可行性和有效性。整个研究内容围绕着数据仓库和ETL过程的设计和实施,保证了数据仓库系统中数据的流动和访问的实时、灵活、高效,对数据仓库的建设和ETL的实施有一定指导作用。
The creation and the application of data warehouses is the only way for the enterprise to realize the advanced informationnalization. In the recent decade, lots of different scales data warehouse systems appear to solve the history data integration, management and decision support problem. The data sources of data warehouses are gradually various. Especially, the appearance of new real time data sources such as Web and textual data brings the new challenges to data warehousing and ETL. The data warehouse technologies faced with such serious problems:How to build a perfect data warehouse architecture to adapting the various data sources; how to implement a efficient ETL process of each layer of data warehouse system; how to guarantee a real or near-real time ETL and how to improve a access control model of data warehouse.
     This dissertation foucs on the characteristic of multi-type data sources first analyzes the existing requirements of data warehouse and the categories of various data sources, used the local ETL and the global ETL as two stages of the whole ETL process. Taking national data warehouse system as an example, the various data sources oriented data warehouse architecture is proposed, including the extraction layer, archive layer, summary layer, warehouse layer and application layer, the design and functions of each layer are also introduced in detail. Based on these, the key techniques of each layer are well studied.
     The main functions of extraction and archive layer are extracting and archiving data. The ETL software of these layers extracts data from various data source to the archive database, so it is called local ETL. This dissertation studied the local ETL based on the data sources of un-structured Web pages, semi-structured text and structured relative database. First, the issues of local ETL based on the data sources of un-structured Web pages are focused, and a more effective approach of collecting and storing Web pages is proposed. The approach divides the Web page into many blocks based on its layout, and treats these blocks as the units of version comparison, incremental storage and future process.
     Secondly, focusing on the issues of local ETL based on the semi-structured textural data sources, the dissertation studied on non-self-describing, semi-structured scientific data, purposed an approach of relationalization of textual data, accomplished the conversion of text model to object model then to relation model. Moreover, the efficiency and security of the model are also highlighted.
     Thirdly, focusing on the issues of local ETL based on the structured data source of relative database, some factors affecting the performance of ETL are summarized, and then a distributed database system based new ETL approach is purposed in this dissertation. Fartherly, a metadata-driven ETL approach is also proposed to provide the better flexibility, extensibility and maneuverability of the ETL tool. Based on the these approaches, a SQL-based, metadata-driven ETL tool is implemented and tested to prove the better efficiency.
     The summary layer and warehouse layer perform the data integration of the various data sources from the archive layer to the warehouse layer, this is some kind of ETL process named the global ETL. With the real time requirements, the global ETL faced not only the data integration issues but also the issues of real or near-real time ETL schedule. To solving the schedule opportunity of global ETL, and its competing with other applications for the resource of data warehouse environment, a new schedule approach of real time ETL is proposed, which trigger the ETL process and assign the resources according to the integration rules. Because real time ETL make use of all resources exclusively when it is executing, the running applications would lost the connections with data warehouse provisionally. In order to making the terminal users being not conscious of intermittent connectivity, a client framework supporting occasional connectivity is designed. The offline client framework is an environment-appreciable smart software framework with a certain universality.
     The application layer of data warehouse includes query, search, OLAP and data miming applications, it should also include a well organized access control mechanism. Both the applications and the data warehouse itself need a nice mechanism of access control. The two access control models are proposed in this dissertation. The proposed role and context based access control model is the extension of the classical role based access control model (RBAC), it is fit for the access control of data warehouse applications and for all of the use oriented applications. Another proposed model is purpose based access control model, it is fit for the database, data warehouse system and any other application oriented systems. Furthermore, according to the later model, an algorithm of mining hiberarchy relationships among the purposes is also studied in this dissertation.
     In conclusion, this dissertation first proposed an architecture of various data sources oriented data warehouse and its layers. Based on the architecture, the key techniques of each layer are well analyzed and studied. All the proposed apporaches and models have been implemented and applied in the practice projects, and their feasibility and effectivity also have been proved by the theoretics and the experiments. The whole researches focus on the design and performance of data warehousing and its ETL processes, and guarantee the opportunely, flexibly and efficiently of data flow and data access in the data warehouse system. These works are the guidance of building data warehouse and implementing ETL system.
引文
[1]L Agosta. The essential guide to data warehousing [M], New York:Prentice Hall/Pearson,2001.
    [2]W H Inmon. Building the data warehouse [M], New York:John Wile,1998.
    [3]J Bischoff, T Alexander.数据仓库技术[M],北京:电子工业出版社,1998.
    [4]R K LReeves, M R W Thomthwaite.数据仓库生命周期工具箱:设计、开发和部署数据仓库的专家方法[M],北京:电子工业出版社,2004.
    [5]W H Inmon.数据仓库[M],北京:机械工业出版社,2003.
    [6]于戈,鲍玉斌.数据仓库工程方法论[M],沈阳:东北大学出版社,2003.
    [7]D Fetterly, M Manasse, M Najork, and J L Wiener. A large-scale study of the evolution of Web pages [J], In:Proceedings of the 12th International World Wide Web Conference,2003,669-678.
    [8]R M Bruckner, B List, and J Schiefer. Striving towards near real-time data integration for data warehouses [C], In:Proceedings of the 4th International Conference Data Warehousing and Knowledge Discovery,2002,317-326.
    [9]M Haisten. Real-time data warehouse:the next stage in data warehouse evolution [R/OL], Available via http://www.DMReview.com,1999.
    [10]C White. Intelligent business strategies:real-time data warehousing heats up [R/OL], Available via http://www.DMReview.com,2002.
    [11]M Nguyen, and A Tjoa. Zero-latency data warehousing (ZLDWH):the state-of-the-art and experimental implementation approaches [C], In:Proceedings of the 4th IEEE International Conference on Computer Science, Research, Innovation, and Vision for the Future,2006,166-175.
    [12]L Agosta, K Gile. Real-time data warehousing:the hype and the reality [R/OL], Available via http://www.DMReview.com,2004.
    [13]C Nicholls. BI 2.0:The next generation [R/OL], Available via http://www.DMReview.com,2006.
    [14]J Langseth, Real-Time Data Warehousing:Challenges and Solutions, Article published at DSSResources.COM,02/08/2004.
    [15]Oracle白皮书.利用Oracle数据库10g实现即时数据仓库—按企业所需速度提供信息[R/OL], Available via http://www.oracle.com/global/cn/documentation /10g/bi/twp_dss_ontime_ etl_10grl_304_proofread.pdf,2004.
    [16]J Langseth. Real-time reality [R/OL], Available via Teradata Magazine Online http://www.teradata.com/t/page/11522/index.html.
    [17]张宁,贾自艳,史忠植.数据仓库中ETL技术的研究[J],计算机工程与应用,2002,38(24),213-216.
    [18]P Vassiliadis,A Simitsis,and S Skiadopoulos.Modeling ETL activities as graphs [C], In:Proceedings of the 4th International Workshop on Design and Management of Data Warehouses,204-228
    [19]E Mallach.决策支持与数据仓库系统[M],北京:电子工业出版社,2001.
    [20]王元珍,李海波.基于OLEDB的数据抽取、转换和装入工具的设计与实现[J],小型微型计算机系统,2002,23(4):453-455.
    [21]A Simitsis. Mapping conceptual to logical models for ETL processes [J], In: Proceedings of the 8th ACM International Workshop on Data Warehousing and OLAP,2005,67-76.
    [22]S A Brobst. Active data warehousing-A new breed of decision support [R], Keynote Speech at the International Workshop on Very Large Data Warehouses,2002.
    [23]United States Government Department of Defense.Trusted Computer System Evaluation Criteria[S], Available via http://csrc.nist.gov/publications/history /dod85.pdf
    [24]K Beznosov. Engineering access control for distributed enterprise application [D], Miami:Florida International University,2000.
    [25]国家海洋局908专项办.我国近海海洋综合调查与评价近海“数字海洋”信息基础框架构建[EB/OL], http://www.coi.gov.cn/oceannews/2005/hyb1467/41.htm
    [26]叶仰明,黄加棋.中国数字海洋的总体技术系统框架[J],海洋科学,2001,25(5),1-4
    [27]侯文峰.中国“数字海洋”发展的基本构想[J],海洋通报,1999,6,1-10.
    [28]J Song, T Z Nie, D L Wang, and G Yu. An effective web page layout adaptation for various resolutions [C], In:Proceedings of the 8th Asia-Pacific Web Conference, 2006,779-785.
    [29]K C Chang, B He, C Li, and M Patel, Z Zhang. Structured databases on the Web: Observations and Implications [J], SIGMOD Record,2004,33(3):61-70.
    [30]A Arasu, and H Garcia-Molina. Extracting structured data from Web pages [C], In: Proceedings of the 22th ACM SIGMOD International Conference on Management of Data,2003,337-348.
    [31]A Ntoulas, J Cho, and C Olston. What's new on the web? the evolution of the web from a search engine perspective [C], In:Proceedings of the 13th International Conference on World Wide Web,2004.1-12.
    [32]National Liberaray of Australia:Padi-web archiving [EB/OL], Available via http://www.nla.gov.au/padi/topics/92.html,2006.
    [33]中国Web信息博物馆[DB/OL], Available via http://www.infomall.cn,2007.
    [34]Internet Archive WayBack Machine [DB/OL], Available via http://www.archive.org, 2007.
    [35]S Gupta, G Kaiser, and S Stolfo. Extracting context to improve accuracy for HTML content extraction [C], In:Proceedings of the 14th International Conference on World Wide Web,2005,1114-1115.
    [36]S H Lin, and J M Ho. Discovering informative content blocks from Web documents [C], In:Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,2002,588-593.
    [37]W C Wong, and A W Fu. Finding structure and characteristics of Web documents for classification [C], In:Proceedings of ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery,2000,96-105.
    [38]Y Yang, and H Zhang. HTML page analysis based on visual cues [C], In: Proceedings of the 6th International Conference on Document Analysis and Recognition,2001.859-864
    [39]Stop Words [CP/OL], Available via http://www.dcs.gla.ac.uk/idom/ir_resources/ linguistic_utils/stop_words,2007.
    [40]Stemming [CP/OL], Available via http://www.dcs.gla.ac.uk/idom/ir_resources/ linguistic_utils/porter.c,2007.
    [41]M Kantrowitz, B Mohit, and V Mittal. Stemming and its effects on TFIDF ranking [C], In:Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval,2000,357-359.
    [42]J MacDonald. Versioned file archiving, compression, and distribution [J], UC Berkeley,1999, Available via http://www.cs.berkeley.edu/~jmacd/.
    [43]B Berliner. CVS Ⅱ:Parallelizing software development [C], In:Proceedings of the USENIX Winter Technical Conference,1990,341-352.
    [44]D Gomes, J P Campos, and M J Silva. Versus:a web repository [R], Available via http://xldb.fc.ul.pt/referencias,2003.
    [45]D Gomes, A L Santos, and M J Silva. Managing duplicates in a web archive [C], In: Proceedings of 21th Annual ACM SymSymposium on Applied Computing,2006, 818-825.
    [46]J Cho, and H Garcia-Molina. Estimating frequency of change [J], ACM Transactions on Internet Technology (TOIT),2003,3(3):256-290.
    [47]M Phillips. PANDORA, Australia's Web Archive, and the Digital Archiving System that Supports it [J/OL], Australia.DigiCULT.info, Available via http://www.nla. gov.au
    [48]J E Halse, G Mohr, K Sigurdsson, M Stack, and P Jack. Heritrix developer documentation [R], Available via http://crawler.archive.org/articles/developer_manual/index.html,2005.
    [49]D Gomes, S Freitas, and M J Silva. Design and selection criteria for a national web archive [C], In:Proceedings of 10th European Conference of Research and Advanced Technology for Digital Libraries,2006,196-207.
    [50]M J Silva. Searching and archiving the web with tumba [C], In:Proceedings of 4th Conference of Association Portugal of System and Information,2003, Available via http://xldb.fc.ul.pt/data/Publications_attach/tumba-search+archive-capsi-final.pdf
    [51]D Hallgrimsson, and S Bang. Nordic Web Archive [C], In:Proceedings of the 3rd ECDL Workshop on Web Archives,2003.
    [52]National Diet Library (Japan) Web Archiving [DB/OL], Available var http://warp. ndl.go.jp,2007.
    [53]UK Web Archiving Consortium [DB/OL], Available var http://info. webarchive.org.uk,2006.
    [54]The Library of Congress, Minerva Web Archiving [DB/OL], Project overview available var http://lcweb2.loc.gov/cocoon/minerva/html/minerva-home.html,2006.
    [55]F McCown. Dynamic web file format transformations with grace [C], In: Proceedings of 5th International Web Archiving Workshop and Digital Preservation, 2005,22-23.
    [56]C Lampos, M Eirinaki, D Jevtuchova, and M Vazirgiannis. Archiving the greek Web [C], In:Proceedings of 4th International Web Archiving Workshop (IWAW04), 2004.
    [57]J Callan. Passage-level evidence in document retrieval, In:Proceedings of the 7th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval [C],1994,302-310.
    [58]M Kaszkiel, and J Zobel. Effective ranking with arbitrary passages [J], Journal of the American Society for Information Science,2001,52(4):344-364.
    [59]Y Diao, H Lu, S Chen, and Z Tian. Toward learning based Web query processing [C], In:Proceedings of 26th International Conference on Very Large Data Bases, 2000,317-328.
    [60]E Kaasinen, M Aaltonen, J Kolari, S Melakoski, and T Laakko. Two approaches to bringing internet services to WAP devices [J], Computer Networks:The International Journal of Computer and Telecommunications Networking,2000, 33(1-6):231-246.
    [61]O Buyukkokten, H Garcia, and A Paepche. Accordion summarization for end-game browsing on PDAs and cellular phones [C], In:Proceedings of the SIG-CHI on Human factors in computing systems,2001.
    [62]A Rahman, H Alam, and R Hartono. Content extraction from HTML documents [C], In:Proceedings of the 1st International Workshop on Web Document Analysis, 2001,3-10.
    [63]D Cai, S Yu, J R Wen, and W Y Ma, Extracting content structure for web pages based on visual representation [C], In:Proceedings of 5th Asia Pacific Web Conference,2003,406-417.
    [64]M Burner, and B Kahle. WWW Archive File Format Specification [S], Alexa Internet Inc, Available via http://pages.alexa.com/company/arcformat.html,1996.
    [65]D Gomes, A L Santos, and M J Silva. Webstore:A manager for incremental storage of contents [R], DI/FCUL TR 04-15, Department of Informatics, University of Lisbon,2004.
    [66]Y Sekiguchi, H Kawashima, H Okuda, and M Oku. Topic detection from blog documents using users' interests [C], In:Proceedings of 7th International Conference on Mobile Data Management,2006,108-111.
    [67]王晓宇,熊方,凌波,周傲英.一种基于相似度分析的主题提取和发现算法[J],软件学报,2003,14(9):1578-1585.
    [68]S W Ambler. The fundamentals of mapping objects to relational databases [R], Available var http://www.agiledata.org,2003.
    [69]J Gray, D T Liu, M A Nieto-Santisteban, A Szalay, D J DeWitt, and G Heber. Scientific data management in the coming decade [J], SIGMOD Record,2005,34(4). 34-41.
    [70]National Oceanographic Data Center [DB/OL], Available var http://www.nodc. noaa.gov/.
    [71]P Buneman, S Khanna, K Tajima, and W C Tan. Archiving scientific data [C], In: Proceedings ofthe ACM SIGMOD International Conference on Management of Data,1-12.
    [72]P Buneman, S B Davidson, M F Fernandez, and D Suciu. Adding Structure to Unstructured Data [C], In:Proceedings of ICDT,1997,336-350.
    [73]The BibTeX Format [S/OL], Available var http://www.ecst.csuchico.edu/~jacobsd/ bib/formats/bibtex.html.
    [74]D Barry. ODMG:The Industry Standard for Java Object Storage [S/OL], Available var http://www.inf.puc-rio.br/~casanova/ReferenciasBD/odmg20-storage.pdf
    [75]A Silberschatz, H F Korth. Database System Concepts [M], McGraw-Hill Book Company,2000
    [76]J Murray. Public key infrastructure digital signatures and systematic risk [J], Journal of Information, Law and Technology,2003,1.
    [77]R L Rivest, A Shamir, and L Adleman. A method for obtaining digital signatures and public-key cryptosystems [J], Commun ACM,1978,21(2):120-126.
    [78]D Quass, A Rajaraman, J D. Ullman, J Widom, and Y Sagiv. Querying semistructured heterogeneous information [J], Journal of Systems Integration,1997, 7(3):381-407.
    "[79] Y Papakonstantinou, H Garcia-Molina, and J Widom. Object exchange across heterogeneous information sources [C], In:Proceedings of the 11th International Conference on Data Engineering,1995.251-260
    [80]J McHugh, S Abiteboul, R Goldman, D Quass, and J W Lore. A database management system for semistructured data [J], SIGMOD Record,1997,26(3). 54-66.
    [81]A Deutsch, M F Fernandez, and D Suciu. Storing semistructured data with STORED[C], In:Proceedings of ACM SIGMOD International Conference on Management of Data,1999.431-442.
    [82]S Y Lu, Y Z Sun, M Atay, and F Fotouhi. A new inlining algorithm for mapping XML DTDs to relational schemas [C], In:Proceedings of the 22nd International Conference on Conceptual Modeling,2003,366-377.
    [83]L Cabibbo, and A Carosi. Managing inheritance hierarchies in Object/Relational Mapping tools [C], In:Proceedings of 14th Conference on Advanced Information Systems Engineering,2005,135-150.
    [84]R G Cattell, D K Barry, M Berler, J Eastman, D Jordan, C Russell, O Schadow, T Stanienda, F Velez.The Object Data Standard:ODMG 3.0 [M], San Fransisco: Morgan Kaufmann,2000.
    [85]L Cabibbo, and R Porcelli. M2ORM2:A model for the transparent management of relationally persistent objects [C], In:Proceedings of the 9th International Workshop on Database Programming Languages,2003,166-178
    [86]B Liskov, M Castro, L Shrira, and A Adya. Providing persistent objects in distributed systems [C], In:Proceedings of the 13th European Conference on Object-Oriented Programming,1999,230-257.
    [87]张绍成,李华林,马玉琴.基于Java的对象持久化方法研究[J],小型微型计算机系统,2005,26(2):264-267.
    [88]潘捷,蔡志旻,赵洋,潘金贵.一个统一的异构对象持久化框架[J],计算机工程,2004,30(18),78-80.
    [89]Sun Microsystems. Enterprise JavaBeans [R], Available var http://java.sun.com/ products/ejb,2005.
    [90]Sun Microsystems. Java Data Objects Specification [S], Available var http://www. jdocentral.com,2005.
    [91]JBoss Inc. Hibernate Reference Documentation [R], Available var http://www. hibernate.org/hib_docs/v3/reference/en/html,2006.
    [92]IBM Corporation and BEA Systems. Service Data Objects Specification[S], Available var http://www.osoa.org/display/Main/
    [93]赵会群,徐凌宇,王国仁,高远.基于随机进程代数的软件体系结构建模与性能评价[J],东北大学学报,2002,23(1):16-19.
    [94]朱耿斌.如何在实际中运用ETL [EB/OL], Available var http://www.dwway.com, 2003.
    [95]Oracle Corporation. Oracle9i中异构数据库连接[R], Oracle白皮书,2002.
    [96]宋杰,王大玲,鲍玉斌,于戈.一种元数据驱动的ETL方法的研究[J],小型微型计算机系统2007,28(12):2167-2173.
    [97]Y B Bao, J Song, F L Leng, D LWang, and G Yu. Study and implementation of a new ETL approach [J], Journal of WuHan University,2007,12(5):804-808.
    [98]潘定,沈钧毅.数据仓库中实时元数据管理的研究[J],计算机工程,2005,31(17):29-31.
    [99]S N Gordon. Software reuse by specialization of generic procedures through views [J], IEEE Transactions on Software Engineering,23 (7):1997.
    [100]C Thomsen, and T B Pedersen. A Survey of open source tools for business intelligence [C], In:Proceedings of the 7th International Conference Data Warehousing and Knowledge Discovery,2005,74-84.
    [101]P Vassiliadis, A Simitsis, M Terrovitis, and S Skiadopoulos. Blueprints and measures for ETL workflows [C], In:Proceedings of the 24th International Conference on Conceptual Modeling,2005,385-400.
    [102]J Bleiholder, and F Naumann. Declarative data fusion-syntax, semantics, and implementation [C], In:Proceedings of the 9th East European Conference on Advances in Databases and Information Systems 2005,58-73.
    [103]M Gorawski, and P Marks. Checkpoint-based resumption in data warehouse [J], Springer Boston,227:313-323.
    [104]P Vassiliadis, A Simitsis, and S Skiadopoulos. On the logical modeling of ETL processes [C], In:Proceedings of the 14th Conference on Advanced Information Systems Engineering,2002,782-786
    [105]P Vassiliadis, A Simitsis, P Georgantas, and M Terrovitis. A framework for the design of ETL scenarios [J], Lecture Notes in Computer Science,2005, 2681:520-535.
    [106]C Bornhovd, and A P Buchmann. A prototype for metadata-based integration of internet sources [J], Lecture Notes in Computer Science, Springer Berlin,2005, 1626:439-445.
    [107]郑有材,蔡希尧.元数据驱动的可重用通信软件的设计[J],西安电子科技大学学报,1998,25(6):778-781.
    [108]陈弦,陈松乔.基于数据仓库的通用ETL工具的设计与实现[J],计算机应用研究,2004,21(8):214-216.
    [109]韩京宇,徐立臻,董逸生.ETL执行的流水线优化[J],小型微型计算机系统,2005,126(6):1013-1017.
    [110]张英朝,邓苏,张维明.数据仓库元数据管理研究[J],计算机工程,2003,29(1):8-10.
    [111]Business Objects Corporation. Data Integrator Introduce [EB/OL], Available var http://www.businessobjects.com/,2005
    [112]IBM Corporation Data Integration Software-IBM WebSphere DataStage [EB/OL], Available var http://www.ibm.com/,2005
    [113]Informatica Corporation Data Integration-Informatica [EB/OL], Available var http://www.informatica. Com,2004
    [114]Teradata Corporation. Teradata Warehouse Solutions [EB/OL], Available var http://www.teradata.com/,2005.
    [115]S Abiteboul, P Buneman, and D Suciu. Data on the Web, from relations to semistructured data and XML [M], San Fransisco:Morgan kaufmann,1999.
    [116]G Wiederhold. Mediators in the architecture of future information systems [J], IEEE Computer,1992,25(3):38-49.
    [117]M Lenzerini. Data Integration:A theoretical perspective [C],In:Proceedings of the 21st SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, 2002,233-246.
    [118]F Araque. Real-time data warehousing with temporal requirements [EB/OL], Available var http://jobfunctions.bnet.com/whitepaper.aspx,2003.
    [119]王培杰,张应中,周金钢.面向对象的Windows编程技术[M],大连:大连理工大学出版社,1994.
    [120]曲绍刚,杨广文,林闯.基于完成时间的任务分配方案与性能分析[J],计算机研究与发展,2005,42(8):1397-1402.
    [121]L Kkeinrock, and R Muntz. Processor sharing queuing models of mixed scheduling disciplines for time shared systems [J], Journal of the ACM,1972,19 (3):464-482.
    [122]H C Ke, H X Zhang, and C Chen. Smart client techniques for online game on portable device [C], In:Proceedings of 1st International Conference on Embedded Software and Systems,2004,294-299.
    [123]Sun Microsystems. Applet Security [EB/OL], Available var java.sun.com/applets/ security/,2005.
    [124]S A Brobst. Enterprise application integration and active data warehousing [C], In: Proceedings of Data Warehousing,2002,15-23.
    [125]R Basu. Challenges of real-time data warehousing [R], Available var http://www.DMReview.com,2003.
    [126]K Y Lee. Efficient incremental view maintenance in data warehouse [C], In: Proceedings of the International Conference on Information and Knowledge Management,2001,349-356.
    [127]R Roussopoulos. Materialized views and data warehouse [J], SIGMOD Record, 1998,27(1):21-26.
    [128]王新军,洪小光,王海洋.数据仓库中多数据源物化视图的一种有效的更新算法[J],计算机研究与发展,2004,40(4):874-879.
    [129]左亚尧,舒忠梅,潘久辉.一种高效的视图维护算法[J],计算机研究与发展,2003,40(4):627-633.
    [130]L Colby, T Griffin, Libkin, I Mumick. Algorithms for deferred view maintenance [C], In:Proceedings of ACM SIGMOD International Conference on Management of Data,1996,469-480.
    [131]L S Colby, A Kawaguchi, D F Lieuwen, I S Mumick, and K A Ross. Supporting multiple view maintenance policies [C], In:Proceedings of ACM SIGMOD International Conference on Management of Data,,1997,13-15.
    [132]张伯礼,孙志挥,胡文瑜等.实现物化视图在线维护[J],计算机研究与发展,2006,43(增刊):602-606.
    [133]刘小宁,马光志.数据仓库实化视图的联机维护[J],计算机系统应用,2001,30(3):49-50.
    [134]李子木,莫倩,徐明.数据仓库中多视图环境下的联机维护[J],计算机研究与发展,1999,36(8):966-972.
    [135]P Gruszczynski, S Osinski, and A Swedrzynski. Offline Business Objects:Enabling data persistence for distributed desktop applications [C], In:Proceedings of OTM Conferences,2005,960-977.
    [136]P N Nguyen, and D Sharma. Smart clients and small business model [C], In: Proceedings of the 9th International Conference on Knowledge-Based Intelligent Information & Engineering Systems,2005,730-736.
    [137]Z Tari, S Hammoudi, and S Wagner. A COBRA object-based caching with consistency [C], In:Proceedings of the 10th International Conference Database and Expert Systems Applications,1999,321-331.
    [138]Microsoft Corporation. Guide of smart client architecture for Microsoft .NET [R], Available var http://msdn.microsoft.com,2006.
    [139]Sun Microsystems. Java Web Start [R], Available var http://java.sun.com/ products/javaWebstart.
    [140]J A Ginige, B D Silva, and A Ginige. Towards end user development of Web applications for SMEs:A Component Based Approach [C], In:Proceedings of the 5th International Conference on Web Engineering,2005,489-499.
    [141]A S Y Lai, and A J Beaumont. A metalevel component-based framework for distributed computing applications [C], In:Proceedings of the 4th Annual ACIS International Conference on Computer and Information Science,2005:268-273.
    [142]F T Dabous, F A Rabhi, H Yu. Using software architectures and design patterns for developing distributed applications [C], In:Proceedings of Australian Software Engineering Conference,2004,290-299.
    [143]A R Silva, F A Rosa, T Goncalves, and M Antunes. Distributed Proxy:A design pattern for the incremental development of distributed applications [C], In: Proceedings of the 2nd International Workshop on Engineering Distributed Objects, 2000,165-181.
    [144]曹建福,周理琴.基于构件的软件开发模型及其实现[J],小型微型计算机系统,2002,23(6):739-742.
    [145]贺岚,狄玉来,王兴伟.基于构件的软件设计模型[J],计算机研究与发展,1998,35(5):451-454.
    [146]胡文蕙,赵文,张世琨,王立福.基于构件技术的应用框架元模型的研究[J],软件学报,2004,15(1):1-8.
    [147]D F Ferraiolo, R S Sandhu, S I Gavrila, D R Kuhn, and R Chandramouli. Proposed NIST standard for role-based access control [J], ACM System and Security,2001, 4(3):224-274.
    [148]R S Sandhu, E J Coyne, H L Feinstein, and C B Youman. Role-based access control models [J], IEEE Computer,1996,29(2):38-47.
    [149]UML Specification, Version 2.0 [S/OL], Available at http://www.uml.org/.
    [150]A Evans, R B France, K Lano, and B Rumpe. The UML as a formal modeling notation [C], In:Proceedings of the 1st International Conference on Unified Modelling Language, Modelling Languages and Applications,1998,336-348.
    [151]S Lujan-Mora, J Trujillo, and I Song. Multidimensional modeling with UML package diagrams [C], In: Proceedings of the 21st International Conference on Conceptual Modeling,2002,199-213.
    [152]J Hong, B Lee, and H. Kim. Extending UML for a context-based navigation modeling framework of web information systems [C], In:Proceedings of the 2nd International Conference Software Engineering Research, Management and Applications,2004,108-122.
    [153]C Apte, B Liu, Edwin P. D. Pednault, and Padhraic Smyth. Business applications of data mining [J], Commun. ACM,2002,45(8):49-53
    [154]Z Du, and F Lin. A novel parallelization approach for hierarchical clustering [J], Parallel Computing,2005,31(5):523-527.
    [155]J Lee, D Yeung, and E Tsang. Hierarchical clustering based on ordinal consistency [J], Pattern Recognition,2005,38(11):1913-1925.
    [156]Y C Thomas, and S L Simon. Designing a distributed authorization service [C], In: Proceedings of INFOCOM,1998,419-429.
    [157]M J Moyer, M J Covington and M Ahamad. Generalized role-based access control for securing future applications [C], In:proceedings of NISSC,2000.
    [158]M J Covington, W Long, S Srinivasan, A K Dey, M Ahamad and G D Abowd. Securing context-aware applications using environment roles [C], In:Proceedings of the 6th ACM Symposium on Access Control Models and Technologies,2001, 10-20.
    [159]A Kumar, N Karnik, and G Chafle. Context sensitivity in role-based access control [J], ACM SIGOPS Operating Systems Review,2002,32(3):53-66.
    [160]G Zhang, and M Parashar. Context-aware dynamic access control for pervasive applications [C], In:Proceedings of the CNDS,2004.
    [161]F Cuppens, and A Miege. Modelling contexts in the Or-BAC model [C],In: Proceedings of the 19th Annual Computer Security Applications Conference,2003, 416-427.
    [162]R J Hulsebosch, A H Salden, M S Bargh, P W G Ebben, and J Reitsma. Context sensitive access control [C], In:Proceedings of the 10th ACM Symposium on Access Control Models and Technologies,2005,111-119.
    [163]W Han, J Zhang, and X Yao. Context-sensitive access control model and implementation [C], In:Proceedings of the 5th International Conference on Computer and Information Technology,2005,757-763
    [164]E Bertino, B Catania, M L Damiani, and P Perlasca. GEO-RBAC:a spatially aware RBAC [C], In:Proceedings of the 10th ACM Symposium on Access Control Models and Technologies,2005,29-37.
    [165]F Hansen and V Oleshchuk. Spatial role-based access control model for wireless networks [C], In:Proceedings of IEEE Vehicular Technology Conference,2003.
    [166]H Zhang, Y P He, and Z G Shi. Spatial context in role-based access control [C], In: Proceedings of the 9th International Conference on Information Security and Cryptology,2006,166-178.
    [167]董光宇,卿斯汉,刘克龙.带时间特性的角色授权约束[J],软件学报,2002,13(8):1521-1527.
    [168]D G Cholewka, R H Botha, and J H P Eloff. A context sensitive access control model and prototype implementation [C], In:Proceeding of the 15th International Conference on Information Security,2000,341-350.
    [169]邓集波,洪帆.基于任务的访问控制模型[J],软件学报,2003,14(1):76-82.
    [170]S Barker. Security policy specification in logic [C], In:Proceedings of the International Conference on Artificial Intelligence,2000,143-148.
    [171]E Bertino, P Bonatti, and E Ferrari. TRBAC:A temporal role-based access control model [C], In:Proceedings of the 5th ACM Workshop on Role-Based Access Control,2000,21-30.
    [172]R J Hayton, J M Bacon, and K Moody. Access control in open distributed environment [C], In:Proceedings of IEEE Symposium on Security and Privacy, 1998,3-14.
    [173]M Hitchens, and V Varadarajan. Tower:a language for role-based access control [C], In:Proceedings of the Policy Workshop,2001,88-106.
    [174]J A Hoagl, R Pandey, and K N Levitt. Security policy specification using a graphical approach [R], Technical Report CSE-98-3, Computer Science Department, University of California Davis,1998.
    [175]C Ribeiro, A Zuquete, and P Ferreira. SPL:An access control language for security policies with complex constraints [C], In:Proceedings of the Network and Distributed System Security Symposium,2001.
    [176]P Epstein, and R Sandhu. Towards a UML based approach to role engineering [C], In:Proceedings of the 4th ACM Workshop on Role-based Access Control,1999, 135-143.
    [177]J Jurjens. Towards development of secure systems using UMLsec [C], In: Proceedings of The 4th International Conference of Fundamental Approaches to Software Engineering,2001,187-200.
    [178]T Lodderstedt, D A Basin, and J Doser. SecureUML, A UML-based modeling language for model-driven security [C], In:Proceedings of the 5th International Conference on Unified Modelling Language, Modelling Languages and Applications,2002,426-441.
    [179]E Bertino, P Samarati, and S Jajodia. An extended authorization model for relational databases [J], IEEE Transactions on Knowledge and Data Engineer,1997,9(1): 85-101.
    [180]E Bertino, S Jajodia, and P Samarati. A flexible authorization mechanism for relational data management systems [J], ACM Transactions on Information System, 1999,17(2):101-140.
    [181]F Afinidad, T Levin, C Irvine, and T Nguyen. A model for temporal interval authorizations [C], In:Proceedings of the 39th Hawaii International International Conference on Systems Science,2006.
    [182]K Keahey, and V Welch. Fine-grain authorization for resource management in the grid environment [C], In:Proceedings of 3rd International Workshop on Grid Computing 2002,199-206
    [183]C Ruan, and V Varadharajan. Data protection in distributed database systems [C], In: Proceedings of the 15th International Symposium on Foundations of Intelligent Systems,2005,191-199.
    [184]S Castano, M Fugini, G Martella, and P Samarati. Database security [M], Addison-Wesley & ACM Press,1995.
    [185]Z Zhang, A Mendelzon. Authorization views and conditional query containment [C], In:Proceedings of the 10th International Conference on Database Theory,2005, 259-273.
    [186]C Sanchez, and L Gruenwald. An agent-based architecture using XML for mobile federated database systems [C], In:Proceedings of Mobile Data Management,2001, 273-274
    [187]E Bertino. Purpose based access control for privacy protection in database systems [C], In:Proceedings of the 10th International Conference on Database Systems for Advanced Applications,2005,2.
    [188]J Byun, E Bertino, and N Li. Purpose based access control of complex data for privacy protection [C], In:Proceedings of the 10th ACM Symposium on Access Control Models and Technologies,2005,102-110.
    [189]J Schlegelmilch. Role mining with ORCA [C], In:Proceedings of the 10th ACM Symposium on Access Control Models and Technologies,2005,168-176.
    [190]夏登文,数字海洋基础数据及业务流程建模方法及相关技术研究[D],中国沈阳,东北大学, 2005.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700