互联网信息资源整合(Web Integration)是一门涉及面广、综合性强的新兴研究领域,它与数据库、人工智能、信息系统等学科有着密切的关系,同时,也为这些学科领域带来了新的研究内容。尽管有关Web信息访问和数据整合的研究沿着不同的方向、基于不同学科的方法已经开展一段时期,但是目前还未形成一个系统的方法和结论、仍存在一些没能解决的问题。本文指出数据模型、知识表示和处理、实用性和自动化处理能力是WI系统的关键问题。在此基础上对WI进行全面的研究,获得以下研究成果:
     (6)实现了一个WI系统的工具软件集-WISK(Web Integration Service Kits),并给出一个使用该软件集开发的实际应用的例子。
The Information Integration of the WWW, shortly as Web Integration (WI), is a broad, synthetic and novel research area, which has deep relationships with areas such as database, AI, information system and so on. Also, WI has brought many new problems to those subjects. Although there have been some studies on WI through various directions and methods for some period, a systematical method or conclusion for WI is still unavailable. Furthermore, there still exist some problems left for resolving. In this thesis, it is indicated that the unified data model, knowledge representation and processing, and practicability and automaticity are three key points of any WI system. Based on the points, this thesis studied WI as a whole, and obtained the following results:
    (1) Ontology-guided techniques and architecture of WI system;
    (2) Deductive Object Model with Semi-structured Features(DOMSF) as the unified data model for WI system. DOMSF can fit well with the requirements of WI environment, with plenty data types and high flexibility. DOMSF is capable of rule deduction, which makes it much powerful and expressive and capable of representing complex relations between objects, not only relation of inheritance.
    (3) Expanding the semantics of ontology to corporate it with the DOMSF data model, based upon which, an ontology representation language, named ORL, was put forward. ORL is highly expressive, and it supports both syntactic and semantic interoperability.
    (4) Based on the object methodology, a layered architecture for accessing dynamic web was brought forward. In this model, pages are considered as template, while data on the web are web of objects. Then, a source description language, TDL, was designed. TDL combines the functionality of HTML structure pattern with text pattern, which means it is more suitable for dynamic web content, as it suits the frequent changing date sources.
    (5) Upon RDBMS, the ontology-based information retrieval and optimized data storage algorithms were shown and proved of its soundness and completeness.
    (6) Finally, based on the above studies, a tool kit for the fast-building of WI system, namely WISK, was developed. And what's more, a prototype application, Integrated Product Catalog for E-commerce, has been developed using WISK. Both of them were mentioned in this thesis.
    Web Integration has a promising future of application. Some of the possible domains include e-business, intelligent information retrieval, digital library, web data mining, enterprise knowledge portal, etc. WI would benefit greatly in all of these domains. It is never too exaggerated to emphasize that Web Integration would be one of the key techniques in the new generation of the web and new application era of e-services.
