面向产品创新设计的语义关键词专利检索方法
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
近年来,在产品设计领域,创新设计具有越来越重要的地位。产品创新设计中的一个关键问题是知识获取问题,即如何从海量信息中提取出符合设计需求的知识。例如在TRIZ创新概念设计的前期模糊前端阶段,原理解或领域解的求解阶段,都需要大量相关知识。当前知识获取的一个重要方式是从专利中获得,常用的方法是用关键词检索,但由于没有采用基于语义的检索方法,所以检索质量尚待进一步提高。本文重点研究基于语义的关键词检索方法,以提高检索的质量和效率,具有理论意义和应用价值。
     本课题首先研究专利的文本预处理方法,并构建了供语义关键词检索的专利数据库,然后研究基于语义关键词检索方法,并在此基础上给出了应用示例。主要研究工作如下:
     (1)给出了一种专利数据的文本预处理方法。本文对国家知识产权局网站下载的专利进行文本格式转换和存储,通过Jtidy方法实现HTML文本的转化,并通过商业软件(Adobe Acrobat Professional 8.0)实现PDF文本的转换,然后通过数据库建模将其存储到本课题组构建的专利数据库中,为用关键词或语义关键词从该专利数据库检索提供数据准备。
     (2)给出了一种基于语义的关键词检索方法。主要用于以下两方面:针对专利文本不带关键词的问题,用语义关键词检索方法对一个给定的专利文本,在语义理解的基础上自动提取专利文本关键词;针对用一般关键词检索范围有限的问题,利用语义关键词扩展相关专利检索的范围。最后通过编程实现了语义关键词的检索模块。
     (3)通过除冰雪车辆创新方案设计相关知识的专利检索实例,说明了本文给出的语义关键词检索方法应用。
     上述研究有助于面向创新设计的知识获取的专利检索研究,将上述成果作为模块纳入本课题组开发的“面向创新的专利知识检索系统”,用语义关键词从该专利数据库中检索所需的知识,为设计师进行创新设计的方案设计提供参考。
In recent decades,innovational design is more and more important in the field of mechanical product design.One of the key problems of innovational design is how to get the needed information which meets the design requirements from massive information,that is, the problem of knowledge acquisition.For example,in the fuzzy front-end stage of the TRIZ innovative conceptual design,the obtainment of the principle solution and the domain solution requires a large number of relevant knowledge.At present,the knowledge is mainly acquired from the patents by keyword search.However,the field of search is limited due of the absence of semantic search.This paper focuses on the semantic-based keyword search in order to improve the quality and efficiency of retrieval,and has a certain theoretical significance and reference value of the application.
     This paper firstly researches the patent text information extraction method and builds a database,which is searched by semantic keywords,then researches the retrieval method based on semantic keywords,and gives the application examples on above-mentioned basis finally. There are several aspects of the work:
     (1) This paper proposes a patent(download from State Intellectual Property Office website) text preprocessing method,which performs the transformation of HTML documents by Jtidy method and transforms PDF documents by business software.Then,the data is stored in the database through the program,which can be searched by keywords or semantic keywords in the patent database and prepare for the further extension of the database using patents.
     (2) This paper presents a keyword retrieval method based on semantic,which can be used in two aspects as following:To deal with the problem that some patents have no keywords,the approach using semantic keywords extracts the keywords of the patent automatically according to the statistic on the basis of semantic understanding;To deal with the field limitation of retrieval on keywords,the semantic keywords is employed to extend the coverage of patent retrieval.
     (3) The patent search instance serving for innovative design of snow removal vehicles is used to illustrate the validation of the patent text information extraction method and semantic keyword retrieval method proposed in this paper.
     The above research benefits the patent retrieval of knowledge acquisition oriented to innovative design,which is used as a part of the "The patent retrieval system oriented to innovative design" developed by our research group.The approach employs semantic keywords to obtain the required knowledge by searching patent database,which can be used as the reference of scheme design in the innovative design for the designer.
引文
[1]Stevens G A,Burley J.Piloting the rocket of radical innovation[J].Research Technology Management,2004,46(2):16-25.
    [2]Moenaert R K,De Meyer A,Souder W E et al.R&D/marketing communication during the fuzzy front-end[J].IEEE Transactions on Engineering Management,1995,42(3):243-258.
    [3]胡树华,蔡铂.论产品创新[J].中国机械工程,1998,9(2):57-61
    [4]余俊.现代设计方法及应用[M].北京:科学出版社,2000.
    [5]冯培恩,冯帅,陈泳等.复合功能原理方案特征建模及其求解过程研究[J].中国机械工程,2002,13(4):306-310.
    [6]檀润华.创新设计[M].北京:机械工业出版社,2002.
    [7]Nigel Cross.Descriptive Models of Creative Design[J].Application to an Example Design Studies,1997,18(4):427-455.
    [8]Altshuller G S.Creativity As an Exact Science:The Theory of the Solution of Inventive Problems[M].New York:Gordon and Breach Science Publishers,1989.
    [9]贾怀玉,金先龙,李治等.基于KBE的电梯智能设计系统[J].计算机辅助设计与图形学学报,2004,16(6):861-865.
    [10]谢友柏.现代设计与知识获取[J].中国机械工程,1996,7(6):36-41.
    [11]陈洪军,陈新度,陈新涛.新一代基于知识的工程系[J].中国机械工程,2002,13(17):1492-1496.
    [12]朱上上,潘云鹤,罗仕鉴等.基于知识的产品创新设计技术研究[J].中国机械工程,2002,13(4):337-342.
    [13]谢友柏.现代设计理论和方法的研究[J].机械工程学报,2004,40(4):1-9.
    [14]刘晓冰,杨春立,高天一.基于设计仓库的产品设计知识管理方法研究[J].计算机集成制造系统,2005,11(6):831-835.
    [15]托夫勒.第三次浪潮[M].北京:中心出版社,2006.
    [16]叶修梓,彭维,唐荣锡.国际CAD产业的发展历史回顾与几点经验教训[J].计算机辅助设计与图形学学报,2003,15(10):1185-1193.
    [17]陈宗舜.我国装备制造业信息化的发展脉络[J].微型机与应用,2008,27(3):41-43.
    [18]Wood R M,Bauer S X S.Discussion of knowledge-based design[J].Journal of Aircraft,2002,39(6):1053-1060.
    [19]Ball L J,Lambell N J,Ormerod T C.Representing design rationale to support innovative design reuse:a minimalist approach[J].Automation in Construction,2001,10(6):663-674.
    [20]Kitamura Y,Mizoguchi R.Ontology-based description of functional design knowledge and its use in a functional way server[J].Expert Systems with Applications,2003,24(2):153-166.
    [21]Lee R J V,Young R I M.Information supported design and manufacture of injection-moulded rotational products[J].International Journal of Production Research,1998,36(12):33447-3366.
    [22]Costa C A,Young R I M.Product range models supporting design knowledge reuse [J].Proceedings of the Institution of Mechanical Engineers,Part B:Journal of Engineering Manufacture,2001,215(3):323-337.
    [23]张树勋,唐洪英,龚箭.可重用集成设计单元模型研究[J].计算机王程与应用,2002,38(3):23-26.
    [24]蔡波,陆续翔,陆长德.产品概念设计重用研究[J].机械科学与技术,2002,21(4):669-674.
    [25]王玉,邢渊,阮雪榆.机械产品设计重用策略研究[J].机械工程学报,2002,38(5):145-148.
    [26]Moon J,Burstein F.Ontology-based spelling correction for searching medical information[M].In:Salam A F.Encyclopedia of Semantic Web.Hershey:Idea Group Pub,2006:83-103.
    [27]Patil L,Durra D,Sriram R.Ontology-based exchange of product data semantics[J].IEEE Transactions on Automation Science and Engineering,2005,2(3):213-225.
    [28]高鹏,林兰芬,蔡铭.基于本体映射的产品配置模型自动获取[J].计算机集成制造系统CIMS,2003,9(9):23-27.
    [29]郝永平,王崇海,宁汝新等.基于本体论的产品全过程知识共享研究[J].机械工程学报,2002,38(12):126-130.
    [30]Bidarra R,Bronsvoort W F.Semantic feature modeling[J].Computer-Aided Design,2000,32(3):201-225.
    [31]Bronsvoort W F,Bidarra R,Noort A.Semantic and multiple-view feature modelling:towards more meaningful product modeling[M].In:Kimura F.Geometric Modelling:Yheoretical and Computational Basis towards Advanced CAD Applications.Kluwer Academic Publishers,2001:69-84.
    [32]Lee K Y,Lee W J,Roh M I.Development of a semantic product modeling system for initial hull structure in shipbuilding[J].Robotics and Computer-Integrated Manufacturing,2004,20(3):211-223.
    [33]吕琳,孟祥旭,徐延宁.复杂产品的层次语义模型研究[J].中国机械工程,2004,15(15):1357-1361.
    [34]孙东光,李隆春,邓家提等.CAD系统语义特征模型的研究与实现[J].计算机集成制造系统-CIMS,2002,8(3):193-196.
    [35]郭鸣,李善平,董金祥.基于本体论及语义Web的产品信息模型研究[J].浙江大学学报(工学版),2004,38(1):22-28.
    [36]Duffy A H B,Smith J S,Duffy S M.Design reuse research:A computational perspective[M].In:Sivaloganathan S,Shahin T M M.Engineering Design Conference on Design Reuse.London:Professional Engineering Publishing Limited,1998:43-56.
    [37]Ong S K,Guo D O.An online Web-based environment for detailed design reuse[J].International Journal of Advanced Manufacturing Technology,2006,27(5/6):462-467.
    [38]Schutze H,Pederson J O.A cooccurrence-base thesaurus and two applications to information retrieval[J].Information Processing and Management,1997,33(3):307-318
    [39]Deerwester S,Dumai S T,Furnas G W,et al.Indexing by latent semantic analysis [J].Journal of ACM Transactions on Information Systems,2000,8(1):79-112.
    [40]Qiu Y,Frei H.Concept based query expansion[C].In:Rasmussen E M,Korfhage R,Willett P.Proceedings of the 16thAnnual International ACM SIGIR Conference on Research and Development in Information Retrieval.New York:ACM Press,1993:160-169.
    [41]Attar R,Fraenkel A S.Local feedback in full-text retrieval systems[J].Journal of the Association for Computing Machinery,1977,24(3):397-417.
    [42]Buckley C,Salton G,Allan J,et al.Automatic query expansion using SMART[C].Proceedings of the 3rd Text REtrieval Conference(TREC-3).Gaithersburg,Maryland,1995:69-80.
    [43]张敏,宋睿华,马少平.基于语义关系查询扩展的文档重构方法[J].计算机学报,2004,27(10):1395-1401.
    [44]Richardson R,Smeaton A.Using WordNet in a Knowledge-Based Approach to Information Retrieval[D].Dublin:Dublin City University,1995.
    [45]Rijsbergen C J.A theoretical basis for the use of cooccurrence data in information retrieval[J].Journal of Documentation,1977,33:106-119.
    [46]Ruge G.Experiments on Linguistically Based Term Associations.Information Processing & management[J],1992,28(3):317-332.
    [47]Lin D,Pantel P.Concept Discovery from Text[C].Proceedings of the 19th Computational Linguistics(COLING-2002).Taipei,Taiwan.2002:577-583.
    [48]Voorhees E M.Query expansion using lexical-semantic relations[C].Proceedings of the 17th ACM SIGIR Conference on R&D in Information Retrieval.Dublin,Ireland.1994:61-69.
    [49]Vallet D.An Ontology-Based Information Retrieval Model[EB/OL].http://nets,ii.uam.es/publications/eswcO5,pdf.
    [50]王进.基于本体的语义信息检索研究[D].合肥:中国科学技术大学.
    [51]方骥,戴青云.基于图像内容的外观专利自动检索系统[J].计算机工程与应用,2004,34:209-211.
    [52]戴青云,李海鹏.基于纹理和形状特征的外观专利图像的检索方法[J].计算机工程与应用,2002,327-29.
    [53]Cordella L P,Foggia P.Sansone C.Subgraph transformations for the inexact matching of attributed relational graphs[J].Computing,1998,12:43-52.
    [54]Cordella L P,Foggia P,Sansone C,et al.An improved algorithm for matching large graphs[C].In:Jolion J M,Kropatsch W,Vento M.Proceedings of the 3rd IAPR-TC15Int'l Workshop on Graph-Based Representation in Pattern Recognition.Ischia,2001:149-159.
    [55]Paling G F,Ha S,Park S.A Design Knowledge Management Framework for Active Design Support[C].Proceedings of Design Engineering Technical Conferences(DETC'99).Nevada,1999:1-8.
    [56]程文堂,徐军,王庭利等.族性结构匹配检索研究[J].计算机与应用化学,2002,19(5):577-580.
    [57]程晓静,程文堂,王艳.自然语言处理技术在药物专利检索中的应用研究[J].情报学报,2005,24(1):42-46.
    [58]Soo V W,Lin S Y,Yang S Y.A cooperative multi-agent platform for invention based on ontology and patent document analysis[C].Proceedings of the Ninth International Conference on Computer Supported Cooperative Work in Design,2005:411-416.
    [59]Verbitsky M.Semantic TRIZ[J].TRIZ Journal,2004,9(2):1-14.
    [60]俞春阳.基于专利本体的产品创新设计技术研究[D].杭州:浙江大学,2007.
    [61]刘镇滔.面向中小企业的专利知识服务及其平台研究[D].上海:上海交通大学,2007.
    [62]Jin B,Teng H F,Shi Y J,et al.Chinese Patent Mining Based on Sememe Statistics and Key-Phrase Extraction[J].Lecture Notes in Artificial Intelligence,2007,4632:516-523.
    [63]金博,史彦军,滕弘飞.基于语义理解的文本相似度算法[J].大连理工大学学报,2005,45(2):291-297.
    [64]徐毅,陈旺,金博等.基于Web和Pro/E的零部件设计重用系统研究[J].计算机集成制造系统,2007,13(12):2309-2315.
    [65]方曙.基于专利信息分析的技术创新能力研究[D].成都:西南交通大学,2007.
    [66]邱清盈,郑国民,冯培恩等.基于正则表达式的专利信息提取方法研究[J].中国机械工程,2007,18(19):2326-2329.
    [67]秦玉杰,李革,黄柯棣.使用JDBC实现XML文档到Oracle9i数据库的存取[J].计算机工程与设计,2005,26(10):2583-2601.
    [68]李占波,李娜.XML数据在关系数据库中的存取[J].微计算机信息,2007,23(9-3):192-194.
    [69]徐振宁,李勇,张维明.Internet个性化信息服务研究综述[J].计算机工程与应用,2002,38(19):183-188.
    [70]李善平,尹奇,胡玉杰等.本体论研究综述[J].计算机研究与发展,2004,41(7):1041-1052.
    [71]镇璐,蒋祖华,苏海等.知识网格辅助产品创新平台及其关键技术[J].上海交通大学学报,2007,41(6):876-880.
    [72]李超平.科技项目查新与专利查新之比较研究[J].图书情报工作,1999,20(7):40-48.
    [73]Luhn H P.A statistical approach to mechanized encoding and searching of literary information[J].IBM Journal of Research and Development,1957,1(4):309-317.
    [74]Wang H,Li S,Yu S.Automatic keyphrase extraction from Chinese news documents [C].In:Wang L and Jin Y.Fuzzy Systems and Knowledge Discovery.Heidelberg:Springer Berlin,2005:648-657.
    [75]胡壮麟.系统功能语法概论[M].长沙:湖南教育出版社,1989.
    [76]董振东,董强.知网和汉语研究[J].当代语言学,2001,3(1):33-44.
    [77]Van Rijsbergen.A new theoretical framework for information retrieval[C].Proceedings of 1986 ACM Conference on Research and Development in Information Retrieval,1986:194-200.
    [78]李大高.信息检索中的查询扩展算法研究[D].镇江:江苏大学,2008.
    [79]鲍军鹏,沈钧毅,刘晓东等.自然语言文本复制检测研究综述[J].软件学报,2003,14(10):1753-1760.
    [80]俞士汶,段慧明,田剪秋.机械文摘自动评测的原理及实现[C].见:吴泉源.智能计算机接口与应用进展—第三届中国计算机智能接口与智能应用学术会议论文集.北京:电子工业出版社,1998:230-233.
    [81]刘继德.多功能清雪车清雪装置研究[D].长春:吉林大学,2008.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700