一种基于刻面描述的构件检索方法研究与实现
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
软件构件技术是实现软件复用的核心技术,其基本思想就是创建可复用软件构件,并利用这些构件来开发新的应用软件。基于构件的软件开发能够有效降低软件开发成本、提高软件开发效率和软件质量。随着对基于构件的软件开发的深入研究和实践,构件的数目不断增多,导致构件库的规模不断膨胀,为构件复用者提供有效的构件检索方法就成为软件复用急需解决的核心问题。基于刻面的构件检索已得到软件复用界的广泛研究和应用,已有的方法涉及到XML、树匹配、本体等多种相关技术,但仍存在不足之处,比如没有对查询语句进行解析以及构件匹配计算不精确等问题都需要进一步探索。
     在已有构件刻面分类描述的基础上,给出新的构件描述模型,针对构件库中构件数目多带来检索时间长以及现有构件编码不能满足需要的问题给出新的术语编码策略,并以此术语编码建立构件术语索引,对构件库中构件进行预处理,提高检索效率。
     通过分析构件检索中自然语言解析的特殊性以及现有的中文分词方法,给出一种正向逐字最大匹配法,将查询语句解析成为构件库能够识别的构件术语,并通过构件术语索引查找到包含这些术语的构件。
     针对基于刻面描述的构件的特点,现有直接应用三种树匹配模型来计算构件间匹配的方法不太准确,给出一种新的树匹配模型即树包涵匹配,并在此基础上改进了匹配代价的计算方法,给出匹配度的概念用于描述构件之间的匹配程度,多角度的分析构件间的匹配。
     最后通过实验分析不同匹配度下的查全率和查准率,选取较优的匹配度阈值。将此匹配度阈值下的查全率和查准率与已有的使用空间编码和多种树匹配模型的构件检索方法进行比较,并对本文方法的检索时间进行分析,验证本文方法的有效性。实验结果表明本文方法能够在保证较高查全率的基础上,有效提高查准率和检索效率。
As the critical technique to achieve software reuse, the basic idea ofComponent-Based Software Development (CBSD) is creating reusable softwarecomponents, and using these components to develop new application software.The application of CBSD can cut down the cost of software developmenteffectively while improve software quality and the efficiency of softwaredevelopment. With in-depth study and practice of CBSD, the number ofcomponents is increasing, which lead to expanding the size of the componentlibrary. Provided an effective method of component retrieval for users to reusecomponents has become the critical problem which needs to be resolved quickly.Facet-based component retrieval has become extensive research and applicationof software reuse, some better methods such as XML, tree matching, ontology,and so on being applied. But, it needs to be further explored since some issuessuch as not parse the query and component matching calculation inaccurate arestill exist.
     In this paper, a new component description model was given on the basis offaceted classification and description. To solve the problem of slow componentretrieval caused by expansion of component library and existing encoding can’tfill requirement, a new term encoding strategy is designed. A term index iscreated based this encoding, which can preprocess the components and improvethe retrieval efficiency.
     By analyzing the particularity of natural language parsing in componentretrieval, as well as the existing Chinese word segmentation method, amaximum matching method of the forward verbatim is designed. Parse thequery to component terms, and find the components which contain these termsby term index.
     Considering the feature of facet-based component, the existing methodswhich use the three tree matching model directly to calculate the componentmatching is less accurate. A new tree matching model, contain matching is designed in this paper and the calculation way of matching cost is improvedbased on this new model. A concept of matching degree is proposed to describethe degree of the match between components and analyze component matchingin multi-angle.
     Finally, analyzed the recall and precision of different matching degree, andselected the better degree as the threshold. Compared with the existingcomponent retrieval methods such as space encoding and tree matching modelin the recall and precision at selected threshold, and analyzed the retrieval time,the efficiency of this method is validated. The experiments show that the methodproposed in this paper can improve the precision and efficiency of componentretrieval while keeping a higher recall.
引文
[1] NATO Standard for the Development of Reusable Usable Software Components[J].Communications and Information Systems Agency,1991.
    [2]杨芙清,梅宏,李克勤.软件复用与软件构件技术[J].电子学报,1999,27(2):68-75.
    [3] M Smolarova, P Navrat. Software reuse: Principles, Patterns, Prospects[J]. CIT Journalof Computing and Information Technology,1997.
    [4] DESMOND F, ALAN C. Objects, components and frameworks with uml [M]. UK:Addison-Wesley,1999,40-57.
    [5] Alan W. Brown, Kurt C. Wallnau. The Current State of CBSE[J]. IEEE Software,1998,15(5):37-46.
    [6]潘颖,赵俊峰,谢冰.构件库技术的研究与发展[J].计算机科学,2003,30(5):90-93,156.
    [7]马亮,孙艳春.软件构件概念的变迁[J].计算机科学,2002,29(4):28-30.
    [8]刘强.构件库之构件检索与理论[D].西安:西北工业大学硕士论文,2003.
    [9] Jean-Mare Morel and Jean Faget. The REBOOT Environment[A]. In: Prieto-DiazR,Frakes WB, eds. Proceedings of the Second International Workshop on SoftwareReusability[C]. CA: IEEE Computer Society Press,1993:80-88.
    [10] Robert C Seacord, Scott A Hissam, Kurt C Wallnau. Agora: a Search Engine forComponents. IEEE Internet computing,1998,11(12):62-70.
    [11] JS Poulin, KP Yglesias. Experiences with a Faceted Classification Scheme in a LargeReusable Software Library(RSL). Proceedings of the17th annual internationalComputer Software and Applications Conference. Phoenix. USA.1993,90-99.
    [12]常继传,李克勤,郭立峰等.青鸟系统中可复用软件构件的表示与查询[J].电子学报,2000,28(8):20-23.
    [13]上海构件库[DB]. http://www.sstc.org.cn/,2011-10.
    [14] Atkinson S.A Unifying Model for Retrieval from Reusable SoftwareLibraries:[Technical Report NO.95-41].the University of Queensland,1995.
    [15]姚全珠,刘波.一种高效的基于刻面分类的构件检索算法研究[J].计算机工程与应用,2010,46(2):118-120,153.
    [16]马亮,孙家骕.基于规约匹配的构件检索[J].小型微型计算机系统,2002,23(10):1153-1157.
    [17]李晓博,缪淮扣,刘静.基于形式规格说明的构件匹配[J].计算机应用与软件,2006,23(10):10-12.
    [18]李刚,刘欣昕.形式化构件规约自动生成中的演化转换[J].计算机工程与应用,2002,(8):69-72.
    [19] Andy Podgurski, Lynn Pierce. Behavior Sampling: A Technique for AutomatedRetrieval of Reusable Components. Proceedings of the14thinternational conference onSoftware. Melbourne. Australia. May,1992:349-361.
    [20]唐彬.基于本体的构件检索研究[D].上海:复旦大学博士论文,2007.
    [21]李文敬,元昌安,廖伟志.基于本体相似度的构件查询算法研究[J].计算机工程与科学,2010,32(8):154-157,160.
    [22]陈颖,沈军.基于本体的构件描述和检索[J].计算机应用与软件,2007,24(7):30-32,49.
    [23]樊晓光,褚文奎,万明.基于领域本体的软构件检索[J].计算机科学,2009,36(6):156-158,238.
    [24]冯艳华,房鼎益,陈晓江等.基于UML的构建检索[J].计算机应用与软件,2006,23(8):48-49,73.
    [25]李嘉丽,王念滨,孙玮鸿.基于UML的本体表示方法研究[J].计算机工程,2009,35(12):41-43.
    [26]钟鸣,方存好,田鹏伟等.基于模拟退火算法的构件检索研究[J].计算机工程,2010,36(16):13-15.
    [27]黄涛.基于蚁群分类算法的构件检索方法研究[D]哈尔滨:哈尔滨工程大学硕士论文,2009.
    [28] Rajesh K. Bhatia, Mayank Dave, R. C. Joshi. Ant Colony Based Rule Generation forReusable Software Component Retrieval[J]. ACM SIGSOFT Software EngineeringNotes, March2010,35(2).
    [29]姚全珠,丁新村,雷西玲.基于遗传算法的刻面权重构件检索方法的实现[J].计算机工程与应用,2008,44(6):127-129,149.
    [30]王渊峰,张涌,任洪敏等.基于刻面描述的构件检索[J].软件学报,2002,13(8):1546-1551.
    [31]王渊峰.基于刻面描述的构件检索算法研究[D].上海:复旦大学博士论文,2002.
    [32]徐如志,钱乐秋,程建平等.基于XML的软件构件查询匹配算法研究[J].软件学报,2003,14(7):1195-1202.
    [33]张聚广,张维石,张秀国等.基于空间编码的刻面分类构件检索方法研究[J].计算机工程与应用,2006,17:153-156,193.
    [34]姚全珠,丁新村,冉占军.基于XML的树匹配构件检索算法的研究与实现[J].计算机应用研究,2008,25(4):1013-1015,1019.
    [35]曾一,刘元勇,郭永林.一种基于XML的统一构件匹配技术[J].计算机科学,2007,34(3):279-282.
    [36]张海龙,彭鑫,赵文耘等.基于刻面与本体的资源描述与检索系统的设计与实现[J].计算机应用与软件,2007,24(9):1-3,50.
    [37]邓勇.基于本体与刻面相结合的构件检索方法研究[D].武汉:华中师范大学硕士论文,2007.
    [38]周清清.基于本体与刻面描述相结合的构件检索研究[D].南昌:江西师范大学硕士论文,2008.
    [39] Awny Alnusair, Tian Zhao. Component Search and Reuse: An Ontology-basedApproach[J]. IEEE Information Reuse and Integration (IRI),2010:258-261.
    [40]薛云皎,钱乐秋,花鸣等.一种基于关联挖掘的自适应构件检索方法[J].电子学报,2004,32(12A):203-206.
    [41]廖伟池.构件检索技术的研究与实现[D].广州:中山大学硕士论文,2009.
    [42]张英俊,任姚鹏,陈立潮,谢斌红.基于语义相似度与优化的构件聚类算法[J].计算机工程与设计,2010,31(11):2531-2535.
    [43]任姚鹏.基于语义相似度分析的软构件聚类算法研究[D].太原科技大学硕士论文,2010.
    [44]任姚鹏,陈立潮,张英俊等.基于潜在语义分析的构件聚类改进方法[J].计算机工程,2011,37(4):67-69.
    [45] W B Frakes, T P Pole. An Empirical Study of Representation Methods for ReusableSoftware Components. IEEE Transactions on Software Engineering,1994,20(8):617-630.
    [46] Hafedh Mili, Fatma Mili,Ali Mili. Reusing Software: Issues and Research Directions[J].IEEE Transactions On Software Engineering, June,1995,21(6):528-562.
    [47]李延春,晏敏.软件构件技术的现状与未来[J].计算机工程与应用,2003,31:86-93,96.
    [48]郭永林.基于XML描述的构件检索匹配研究[D].重庆:重庆大学硕士论文,2006.
    [49]费玉奎,王志坚.构件技术发展综述[J].河海大学学报(自然科学版),2004,32(6):696-699.
    [50] Larry Latour, Tom Wheeler, Bill Frakes.Descriptive and Prescriptive Aspects of the3C’S Model: SETA1Working Group Summary. SETA1Proceedings of the1stinternational symposium on Environments and Tools for Ada. ACM New Work,USA,1991:9-17.
    [51] Stockwell T, Conradi R, Karlsson EA. The REBOOT Approach to Software Reuse[J].Journal of System Software,1995,30:201-212.
    [52]郭立峰,郭耀,常继传.NATO软件复用标准导论[J].计算机科学,1999,26(5):5-16.
    [53] Uta Priss. Faceted Information Representation[C]. Proceedings of the8th InternationalConference on Conceptual Structures, Shaker Verlag, Aachen,2000:84-94.
    [54] De Lucena V F. Facet-based classification scheme for industrial automation softwarecomponents[C],Sixth International Workshop on Component-Oriented Programming atECOOP2001,Budapest,Hungary,2001.
    [55]王渊峰,薛云皎,张涌等.刻面分类构件的匹配模型[J].软件学报,2003,14(3):401-408.
    [56] Scott Henninger. An evolutionary approach to constricting effective software reuserepositories. ACM Transactions on Software Engineering and Methodology,1997,6(2):111-140.
    [57]张韬.基于刻面描述的构件检索方法及实现[D].武汉:华中科技大学硕士论文,2008.
    [58]刘子辰.插入友好的XML索引编码技术研究[D].北京:北京工业大学硕士论文,2009.
    [59]胡峰.基于XML的领域构件库系统的设计与实现[J].保定:华北电力大学硕士论文,2006.
    [60] Kilpelainen P. Tree matching problems with applications to structured text database[R].Helsinki: Department of Computer Science, University of Helsinki,1992.
    [61]傅翠云.基于刻面分类的构件检索技术的研究与实现[D].广州:中山大学硕士论文,2008.
    [62]贾晓辉,陈德华,严梅等.基于刻面描述的构件查询匹配模型及算法研究[J].计算机研究与发展,2004,41(10):1634-1638.
    [63]王莹,林雪峰,戴晖.基于叶节点包容匹配模型的构件检索算法研究[J].计算机工程与设计,2007,28(24):5977-5979,5982.
    [64]彭博.一种基于刻面分类法的Web应用构件库的研究与实现[D].北京:北京工业大学硕士论文,2009.
    [65]马锟.基于刻面分类模式的构件检索技术研究[D].大连:大连海事大学硕士论文,2006.
    [66]揭春雨,刘源,梁南元.论汉语自动分词方法[J].中文信息学报,1999:2-20.
    [67]张磊,张代远.中文分词算法解析[J].电脑知识与技术,2009,5(1):192-193.
    [68]任丽芸.搜索引擎中文分词技术研究[D].重庆:重庆理工大学硕士论文,2011.
    [69] Rongyou Huang, Xinjian Zhao. A Chinese Web Page Automatic Classification System,Web Information System and Mining,2010,6318:61-66.
    [70]王云晓,张学诚,屈彪.基于XML的构件库管理系统实现研究[J].计算机应用与软件,2009,26(11):34-37.
    [71]陈斌,张春海.基于领域工程和XML描述的构件匹配与检索的研究[J].计算机科学,2009,36(4A):140-142,153.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700