用户名: 密码: 验证码:
一种面向领域的Web服务语义聚类方法
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Web Service Semantic Clustering Method Oriented Domain
  • 作者:赵一 ; 李昭 ; 陈鹏 ; 何泾沙 ; 何克清
  • 英文作者:ZHAO Yi;LI Zhao;CHEN Peng;HE Jing-sha;HE Ke-qing;School of Computer Science,Wuhan University;College of Computer and Information,China Three Gorges University;
  • 关键词:语义潜式狄里克雷分布 ; Word2vec ; Web服务聚类
  • 英文关键词:semantic latent dirichlet allocation;;Word2vec;;Web services clustering
  • 中文刊名:XXWX
  • 英文刊名:Journal of Chinese Computer Systems
  • 机构:武汉大学计算机学院;三峡大学计算机与信息学院;
  • 出版日期:2019-01-15
  • 出版单位:小型微型计算机系统
  • 年:2019
  • 期:v.40
  • 基金:国家重点研发计划项目(2016YFC0802500,2016YFB0800403)资助;; 国家自然科学基金项目(61562073)资助;; 三峡大学人才专项经费项目(8000303)资助
  • 语种:中文;
  • 页:XXWX201901016
  • 页数:8
  • CN:01
  • ISSN:21-1106/TP
  • 分类号:83-90
摘要
目前,互联网中发布的Web服务大都通过自然语言进行描述,这种非结构化的描述方式为机器进行自动分析与处理带来了极大的困难.如何提高服务发现的效率和精确率,已成为服务计算领域的研究热点之一.服务聚类是服务发现的重要支撑技术,通过将语义相似的服务加以聚类和组织,有助于改进服务发现的效果.当前的服务聚类技术主要采用LDA(潜式狄里克雷分布)和K-means等模型在同一领域下进行工作,利用这些方法进行服务聚类时还存在一定的局限性,例如,未充分利用词汇间的语义关系进行降维,从而导致服务发现的效果不够理想.针对该问题,本文使用神经网络模型(word2vec模型)获得服务描述中的同义词表并生成领域特征词集,来最大限度的降低服务特征向量维度;在此基础上,提出S-LDA(Semantic Latent Dirichlet Allocation)模型对同一领域的服务进行聚类,由此构建了一个面向领域的Web服务聚类框架(Domain Semantic aided Web Service Clustering,DSWSC).在ProgrammableWeb网站上发布的服务数据集开展的实验表明,与LDA和K-means等方法相比,本文方法在熵、聚类纯度和F指标上均取得了明显效果,有助于提高服务搜索的准确率.
        Currently,most of the Web services published in the Internet are described by natural language,this kind of unstructured descriptions brings difficulties in automatic analysis and processing. Howto improve the efficiency and accuracy of service discovery has become a hot topic in the field of service computing. Service clustering is an important fundamental technology for service discovery.It is helpful to improve the effectiveness of service discovery by clustering and organizing semantic similar services. The current service clustering technology mainly adopts LDA( Latent Dirichlet Allocation) and K-means models. There is still some limitations when using these methods for service clustering,e. g.,they are unable to reduce dimension by using lexical semantic relations. To solve this problem,this paper firstly creates synonyms for service descriptions by the neural network model( word2 vec model),and then uses the decision tree classifier to classify service domains. Afterwards,an improved S-LDA( Semantic Latent Dirichlet Allocation) model is proposed to cluster semantic similar services. In this way,a domain-oriented service semantic clustering method( DSWSC) is proposed. Experiments conducted on the service data set published on the Programming Web showthat our approach outperforms LDA and K-means methods in entropy,clustering purity and F-measure,which can be helpful to improve the accuracy in service discovery.
引文
[1]Chen De-wei,Xu Bin,Cai Yue-ru,et al. A P2P based web service discovery mechanism with bounding deployment and publication[J]. Journal of Computers,2005,28(4):615-626.
    [2]Luo Yun-hui,Liu Xi-ping. User preference service selection method based on similar class[J]. Journal of Nanjing University of Posts and Telecommunications(Natural Science Edition),2014,34(1):116-122.
    [3]Elgazzar K,Hassan A E,Martin P. Clustering WSDL documents to bootstrap the discovery of Web service[C]. Proc. of the 8th IEEE Int Conf on Web Servics,Piscataway,NJ:IEEE,2010:147-154.
    [4]Li Zheng,Wang Jian,Zhang Neng. A Topic-oriented clustering approach for domain services[J]. Journal of Computer Research and Development,2014,51(2):408-419.
    [5]Tian Gang,He Ke-qing,Wang Jian,et al. Domain-oriented and tagaided web service clustering method[J]. Chinese Journal of Electronics,2015,43(7):1266-1274.
    [6]Liu Jian-xiao,Wang Jian,Zhang Xiu-wei. A web service clustering method based on self-join in RDB[J]. Journal of Computer Research and Development,2013,50(S1):205-210.
    [7] Platzer C,Rosenberg F,Dustdar S. Web service clusering using multidimensional angles as proximity measures[J]. ACMTrans on Internet Technology,2009,9(3):1-26.
    [8]Wang Xian-zhi,Wang Zhong-jie,Xu Xiao-fei. Semi-empirical servie composition:a clustering based approach[C]. Proc of the 9th IEEE Int Conf on Web Services,Piscataway,NJ,IEEE,2011:219-226.
    [9] Sheng Zhen-hua,Wu Yu,Jiang Jin-hua,et al. InfoSigs:a finegrained clustering algorithm for web objects[J]. Journal of Computer Research and Development,2010,47(5):796-803.
    [10] Li Ying. Semantic web service clustering method based on graph theory[J]. Computer Engineering,2011,37(22):51-52.
    [11]Liu Zhen-lu,Wang Da-ling,Feng Shi,et al. An approach of latent semantic space partition and web document clustering[J]. Journal of Chinese Information Processing,2011,25(1):60-65.
    [12]Liu Yun-feng,Qi Huan. Mulit-hierarchy documents clustering based on LSA space dimensionality character[J]. Tsinghua University(Sci&Tech),2005,45(S1):57-60.
    [13]Dasgupta S,Bhat S,Lee Y. Taxonomic clustering and query matching for efficient service discovery[C]. Proc of IEEE 9th Int Conf on Web Services,Piscataway,NJ IEEE,2011:363-370.
    [14] Chao Lai-ping. Semantic Web service discovery based on LDA clustering[D]. Nanjing:Nanjing University,2016.
    [15]Liu Jian-xiao,He Ke-qing,Wang Jian. Semantic interoperabilit oriented method of service aggregation[J]. Journal of Software,2011,22(Sup.):27-40.
    [16]Chen Ke-han. Research on Web service QoS prediction and service selection algorithm based on user clustering[D]. Hangzhou:Zhejiang Unversity,2013.
    [17]Zhang Y Jin R,Zhou,Z H. Understanding bag-of-words model:a statistical framework[J]. International Journal of Machine Learning and Cybernetics,2010,1(1):43-52.
    [18]Liu P,Qiu X,Huang X. Learning context-sensitive word embeddings with neural tensor skip-gram model[C]. International Conference on Artificial Intelligence,2015:1284-1290.
    [1]陈德伟,许斌,蔡月茹,等.服务部署与发布绑定的基于P2P网络的Web服务发现机制[J].计算机学报,2005,28(4):615-626.
    [2]骆云辉,刘茜萍.基于相似类的用户偏好服务选择方法[J].南京南京邮电大学学报(自然科学版),2014,34(1):116-122.
    [4]李征,王健,张能.一种面向主题的领域服务聚类方法[J].计算机研究与发展,2014,51(2):408-419.
    [5]田刚,何克清,王健,等面向领域标签辅助的服务聚类方法[J].电子学报,2015,43(7):1266-1274.
    [6]刘建晓,王健,张秀伟.一种基于RDB中自身连接的Web服务聚类方法[J].计算机研究与发展,2013,50(S1):205-210.
    [9]盛振华,吴羽,江锦华,等. InfoSigs:一种面向Web对象的细粒度聚类算法[J].计算机研究与发展,2010,47(5):796-803.
    [10]黎英.基于图论的语义Web服务聚类方法[J].计算机工程,2011,37(22):51-52.
    [11]刘振鹿,王大玲,冯时,等.一种基于LDA的潜在语义区划分及Web文档聚类算法[J].中文信息学报,2011,25(1):60-65.
    [12]刘云峰,齐欢.基于潜在语义空间维度特性的多层文档聚类[J].清华大学学报(自然科学版),2005,45(S1):57-60.
    [14]曹赖平.基于LDA聚类的语义Web服务发现[D].南京:南京大学,2016.
    [15]刘建晓,何克清,王健.一种面向语义互操作性的服务聚合方法[J].软件学报,2011,22(增刊):27-40.
    [16]陈克寒.基于用户聚类的Web服务QoS预测与服务选择算法研究[D].杭州:浙江大学,2013.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700