用户名: 密码: 验证码:
改进的DBSCAN聚类算法在社会化标注中的应用
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Clustering Social Tags with Improved DBSCAN Algorithm
  • 作者:熊回香 ; 叶佳鑫 ; 蒋武轩
  • 英文作者:Xiong Huixiang;Ye Jiaxin;Jiang Wuxuan;School of Information Management,Central China Normal University;
  • 关键词:DBSCAN ; 标签聚类 ; 用户聚类 ; 标签拓展
  • 英文关键词:DBSCAN;;Tag Clustering;;User Clustering;;Tag Expansion
  • 中文刊名:XDTQ
  • 英文刊名:Data Analysis and Knowledge Discovery
  • 机构:华中师范大学信息管理学院;
  • 出版日期:2018-12-25
  • 出版单位:数据分析与知识发现
  • 年:2018
  • 期:v.2;No.24
  • 基金:国家社会科学基金项目“大众分类中标签间语义关系挖掘研究”(项目编号:12BTQ038)的研究成果之一
  • 语种:中文;
  • 页:XDTQ201812009
  • 页数:12
  • CN:12
  • ISSN:10-1478/G2
  • 分类号:81-92
摘要
【目的】改进DBSCAN算法并验证其在社会化标注中的可行性及有效性。【方法】结合社会化标注的特点,分析标签被用来标注资源的频次及标签的总出现次数,挖掘标签与资源间的联系来改进DBSCAN聚类算法,以改进的算法为基础,实现标签聚类、用户聚类以及用户标签的拓展。【结果】采用豆瓣电影上的数据进行对比实验,改进的DBSCAN算法在应用于社会化标注时可以提高簇内对象间相关性与各簇间相关性的比值,聚类效果得到改进。【局限】在选择构建向量的数据时存在一定局限性,样本数据只能从较笼统的层面表示用户及资源特征,未对其进行深入挖掘。【结论】本文通过分析社会化标注的特点来改进DBSCAN算法,提高算法的效果,并为其改进提供新的思路。
        [Objective] This paper tries to improve the DBSCAN algorithm and verify its feasibility and effectiveness in social tagging. [Methods] First, we analyzed the frequency of social tags for resources and their total appearances. Then, we examined the relationship between tags and resources to improve the DBSCAN clustering algorithm. Finally, we applied the new algorithm to cluster tags, and users. [Results] We ran our experiment with data from Douban Movies. The modified DBSCAN algorithm improved the inter-object and inter-cluster correlations of social taggings. [Limitations] The sample datasets need more in-depth mining. [Conclusions] The improved DBSCAN algorithm could effectively cluster social tags.
引文
[1]Hotho A,J?schke R,Schmitz C,et al.Information Retrieval in Folksonomies:Search and Ranking[C]//Proceedings of the 3rd European Conference on the Semantic Web:Research and Applications.2006:411-426.
    [2]熊回香.面向Web3.0的大众分类研究[D].武汉:华中师范大学,2011.(Xiong Huixiang.Research on Folksonomy Oriented to Web3.0[D].Wuhan:Central China Normal University,2011.)
    [3]Hayman S.Folksonomies and Tagging:New Developments in Social Bookmarking[C]//Proceedings of the 2007 Ark Group Conference:Developing and Improving Classification Schemes.2007.
    [4]苏新宁,杨建林,江念南,等.数据仓库和数据挖掘[M].北京:清华大学出版社,2006.(Su Xinning,Yang Jianlin,Jiang Niannan,et al.Data Warehouse and Data Mining[M].Beijing:Tsinghua University Press,2006.)
    [5]Martin P,Eklund P.Embedding Knowledge in Web Documents:CGs Versus XML-based Metadata Languages[C]//Proceedings of the 7th International Conference on Conceptual Structures:Standards and Practices.1999:230-246.
    [6]Razmerita L,Lytras M D.Ontology-Based User Modelling Personalization:Analyzing the Requirements of a Semantic Learning Portal[C]//Proceedings of the 1st World Summit on Knowledge Society.Springer,2008:354-363.
    [7]房小可,纪春光.基于标签主题和概念空间的个性化推荐研究[J].情报理论与实践,2015,38(5):105-111.(Fang Xiaoke,Ji Chunguang.Research on the Personalized Recommendation Based on Tag Topic and Concept Space[J].Information Studies:Theory&Application,2015,38(5):105-111.)
    [8]Sood S,Owsley S,Hammond K J,et al.TagAssist:Automatic Tag Suggestion for Blog Posts[C]//Proceedings of ICWSM’2007,Boulder,Colorado,USA.2007.
    [9]Zhang Z K,Liu C.A Hypergraph Model of Social Tagging Networks[J].Journal of Statistical Mechanics:Theory and Experiment,2010(10):P10005.
    [10]钟青燕,苏一丹,梁胜勇.基于层次聚类和语义的标签推荐研究[J].微计算机信息,2010,26(12-3):199-203.(Zhong Qingyan,Su Yidan,Liang Shengyong.Tag Recommendation Research Base on Hierarchical Clustering and Semantic[J].Microcomputer Information,2010,26(12-3):199-203.)
    [11]廖志芳,王超群,李小庆,等.张量分解的标签推荐及新用户标签推荐算法[J].小型微型计算机系统,2013,34(11):2472-2476.(Liao Zhifang,Wang Chaoqun,Li Xiaoqing,et al.Tag Recommendation and New User Tag Recommendation Algorithms Based on Tensor Decomposition[J].Journal of Chinese Computer Systems,2013,34(11):2472-2476.)
    [12]张斌,张引,高克宁,等.融合关系与内容分析的社会标签推荐[J].软件学报,2012,23(3):476-488.(Zhang Bin,Zhang Yin,Gao Kening,et al.Combining Relation and Content Analysis for Social Tagging Recommendation[J].Journal of Software,2012,23(3):476-488.)
    [13]易明,操玉杰,沈劲枝,等.社会化标签系统中基于密度聚类的Web用户兴趣建模方法[J].情报学报,2011,30(1):37-43.(Yi Ming,Cao Yujie,Shen Jinzhi,et al.An Approach to Web User Interest Modeling Based on Density-based Clustering Algorithm in the Social Tag System[J].Journal of the China Society for Scientific and Technical Information,2011,30(1):37-43.)
    [14]Begelman G,Keller P,Smadja F.Automated Tag Clustering:Improving Search and Exploration in the Tag Space[C]//Proceedings of the Collaborative Web Tagging Workshop at WWW2006.2006:15-33.
    [15]曹高辉,焦玉英,成全.基于凝聚式层次聚类算法的标签聚类研究[J].现代图书情报技术,2008(4):23-28.(Cao Gaohui,Jiao Yuying,Cheng Quan.Research on Tag Cluster Based on Hierarchical Agglomerative Clustering Algorithm[J].New Technology of Library and Information Service,2008(4):23-28.)
    [16]Gemmell J,Shepitsen A,Mobasher B,et al.Personalizing Navigation in Folksonomies Using Hierarchical Tag Clustering[C]//Proceedings of the 10th International Conference on Data Warehousing and Knowledge Discovery.Springer,2008:196-205.
    [17]王翠英.标签的聚类分析研究[J].现代图书情报技术,2008(5):67-71.(Wang Cuiying.Study on Tag Clustering Analysis[J].New Technology of Library and Information Service,2008(5):67-71.)
    [18]石陆魁,何丕廉.一种基于密度的高效聚类算法[J].计算机应用,2005,25(8):1824-1826.(Shi Lukui,He Pilian.Efficient Density-Based Clustering Algorithm[J].Computer Applications,2005,25(8):1824-1826.)
    [19]李双庆,慕升弟.一种改进的DBSCAN算法及其应用[J].计算机工程与应用,2014,50(8):72-76.(Li Shuangqing,Mu Shengdi.Improved DBSCAN Algorithm and Its Application[J].Computer Engineering and Applications,2014,50(8):72-76.)
    [20]Li P,Wang B,Jin W,et al.User-Related Tag Expansion for Web Document Clustering[C]//Proceedings of the 33rd European Conference on Information Retrieval.Springer,2011:19-31.
    [21]Zezula P,Amato G,Dohnal V,et al.Similarity Search:The Metric Space Approach[M].Springer Science&Business Media,2006.
    (1)http://www.yaahp.com/.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700