Do second-order similarities provide added-value in a hybrid approach?
详细信息    查看全文
  • 作者:Bart Thijs (1)
    Edgar Schiebel (2)
    Wolfgang Gl?nzel (1) (3)
  • 关键词:Bibliographic coupling ; Text mining ; Hybrid clustering ; Similarity measures ; Public health
  • 刊名:Scientometrics
  • 出版年:2013
  • 出版时间:September 2013
  • 年:2013
  • 卷:96
  • 期:3
  • 页码:667-677
  • 全文大小:352KB
  • 参考文献:1. Ahlgren, P., & Colliander, C. (2009). Document–document similarity approaches and science mapping: experimental comparison of five approaches. / Journal of Informetrics, / 3(1), 49-3. doi:10.1016/j.joi.2008.11.003 CrossRef
    2. Bichteler, J., & Eaton, E. A. (1980). The combined use of bibliographic coupling and co-citation for document retrieval. / JASIS, / 31(4), 278-82. doi:10.1002/asi.4630310408 CrossRef
    3. Boyack, K. W., & Klavans, R. (2010). Co-citation analysis, bibliographic coupling, and direct citation: which citation approach represents the research front most accurately? / JASIST, / 61(12), 2389-404. doi:10.1002/asi.21419 CrossRef
    4. Braam, R. R., Moed, H. F., & van Raan, A. F. J. (1991a). Mapping of science by combined co-citation and word analysis, part 1: structural aspects. / JASIS, / 42(4), 233-51. doi:10.1002/(SICI)1097-4571(199105)42:4<233::AID-ASI1>3.0.CO;2-I CrossRef
    5. Braam, R. R., Moed, H. F., & van Raan, A. F. J. (1991b). Mapping of science by combined co-citation and word analysis part II: dynamical aspects. / JASIS, / 42(4), 252-66. doi:10.1002/(SICI)1097-4571(199105)42:4<252::AID-ASI2>3.0.CO;2-G CrossRef
    6. Colliander, C., & Ahlgren, P. (2011). Experimental comparison of first and second-order similarities in a scientometric context. / Scientometrics, / 90(2), 675-85. doi:10.1007/s11192-011-0491-x CrossRef
    7. Gl?nzel, W. (2012). The role of core documents in bibliometric network analysis and their relation with h-type indices. / Scientometrics, / 93(1), 113-23. doi:10.1007/s11192-012-0639-3 . CrossRef
    8. Gl?nzel, W., Janssens, F., & Thijs, B. (2009). A comparative analysis of publication activity and citation impact based on the core literature in bioinformatics. / Scientometrics, / 79(1), 109-29. doi:10.1007/s11192-009-0407-1 CrossRef
    9. Gl?nzel, W., & Thijs, B. (2011). Using `core documents-for the representation of clusters and topics. / Scientometrics, / 88(1), 297-09. doi:10.1007/s11192-011-0347-4 CrossRef
    10. Gl?nzel, W., & Thijs, B. (2012). Using ‘core documents-for detecting and labelling new emerging topics. / Scientometrics, / 91(2), 399-16. doi:10.1007/s11192-011-0591-7 . CrossRef
    11. Janssens, F. (2007). / Clustering of scientific fields by integrating text mining and bibliometrics. Ph.D. Thesis, Faculty of Engineering, Katholieke Universiteit Leuven, Belgium. http://www.hdl.handle.net/1979/847.
    12. Kopcsa, A., & Schiebel, E. (1998). Science and technology mapping: a new iteration model for representing multidimensional relationships. / JASIS, / 49(1), 7-7. doi:10.1002/(SICI)1097-4571(1998)49:1<7::AID-ASI3>3.0.CO;2-W CrossRef
    13. Picard, J. (1999), / Finding content- / bearing terms using term similarities. In Proceedings of EACL-9, 241-44.
    14. Rousseeuw, P. (1987). Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. / Journal of Computational and Applied Mathematics, / 20(1), 53-5. doi:10.1016/0377-0427(87)90125-7 CrossRef
    15. Schiebel, E. (2012). Visualization of research fronts and knowledge bases by three-dimensional areal densities of bibliographically coupled publications and co-citations. / Scientometrics, / 91(2), 557-66. doi:10.1007/s11192-012-0626-8 . CrossRef
    16. Zitt, M., & Bassecoulard, E. (1994). Development of a method for detection and trend analysis of research fronts built by lexical or co-citation analysis. / Scientometrics, / 30(1), 333-51. doi:10.1007/BF02017232 CrossRef
  • 作者单位:Bart Thijs (1)
    Edgar Schiebel (2)
    Wolfgang Gl?nzel (1) (3)

    1. Centre for R&D Monitoring (ECOOM) and Department of MSI, KU Leuven, Leuven, Belgium
    2. AIT Austrian Institute of Technology GmbH, Vienna, Austria
    3. Department of Science Policy and Scientometrics, LHAS, Budapest, Hungary
  • ISSN:1588-2861
文摘
Recent studies on first- and second-order similarities have shown that the latter one outperforms the first one as input for document clustering or partitioning applications. First-order similarities based on bibliographic coupling or on lexical approaches come with specific methodological issues like sparse matrices, sensitive to spelling variances or context differences. Second-order similarities were proposed to tackle these problems and take the lexical context into account. But also a hybrid combination of both types of similarities proved an important improvement which integrates the strengths of the two approaches and diminishes their weaknesses. In this paper we extend the notion of second-order similarity by applying it in the context of the hybrid approach. We conclude that there is no added value for the clearly defined clusters but that the second-order similarity can provide an additional viewpoint for the more general clusters.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700