合著网络中作者的合作模式分析
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
科学技术发展的日新月异以及科学研究的不断深入,使得科研人员很难单独完成某项课题或论文。通过协作分工、共同探讨,可以提高科研成果水平和研究效率。而学术文章作为科研工作成果的主要形式,也呈现出多个作者合著发表的现象,几乎所有的学术文章都是由几个作者合著完成发表的,很少出现一篇文章只有一个作者的情况,而且这种现象越来越明显。另一方面,科技管理工作中的诸多环节,比如评审专家遴选、科技政策的制定,都涉及到对专家学者的评价。因此,在对专家学者进行评价的时候,不应仅仅关注如职称、发表文章的数量、质量等这些自身的信息,也应该考察他们的科研合作行为,从合著发表文章这一合作关系的角度来进行。
     在这样的背景下,本文基于合著网络(合著网络是由节点和边组成的,节点表示作者,边表示两个作者之间共同发表过文章),将研究的重点确定为合著网络中作者的合作模式。同时,将合著网络中作者的合作模式分为中心度、不同研究方向的分布模式、基于网络结构的作者角色几个方面。其中,作者的中心度又分为考虑合作关系强度的作者的特征向量中心度和广度中心度两个部分。在数据集方面,选择了ACM SIGKDD知识发现和数据挖掘国际会议论文集中的文章,在此基础上构建了ACM SIGKDD合著网络。
     本文把度量两个地区之间合作关系强度的Salton方法应用于合著网络,来度量作者之间的合作关系强度。在此基础上,把这种合作关系强度引入到特征向量中心度中,并且分析了在考虑和不考虑合作关系强度这两种情况下,特征向量中心度有何不同。结果表明,基于Salton法的合作关系强度的引入确实给特征向量中心度的计算结果带来了影响。此外,还分析了考虑合作关系强度的特征向量中心度和度中心度之间的相关性。
     作者的合作模式还包括作者合作关系的广度这一方面。基于作者合作关系的广度的思想,提出了一种新的度量作者合作活跃程度的中心度——广度中心度。所提出的广度中心度的计算方法,是基于Shannon的熵的计算原理的,并且应用了基于Salton法的合作关系强度以及由lambda集合所定义的子群体。结果表明,具有较高的广度中心度的作者也会具有较高的介数中心度,反之,则不一定成立。此外,还分析了广度中心度和度中心度之间的相关性。
     在作者研究方向的分布模式这一方面,本文又将其分为作者研究方向的数目、作者研究方向的异质性、作者研究方向分布的均匀性三个小的方面。在提出作者研究方向的异质性和作者研究方向分布的均匀性的计算方法以后,重点分析了作者研究方向的分布模式和度中心度以及广度中心度之间的相关性。此外,还对作者进行了分类。
     最后,本文应用自同构对等性,分析了基于网络结构的作者角色,并在此基础上,从这些作者角色中抽象出一些典型的角色。结果表明,在应用自同构对等性来进行合著网络中作者的角色分析的时候,不能仅仅从绝对精确的结果来考察,这往往会忽略一些处于比较重要位置的作者的角色。并且,作者在网络中的角色也弥补了度中心度的不足。
     本文的这些研究,不但丰富了合著网络研究方面的理论成果,还将为科技管理工作以及科技政策的制定提供有益的借鉴。
The fast development of the technology and the scientific research make researchers hard to complete projects and papers. The collaboration can improve the research production and efficiency. As the primary production of the research, the papers are nearly done by several authors, and the phenomenon is more and more prominent. On the other hand, many parts in the technology management such as the choice of experts to examine projects and the constituting of scientific policy are involved in the evaluation of experts. Therefore, when evaluating experts, we should also attach importance to the collaboration in scientific research.
     Under the background, the present dissertation makes the patterns of collaboration the keystone in our research. And divide the patterns of collaboration into authors’centralities, the patterns of distribution of different subjects and the roles based on the co-authorship network structure. The centralities are then divided into the eigenvector centrality based on the collaborative strength and the extensity centrality based on the collaborative strength. As for the data set, we choose the papers in proceedings of the international conference on ACM SIGKDD. Based on the data set, we construct the network termed the co-authorship network of ACM SIGKDD.
     We apply the Salton’s measure which is used to measure the collaborative strength between two regions to the measurement of collaborative strength between two authors. And then apply this collaborative strength between two authors to the eigenvector centrality. We also analyze the differences of the eigenvector centrality between considering this collaborative strength and not. The results indicate that this collaborative strength really make some differences. In addition, the correlation of the eigenvector centrality based on collaborative strength and the degree is analyzed.
     Authors’patterns of collaboration also include the extensity of authors’collaborative relationships. Base on the extensity of authors’collaborative relationships, we propose a new centrality which can measure authors’degree of activity of collaboration, namely the extensity centrality. This new centrality is based on Shannon’s entropy, the collaborative strength and the communities based on the lambda sets. The results indicate that authors with high extensity centrality will also have high betweenness. But the opposition is not necessarily the case. In addition, the correlation of the extensity centrality and the degree is analyzed.
     As for the patterns of the distribution of subjects, we divide them into the number of authors’subjects, the heterogeneity of subjects and the equality of the distribution of subjects. After proposing the computing methods of the heterogeneity of subjects and the equality of the distribution of subjects, we analyze the correlation of the patterns of the distribution of subjects and the degree and the extensity centrality. In addition, we also do the classification of authors.
     Finally, we apply the automorphic equivalence analysis to the analysis of authors’roles in co-authorship networks. And we extractive some representative roles from these roles based on the analysis. The results indicate that when apply the automorphic equivalence analysis to the analysis of authors’roles, we can not focus only on the precise results because of its omit of some important roles. In addition, authors’role in co-authorship networks can offset the shortage of the degree.
     All of these researches of this dissertation will not only enrich the academic fruit, but also supply the technology management and the constituting of scientific policy.
引文
1 C. Genest, C. Thibault. Investigating the Concentration Within a Research Community Using Joint Publications and Co-authorship via Intermediaries. Scientometrics. 2001, 51(2): 429~440
    2 H. Kretschmer. Author Productivity and Geodesic Distance in Bibliographic Co-authorship Networks, and Visibility on the Web. Scientometrics. 2004, 60(3): 409~420
    3 F. Yoshikane, T. Nozawa, K. Tsuji. Comparative Analysis of Co-authorship Networks Considering Authors’Roles in Collaboration: Differences Between the Theoretical and Application Areas. Scientometrics. 2006, 68(3): 643~655
    4刘则渊,尹丽春,徐大伟.试论复杂网络分析方法在合作研究中的应用.科技管理研究. 2005, (12): 267~269
    5欧阳霞. EAMOLA成员科研合作行为的网络可视化研究.图书·情报·知识. 2006, (114): 33~37
    6 WISER-Web Indicators for Science, Technology & Innovation Research. http: / / www. wiserweb. org/ WI-documents/ WP3-1. html (2006-04-23)
    7 D. Chakrabarti, C. Faloutsos. Graph Mining: Laws, Generators, and Algorithms. ACM Computing Surveys. 2006, 38: 1~69
    8 M. E. J. Newman. The Structure and Function of Complex Networks. SIAM REVIEW. 2000, 45(2): 167~256
    9刘涛,陈忠,陈晓荣.复杂网络理论及其应用研究概述.系统工程. 2005, 23(6):1~7
    10 S. Wasserman, K. Faust. Social Network Analysis. Cambridge University Press, 1994
    11 P. V. Marsden. Egocentric and Sociocentric Measures of Network Centrality. Social Networks. 2002, 24: 407~422
    12 L.C. Freeman. Centrality in Networks: I. Conceptual Clarification. Social Networks. 1979, 1: 215~239
    13 D. Gomez, E. G.Aranguena, C. Manuel. Centrality and Power in Social Networks: a Game Theoretic Approach. Mathematical Social Sciences. 2003, 46:27~54
    14 L. C. Freeman, S. P. Borgatti, D. R. White. Centrality in Valued Graphs: a Measure of Betweenness Based on Network Flow. Social networks. 1991, 13: 141~154
    15 D. R. White, S. P. Borgatti. Betweenness Centrality Measures for Directed Graphs. Social Networks.1994, 16: 335~346
    16 P. L. Flom, S. R. Friedman, S. Strauss, A. Neaigus. A New Measure of Linkage Between Two Sub-networks. Connections. 2004, 26(1): 62~70
    17 R. Rousseau. Q-measures For Binary Divided Networks: an Investigation Within the Field of Informetrics. Proceedings of the 68th ASIST Conference, 2005: 675~696
    18 R. Rousseau, L. Zhang. Betweenness Centrality and Q-measures in Directed Valued Networks. Scientometrics. 2008, 75(3): 575~590
    19 P. Bonacich. A Technique for Analyzing Overlapping Memberships. Sociological Methodology. 1972: 176~85
    20 P. Bonacich. Factoring and Weighting Approaches to Status Scores and Clique Identification. Journal of Mathematical Sociology. 1972, 2: 113~20
    21 P. Bonacich. Power and Centrality: A Family of Measures. The American Journal of Sociology. 1987, 92(5): 1170~1182
    22 J. M. Kleinberg. Hubs, Authorities, and Communities. ACM Computing Surveys. 1999, 31(4): 1~3
    23 J. M. Kleinberg. Authoritative Sources in a Hyperlinked Environment. Journal of the ACM. 1999, 46(5): 604~632
    24 L. Page, S. Brin, R. Motwani, T. Winograd. The PageRank Citation Ranking: Bringing Order to the Web. Technical report, Stanford University Database Group, 1998, http://citeseer.nj.nec.com/368196.html
    25 D. Fiala, F. Rousselot, K. Jezek. PageRank for Bibliographic Networks. Scientometrics. 2008, 76(1): 135~158
    26 C. H. Hubbell. An Input-Output Approach to Clique Identification. Sociometry. 1965, 28: 377~399
    27 G. Pinski, F. Narin. Citation Influence for Journal Aggregates of Scientific Publications: Theory, with Application to the Literature of Physics. Information Processing and Management. 1976, 12: 297~312
    28 F. Tutzauer. Entropy as a Measure of Centrality in Networks Characterized by Path-transfer Flow. Social Networks. 2007, 29(2): 249~265
    29 K. Borner, L. Dall’asta. W. Ke. A. Vespignani. Studying the Emerging Global Brain: Analyzing and Visualizing the Impact of Co-Authorship Teams. Complexity. 2005, 10(4): 57~67
    30 J. Shetty, J. Adibi. Discovering Important Nodes through Graph Entropy The Case of Enron Email Database. Proceedings of the 3rd international workshop on Link discovery, 2005: 74~81
    31 S. White, P. Smyth. Algorithms for Estimating Relative Importance in Networks. Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, 2003: 266~275
    32 T. Haveliwala. Topic-Sensitive PageRank. Proceedings of the 11th International World Wide Web Conference, 2002, Honolulu, Hawaii: 517~526
    33 G. Jeh, J. Widom. Scaling Personalized Web Search. Stanford University, Computer Science Department Technical Report, 2002
    34 H. Chang, D. Cohn, A. McCallum. Creating Customized Authority Lists. Proceedings of the 17th International Conference on Machine Learning, San Francisco, 2000. Morgan Kaufmann: 167~174
    35 P. Domingos, M. Richardson. Mining the Network Value of Customers. Seventh International Conference on Knowledge Discovery and Data Mining, 2001
    36 M. Richardson, P. Domingos. Mining Knowledge-Sharing Sites for Viral Marketing. Eighth International Conference on Knowledge Discovery and Data Mining, 2002
    37 D. Kempe, J. Kleinberg, E. Tardos. Maximizing the Spread of Influence through a Social Network. Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, 2003: 137~146
    38 J. O. Madadhain, P. Smyth. EventRank: A Framework for Ranking Time-Varying Networks. Proceedings of the 3rd international workshop on Link discovery, 2005: 9~16
    39 J. O. Madadhain, J. Hutchins, P. Smyth. Prediction and Ranking Algorithms for Event-Based Network Data. SIGKDD Explorations. 2003, 7(2): 23~30
    40 K. Faust. Centrality in Affiliation Networks. Social Networks. 1997, 19: 157~191
    41 T. W. Valente, R. K. Foreman. Integration and Radiality: Measuring the Extent of an Individual’s Connectedness and Reachability in a Network. Social Networks. 1998, 20: 89~105
    42 M. S. Granovetter. The Strength of Weak Ties. The American Journal of Sociology. 1978, 78(6): 1360~1380
    43 M. S. Granovetter. The Strength of Weak Ties: A Network Theory Revisited. Sociological Theory. 1983, 1: 201~233
    44 N. Lin, P.W. Dayton, P. Greenwald. Analyzing the Instrumental Use of Relations, in the Context of Social Structure. Sociological Methods and Research. 1978: 149-166
    45 N. Friedkin. A Test of Structural Features of Granovetter’s Strength of Weak Ties Theory. Social Networks. 1980: 411~422
    46 B. Wellman. Studying Personal Communities, In Peter V. Marsden and N. Lin (eds.). Social Structure and Network Analysis, Sage, 1982
    47 R. D. Alba, C. Kadushin. The Intersection of Social Circles: A New Measure of Social Proximity in Networks. Sociological Methods and Research. 1976: 77~102
    48 P. V. Marsden, K. E. Campbell. Measuring Tie Strength. Social Forces. 1984, 63 (2): 482~501
    49 J. L. Martin, K.T. Yeung. Persistence of Close Personal Ties Over a 12-Year Period. Social Networks. 2006, 28: 331~362
    50 G. Lai, O. Wong. The Tie Effect on Information Dissemination: the Spread of a Commercial Rumor in Hong Kong. Social Networks. 2002, 24: 49~75
    51 M. E. J. Newman. Scientific Collaboration Networks. II. Shortest Paths, Weighted Networks, and Centrality. PHYSICAL REVIEW E. 64
    52 G. Salton, M. Mcgill. Introduction to Modern Information Retrieval. McGraw-Hill, 1983
    53 P. Jaccard. Nouvelles Recherches Sur La Distribution Florale. Bulletin de la Societe Vaudoise des Sciences Naturelles. 1908, 44: 223~270
    54 Y. Koren, S. C. North, C. Volinsky. Measuring and Extracting Proximity in Networks. Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, 2006: 245~255
    55 C. Faloutsos, K. S. McCurley, A. Tomkins. Fast Discovery of ConnectionSubgraphs. In Proc. 10th ACM SIGKDD conference, 2004: 118~127
    56 C. Y. Huang, C. T. Sun, C. Y. Cheng, J. L. Hsieh. Bridge and Brick Motifs in Complex Networks. Physica A. 2007, 377(1): 340~350
    57 M. Girvan, M. E. J. Newman. Community Structure in Social and Biological Networks. PNAS. 2002, 99: 7821~7826
    58 M. E. J. Newman, M. Girvan. Finding and Evaluating Community Structure in Networks. PHYSICAL REVIEW E. 2004, 69
    59 M. E. J. Newman. Analysis of Weighted Networks. PHYSICAL REVIEW E. 2004, 70
    60 H. Kretschmer. Coauthorship Networks of Invisible Colleges and Institutionalized Communities. Scientometrics. 1994, 30(1): 363~369
    61 H. Kretschmer. Patterns of Behaviour in Coauthorship Networks of Invisible Colleges. Scientometrics. 1997, 40(3): 579~591
    62 H. Kretschmer. Measurement of Social Stratihcation, a Comtribution to the Dispute on the Ortega Hypothesis. Scientometrics. 1993, 26(1): 97~113
    63 M. E. J. Newman. Scientific Collaboration Networks. I. Network Construction and Fundamental Results. Physical Review E. 64
    64 M. E. J. Newman. The Structure of Scientific Collaboration Networks. Proceedings of the National Academy of Sciences. 2001, 98(2): 404~409
    65 M. E. J. Newman. Coauthorship Networks and Patterns of Scientific Collaboration. PNAS. 2004, 101: 5200~5205
    66 M. E. J. Newman. Assortative Mixing in Networks. Physical Review Letters. 2002, 89(20)
    67 F. Yoshikane, T. Nozawa, K. Tsuji. Comparative Analysis of Co-authorship Networks Considering Authors’Roles in Collaboration: Differences Between the Theoretical and Application Areas. Scientometrics. 2006, 68(3): 643~655
    68 T. Braun, W. Glanzel, R. Schubert. Publication and Cooperation Patterns of the Authors of Neuroscience Journals. Scientometrics. 2001, 51(3): 499~510
    69 C. Genest, C. Thibault. Investigating the Concentration within a Research Community Using Joint Publications and Co-authorship via Intermediaries. Scientometrics. 2001, 51(2): 429~440
    70 X. Liu, J. Bollen, M. L. Nelson, H. V. Sompel. Co-authorship Networks in the Digital Library Research Community. Information Processing and Management.2005, 41: 1462~1480
    71 L. Yin, H. Kretschmer, R. A. Hanneman, Z. Liu. Connection and Stratification in Research Collaboration: an Analysis of the COLLNET Network. Information Processing and Management. 2006, 42: 1599~1613
    72尹丽春, H. Kretschmer,刘则渊.基于COLLNET成员的合著网络拓扑结构分析.科技进步与对策. 2006, (2): 70~73
    73 M. A. Rodriguez, A. Pepe. On the Relationship between the Structural and Socioacademic Communities of a Coauthorship Network. Journal of Informetrics. 2008, 2: 195~201
    74 C.Gossart, M. ?ZMAN. Co-authorship Networks in Social Sciences: The Case of Turkey. Scientometrics. 2009, 78(2): 323~345
    75 A. F. Smeaton, G. Keogh, C. Gurrin, K. McDonald, T. Sodring. Analysis of Papers from Twenty-five Years of SIGIR Conferences: What Have We Been Doing for the Last Quarter of a Century?. ACM SIGIR Forum, 2003: 49~53
    76 M. A. Nascimento, J. Sander, J. Pound. Analysis of SIGMOD’s CoAuthorship Graph. SIGMOD Record. 2003, 32(3): 8~10
    77 S. J. Cunningham, S. M. Dillon. Authorship Patterns in Information Systems. Scientometries. 1997, 39(1): 19~27
    78 P. Sarkar, A. W. Moore. Dynamic Social Network Analysis using Latent Space Models. SIGKDD Explorations. 2003, 7(2): 31~40
    79 L. Backstrom, D. Huttenlocher, J. Kleinberg. Group Formation in Large Social Networks: Membership, Growth, and Evolution. Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, 2006: 44~54
    80 A. Y. Wu, M. Garland, J. Han. Mining Scale-free Networks using Geodesic Clustering. Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, 2004: 719~724
    81 A. Goldenberg, A. W. Moore. Bayes Net Graphs to Understand Co-authorship Networks?. Proceedings of the 3rd international workshop on Link discovery, 2005: 1~8
    82 H. Tong, C. Faloutsos. Center-Piece Subgraphs: Problem Definition and Fast Solutions. Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, 2006: 404~413
    83 D. L. Nowell, J. Kleinberg. The Link Prediction Problem for Social Networks. Proceedings of the twelfth international conference on Information and knowledge management, 2003: 556~559
    84张鹏,李梦辉,吴金闪,狄增如,樊瑛.科学家合作网络的聚类分析.复杂系统与复杂性科学. 2005, 2(2): 30~34
    85何阅,张培培,许田,姜玉梅,何大韧.一个科研合作网的双粒子图自适应演化模型.物理学报. 2004, 53(6): 1710~1715
    86张培培,何阅,周涛,苏蓓蓓,常慧,周月平,汪秉宏,何大韧.一个描述合作网络顶点度分布的模型.物理学报. 2006, 55(1): 60~67
    87刘杰,陆君安.一个小型科研合作复杂网络及其分析.复杂系统与复杂性科学. 2004, 1(3): 56~61
    88章忠志,荣莉莉,周涛.一类无标度合作网络的演化模型.系统工程理论与实践. 2005, (11): 55~60
    89 J. Han, M. Kamber. Data Mining: Concepts and Techniques. 2nd ed.. Morgan Kaufmann, 2005.
    90 S. P. Borgatti, M. G. Everett, L. C. Freeman. UCINET for Windows: Software for Social Network Analysis. Analytic Technologies, Harvard, 2002
    91 C. Spearman. General Intelligence Objectively Determined and Measured. American Journal of Psychology. 1904, 15: 201~293
    92 T. Luukkonen, R. J. W. Tijssen, O. Persson, G. Sivertsen. The Measurement of International Scientific Collaboration. Scientometrics. 1993, 28(1): 15~36
    93 W. Glanzel, A. Schubert, H. Czerwon. A Bibliometric Analysis of International Scientific Cooperation of the European Union (1985-1995). Scientometrics. 1999, 45(2): 185~202
    94 L. Liang, L. Zhu. Major Factors Affecting China’s Inter-regional Research Collaboration: Regional Scientific Productivity and Geographical Proximity. Scientometrics. 2002, 55(2): 287~316
    95 Y. Yamashita, Y. Okubo. Patterns of Scientific Collaboration between Japan and France: Inter-sectoral Analysis using Probabilistic Partnership Index (PPI). Scientometrics. 2006, 68(2): 303~324
    96 J. Scott. Social Network Analysis: A Handbook. 2nd ed.. Sage, London, 2000
    97 J. Moody. Race, School Integration, and Friendship Segregation in America. American Journal of Sociology. 2001, 107: 679~716
    98 M. F. Schwartz, D. C. M. Wood. Discovering Shared Interests using Graph Analysis. Community of ACM. 1993, 36(8): 78~89
    99 C. Borgs, J. T. Chayes, M. Mahdian, A. Saberi. Exploring the Community Structure of Newsgroups (Extended Abstract). In Conference of the ACM Special Interest Group on Knowledge Discovery and Data Mining. ACM Press, 2004
    100 G. W. Flake, S. Lawrence, C. L. Giles. Efficient Identification of Web Communities. In Conference of the ACM Special Interest Group on Knowledge Discovery and Data Mining. ACM Press, 2000
    101 D. Crane. Invisible Colleges: Diffusion of Knowledge in Scientific Communities. University of Chicago Press, 1972
    102 L. Egghe, R. Rousseau. Introduction to Informetrics. Elsevier, 1990
    103 S. P. Borgatti, M. G. Everett. A Graph-Theoretic Perspective on Centrality. Social Networks. 2006, 28: 466~484
    104 S. B. Seidman. Internal Cohesion of LS Sets in Graphs. Social Networks. 1983, 5(2): 97~107
    105 S. P. Borgatti, M. G. Everett, P. R. Shirey. LS Sets, Lambda Sets and other Cohesive Subsets. Social Networks. 1990, 12: 337~357
    106 W.Z. Shen, J.D. Huang, S.M. Chao. Lambda Set Selection in Roth-Karp Decomposition for LUT-based FPGA Technology Mapping. Proceedings of the
    32nd ACM/IEEE conference on Design automation, San Francisco, California, 1995: 65~69
    107 C. E. Shannon. A Mathematical Theory of Communication. The Bell System Technical Journal. 1948, 27: 379~423, 623~656
    108 C. E. Shannon, W. Weaver. The Mathematical Theory of Communication. University of Illinois Press, 1964
    109 H. Hou, H. Kretschmer, Z. Liu. The Structure of Scientific Collaboration Networks in Scientometrics. Scientometrics. 2008, 75: 189~202
    110 R. A. Hanneman, M. Riddle. Introduction to Social Network Methods. http://faculty.ucr.edu/hanneman/nettext/. 2005
    111 S. P. Borgatti, M. G. Everett. The Class of All Regular Equivalences: Algebraic Structure and Computation. Social Networks. 1989, 11: 65~88

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700