数字图书馆敏感数据匿名发布若干关键技术研究

英文题名：Research on Publishing Sensitive Data Based on Anonymity in Digital Libraries
作者：骆永成
论文级别：博士
学科专业名称：控制理论与控制工程
中文关键词：数字图书馆 ; 数据发布 ; k-匿名 ; 个性化匿名 ; 身份保留 ; 二分图 ; 安全分组 ; 聚类
英文关键词：digital library ; data publishing ; k-anonymity ; personalized anonymity ; identity-reserved ; bipartite graph ; safety grouping ; clustering
学位年度：2011
导师：乐嘉锦
学科代码：081101
学位授予单位：东华大学
论文提交日期：2011-07-01

摘要

随着信息技术的不断发展,数字图书馆的资源日益丰富和各项服务不断创新,用户隐私问题也日益突出。面向各种应用的数据共享和分析服务的数据匿名发布技术一方面具有较好的适用性、通用性和实用性等优势,另一方面又能够充分尊重用户的隐私,有利于数字图书馆应用数据的充分利用和信息共享,从而促进图书馆开展各项服务工作。然而,数字图书馆的应用数据有一定的具体领域特征,隐私保护诉求和数据形式存在多样性。本文通过对现有各种匿名模型及匿名化技术的研究和分析后,指出目前通常的数据匿名发布技术不足以解决数字图书馆敏感数据发布多种场景下的隐私保护问题。因而,本文对数字图书馆敏感数据匿名发布的若干关键技术进行了一些研究,论文的主要工作如下：
     (1)面向应用的敏感数据匿名发布框架的研究
     针对当前敏感数据隐私保护中所面临的种种挑战,创新地提出了一种适应应用需求的数据发布体系结构框架方案——基于领域知识面向应用的敏感数据匿名发布框架,并对框架模块进行了初步介绍,同时还给出了一个个性化自适应的隐私保护数据发布算法。该框架尝试使用自适应的机制,不但能满足不同的数据应用需求而且又能满足数据所有者不同的隐私保护需求。在自适应数据发布算法中,联合采用了准标识属性QI泛化和敏感属性SA泛化以获得符合匿名发布原则的匿名数据表,从而在满足隐私保护需求的同时减少了发布数据的信息损失,即尽可能地提高了发布数据的信息精度。
     (2)基于泛化的个性化匿名数据发布技术的研究
     本文结合匿名模型的最新发展,提出了一个可以应用于数字图书馆敏感数据发布的个性化敏感数据发布模型——(P,α,k)-匿名模型和基于泛化技术的数据匿名化实现算法,从面向个体和敏感属性值角度出发,充分考虑了图书馆特殊用户隐私保护诉求和大众用户的普遍性隐私保护需求。文中首先介绍了相关工作并在分析现有个性化匿名原则的基础上对个性化隐私约束参数进行了建模,并提出了(P,α,k)-匿名模型；接着提出了一个基于泛化技术的启发式TopDown—LA算法,并介绍了该算法应用的局部重编码和特化处理技术,保证了算法获取最小k-泛化,最大限度地提高匿名化表精度,而后还分析了算法复杂性和正确性。最后通过真实数据实验,验证了这种启发式的个性化匿名算法可行性。该算法能充分满足个性化隐私保护需求进行匿名发布数据,相比Basic Incognito和Mondrian算法信息损失少,算法性能良好。
     (3)用户身份保留的匿名数据发布技术的研究
     本文提出了三种具体的身份保留匿名化原则,并重点介绍了基于聚类的匿名发布和有损分解IDAnatomy两种数据发布方法的实现。数字图书馆应用数据的分析在绝大多数情况下不仅需要发布的数据保留用户身份,而且还需要考虑用户的个体隐私保护需求。针对此种情况,本文首先考虑数字图书馆领域应用数据通常存在单一个体对应多条记录的情况,特别分析了此情况下用户敏感数据的侵犯情况,并提出了三种具体的身份保留匿名化原则。接着介绍了应用加权层次距离信息损失评估方式实现数据匿名的基于聚类的(P,α,β)-clustering算法,并分析了算法复杂度；另外还介绍了有损分解IDAnatomy数据发布方法,其通过将原始关系的准标识符属性和敏感属性以两个不同的关系发布,利用它们之间的有损连接来保护隐私数据的安全,并且给出了基本的IDAnatomy算法保证发布的数据满足隐私保护和实用性要求。最后在实验环境中从多个方面比较了原有匿名方法和身份保留的匿名化方法,检验了方法的有效性。
     (4)敏感数据图发布相关技术的研究
     本文主要提出了一种新的图聚类安全分组策略和两种不同实现策略的匿名数据发布算法。文中首先分析了数字图书馆复杂个体交互关系数据发布的隐私保护问题,同时根据背景知识对图攻击问题进行了增量式知识查询建模和量化。接着在建立二分图图模型和相关定义的基础上,初步对图的数据匿名集成和数据匿名化问题进行了探讨,同时介绍了简单匿名化、列举和划分等二分图基本数据匿名发布方法。而后结合最新研究成果,提出了一种新的图聚类安全分组策略来提高二分图发布数据的可用性,并从实现策略上比较了先聚后分的CKG算法和边分边聚的KGC算法,其间还重点分析了两个关键问题——图泛化信息损失和聚类分组超顶点的描述。最后通过实验表明,基于聚类安全分组策略匿名方法能为图中的个体提供隐私保护的同时还能在一定程度上提高匿名图数据的可用性。
     本文研究了数字图书馆领域几个常见应用场景下的数据发布若干关键技术,给出了一些可行解决方案,并且对提出的各种算法不仅都作了详细的性能分析,而且使用数字图书馆运行的实际数据集或综合数据集对算法进行了详细实验。经实验和性能分析都表明：本文提出的算法与相关算法相比具有很好的性能和较好的适应能力。
With the continuous development of information technology, the resources and innovative services in digital libraries are becoming increasingly rich. At the same time the issue of the users' privacy is also increasingly prominent. Applied to data sharing and data analysis, the anonymous technique in privacy-preserving data publishing on the one hand has good applicability, versatility and practicality, on the other hand can fully respect the users' privacy, which is conducive to full application of the data and information sharing, thus promoting library's services. However, application data in digital libraries has some characteristics of specific areas, which is the diversity of privacy protection demands and the data form. After analyzing various existing anonymity models and anonymization technologies, the thesis points out that the current anonymous data publishing techniques will not solve the privacy problem of sensitive data released under various scenarios in digital libraries. Therefore, it studies some key techniques of anonymous dada publishing for the sensitive data in digital libraries. The main work as follows:
     (1) Research on the sensitive data publishing framework based on domain knowledge and the application
     Facing with the current challenges of the sensitive data protection, a data publishing architectural framework based on domain knowledge is proposed to meet the application requirements. And several modules of the framework are introduced. Furthermore, an adaptive and personalized data publishing algorithm is given. The framework trying to use an adaptive mechanism, not only can meet the needs of the different data applications, but also can satisfy the needs for the different owners' privacy protection. In the adaptive data publishing algorithm, it is used together the generalization principles of the quasi-identifier property and sensitive attribute in order to obtain the anonymous released data sheet to meet the demand for privacy protection, while reducing the information loss. That is as much as possible to improve the accuracy of the released data.
     (2) Research on the technology of the personalized anonymity data publishing based on the generalization
     With the latest development of anonymity, this thesis puts forward a personalized data publishing model applied to release the sensitive data in digital libraries from the perspective of the individual and sensitive attribute values, which is a (P,alpha,k)-anonymity model, and an algorithm based on the generalization. The model gives full consideration to the special user's privacy and the public users'privacy. First, after introducing the related works and several existing personalized anonymity principles, this thesis gives the personalized privacy constraints with several parameters and proposes a (P,alpha,k)-anonymity model. Second, a heuristic algorithm based on the generalization, TopDown-LA, is proposed. And the techniques of local encoding and specialization used in the algorithm also be explained, which ensure the algorithm to obtain the minimum k-generalization and maximize the accuracy of the anonymous table, and then the complexity and accuracy of the algorithm also be analyzed. Finally, the real data experiments verify the feasibility of this heuristic algorithm. These show that it can fully meet the needs of personalized privacy protection, compared with less loss of information than Basic Incognito and Mondrian, and it has good execution performance. (3) Research on the identity-reserved data publishing technology
     This thesis introduces three specific identity-reserved anonymity principles, and focuses on the two data publishing methods of the clustering-based anonymization and the lossy decomposition, ID Anatomy. In most cases the analysis of the released data in digital libraries not only need to reserve the user's identity, but also need to consider the needs of the user's individual privacy. In such cases, the thesis first considers the data with multiple records corresponding to a single individual. In particular, it analyzes the violations of the sensitive data. And it brings forward three specific identity-reserved anonymity principles. Then, the thesis describes the clustering-based algorithm, which applied the weighted-hierarchical-distance methods to assess the information loss, and analyzes its complexity. It also introduces a method of the lossy decomposition, IDAnatomy, which releases the quasi-identifier property and sensitive attributes by using two different relationship tables with their original relations, utilizing the lossy connection to protect the privacy security. And the algorithm guarantees to meet the requirements of privacy and utility. Finally, in the experimental environments it compares several aspects of the original methods and identity-reserved anonymous method, testing the validity of the method.
     (4) Research on the graph data publishing
     This thesis presents a new clustering-based safety grouping strategy for the graph data and two different anonymous data publishing algorithms. Firstly, it analyzes privacy protection data publishing issues of the complex interaction data in digital libraries, and implements an incremental knowledge query model based on the background knowledge of the graph attack problems. Secondly, on the basis of the establishment of bipartite graph model and some related definitions, the issues of the graph anonymization integration and data anonymization are discussed. Also it introduces some bipartite graph data publishing methods, such as primitive anonymous publishing, list approach, partitioning approach, and so on. Then, combined with the latest research results, a new clustering-based safety grouping strategy to improve the data availability of the released bipartite graph is introduced. And it compares the CKG algorithm and KGC algorithm from the implementation strategies. During this period it also highlights the information loss of graph generalization and the description of super-nodes. At last, the experiments show that the clustering-based safety grouping strategy can provide privacy protection for the individuals and increases the availability of anonymous graph data to some extent.
     In this thesis, the various algorithms not only have made a detailed performance analysis, but also have run with the actual data set in digital libraries or integrated data set. The experimental results and performance analysis show that the proposed methods compared with the related algorithms have good performance and better adaptability.

引文

[1]徐险峰,马海群,王海东,图书馆用户隐私权保护研究综述,图书馆建设,2010(7),30-34.
    [2]贾松林,中外图书馆隐私保护制度的比较及思考,图书馆论坛,2010,6(vo1.30),156-158.
    [3]周水庚,李丰,陶宇飞等,面向数据库应用的隐私保护研究综述,计算机学报JSJX,2009 (05),847-861.
    [4]徐睿,蒋玲,国际大型隐私保护机构综述,农业图书情报学刊,2006,18(10),111-113,123页.
    [5]杨维嘉,在数据挖掘中保护隐私信息的研究,[博士论文],上海交通大学,2009.
    [6]N. Gisin, G. G. Ribordy, W. Titteletc., Quantum cryptography, Reviews of Modern Physics,2002,74(1),145-195.
    [7]Latanya Sweeney, k-anonymity:A model for protecting privacy, International Journal of Uncertainty, Fuzziness and Knowlege-Based Systems,2002,10(5), 557-570.
    [8]葛伟平,隐私保护的数据挖掘,[博士论文],上海复旦大学,2005.
    [9]张锋,孙雪冬,常会友等,两方参与的隐私保护协同过滤推荐研究,电子学报,2009,37(1),84-89.
    [10]杨晓春,刘向宇,王斌等,支持多约束的k-匿名化方法,软件学报,2006(05),1222-1231.
    [11]王智慧,信息共享中隐私保护若干问题研究,[博士论文],上海复旦大学,2007.
    [12]Lingyu Wang, Sushil Jajodia, Duminda Wijesekera, Preserving privacy in on-line analytical processing (OLAP), USA, Springer Science+Business Media, 2007.
    [13]R. Agrawal and R. Srikant, Privacy-preserving data mining, In SIGMOD,2000, 439-450.
    [14]魏琼,数据发布中的隐私保护方法研究,[博士论文],华中科技大学,2008.
    [15]Benjamin C. M. Fung, Ke Wang, Rui Chenetc., Privacy-preserving data publishing:A survey of recent developments, ACM Computing Surveys (CSUR), 2010,42(4),1-53.
    [16]陈珂,开放式环境下敏感数据安全的关键技术研究,[博士论文],浙江大学,2007.
    [17]L Cranor, M Langheinrich, M Marchiori, et al., The Platform for Privacy Preferences 110 (P3P 110) Specification W3C Recommendation,2002.
    [18]A Turner, A Dogac, H Toroslu, A semantic based privacy framework for web services, In Proc. WWW'03 Workshop on E-Services and the Semantic Web, 2003,356-362.
    [19]Dan Boneh, Matthew Franklin, Identity-based encryption from the weil pairing, SIAM Journal on Computing,2003,32(3),586-615.
    [20]C. A. Sun J. Zhang, Privacy and security for online social networks:Challenges and opportunities, IEEE Network,2010,24(4),13-18.
    [21]Bayardo R J. Agrawal R., Data privacy through optimal k-anonymization, In Proc. of the 21st International Conference on Data Engineering, Los Alamitos:IEEE Computer Society,2005,217-228.
    [22]Latanya Sweeney, Achieving k-anonymity privacy protection using generalization and suppression, International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems,2002,10(5),571-588.
    [23]Rhonda Chaytor, Ke Wang,Patricia Brantingham, Fine-grain perturbation for privacy preserving data publishing,2009,740-745.
    [24]Vladimir Estivill-Castro, Ljiljana Brankovic, Data Swapping, Balancing Privacy against Precision in Mining for Logic Rules,1999,389-398.
    [25]D.Lambert, Measure of disclosure risk and harm, Journal of Official Statistics, 1993,9,313-331.
    [26]兰丽辉,鞠时光,金华等,数据发布中的隐私保护研究综述,计算机应用研究,2010(08),2822-2827.
    [27]Ke Wang, Benjamin C. M. Fung, Anonymizing sequential releases, In Proc. ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining,2006,414-423.
    [28]Mehmet Ercan Nergiz, Christopher Clifton, Ahmet Erhan Nergiz, Multirelational k-anonymity, IEEE Transactions on Knowledge and Data Engineering,2009, 21(8),1104-1117.
    [29]A. Gehrke J. Kifer Machanavajjhala,l-Diversity:Privacy beyond k-anonymity, In Proc. of the 22nd IEEE International Conference on Data Engineering (ICDE), 2006.
    [30]Xiaoxun Sun, Min Li, Hua Wang, A family of enhanced (l, α)-diversity models for privacy preserving data publishing, Future Generation Computer Systems, 2011,27(3),348-356.
    [31]Ninghui Li, Tiancheng Li, Suresh Venkatasubramanian, t-closeness:Privacy beyond k-anonymity and l-diversity,2007 IEEE 23rd International Conference On Data Engineering, Vols 1-3,2007,81-90.
    [32]Raymond Chi-Wing Wong, Jiuyong Li, Ada Wai-Chee Fuetc., (a,k)-anonymity an enhanced k-anonymity model for privacy preserving data publishing, In: ACM SIGKDD,2006,754-759.
    [33]Traian Marius Truta, Alina Campan, Paul Meyer, Generating microdata with p-sensitive k-anonymity property,2007,4721 LNCS,124-141.
    [34]Qing Zhang, Nick Koudas, Divesh Srivastavaetc., Aggregate query answering on anonymized tables, In:IEEE International Conference on Data Engineering, 2007,116-125.
    [35]X. Xiao and Y. Tao, Personalized privacy preservation, In Proc. of the ACM SIGMOD, Atlanta, Georgia, USA,2006:229-240.
    [36]Mehmet Ercan Nergiz, Maurizio Atzori, Chris Clifton, Hiding the presence of individuals from shared databases, In:SIGMOD107,2007,665-676.
    [37]Vibhor Rastogi, Dan Suciu, Sungho Hong, The boundary between privacy and utility in data publishing, In Proc. of the 33rd International Conference on Very Large Data Bases (VLDB).2007,531-542.
    [38]Avrim Blum, Katrina Ligett, Aaron Roth, A learning theory approach to non-interactive database privacy, In Proc. of the 40th Annual ACM Symposium on Theory of Computing (STOC). ACM, New York,2008,609-618.
    [39]J.-W. Byun, Y. Sohn, E. Bertino and N. Li, Secure anonymization for incremental datasets, In:Secure Data Management, Seoul, Korea,2006.
    [40]X.Xiao and Y.Tao, m-Invariance:Towards Privacy Preserving Re-publication of Dynamic Datasets, In Proc. of the 26th ACM SIGMOD. Beijing, China:ACM Press Press,2007,689-700.
    [41]刘喻,吕大鹏,冯建华等,数据发布中的匿名化技术研究综述,计算机应用,2007(10),2361-2364.
    [42]Ashwin Machanavajjhala, Daniel Kifer, Johannes Gehrkeetc., L-diversity: Privacy beyond k-anonymity, ACM Transactions on Knowledge Discovery from Data (TKDD),2007,1(1),3.
    [43]A. Meyerson and R. Williams, On the Complexity of Optimal K-Anonymity, In Sym. on Principles of Database Systems (PODS), Paris, France,2004,223-228.
    [44]K. LeFevre, D. J. DeWitt, R. Ramakrishnan, Incognito:Efficient full-domain k-anonymity, In Proc. of the ACM SIGMOD,2005,49-60.
    [45]Li,T. and Li, N., Optimal k-anonymity with flexible generalization schemes through bottom-up searching. In Proc. of the Sixth IEEE Int. Conf. on Data Mining Workshops, Hong Kong, China,2006,518-523.
    [46]LeFevre K, DeWitt D J, Ramakrishnan R, Mondrian multidimensional K-anonymity, In Proc. of the International Conference on Data Engineering (ICDE'06), Atlanta, GA, USA, April.2006,25-35.
    [47]J. Xu, W. Wang, J. Pei, X. Wang, B. Shi, et al. Utility-Based Anonymization Using Local Recoding, In Proc. of the ACM SIGKDD, Philadelphia, PA, USA, 2006,785-790.
    [48]Taiyong Li, Changjie Tang, Jiang Wuetc., k-anonymity via clustering domain knowledge for privacy preservation, In Proc. of the 5th International Conference on Fuzzy Systems and Knowledge Discovery,2008,4,697-701.
    [49]J. W. Sohn Y. Byun, Secure anonymization for incremental datasets, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics),2006,4165 LNCS,48-63.
    [50]Benjamin C. M. Fung, Ke Wang, Ada Wai-Chee Fuetc., Anonymity for continuous data publishing, In Proc. of the 11th International Conference on Extending Database Technology (EDBT). ACM, New York,2008,264-275.
    [51]Yingyi Bu, Ada Wai Chee Fu,Raymond Chi Wing Wongetc., Privacy preserving serial data publishing by role composition, In VLDB Endow.,2008,1(1), 845-856.
    [52]Raymond Chi-Wing Wong, Ada Wai-Chee Fu, Jia Liuetc., Global privacy guarantee in serial data publishing, In Proc. of ICDE'10. IEEE Comp. Soc.,2010, 956-959.
    [53]X. Xiao, Y. Tao, Anatomy:Simple and effective privacy preservation, In Proc. of the 32nd International Conference on Very Large Data Bases (VLDB 2006), Seoul, Korea,2006.
    [54]Y. a. Chen H. Tao, ANGEL:Enhancing the utility of generalization for privacy preserving publication, IEEE Transactions on Knowledge and Data Engineering, 2009,21(7),1073-1087.
    [55]Noman Mohammed, Benjamin C. M. Fung, Ke Wangetc., Privacy-preserving data mashup, In Proc. of the 12th International Conference on Extending Database Technology (EDBT),2009,228-239.
    [56]Wei Jiang, Chris Clifton, Privacy-preserving distributed k-anonymity, In Proc. of the 19th Annual IFIP WG 11.3 Working Conference on Data and Applications Security,2005,3654,166-177.
    [57]P. Samarati, Protecting respondents' identities in microdata release, IEEE Transactions on Knowledge and Data Engineering,2001,13(6),1010-1027.
    [58]Sweeney L, Datafly:a system for providing anonymity in medical data, In Proc. of the 11th IFIP TC11 WG11.3 International Conference on Database Securty XI: Status and Prospects. London:Chapman & Hall,1998,356-381.
    [59]Iyengar V S. Transforming data to satisfy privacy constraints, In Proc. of the 8th ACM SIGKDD, New York:ACM,2002,279-288.
    [60]Ke Wang, Philip S. Yu, Sourav Chakraborty, Bottom-up generalization:A data mining solution to privacy protection, In Proc. of the 4th IEEE International Conference On Data Mining, Proceedings,2004,249-256.
    [61]BCM Fung, K. Wang, P. S. Yu, Anonymizing classification data for privacy preservation, IEEE Transactions on Knowledge and Data Engineering,2007, 19(5),711-725.
    [62]Benjamin C. M. Fung, Ke Wang, Lingyu Wangetc., Privacy-preserving data publishing for cluster analysis, Data & Knowledge Engineering,2009,68(6), 552-575.
    [63]Charu C. Aggarwal, Philip S. Yu, A framework for condensation-based anonymization of string data, Data Mining and Knowledge Discovery,2008, 16(3),251-275.
    [64]Charu C. Aggarwal, Philip S. Yu, On static and dynamic methods for condensation-based privacy-preserving data mining, ACM Trans. Database Syst., 2008,33(1),1-39.
    [65]Charu C. Aggarwal, Jian Pei, Bo Zhang, On privacy preservation against adversarial data mining, In Proc. of the 12th ACM SIGKDD. ACM, New York, 2006,510-516.
    [66]Ke Wang, Benjamin C. M. Fung, Guozhu Dong, Integrating private databases for data analysis, In Proc. of the IEEE International Conference on Intelligence and Security Informatics (ISI),2005,3495,171-182.
    [67]Ke Wang, Benjamin C. M. Fung, Philip S. Yu, Handicapping attacker's confidence:an alternative to k-anonymization, Knowledge and Information Systems,2007,11(3),345-368.
    [68]Kristen LeFevre, David J., DeWitt, Raghu Ramakrishnan, Workload-aware anonymization, In Proc. of the 12th ACM SIGKDD. ACM, New York,2006, 277-286.
    [69]A.Campan and T.M.Truta, A clustering approach for data and structural anonymity in social networks. In Proc. of the 2nd ACM SIGKDD International Workshop on Privacy, Security, and Trust in KDD, In Conjunction with KDD'08, Las Vegas,Nevada, USA,2008.
    [70]Elena Zheleva, Lise Getoor, Preserving the privacy of sensitive relationships in graph data, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics),2008,4890 LNCS,153-171.
    [71]Lisa Singh, Justin Zhan, Measuring topological anonymity in social networks, In Proc. of the 2007 IEEE International Conference on Granular Computing 2007, 770-774.
    [72]Y. Xintao W. Xiaowei, Randomizing social networks:A spectrum preserving approach, Society for Industrial and Applied Mathematics-8th SIAM International Conference on Data Mining 2008, Proceedings in Applied Mathematics,2008,2,739-750.
    [73]M. Miklau G. Jensen Hay, Resisting structural re-identification in anonymized social networks, VLDB Journal,2010,19(6),797-823.
    [74]G. Srivastava D. Yu Cormode, Anonymizing bipartite graph data using safe groupings, VLDB Journal,2010,19(1),115-139.
    [75]K. Terzi E. Liu, Towards identity anonymization on graphs, In Proc. of the ACM SIGMOD International Conference on Management of Data,2008,93-106.
    [76]M. Hay, G. Miklau, D. Jensen, P. Weis, and S. Srivastava, Anonymizing social networks, Technical Report 07-19, University of Massachusetts Amherst,2007.
    [77]L. A. Wang J. Liu, Privacy preservation in social networks with sensitive edge weights, Society for Industrial and Applied Mathematics-9th SIAM International Conference on Data Mining 2009, Proceedings in Applied Mathematics,2009,2,949-960.
    [78]B. Pei J. Zhou, Preserving privacy in social networks against neighborhood attacks, Proceedings-International Conference on Data Engineering,2008, 506-515.
    [79]Mingxuan Yuan, Lei Chen, Philip S. Yu, Personalized privacy protection in social networks, In VLDB Endow.,2010,4,141-150.
    [80]Lei Zou, Lei Chen, M. Tamer O Zsu, k-automorphism:a general framework for privacy preserving network publication, In VLDB Endow.,2009,2,946-957.
    [81]Lars Backstrom, Cynthia Dwork, Jon Kleinberg, Wherefore art thou r3579x?: Anonymized social networks, hidden patterns, and structural steganography, 2007,181-190.
    [82]B. K. A. Panda Tripathy, A new approach to manage security against neigborhood attacks in social networks, In Proc. of the 2010 International Conference on Advances in Social Network Analysis and Mining, ASONAM,2010,264-269.
    [83]Smriti Bhagat, Graham Cormode, Balachander Krishnamurthyetc., Class-based graph anonymization for social network data, In VLDB Endow.,2009,2, 766-777.
    [84]Das, S Egecioglu O El Das, Anonymizing weighted social network graphs, Proceedings-International Conference on Data Engineering,2010,904-907.
    [85]Wentao Wu, Yanghua Xiao, Wei Wangetc, K-symmetry model for identity anonymization in social networks, In EDBT 2010, March, Lausanne, Switzerland, 2010,111-122.
    [86]D. Gehrke J. Kifer, Injecting utility into anonymized datasets, In Proc. of the ACM SIGMOD International Conference on Management of Data,2006, 217-228.
    [87]M. E. Nergiz, C. Clifton, Thoughts on k-anonymization, Data & Knowledge Engineering,2007,63(3),622-645.
    [88]Tiancheng Li, Ninghui Li, Towards optimal k-anonymization, Data and Knowledge Engineering,2008,65(1),22-39.
    [89]Grigorios Loukides, Jianhua Shao, An empirical study of utility measures for k-Anonymisation, In Proc. of the 25th British National Conference on Databases, (BNCOD 2008), Cardiff, United kingdom,2008,15-27.
    [90]G. A. Karras P. Ghinita, A framework for efficient data anonymization under privacy and accuracy constraints, ACM Transactions on Database Systems,2009, 34(2).
    [91]J. Li, R.C.W. Wong, A.W.C. Fu and J. Pei, Achieving k-Anonymity by Clustering in Attribute Hierarchical Structures, In Proc. of the 8th International Conference on Data Warehousing and Knowledge Discovery (DaWaK), Krakow, Poland, 2006:405-416.
    [92]Grigorios Loukides, Jian-Hua Shao, An efficient clustering algorithm for k-anonymisation, Journal of Computer Science and Technology,2008,23(2), 188-202.
    [93]Aristides Gionis, Tamir Tassa, K-anonymization with minimal loss of information, IEEE Transactions on Knowledge and Data Engineering,2009,21(2),206-219.
    [94]陈建明,韩建民,面向微聚集技术的k-匿名数据质量评估模型,计算机应用研究,2010,27 (06),2344-2347.
    [95]G. Aggarwal, T. Feder, K. Kenthapadi, et al. Approximation algorithms for k-anonymity, Journal of Privacy Technology (JOPT),2005.
    [96]Hyoungmin Park, Kyuseok Shim, Approximate algorithms for k-anonymity, Journal of Privacy Technology,2007,67-78.
    [97]Ji-Won Byun, Ashish Kamra, Elisa Bertinoetc., Efficient k-anonymization using clustering techniques, Advances in Databases:Concepts, Systems and Applications,2007,188-200.
    [98]牟冬梅,数字图书馆知识组织语义互联策略及其应用研究,[博士论文],吉林大学,2009.
    [99]中国科学院图书馆分类法修订委员会编,中国科学院图书馆图书分类法(第三版),北京,科学出版社,1974,844页.
    [100]G. Aggarwal, T. Feder, K. Kenthapadi, R. Motwani, R. Panigrahy, D. Thomas and A. Zhu, Anonymizing Tables. In Proc. of the 10th International Conference on Database Theory (ICDT), Edinburgh, Scotland.2005:246-258.
    [101]Alina Campan, Traian Marius Truta, Nicholas Cooper, P-sensitive K-anonymity with generalization constraints, Transactions on Data Privacy,2010,3(2),65-89.
    [102]Xiaoxun Sun, Lili Sun, Hua Wang, Extended k-anonymity models against sensitive attribute disclosure, Computer Communications,2011,34(4),526-535.
    [103]B. C. M. Fung, K. Wang, and P. S. Yu., Top-down specialization for information and privacy preservation, In ICDE,2005,205-216.
    [104]朱拯,王智慧,汪卫,基于个人隐私约束的k-匿名模型,计算机研究与发展,2010(47),271-278.
    [105]王茜,曾子平,(p,a)-sensitive k-匿名隐私保护模型,计算机应用研究,2009,26(06),2178-2180.
    [106]Yun-Hai Tong, You-Dong Tao, Shi-Wei Tang, et al. Identity-Reserved anonymity in privacy preserving data publishing, Journal of Software,2010, 21(4),771-781.
    [107]G. Aggarwal et al., Two can keep a secret:a distributed architecture for secure database services, In CIDR, USA,2005.
    [108]Valentina Ciriani, Sabrina De Capitani Di Vimercati, Sara Forestietc., Fragmentation and encryption to enforce privacy in data storage, Computer Security-ESORICS,2007,4734 LNCS,171-186.
    [109]Amihai Motro, Francesco Parisi-Presicce, Blind custodians:A database service architecture that supports privacy without encryption, In Proc. of the 19th Annual IFIP WG 11.3 Working Conference on Data and Applications Security,2005, 338-352.
    [110]王雅哲,杨晓春,王斌,数据发布中维护敏感数据高可用性的隐私保护方法,计算机研究与发展,2007,214-219.
    [111]刘玉葆,黄志兰,傅慰慈等,基于有损分解的数据隐私保护方法,计算机研究与发展,2009,1217-1225.
    [112]岑婷婷,数据表匿名化的微聚集算法的研究,[硕士论文],浙江师范大学,2009.
    [113]G. Aggarwal, T. Feder, K. Kenthapadi, S. Khuller, R. Panigrahy, D. Thomas and A. Zhu., Achieving Anonymity via Clustering. In Sym. on Principles of Database Systems (PODS), Chicago, Illinois, USA,2006,153-162.
    [114]Jun-Lin Lin, Meng-Cheng Wei, An efficient clustering method for k-anonymization, In Proc. of the 1st International Workshop on Privacy and Anonymity in the Information Society,2008,331,46-50.
    [115]Hua Zhu, Xiaojun Ye, Achieving k-anonymity via a density-based clustering method, In Proc. of the 9th Asia-Pacific Web Conference on Advances in Data and Web Management,2007,4505 LNCS,745-752.
    [116]Domingo-Ferrer J, Mateo-Sanz J M., Efficient multivariate data-oriented microaggregation, VLDB Journal,2006,15,355-369.
    [117]Solanas A, Martinez-Baslleste A, Domingo-Ferrer J., V-MDAV:Amultivariate microaggregation with variable group size. In Proc. of Computational Statistics. Rome, Italy:Springer-Verlag,2006,917-927.
    [118]Grigorios Loukides, Jianhua Shao, Capturing Data Usefulness and Privacy Protection in K-Anonymisation, Applied Computing 2007, Vol 1 and 2,2007, 370-374.
    [119]Chuang-Cheng Chiu, Chieh-Yuan Tsai, A k-anonymity clustering method for effective data privacy preservation, ADMA 2007,4632 LNAI,89-99.
    [120]王智慧,许俭,汪卫等,一种基于聚类的数据匿名方法,软件学报,2010(4),680-693.
    [121]于娟,韩建民,郭腾芳等,基于聚类的高效k-匿名化算法,计算机研究与发展,2009,46(z2),105-111.
    [122]R.Diestel, Graph Theory (3rd Edition), volume 173. Springer-Verlag, Heidelberg, 2005.
    [123]H. A. Tian Y. Li, Personalized feed recommendation service for social networks, Proceedings-PASSAT 2010:2nd IEEE International Conference on Privacy, Security, Risk and Trust,2010,96-103.
    [124]R. A. Bender A. Baden, Persona:An online social network with user-defined privacy, Computer Communication Review,2009,39(4),135-146.
    [125]Gabriel Ghinita, Panagiotis Karras, Panos Kalnisetc., Fast data anonymization with low information loss, In Proc. of the 33rd international conference on Very large data bases,2007,758-769.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700