不完备信息系统的数据挖掘研究

英文题名：Research on Incomplete Information System Data Mining
作者：田宏
论文级别：博士
学科专业名称：计算机应用技术
中文关键词：不完备信息系统 ; 数据挖掘 ; 粗糙集 ; 隐私保护
英文关键词：Incomplete information systems ; Data mining ; Rough sets ; Privacy preserving
学位年度：2010
导师：王秀坤
学科代码：081203
学位授予单位：大连理工大学
论文提交日期：2009-08-30

摘要

由于部分数据缺失或者获取真实数据的限制等原因,使得在数据挖掘时往往面临的是不完备信息系统,即信息系统可能存在部分对象的一些属性值未知的情况或者无法获取真实数据信息的情况。粗糙集理论是一种刻画不确定和模糊数据的数学理论,能有效的分析和处理不精确、不一致、不完整等各种信息,并从中发现隐含的知识。本文以不完备信息系统为研究对象,以数据挖掘与知识发现为目的,研究了基于弱模糊相似关系的广义粗糙集理论、基于值的相似关系的粗糙集模型以及不完备信息系统中隐私保护的数据挖掘算法,具体研究工作如下：
     1.粗糙集理论在不完备信息系统中的扩展是目前研究不完备信息系统数据挖掘的理论基础。基于相容关系的粗糙集认为空值和任意已知属性值都相等；基于相似关系的粗糙集认为空值是不存在的而被忽略；基于限制相容关系的粗糙集虽然认为空值存在而且可以比较,却限制了相容关系中取值不全为空的两个对象无相同属性取值的情况。针对以上问题,本文提出一种基于弱模糊相似关系的广义粗糙集模型,研究表明了该粗糙集模型在不改变原信息系统的信息情况下,能更加客观的刻画不完备信息系统中对象的真实信息,证明了弱模糊相似关系是一个更加一般的二元关系。
     2.研究了基于相容关系、相似关系在不完备信息系统中的知识发现。研究发现在这两种关系的粗糙集模型中不能精确的描述对象之间相似的差异,导致不能精确地进行知识发现。针对这个问题,本文提出了基于属性值的相似关系粗糙集模型下不完备信息系统的知识发现方法。该方法通过计算出每个对象的属性值之间的相似度,从而能够准确的确定出每个对象相对一个概念集合的上、下近似。如果用户选择一个合适的相似度阈值,就可以通过上、下近似的计算找到满足相似度阈值的对象集合,最后精确的确定出满足条件的知识规则。实验结果说明了该方法是一个有效的不完备信息系统的知识发现方法。
     3.研究了不完备信息系统的隐私保护数据挖掘算法,基于随机变换的MASK算法、基于属性转换概率矩阵的方法PARD算法和基于部分隐藏的随机化回答方法RRPH算法。对以上算法进行了详细的分析,针对这些算法中存在的局限性,本文提出了一种高效的隐私保护关联规则挖掘算法—基于转换概率矩阵的部分随机化回答方法PRRPM。理论分析和实验结果表明了本文提出的PRRPM方法在隐私性、准确性、复杂度和适用性方面更具有优势。
Since the data missing or restrictions on access to real data, data mining are often face with incomplete information system, which there are some unknown attribute values and unable to obtain real data in information system. Rough set theory is a new mathematical approach to uncertain and vague data analysis. It can effectively deal with imprecise, inconsistent, incomplete informations, and can discovery the hidden knowledge. In order to study data mining and knowledge discovery in incomplete information system, the general rough set theory based on the week fuzzy similarity relation and the rough set models based on valued similarity relation are studied in this dissertation. Furthermore, the privacy preserving data mining techniques and algorithms are studied in incomplete information system. The research works are listed as follows:
     1. The rough set theory extension in incomplete information system is the theory foundation for data mining in incomplete information system recently. The rough set based on tolerance relation, in which the vacancy is equal to any known attribute values. The rough set based on similarity relation, in which the vacancy does not exist. The rough set based on the limited tolerance relation, in which the vacancy does exist and can be campared. However, it is limited that the two objects do not have the same attribute values while they attribute values are not vacancy. In the light of the above shortcomings and the lack of theory, we have proposed a general rough set based on the week fuzzy similarity relation. The properties and objectivity are researched and examined in deal with objects in incomplete information system. It is proved that the week fuzzy similarity relation is a more general binary relation.
     2. In order to mining the knowledge in incomplete information system based on the tolerance relation and the similiarity relation, which can not accurately describe the difference between the two similiarity objects and can not accurately discovery knowledge. Therefore, we present an approach to mining knowledge based on the value similiarity relation, which method can objectively reflect the objects inherent relationship in incomplete information systems. First, we can accurately identify the upper and lower approximation of each object relative to the concept of a set, by computing the similarity degree of attribute values between each object. Second, if user selects an appropriate threshold value of similarity, we can find the set of objects meeting the similarity threshold by computing the upper and lower approximation. Finally, we can precise determine the rules of knowledge meeting the conditions. Experimental results show that this model is a validity model of knowledge discovery in incomplete information system.
     3. The privacy-preserving data mining algorithems are studied in incomplete information system. The MASK algorithem based on randomized transition strategies, the PARD algorithem based on attribute transfer probility matrix and the RRPH algorithem based on randomized response with partial hiding. In the light of the above shortcomings, we propose a validity privacy-preserving association rules mining method, which are the partial randomized response based on probability matrix or PRRPM. The PRRPM algorithm is explored and its validity examined through theoretical analysis and experiments, experimental results show that the accuracy, privacy, complexity and applicability are more advantages.

引文

[1]陆汝铃.世纪之交的知识工程与知识科学[M].北京：清华大学出版社,2002.
    [2]Han J. Data Mining Techniques [M]. Database Systems Research Laboratory, School of Computing Science, Canada:Simon Fraser University,1996.
    [3]王珏.机器学习及其应用[M].北京：清华大学出版社,2006.
    [4]张钹,张铃.问题求解理论及应用[M].北京：清华大学出版社,2007.
    [5]Pawlak Z. Rough Sets[M]. Theoretical aspects of reasoning about data, Boston:Kluwer Academic Publishers.1991.
    [6]Jia-yang Wang, Jie Zhou. Research of reduct features in the variable precision rough set model[J]. Neurocomputing,2009,72:2643-2648.
    [7]Salvatore Greco, Benedetto Matarazzo, Roman Stowinski. Parameterized rough set model using rough membership and Bayesian confirmation measures[J]. International Journal of Approximate Reasoning,2008,49(2):285-300.
    [8]Masahiro Inuiguchi, Yukihiro Yoshioka, Yoshifumi Kusunoki. Variable-precision dominance based rough set approach and attribute reduction[J]. International Journal of Approximate Reasoning,2009,50(8):1199-1214.
    [9]Jerzy Btaszczynski, Salvatore Greco, Roman Stowinski, Marcin Szelag. Monotonic Variable Consistency Rough Set Approaches [J]. International Journal of Approximate Reasoning, 2009,50(7):979-999.
    [10]Ying-Chieh Tsai, Ching-Hsue Cheng, Jing-Rong Chang. Entropy-based fuzzy rough classi-fication approach for extracting classification rules [J]. Expert Systems with Applications,2006,31(2):436-443.
    [11]Y. Y. Yao. Probabilistic rough set approximations [J]. International Journal of Approxi-mate Reasoning,2008,49(2):255-271.
    [12]Michael Ningler, Gudrun Stockmanns, Gerhard Schneider, Hans-Dieter Kochs, Eberhard Kochs. Adapted variable precision rough set approach for EEG analysis[J]. Artificial Intelligence in Medicine,2009,47(3):239-261.
    [13]Gang Xie, Jinlong Zhang, K. K. Lai, Lean Yu. Variable precision rough set for group decision-making:An application [J]. International Journal of Approximate Reasoning, 2008,49(2):331-343.
    [14]Tzung-Pei Hong, Tzu-Ting Wang, Shyue-Liang Wang. Mining fuzzy β-certain and β-possible rules from quantitative data based on the variable precision rough-set model[J]. Expert Systems with Applications,2007,32(1):223-232.
    [15]Xibei Yang, Dongjun Yu, Jingyu Yang, Lihua Wei. Dominance-based rough set approach to incomplete interval-valued information system [J]. Data & Knowledge Engineering, 2009,68(11):1331-1347.
    [16]Zuqiang Meng, Zhongzhi Shi. A fast approach to attribute reduction in incomplete decision systems with tolerance relation-based rough sets[J]. Information Sciences, 2009,179(16):2774-2793.
    [17]Xin-Yu Shao, Xue-Zheng Chu, Hao-Bo Qiu, Liang Gao, Jun Yan. An expert system using rough sets theory for aided conceptual design of ships engine room automation [J]. Expert Systems with Applications,2009,36(2):3223-3233.
    [18]Xibei Yang, Jingyu Yang, Chen Wu, Dongjun Yu. Dominance-based rough set approach and knowledge reductions in incomplete ordered information system [J]. Information Sciences, 2008,178(4):1219-1234.
    [19]Tianrui Li, Da Ruan, Wets Geert, Jing Song, Yang Xu. A rough sets based characteristic relation approach for dynamic attribute generalization in data mining [J]. Knowledge-Based Systems,2007,20(5):485-494.
    [20]Kryszkiewicz M. Rough set approach to incomplete information systems[J]. Information Science,1998,112:39-49.
    [21]Kryszkiewicz M. Rules in incomplete information systems[J]. Information Science,1999, 113:271-292.
    [22]Alicja Mieszkowicz-Rolka, Leszek Rolka. Fuzzy rough approximations of process data[J]. International Journal of Approximate Reasoning,2008,49(2):301-315.
    [23]Qiang Wu, Zongtian Liu, Real formal concept analysis based on grey-rough set theory [J]. Knowledge-Based Systems,2009,22 (1):38-45.
    [24]Lihua Wei, Zhenmin Tang, Runyun Wang, Xibei Yang. Extensions of dominance-based rough set approach in incomplete information system[J]. Automatic Control and Computer Sciences,2008,42(5):255-263.
    [25]Yee Leung, Wei-Zhi Wu, Wen-Xiu Zhang. Knowledge acquisition in incomplete information systems:A rough set approach [J]. European Journal of Operational Research,2006, 168(1):164-180.
    [26]M. K. Sabul, G. Raju. Rough Set Approaches for Mining Incomplete Information Systems, ICIC 2008[C]. Lecture Notes in Computer Science (5227), Springer Berlin:Heidelberg, 2008,914-921.
    [27]Zdzistaw Pawlak, Andrzej Skowron. Rough sets:Some extensions [J]. Information Sciences, 2007,177(1):28-40.
    [28]吕品,陈年生,董武世.面向保护隐私的数据挖掘技术研究[J].计算机技术与发展,2006,16(7)：147-149.
    [29]葛伟平.保护隐私的数据挖掘[D].上海：复旦大学,2006.
    [30]董爱杰.保护隐私的关联规则挖掘研究[D].大连：大连交通大学,2008.
    [31]Clark Glymour, David Madigan, Daryl Pregibon, Padhraic Smyth. Statistical Themes and Lessons for Data Mining[J]. Data Mining and Knowledge Discovery,1997(11):11-28
    [32]Elliot Bendoly. Theory and support for process frameworks of knowledge discovery and data mining from ERP systems[J]. Information & Management,2003,40(7):639-647.
    [33]Balaji Padmanabhan, Alexander Tuzhilin, Knowledge refinement based on the discovery of unexpected patterns in data mining [J]. Decision Support Systems,2002,33 (3):309-321.
    [34]Pang Ning Tan, Michael Steinbach, Vipin Kumar,(范明,范宏建,等译).Introduction to Data Mining(数据挖掘导论)[M].北京：人民邮电出版社,2006.
    [35]金聪,戴上平,郭京蕾,等.人工智能教程[M].北京：清华大学出版社,2007.
    [36]Quinlan, J. R. Bagging Boosting and C4.5[C]. In Proceedings of the Thirteen National Conference on Artificial Intelligence. Cambridge:MA. AAAI Press/MIT Press.1996, 725-730.
    [37]S. Tsumoto, H. Tanaka. Automated discovery of functional components of proteins from amino acid sequences based on rough sets and change of representation[C]. Proceedings of KDD95, Montreal, Quebec, Canada:AAAI Press.1995,318-324.
    [38]Agrawal, R., Srikant, R. Fast algorithms for mining association rules[C]. Proc. Int. Conf. Very Large Database, Santiago:Chile,1994,487-499.
    [39]Han J, Fu Y, Wang W and et al. DBMiner:A System for Mining Knowledge in Large Relational Database [C]. In:Simoudis E, Han J-W, Fayyad U. M (Eds.). Proceedings of AAAI Workshop on KDD96, Menlo Park, California:AAAI Press.1996,250-255.
    [40]Agrawal R, Mehta M, Shafer J, et al. The Quest Data Mining System[C]. In:Simoudis E, Han J-W, Fayyad U. M (Eds.). Proceedings of AAAI Workshop on KDD96. Menlo Park, California: AAAI Press.1996,244-249.
    [41]Pawlak Z. Rough sets[J]. International Journal of Computer Information Science,1982,11: 321-336.
    [42]J. M. Cadenas, M. C. Garrido, E. Munoz. Using machine learning in a cooperative hybrid parallel strategy of metaheuristics [J]. Information Sciences,2009,179(19):3255-3267.
    [43]Jeremy Mennis, Diansheng Guo. Spatial data mining and geographic knowledge discovery An introduction [J]. Computers, Environment and Urban Systems,2009,33(6):403-408.
    [44]Jason Van Hulse, Taghi Khoshgoftaar. Knowledge discovery from imbalanced and noisy data[J]. Data & Knowledge Engineering,2009,68(12):1513-1542.
    [45]Chia-Wen Liao, Yeng-Horng Perng, Tsung-Lung Chiang. Discovery of unapparent associa-tion rules based on extracted probability [J]. Decision Support Systems,2009,47(4): 354-363.
    [46]Tzung-Pei Hong, Li-Huei Tseng, Been-Chian Chien. Mining from incomplete quantitative data by fuzzy rough sets[J]. Expert Systems with Applications,2010,37(3):2644-2652.
    [47]Hai Wang, Shouhong Wang. Discovering patterns of missing data in survey databases: An application of rough sets[J]. Expert Systems with Applications,2009,36(3):6256-6260.
    [48]Atish P. Sinha, Huimin Zhao. Incorporating domain knowledge into data mining classifiers:An application in indirect lending [J]. Decision Support Systems,2008, 46(1):287-299.
    [49]Theresa Beaubouef, Frederick E. Petry, Roy Ladner. Spatial data methods and vague regions:A rough set approach[J]. Applied Soft Computing,2007,7(1):425-440.
    [50]Wojciech Ziarko. Probabilistic approach to rough sets[J]. International Journal of Approximate Reasoning,2008,49(2):272-284.
    [51]Zhiwang Zhang, Yong Shi, Guangxia Gao. A rough set-based multiple criteria linear programming approach for the medical diagnosis and prognosis [J]. Expert Systems with Applications.2009,36(5):8932-8937.
    [52]Daisuke Yamaguchi, Guo-Dong Li, Masatake Nagai. A grey-based rough approximation model for interval data processing[J]. Information Sciences,2007,177(21):4727-4744.
    [53]Ye Chen, Keith W. Hipel, D. Marc Kilgour, Yuming Zhu. A strategic classification support system for brownfield redevelopment [J]. Environmental Modelling & Software,2009,24(5): 647-654.
    [54]Piotr Sawicki, Jacek Zak. Technical diagnostic of a fleet of vehicles using rough set theory[J]. European Journal of Operational Research,2009,193(3):891-903.
    [55]曾黄麟.粗集理论及其应用[M].重庆：重庆大学出版社,1996.
    [56]王国胤.Rough集理论与知识获取[M].西安：西安交通大学出版社,2001.
    [57]刘清.Rough集及其Rough推理[M].北京：科学出版社,2001.
    [58]张文修,吴伟志,粱吉业,等.粗糙集理论和方法[M].北京：科学出版社,2001.
    [59]张文修,梁怡,吴伟志.信息系统与知识发现[M].北京：科学出版社,2003.
    [60]张文修,吴伟志.粗糙集理论介绍和研究综述[J].模糊系统与数学,2000,14(4)：1-12.
    [61]Pawlak Z. Rough set Theory and its Applications to Data Analysis [J]. Cybernetics and Systems.1988,29:661-688.
    [62]Skowron, A., Rauszer, C. The discernibility matrices and functions in information systems[M]. In:Slowinski. R. (ed.), Intelligent Decision Support Handbook of Applicat-ions and Advances of the Rough Sets Theory.1991:331-362.
    [63]Yiyu Yao, Yan Zhao. Discernibility matrix simplification for constructing attribute reducts[J]. Information Sciences,2009,179(7):867-882.
    [64]Yan Zhao, Yiyu Yao, Feng Luo. Data analysis based on discernibility and indiscernibility [J]. Information Sciences,2007,177(22):4959-4976.
    [65]Ming Yang, Ping Yang. A novel condensing tree structure for rough set feature selection [J]. Neurocomputing,2008,71 (4):1092-1100.
    [66]Daisuke Yamaguchi. Attribute dependency functions considering data efficiency [J]. International Journal of Approximate Reasoning,2009,51(1):89-98.
    [67]张文修,米据生,吴伟志.不协调目标信息系统的知识约简[J].计算机学报,2003,26(1)：12-18.
    [68]张文修,仇国芳.基于粗糙集的不确定决策[M].北京：清华大学出版社,2005.
    [69]Hu X. Knowledge discovery in database:an attribute-oriented rough set approach[D]. Doctoral dissertation, Canada:University of Regina,2005.
    [70]Renpu Li, Zheng-ou Wang. Mining classification rules using rough sets and neural networks[J]. European Journal of Operational Research,2004,157(2):439-448.
    [71]Alberto Garcia-Villoria, Rafael Pastor. Solving the response time variability problem by means of a genetic algorithm [J]. European Journal of Operational Research,2010, 202(2):320-327.
    [72]Mehmet Kaya. Autonomous classifiers with understandable rule using multi-objective genetic algorithms[J]. Expert Systems with Applications,2010,37(4):3489-3484.
    [73]Hoi-Ming Chi, Herbert Moskowitz, Okan K. Ersoy, Kemal Altinkemer, Peter F. Gavin, Bret E. Huff, Bernard A. Olsen. Machine learning and genetic algorithms in pharmaceutical development and manufacturing processes [J]. Decision Support Systems,2009,48(1):69-80.
    [74]苗夺谦,胡桂荣.知识约简的一种启发式算法[J].计算机研究与发展,1999,36(6)：681-684.
    [75]苗夺谦,王珏.粗糙集理论中概念与运算的信息表示[J].软件学报,1999,10(2)：113-116.
    [76]王国胤,于洪,杨大春.基于条件信息熵的决策表约简[J].计算机学报,2002,25(7)：759-766.
    [77]Hong Yu, Guoyin Wang, Dachun Yang, Zhongfu Wu. Knowledge reduction algorithrms based on rough set and conditional information entropy [C]. Data Mining and Knowledge Discovery: Theory, Tools, and Technology Ⅳ, Belur V. Dasarathy, Editor, Proceddings of SPIE,2002: 422-431.
    [78]Bin Han, Tie-Jun Wu. Information entropy based reduct searching algorithm[C]. Proceed-ings of the American Control Conference, Anchorage, AK 2002:8-10.
    [79]G Y. Wang. Algebra view and information view of rough sets theory[C]. Data Mining and Knowledge Discovery:Theory, Tools, and Technology Ⅲ, Belur V. Dasarathy, Editor, Proceddings of SPIE,2001:200-207.
    [80]梁吉业,曲开社,徐宗本.信息系统的属性约简[J].系统工程理论与实践,2001,12：76-80.
    [81]徐燕,怀进鹏,王兆其.基于区分能力大小的启发式约简算法及其应用[J].计算机学报,2003,26(1)：97-103.
    [82]尹旭日.基于粗糙集的知识发现研究[D].南京：南京大学,2001.
    [83]黄兵.基于粗糙集的不完备信息系统知识获取理论与方法[D].南京：南京理工大学,2004.
    [84]徐德友,胡寿松.一种基于粗糙集的近似质量求取属性约简的决策算法[J].控制与决策,2003,5：313-316.
    [85]赵卫东,戴伟辉.基于特征矩阵的决策表约简研究[J].系统工程理论与实践,2003,3：65-69.
    [86]王立宏,吴耿锋.基于并行协同进化的属性约简[J].计算机学报,2003,26(5)：630-635.
    [87]刘少辉，盛秋,吴斌,等.Rough集高效算法的研究[J].计算机学报,2003,26(5)：524-529.
    [88]Wang Jue, Wang Ju. A kind of reduction algorithms based on discernibility matrix:The ordered attributes method[J]. Journal Computer and Technology,2001,16(6):489-504.
    [89]Su-Qing Han, Jue Wang. Reduct and attribute order [J]. Journal of Computer Science and Technology,2004,19(4):1429-1449.
    [90]Kryszkiewicz M. Rybinski H. Finding reducts in composed information systems[C]. In: Ziarko WP (Ed.), Proceedings of RSKD 93, London:Springer-Verlag.1993,261-273.
    [91]Bazan J G, Skowron A, Synak P. Dynamic reducts as a tool for extraction laws from decisions tables[M]. In:Ras ZZ W, Zemankiva M (Ed.), Methodologies for Intelligent Systems. Berlin:Springer-Verlag.2000,246-355.
    [92]米据生,吴伟志,张文修.基于变精度粗糙集理论的知识约简方法[J].系统工程理论与实践,2004,24(1)：76-82
    [93]李然.连续值域信息系统的规则提取与知识约简[J].模糊系统与数学,2003,17(4)：40-47.
    [94]Sai Ying, Y. Y. Yao. Analyzing and mining ordered information tables[J]. Journal of Computer and Technology,2003,18(60):771-779.
    [95]陈德刚,刘民,吴澄等.模糊信息系统的代数结构及其约简[J].清华大学学报,2003,43(9)：1233-1264.
    [96]陆汝铃.知识科学与计算机科学[M].北京：清华大学出版社,2003.
    [97]郝忠孝.空值环境下的数据库导论[M].北京机械工业出版社,1996.
    [98]Komorowski J, hrn A, Skowron A. The ROSETTA rough set software system [M]. In Handbook of Data Mining and Knowledge Discovery, London:Oxford University Press,2002.
    [99]Clark P, Niblett T. The CN2 induction algorithm[J]. Machine Learning,1989,3(4): 261-283.
    [100]Wen-Xiu Zhang, Ju-Sheng Mi. Incomplete information system and its optimal selections [J]. Computers & Mathematics with Applications,2004,48(5):691-698.
    [101]A. S. Salama. Topological solution of missing attribute values problem in incomplete information tables[J]. Information Sciences,2010,180(5):631-639.
    [102]Changzhong Wang, Congxin Wu, Degang Chen. A systematic study on attribute reduction with rough sets based on general binary relations [J]. Information Sciences,2008, 178(9):2237-2261.
    [103]Mu-Chen Chen, Long-Sheng Chen, Chun-Chin Hsu, Wei-Rong Zeng. An information granulation based data mining approach for classifying imbalanced data [J]. Information Sciences,2008,178(16):3214-3227.
    [104]Zan Huang, Jiexun Li, Hua Su, George S. Watts, Hsinchun Chen. Large-scale regulatory network analysis from microarray data:modified Bayesian network learning and association rule mining[J]. Decision Support Systems,2007,43(4):1207-1225.
    [105]Mohamed Ayman Boujelben, Yves De Smet, Ahmed Frikha, Habib Chabchoub. Building a binary outranking relation in uncertain, imprecise and multi-experts contexts:The application of evidence theory [J]. International Journal of Approximate Reasoning, 2009,50 (8):1259-1278.
    [106]Jerzy W. Grzymala-Busse. A Comparison of Three Strategies to Rule Induction from Data with Numerical Attributes [J]. Electronic Notes in Theoretical Computer Science,2003, 82(4):132-140.
    [107]Fowler N, Cross S, Owens C. The ARPA-Rome knowledge-based planning and scheduling initial[J]. IEEE Expert,1995,10(1):4-9.
    [108]Slowinski R, Vsnderpooten D. A generalized definition of rough approximations based on similarity[J]. IEEE Trans on Data and Knowledge Engineering.2000,12(2):331-336.
    [109]J Stefamowski, A Tsoukeas. On the extension of rough sets under incomplete informa-tion[J]. International Journal of Intelligent System,2000,16(1):29-38.
    [110]王国胤.Rough集理论在不完备信息系统中的扩充[J].计算机研究与发展,2002,39(10)：1238-1243.
    [111]Hong Tian, Pixi Zhao, Xiukun Wang. CRST:A generalization of rough set theory[C]. The Tenth International Conference on Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing, University of Regina, Lecture Notes in Artificial Intelligence,2005,3641: 364-372.
    [112]田宏,王秀坤.一种基于弱模糊相似关系的广义粗糙集[J].大连理工大学学报,2007,47(2)：287-290.
    [113]Y. Y. Yao, Two views of the theory of rough sets in finite universe[J]. International Journal of Approximate Reasoning 1996,15(4):291-317.
    [114]Z. Pawlak, A. Skowron, Rough membership functions [M], Fuzzy Logic for the Management of Uncertainty, Wiley, New York,1994,251-271.
    [115]Y. Y. Yao, J. P. Zhang. Interpreting fuzzy membership functions in the theory of rough sets[C]. Proceedings of RSCTCOO, Banff Park Lodge, Banff, Canada,2000,50-57.
    [116]Tian Hong, Wang Xiukun, Henryk Rybinski. A New Approach to Computing Weighted Attributes Values in Incomplete Information Systems[J]. Journal of Convergence Information Technology,2007(6):21-29.
    [117]Tian Hong, Song Xudong. Rough set approach to incomplete information systems[J]. Advanced in Information and Systems Science,2009(5):1011-1024.
    [118]Blake, C. L., Merz, C. J., UCI Repository of machine learning databases [http://www. ics.uci.edu/mlearn/MLRepository.htm][EB]. Irvine, CA:University of California, Department of Information and Computer Science,1998.
    [119]Agrawal R, Srikant R. Privacy-Preserving data mining [C]. Proc. of the ACM SIGMOD Conf. on Management of Data. Dallas:ACM Press,2000:439-450.
    [120]Huseyin Polat, Wenliang Du. Privacy-Preserving Collaborative Filtering using Randomized Perturbation Techniques[C]. In Proceedings of The Third IEEE International Conference on Data Mining (ICDM), Melbourne, Florida, November 19-22,2003:625-628.
    [121]Evfimievski A. Randomization in privacy preserving data mining [J]. SIGKDD Explorat-ions,2002,4(2):43-48.
    [122]Rizvi SJ, Haritsa JR. Maintaining data privacy in association rule mining[C]. In: Bernstein PA, Ioannidis YE, Ramakrishnan R, Papadias D, eds. Proc. of the 28th Int Conf. on Very Large Data Bases. Hong Kong:Morgan Kaufmann Publishers,2002:682-693.
    [123]Zhang Peng, Tong Yunhai, Tang Shiwei.An Effective Method for Privacy Preserving Association Rule Mining[J]. Journal of Software,2006,17(8):1764-1774.
    [124]Wenliang Du, Zhijun Zhan. Building Decision Tree Classifier on Private Data[C]. In Workshop on Privacy, Security, and Data Mining at The 2002 IEEE International Conference on Data Mining (ICDM02), Maebashi City, Japan,2002:1-8.
    [125]Vaidya J, Clifton C. Privacy preserving association rule mining in vertically partitioned data[M]. In:Hand D, Keim D, Ng R, eds. Proc. of the 8th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining. Edmonton:ACM Press,2002:639-644.
    [126]Y. Lindell, B. Pinkas. Privacy Preserving Data Mining[J]. The Journal of Cryptology, 2002,15(3):177-206.
    [127]Kantarcioglu M, Clifton C. Privacy-Preserving distributed mining of association rules on horizontally partitioned data[J]. IEEE Trans. on Knowledge and Data Engineering, 2004,16(9):1026-1037.
    [128]Stanley R. M. Oliveira, Osmar R. Zaiane. Achieving Privacy Preservation When Sharing Data for Clustering[C]. In Proc. of the International Workshop on Secure Data Management in a Connected World (SDM04) in conjunction with VLDB,Toronto, Canada 2004: 67-82.
    [129]Stanley R. M. Oliveira, Osmar R. Zaiane. Privacy Preserving Frequent Itemset Mining [C].In Proc. of IEEE ICDM Workshop on Privacy, Security and Data Mining, Maebashi City, Japan,2002:43-54.
    [130]Evfimievski A, Srikant R, Agrawal R, etal. Privacy preserving mining of association rules. In:Hand D, Keim D, Ng R, eds[C]. In Proc. of the 8th ACM SIGKDD Int Conf. on Knowledge Discovery and Data Mining. Edmonton:ACM Press.2002:217-228.
    [131]Jun-Lin Lin, Yung-Wei Cheng. Privacy preserving itemset mining through noisy items [J]. Expert Systems with Applications,2009,36(3):5711-5717.
    [132]Li Liu, Murat Kantarcioglu, Bhavani Thuraisingham. The applicability of the pertur-bation based privacy preserving data mining for real-world data[J]. Data & Knowledge Engineering,2008,65(1):5-21.
    [133]Divyesh Shah, Sheng Zhong. Two methods for privacy preserving data mining with malicious participants[J]. Information Sciences,2007,177(23):5468-5483.
    [134]Sheng Zhong. Privacy-preserving algorithms for distributed mining of frequent itemsets[J]. Information Sciences,2007,177(2):490-503.
    [135]S. L. Warmer. Randomized response:A survey technique for eliminating evasive answer bias[J]. The American Statistical Association,2006,60(9):63-69.
    [136]Justin Zhan, Stan Matwin, LiWu Chang. Privacy-preserving collaborative association rule mining [J]. Journal of Network and Computer Applications,2007,30(3):1216-1227.
    [137]Yu-Chiang Li, Jieh-Shan Yeh, Chin-Chen Chang. MICF:An effective sanitization algorithm for hiding sensitive patterns on data mining[J]. Advanced Engineering Informatics,2007,21(3):269-280
    [138]沈旭昌,保护隐私的分布式挖掘系统[D].浙江：浙江工业大学,2005.
    [139]李顺东,窦家维,贾晓林.集合相交问题的双方保密计算[J].西安交通大学学报,2006,40(10)：1091-1093.
    [140]Boris Rozenberg, Ehud Gudes. Association rules mining in vertically partitioned databases[J]. Data & Knowledge Engineering,2006,59(2):378-396.
    [141]Wei jia Yang, Shangteng Huang. Data privacy protection in multi-party clustering [J]. Data & Knowledge Engineering,2008,67(1):185-199.
    [142]Saygin Y, Verykios V. S, Clifton C. Using unknowns to prevent discovery of association rules[J]. ACM SIGMOD Record,2001,30(4):45-54.
    [143]田宏,王亚伟,王秀坤.基于可逆方阵的隐私保换关联规则挖掘[J].计算机工程,2009,35(7)：153-155.
    [144]Agrawal S, Krishnan V, Haritsa JR. On addressing efficiency concerns in privacy-preserving mining [C]. In:Lee YJ, Li JZ, Whang KY, Lee D, eds. Proc. of the 9th Int. Conf. on Database Systems for Advanced Applications. LNCS 2973, Jeju Island:Springer-Verlag, 2004:113-124.
    [145]张鹏,童云海,唐世渭等.一种有效的基于图的关联规则挖掘算法[J].软件学报,2006,17(8)：2654-265.
    [146]葛伟平.基于保护隐私的分类挖掘[J].计算机研究与发展,2006,43(1)：39-45.
    [147]Stanley R.M. Oliveira, Osmar R. Zaiane. A privacy-preserving clustering approach toward secure and effective data analysis for business collaboration [J]. Computers & Security,2007,26(1):81-93.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700