用户名: 密码: 验证码:
Frequent Itemset Hiding Algorithm Using Frequent Pattern Tree Approach.
详细信息   
  • 作者:Alnatsheh ; Rami.
  • 学历:Doctor
  • 年:2012
  • 导师:Mukherjee, Sumitra,eadvisorLaszlo, Michael J.ecommittee memberSun, Junpingecommittee member
  • 毕业院校:Nova Southeastern University
  • Department:Computer Information Systems
  • ISBN:9781267844927
  • CBH:3548774
  • Country:USA
  • 语种:English
  • FileSize:2050779
  • Pages:89
文摘
A problem that has been the focus of much recent research in privacy preserving data-mining is the frequent itemset hiding FIH) problem. Identifying itemsets that appear together frequently in customer transactions is a common task in association rule mining. Organizations that share data with business partners may consider some of the frequent itemsets sensitive and aim to hide such sensitive itemsets by removing items from certain transactions. Since such modifications adversely affect the utility of the database for data mining applications, the goal is to remove as few items as possible. Since the frequent itemset hiding problem is NP-hard and practical instances of this problem are too large to be solved optimally, there is a need for heuristic methods that provide good solutions. This dissertation developed a new method called Min_Items_Removed, using the Frequent Pattern Tree FP-Tree) that outperforms extant methods for the FIH problem. The FP-Tree enables the compression of large databases into significantly smaller data structures. As a result of this compression, a search may be performed with increased speed and efficiency. To evaluate the effectiveness and performance of the Min_Items_Removed algorithm, eight experiments were conducted. The results showed that the Min_Items_Removed algorithm yields better quality solutions than extant methods in terms of minimizing the number of removed items. In addition, the results showed that the newly introduced metric normalized number of leaves) is a very good indicator of the problem size or difficulty of the problem instance that is independent of the number of sensitive itemsets.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700