High performance methods for frequent pattern mining
详细信息   
  • 作者:Vu ; Lan ; Ph.D.
  • 学历:Ph.D.
  • 年:2014
  • 关键词:Database ; Frequent pattern mining ; GPGPU ; High perfor
  • 导师:Alaghband,Gita
  • 毕业院校:University of Colorado
  • Department:Computer Science
  • 专业:Computer science
  • ISBN:9781321412291
  • CBH:3667246
  • Country:USA
  • 语种:English
  • FileSize:6971643
  • Pages:191
文摘
Current Big Data era is generating tremendous amount of data in most fields such as business,social media,engineering,and medicine. The demand to process and handle the resulting "big data" has led to the need for fast data mining methods to develop powerful and versatile analysis tools that can turn data into useful knowledge. Frequent pattern mining (FPM) is an important task in data mining with numerous applications such as recommendation systems,consumer market analysis,web mining,network intrusion detection,etc. We develop efficient high performance FPM methods for large-scale databases on different computing platforms,including personal computers (PCs),multi-core multi-socket servers,clusters and graphics processing units (GPUs). At the core of our research is a novel self-adaptive approach that performs efficiently and fast on both sparse and dense databases,and outperforms its sequential counterparts. This approach applies multiple mining strategies and dynamically switches among them based on the data characteristics detected at runtime. The research results include two sequential FPM methods (i.e. FEM and DFEM) and three parallel ones (i.e. ShaFEM,SDFEM and CGMM). These methods are applicable to develop powerful and scalable mining tools for big data analysis. We have tested,analysed and demonstrated their efficacy on selecting representative real databases publicly available at Frequent Itemset Mining Implementations Repository.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700