首页 | 本学科首页   官方微博 | 高级检索  
     


Parallel data mining techniques on Graphics Processing Unit with Compute Unified Device Architecture (CUDA)
Authors:Liheng Jian  Cheng Wang  Ying Liu  Shenshen Liang  Weidong Yi  Yong Shi
Affiliation:1. School of Information Science and Engineering, Graduate University of Chinese Academy of Sciences, Beijing, China
2. Agilent Technologies Co. Ltd., Beijing, China
3. Research Center on Fictitious Economy and Data Science, Chinese Academy of Sciences, Beijing, China
4. University of Nebraska at Omaha, Omaha, USA
Abstract:Recent development in Graphics Processing Units (GPUs) has enabled inexpensive high performance computing for general-purpose applications. Compute Unified Device Architecture (CUDA) programming model provides the programmers adequate C language like APIs to better exploit the parallel power of the GPU. Data mining is widely used and has significant applications in various domains. However, current data mining toolkits cannot meet the requirement of applications with large-scale databases in terms of speed. In this paper, we propose three techniques to speedup fundamental problems in data mining algorithms on the CUDA platform: scalable thread scheduling scheme for irregular pattern, parallel distributed top-k scheme, and parallel high dimension reduction scheme. They play a key role in our CUDA-based implementation of three representative data mining algorithms, CU-Apriori, CU-KNN, and CU-K-means. These parallel implementations outperform the other state-of-the-art implementations significantly on a HP xw8600 workstation with a Tesla C1060 GPU and a Core-quad Intel Xeon CPU. Our results have shown that GPU + CUDA parallel architecture is feasible and promising for data mining applications.
Keywords:
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号