基于FPGA的细粒度并行K-means算法加速器的设计与实现 Fine-Grained Parallel K-means Clustering Algorithm on FPGA期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于FPGA的细粒度并行K-means算法加速器的设计与实现

引用本文：	倪时策,窦勇,雷元武,赵建勋. 基于FPGA的细粒度并行K-means算法加速器的设计与实现[J]. 计算机工程与科学, 2009, 31(Z1). DOI: 10.3969/j.issn.1007-130X.2009.A1.020

作者姓名：	倪时策窦勇雷元武赵建勋

作者单位：	1. 国防科技大学计算机学院,湖南,长沙,410073 2. 装甲兵工程学院,北京,100072

基金项目：	国家自然科学基金资助项目

摘要：	本文在深入分析K-means算法计算特征的基础上,基于FPGA平台提出并实现了一种细粒度的并行浮点K-means算法。设计采用了阵列多PE并行处理的任务划分策略,实现了处理单元间的负载平衡,采用数据驱动的流水线隐藏片外存储访问,设计了一种基于脉动阵列结构的主从多PE并行计算阵列,并在单片FPGA(XC5VLX330)上成功集成了4个PE。实验结果表明,我们提出的K-means算法加速器结构具备良好的可扩展性。通过实验测试,我们的实现方案相对于Pentium 4 2.66 GHz单处理器程序达到了15倍的加速比。
关键词：	K-means算法 FPGA 硬件加速器浮点实现
Fine-Grained Parallel K-means Clustering Algorithm on FPGA

NI Shi-ce,DOU Yong,LEI Yuan-wu,ZHAO Jian-xun. Fine-Grained Parallel K-means Clustering Algorithm on FPGA[J]. Computer Engineering & Science, 2009, 31(Z1). DOI: 10.3969/j.issn.1007-130X.2009.A1.020

Authors:	NI Shi-ce DOU Yong LEI Yuan-wu ZHAO Jian-xun

Abstract:	We propose a systolic array structure including one master PE and multiple slave PEs for fine grain hardware implementation on FPGA. We partition tasks by rows and assign tasks to PEs for load balance. We exploit data reuse schemes to reduce the need to load data from external memory. To our knowledge, our implementation with 4 PEs is the only FPGA accelerator(XC5VLX330) implementing the complete K-means clustering algorithm. The experimental results show a factor of more than 15 speedup over the Cluster 3.0 software running on a PC platform with Pentium 4 2.66GHz CPU.

Keywords:	FPGA
本文献已被万方数据等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏