首页 | 本学科首页   官方微博 | 高级检索  
     

基于GPU的特征脸算法优化研究
引用本文:李繁,严星,张晓宇.基于GPU的特征脸算法优化研究[J].计算机科学,2021,48(4):197-204.
作者姓名:李繁  严星  张晓宇
作者单位:新疆财经大学网络与实验教学中心 乌鲁木齐 830012;新疆财经大学信息管理学院 乌鲁木齐 830012
基金项目:新疆财经大学青年博士基金;国家自然科学基金;新疆社科基金
摘    要:特征脸算法是基于脸部表征的常用人脸辨识方法之一。当训练数据量较大时,不管是训练还是测试模块都非常耗时。基于此,采用CUDA并行运算架构实现GPU加速特征脸算法。针对GPU并行运算的效果取决于硬件规格、算法本身的复杂度和可并行性,以及程序开发者使用GPU的并行化方式等因素,文中首先提出在特征脸算法训练阶段的计算平均值、zero mean、正规化特征脸等计算步骤以及测试阶段的投影到特征脸空间、计算欧几里得距离等计算步骤使用GPU优化加速;其次在相应计算步骤采用不同的并行化加速方法并做出效能评估。实验结果表明,在人脸训练数据量在320~1920的范围内,各计算步骤加速效果明显。与Intel i7-5960X相比,GTX1060显示适配器在训练模块中可达到平均约71.7倍的加速效果,在测试模块中可达到平均约34.1倍的加速效果。

关 键 词:人脸辨识  特征脸  GPU并行运算  旋转运算  核心函数

Optimization of GPU-based Eigenface Algorithm
LI Fan,YAN Xing,ZHANG Xiao-yu.Optimization of GPU-based Eigenface Algorithm[J].Computer Science,2021,48(4):197-204.
Authors:LI Fan  YAN Xing  ZHANG Xiao-yu
Affiliation:(Network&Experimental Teaching Center,Xinjiang University of Finance and Economics,Urumqi 830012,China;School of Information Management,Xinjiang University of Finance and Economics,Urumqi 830012,China)
Abstract:Eigenface algorithm is one of the commonly used face recognition methods based on facial representation.When the amount of training data is large,it is very time-consuming both training and testing modules.Based on this,the CUDA parallel computing architecture is used to implement GPU accelerated eigenface algorithm.The effect of GPU parallel computing depends on the hardware specifications,the complexity and parallelism of the algorithm itself,and the parallelization method used by the program developer to use GPU.Therefore,this paper first proposes the calculation of the average value and zero mean in the training phase of the eigenface algorithm.The calculation steps such as normalizing the eigenface and the calculation steps of the projection to the eigenface space and calculating the Euclidean distance in the test phase are optimized and accelerated by GPU.Secondly,different parallelization acceleration methods are used in the corresponding calculation steps and performance evaluation is made.Experimental results show that in the range of face training data from 320 to 1920,the acceleration effect of each calculation step is obvious.Compared with Intel i7-5960X,the GTX1060 display adapter can achieve an average acceleration effect of about 71.7 times in the training module,and an average acceleration effect of about 34.1 times in the test module.
Keywords:Face recognition  Eigenface  GPU parallel computing  Rotary operation  Kernel function
本文献已被 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号