首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 328 毫秒
1.
针对在无约束环境下静态手势在识别过程中准确率不高的问题,本文提出了一种融合手部骨架灰度图(Grayscale Image of Hand Skeleton,GHS)的深度神经网络,使用手部关键点及其相互关联性构建手部骨架灰度图.网络的输入为GHS图像和RGB图像,主干网络为yolov3,添加了扩展卷积残差模块,在GHS图像和RGB图像进行特征融合后,通过SE模块对每个通道上的特征进行缩放,采用RReLU激活函数来代替Leaky ReLU激活函数.通过手部关键点及其相互间的连接信息增强手部图像特征,增大手势的类间差异,同时降低无约束环境对手势识别的影响,以提高手势识别的准确率.实验结果表明,在Microsoft Kinect&Leap Motion数据集上相比其他方法,本文方法的平均准确率达到最高,为99.68%;在Creative Senz3D数据集上相比其他方法,本文方法平均准确率达到最高,为99.8%.  相似文献   

2.
针对复杂环境中动态手势识别精度低且鲁棒性不强的问题,提出一种基于多模态融合的动态手势识别算法TF-MG。TF-MG结合深度信息和三维手部骨架信息,利用2种不同网络分别提取对应特征信息,然后将提取的特征融合输入分类网络,实现动态手势识别。针对深度信息运用运动历史图像方法,将运动轨迹压缩到单帧图像,使用MobileNetV2提取特征。针对三维手部骨架信息采用门控循环神经单元组成的DeepGRU对手部骨架信息进行特征提取。实验结果表明,在DHG-14/28数据集上,对14类手势识别精度达到93.29%,对28类手势识别精度达到92.25%。相对其他对比算法实现了更高的识别精度。  相似文献   

3.
针对目前表面肌电信号(surface electromyography,sEMG)端到端手势识别特征提取不充分、多手势识别准确率不高的问题,提出一种融合注意力机制的多流卷积肌电手势识别网络模型.该模型通过滑动窗口将多通道时域sEMG生成肌电子图,并使用多流卷积神经网络充分提取每个采集通道sEMG的语义特征,然后将其聚合得到丰富的多通道手势语义特征;同时从时间和特征通道维度上计算语义特征的注意力分布图,强化有用特征并弱化无用特征,进一步提高多手势识别准确率.实验使用Ninapro数据集进行训练和测试,并与主流的肌电手势识别模型进行对比.实验结果表明,该模型在识别准确率上具有更好的表现,证明了该模型的有效性.  相似文献   

4.
针对现有的动态手势识别方法对长时间序列的时空特征难以精确匹配的问题,提出了一种基于宽残差和双向长短时记忆网络的时空特征一致手势识别方法。首先使用已经训练好的3D卷积神经网络从视频的空间和时间维度同步提取出短时特征,再经双向空间长短时记忆网络同步解析后形成长时空特征连接单元,并作为残差网络的输入。为了验证算法的有效性,使用Kinect传感器构建了一个全新的多模式手势数据集,在三个手势识别公开数据集SLVM、Montalbano和SKIG上的实验表明,提出的方法有很好的性能表现,识别精度超越了目前已公开的最佳识别率。  相似文献   

5.
兰红  何璠  张蒲芬 《计算机应用研究》2021,38(12):3791-3795,3825
针对现有骨架动作识别主要采用双流框架,在提取时间空间以及通道特征方法上存在的问题,提出一个ADGCN,用于骨架动作识别.首先对骨架数据进行建模,分别将关节、骨骼及其关节和骨骼的运动信息输入到多流框架的单个流.然后将输入的数据传送到提出的有向图卷积网络中进行提取关节和骨骼之间的依赖关系,再利用提出的时空通道注意力网络(STCN),增强每层网络中关键关节的时间、空间以及通道的信息.最后将四个流的信息通过加权平均计算动作识别的精度,输出动作的预测结果.此模型在两个大型数据集NTU-RGB+D和Kinectics-Skeleton中进行训练和验证,验证的结果与基线方法DGNN(有向图神经网络)相比,在NTU-RGB+D数据集上,在两个交叉子集CS和CV上的准确率分别提升了2.43%和1.2%.在Kinectics-Skeleton数据集的top1和top5上的准确率分别提升了0.7%和0.9%.提出的ADGCN可以有效地增强骨架动作识别的性能,在两个大型数据集上的效果都有所提升.  相似文献   

6.
基于手部骨骼的动态手势识别是计算机视觉和人机交互领域的一个研究热点.手势涉及的关节在空间上分布更紧密,相关性更强.针对目前基于骨骼的动态手势识别存在空间特征复杂、识别计算速率缓慢等问题,提出一种注意力引导空域图卷积简单循环单元(ASGC-SRU)网络.首先,将空域图卷积嵌入至SRU的门结构中,使得具有高速并行计算能力的SRU能够对复杂手势的时域和空域信息进行建模;然后,引入一种指关节注意力引导模块,使得更重要的指关节具有更高的关注度;最后,引入一种注意力增强空域图丢弃(ASD)的正则化方法,缓解网络过拟合的弊端.为验证所提出方法的有效性,在公认的动态手势数据集SHREC’17和DHG 14/28上进行大量实验,实验结果表明,所提出方法取得了较高的识别准确率,同时保持优良的计算效率.  相似文献   

7.
针对基于视觉的动态手势识别易受光照、背景和手势形状变化影响等问题,在分 析人体手势空间上下文特征的基础上,首先建立一种基于人体骨架和部件轮廓特征的动态手势 模型,并采用卷积姿势机和单发多框检测器技术构造深度神经网络进行人体手势骨架和部件轮 廓特征提取。其次,引入长短时记忆网络提取动态人体手势中骨架、左右手和头部轮廓的时序 特征,进而分类识别手势。在此基础上,设计了一种空间上下文与时序特征融合的动态手势识 别机(GRSCTFF),并通过交警指挥手势视频样本库对其进行网络训练和实验分析。实验证明, 该系统 可以快速准确识别动态交警指挥手势,准确率达到94.12%,并对光线、背景和手势形 状变化具有较强的抗干扰能力。  相似文献   

8.
针对动态复杂场景下的操作动作识别,提出一种基于手势特征融合的动作识别框架,该框架主要包含RGB视频特征提取模块、手势特征提取模块与动作分类模块。其中RGB视频特征提取模块主要使用I3D网络提取RGB视频的时间和空间特征;手势特征提取模块利用Mask R-CNN网络提取操作者手势特征;动作分类模块融合上述特征,并输入到分类器中进行分类。在EPIC-Kitchens数据集上,提出的方法识别抓取手势的准确性高达89.63%,识别综合动作的准确度达到了74.67%。  相似文献   

9.
近年来,连续手语识别的研究工作主要围绕RGB模态的数据展开,并且在现实场景数据集和实验室采集数据集上都取得了显著进展。然而,RGB模态的处理对设备计算能力具有很高的要求,而骨骼关键点模态则由于输入数据复杂度相对低,因此处理速度更快,只是在识别性能上弱于RGB模态。为了综合两种方法的优点,文中提出了一种基于时序关联信息对齐的跨模态知识蒸馏方法(Temporally Related Knowledge Distillation, TRKD)。该方法使用RGB模态的神经网络作为教师网络来指导使用骨骼关键点模态的学生网络,以快速准确地实现连续手语识别。由于教师网络对手语语境的理解能力十分值得学生网络学习,因此提出了具有先验信息以及自适应学习方法的图卷积网络来提取两类模态中的时序关联特征,并通过特征对齐来实现教学。在特征对齐过程中,在教师网络中引入可学习参数会导致教师提供的监督信息丢失。为了解决这个问题,所提出的TRKD方法引入了自监督学习中的对比学习来提供监督信息,从而实现了教师网络与学生网络在时序关联特征上的对齐。文中在Phoenix-2014手语数据集上组织了多项蒸馏任务,以验证所提方法的...  相似文献   

10.
为有效地消除手语识别过程中背景、光照等干扰因素带来的视觉问题,采用低冗余的骨架数据表达手语信息,设计了一个端到端连续手语识别模型.首先,分别从帧内和帧间提取手型和轨迹特征,可以有效地降低原始样本的离散程度;其次,构建一系列并行的双路残差网络对手型和轨迹特征进行优化与融合,生成时空特征序列;最后,基于注意力机制的编码-解码网络实现时空特征序列到翻译文本的映射.使用Leap Motion收集建立了一个基于三维手部骨架数据的手语数据集LMSLR.实验结果表明,在LMSLR数据集和公共的CSL数据集上,该模型与大多数基于视频处理的模型相比具有较高的准确率和较小的计算量.  相似文献   

11.
Cemil  Ming C.   《Neurocomputing》2007,70(16-18):2891
Sign language (SL), which is a highly visual–spatial, linguistically complete, and natural language, is the main mode of communication among deaf people. Described in this paper are two different American Sign Language (ASL) word recognition systems developed using artificial neural networks (ANN) to translate the ASL words into English. Feature vectors of signing words taken at five time instants were used in the first system, while histograms of feature vectors of signing words were used in the second system. The systems use a sensory glove, Cyberglove™, and a Flock of Birds® 3-D motion tracker to extract the gesture features. The finger joint angle data obtained from strain gauges in the sensory glove define the hand shape, and the data from the tracker describe the trajectory of hand movement. In both systems, the data from these devices were processed by two neural networks: a velocity network and a word recognition network. The velocity network uses hand speed to determine the duration of words. Signs are defined by feature vectors such as hand shape, hand location, orientation, movement, bounding box, and distance. The second network was used as a classifier to convert ASL signs into words based on features or histograms of these features. We trained and tested our ANN models with 60 ASL words for a different number of samples. These methods were compared with each other. Our test results show that the accuracy of recognition of these two systems is 92% and 95%, respectively.  相似文献   

12.
卷积神经网络在手势识别领域应用广泛,但现有的卷积神经网络存在特征表征不足的问题,导致手势识别精度较低。提出一种轻量级静态手势识别算法r-mobilenetv2,通过串联通道注意力与空间注意力,将两者输出的特征图以跳跃连接的形式线性相加,得到一种全新的注意力机制。使用一维卷积调整低层特征的通道维度,将低级特征与经过上采样的高层特征进行空间维度匹配及通道维度匹配,并进行线性相加,其结果经卷积操作后与高层特征按通道维度连接,从而实现特征融合。在此基础上,将所提注意力机制与特征融合相结合,并用于改进后的轻量级网络MobileNetV2中,得到r-mobilenetv2算法。实验结果表明,与MobileNetV2算法相比,r-mobilenetv2算法的参数量降低了27%,错误率下降了1.82个百分点。  相似文献   

13.
复杂背景下基于傅立叶描述子的手势识别   总被引:5,自引:1,他引:5  
刘寅  滕晓龙  刘重庆 《计算机仿真》2005,22(12):158-161
人的手势是人们日常生活中最广泛使用的一种交流方式。由于在人机交互界面和虚拟现实环境中的应用,手势识别的研究受到了越来越广泛的关注。但是目前基于单目视觉的手势识别技术中,手势分割要求背景简单或者要求识别者戴着笨重的数据手套。而该文结合了运动信息和基于KL变换的肤色模型,在复杂背景下进行手势分割,与传统的基于RGB肤色模型的手势分割相比,在复杂背景环境下得到了很好的分割效果。在对分割的手势区域进行预处理后,该文使用了一种归一化的傅立叶描述子进行手势的特征提取,相比传统的傅立叶描述子更加准确,最后采用了传统的三层BP网络作为模式识别器,手势训练集和测试集的识别率分别达到了95.9%和95%。  相似文献   

14.
Hand gestures are a natural way for human-robot interaction.Vision based dynamic hand gesture recognition has become a hot research topic due to its various applications.This paper presents a novel deep learning network for hand gesture recognition.The network integrates several well-proved modules together to learn both short-term and long-term features from video inputs and meanwhile avoid intensive computation.To learn short-term features,each video input is segmented into a fixed number of frame groups.A frame is randomly selected from each group and represented as an RGB image as well as an optical flow snapshot.These two entities are fused and fed into a convolutional neural network(Conv Net)for feature extraction.The Conv Nets for all groups share parameters.To learn longterm features,outputs from all Conv Nets are fed into a long short-term memory(LSTM)network,by which a final classification result is predicted.The new model has been tested with two popular hand gesture datasets,namely the Jester dataset and Nvidia dataset.Comparing with other models,our model produced very competitive results.The robustness of the new model has also been proved with an augmented dataset with enhanced diversity of hand gestures.  相似文献   

15.
针对静态手势识别任务中,传统基于人工提取特征方法耗时耗力,识别率较低,现有卷积神经网络依赖单一卷积核提取特征不够充分的问题,提出双通道卷积神经网络模型。输入手势图片通过两个相互独立的通道进行特征提取,双通道具有尺度不同的卷积核,能够提取输入图像中不同尺度的特征,然后在全连接层进行特征融合,最后经过softmax分类器进行分类。在Thomas Moeslund和Jochen Triesch手势数据库上进行实验验证,结果表明该模型提高了静态手势识别的准确率,增强了卷积神经网络的泛化能力。  相似文献   

16.
Hand gesture recognition has been intensively applied in various human-computer interaction (HCI) systems. Different hand gesture recognition methods were developed based on particular features, e.g., gesture trajectories and acceleration signals. However, it has been noticed that the limitation of either features can lead to flaws of a HCI system. In this paper, to overcome the limitations but combine the merits of both features, we propose a novel feature fusion approach for 3D hand gesture recognition. In our approach, gesture trajectories are represented by the intersection numbers with randomly generated line segments on their 2D principal planes, acceleration signals are represented by the coefficients of discrete cosine transformation (DCT). Then, a hidden space shared by the two features is learned by using penalized maximum likelihood estimation (MLE). An iterative algorithm, composed of two steps per iteration, is derived to for this penalized MLE, in which the first step is to solve a standard least square problem and the second step is to solve a Sylvester equation. We tested our hand gesture recognition approach on different hand gesture sets. Results confirm the effectiveness of the feature fusion method.  相似文献   

17.
The role of gesture recognition is significant in areas like human‐computer interaction, sign language, virtual reality, machine vision, etc. Among various gestures of the human body, hand gestures play a major role to communicate nonverbally with the computer. As the hand gesture is a continuous pattern with respect to time, the hidden Markov model (HMM) is found to be the most suitable pattern recognition tool, which can be modeled using the hand gesture parameters. The HMM considers the speeded up robust feature features of hand gesture and uses them to train and test the system. Conventionally, the Viterbi algorithm has been used for training process in HMM by discovering the shortest decoded path in the state diagram. The recursiveness of the Viterbi algorithm leads to computational complexity during the execution process. In order to reduce the complexity, the state sequence analysis approach is proposed for training the hand gesture model, which provides a better recognition rate and accuracy than that of the Viterbi algorithm. The performance of the proposed approach is explored in the context of pattern recognition with the Cambridge hand gesture data set.  相似文献   

18.
An American Sign Language (ASL) recognition system is being developed using artificial neural networks (ANNs) to translate ASL words into English. The system uses a sensory glove called the Cyberglove™ and a Flock of Birds® 3-D motion tracker to extract the gesture features. The data regarding finger joint angles obtained from strain gauges in the sensory glove define the hand shape, while the data from the tracker describe the trajectory of hand movements. The data from these devices are processed by a velocity network with noise reduction and feature extraction and by a word recognition network. Some global and local features are extracted for each ASL word. A neural network is used as a classifier of this feature vector. Our goal is to continuously recognize ASL signs using these devices in real time. We trained and tested the ANN model for 50 ASL words with a different number of samples for every word. The test results show that our feature vector extraction method and neural networks can be used successfully for isolated word recognition. This system is flexible and open for future extension.  相似文献   

19.
为了提高基于加速度传感器的动态手势识别算法的性能,并且增强系统的可扩展性,提出了一种有效结合机器学习模型与模板匹配的方法.将手势分为基本手势和复杂手势两大类,其中复杂手势可分割为基本手势组成的序列;根据手势运动的特点提取有效的特征量,并利用基本手势样本训练随机森林模型,然后用其对基本手势序列进行分类预测;将预测结果进行约翰逊编码,再与标准模板序列进行相似度匹配.实验结果表明,该方法获得了99.75%的基本手势识别率以及100%的复杂手势识别率.算法既保证了手势识别的精度,也提高了系统的可扩展性.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号