首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Emotion recognition is a hot research in modern intelligent systems. The technique is pervasively used in autonomous vehicles, remote medical service, and human–computer interaction (HCI). Traditional speech emotion recognition algorithms cannot be effectively generalized since both training and testing data are from the same domain, which have the same data distribution. In practice, however, speech data is acquired from different devices and recording environments. Thus, the data may differ significantly in terms of language, emotional types and tags. To solve such problem, in this work, we propose a bimodal fusion algorithm to realize speech emotion recognition, where both facial expression and speech information are optimally fused. We first combine the CNN and RNN to achieve facial emotion recognition. Subsequently, we leverage the MFCC to convert speech signal to images. Therefore, we can leverage the LSTM and CNN to recognize speech emotion. Finally, we utilize the weighted decision fusion method to fuse facial expression and speech signal to achieve speech emotion recognition. Comprehensive experimental results have demonstrated that, compared with the uni-modal emotion recognition, bimodal features-based emotion recognition achieves a better performance.  相似文献   

2.
三维采集设备的快速发展,极大推动了三维数据技术的研究。其中,以三维人脸数据为载体的三维面部表情识别研究成果不断涌现。三维面部表情识别可以极大克服二维识别中的姿态和光照变化等方面问题。对三维表情识别技术进行了系统概括,尤其针对三维表情的关键技术,即对表情特征提取、表情编码分类及表情数据库进行了总结分析,并提出了三维表情识别的研究建议。三维面部表情识别技术在识别率上基本满足要求,但实时性上需要进一步优化。相关内容对该领域的研究具有指导意义。  相似文献   

3.
基于LBP特征和贝叶斯模型的单样本人脸识别   总被引:1,自引:0,他引:1  
针对单样本人脸识别这一人脸识别中的难点问题,提出了一种基于局部二元模式(LBP)直方图特征和贝叶斯(Bayes)模型的人脸识别方法.首先在独立的训练集上学习同类样本和异类样本的LBP直方图特征的相似度先验信息,估计同类和异类的类条件概率密度函数,在识别过程中利用一对图像的LBP直方图相似度计算该对图像属于同一类的后验概...  相似文献   

4.
针对光照、姿态和表情对人脸识别率造成严重影响的问题,提出了结合笛卡儿微分不变量(CDI,cartesian differential invariant)和LBP(local binary patterns)的人脸特征抽取与识别算法。首先,利用高斯微分算子抽取人脸图像的微分结构,组合这些微分结构得到一个不可约简的笛卡儿CDI集。其次,对CDI集中每个分量分别计算其LBP特征,并将所有分量的LBP特征连接起来以得到人脸图像的特征。最后,运用所抽取出的人脸局部描述特征和支持向量机(SVM)分类器完成人脸图像分类与识别。试验分析表明,基于CDI的LBP特征对人脸位置、姿态、光照和表情的变化具有较高的不变性。该算法在ORL和Yale人脸库中分别取得了98.5%和98.89%的识别率。  相似文献   

5.
To extract decisive features from gesture images and solve the problem of information redundancy in the existing gesture recognition methods, we propose a new multi-scale feature extraction module named densely connected Res2Net (DC-Res2Net) and design a feature fusion attention module (FFA). Firstly, based on the new dimension residual network (Res2Net), the DC-Res2Net uses channel grouping to extract fine-grained multi-scale features, and dense connection has been adopted to extract stronger features of different scales. Then, we apply a selective kernel network (SK-Net) to enhance the representation of effective features. Afterwards, the FFA has been designed to remove redundant information in features by fusing low-level location features with high-level semantic features. Finally, experiments have been conducted to validate our method on the OUHANDS, ASL, and NUS-II datasets. The results demonstrate the superiority of DC-Res2Net and FFA, which can extract more decisive features and remove redundant information while ensuring high recognition accuracy and low computational complexity.  相似文献   

6.
针对(2D)2PCA无法保存某些重要局部特征的问题,提出了一种分块双向二维主成分分析融合局部特征方法。首先,将图像分解为互不重叠的子块,每个子块包含重要的局部信息,利用(2D)2PCA对子块进行特征提取并投影到特征子空间;然后,对每个子块分别设计一个分类器并在一定置信度范围内判别测试样本所属类别。最后,根据所有子块所属类别的置信度之和完成人脸分类。在四个人脸识别数据库上的实验结果表明,相比其它几种人脸识别算法,所提方法取得了更高的识别精度。  相似文献   

7.
8.
The present paper presents a fully automatic low-cost system for generating animatable and statically multi-textured avatars of real people captured with several standard cameras. Our system features a novel technique for generating view-independent texture atlases computed from the original images, and two proposals for improving the quality of the facial region of the 3D mesh: a purely passive one implying no additional cost, and another based on active techniques such as structured light projection.  相似文献   

9.
With the rapid development of three-dimensional (3D) vision technology and the increasing application of 3D objects, there is an urgent need for 3D object recognition in the fields of computer vision, virtual reality, and artificial intelligence robots. The view-based method projects 3D objects into two-dimensional (2D) images from different viewpoints and applies convolutional neural networks (CNN) to model the projected views. Although these methods have achieved excellent recognition performance, there is not sufficient information interaction between the features of different views in these methods. Inspired by the recent success achieved by vision transformer (ViT) in image recognition, we propose a hybrid network by taking advantage of CNN to extract multi-scale local information of each view, and of transformer to capture the relevance of multi-scale information between different views. To verify the effectiveness of our multi-view convolutional vision transformer (MVCVT), we conduct experiments on two public benchmarks, ModelNet40 and ModelNet10, and compare with those of some state-of-the-art methods. The final results show that MVCVT has competitive performance in 3D object recognition.  相似文献   

10.
Dense 3D reconstruction is required for robots to safely navigate or perform advanced tasks. The accurate depth information of the image and its pose are the basis of 3D reconstruction. The resolution of depth maps obtained by LIDAR and RGB-D cameras is limited, and traditional pose calculation methods are not accurate enough. In addition, if each image is used for dense 3D reconstruction, the dense point clouds will increase the amount of calculation. To address these issues, we propose a 3D reconstruction system. Specifically, we propose a depth network of contour and gradient attention, which is used to complete and correct depth maps to obtain high-resolution and high-quality depth maps. Then, we propose a method of fusion of traditional algorithms and deep learning for pose estimation to obtain accurate localization results. Finally, we adopt the method of autonomous selection of keyframes to reduce the number of keyframes, the surfel-based geometric reconstruction is performed to reconstruct the dense 3D environment. On the TUM RGB-D, ICL-NIUM, and KITTI datasets, our method significantly improves the quality of the depth maps, the localization results, and the effect of 3D reconstruction. At the same time, we have also accelerated the speed of 3D reconstruction.  相似文献   

11.
针对三维人脸识别中单一特征信息不足,采用一种基于整体信息和局部信息相融合的识别算法,以提高识别率。首先将预处理的三维点云用多层次B样条曲面拟合,获取精确的人脸曲面拟合函数,将控制点映射为深度图像,并根据人脸曲面函数和生理特征提取过鼻尖的中分轮廓线和水平轮廓线;其次对深度图像采用二维主元分析(2D-PCA)算法提取整体信息,对轮廓线采用改进的ICP算法匹配,作为局部信息;最后用加权求和法在决策级进行信息融合。采用CASIA3D人脸库完成识别测试,实验结果表明,本文算法明显优于单一特征信息下识别算法,且对姿态有较好的鲁棒性,同时不增加算法复杂度。  相似文献   

12.
The research of emotion recognition based on electroencephalogram (EEG) signals often ignores the relatedinformation between the brain electrode channels and the contextual emotional information existing in EEG signals,which may contain important characteristics related to emotional states. Aiming at the above defects, aspatiotemporal emotion recognition method based on a 3-dimensional (3D) time-frequency domain feature matrixwas proposed. Specifically, the extracted time-frequency domain EEG features are first expressed as a 3D matrixformat according to the actual position of the cerebral cortex. Then, the input 3D matrix is processed successivelyby multivariate convolutional neural network (MVCNN) and long short-term memory (LSTM) to classify theemotional state. Spatiotemporal emotion recognition method is evaluated on the DEAP data set, and achievedaccuracy of 87.58% and 88.50% on arousal and valence dimensions respectively in binary classification tasks, aswell as obtained accuracy of 84.58% in four class classification tasks. The experimental results show that 3D matrixrepresentation can represent emotional information more reasonably than two-dimensional (2D). In addition,MVCNN and LSTM can utilize the spatial information of the electrode channels and the temporal context information of the EEG signal respectively.  相似文献   

13.
一种基于2D和3DSIFT特征级融合的一般物体识别算法   总被引:1,自引:0,他引:1       下载免费PDF全文
李新德  刘苗苗  徐叶帆  雒超民 《电子学报》2015,43(11):2277-2283
如何选择合适的特征表示一般物体类间差异和类内共性至关重要,因此,本文在2D SIFT(Scale Invariant Feature Transform,SIFT)的基础上,提出了基于点云模型的3D SIFT特征描述子,进而提出一种基于2D和3D SIFT特征级融合的一般物体识别算法.分别提取物体2维图像和3维点云的2D和3D SIFT特征描述子,利用"词袋"(Bag of Words,BoW)模型得到物体特征向量,根据特征级融合将两个特征向量进行融合实现物体描述,运用有监督分类器支持向量机(Support Vector Machine,SVM)实现分类识别,给出最终识别结果.最后,实验验证了本文提出算法的好处.  相似文献   

14.
Under the condition of weak light or no light, the recognition accuracy of the mature 2D face recognition technology decreases sharply. In this paper, a face recognition algorithm based on the matching of 3D face data and 2D face images is proposed. Firstly, 3D face data is reconstructed from the 2D face in the database based on the 3DMM algorithm, and the face depth image is obtained through orthogonal projection. Then, the average curvature map of the face depth image is used to enhance the data of the depth image. Finally, an improved residual neural network based on the depth image and curvature is designed to compare the scanned face with the face in the database. The method proposed in this paper is tested on the 3D face data in three public face datasets (Texas 3DFRD, FRGC v2.0, and Lock3DFace), and the recognition accuracy is 84.25%, 83.39%, and 78.24%, respectively.  相似文献   

15.
16.
三维人脸建模是计算机视觉和计算机图形学领域中一个研究热点,笔者首先分析了三维人脸建模技术背景意义和研究现状;其次论证了各种三维人脸建模技术的优缺点;最后对三维人脸建模技术的应用领域进行了详细介绍并进一步展望了今后三维人脸建模方向。  相似文献   

17.
人脸表情识别在人机交互等人工智能领域发挥着 重要作用,当前研究忽略了人脸的语 义信息。本 文提出了一种融合局部语义与全局信息的人脸表情识别网络,由两个分支组成:局部语义区 域提取分支 和局部-全局特征融合分支。首先利用人脸解析数据集训练语义分割网络得到人脸语义解析 ,通过迁移训 练的方法得到人脸表情数据集的语义解析。在语义解析中获取对表情识别有意义的区域及其 语义特征, 并将局部语义特征与全局特征融合,构造语义局部特征。最后,融合语义局部特征与全局特 征构成人脸 表情的全局语义复合特征,并通过分类器分为7种基础表情之一。本文同时提出了解冻部分 层训练策略, 该训练策略使语义特征更适用于表情识别,减 少语义信息冗余性。在两个公开数据集JAFFE 和KDEF上 的平均识别准确率分别达到了93.81%和88.78% ,表现优于目前的深度学习方法和传统方法。实验结果证 明了本文提出的融合局部语义和全局信息的网络能够很好地描述表情信息。  相似文献   

18.
This paper proposes a simple and discriminative framework,using graphical model and 3D geometry to understand the diversity of urban scenes with varying viewpoints.Our algorithm constructs a conditional random field (CRF) network using over-segmented superpixels and learns the appearance model from different set of features for specific classes of our interest.Also,we introduce a training algorithm to learn a model for edge potential among these superpixel areas based on their feature difference.The proposed algorithm gives competitive and visually pleasing results for urban scene segmentation.We show the inference from our trained network improves the class labeling performance compared to the result when using the appearance model solely.  相似文献   

19.
林森  尚鹏 《光电子.激光》2024,35(5):536-543
针对三维 (3D)掌纹识别由于噪声干扰和忽略相邻深度信息引起识别率低的问题,提出融合潜在纹理和表面一致性的3D掌纹识别方法。首先,利用能量局部边缘二值码(energy local edge binary code,ELEBC)从能量图中提取潜在的纹理方向信息,消除噪声。然后,通过平均块模式表面类型(mean block pattern surface type,MBST)获取表面一致性。最后,利用主成分分析(principal component analysis,PCA)进行数据降维,并使用决策级融合,从而获取最终的识别结果。在香港理工大学3D掌纹数据库中进行相关实验,结果表明,正确识别率最高可达到99.71%,相比于其他新颖算法具有优势,并且识别分类时间保持在0.5 s以下。这显示出本文方法不仅具有良好的识别效果,同时能够满足实时性的要求,具有应用价值。  相似文献   

20.
多视角三维重建依赖目标表面的纹理特征,在处理低纹理区域时易出现数据空洞现象,融合目标物反射光的偏振信息可以在不同光照环境下对其进行完整重建,通过偏振参数计算物体表面法向量,进而重建目标物深度图.但单独使用偏振信息重建三维表面存在方位角歧义和天顶角偏差等问题,导致重建结果出现变形甚至得不到深度结果.针对存在低纹理区域的物...  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号