共查询到20条相似文献,搜索用时 15 毫秒
1.
针对语音情感识别中判别性的情感特征提取难题,结合卷积神经网络和视觉transformer网络结构,提出一种双通道特征融合的语音表征方法。使用基于倒瓶颈结构的卷积模块通道,并引入类transformer训练策略提取局部频谱特征,通过改进视觉transformer提取全局序列特征,利用卷积神经网络直接提取整个语谱图代替分块部分,更好地提取时序信息,将提取到的特征信息进行融合,能够获取判别性强的情感特征,最后输入到Softmax分类器得到识别结果。在EMO-DB和CASIA数据库上进行实验,文中所提模型的平均准确率分别达到了94.24%和93.05%,与其他模型进行对比试验,结果优于其他模型,表明了该方法的有效性。 相似文献
2.
负性面部表情影响面孔身份识别的实验研究 总被引:1,自引:0,他引:1
为验证负性面部表情对面孔身份识别的干扰效应,采用中国人的面孔表情图片为材料,设计了两个Garner范式实验:实验一重复过去研究采用愤怒与快乐表情图片为材料,实验二采用愤怒与悲伤表情图片为材料。结果发现,愤怒与快乐表情不影响面孔身份识别,而愤怒与悲伤表情影响面孔身份识别,说明负性表情能够影响面孔身份识别,结果支持表情身份非独立加工观。这一结果也弥补了过去研究难以发现表情影响身份识别的不足。 相似文献
3.
4.
目的研究基于面部表情识别技术的用户满意度客观度量方法。方法以两款车载信息系统为载体,以面部表情识别与BP神经网络算法为技术手段,设计用户分别与两款系统进行人机交互的实验,建立用户面部表情与用户主观满意度的映射模型,并进行用户满意度预测,对比模型预测值与用户主观量表值,分析得出模型的预测能力,验证度量方法的可行性。结论该模型对用户满意度的预测值与用户主观满意度值的整体均方误差为0.165,实现了在较小误差范围内的准确预测。模型通过识别用户与产品进行人机交互时的面部表情,能有效客观地度量用户对产品的满意度。 相似文献
5.
针对聋哑人与正常人之间存在的交流障碍问题,提出了一种融合人脸表情的手语到汉藏双语情感语音转换的方法。首先使用深度置信网络模型得到手势图像的特征信息,并通过深度神经网络模型得到人脸信息的表情特征。其次采用支持向量机对手势特征和人脸表情特征分别进行相应模型的训练及分类,根据识别出的手势信息和人脸表情信息分别获得手势文本及相应的情感标签。同时,利用普通话情感训练语料,采用说话人自适应训练方法,实现了一个基于隐Markov模型的情感语音合成系统。最后,利用识别获得的手势文本和情感标签,将手势及人脸表情转换为普通话或藏语的情感语音。客观评测表明,静态手势的识别率为92.8%,在扩充的Cohn-Kanade数据库和日本女性面部表情(Japanese Female Facial Expression,JAFFE)数据库上的人脸表情识别率为94.6%及80.3%。主观评测表明,转换获得的情感语音平均情感主观评定得分4.0分,利用三维情绪模型(Pleasure-Arousal-Dominance,PAD)分别评测人脸表情和合成的情感语音的PAD值,两者具有很高的相似度,表明合成的情感语音能够表达人脸表情的情感。 相似文献
6.
基于改进一维卷积神经网络的滚动轴承故障识别 总被引:1,自引:0,他引:1
滚动轴承的故障识别对于防止旋转机械系统故障恶化并保证其安全运行具有重要意义。针对现有智能诊断模型参数多、识别效率低的问题,提出一种基于改进一维卷积神经网络的滚动轴承故障识别(FRICNN–1D)方法。通过引入1×1卷积核增强一维卷积神经网络模型的非线性表达能力;并用全局平局池化层代替传统卷积神经(CNN)网络中的全连接层,以降低模型参数和计算量,且防止过拟合现象。试验结果表明,该方法可以准确识别滚动轴承不同故障状态,具有一定的工程实际应用潜力。 相似文献
7.
Nianbin Wang Ming He Jianguo Sun Hongbin Wang Lianke Zhou Ci Chu Lei Chen 《计算机、材料和连续体(英文)》2019,58(1):169-181
Underwater target recognition is a key technology for underwater acoustic countermeasure. How to classify and recognize underwater targets according to the noise information of underwater targets has been a hot topic in the field of underwater acoustic signals. In this paper, the deep learning model is applied to underwater target recognition. Improved anti-noise Power-Normalized Cepstral Coefficients (ia-PNCC) is proposed, based on PNCC applied to underwater noises. Multitaper and normalized Gammatone filter banks are applied to improve the anti-noise capacity. The method is combined with a convolutional neural network in order to recognize the underwater target. Experiment results show that the acoustic feature presented by ia-PNCC has lower noise and are well-suited to underwater target recognition using a convolutional neural network. Compared with the combination of convolutional neural network with single acoustic feature, such as MFCC (Mel-scale Frequency Cepstral Coefficients) or LPCC (Linear Prediction Cepstral Coefficients), the combination of the ia-PNCC with a convolutional neural network offers better accuracy for underwater target recognition. 相似文献
8.
Lifelog is a digital record of an individual’s daily life. It collects, records, and archives a large amount of unstructured data; therefore, techniques are required to organize and summarize those data for easy retrieval. Lifelogging has been utilized for diverse applications including healthcare, self-tracking, and entertainment, among others. With regard to the image-based lifelogging, even though most users prefer to present photos with facial expressions that allow us to infer their emotions, there have been few studies on lifelogging techniques that focus upon users’ emotions. In this paper, we develop a system that extracts users’ own photos from their smartphones and configures their lifelogs with a focus on their emotions. We design an emotion classifier based on convolutional neural networks (CNN) to predict the users’ emotions. To train the model, we create a new dataset by collecting facial images from the CelebFaces Attributes (CelebA) dataset and labeling their facial emotion expressions, and by integrating parts of the Radboud Faces Database (RaFD). Our dataset consists of 4,715 high-resolution images. We propose Representative Emotional Data Extraction Scheme (REDES) to select representative photos based on inferring users’ emotions from their facial expressions. In addition, we develop a system that allows users to easily configure diaries for a special day and summaize their lifelogs. Our experimental results show that our method is able to effectively incorporate emotions into lifelog, allowing an enriched experience. 相似文献
9.
针对风电轴承故障源域数据和目标域数据特征分布不同而导致的故障诊断精度偏低问题,提出一种利用改进残差神经网络进行风电轴承故障迁移诊断的方法.该方法将卷积核和池化核设定为与一维振动信号卷积运算相适应的尺寸,从振动信号直接提取轴承的故障特征;在一维残差网络中同时使用批量归一化和实例归一化,进一步增强模型的特征提取能力;在模型训练阶段,通过源域数据和目标域数据的多核最大均值差异构建新的损失函数,以提高模型在不同分布数据集上的迁移学习及分类能力.利用故障轴承实验数据对方法的有效性进行验证,结果显示,即使受到轴承变转速运行工况和故障振动信号含噪声干扰成分的双重影响,该方法仍然可提取出轴承故障的重要特征,并实现不同工况轴承故障的迁移诊断和准确分类,这对于发展复杂环境下的旋转机械智能故障诊断技术具有参考价值. 相似文献
10.
11.
针对3D-CNN能够较好地提取视频中时空特征但对计算量和内存要求很高的问题,本文设计了高效3D卷积块替换原来计算量大的3×3×3卷积层,进而提出了一种融合3D卷积块的密集残差网络(3D-EDRNs)用于人体行为识别.高效3D卷积块由获取视频空间特征的1×3×3卷积层和获取视频时间特征的3×1×1卷积层组合而成.将高效3... 相似文献
12.
Fawaz Waselallah Alsaade Theyazn H. H. Aldhyani Mosleh Hmoud Al-Adhaileh 《计算机、材料和连续体(英文)》2021,68(1):805-819
The COVID-19 pandemic poses an additional serious public health threat due to little or no pre-existing human immunity, and developing a system to identify COVID-19 in its early stages will save millions of lives. This study applied support vector machine (SVM), k-nearest neighbor (K-NN) and deep learning convolutional neural network (CNN) algorithms to classify and detect COVID-19 using chest X-ray radiographs. To test the proposed system, chest X-ray radiographs and CT images were collected from different standard databases, which contained 95 normal images, 140 COVID-19 images and 10 SARS images. Two scenarios were considered to develop a system for predicting COVID-19. In the first scenario, the Gaussian filter was applied to remove noise from the chest X-ray radiograph images, and then the adaptive region growing technique was used to segment the region of interest from the chest X-ray radiographs. After segmentation, a hybrid feature extraction composed of 2D-DWT and gray level co-occurrence matrix was utilized to extract the features significant for detecting COVID-19. These features were processed using SVM and K-NN. In the second scenario, a CNN transfer model (ResNet 50) was used to detect COVID-19. The system was examined and evaluated through multiclass statistical analysis, and the empirical results of the analysis found significant values of 97.14%, 99.34%, 99.26%, 99.26% and 99.40% for accuracy, specificity, sensitivity, recall and AUC, respectively. Thus, the CNN model showed significant success; it achieved optimal accuracy, effectiveness and robustness for detecting COVID-19. 相似文献
13.
针对传统鸟声识别算法中特征提取方式单一、分类识别准确率低等问题,提出一种结合卷积神经网络和Transformer网络的鸟声识别方法。该方法综合考虑网络局部特征学习和全局上下文依赖性构造,从原始鸟声音频信号中提取短时傅里叶变换(Short Time Fourier Transform,STFT)语谱图特征,将其输入到卷积神经网络(ConvolutionalNeural Network,CNN)中提取局部频谱特征信息,同时提取鸟声信号的对数梅尔特征及一阶差分、二阶差分特征用于合成梅尔频率倒谱系数(Mel Frequency Cepstrum Coefficient,MFCC)混合特征向量,将其输入到Transformer网络中获取全局序列特征信息,最后融合所提取的特征可得到更丰富的鸟声特征参数,通过Softmax分类器得到鸟声识别结果。在Birdsdata和xeno-canto鸟声数据集上进行实验,平均识别准确率分别达到了97.81%和89.47%。实验结果表明该方法相较于其他现有的鸟声识别模型具有更高的识别准确率。 相似文献
14.
针对地震勘探中噪声压制的问题,构建了一种适合分类和识别地震子波的卷积神经网络模型.首先对卷积神经网络模型的激活函数、卷积核大小以及归一化层等进行了设计,然后利用已搭建好的卷积神经网络对地震信号的时频谱图进行特征提取,最后实现了不同类型的含噪地震信号的分类和识别.实验结果表明,该模型有高分类率和识别率及较好的抗干扰能力,... 相似文献
15.
人脸识别与虹膜识别、指纹识别、步态识别等其它生物特征识别技术相比,具有自然、便捷、用户体验友好等独特优势,因而受到了学术界和工业界的广泛关注.近年来,在深度学习技术的驱动下,人脸识别技术取得了突破性进展,在面对表情、姿态、光照、遮挡等外在干扰因素时,仍表现出较好的鲁棒性.特别地,基于深度学习的人脸识别技术已广泛应用于安防、金融、教育、交通、新零售等应用领域.我们认识到,在人脸识别技术不断走向大众化的过程中,急需一些综述性的和普及性的文献来总结人脸识别技术的基本原理和基本方法.基于此,本文首先简要回顾了人脸识别的发展脉络,之后从人脸预处理、深度特征学习、特征比对、人脸数据集、评价标准五个方面重点介绍了基于深度学习的人脸识别技术.最后指出了人脸识别技术未来的发展趋势. 相似文献
16.
To generate realistic three-dimensional animation of virtual character, capturing real facial expression is the primary task. Due to diverse facial expressions and complex background, facial landmarks recognized by existing strategies have the problem of deviations and low accuracy. Therefore, a method for facial expression capture based on two-stage neural network is proposed in this paper which takes advantage of improved multi-task cascaded convolutional networks (MTCNN) and high-resolution network. Firstly, the convolution operation of traditional MTCNN is improved. The face information in the input image is quickly filtered by feature fusion in the first stage and Octave Convolution instead of the original ones is introduced into in the second stage to enhance the feature extraction ability of the network, which further rejects a large number of false candidates. The model outputs more accurate facial candidate windows for better landmarks recognition and locates the faces. Then the images cropped after face detection are input into high-resolution network. Multi-scale feature fusion is realized by parallel connection of multi-resolution streams, and rich high-resolution heatmaps of facial landmarks are obtained. Finally, the changes of facial landmarks recognized are tracked in real-time. The expression parameters are extracted and transmitted to Unity3D engine to drive the virtual character's face, which can realize facial expression synchronous animation. Extensive experimental results obtained on the WFLW database demonstrate the superiority of the proposed method in terms of accuracy and robustness, especially for diverse expressions and complex background. The method can accurately capture facial expression and generate three-dimensional animation effects, making online entertainment and social interaction more immersive in shared virtual space. 相似文献
17.
18.
Siying Li;Pan Xia;Yonggang Zou;Lidong Du;Zhenfeng Li;Peng Wang;Xianxiang Chen;Yundai Chen;Yajun Shi;Zhen Fang; 《International journal of imaging systems and technology》2024,34(2):e23047
Dementia-associated disorders cause damage to the brains of patients and bring huge burdens to individuals and families. Electroencephalogram (EEG) monitoring is friendly to patients on account of low cost, non-invasion, and objective. Event-related potential (ERP) is a component of EEG that has huge potential to evaluate the cognitive function of the brain. In this study, we recorded the ERP from patients with dementia and healthy people, then proposed an ERP-based deep learning method to realize dementia recognition via the model Dementia-Unet (D-Unet). To improve the performance of the model, on the base of the decoder and the primary classifier, we added new structures including a symmetric decoder and two auxiliary outputs. One of the auxiliary outputs was input reconstruction, and the other one was aimed at working like the primary classifier with the same task. The results of the experiment of four-fold cross-validation demonstrated the two auxiliary outputs improved the performance of the model effectively. When compared with some other machine learning methods and deep learning methods, our model obtained the best performance with an accuracy of 0.815, a precision of 0.829, a recall of 0.797, and an f1-score of 0.812. Besides, we put up a complex training strategy with all outputs involved, but a simple testing strategy with only a primary classifier working to keep high performance but cut down the complexity burden during testing. 相似文献
19.
In semiconductor manufacturing, wafer testing is performed to ensure the performance of each product after wafer fabrication. The wafer map is used to visualize the color-coded wafer test results based on the locations. The defects on the wafer map may be randomly distributed or form clustered patterns. The various clustered defect patterns are usually caused by assignable faults. The identification of the patterns is thus important to provide valuable hints for the root causes diagnosis. Solving the problems helps improve the manufacturing processes and reduce costs. In this study, we present a novel convolutional neural network (CNN)–based method to automatically recognize the defect pattern on wafer maps. Our method uses polar mapping before the training of CNN to transform the circular wafer map into a matrix, which can be processed within CNN architecture. This procedure also reduces the input size and solves variations in wafer sizes and die sizes. To eliminate the effects of rotation, we apply data augmentation in the training of CNN. Experiments using the real-world dataset prove the effectiveness and superiority of our method. 相似文献
20.
目的为解决GDX1包装机MICROⅡ控制系统维修成本高、故障率较高、系统开放性差、数据采集困难、制约工业和信息化融合等问题,设计一套GDX1包装机IPC控制系统。方法 GDX1包装机控制系统通过IPC分布式电控系统改造,实现分布式采集+集中管控+数据交互的包装机IPC控制。结果该控制系统改造后,采用EtherCAT总线及高速处理模块,实现了100M高速数据传输速度;采用TWINCAT3的多核多线程处理技术,使数据采集速率提升了715%;采用PID算法的热封器温度控制,将实时温差控制在-3~3℃以内。结论 IPC控制系统改进后,实现了对各种检测信号、安全信号等点对点的准确采集,降低了因线路接点引起故障停机的概率,提高了设备运行的稳定性,提升了设备的有效作业率,使得车间信息化底层数据的准确性、稳定性得到了保证,完成了工业化与信息化的融合,实现了设备智能化,该系统和技术可推广应用于行业内所有包装设备上。 相似文献