首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Ling  Hefei  Wu  Jiyang  Huang  Junrui  Chen  Jiazhong  Li  Ping 《Multimedia Tools and Applications》2020,79(9-10):5595-5616
Multimedia Tools and Applications - Discriminative feature embedding is of essential importance in the field of large scale face recognition. In this paper, we propose an attention-based...  相似文献   

2.
陶攀  付忠良  朱锴  王莉莉 《计算机应用》2017,37(5):1434-1438
提出了一种基于深度卷积神经网络自动识别超声心动图标准切面的方法,并可视化分析了深度模型的有效性。针对网络全连接层占有模型大部分参数的缺点,引入空间金字塔均值池层化替代全连接层,获得更多空间结构信息,并大大减少模型参数、降低过拟合风险,通过类别显著性区域将类似注意力机制引入模型可视化过程。通过超声心动图标准切面的识别问题案例,对深度卷积神经网络模型的鲁棒性和有效性进行解释。在超声心动图上的可视化分析实验表明,改进深度模型作出的识别决策依据,同医师辨别分类超声心动图标准切面的依据一致,表明所提方法的有效性和实用性。  相似文献   

3.
Multimedia Tools and Applications - In recent years, several technologies have been utilized to bridge the communication gap between persons who have hearing or speaking impairments and those who...  相似文献   

4.
In this paper, we propose a sensitive convolutional neural network which incorporates sensitivity term in the cost function of Convolutional Neural Network (CNN) to emphasize on the slight variations and high frequency components in highly blurred input image samples. The proposed cost function in CNN has a sensitivity part in which the conventional error is divided by the derivative of the activation function, and subsequently the total error is minimized by the gradient descent method during the learning process. Due to the proposed sensitivity term, the data samples at the decision boundaries appear more on the middle band or the high gradient part of the activation function. This highlights the slight changes in the highly blurred input images enabling better feature extraction resulting in better generalization and improved classification performance in the highly blurred images. To study the effect of the proposed sensitivity term, experiments were performed for the face recognition task on small dataset of facial images at different long standoffs in both night-time and day-time modalities.  相似文献   

5.
为了克服人脸识别中存在光照、姿态、颜色等噪声的干扰,融合了卷积神经网络与孪生神经网络的优点,提出了一种改进的CNN网络结构,该结构由两个卷积神经网络组成,且共享网络权值,在该结构的训练中采用了差异深度度量学习(DDML)算法。卷积结构有效地去除外界噪声干扰,且在非线性降维中权值共享结构能够自动提取相同特征,DDML算法增加了提取特征的有效性。在ORL、YaleB和AR人脸数据库上实验结果表明,与PCA、CNN等算法相比,识别稳定度高,识别率提升近5个百分点。  相似文献   

6.
针对非协作通信条件下信号调制方式识别问题,提出了一种基于深度神经网络的调制方式自动识别新方法。该方法对接收到的信号进行预处理,生成星座图,并将星座图形状作为深度卷积神经网络的输入,根据训练好的网络模型对调制信号进行分类识别。与以往的识别方法相比,该方法利用卷积神经网络自动学习各种数字调制信号的星座图特征,克服了特征提取困难,通用性不强,抗噪声性能差等缺点,处理流程简单,并对星座图的形变具有不敏感性。针对4QAM、16QAM和64QAM三种典型的数字调制方式,进行了仿真实验,当信噪比大于4时,调制方式的识别正确率大于95%,实验结果表明,基于深度卷积神经网络的信号调制方式识别方法是有效的。  相似文献   

7.
针对现有人体动作识别方法需输入固定长度的视频段、未充分利用时空信息等问题,提出一种基于时空金字塔和注意力机制相结合的深度神经网络模型,将包含时空金字塔的3D-CNN和添加时空注意力机制的LSTM模型相结合,实现了对视频段的多尺度处理和对动作的复杂时空信息的充分利用。以RGB图像和光流场作为空域和时域的输入,以融合金字塔池化层的运动和外观特征后的融合特征作为融合域的输入,最后采用决策融合策略获得最终动作识别结果。在UCF101和HMDB51数据集上进行实验,分别取得了94.2%和70.5%的识别准确率。实验结果表明,改进的网络模型在基于视频的人体动作识别任务上获得了较高的识别准确率。  相似文献   

8.
针对现有的动作识别算法的特征提取复杂、识别率低等问题,提出了基于批归一化变换(batch normalization)与GoogLeNet网络模型相结合的网络结构,将图像分类领域的批归一化思想应用到动作识别领域中进行训练算法改进,实现了对视频动作训练样本的网络输入进行微批量(mini-batch)归一化处理。该方法以RGB图像作为空间网络的输入,光流场作为时间网络输入,然后融合时空网络得到最终动作识别结果。在UCF101和HMDB51数据集上进行实验,分别取得了93.50%和68.32%的准确率。实验结果表明,改进的网络架构在视频人体动作识别问题上具有较高的识别准确率。  相似文献   

9.
10.

Deep learning models have attained great success for an extensive range of computer vision applications including image and video classification. However, the complex architecture of the most recently developed networks imposes certain memory and computational resource limitations, especially for human action recognition applications. Unsupervised deep convolutional neural networks such as PCANet can alleviate these limitations and hence significantly reduce the computational complexity of the whole recognition system. In this work, instead of using 3D convolutional neural network architecture to learn temporal features of video actions, the unsupervised convolutional PCANet model is extended into (PCANet-TOP) which effectively learn spatiotemporal features from Three Orthogonal Planes (TOP). For each video sequence, spatial frames (XY) and temporal planes (XT and YT) are utilized to train three different PCANet models. Then, the learned features are fused after reducing their dimensionality using whitening PCA to obtain spatiotemporal feature representation of the action video. Finally, Support Vector Machine (SVM) classifier is applied for action classification process. The proposed method is evaluated on four benchmarks and well-known datasets, namely, Weizmann, KTH, UCF Sports, and YouTube action datasets. The recognition results show that the proposed PCANet-TOP provides discriminative and complementary features using three orthogonal planes and able to achieve promising and comparable results with state-of-the-art methods.

  相似文献   

11.
俞汝劼  杨贞  熊惠霖 《计算机应用》2017,37(6):1702-1707
针对军用机场大尺寸卫星图像中航空器检测识别的具体应用场景,建立了一套实时目标检测识别框架,将深度卷积神经网络应用到大尺寸图像中的航空器目标检测与识别任务中。首先,将目标检测的任务看成空间上独立的bounding-box的回归问题,用一个24层卷积神经网络模型来完成bounding-box的预测;然后,利用图像分类网络来完成目标切片的分类任务。大尺寸图像上的传统目标检测识别算法通常在时间效率上很难突破,而基于卷积神经网络的航空器目标检测识别算法充分利用了计算硬件的优势,大大缩短了任务耗时。在符合应用场景的自采数据集上进行测试,所提算法目标检测实时性达到平均每张5.765 s,在召回率65.1%的工作点上达到了79.2%的精确率,分类网络的实时性达到平均每张0.972 s,Top-1错误率为13%。所提框架在军用机场大尺寸卫星图像中航空器检测识别的具体应用问题上提出了新的解决思路,同时保证了实时性和算法精度。  相似文献   

12.
深度卷积神经网络的汽车车型识别方法   总被引:1,自引:0,他引:1  
针对现有汽车车型识别方法计算量大、提取特征复杂等问题,提出一种基于深度卷积神经网络的汽车车型识别方法。该方法借助于深度学习,对经典的卷积神经网络做出改进并得到由多个卷积层和次抽样层构成的深度卷积神经网络。根据五种车型的分类结果,表明该方法在识别率方面较传统方法有明显的提高。实验还研究了网络层数、卷积核大小、特征维数对深度卷积神经网络的性能和识别率的影响。  相似文献   

13.
目的 针对用于SAR (synthetic aperture radar) 目标识别的深度卷积神经网络模型结构的优化设计难题,在分析卷积核宽度对分类性能影响基础上,设计了一种适用于SAR目标识别的深度卷积神经网络结构。方法 首先基于二维随机卷积特征和具有单个隐层的神经网络模型-超限学习机分析了卷积核宽度对SAR图像目标分类性能的影响;然后,基于上述分析结果,在实现空间特征提取的卷积层中采用多个具有不同宽度的卷积核提取目标的多尺度局部特征,设计了一种适用于SAR图像目标识别的深度模型结构;最后,在对MSTAR (moving and stationary target acquisition and recognition) 数据集中的训练样本进行样本扩充基础上,设定了深度模型训练的超参数,进行了深度模型参数训练与分类性能验证。结果 实验结果表明,对于具有较强相干斑噪声的SAR图像而言,采用宽度更大的卷积核能够提取目标的局部特征,提出的模型因能从输入图像提取目标的多尺度局部特征,对于10类目标的分类结果(包含非变形目标和变形目标两种情况)接近或优于已知文献的最优分类结果,目标总体分类精度分别达到了98.39%和97.69%,验证了提出模型结构的有效性。结论 对于SAR图像目标识别,由于与可见光图像具有不同的成像机理,应采用更大的卷积核来提取目标的空间特征用于分类,通过对深度模型进行优化设计能够提高SAR图像目标识别的精度。  相似文献   

14.
Multimedia Tools and Applications - This paper addresses the demand for an intelligent and rapid classification system of skin cancer using contemporary highly-efficient deep convolutional neural...  相似文献   

15.
卷积神经网络的多字体汉字识别   总被引:1,自引:0,他引:1       下载免费PDF全文
目的 多字体的汉字识别在中文自动处理及智能输入等方面具有广阔的应用前景,是模式识别领域的一个重要课题。近年来,随着深度学习新技术的出现,基于深度卷积神经网络的汉字识别在方法和性能上得到了突破性的进展。然而现有方法存在样本需求量大、训练时间长、调参难度大等问题,针对大类别的汉字识别很难达到最佳效果。方法 针对无遮挡的印刷及手写体汉字图像,提出了一种端对端的深度卷积神经网络模型。不考虑附加层,该网络主要由3个卷积层、2个池化层、1个全连接层和一个Softmax回归层组成。为解决样本量不足的问题,提出了综合运用波纹扭曲、平移、旋转、缩放的数据扩增方法。为了解决深度神经网络参数调整难度大、训练时间长的问题,提出了对样本进行批标准化以及采用多种优化方法相结合精调网络等策略。结果 实验采用该深度模型对国标一级3 755类汉字进行识别,最终识别准确率达到98.336%。同时通过多组对比实验,验证了所提出的各种方法对改善模型最终效果的贡献。其中使用数据扩增、使用混合优化方法和使用批标准化后模型对测试样本的识别率分别提高了8.0%、0.3%和1.4%。结论 与其他文献中利用手工提取特征结合卷积神经网络的方法相比,减少了人工提取特征的工作量;与经典卷积神经网络相比,该网络特征提取能力更强,识别率更高,训练时间更短。  相似文献   

16.
The genre is an abstract feature, but still, it is considered to be one of the important characteristics of music. Genre recognition forms an essential component for a large number of commercial music applications. Most of the existing music genre recognition algorithms are based on manual feature extraction techniques. These extracted features are used to develop a classifier model to identify the genre. However, in many cases, it has been observed that a set of features giving excellent accuracy fails to explain the underlying typical characteristics of music genres. It has also been observed that some of the features provide a satisfactory level of performance on a particular dataset but fail to provide similar performance on other datasets. Hence, each dataset mostly requires manual selection of appropriate acoustic features to achieve an adequate level of performance on it. In this paper, we propose a genre recognition algorithm that uses almost no handcrafted features. The convolutional recurrent neural network‐based model proposed in this study is trained on melspectrogram extracted from 3‐s duration audio clips taken from GTZAN dataset. The proposed model provides an accuracy of 85.36% on 10‐class genre classification. The same model has been trained and tested on 10 genres of MagnaTagATune dataset having 18,476 clips of 29‐s duration. The model has yielded an accuracy of 86.06%. The experimental results suggest that the proposed architecture with melspectrogram as input feature is capable of providing consistent performances across the different datasets  相似文献   

17.
何雪英  韩忠义  魏本征 《计算机应用》2018,38(11):3236-3240
针对当前皮肤病识别分类面临的两个主要问题:一是由于皮肤病种类繁多,病灶外观的类间相似度高和类内差异化大,尤其是色素性皮肤病,使得皮肤病的识别分类比较困难;二是皮肤病识别算法模型设计存在一定的局限性,识别率还有待进一步提高。为此,以VGG19模型为基础架构,训练了一个结构化的深度卷积神经网络(CNN),实现了色素性皮肤病的自动分类。首先,采用数据增强(裁剪、翻转、镜像)对数据进行预处理;其次,将在ImageNet上预训练好的模型,迁移至增强后的数据集进行调优训练,训练过程中通过设置Softmax损失函数的权重,增加少数类判别错误的损失,来缓解数据集中存在的类别不平衡问题,提高模型的识别率。实验采用深度学习框架PyTorch,在数据集ISIC2017上进行。实验结果表明,该方法的识别率和敏感性可分别达到71.34%、70.01%,相比未设置损失函数的权重时分别提高了2.84、11.68个百分点,说明该方法是一种有效的皮肤病识别分类方法。  相似文献   

18.
基于迁移学习的并行卷积神经网络牦牛脸识别算法   总被引:1,自引:0,他引:1  
陈争涛  黄灿  杨波  赵立  廖勇 《计算机应用》2021,41(5):1332-1336
为了在牦牛养殖过程中对牦牛实现精确管理,需要对牦牛的身份进行识别,而牦牛脸识别是一种可行的牦牛身份识别方式。然而已有的基于神经网络的牦牛脸识别算法中存在牦牛脸数据集特征多、神经网络训练时间长的问题,因此,借鉴迁移学习的方法并结合视觉几何组网络(VGG)和卷积神经网络(CNN),提出了一种并行CNN(Parallel-CNN)算法用来识别牦牛的面部信息。首先,利用已有的VGG16网络对牦牛脸图像数据进行迁移学习以及初次提取牦牛的面部信息特征;然后,将提取到的不同层次的特征进行维度变换并输入到Parallel-CNN中进行二次特征提取;最后,利用两个分离的全连接层对牦牛脸图像进行分类。实验结果表明:Parallel-CNN能够对不同角度、光照和姿态的牦牛脸进行识别,在含有300头牦牛的90 000张牦牛脸图像的测试数据集上,所提算法的识别准确率达到91.2%。所提算法可以对牦牛身份进行精确识别,从而帮助牦牛养殖场实现对牦牛的智能化管理。  相似文献   

19.
Cao  Jiuwen  Cao  Min  Wang  Jianzhong  Yin  Chun  Wang  Danping  Vidal  Pierre-Paul 《Multimedia Tools and Applications》2019,78(20):29021-29041
Multimedia Tools and Applications - Urban noise recognition play a vital role in city management and safety operation, especially in the recent smart city engineering. Exiting studies on urban...  相似文献   

20.

Facial expressions are essential in community based interactions and in the analysis of emotions behaviour. The automatic identification of face is a motivating topic for the researchers because of its numerous applications like health care, video conferencing, cognitive science etc. In the computer vision with the facial images, the automatic detection of facial expression is a very challenging issue to be resolved. An innovative methodology is introduced in the presented work for the recognition of facial expressions. The presented methodology is described in subsequent stages. At first, input image is taken from the facial expression database and pre-processed with high frequency emphasis (HFE) filtering and modified histogram equalization (MHE). After the process of image enhancement, Viola Jones (VJ) framework is utilized to detect the face in the images and also the face region is cropped by finding the face coordinates. Afterwards, different effective features such as shape information is extracted from enhanced histogram of gradient (EHOG feature), intensity variation is extracted with mean, standard deviation and skewness, facial movement variation is extracted with facial action coding (FAC),texture is extracted using weighted patch based local binary pattern (WLBP) and spatial information is extracted byentropy based Spatial feature. Subsequently, dimensionality of the features are reduced by attaining the most relevant features using Residual Network (ResNet). Finally, extended wavelet deep convolutional neural network (EWDCNN) classifier uses the extracted features and accurately detects the face expressions as sad, happy, anger, fear disgust, surprise and neutral classes. The implementation platform used in the work is PYTHON. The presented technique is tested with the three datasets such as JAFFE, CK+ and Oulu-CASIA.

  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号