首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 78 毫秒
1.
为进一步提高人脸表情识别的准确率,提出一种融合全局与局部特征的深度卷积神经网络算法(GL-DCNN).该算法由两个改进的卷积神经网络分支组成,全局分支和局部分支,分别用于提取全局特征和局部特征,对两个分支的特征进行加权融合,使用融合后的特征进行分类.首先,提取全局特征,全局分支基于迁移学习,使用改进的VGG19网络模型...  相似文献   

2.
针对现有肺炎医学影像识别研究在浅层网络忽略全局特征导致特征提取不全且模型规模较大的问题, 提出了一种基于CNN和注意力机制的轻量化模型提高肺炎类型的识别效率. 采用轻量化模型结构减少模型参数量, 通过增大卷积核, 引入高效通道注意力和自注意力机制解决网络重要信息丢失和无法提取底层全局信息的问题, 通过双分支并行提取局部和全局信息并使用多尺度通道注意力提高二者融合质量, 使用CLAHE算法优化原始数据. 实验结果表明, 该模型在保证轻量性的同时准确率、灵敏度、特异性较原模型分别提高2.59%, 3.1%, 1.38%, 并优于当前优秀的其他分类模型, 具有更强的实用性.  相似文献   

3.
目的 基于计算机的胸腔X线影像疾病检测和分类目前存在误诊率高,准确率低的问题。本文在视觉Transformer(vision Transformer,ViT)预训练模型的基础上,通过迁移学习方法,实现胸腔X线影像辅助诊断,提高诊断准确率和效率。方法 选用带有卷积神经网络(convolutional neural network,CNN)的ViT模型,其在超大规模自然图像数据集中进行了预训练;通过微调模型结构,使用预训练的ViT模型参数初始化主干网络,并迁移至胸腔X线影像数据集中再次训练,实现疾病多标签分类。结果 在IU X-Ray数据集中对ViT迁移学习前、后模型平均AUC(area under ROC curve)得分进行对比分析实验。结果表明,预训练ViT模型平均AUC得分为0.774,与不使用迁移学习相比提升了0.208。并针对模型结构和数据预处理进行了消融实验,对ViT中的注意力机制进行可视化,进一步验证了模型有效性。最后使用Chest X-Ray14和CheXpert数据集训练微调后的ViT模型,平均AUC得分为0.839和0.806,与对比方法相比分别有0.014~0.03...  相似文献   

4.
目的 高度适形放射治疗是常用的癌症治疗方法,该方法的有效性依赖于对癌组织和周边多个危及器官(organ at risk,OAR)解剖结构的精确刻画,因此研究三维图像多器官的高精度自动分割具有重要意义。以视觉Transformer(vision Transformer,ViT)和卷积神经网络(convolutional neural network,CNN)结合为代表的三维医学图像分割方法表现出了丰富的应用优势。然而,这类方法往往忽略同一尺度内和不同尺度间的信息交互,使得CNN和ViT特征的提取和融合受限。本文提出一种端到端多器官分割网络LoGoFUNet(local-global-features fusion UNet),旨在应对现有方法的缺陷。方法 首先,针对单一器官分割,提出在同一尺度下并行提取并融合CNN和ViT特征的LoGoF(local-global-features fusion)编码器,并构建了一个端到端的三维医学图像分割多尺度网络M0。此外,考虑到器官内部以及器官之间的相互关系,该方法在M0网络的基础上设计并引入了多尺度交互(multi-scale interacti...  相似文献   

5.
目的 病理学检查是明确乳腺癌诊断及肿瘤类型的金标准。深度神经网络广泛应用于乳腺病理全切片的诊断工作并取得了明显进展,但是现有大多数工作只是将全切片切割成小图像块,对每个图像块进行单独处理,没有考虑它们之间的空间信息。为此,提出了一种融合空间相关性特征的乳腺组织病理全切片分类方法。方法 首先基于卷积神经网络对病理图像块进行预测,并提取每个图像块有代表性的深层特征,然后利用特征融合将图像块及其周围图像的特征进行聚合,以形成具有空间相关性的块描述符,最后将全切片图像中最可疑的块描述符传递给循环神经网络,以预测最终的全切片级别的分类。结果 本文构建了一个经过详细标注的乳腺病理全切片数据集,并在此数据集上进行良性/恶性二分类实验。在自建的数据集中与3种全切片分类方法进行了比较。结果表明,本文方法的分类精度达到96.3%,比未考虑空间相关性的方法高出了1.9%,与基于热力图特征和基于空间性和随机森林的方法相比,分类精度分别高出8.8%和1.3%。结论 本文提出的乳腺组织病理全切片识别方法将空间相关性特征融合和RNN分类集成到一个统一模型,有助于提高图像识别准确率,为病理图像诊断工作提供了高效的辅助...  相似文献   

6.
目的 深度伪造视频检测是目前计算机视觉领域的热点研究问题。卷积神经网络和Vision Transformer(ViT)都是深度伪造检测模型中的基础结构,二者虽各有优势,但都面临训练和测试阶段耗时较长、跨压缩场景精度显著下降问题。针对这两类模型各自的优缺点,以及不同域特征在检测场景下的适用性,提出了一种高效的CNN(convolutional neural network)结合Transformer的联合模型。方法 设计基于Efficient Net的空间域特征提取分支及频率域特征提取分支,以丰富单分支的特征表示。之后与Transformer的编码器结构、交叉注意力结构进行连接,对全局区域间特征相关性进行建模。针对跨压缩、跨库场景下深度伪造检测模型精度下降问题,设计注意力机制及嵌入方式,结合数据增广策略,提高模型在跨压缩率、跨库场景下的鲁棒性。结果 在Face Forensics++的4个数据集上与其他9种方法进行跨压缩率的精度比较,在交叉压缩率检测实验中,本文方法对Deepfake、Face2Face和Neural Textures伪造图像的检测准确率分别达到90.35%、71.79%...  相似文献   

7.
目的 针对传统红外与可见光图像融合方法中人工设计特征提取和特征融合的局限性,以及基于卷积神经网络(convolutional neural networks, CNN)的方法无法有效提取图像中的全局上下文信息和特征融合过程中融合不充分的问题,本文提出了基于视觉Transformer和分组渐进式融合策略的端到端无监督图像融合网络。方法 首先,将在通道维度上进行自注意力计算的多头转置注意力模块和通道注意力模块组合成视觉Transformer,多头转置注意力模块解决了自注意力计算量随像素大小呈次方增大的问题,通道注意力可以强化突出特征。其次,将CNN和设计的视觉Transformer并联组成局部—全局特征提取模块,用来提取源图像中的局部细节信息和全局上下文信息,使提取的特征既具有通用性又具有全局性。此外,为了避免融合过程中信息丢失,通过将特征分组和构造渐进残差结构的方式进行特征融合。最后,通过解码融合特征得到最终的融合图像。结果 实验在TNO数据集和RoadScene数据集上与6种方法进行比较。主观上看,本文方法能够有效融合红外图像和可见光图像中的互补信息,得到优质的融合图像。从客观定量分析...  相似文献   

8.
当前小目标检测算法的实现方式主要是设计各种特征融合模块,检测效果和模型复杂度很难达到平衡.此外,与常规目标相比,小目标信息量少,特征难以提取.为了克服这两个问题,采用了一种不降维局部跨通道交互策略的通道注意力模块,实现通道间的信息关联,通过对每个通道的特征进行权重分配来学习不同通道间特征的相关性.同时,加入改进的特征融合模块,使网络可以使用低层和高层的特征进行多尺度目标检测,提升了以低层特征为主要检测依据的小目标检测精度.骨干网络采用特征表达能力强和速度快的ResNet,在获取更多网络特征的同时保证了网络的收敛性.损失函数采用Focal Loss,减少易分类样本的权重,使得模型在训练时更关注于难分类样本的分类.该算法框架在VOC数据集上的mAP为82.7%,在航拍数据集上的mAP为86.8%.  相似文献   

9.
心率失常是心血管疾病诊断的重要手段,其自动分类具有重要的临床意义。为了提高心率失常分类的准确性,结合一维卷积神经网络(Convolutional Neural Networks,CNN)和注意力机制(Attention)提出了一种CNN+Attention的深度学习模型,使用CNN提取心电信号的一维时域特征。针对一维时序心电信号时域特征表征能力有限的问题,使用短时傅里叶变换(Short-Time Fourier transform,STFT)将心电信号变换到时频域,通过Attention提取心电信号的时频域全局相关依赖关系,将时域与时频域特征融合对5种类型心电信号进行分类。在MIT-BIH数据集上验证了模型的有效性,所提模型对5种类型心电信号的平均分类准确率、精准率、召回率、灵敏度以及F1_Score分别为99.72%、98.55%、99.46%、99.90%以及99.00%。与已有先进方法对比,验证了所提模型具有先进的性能表现。  相似文献   

10.
在图像语义分割中,利用卷积神经网络对图像信息进行特征提取时,针对卷积神经网络没有有效利用各层级间的特征信息而导致图像语义分割精度受损的问题,提出分级特征融合的图像语义分割方法.该方法利用卷积结构分级提取含有像素级的浅层低级特征和含有图像级的深层语义特征,进一步挖掘不同层级间的特征信息,充分获取浅层低级特征和深层语义特征...  相似文献   

11.
Computer-aided Diagnosis (CADx) technology can substantially aid in early detection and diagnosis of breast cancers. However, the overall performance of a CADx system is tied, to a large extent, to the accuracy with which the tumors can be segmented in a mammogram. This implies that the segmentation of mammograms is a critical step in the diagnosis of benign and malignant tumors. In this paper, we develop an enhanced mammography CADx system with an emphasis on the segmentation step. In particular, we present two hybrid algorithms based upon region-based, contour-based and clustering segmentation techniques to recognize benign and malignant breast tumors. In the first algorithm, in order to obtain the most accurate final segmented tumor, the initial segmented image, that is required for the level set, is provided by one of spatial fuzzy clustering (SFC), improved region growing (RG), or cellular neural network (CNN). In the second algorithm, all of the parameters which control the level set are obtained from a dynamic training procedure by the combination of both genetic algorithms (GA) and artificial neural network (ANN) or memetic algorithm (MA) and ANN. After segmenting tumors using one of the hybrid proposed methods, intensity, shape and texture features are extracted from tumors, and the appropriate features are then selected by another GA algorithm. Finally, to classify tumors as benign or malignant, different classifiers such as ANN, random forest, naïve Bayes, support vector machine (SVM), and K-nearest neighbor (KNN) are used. Experimental results confirm the efficiency of the proposed methods in terms of sensitivity, specificity, accuracy and area under ROC curve (AUC) for the classification of breast tumors. It was concluded that RG and GA in adaptive RG-LS method produce more accurate primary boundary of tumors and appropriate parameters for the level set technique in segmentation and subsequently in classification.  相似文献   

12.

在采用图像谱残差分析方法获取全局特征显著性图像的基础上, 利用小波变换在时域和频域具有的局部特征信息表征能力, 通过对图像包含的不同特征信息进行小波变换, 去除各个特征图中的冗余信息, 得到图像局部特征显著部分, 对两种分析方法下获得的显著图进行融合分析, 获得最终的图像显著部分, 并利用视觉转移机制在原图中勾画出显著性目标. 实验结果分析表明, 改进后的方法提高了图像显著目标检测的准确率.

  相似文献   

13.
提出了一种融合全局和局部特征的Fisherfaces方法。在Fisher线性准则下,抽取出图像全局特征和局部特征的最佳分类特征。计算待识别样本和训练样本集的加权欧氏距离。在最近邻准则下,判别待识别样本的类别,在ORL人脸库上进行的对比实验结果表明该方法的优越性。  相似文献   

14.
15.
传统的虹膜识别系统需要将虹膜图像转换至极坐标系统并进行归一化,通过平移特征向量来达到旋转不变性。为了降低传统虹膜识别方法的复杂性,提出了一种融合局部与全局特征提取的虹膜识别方法,无须对预处理后的虹膜图像进行归一化。该方法首先对分割出的虹膜图像直接采用非张量积小波提取全局特征,接着采用SIFT方法提取选定区域的局部特征,最后对虹膜局部及全局特征采用不同的权值,进行相似性距离测试。结果表明该方法在等错误率为0.935%的情况下,正确识别率达到了99.065%。在不对虹膜图像归一化的情况下,可获得很好的识别性能。  相似文献   

16.
17.
Pattern Analysis and Applications - At present, deep learning has made great progress in the field of glyph modeling. However, existing methods of font generation have some problems, such as...  相似文献   

18.
Recently, several no-reference image quality assessment (NR-IQA) metrics have been developed for the quality evaluation of screen content images (SCIs). While, most of them are opinion-aware methods, which are limited by the subjective opinion scores of training data. Hence, in this paper, we propose a novel opinion-unaware method to predict the quality of SCIs without any prior information. Firstly, an union feature is proposed by considering the local and global visual characteristics of human visual system simultaneously. Specifically, a local structural feature is extracted from the rough and smooth regions of SCIs by leveraging a sparse representation model. As a supplement, a global feature is obtained by combining the luminance statistical feature and local binary pattern (LBP) feature of entire SCIs. Secondly, to get rid of the limitation of subjective opinion scores, a new large-scale training dataset contained 80,000 distorted SCIs is constructed, and the quality labels of those distorted SCIs are derived by an advanced full-reference IQA metric. Thirdly, a regression model between image features and image quality labels is learned from the training dataset by employing a learning-based framework. And then, the quality scores of test SCIs can be predicted by the pre-trained regression model. The experimental results on two largest SCI-oriented databases show that the proposed method is superior to the state-of-the-art NR-IQA metrics.  相似文献   

19.
This paper presents a novel level set method for complex image segmentation, where the local statistical analysis and global similarity measurement are both incorporated into the construction of energy functional. The intensity statistical analysis is performed on local circular regions centered in each pixel so that the local energy term is constructed in a piecewise constant way. Meanwhile, the Bhattacharyya coefficient is utilized to measure the similarity between probability distribution functions for intensities inside and outside the evolving contour. The global energy term can be formulated by minimizing the Bhattacharyya coefficient. To avoid the time-consuming re-initialization step, the penalty energy term associated with a new double-well potential is constructed to maintain the signed distance property of level set function. The experiments and comparisons with four popular models on synthetic and real images have demonstrated that our method is efficient and robust for segmenting noisy images, images with intensity inhomogeneity, texture images and multiphase images.  相似文献   

20.
In this paper, global and local prosodic features extracted from sentence, word and syllables are proposed for speech emotion or affect recognition. In this work, duration, pitch, and energy values are used to represent the prosodic information, for recognizing the emotions from speech. Global prosodic features represent the gross statistics such as mean, minimum, maximum, standard deviation, and slope of the prosodic contours. Local prosodic features represent the temporal dynamics in the prosody. In this work, global and local prosodic features are analyzed separately and in combination at different levels for the recognition of emotions. In this study, we have also explored the words and syllables at different positions (initial, middle, and final) separately, to analyze their contribution towards the recognition of emotions. In this paper, all the studies are carried out using simulated Telugu emotion speech corpus (IITKGP-SESC). These results are compared with the results of internationally known Berlin emotion speech corpus (Emo-DB). Support vector machines are used to develop the emotion recognition models. The results indicate that, the recognition performance using local prosodic features is better compared to the performance of global prosodic features. Words in the final position of the sentences, syllables in the final position of the words exhibit more emotion discriminative information compared to the words and syllables present in the other positions.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号