首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
赵倩  周冬明  杨浩  王长城  李淼 《红外与激光工程》2022,51(10):20220018-1-20220018-13
针对相机抖动、拍摄物体快速运动以及低快门速度等因素造成的图像非均匀模糊,提出一种结合多尺度特征融合和多输入多输出编-解码器的去模糊算法。首先使用多尺度特征提取模块来提取较小尺度模糊图像的初始特征,该模块使用扩张卷积来以较少的参数量获得更大的感受野。其次,通过特征注意力模块来自适应地学习不同尺度特征中的有效信息,该模块利用小尺度图像的特征来生成注意图,能够有效地减少冗余特征。最后,使用多尺度特征渐进融合模块逐步融合不同尺度的特征,使得不同尺度特征信息能够进行互补。相比以往的使用多个子网堆叠的多尺度方法,文中使用单个网络就能提取多尺度特征,从而降低了训练难度。为了评估网络的去模糊效果和泛化性能,提出的算法在基准数据集GoPro、HIDE和真实数据集RealBlur上均进行了测试。在GoPro和HIDE数据集上的峰值信噪比值分别为31.73 dB和29.39 dB,结构相似度值分别为0.951和0.923,其结果均高于目前先进的去模糊算法,并且在真实数据集RealBlur上也取得了最佳效果。实验结果表明,提出的去模糊算法相比现有算法去模糊更为彻底,能有效地复原图像的边缘轮廓和纹理细节信息,并且能够提升后续高级计算机视觉任务的鲁棒性。  相似文献   

2.
针对乳腺肿瘤大小形态多变、边界模糊以及前景与背景间严重类不平衡的问题,该文提出一种多尺度残差双域注意力融合网络。该网络以多尺度卷积构成的多尺度残差块作为基本搭建模块,通过提取多尺度特征和优化梯度传播通道提高其识别不同尺寸目标的能力,同时融入双域注意力单元,提高网络的边缘识别和边界保持能力。另外该文提出一种混合自适应权重损失函数改善网络优化方向,缓解正负样本极度不均衡的影响。实验结果表明,该文所提方法的平均骰子相似系数(Dice)值达到0.806 3,较U形网络(UNet)提高5.3%,参数量下降73.36%,具有更优的分割性能。  相似文献   

3.
遥感图像内容丰富,一般的深度模型提取遥感图像特征时容易受复杂背景干扰,对关键特征的提取效果不佳,并且难以表达图像的空间信息,该文提出一种基于多尺度池化和范数注意力机制的深度卷积神经网络,在通道层面与空间层面自适应地给显著特征加权.首先,在多尺度池化通道注意力模块中,结合空间金字塔池化的思想,对每个通道上的特征图进行不同...  相似文献   

4.
基于直方图统计量的逆合成孔径雷达目标识别   总被引:1,自引:0,他引:1  
将原用于人脸识别的基于Gabor局部二进制模式的识别技术用于逆合成孔径雷达(ISAR)像的雷达目标识别,对算法进行了改进,取得了较好的识别效果。将ISAR像进行Gabor小波变换,提取不同尺度和方向的Gabor幅值图谱;然后把幅值图谱分成小的子区域,用多尺度局部二值模式提取空域增强的直方图作为特征,最后在χ2统计量作为不相似度量计算的特征空间里,采用最近邻分类器完成五类目标的分类识别。与目前已有的几种典型ISAR目标识别方法进行了对比,结果表明:该方法是可行且有效的,能够明显地提高识别率。  相似文献   

5.
Driver distraction has currently been a global issue causing the dramatic increase of road accidents and casualties. However, recognizing distracted driving action remains a challenging task in the field of computer vision, since inter-class variations between different driver action categories are quite subtle. To overcome this difficulty, in this paper, a novel deep learning based approach is proposed to extract fine-grained feature representation for image-based driver action recognition. Specifically, we improve the existing convolutional neural network from two aspects: (1) we employ multi-scale convolutional block with different receptive fields of kernel sizes to generate hierarchical feature map and adopt maximum selection unit to adaptively combine multi-scale information; (2) we incorporate an attention mechanism to learn pixel saliency and channel saliency between convolutional features so that it can guide the network to intensify local detail information and suppress global background information. For experiment, we evaluate the designed architecture on multiple driver action datasets. The quantitative experiment result shows that the proposed multi-scale attention convolutional neural network (MSA-CNN) obtains the state of the art performance in image-based driver action recognition.  相似文献   

6.
In recent years, stereo cameras have been widely used in various fields. Due to the limited resolution of real equipments, stereo image super-resolution (SR) is a very important and hot topic. Recent studies have shown that deep network structures can directly affect feature expression and extraction and thus influence the final results. In this paper, we propose a multi-atrous residual attention stereo super-resolution network (MRANet) with parallax extraction and strong discriminative ability. Specifically, we propose a multi-scale atrous residual attention (MARA) block to obtain receptive fields of different scales through a multi-scale atrous convolution and then combine them with attention mechanisms to extract more diverse and meaningful information. Moreover, we propose a stereo feature fusion unit for stereo parallax extraction and single viewpoint feature refinement and integration. Experiments on benchmark datasets show that MRANet achieves state-of-the-art performance in terms of quantitative metrics and visual quality compared with several SR methods.  相似文献   

7.
周薇娜  刘露 《电信科学》2022,38(10):67-78
船舶检测在军事侦察、海上目标跟踪、海上交通管制等任务中发挥着重要作用。然而,受船舶外形尺度多变和复杂海面背景的影响,在复杂海面上检测多尺度船舶仍然是一个挑战。针对此难题,提出了一种基于多层信息交互融合和注意力机制的 YOLOv4 改进方法。该方法主要通过多层信息交互融合(multi-layer information interactive fusion,MLIF)模块和多注意感受野(multi-attention receptive field,MARF)模块构建一个双向细粒度特征金字塔。其中,MLIF模块用于融合不同尺度的特征,不仅能将深层的高级语义特征串联在一起,而且将较浅层的丰富特征进行重塑;MARF由感受野模块(receptive field block,RFB)与注意力机制模块组成,能有效地强调重要特征并抑制冗余特征。此外,为了进一步评估提出方法的性能,在新加坡海事数据集(Singapore maritime dataset,SMD)上进行了实验。实验结果表明,所提方法能有效地解决复杂海洋环境下多尺度船舶检测的难题,且同时满足了实时需求。  相似文献   

8.
刘亚灵  郭敏  马苗 《光电子.激光》2021,32(12):1271-1277
针对声音事件检测中仅在时频维度使用注意力机制的局限性以及卷积层单一导致的 特征提取不足问题,本文提出基于多尺度注意力特征融合的卷积循环神经网络(convolutional recurrent neural network,CRNN)模型,以提高声音事件检测性能。首 先,提出多尺度注意力模块,实现对局部时频单元和全局通道特征的多尺度注意,提高模型 的特征选择能力;其次,提出一种多尺度特征融合方法,融合含有丰富上下文信息的多尺度 注意力特征,提高模型的特征表达能力;最后,双向门控循环网络层对时间依赖性进行建模 , 全连接层对声音事件进行逐帧分类。除此之外,使用数据平衡技术进一步泛化模型。在 AudioSet子数据集上的实验结果表明:提出的网络模型与CRNN相比,评估集(error rate, ER)下降 11%,F1分数 (F1-score, F1)提升8.3%,有效地提高了声音事件检测性能。  相似文献   

9.
With the rapid development of three-dimensional (3D) vision technology and the increasing application of 3D objects, there is an urgent need for 3D object recognition in the fields of computer vision, virtual reality, and artificial intelligence robots. The view-based method projects 3D objects into two-dimensional (2D) images from different viewpoints and applies convolutional neural networks (CNN) to model the projected views. Although these methods have achieved excellent recognition performance, there is not sufficient information interaction between the features of different views in these methods. Inspired by the recent success achieved by vision transformer (ViT) in image recognition, we propose a hybrid network by taking advantage of CNN to extract multi-scale local information of each view, and of transformer to capture the relevance of multi-scale information between different views. To verify the effectiveness of our multi-view convolutional vision transformer (MVCVT), we conduct experiments on two public benchmarks, ModelNet40 and ModelNet10, and compare with those of some state-of-the-art methods. The final results show that MVCVT has competitive performance in 3D object recognition.  相似文献   

10.
针对脑出血CT图像病灶部位的多尺度性导致分割精度较低的问题,该文提出一种基于改进U型神经网络的图像分割模型(AU-Net+).首先,该模型利用U-Net中的编码器对脑出血CT图像特征编码,将提出的残差八度卷积(ROC)块应用到U型神经网络的跳跃连接部分,使不同层次的特征更好地融合;其次,对融合后的特征,分别引入混合注意...  相似文献   

11.
为了提高行人属性识别的准确率,提出了一种基于多尺度注意力网络的行人属性识别算法。为了提高算法的特征表达能力和属性判别能力,首先,在残差网络ResNet50的基础上,增加了自顶向下的特征金字塔和注意力模块,自顶向下的特征金字塔由自底向上提取的视觉特征构建;然后,融合特征金字塔中不同尺度的特征,为每层特征的通道注意力赋予不同的权重。最后,改进了模型损失函数以减弱数据不平衡对属性识别率的影响。在RAP和PA-100K数据集上的实验结果表明,与现有算法相比,本算法对行人属性识别的平均精度、准确度、F1性能更好。  相似文献   

12.
Representing contextual features at multiple scales is important for RGB-D SOD. Recently, due to advances in backbone convolutional neural networks (CNNs) revealing stronger multi-scale representation ability, many methods achieved comprising performance. However, most of them represent multi-scale features in a layer-wise manner, which ignores the fine-grained global contextual cues in a single layer. In this paper, we propose a novel global contextual exploration network (GCENet) to explore the performance gain of multi-scale contextual features in a fine-grained manner. Concretely, a cross-modal contextual feature module (CCFM) is proposed to represent the multi-scale contextual features at a single fine-grained level, which can enlarge the range of receptive fields for each network layer. Furthermore, we design a multi-scale feature decoder (MFD) that integrates fused features from CCFM in a top-down way. Extensive experiments on five benchmark datasets demonstrate that the proposed GCENet outperforms the other state-of-the-art (SOTA) RGB-D SOD methods.  相似文献   

13.
薛茹  宋焕生 《电视技术》2014,38(7):188-191,206,182
针对传统的HOG目标识别方法,提出一种通过Gabor滤波融合后的进行HOG特征提取的目标检测方法。为了提高HOG特征提取信息的有效性,首先用Gabor对目标图像做了预处理,其预处理过程是针对图像Gabor特征的在尺度和方向上进行融合,形成一幅Gabor图像。为了有效提取全局的Gabor图像纹理、轮廓信息,将该图像分为大小相同且重叠的块,分别对每个块进行统计,最后用RealAdaboost级联方法对目标和非目标样本进行学习,并对测试序列进行分类。结果表明,基于梯度的Gabor预处理技术能提高目标特征提取性能。与传统的HOG目标识别的方法比较,该方法在目标图像受到干扰的情况(遮挡、重叠等)下,监测效果明显优越。  相似文献   

14.
Face anti-spoofing is used to assist face recognition system to judge whether the detected face is real face or fake face. In the traditional face anti-spoofing methods, features extracted by hand are used to describe the difference between living face and fraudulent face. But these handmade features do not apply to different variations in an unconstrained environment. The convolutional neural network (CNN) for face deceptions achieves considerable results. However, most existing neural network-based methods simply use neural networks to extract single-scale features from single-modal data, while ignoring multi-scale and multi-modal information. To address this problem, a novel face anti-spoofing method based on multi-modal and multi-scale features fusion ( MMFF) is proposed. Specifically, first residual network ( Resnet )-34 is adopted to extract features of different scales from each modality, then these features of different scales are fused by feature pyramid network (FPN), finally squeeze-and-excitation fusion ( SEF) module and self-attention network ( SAN) are combined to fuse features from different modalities for classification. Experiments on the CASIA-SURF dataset show that the new method based on MMFF achieves better performance compared with most existing methods.  相似文献   

15.
Smoky vehicle, emitting visible black exhaust emissions from vehicle exhaust pipe, is representative heavy pollution vehicle. This paper presents an intelligent smoky vehicle detection method based on multi-scale block Tamura features. In this method, the Vibe background subtraction algorithm is adopted to detect vehicle objects. We propose the multi-scale block Tamura features and use this features to distinguish smoky vehicle images and non-smoke vehicle images. More specifically, the region at the back of the vehicle is divided into 1\(\times \)2 blocks. For each block, the multi-scale strategy based on Gaussian kernel with different standard deviations is proposed to extract features and utilize different scales information. Finally, the back-propagation neural network classifier is trained and used for classification. Our method can automatically detect smoky vehicle through analyzing road surveillance videos. The experimental results show that the proposed algorithm framework performs better than common smoke and fire detection method, and the proposed multi-scale block Tamura features can obtain higher detection accuracy than common Tamura features.  相似文献   

16.
行人重识别的关键依赖于行人特征的提取,卷积神经网络具有强大的特征提取以及表达能力。针对不同尺度下可以观察到不同的特征,该文提出一种基于多尺度和注意力网络融合的行人重识别方法(MSAN)。该方法通过对网络不同深度的特征进行采样,将采样的特征融合后对行人进行预测。不同深度的特征图具有不同的表达能力,使网络可以学习到行人身上更加细粒度的特征。同时将注意力模块嵌入到残差网络中,使得网络能更加关注于一些关键信息,增强网络特征学习能力。所提方法在Market1501, DukeMTMC-reID和MSMT17_V1数据集上首位准确率分别到了95.3%, 89.8%和82.2%。实验表明,该方法充分利用了网络不同深度的信息和关注的关键信息,使模型具有很强的判别能力,而且所提模型的平均准确率优于大多数先进算法。  相似文献   

17.
为了进一步提升现有盲源分离算法的分离性能,本文在Wave-U-Net的基础上提出了一种全尺度跳跃连接模型。首先为了解决Wave-U-Net下采样过程中信号特征丢失问题,该模型在跳跃连接中增加了卷积操作,通过对不同时间尺度的特征图进行连接,有效地结合了信号的浅层特征和深层特征,提升了模型的分离性能。针对Wave-U-Net最佳深度取值和全尺度跳跃连接模型的参数过多的问题,本文进一步提出了多尺度跳跃连接模型。在多尺度跳跃连接模型中,通过嵌入不同深度的Wave-U-Net来代替跳跃连接中的卷积操作,在牺牲一部分分离性能下减少了模型参数,该模型共享下采样块来降低模型训练时间以及模型最佳深度取值带来的影响。仿真实验表明,相比于其他基线模型,本文提出的两种模型能显著提升信号分离性能,在SDR,SIR,SAR提升奖将近3~4 dB。   相似文献   

18.
In recent years, deep learning has been successfully applied to medical image segmentation. However, as the network extends deeper, the consecutive downsampling operations will lead to more loss of spatial information. In addition, the limited data and diverse targets increase the difficulty for medical image segmentation. To address these issues, we propose a multi-path connected network (MCNet) for medical segmentation problems. It integrates multiple paths generated by pyramid pooling into the encoding phase to preserve semantic information and spatial details. We utilize multi-scale feature extractor block (MFE block) in the encoder to obtain large and multi-scale receptive fields. We evaluated MCNet on three medical datasets with different image modalities. The experimental results show that our method achieves better performance than the state-of-the-art approaches. Our model has strong feature learning ability and is robust to capture different scale targets. It can achieve satisfactory results while using only 0.98 million (M) parameters.  相似文献   

19.
视网膜血管的分割精确率对眼科疾病和糖尿病早期诊断有着重要影响。面对现有方法在微血管与病变区域分割性能差的问题,本文提出一种强化提取血管特征的分割模型。该模型在编码部位引入多尺度特征提取残差模块(multi-scale feature extraction residual module,MFE-residual) 和多级残差空洞卷积层,用来扩展感受野,学习多层次图像特征,提高模型对血管信息的利用率;下采样和短连接部位分别融入轻量化注意力机制和多通道注意力模块,增加模型对血管的识别度,降低误分割的可能性。本文基于DRIVE和STARE两种公开数据集进行了实验,来验证改 进模型的分割能力。结果表明,两种数据上的准确率分别为0.965 2和0.971 5,灵敏度分别为0.820 5和0.825 6,与其他算法相比,分割性能更有优势。  相似文献   

20.
Deep image compression efficiency has been improved in the past years. However, to fully exploit context information for compressing image objects of different scales and shapes, more adaptive geometric structure of inputs should be considered. In this paper, we novelly introduce deformable convolution and its spatial attention extension into deep image compression task to fully exploit the context information. Specifically, a novel deep image compression network with Multi-Scale Deformable Convolution and Spatial Attention, named MS-DCSA, is proposed to better extract compact and efficient latent representation as well as reconstruct higher-quality images. First, multi-scale deformable convolution is presented to provide multi-scale receptive fields for learning spatial sampling offsets in deformable operations. Subsequently, multi-scale deformable spatial attention module is developed to generate attention masks to re-weight extracted features according to their importance. In addition, the multi-scale deformable convolution is applied to design delicate up/down sampling modules. Extensive experiments demonstrate that the proposed MS-DCSA network achieves improved performance on both PSNR and MS-SSIM quality metrics, compared to conventional as well as competing deep image compression methods.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号