期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

《Journal of Visual Communication and Image Representation》2021

At present, the main super-resolution (SR) method based on convolutional neural network (CNN) is to increase the layer number of the network by skip connection so as to improve the nonlinear expression ability of the model. However, the network also becomes difficult to be trained and converge. In order to train a smaller but better performance SR model, this paper constructs a novel image SR network of multiple attention mechanism(MAMSR), which includes channel attention mechanism and spatial attention mechanism. By learning the relationship between the channels of the feature map and the relationship between the pixels in each position of the feature map, the network can enhance the ability of feature expression and make the reconstructed image more close to the real image. Experiments on public datasets show that our network surpasses some current state-of-the-art algorithms in PSNR, SSIM, and visual effects. 相似文献

2.

《Journal of Visual Communication and Image Representation》2023

Convolutional neural network (CNN) based methods have recently achieved extraordinary performance in single image super-resolution (SISR) tasks. However, most existing CNN-based approaches increase the model’s depth by stacking massive kernel convolutions, bringing expensive computational costs and limiting their application in mobile devices with limited resources. Furthermore, large kernel convolutions are rarely used in lightweight super-resolution designs. To alleviate the above problems, we propose a multi-scale convolutional attention network (MCAN), a lightweight and efficient network for SISR. Specifically, a multi-scale convolutional attention (MCA) is designed to aggregate the spatial information of different large receptive fields. Since the contextual information of the image has a strong local correlation, we design a local feature enhancement unit (LFEU) to further enhance the local feature extraction. Extensive experimental results illustrate that our proposed MCAN can achieve better performance with lower model complexity compared with other state-of-the-art lightweight methods. 相似文献

3.

基于注意力机制的多尺度全场景监控目标检测方法

下载免费PDF全文

张德祥王俊袁培成《电子与信息学报》2022,44(9):3249-3257

针对复杂城市监控场景中由于目标尺寸变化大、目标遮挡、天气影响等原因导致目标特征不明显的问题,该文提出一种基于注意力机制的多尺度全场景监控目标检测方法。该文设计了一种基于Yolov5s模型的多尺度检测网络结构,以提高网络对目标尺寸变化的适应性。同时,构建了基于注意力机制的特征提取模块,通过网络学习获得特征的通道级别权重,增强了目标特征,抑制了背景特征,提高了特征的网络提取能力。通过K-means聚类算法计算全场景监控数据集的初始锚框大小,加速模型收敛同时提升检测精度。在COCO数据集上,与基本网络相比,平均精度均值(mAP)提高了3.7%,mAP₅₀提升了4.7%,模型推理时间仅为3.8 ms。在整个场景监控数据集中,mAP₅₀达到89.6%,处理监控视频时为154 fps,满足监控现场的实时检测要求。相似文献

4.

《Journal of Visual Communication and Image Representation》2023

相似文献

5.

基于注意力的多尺度水下图像增强网络

下载免费PDF全文

方明刘小晗付飞蚺《电子与信息学报》2021,43(12):3513-3521

水下图像往往会因为光的吸收和散射而出现颜色退化与细节模糊的现象,进而影响水下视觉任务。该文通过水下成像模型合成更接近水下图像的数据集,以端到端的方式设计了一个基于注意力的多尺度水下图像增强网络。在该网络中引入像素和通道注意力机制,并设计了一个多尺度特征提取模块,在网络开始阶段提取不同层次的特征,通过带跳跃连接的卷积层和注意力模块后得到输出结果。多个数据集上的实验结果表明,该方法在处理合成水下图像和真实水下图像时都能有很好的效果,与现有方法相比能更好地恢复图像颜色和纹理细节。相似文献

6.

MSIANet：多尺度交互注意力人群计数网络

下载免费PDF全文

张世辉赵维勃王磊王威李群鹏《电子与信息学报》2023,45(6):2236-2245

尺度变化、遮挡和复杂背景等因素使得拥挤场景下的人群数量估计成为一项具有挑战性的任务。为了应对人群图像中的尺度变化和现有多列网络中规模限制及特征相似性问题,该文提出一种多尺度交互注意力人群计数网络(Multi-Scale Interactive Attention crowd counting Network, MSIANet)。首先,设计了一个多尺度注意力模块,该模块使用4个具有不同感受野的分支提取不同尺度的特征,并将各分支提取的尺度特征进行交互,同时,使用注意力机制来限制多列网络的特征相似性问题。其次,在多尺度注意力模块的基础上设计了一个语义信息融合模块,该模块将主干网络的不同层次的语义信息进行交互,并将多尺度注意力模块分层堆叠,以充分利用多层语义信息。最后,基于多尺度注意力模块和语义信息融合模块构建了多尺度交互注意力人群计数网络,该网络充分利用多层次语义信息和多尺度信息生成高质量人群密度图。实验结果表明,与现有代表性的人群计数方法相比,该文提出的MSIANet可有效提升人群计数任务的准确性和鲁棒性。相似文献

7.

基于注意力机制ResNet轻量网络的面部表情识别

赵晓;杨晨;王若男;李玥辰《液晶与显示》2023,38(11):1503-1510

针对ResNet18网络模型在面部表情识别时存在网络模型大、准确率低等问题,提出了一种基于注意力机制ResNet轻量网络模型（Multi-ScaleCBAMLightweightResNet,MCLResNet）,能够以较少的参数量、较高的准确率实现面部表情的识别。首先,采用ResNet18作为主干网络提取特征,引入分组卷积减少ResNet18的参数量;利用倒残差结构增加网络深度,优化了图像特征提取效果。其次,将CBAM（ConvolutionalBlockAttentionModule）通道注意力模块中的共享全连接层替换为1×3的卷积模块,有效减少了通道信息的丢失;在CBAM空间注意力模块中添加多尺度卷积模块获得了不同尺度的空间特征信息。最后,将多尺度空间特征融合的CBAM模块（Multi-ScaleCBAM,MSCBAM）添加到轻量的ResNet模型中,有效增加了网络模型的特征表达能力,另外在引入MSCBAM的网络模型输出层增加一层全连接层,以此增加模型在输出时的非线性表示。该模型在FER2013和CK+数据集上的实验结果表明,本文提出的模型参数量相比ResNet18下降82.58%,并且有较好的识别准确率。相似文献

8.

《Journal of Visual Communication and Image Representation》2021

相似文献

9.

《Journal of Visual Communication and Image Representation》2020

Hyperspectral imagery has been widely used in military and civilian research fields such as crop yield estimation, mineral exploration, and military target detection. However, for the limited imaging equipment and the complex imaging environment of hyperspectral images, the spatial resolution of hyperspectral images is still relatively low, which limits the application of hyperspectral images. So, studying the data characteristics of hyperspectral images deeply and improving the spatial resolution of hyperspectral images is an important prerequisite for accurate interpretation and wide application of hyperspectral images. The purpose of this paper is to deal with super-resolution of the hyperspectral image quickly and accurately, and maintain the spectral characteristics of the hyperspectral image, makes the spectral separability of the substrate in the original image remains unchanged after super-resolution processing. This paper first learns the mapping relationship between the spectral difference of low-resolution hyperspectral image and the spectral difference of the corresponding high-resolution hyperspectral image based on multiple scale convolutional neural network, Thus, apply this mapping relationship to the input low-resolution hyperspectral image generally, getting the corresponding high resolution spectral difference. Constrained space by using the image of reconstructed spectral difference, this requires the low-resolution hyperspectral image generated by the reconstructed image is to be close to the input low-resolution hyperspectral image in space, so that the whole process becomes a closed circulation system where the low-resolution hyperspectral image generation of high-resolution hyperspectral images, then back to low-resolution hyperspectral images. This innovative design further enhances the super-resolution performance of the algorithm. The experimental results show that the hyperspectral image super-resolution method based on convolutional neural network improves the input image spatial information, and the super-resolution performance of the model is above 90%, which can maintain the spectral information well. 相似文献

10.

融合多尺度分形注意力的红外小目标检测模型

下载免费PDF全文

谷雨张宏宇孙仕成《电子与信息学报》2023,45(8):3002-3011

为提高红外图像小目标检测的性能,融合传统方法的先验知识和深度学习方法的特征学习能力,该文设计了一种融合多尺度分形注意力的红外小目标端到端检测模型。首先,在对适用于红外图像弱小目标检测的多尺度分形特征分析基础上,给出了基于深度学习算子对其进行加速计算的过程。其次,设计卷积神经网络(CNN)学习度量得到目标显著性分布图,结合特征金字塔注意力模块和金字塔池化下采样模块,提出了一种基于多尺度分形特征的注意力模块。将其嵌入到红外目标语义分割模型时,采用非对称上下文融合机制提高浅层特征和深层特征的融合效果,并利用非对称金字塔非局部模块获取全局注意力,以提高红外小目标检测性能。最后,采用单帧红外小目标(SIRST)数据集验证提出算法的性能,所提模型交并比(IoU)和归一化交并比(nIoU)分别达到了77.4%和76.1%,优于目前已知方法的性能。同时通过迁移实验进一步验证了提出模型的有效性。由于有效地融合了传统方法和深度学习方法的优势,所提模型适用于复杂环境下的红外小目标检测。相似文献

11.

《Journal of Visual Communication and Image Representation》2021

Dense depth completion is essential for autonomous driving and robotic navigation. Existing methods focused on attaining higher accuracy of the estimated depth, which comes at the price of increasing complexity and cannot be well applied in a real-time system. In this paper, a coarse-to-fine and lightweight network (S&CNet) is proposed for dense depth completion to reduce the computational complexity with negligible sacrifice on accuracy. A dual-stream attention module (S&C enhancer) is proposed according to a new finding of deep neural network-based depth completion, which can capture both the spatial-wise and channel-wise global-range information of extracted features efficiently. Then it is plugged between the encoder and decoder of the coarse estimation network so as to improve the performance. The experiments on KITTI dataset demonstrate that the proposed approach achieves competitive result with respect to state-of-the-art works but via an almost four times faster speed. The S&C enhancer can also be easily plugged into other existing works to boost their performances significantly with negligible additional computations. 相似文献

12.

《Journal of Visual Communication and Image Representation》2022

Attention modules embedded in deep networks mediate the selection of informative regions for object recognition. In addition, the combination of features learned from different branches of a network can enhance the discriminative power of these features. However, fusing features with inconsistent scales is a less-studied problem. In this paper, we first propose a multi-scale channel attention network with an adaptive feature fusion strategy (MSCAN-AFF) for face recognition (FR), which fuses the relevant feature channels and improves the network’s representational power. In FR, face alignment is performed independently prior to recognition, which requires the efficient localization of facial landmarks, which might be unavailable in uncontrolled scenarios such as low-resolution and occlusion. Therefore, we propose utilizing our MSCAN-AFF to guide the Spatial Transformer Network (MSCAN-STN) to align feature maps learned from an unaligned training set in an end-to-end manner. Experiments on benchmark datasets demonstrate the effectiveness of our proposed MSCAN-AFF and MSCAN-STN. 相似文献

13.

基于半监督的电机磁瓦缺陷检测方法

下载免费PDF全文

夏兴华李欣宇韩忠华《移动信息》2024,46(9):334-337

目前大多数的数据集样本都是不含任何标签的,需要耗费大量成本对数据集中的样本进行手动标签。针对这个问题,文中提出了一种基于改进的 DCGAN 半监督检测模型。首先引入 SE 注意力机制,通过学习通道间的关系加强有效特征,抑制无效特征,提高网络的学习能力。然后,提出将残差模块与 SE 注意力机制进行融合应用到生成器和判别器中,减少梯度消失和退化,有效提高生成图片的质量,生成更加逼真的样本数据。最后对损失函数进行优化避免训练过程中的梯度消失或梯度爆炸问题,使模型更容易收敛,且还可以提升生成样本的质量。实验结果显示,改进后的 DCGAN 半监督检测模型可以减少对全标签数据集的依赖,降低人工成本。相似文献

14.

N. Goldberg A. Feuer G. C. Goodwin 《Journal of Visual Communication and Image Representation》2003,14(4):508-525

It has been known for some time that temporal dependence (motion) plays a key role in the super-resolution (SR) reconstruction of a single frame (or sequence of frames). While the impact of global time-invariant translations is relatively well known, the general motion case has not been studied in detail. In this paper, we discuss SR reconstruction for both motion models from a frequency-domain point of view. A noniterative algorithm for SR reconstruction is presented using spatio-temporal filtering. The concepts of motion-compensated windows and sinc interpolation kernels are utilized, resulting in a finite impulse response (FIR) filter realization. In the simulations, we assume a priori knowledge of the motion (optical flow), which is commonly done throughout much of the SR reconstruction literature. The proposed process is localized in nature, and this enables the selective reconstruction of desired parts of a particular frame or sequence of frames. 相似文献

15.

《Journal of Visual Communication and Image Representation》2022

Image deraining is a significant problem that ensures the visual quality of images to prompt computer vision systems. However, due to the insufficiency of captured rain streaks features and global information, current image deraining methods often face the issues of rain streaks remaining and image blurring. In this paper, we propose a Multi-receptive Field Aggregation Network (MRFAN) to restore a cleaner rain-free image. Specifically, we construct a Multi-receptive Field Feature Extraction Block (MFEB) to capture rain features with different receptive fields. In MFEB, we design a Self-supervised Block (SSB) and an Aggregation Block (AGB). SSB can make the network adaptively focus on the critical rain features and rain-covered areas. AGB effectively aggregates and redistributes the multi-scale features to help the network simulate rain streaks better. Experiments show that our method achieves better results on both synthetic datasets and real-world rainy images. 相似文献

16.

Super-resolution mosaicing from MPEG compressed video

P. Krmer O. Hadar J. Benois-Pineau J.-P. Domenger 《Signal Processing: Image Communication》2007,22(10):845-865

In this paper we address the problem of mosaic construction from MPEG 1/2 compressed video for the purpose of video browsing. State-of-the-art mosaicing methods work on raw video, but most video content is available in compressed form such as MPEG 1/2. Applying these methods to compressed video requires full decoding which is very costly. The resulting mosaic is in general too large to display on the screen and is thus inappropriate for the purpose of video browsing. Therefore, we directly extract very low-resolution frames from MPEG 1/2 compressed video for the mosaic construction and then apply a super-resolution (SR) method based on iterative backprojections in order to increase the mosaic resolution and its visual quality. Global motion to be used in the SR method for aligning and warping the frames is estimated from motion information contained in the compressed stream. We also use the estimated global motion in the blur estimation and in the choice of the degradation model used for the restoration in the SR algorithm. The method for the SR mosaic construction from MPEG 1/2 compressed video that we present in this paper is less costly than mosaic construction from full decoded video. Furthermore, the resulting mosaic size is more appropriate for the purpose of video browsing. 相似文献

17.

基于特征通道和空间联合注意机制的遮挡行人检测方法

下载免费PDF全文

陈勇刘曦刘焕淋《电子与信息学报》2020,42(6):1486-1493

遮挡是行人检测任务中导致漏检发生的主要原因之一,对检测器性能造成了不利影响。为了增强检测器对于遮挡行人目标的检测能力,该文提出一种基于特征引导注意机制的单级行人检测方法。首先,设计一种特征引导注意模块,在保持特征通道间的关联性的同时保留了特征图的空间信息,引导模型关注遮挡目标可视区域;然后,通过注意模块融合浅层和深层特征,从而提取到行人的高层语义特征;最后,将行人检测作为一种高层语义特征检测问题,通过激活图的形式预测得到行人位置和尺度,并生成最终的预测边界框,避免了基于先验框的预测方式所带来的额外参数设置。所提方法在CityPersons数据集上进行了测试,并在Caltech数据集上进行了跨数据集实验。结果表明该方法对于遮挡目标检测准确度优于其他对比算法。同时该方法实现了较快的检测速度,取得了检测准确度和速度的平衡。

相似文献

18.

《Journal of Visual Communication and Image Representation》2022

Recently, very deep convolution neural network (CNN) has shown strong ability in single image super-resolution (SISR) and has obtained remarkable performance. However, most of the existing CNN-based SISR methods rarely explicitly use the high-frequency information of the image to assist the image reconstruction, thus making the reconstructed image looks blurred. To address this problem, a novel contour enhanced Image Super-Resolution by High and Low Frequency Fusion Network (HLFN) is proposed in this paper. Specifically, a contour learning subnetwork is designed to learn the high-frequency information, which can better learn the texture of the image. In order to reduce the redundancy of the contour information learned by the contour learning subnetwork during fusion, the spatial channel attention block (SCAB) is introduced, which can select the required high-frequency information adaptively. Moreover, a contour loss is designed and it is used with the

ℓ 1

loss to optimize the network jointly. Comprehensive experiments demonstrate the superiority of our HLFN over state-of-the-art SISR methods. 相似文献

19.

《Journal of Visual Communication and Image Representation》2020

With the tremendous success of the visual question answering (VQA) tasks, visual attention mechanisms have become an indispensable part of VQA models. However, these attention-based methods do not consider any relationship among regions, which is crucial for the thorough understanding of the image by the model. We propose local relation networks for generating context-aware image features for each image region, which contain information on the relationship among the other image regions. Furthermore, we propose a multilevel attention mechanism to combine semantic information from the LRNs and the original image regions, rendering the decision of the model more reasonable. With these two measures, we improve the region representation and achieve better attentive effect and VQA performance. We conduct numerous experiments on the COCO-QA dataset and the largest VQA v2.0 benchmark dataset. Our model achieves competitive results, proving the effectiveness of our proposed LRNs and multilevel attention mechanism through visual demonstrations. 相似文献

20.

基于多尺度混合卷积网络的高光谱图像分类

杨云周瑶陈佳宁《液晶与显示》2023,38(3):368-377

针对高光谱图像数据分布不均匀、空谱特征提取不够充分以及随着网络层数增加而导致的网络退化等问题,提出一种基于多尺度混合卷积网络的高光谱图像分类方法。首先,使用主成分分析对高光谱数据进行降维处理;接着,利用邻域提取将邻域内的像素点作为一个样本,补充相应的空间信息;然后,使用多尺度混合卷积网络对预处理后的样本数据进行特征提取,并加入混合域注意力机制来加强空间和光谱维中有用的信息;最后,使用Softmax分类器对每个像素样本进行类别划分。实验结果表明：将所提出的模型在IndianPines和PaviaUniversity两个高光谱数据集中进行实验,其总体分类精度、平均分类精度、Kappa系数分别能达到0.9879、0.9833、0.9862和0.9990、0.9969、0.9986。该算法能够更加充分地提取高光谱图像的特征信息,与其他分类方法相比取得了更好的分类效果。相似文献