共查询到20条相似文献,搜索用时 0 毫秒
1.
Recently, very deep convolution neural network (CNN) has shown strong ability in single image super-resolution (SISR) and has obtained remarkable performance. However, most of the existing CNN-based SISR methods rarely explicitly use the high-frequency information of the image to assist the image reconstruction, thus making the reconstructed image looks blurred. To address this problem, a novel contour enhanced Image Super-Resolution by High and Low Frequency Fusion Network (HLFN) is proposed in this paper. Specifically, a contour learning subnetwork is designed to learn the high-frequency information, which can better learn the texture of the image. In order to reduce the redundancy of the contour information learned by the contour learning subnetwork during fusion, the spatial channel attention block (SCAB) is introduced, which can select the required high-frequency information adaptively. Moreover, a contour loss is designed and it is used with the loss to optimize the network jointly. Comprehensive experiments demonstrate the superiority of our HLFN over state-of-the-art SISR methods. 相似文献
2.
The saliency prediction precision has improved rapidly with the development of deep learning technology, but the inference speed is slow due to the continuous deepening of networks. Hence, this paper proposes a fast saliency prediction model. Concretely, the siamese network backbone based on tailored EfficientNetV2 accelerates the inference speed while maintaining high performance. The shared parameters strategy further curbs parameter growth. Furthermore, we add multi-channel activation maps to optimize the fine features considering different channels and low-level visual features, which improves the interpretability of the model. Extensive experiments show that the proposed model achieves competitive performance on the standard benchmark datasets, and prove the effectiveness of our method in striking a balance between prediction accuracy and inference speed. Moreover, the small model size allows our method to be applied in edge devices. The code is available at: https://github.com/lscumt/fast-fixation-prediction. 相似文献
3.
为了提高行人再识别算法的识别效果,该文提出一种基于注意力模型的行人属性分级识别神经网络模型,相对于现有算法,该模型有以下3大优点:一是在网络的特征提取部分,设计用于识别行人属性的注意力模型,提取行人属性信息和显著性程度;二是在网络的特征识别部分,针对行人属性的显著性程度和包含的信息量大小,利用注意力模型对属性进行分级识别;三是分析属性之间的相关性,根据上一级的识别结果,调整下一级的识别策略,从而提高小目标属性的识别准确率,进而提高行人再识别的准确率。实验结果表明,该文提出的模型相较于现有方法,有效提高了行人再识别的首位准确率,其中,Market1501数据集上,首位准确率达到了93.1%,在DukeMTMC数据集上,首位准确率达到了81.7%。 相似文献
4.
Person re-identification(ReID) is an intelligent video surveillance technology that retrieves the same person from different cameras. This task is extremely challenging due to changes in person poses, different camera views, and occlusion. In recent years, person ReID based on deep learning technology has received widespread attention due to the rapid development and excellent performance of deep learning. In this paper, we first divide person ReID based on deep learning approaches into seven types, i.e., fused hand-crafted features deep model, representation learning model, metric learning model, part-based deep model, video-based model, GAN-based model, unsupervised model. Furthermore, we launched a brief overview of the seven types. Then, we introduce some examples of commonly used datasets, compare the performance of some algorithms on image and video datasets in recent years, and analyze the advantages and disadvantages of various methods. Finally, we summarize the possible future research directions of person ReID technology. 相似文献
5.
Saliency prediction on RGB-D images is an underexplored and challenging task in computer vision. We propose a channel-wise attention and contextual interaction asymmetric network for RGB-D saliency prediction. In the proposed network, a common feature extractor provides cross-modal complementarity between the RGB image and corresponding depth map. In addition, we introduce a four-stream feature-interaction module that fully leverages multiscale and cross-modal features for extracting contextual information. Moreover, we propose a channel-wise attention module to highlight the feature representation of salient regions. Finally, we refine coarse maps through a corresponding refinement block. Experimental results show that the proposed network achieves a performance comparable with state-of-the-art saliency prediction methods on two representative datasets. 相似文献
6.
Convolutional neural network (CNN) based methods have recently achieved extraordinary performance in single image super-resolution (SISR) tasks. However, most existing CNN-based approaches increase the model’s depth by stacking massive kernel convolutions, bringing expensive computational costs and limiting their application in mobile devices with limited resources. Furthermore, large kernel convolutions are rarely used in lightweight super-resolution designs. To alleviate the above problems, we propose a multi-scale convolutional attention network (MCAN), a lightweight and efficient network for SISR. Specifically, a multi-scale convolutional attention (MCA) is designed to aggregate the spatial information of different large receptive fields. Since the contextual information of the image has a strong local correlation, we design a local feature enhancement unit (LFEU) to further enhance the local feature extraction. Extensive experimental results illustrate that our proposed MCAN can achieve better performance with lower model complexity compared with other state-of-the-art lightweight methods. 相似文献
7.
Low-light images enhancement is a challenging task because enhancing image brightness and reducing image degradation should be considered simultaneously. Although existing deep learning-based methods improve the visibility of low-light images, many of them tend to lose details or sacrifice naturalness. To address these issues, we present a multi-stage network for low-light image enhancement, which consists of three sub-networks. More specifically, inspired by the Retinex theory and the bilateral grid technique, we first design a reflectance and illumination decomposition network to decompose an image into reflectance and illumination maps efficiently. To increase the brightness while preserving edge information, we then devise an attention-guided illumination adjustment network. The reflectance and the adjusted illumination maps are fused and refined by adversarial learning to reduce image degradation and improve image naturalness. Experiments are conducted on our rebuilt SICE low-light image dataset, which consists of 1380 real paired images and a public dataset LOL, which has 500 real paired images and 1000 synthetic paired images. Experimental results show that the proposed method outperforms state-of-the-art methods quantitatively and qualitatively. 相似文献
8.
人类视觉注意机制在目标检测中的应用 总被引:22,自引:1,他引:22
根据人类视觉感知理论,在介绍了两种比较有代表性的视觉注意模型的基础上,采用bottom—up控制策略的预注意机制和top—down控制策略的注意机制,提出了一种适用于自动目标识别的目标检测算法。从输入图像出发,采用Gabor算子建立多尺度、多方位的多通道图像,通过全波整流和各通道间的对比度增益控制,得到多尺度、多方位的方位特征图,这些特征图的线性组合则为显著性图。给出了仅采用bottom—up控制策略的船舶目标检测实验结果,待检测目标在显著性图中得到明显增强,有利于检测的实现。 相似文献
9.
To solve the problem of low sign language recognition rate under the condition of small samples, a simple and effective static gesture recognition method based on an attention mechanism is proposed. The method proposed in this paper can enhance the features of both the details and the subject of the gesture image. The input of the proposed method depends on the intermediate feature map generated by the original network. Also, the proposed convolutional model is a lightweight general module, which can be seamlessly integrated into any CNN(Convolutional Neural Network) architecture and achieve significant performance gains with minimal overhead. Experiments on two different datasets show that the proposed method is effective and can improve the accuracy of sign language recognition of the benchmark model, making its performance better than the existing methods. 相似文献
10.
We considered the prediction of driver's cognitive states related to driving performance using EEG signals. We proposed a novel channel-wise convolutional neural network (CCNN) whose architecture considers the unique characteristics of EEG data. We also discussed CCNN-R, a CCNN variation that uses Restricted Boltzmann Machine to replace the convolutional filter, and derived the detailed algorithm. To test the performance of CCNN and CCNN-R, we assembled a large EEG dataset from 3 studies of driver fatigue that includes samples from 37 subjects. Using this dataset, we investigated the new CCNN and CCNN-R on raw EEG data and also Independent Component Analysis (ICA) decomposition. We tested both within-subject and cross-subject predictions and the results showed CCNN and CCNN-R achieved robust and improved performance over conventional DNN and CNN as well as other non-DL algorithms. 相似文献
11.
高效的调制方式识别方法可以提高通信效率,推动通信行业的进一步发展。文中设计了一种基于前置LSTM编码的残差 注意力网络,其可以明显提高识别准确率。在第一部分,给出了网络的基本结构。该网络主体使用残差结构,前置LSTM层对数据序列编码,并采用加入了软阈值降噪的注意力机制。第二部分使用包含了24种调制方式的开源数据集作为数据来源,设计了多个实验来比较网络的性能表现,并探究了数据质量、调制方式、网络深度等多个因素对神经网络识别性能的影响。实验结果表明,文中搭建的网络表现优秀,在高信噪比数据下的准确率仍超过了95%。 相似文献
12.
Spatial–temporal information is easy to achieve in a practical surveillance scene, but it is often neglected in most current person reidentification (ReID) methods. Employing spatial–temporal information as a constrain has been verified as beneficial for ReID. However, there is no effective modeling according to the pedestrian movement law. In this paper, we present a ReID framework with internal and external spatial–temporal constraints, termed as IESC-ReID. A novel residual spatial attention module is proposed to build a spatial–temporal constraint and increase the robustness to partial occlusions or camera viewpoint changes. A Laplace-based spatial–temporal constraint is also introduced to eliminate irrelevant gallery images, which are gathered by the internal learning network. IESC-ReID constrains the attention within the functioning range of the channel space, and utilizes additional spatial–temporal constrains to further constrain results. Intensive experiments show that these constraints consistently improve the performance. Extensive experimental results on numerous publicly available datasets show that the proposed method outperforms several state-of-the-art ReID algorithms. Our code is publicly available at https://github.com/jiaming-wang/IESC. 相似文献
13.
针对当前全景图像显著性检测方法存在检测精度偏低、模型收敛速度慢和计算量大等问题,该文提出一种基于鲁棒视觉变换和多注意力的U型网络(URMNet)模型。该模型使用球形卷积提取全景图像的多尺度特征,减轻了全景图像经等矩形投影后的失真。使用鲁棒视觉变换模块提取4种尺度特征图所包含的显著信息,采用卷积嵌入的方式降低特征图的分辨率,增强模型的鲁棒性。使用多注意力模块,根据空间注意力与通道注意力间的关系,有选择地融合多维度注意力。最后逐步融合多层特征,形成全景图像显著图。纬度加权损失函数使该文模型具有更快的收敛速度。在两个公开数据集上的实验表明,该文所提模型因使用了鲁棒视觉变换模块和多注意力模块,其性能优于其他6种先进方法,能进一步提高全景图像显著性检测精度。 相似文献
14.
Tracking-by-detection (TBD) is a significant framework for visual object tracking. However, current trackers are usually updated online based on random sampling with a probability distribution. The performance of the learning-based TBD trackers is limited by the lack of discriminative features, especially when the background is full of semantic distractors. We propose an attention-driven data augmentation method, in which a residual attention mechanism is integrated into the TBD tracking network as supplementary references to identify discriminative image features. A mask generating network is used to simulate changes in target appearances to obtain positive samples, where attention information and image features are combined to identify discriminative features. In addition, we propose a method for mining hard negative samples, which searches for semantic distractors with the response of the attention module. The experiments on the OTB2015, UAV123, and LaSOT benchmarks show that this method achieves competitive performance in terms of accuracy and robustness. 相似文献
15.
17.
The existing deraining methods based on convolutional neural networks (CNNs) have made great success, but some remaining rain streaks can degrade images drastically. In this work, we proposed an end-to-end multi-scale context information and attention network, called MSCIANet. The proposed network consists of multi-scale feature extraction (MSFE) and multi-receptive fields feature extraction (MRFFE). Firstly, the MSFE can pick up features of rain streaks in different scales and propagate deep features of the two layers across stages by skip connections. Secondly, the MRFFE can refine details of the background by attention mechanism and the depthwise separable convolution of different receptive fields with different scales. Finally, the fusion of these outputs of two subnetworks can reconstruct the clean background image. Extensive experimental results have shown that the proposed network achieves a good effect on the deraining task on synthetic and real-world datasets. The demo can be available at https://github.com/CoderLi365/MSCIANet. 相似文献
18.
光线在水下被吸收或者散射使得水下图像成像出现色偏、模糊遮挡等问题,影响水下视觉任务。传统的图像增强方法分别采用直方图均衡、伽马矫正和白平衡方法较好地增强水下图像。然而,3种方法融合增强水下图像的互补性和相关性方面的研究较少。因此,该文提出一种基于多路混合注意力机制的水下图像增强网络。首先,提出多路特征提取模块,对图像进行直方图均衡支路、伽马矫正支路和白平衡支路的多路特征提取,提取图像的对比度、亮度和颜色特征;然后,融合直方图均衡、伽马矫正和白平衡3支路特征,增强3支路特征融合的互补性;最后,设计混合注意力学习模块,深度挖掘3支路在对比度、亮度和颜色的相关性矩阵,并引入跳跃连接增强图像输出。在多个数据集上的实验结果表明,该方法能够有效恢复水下图像色偏、模糊遮挡和提高图像明亮度。 相似文献
19.
This paper proposes AMEA-GAN, an attention mechanism enhancement algorithm. It is cycle consistency-based generative adversarial networks for single image dehazing, which follows the mechanism of the human retina and to a great extent guarantees the color authenticity of enhanced images. To address the color distortion and fog artifacts in real-world images caused by most image dehazing methods, we refer to the human visual neurons and use the attention mechanism of similar Horizontal cell and Amazon cell in the retina to improve the structure of the generator adversarial networks. By introducing our proposed attention mechanism, the effect of haze removal becomes more natural without leaving any artifacts, especially in the dense fog area. We also use an improved symmetrical structure of FUNIE-GAN to improve the visual color perception or the color authenticity of the enhanced image and to produce a better visual effect. Experimental results show that our proposed model generates satisfactory results, that is, the output image of AMEA-GAN bears a strong sense of reality. Compared with state-of-the-art methods, AMEA-GAN not only dehazes images taken in daytime scenes but also can enhance images taken in nighttime scenes and even optical remote sensing imagery. 相似文献
20.
水下图像往往会因为光的吸收和散射而出现颜色退化与细节模糊的现象,进而影响水下视觉任务。该文通过水下成像模型合成更接近水下图像的数据集,以端到端的方式设计了一个基于注意力的多尺度水下图像增强网络。在该网络中引入像素和通道注意力机制,并设计了一个多尺度特征提取模块,在网络开始阶段提取不同层次的特征,通过带跳跃连接的卷积层和注意力模块后得到输出结果。多个数据集上的实验结果表明,该方法在处理合成水下图像和真实水下图像时都能有很好的效果,与现有方法相比能更好地恢复图像颜色和纹理细节。 相似文献