首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
    
Designing efficient deep neural networks has achieved great interest in image super-resolution (SR). However, exploring diverse network structures is computationally expensive. More importantly, each layer in a network has a distinct role that leads to the design of a specialized structure. In this work, we present a novel neural architecture search (NAS) algorithm that efficiently explores layer-wise structures. Specifically, we construct a supernet allowing flexibility in choosing the number of channels and per-channel activation functions according to the role of each layer. The search process runs efficiently via channel pruning since gradient descent jointly optimizes the Mult-Adds and the accuracy of the searched models. We facilitate estimating the model Mult-Adds in a differentiable manner using relaxations in the backward pass. The searched model, named FGNAS, outperforms the state-of-the-art NAS-based SR methods by a large margin.  相似文献   

2.
    
The existing deraining methods based on convolutional neural networks (CNNs) have made great success, but some remaining rain streaks can degrade images drastically. In this work, we proposed an end-to-end multi-scale context information and attention network, called MSCIANet. The proposed network consists of multi-scale feature extraction (MSFE) and multi-receptive fields feature extraction (MRFFE). Firstly, the MSFE can pick up features of rain streaks in different scales and propagate deep features of the two layers across stages by skip connections. Secondly, the MRFFE can refine details of the background by attention mechanism and the depthwise separable convolution of different receptive fields with different scales. Finally, the fusion of these outputs of two subnetworks can reconstruct the clean background image. Extensive experimental results have shown that the proposed network achieves a good effect on the deraining task on synthetic and real-world datasets. The demo can be available at https://github.com/CoderLi365/MSCIANet.  相似文献   

3.
    
While some denoising methods based on deep learning achieve superior results on synthetic noise, they are far from dealing with photographs corrupted by realistic noise. Denoising on real-world noisy images faces more significant challenges due to the source of it is more complicated than synthetic noise. To address this issue, we propose a novel network including noise estimation module and removal module (NERNet). The noise estimation module automatically estimates the noise level map corresponding to the information extracted by symmetric dilated block and pyramid feature fusion block. The removal module focuses on removing the noise from the noisy input with the help of the estimated noise level map. Dilation selective block with attention mechanism in the removal module adaptively not only fuses features from convolution layers with different dilation rates, but also aggregates the global and local information, which is benefit to preserving more details and textures. Experiments on two datasets of synthetic noise and three datasets of realistic noise show that NERNet achieves competitive results in comparison with other state-of-the-art methods.  相似文献   

4.
喻翔 《移动信息》2024,46(2):135-139
高效的调制方式识别方法可以提高通信效率,推动通信行业的进一步发展。文中设计了一种基于前置LSTM编码的残差 注意力网络,其可以明显提高识别准确率。在第一部分,给出了网络的基本结构。该网络主体使用残差结构,前置LSTM层对数据序列编码,并采用加入了软阈值降噪的注意力机制。第二部分使用包含了24种调制方式的开源数据集作为数据来源,设计了多个实验来比较网络的性能表现,并探究了数据质量、调制方式、网络深度等多个因素对神经网络识别性能的影响。实验结果表明,文中搭建的网络表现优秀,在高信噪比数据下的准确率仍超过了95%。  相似文献   

5.
Person re-identification(ReID) is an intelligent video surveillance technology that retrieves the same person from different cameras. This task is extremely challenging due to changes in person poses, different camera views, and occlusion. In recent years, person ReID based on deep learning technology has received widespread attention due to the rapid development and excellent performance of deep learning. In this paper, we first divide person ReID based on deep learning approaches into seven types, i.e., fused hand-crafted features deep model, representation learning model, metric learning model, part-based deep model, video-based model, GAN-based model, unsupervised model. Furthermore, we launched a brief overview of the seven types. Then, we introduce some examples of commonly used datasets, compare the performance of some algorithms on image and video datasets in recent years, and analyze the advantages and disadvantages of various methods. Finally, we summarize the possible future research directions of person ReID technology.  相似文献   

6.
    
The latest deep neural networks for medical segmentation typically utilize transposed convolutional filters and atrous convolutional filters for spatial restoration and larger receptive fields, leading to dilution and inconsistency of visual semantics. To address such issues, we propose a novel attentional up-concatenation structure to build an auxiliary path for direct access to multi-level features. In addition, we employ a new structural loss to bring better morphological awareness and reduce the segmentation flaws caused by the semantic inconsistencies. Thorough experiments on the challenging optic cup/disc segmentation, cellular segmentation and lung segmentation tasks were performed to evaluate the proposed methods. Further ablation analysis demonstrated the effectiveness of the different components of the model and illustrated its efficiency. The proposed methods achieved the best performance and speed compared to the state-of-the-art models in three tasks on seven public datasets, including DRISHTI-GS, RIM-r3, REFUGE, MESSIDOR, TNBC, GlaS and LUNA.  相似文献   

7.
    
The power of convolutional neural networks (CNN) has demonstrated irreplaceable advantages in super-resolution. However, many CNN-based methods need large model sizes to achieve superior performance, making them difficult to apply in the practical world with limited memory footprints. To efficiently balance model complexity and performance, we propose a multi-scale attention network (MSAN) by cascading multiple multi-scale attention blocks (MSAB), each of which integrates a multi-scale cross block (MSCB) and a multi-path wide-activated attention block (MWAB). Specifically, MSCB initially connects three parallel convolutions with different dilation rates hierarchically to aggregate the knowledge of features at different levels and scales. Then, MWAB split the channel features from MSCB into three portions to further improve performance. Rather than being treated equally and independently, each portion is responsible for a specific function, enabling internal communication among channels. Experimental results show that our MSAN outperforms most state-of-the-art methods with relatively few parameters and Mult-Adds.  相似文献   

8.
    
Compared with the traditional image denoising method, although the convolutional neural network (CNN) has better denoising performance, there is an important issue that has not been well resolved: the residual image obtained by learning the difference between noisy image and clean image pairs contains abundant image detail information, resulting in the serious loss of detail in the denoised image. In this paper, in order to relearn the lost image detail information, a mathematical model is deducted from a minimization problem and an end-to-end detail retaining CNN (DRCNN) is proposed. Unlike most denoising methods based on CNN, DRCNN is not only focus to image denoising, but also the integrity of high frequency image content. DRCNN needs less parameters and storage space, therefore it has better generalization ability. Moreover, DRCNN can also adapt to different image restoration tasks such as blind image denoising, single image superresolution (SISR), blind deburring and image inpainting. Extensive experiments show that DRCNN has a better effect than some classic and novel methods.  相似文献   

9.
    
Application of convolutional neural networks (CNNs) for image additive white Gaussian noise (AWGN) removal has attracted considerable attentions with the rapid development of deep learning in recent years. However, the work of image multiplicative speckle noise removal is rarely done. Moreover, most of the existing speckle noise removal algorithms are based on traditional methods with human priori knowledge, which means that the parameters of the algorithms need to be set manually. Nowadays, deep learning methods show clear advantages on image feature extraction. Multiplicative speckle noise is very common in real life images, especially in medical images. In this paper, a novel neural network structure is proposed to recover noisy images with speckle noise. Our proposed method mainly consists of three subnetworks. One network is rough clean image estimate subnetwork. Another is subnetwork of noise estimation. The last one is an information fusion network based on U-Net and several convolutional layers. Different from the existing speckle denoising model based on the statistics of images, the proposed network model can handle speckle denoising of different noise levels with an end-to-end trainable model. Extensive experimental results on several test datasets clearly demonstrate the superior performance of our proposed network over state-of-the-arts in terms of quantitative metrics and visual quality.  相似文献   

10.
    
With the rapid development of mobile Internet and digital technology, people are more and more keen to share pictures on social networks, and online pictures have exploded. How to retrieve similar images from large-scale images has always been a hot issue in the field of image retrieval, and the selection of image features largely affects the performance of image retrieval. The Convolutional Neural Networks (CNN), which contains more hidden layers, has more complex network structure and stronger ability of feature learning and expression compared with traditional feature extraction methods. By analyzing the disadvantage that global CNN features cannot effectively describe local details when they act on image retrieval tasks, a strategy of aggregating low-level CNN feature maps to generate local features is proposed. The high-level features of CNN model pay more attention to semantic information, but the low-level features pay more attention to local details. Using the increasingly abstract characteristics of CNN model from low to high. This paper presents a probabilistic semantic retrieval algorithm, proposes a probabilistic semantic hash retrieval method based on CNN, and designs a new end-to-end supervised learning framework, which can simultaneously learn semantic features and hash features to achieve fast image retrieval. Using convolution network, the error rate is reduced to 14.41% in this test set. In three open image libraries, namely Oxford, Holidays and ImageNet, the performance of traditional SIFT-based retrieval algorithms and other CNN-based image retrieval algorithms in tasks are compared and analyzed. The experimental results show that the proposed algorithm is superior to other contrast algorithms in terms of comprehensive retrieval effect and retrieval time.  相似文献   

11.
    
Recently, very deep convolution neural network (CNN) has shown strong ability in single image super-resolution (SISR) and has obtained remarkable performance. However, most of the existing CNN-based SISR methods rarely explicitly use the high-frequency information of the image to assist the image reconstruction, thus making the reconstructed image looks blurred. To address this problem, a novel contour enhanced Image Super-Resolution by High and Low Frequency Fusion Network (HLFN) is proposed in this paper. Specifically, a contour learning subnetwork is designed to learn the high-frequency information, which can better learn the texture of the image. In order to reduce the redundancy of the contour information learned by the contour learning subnetwork during fusion, the spatial channel attention block (SCAB) is introduced, which can select the required high-frequency information adaptively. Moreover, a contour loss is designed and it is used with the 1 loss to optimize the network jointly. Comprehensive experiments demonstrate the superiority of our HLFN over state-of-the-art SISR methods.  相似文献   

12.
    
In this paper, the feature representation of an image by CNN is used to hide the secret image into the cover image. The style of the cover image hides the content of the secret image and produce a stego image using Neural Style Transfer (NST) algorithm, which resembles the cover image and also contains the semantic content of secret image. The main technical contributions are to hide the content of the secret image in the in-between hidden layered style features of the cover image, which is the first of its kind in the present state-of-art-technique. Also, to recover the secret image from the stego image, destylization is done with the help of conditional generative adversarial networks (GANs) using Residual in Residual Dense Blocks (RRDBs). Further, stego images from different layer combinations of content and style features are obtained and evaluated. Evaluation is based on the visual similarity and quality loss between the cover-stego pair and the secret-reconstructed secret pair of images. From the experiments, it has been observed that the proposed algorithm has 43.95 dB Peak Signal-to-Noise Ratio (PSNR)), .995 Structural Similarity Index (SSIM), and .993 Visual Information Fidelity (VIF) for the ImageNet dataset. The proposed algorithm is found to be more robust against StegExpose than the traditional methods.  相似文献   

13.
由于器件及工艺等技术限制,红外图像分辨率相对可见光图像较低,存在细节纹理特征模糊等不足。对此,本文提出一种基于深度卷积神经网络(convolutional neural network,CNN)的红外图像超分辨率重建方法。该方法改进残差模块,降低激活函数对信息流影响的同时加深网络,充分利用低分辨率红外图像的原始信息。结合高效通道注意力机制和通道-空间注意力模块,使重建过程中有选择性地捕获更多特征信息,有利于对红外图像高频细节更准确地进行重建。实验结果表明,本文方法重建红外图像峰值信噪比(peak signal to noise ratio,PSNR)优于传统的Bicubic插值法以及基于CNN的SRResNet、EDSR、RCAN模型。当尺度因子为×2和×4时,重建图像的平均PSNR值比传统Bicubic插值法分别提高了4.57 dB和3.37 dB。  相似文献   

14.
    
Screen content image (SCI) is a composite image including textual and pictorial regions resulting in many difficulties in image quality assessment (IQA). Large SCIs are divided into image patches to increase training samples for CNN training of IQA model, and this brings two problems: (1) local quality of each image patch is not equal to subjective differential mean opinion score (DMOS) of an entire image; (2) importance of different image patches is not same for quality assessment. In this paper, we propose a novel no-reference (NR) IQA model based on the convolutional neural network (CNN) for assessing the perceptual quality of SCIs. Our model conducts two designs solving problems which benefits from two strategies. For the first strategy, to imitate full-reference (FR) CNN-based model behavior, a CNN-based model is designed for both FR and NR IQA, and performance of NR-IQA part improves when the image patch scores predicted by FR-IQA part are adopted as the ground-truth to train NR-IQA part. For the second strategy, image patch qualities of one entire SCI are fused to obtain the SCI quality with an adaptive weighting method taking account the effect of the different image patch contents. Experimental results verify that our model outperforms all test NR IQA methods and most FR IQA methods on the screen content image quality assessment database (SIQAD). On the cross-database evaluation, the proposed method outperforms the existing NR IQA method in terms of at least 2.4 percent in PLCC and 2.8 percent in SRCC, which shows high generalization ability and high effectiveness of our model.  相似文献   

15.
    
In this work, a framework that can automatically create cartoon images with low computation resources and small training datasets is proposed. The proposed system performs region segmentation and learns a region relationship tree from each learning image. The segmented regions are clustered automatically with an enhanced clustering mechanism with no prior knowledge of number of clusters. According to the topology represented by region relationship tree and clustering results, the regions are reassembled to create new images. A swarm intelligence optimization procedure is designed to coordinate the regions to the optimized sizes and positions in the created image. Rigid deformation using moving least squares is performed on the regions to generate more variety for created images. Compared with methods based on Generative Adversarial Networks, the proposed framework can create better images with limited computation resources and a very small amount of training samples.  相似文献   

16.
骨龄(BA)是评估儿童生长发育是否正常的重要指标之一。中国人手腕骨发育标准-CHN计分法是目前中国儿童生长发育中骨龄评估(BAA)最常用的方法之一。但是在CHN计分法中,某些参照骨图谱的发育指征跨度较大,导致专家依据个人经验主观判断它的发育分期而影响评估准确度。在利用深度学习对该类图谱的发育分期进行评估时,会导致它的评估结果产生随机性。该文基于专家评估过的2万余张儿童手腕部X线片,在CHN计分法的基础上,在相邻发育分期间隔跨度较大的参照骨标准图谱之间勾绘新的成熟度指征,产生细化图谱,并利用层次分析法为其分配对应的成熟度得分,提高骨龄评价的准确率。该文在AlexNet网络的基础上融合Harris特征和卷积注意力模块,对各参照骨的发育分期进行评估。在自制的年龄分布为5-11岁的数据集上,采用优化后的CHN法得到的骨龄在容忍度为0.5岁和1岁时的准确率分别达到了94.6%和99.13%。实验结果表明所提方法可以更加精细地分辨儿童手腕骨发育程度,大幅提高骨龄评估的准确率,辅助临床应用。  相似文献   

17.
    
We considered the prediction of driver's cognitive states related to driving performance using EEG signals. We proposed a novel channel-wise convolutional neural network (CCNN) whose architecture considers the unique characteristics of EEG data. We also discussed CCNN-R, a CCNN variation that uses Restricted Boltzmann Machine to replace the convolutional filter, and derived the detailed algorithm. To test the performance of CCNN and CCNN-R, we assembled a large EEG dataset from 3 studies of driver fatigue that includes samples from 37 subjects. Using this dataset, we investigated the new CCNN and CCNN-R on raw EEG data and also Independent Component Analysis (ICA) decomposition. We tested both within-subject and cross-subject predictions and the results showed CCNN and CCNN-R achieved robust and improved performance over conventional DNN and CNN as well as other non-DL algorithms.  相似文献   

18.
    
The human visual system has the ability to rapidly identify and redirect attention to important visual information in high complexity scenes such as the human crowd. Saliency prediction in the human crowd scene is the process using computer vision techniques to imitate the human visual system, predicting which areas in a human crowd scene may attract human attention. However, it is a challenging task to identify which factors may attract human attention due to the high complexity of the human crowd scene. In this work, we propose Multiscale DenseNet — Dilated and Attention (MSDense-DAt), a convolutional neural network (CNN) using self-attention to integrate the result of knowledge-driven gaze in the human visual system to identify salient areas in the human crowd scene. Our method combines various state-of-the-art deep learning architectures to deal with the high complexity in human crowd image, such as multiscale DenseNet for multiscale deep features extraction, self-attention, and dilated convolution. Then the effectiveness of each component in our CNN architecture is evaluated by comparing different components combinations. Finally, the proposed method is further evaluated in different crowd density levels to appraise the effect of crowd density on model performance.  相似文献   

19.
    
Recently, inferring users’ personality traits on social media has attracted extensive attention. Existing studies have shown that users’ personality traits can be inferred from their preferences for images. However, since users’ preferences on images are often affected by multiple factors, some liked images cannot effectively reflect their personality traits. To handle this issue, this paper proposes a personality modeling approach based on image aesthetic attribute-aware graph representation learning, which can leverage aesthetic attributes to refine the liked images that are consistent with users’ personality traits. Specifically, we first utilize a Convolutional Neural Network (CNN) to train an aesthetic attribute prediction module. Then, attribute-aware graph representation learning is introduced to refine the images with similar aesthetic attributes from users’ liked images. Finally, the aesthetic attributes of all refined images are combined to predict personality traits through a Multi-Layer Perceptron (MLP). Experimental results and visual analysis have shown that the proposed method is superior to state-of-the-art personality modeling methods.  相似文献   

20.
    
Single image dehazing is a critical image pre-processing step for many practical vision systems. Most existing dehazing methods solve this problem utilizing various of hand-crafted priors or by supervised training on the synthetic hazy image information (such as haze-free image, transmission map and atmospheric light). However, the assumptions on the hand-crafted priors are easily violated and collecting realistic transmission map and atmospheric light are unpractical. In this paper, we propose a novel weakly supervised network based on the multi-level multi-scale block. The proposed network reduces the constraint on the training data and automatically estimates the transmission map and the atmospheric light as well as the intermediate haze-free image without using any realistic transmission map and atmospheric light as supervision. Moreover, the estimated intermediate haze-free image helps to generate accurate transmission map and atmospheric light by embedding the physical-model, which presents reliable restoration of the final haze-free image. In particular, our network also can be trained on the real-world dataset to fine-tune the model and the fine-tuning operation improves the dehazing performance on the real-world dataset. Quantitative and qualitative experimental results demonstrate the proposed method performs on par with the supervised methods.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号