首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
With the rapid development of mobile Internet and digital technology, people are more and more keen to share pictures on social networks, and online pictures have exploded. How to retrieve similar images from large-scale images has always been a hot issue in the field of image retrieval, and the selection of image features largely affects the performance of image retrieval. The Convolutional Neural Networks (CNN), which contains more hidden layers, has more complex network structure and stronger ability of feature learning and expression compared with traditional feature extraction methods. By analyzing the disadvantage that global CNN features cannot effectively describe local details when they act on image retrieval tasks, a strategy of aggregating low-level CNN feature maps to generate local features is proposed. The high-level features of CNN model pay more attention to semantic information, but the low-level features pay more attention to local details. Using the increasingly abstract characteristics of CNN model from low to high. This paper presents a probabilistic semantic retrieval algorithm, proposes a probabilistic semantic hash retrieval method based on CNN, and designs a new end-to-end supervised learning framework, which can simultaneously learn semantic features and hash features to achieve fast image retrieval. Using convolution network, the error rate is reduced to 14.41% in this test set. In three open image libraries, namely Oxford, Holidays and ImageNet, the performance of traditional SIFT-based retrieval algorithms and other CNN-based image retrieval algorithms in tasks are compared and analyzed. The experimental results show that the proposed algorithm is superior to other contrast algorithms in terms of comprehensive retrieval effect and retrieval time.  相似文献   

2.
Convolutional neural network (CNN) based methods have recently achieved extraordinary performance in single image super-resolution (SISR) tasks. However, most existing CNN-based approaches increase the model’s depth by stacking massive kernel convolutions, bringing expensive computational costs and limiting their application in mobile devices with limited resources. Furthermore, large kernel convolutions are rarely used in lightweight super-resolution designs. To alleviate the above problems, we propose a multi-scale convolutional attention network (MCAN), a lightweight and efficient network for SISR. Specifically, a multi-scale convolutional attention (MCA) is designed to aggregate the spatial information of different large receptive fields. Since the contextual information of the image has a strong local correlation, we design a local feature enhancement unit (LFEU) to further enhance the local feature extraction. Extensive experimental results illustrate that our proposed MCAN can achieve better performance with lower model complexity compared with other state-of-the-art lightweight methods.  相似文献   

3.
Compared with the traditional image denoising method, although the convolutional neural network (CNN) has better denoising performance, there is an important issue that has not been well resolved: the residual image obtained by learning the difference between noisy image and clean image pairs contains abundant image detail information, resulting in the serious loss of detail in the denoised image. In this paper, in order to relearn the lost image detail information, a mathematical model is deducted from a minimization problem and an end-to-end detail retaining CNN (DRCNN) is proposed. Unlike most denoising methods based on CNN, DRCNN is not only focus to image denoising, but also the integrity of high frequency image content. DRCNN needs less parameters and storage space, therefore it has better generalization ability. Moreover, DRCNN can also adapt to different image restoration tasks such as blind image denoising, single image superresolution (SISR), blind deburring and image inpainting. Extensive experiments show that DRCNN has a better effect than some classic and novel methods.  相似文献   

4.
For stereo matching based on patch comparing using convolutional neural networks (CNNs), the matching cost estimation is highly dependent on the network structure, and the patch comparing is time consuming for traditional CNNs. Accordingly, we propose a stereo matching method based on a novel shrinking residual CNN, which consists of convolutional layers and skip-connection layers, and the size of the fully connected layers decreases progressively. Firstly, a layer-by-layer shrinking size model is adopted for the full-connection layers to greatly increase the running speed. Secondly, the convolutional layer and the residual structure are fused to improve patch comparing. Finally, the Loss function is re-designed to give higher weights to hard-classified examples compared with the standard cross entropy loss. Experimental results on KITTI2012 and KITTI2015 demonstrate that the proposed method can improve the operation speed while maintaining high accuracy.  相似文献   

5.
Underwater image processing has played an important role in various fields such as submarine terrain scanning, submarine communication cable laying, underwater vehicles, underwater search and rescue. However, there are many difficulties in the process of acquiring underwater images. Specifically, the water body will selectively absorb part of the light when light travels through the water, resulting in color degradation of underwater images. At the same time, due to the influence of floating substances in the water, the light has a certain degree of scattering, which will bring serious problems such as blurred details and low contrast to underwater images. Therefore, using image processing technology to restore the real appearance of underwater images has a high practical value. In order to solve the above problems, we combine the color correction method with the deblurring network to improve the quality of underwater images in this paper. Firstly, aiming at the problem of insufficient number and diversity of underwater image samples, a network combined with depth image reconstruction and underwater image generation is proposed to simulate underwater images based on the style transfer method. Secondly, for the problem of color distortion, we propose a dynamic threshold color correction method based on image global information combined with the loss law of light propagation in water. Finally, in order to solve the problem of image blurring caused by scattering and further improve the overall image clarity, the color-corrected image is reconstructed by a multi-scale recursive convolutional neural network. Experiment results show that we can obtain images closer to underwater style with shorter training time. Compared with several latest underwater image processing methods, the proposed method has obvious advantages in multiple underwater scenes. Simultaneously, we can restore the color information, remove blurring and boost detail for underwater images.  相似文献   

6.
Designing efficient deep neural networks has achieved great interest in image super-resolution (SR). However, exploring diverse network structures is computationally expensive. More importantly, each layer in a network has a distinct role that leads to the design of a specialized structure. In this work, we present a novel neural architecture search (NAS) algorithm that efficiently explores layer-wise structures. Specifically, we construct a supernet allowing flexibility in choosing the number of channels and per-channel activation functions according to the role of each layer. The search process runs efficiently via channel pruning since gradient descent jointly optimizes the Mult-Adds and the accuracy of the searched models. We facilitate estimating the model Mult-Adds in a differentiable manner using relaxations in the backward pass. The searched model, named FGNAS, outperforms the state-of-the-art NAS-based SR methods by a large margin.  相似文献   

7.
The existing deraining methods based on convolutional neural networks (CNNs) have made great success, but some remaining rain streaks can degrade images drastically. In this work, we proposed an end-to-end multi-scale context information and attention network, called MSCIANet. The proposed network consists of multi-scale feature extraction (MSFE) and multi-receptive fields feature extraction (MRFFE). Firstly, the MSFE can pick up features of rain streaks in different scales and propagate deep features of the two layers across stages by skip connections. Secondly, the MRFFE can refine details of the background by attention mechanism and the depthwise separable convolution of different receptive fields with different scales. Finally, the fusion of these outputs of two subnetworks can reconstruct the clean background image. Extensive experimental results have shown that the proposed network achieves a good effect on the deraining task on synthetic and real-world datasets. The demo can be available at https://github.com/CoderLi365/MSCIANet.  相似文献   

8.
The existing implementations of block-shift based filtering algorithms for deblocking are hard to achieve good smoothing performance and low computation complexity simultaneously due to their fixed block size and small shifting range. In this paper, we propose to integrate quadtree (QT) decomposition with the block-shift filtering for deblocking. By incorporating the QT decomposition, we can easily find the locations of uniform regions and determine the corresponding suitable block sizes. The variable block sizes generated by the QT decomposition facilitate the later block-shift filtering with low computational cost. In addition, large block based shift filtering can provide better deblocking results because the smoothing range of large blocks spans over the conventional 8 × 8 block size. Furthermore, we extend the proposed QT based block-shifting algorithm for deringing JPEG2000 coded images. Experimental results show the superior performance of our proposed algorithms.  相似文献   

9.
While some denoising methods based on deep learning achieve superior results on synthetic noise, they are far from dealing with photographs corrupted by realistic noise. Denoising on real-world noisy images faces more significant challenges due to the source of it is more complicated than synthetic noise. To address this issue, we propose a novel network including noise estimation module and removal module (NERNet). The noise estimation module automatically estimates the noise level map corresponding to the information extracted by symmetric dilated block and pyramid feature fusion block. The removal module focuses on removing the noise from the noisy input with the help of the estimated noise level map. Dilation selective block with attention mechanism in the removal module adaptively not only fuses features from convolution layers with different dilation rates, but also aggregates the global and local information, which is benefit to preserving more details and textures. Experiments on two datasets of synthetic noise and three datasets of realistic noise show that NERNet achieves competitive results in comparison with other state-of-the-art methods.  相似文献   

10.
In this paper, the feature representation of an image by CNN is used to hide the secret image into the cover image. The style of the cover image hides the content of the secret image and produce a stego image using Neural Style Transfer (NST) algorithm, which resembles the cover image and also contains the semantic content of secret image. The main technical contributions are to hide the content of the secret image in the in-between hidden layered style features of the cover image, which is the first of its kind in the present state-of-art-technique. Also, to recover the secret image from the stego image, destylization is done with the help of conditional generative adversarial networks (GANs) using Residual in Residual Dense Blocks (RRDBs). Further, stego images from different layer combinations of content and style features are obtained and evaluated. Evaluation is based on the visual similarity and quality loss between the cover-stego pair and the secret-reconstructed secret pair of images. From the experiments, it has been observed that the proposed algorithm has 43.95 dB Peak Signal-to-Noise Ratio (PSNR)), .995 Structural Similarity Index (SSIM), and .993 Visual Information Fidelity (VIF) for the ImageNet dataset. The proposed algorithm is found to be more robust against StegExpose than the traditional methods.  相似文献   

11.
The underlining task for fine-grained image recognition captures both the inter-class and intra-class discriminate features. Existing methods generally use auxiliary data to guide the network or a complex network comprising multiple sub-networks. They have two significant drawbacks: (1) Using auxiliary data like bounding boxes requires expert knowledge and expensive data annotation. (2) Using multiple sub-networks make network architecture complex and requires complicated training or multiple training steps. We propose an end-to-end Spatial Self-Attention Network (SSANet) comprising a spatial self-attention module (SSA) and a self-attention distillation (Self-AD) technique. The SSA encodes contextual information into local features, improving intra-class representation. Then, the Self-AD distills knowledge from the SSA to a primary feature map, obtaining inter-class representation. By accumulating classification losses from these two modules enables the network to learn both inter-class and intra-class features in one training step. The experiment findings demonstrate that SSANet is effective and achieves competitive performance.  相似文献   

12.
The semantic segmentation of low altitude high-resolution urban scene images taken by UAV plays an important role in city management. However, such images have the characteristics of inter-class homogeneity and intra-class heterogeneity. How to segment these images quickly and accurately is still challenging. In this paper, we propose a novel double-branch network. For the challenge of inter-class homogeneity, a boundary flow module is designed to enhance the flow of latent semantic information between two branches by imposing boundary constraints between classes. To alleviate intra-class heterogeneity, a context extraction module based on adaptive dynamic fusion is designed, which effectively captures the long-term relationship of features with very low parameters. Experiments on two typical datasets show that our approach achieves the best balance between accuracy and speed. Specifically, we achieve 65.8% mIoU and 74.1% mIoU on UAVid test set and UDD validation set respectively, and 60FPS on an NVIDIA TITAN Xp.  相似文献   

13.
Application of convolutional neural networks (CNNs) for image additive white Gaussian noise (AWGN) removal has attracted considerable attentions with the rapid development of deep learning in recent years. However, the work of image multiplicative speckle noise removal is rarely done. Moreover, most of the existing speckle noise removal algorithms are based on traditional methods with human priori knowledge, which means that the parameters of the algorithms need to be set manually. Nowadays, deep learning methods show clear advantages on image feature extraction. Multiplicative speckle noise is very common in real life images, especially in medical images. In this paper, a novel neural network structure is proposed to recover noisy images with speckle noise. Our proposed method mainly consists of three subnetworks. One network is rough clean image estimate subnetwork. Another is subnetwork of noise estimation. The last one is an information fusion network based on U-Net and several convolutional layers. Different from the existing speckle denoising model based on the statistics of images, the proposed network model can handle speckle denoising of different noise levels with an end-to-end trainable model. Extensive experimental results on several test datasets clearly demonstrate the superior performance of our proposed network over state-of-the-arts in terms of quantitative metrics and visual quality.  相似文献   

14.
The latest deep neural networks for medical segmentation typically utilize transposed convolutional filters and atrous convolutional filters for spatial restoration and larger receptive fields, leading to dilution and inconsistency of visual semantics. To address such issues, we propose a novel attentional up-concatenation structure to build an auxiliary path for direct access to multi-level features. In addition, we employ a new structural loss to bring better morphological awareness and reduce the segmentation flaws caused by the semantic inconsistencies. Thorough experiments on the challenging optic cup/disc segmentation, cellular segmentation and lung segmentation tasks were performed to evaluate the proposed methods. Further ablation analysis demonstrated the effectiveness of the different components of the model and illustrated its efficiency. The proposed methods achieved the best performance and speed compared to the state-of-the-art models in three tasks on seven public datasets, including DRISHTI-GS, RIM-r3, REFUGE, MESSIDOR, TNBC, GlaS and LUNA.  相似文献   

15.
For efficient cellular communication channel usage, we propose a neural computation model for image coding. In a constant-time unsupervised learning, our neural model approximates optimal pattern clustering from training example images through a memory adaptation process, and builds a compression codebook in its synaptic weight matrix. This neural codebook can be distributed to both ends of a transmission channel for fast codec operations on general images. The transmission is merely the indices of the codebook entries best matching the patterns in the image to be transmitted. These indices can further be compressed through a classical entropy coding method to yield even more transmission reduction. Other advantages of our model are the low training time complexity, high utilization of neurons, robust pattern clustering capability, and simple computation. A VLSI implementation is also highly suitable for the intrinsic parallel nature of neural networks. Our compression results are competitive compared to JPEG and wavelet methods. We also reveal the general codebook's cross-compression results, filtering effects by special training methods, and learning enhancement techniques for obtaining a compact codebook to yield both high compression and picture quality.  相似文献   

16.
Screen content image (SCI) is a composite image including textual and pictorial regions resulting in many difficulties in image quality assessment (IQA). Large SCIs are divided into image patches to increase training samples for CNN training of IQA model, and this brings two problems: (1) local quality of each image patch is not equal to subjective differential mean opinion score (DMOS) of an entire image; (2) importance of different image patches is not same for quality assessment. In this paper, we propose a novel no-reference (NR) IQA model based on the convolutional neural network (CNN) for assessing the perceptual quality of SCIs. Our model conducts two designs solving problems which benefits from two strategies. For the first strategy, to imitate full-reference (FR) CNN-based model behavior, a CNN-based model is designed for both FR and NR IQA, and performance of NR-IQA part improves when the image patch scores predicted by FR-IQA part are adopted as the ground-truth to train NR-IQA part. For the second strategy, image patch qualities of one entire SCI are fused to obtain the SCI quality with an adaptive weighting method taking account the effect of the different image patch contents. Experimental results verify that our model outperforms all test NR IQA methods and most FR IQA methods on the screen content image quality assessment database (SIQAD). On the cross-database evaluation, the proposed method outperforms the existing NR IQA method in terms of at least 2.4 percent in PLCC and 2.8 percent in SRCC, which shows high generalization ability and high effectiveness of our model.  相似文献   

17.
In recent years, removing rain streaks from a single image has been a significant issue for outdoor vision tasks. In this paper, we propose a novel recursive residual atrous spatial pyramid pooling network to directly recover the clear image from rain image. Specifically, we adopt residual atrous spatial pyramid pooling (ResASPP) module which is constructed by alternately cascading a ResASPP block with a residual block to exploit multi-scale rain information. Besides, taking the dependencies of deep features across stages into consideration, a recurrent layer is introduced into ResASPP to model multi-stage processing procedure from coarse to fine. For each stage in our recursive network we concatenate the stage-wise output with the original rainy image and then feed them into the next stage. Furthermore, the negative SSIM loss and perceptual loss are employed to train the proposed network. Extensive experiments on both synthetic and real-world rainy datasets demonstrate that the proposed method outperforms the state-of-the-art deraining methods.  相似文献   

18.
Single image dehazing is a critical image pre-processing step for many practical vision systems. Most existing dehazing methods solve this problem utilizing various of hand-crafted priors or by supervised training on the synthetic hazy image information (such as haze-free image, transmission map and atmospheric light). However, the assumptions on the hand-crafted priors are easily violated and collecting realistic transmission map and atmospheric light are unpractical. In this paper, we propose a novel weakly supervised network based on the multi-level multi-scale block. The proposed network reduces the constraint on the training data and automatically estimates the transmission map and the atmospheric light as well as the intermediate haze-free image without using any realistic transmission map and atmospheric light as supervision. Moreover, the estimated intermediate haze-free image helps to generate accurate transmission map and atmospheric light by embedding the physical-model, which presents reliable restoration of the final haze-free image. In particular, our network also can be trained on the real-world dataset to fine-tune the model and the fine-tuning operation improves the dehazing performance on the real-world dataset. Quantitative and qualitative experimental results demonstrate the proposed method performs on par with the supervised methods.  相似文献   

19.
Image steganalysis based on convolutional neural networks(CNN) has attracted great attention. However, existing networks lack attention to regional features with complex texture, which makes the ability of discrimination learning miss in network. In this paper, we described a new CNN designed to focus on useful features and improve detection accuracy for spatial-domain steganalysis. The proposed model consists of three modules: noise extraction module, noise analysis module and classification module. A channel attention mechanism is used in the noise extraction module and analysis module, which is realized by embedding the SE(Squeeze-and-Excitation) module into the residual block. Then, we use convolutional pooling instead of average pooling to aggregate features. The experimental results show that detection accuracy of the proposed model is significantly better than those of the existing models such as SRNet, Zhu-Net and GBRAS-Net. Compared with these models, our model has better generalization ability, which is critical for practical application.  相似文献   

20.
In recent years, deep learning has been successfully applied to medical image segmentation. However, as the network extends deeper, the consecutive downsampling operations will lead to more loss of spatial information. In addition, the limited data and diverse targets increase the difficulty for medical image segmentation. To address these issues, we propose a multi-path connected network (MCNet) for medical segmentation problems. It integrates multiple paths generated by pyramid pooling into the encoding phase to preserve semantic information and spatial details. We utilize multi-scale feature extractor block (MFE block) in the encoder to obtain large and multi-scale receptive fields. We evaluated MCNet on three medical datasets with different image modalities. The experimental results show that our method achieves better performance than the state-of-the-art approaches. Our model has strong feature learning ability and is robust to capture different scale targets. It can achieve satisfactory results while using only 0.98 million (M) parameters.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号