首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
This paper proposes an incremental face annotation framework for sharing and publishing photographs which contain faces under a large scale web platform such as a social network service with millions of users. Unlike the conventional face recognition environment addressed by most existing works, the image databases being accessed by the large pool of users can be huge and frequently updated. A reasonable way to efficiently annotate such huge databases is to accommodate an adaptation of model parameters without the need to retrain the model all over again when new data arrives. In this work, we are particularly interested in the following issues related to increment of data: (i) the huge number of images being added at each instant, (ii) the large number of users joining the web each day, and (iii) the large number of classification systems being added at each period. We propose an efficient recursive estimation method to handle these data increment issues. Our experiments on several databases show that our proposed method achieves an almost constant execution time with comparable accuracy relative to those state-of-the-art incremental versions of principal component analysis, linear discriminant analysis and support vector machine.  相似文献   

2.
With the rapid development of social network and computer technologies, we always confront with high-dimensional multimedia data. It is time-consuming and unrealistic to organize such a large amount of data. Most existing methods are not appropriate for large-scale data due to their dependence of Laplacian matrix on training data. Normally, a given multimedia sample is usually associated with multiple labels, which are inherently correlated to each other. Although traditional methods could solve this problem by translating it into several single-label problems, they ignore the correlation among different labels. In this paper, we propose a novel semi-supervised feature selection method and apply it to the multimedia annotation. Both labeled and unlabeled samples are sufficiently utilized without the need of graph construction, and the shared information between multiple labels is simultaneously uncovered. We apply the proposed algorithm to both web page and image annotation. Experimental results demonstrate the effectiveness of our method.  相似文献   

3.
One of the key limitations of the many existing visual tracking method is that they are built upon low-level visual features and have limited predictability power of data semantics. To effectively fill the semantic gap of visual data in visual tracking with little supervision, we propose a tracking method which constructs a robust object appearance model via learning and transferring mid-level image representations using a deep network, i.e., Network in Network (NIN). First, we design a simple yet effective method to transfer the mid-level features learned from NIN on the source tasks with large scale training data to the tracking tasks with limited training data. Then, to address the drifting problem, we simultaneously utilize the samples collected in the initial and most previous frames. Finally, a heuristic schema is used to judge whether updating the object appearance model or not. Extensive experiments show the robustness of our method.  相似文献   

4.
柯逍  邹嘉伟  杜明智  周铭柯 《电子学报》2017,45(12):2925-2935
针对传统图像标注模型存在着训练时间长、对低频词汇敏感等问题,该文提出了基于蒙特卡罗数据集均衡和鲁棒性增量极限学习机的图像自动标注模型.该模型首先对公共图像库的训练集数据进行图像自动分割,选择分割后相应的种子标注词,并通过提出的基于综合距离的图像特征匹配算法进行自动匹配以形成不同类别的训练集.针对公共数据库中不同标注词的数据规模相差较大,提出了蒙特卡罗数据集均衡算法使得各个标注词间的数据规模大体一致.然后针对单一特征描述存在的不足,提出了多尺度特征融合算法对不同标注词图像进行有效的特征提取.最后针对传统极限学习机存在的隐层节点随机性和输入向量权重一致性的问题,提出了鲁棒性增量极限学习,提高了判别模型的准确性.通过在公共数据集上的实验结果表明:该模型可以在很短时间内实现图像的自动标注,对低频词汇具有较强的鲁棒性,并且在平均召回率、平均准确率、综合值等多项指标上均高于现流行的大多数图像自动标注模型.  相似文献   

5.
This paper presents a generalized relevance model for automatic image annotation through learning the correlations between images and annotation keywords. Different from previous relevance models that can only propagate keywords from the training images to the test ones, the proposed model can perform extra keyword propagation among the test images. We also give a convergence analysis of the iterative algorithm inspired by the proposed model. Moreover, to estimate the joint probability of observing an image with possible annotation keywords, we define the inter-image relations through proposing a new spatial Markov kernel based on 2D Markov models. The main advantage of our spatial Markov kernel is that the intra-image context can be exploited for automatic image annotation, which is different from the traditional bag-of-words methods. Experiments on two standard image databases demonstrate that the proposed model outperforms the state-of-the-art annotation models.  相似文献   

6.
采用深度学习对钢铁材料显微组织图像分类,需要大量带标注信息的训练集。针对训练集人工标注效率低下问题,该文提出一种新的融合自组织增量神经网络和图卷积神经网络的半监督学习方法。首先,采用迁移学习获取图像数据样本的特征向量集合;其次,通过引入连接权重策略的自组织增量神经网络(WSOINN)对特征数据进行学习,获得其拓扑图结构,并引入胜利次数进行少量人工节点标注;然后,搭建图卷积网络(GCN)挖掘图中节点的潜在联系,利用Dropout手段提高网络的泛化能力,对剩余节点进行自动标注进而获得所有金相图的分类结果。针对从某国家重点实验室收集到的金相图数据,比较了在不同人工标注比例下的自动分类精度,结果表明:在图片标注量仅为传统模型12%时,新模型的分类准确度可达到91%。  相似文献   

7.
Convolutional neural network (CNN) based methods have recently achieved extraordinary performance in single image super-resolution (SISR) tasks. However, most existing CNN-based approaches increase the model’s depth by stacking massive kernel convolutions, bringing expensive computational costs and limiting their application in mobile devices with limited resources. Furthermore, large kernel convolutions are rarely used in lightweight super-resolution designs. To alleviate the above problems, we propose a multi-scale convolutional attention network (MCAN), a lightweight and efficient network for SISR. Specifically, a multi-scale convolutional attention (MCA) is designed to aggregate the spatial information of different large receptive fields. Since the contextual information of the image has a strong local correlation, we design a local feature enhancement unit (LFEU) to further enhance the local feature extraction. Extensive experimental results illustrate that our proposed MCAN can achieve better performance with lower model complexity compared with other state-of-the-art lightweight methods.  相似文献   

8.
方晨  郭渊博  王娜  甄帅辉  唐国栋 《电子学报》2000,48(10):1983-1992
机器学习的飞速发展使其成为数据挖掘领域最有效的工具之一,但算法的训练过程往往需要大量的用户数据,给用户带来了极大的隐私泄漏风险.由于数据统计特征的复杂性及语义丰富性,传统隐私数据发布方法往往需要对原始数据进行过度清洗,导致数据可用性低而难以再适用于数据挖掘任务.为此,提出了一种基于生成对抗网络(Generative Adversarial Network,GAN)的差分隐私数据发布方法,通过在GAN模型训练的梯度上添加精心设计的噪声来实现差分隐私,确保GAN可无限量生成符合源数据统计特性且不泄露隐私的合成数据.针对现有同类方法合成数据质量低、模型收敛缓慢等问题,设计多种优化策略来灵活调整隐私预算分配并减小总体噪声规模,同时从理论上证明了合成数据严格满足差分隐私特性.在公开数据集上与现有方法进行实验对比,结果表明本方法能够更高效地生成质量更高的隐私保护数据,适用于多种数据分析任务.  相似文献   

9.
2D image-based 3D model retrieval has become a hotspot topic in recent years. However, the current existing methods are limited by two aspects. Firstly, they are mostly based on the supervised learning, which limits their application because of the high time and cost consuming of manual annotation. Secondly, the mainstream methods narrow the discrepancy between 2D and 3D domains mainly by the image-level alignment, which may bring the additional noise during the image transformation and influence cross-domain effect. Consequently, we propose a Wasserstein distance feature alignment learning (WDFAL) for this retrieval task. First of all, we describe 3D models through a series of virtual views and use CNNs to extract features. Secondly, we design a domain critic network based on the Wasserstein distance to narrow the discrepancy between two domains. Compared to the image-level alignment, we reduce the domain gap by the feature-level distribution alignment to avoid introducing additional noise. Finally, we extract the visual features from 2D and 3D domains, and calculate their similarity by utilizing Euclidean distance. The extensive experiments can validate the superiority of the WDFAL method.  相似文献   

10.
Image on web has become one of the most important information for browsers; however, the large number of results retrieved from images search engine increases the difficulty in finding the intended images. Image search result clustering (ISRC) is a solution to this problem. Currently, the ISRC-based methods separately utilized textual and visual features to present clustering result. In this paper, we proposed a new ISRC method as called Incremental-Annotations-based image search with clustering (IAISC), which adopted annotation as textual features and category model as visual features. IAISC can provide clustering result based on the semantic meaning and visual trail; further, presented by the iteratively structure, a user can obtain the intended image easily. The experimental result shows our method has high precision that the average precision rate is 73.4%; particularly, the precision rate is 96.5% when the user drills down the intended images till the last round. Regarding efficiency, our system is one and a half times as efficient as the previous studies.  相似文献   

11.
图像自动标注在检索大量数字图像时起到关键作用,它能将图像的视觉特征转化为图像的标注字信息,为用户的使用及检索带来极大的方便。研究了图像自动语义标注方法,设计并实现了基于Matlab图像自动标注系统,能够提取图像颜色特征和纹理特征,与已标注图像进行相似性度量并标注出图像语义关键词  相似文献   

12.
Application of convolutional neural networks (CNNs) for image additive white Gaussian noise (AWGN) removal has attracted considerable attentions with the rapid development of deep learning in recent years. However, the work of image multiplicative speckle noise removal is rarely done. Moreover, most of the existing speckle noise removal algorithms are based on traditional methods with human priori knowledge, which means that the parameters of the algorithms need to be set manually. Nowadays, deep learning methods show clear advantages on image feature extraction. Multiplicative speckle noise is very common in real life images, especially in medical images. In this paper, a novel neural network structure is proposed to recover noisy images with speckle noise. Our proposed method mainly consists of three subnetworks. One network is rough clean image estimate subnetwork. Another is subnetwork of noise estimation. The last one is an information fusion network based on U-Net and several convolutional layers. Different from the existing speckle denoising model based on the statistics of images, the proposed network model can handle speckle denoising of different noise levels with an end-to-end trainable model. Extensive experimental results on several test datasets clearly demonstrate the superior performance of our proposed network over state-of-the-arts in terms of quantitative metrics and visual quality.  相似文献   

13.
Since there is semantic gap between low-level visual features and high-level image semantic, the performance of many existing content-based image annotation algorithms is not satisfactory. In order to bridge the gap and improve the image annotation performance, a novel automatic image annotation (AIA) approach using neighborhood set (NS) based on image distance metric learning (IDML) algorithm is proposed in this paper. According to IDML, we can easily obtain the neighborhood set of each image since obtained image distance can effectively measure the distance between images for AIA task. By introducing NS, the proposed AIA approach can predict all possible labels of the image without caption. The experimental results confirm that the introduction of NS based on IDML can improve the efficiency of AIA approaches and achieve better annotation performance than the existing AIA approaches.  相似文献   

14.
Image shadow detection and removal can effectively recover image information lost in the image due to the existence of shadows, which helps improve the accuracy of object detection, segmentation and tracking. Thus, aiming at the problem of the scale of the shadow in the image, and the inconsistency of the shadowed area with the original non-shadowed area after the shadow is removed, the multi-scale and global feature (MSGF) is used in the proposed method, combined with the non-local network and dense dilated convolution pyramid pooling network. Besides, aiming at the problem of inaccurate detection of weak shadows and complicated shape shadows in existing methods, the direction feature (DF) module is adopted to enhance the features of the shadow areas, thereby improving shadow segmentation accuracy. Based on the above two methods, an end-to-end shadow detection and removal network SDRNet is proposed. SDRNet completes the task of sharing two feature heights in a unified network without adding additional calculations. Experimental results on the two public datasets ISDT and SBU demonstrate that the proposed method achieves more than 10% improvement in the BER index for shadow detection and the RMSE index for shadow removal, which proves that the proposed SDRNet based on the MSGF module and DF module can achieve the best results compared with other existing methods.  相似文献   

15.
何菁  陈胜 《电子科技》2016,29(7):85
针对现有图像分割方法存在需要手动分割,以及精确度较低的问题。采用一种全新的两步图像分割方案。该方案。以基于人工神经网络的模式识别技术,即人工神经网络的大规模培训的方法,通过对肺区不同子区域内结构进行分割处理,利用训练好的大规模人工神经网络对标准胸片中的肋骨、锁骨等骨质结构进行抑制,结合以基于区域的活动轮廓模型,即Snake模型,正确分割亮度不均匀的图像。文中选择与医护人员人工分割的图像进行对比,通过放射科医生采用等级法打分,原图的平均分为20分,而通过文中改进的分割方法平均分高达34分。  相似文献   

16.
This paper presents an automated video analysis framework for the detection of colonic polyps in optical colonoscopy. Our proposed framework departs from previous methods in that we include spatial frame-based analysis and temporal video analysis using time-course image sequences. We also provide a video quality assessment scheme including two measures of frame quality. We extract colon-specific anatomical features from different image regions using a windowing approach for intraframe spatial analysis. Anatomical features are described using an eigentissue model. We apply a conditional random field to model interframe dependences in tissue types and handle variations in imaging conditions and modalities. We validate our method by comparing our polyp detection results to colonoscopy reports from physicians. Our method displays promising preliminary results and shows strong invariance when applied to both white light and narrow-band video. Our proposed video analysis system can provide objective diagnostic support to physicians by locating polyps during colon cancer screening exams. Furthermore, our system can be used as a cost-effective video annotation solution for the large backlog of existing colonoscopy videos.  相似文献   

17.
The bag of visual words (BOW) model is an efficient image representation technique for image categorization and annotation tasks. Building good visual vocabularies, from automatically extracted image feature vectors, produces discriminative visual words, which can improve the accuracy of image categorization tasks. Most approaches that use the BOW model in categorizing images ignore useful information that can be obtained from image classes to build visual vocabularies. Moreover, most BOW models use intensity features extracted from local regions and disregard colour information, which is an important characteristic of any natural scene image. In this paper, we show that integrating visual vocabularies generated from each image category improves the BOW image representation and improves accuracy in natural scene image classification. We use a keypoint density-based weighting method to combine the BOW representation with image colour information on a spatial pyramid layout. In addition, we show that visual vocabularies generated from training images of one scene image dataset can plausibly represent another scene image dataset on the same domain. This helps in reducing time and effort needed to build new visual vocabularies. The proposed approach is evaluated over three well-known scene classification datasets with 6, 8 and 15 scene categories, respectively, using 10-fold cross-validation. The experimental results, using support vector machines with histogram intersection kernel, show that the proposed approach outperforms baseline methods such as Gist features, rgbSIFT features and different configurations of the BOW model.  相似文献   

18.
Exploring context information for visual recognition has recently received significant research attention. This paper proposes a novel and highly efficient approach, which is named semantic diffusion, to utilize semantic context for large-scale image and video annotation. Starting from the initial annotation of a large number of semantic concepts (categories), obtained by either machine learning or manual tagging, the proposed approach refines the results using a graph diffusion technique, which recovers the consistency and smoothness of the annotations over a semantic graph. Different from the existing graph-based learning methods that model relations among data samples, the semantic graph captures context by treating the concepts as nodes and the concept affinities as the weights of edges. In particular, our approach is capable of simultaneously improving annotation accuracy and adapting the concept affinities to new test data. The adaptation provides a means to handle domain change between training and test data, which often occurs in practice. Extensive experiments are conducted to improve concept annotation results using Flickr images and TV program videos. Results show consistent and significant performance gain (10 +% on both image and video data sets). Source codes of the proposed algorithms are available online.  相似文献   

19.
针对传统医学超声图像去斑方法的不足,该文提出一种自适应多曝光融合框架和前馈卷积神经网络模型图像去斑方法。首先,制作超声图像训练数据集;然后,提出一种自适应增强因子的多曝光融合框架,增强图像进行有效特征提取;最后,通过网络训练去斑模型并获得去斑后的图像。实验结果表明,该文较已有的方法,能更有效地滤除医学超声图像中的斑点噪声并更多的保留图像细节。  相似文献   

20.
基于深度神经网络的多源图像内容自动分析与目标识别方法近年来不断取得新的突破,并逐步在智能安防、医疗影像辅助诊断和自动驾驶等多个领域得到广泛部署。然而深度神经网络的对抗脆弱性给其在安全敏感领域的部署带来巨大安全隐患。对抗鲁棒性的有效提升方法是采用最大化网络损失的对抗样本重训练深度网络,但是现有的对抗训练过程生成对抗样本时需要类别标记信息,并且会大大降低无攻击数据集上的泛化性能。本文提出一种基于自监督对比学习的深度神经网络对抗鲁棒性提升方法,充分利用大量存在的无标记数据改善模型在对抗场景中的预测稳定性和泛化性。采用孪生网络架构,最大化训练样本与其无监督对抗样本间的多隐层表征相似性,增强模型的内在鲁棒性。本文所提方法可以用于预训练模型的鲁棒性提升,也可以与对抗训练相结合最大化模型的“预训练+微调”鲁棒性,在遥感图像场景分类数据集上的实验结果证明了所提方法的有效性和灵活性。   相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号