首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Subspace and similarity metric learning are important issues for image and video analysis in the scenarios of both computer vision and multimedia fields. Many real-world applications, such as image clustering/labeling and video indexing/retrieval, involve feature space dimensionality reduction as well as feature matching metric learning. However, the loss of information from dimensionality reduction may degrade the accuracy of similarity matching. In practice, such basic conflicting requirements for both feature representation efficiency and similarity matching accuracy need to be appropriately addressed. In the style of “Thinking Globally and Fitting Locally”, we develop Locally Embedded Analysis (LEA) based solutions for visual data clustering and retrieval. LEA reveals the essential low-dimensional manifold structure of the data by preserving the local nearest neighbor affinity, and allowing a linear subspace embedding through solving a graph embedded eigenvalue decomposition problem. A visual data clustering algorithm, called Locally Embedded Clustering (LEC), and a local similarity metric learning algorithm for robust video retrieval, called Locally Adaptive Retrieval (LAR), are both designed upon the LEA approach, with variations in local affinity graph modeling. For large size database applications, instead of learning a global metric, we localize the metric learning space with kd-tree partition to localities identified by the indexing process. Simulation results demonstrate the effective performance of proposed solutions in both accuracy and speed aspects.  相似文献   

2.

Deep learning models have attained great success for an extensive range of computer vision applications including image and video classification. However, the complex architecture of the most recently developed networks imposes certain memory and computational resource limitations, especially for human action recognition applications. Unsupervised deep convolutional neural networks such as PCANet can alleviate these limitations and hence significantly reduce the computational complexity of the whole recognition system. In this work, instead of using 3D convolutional neural network architecture to learn temporal features of video actions, the unsupervised convolutional PCANet model is extended into (PCANet-TOP) which effectively learn spatiotemporal features from Three Orthogonal Planes (TOP). For each video sequence, spatial frames (XY) and temporal planes (XT and YT) are utilized to train three different PCANet models. Then, the learned features are fused after reducing their dimensionality using whitening PCA to obtain spatiotemporal feature representation of the action video. Finally, Support Vector Machine (SVM) classifier is applied for action classification process. The proposed method is evaluated on four benchmarks and well-known datasets, namely, Weizmann, KTH, UCF Sports, and YouTube action datasets. The recognition results show that the proposed PCANet-TOP provides discriminative and complementary features using three orthogonal planes and able to achieve promising and comparable results with state-of-the-art methods.

  相似文献   

3.
目的 细粒度图像检索是当前细粒度图像分析和视觉领域的热点问题。以鞋类图像为例,传统方法仅提取其粗粒度特征且缺少关键的语义属性,难以区分部件间的细微差异,不能有效用于细粒度检索。针对鞋类图像检索大多基于简单款式导致检索效率不高的问题,提出一种结合部件检测和语义网络的细粒度鞋类图像检索方法。方法 结合标注后的鞋类图像训练集对输入的待检鞋类图像进行部件检测;基于部件检测后的鞋类图像和定义的语义属性训练语义网络,以提取待检图像和训练图像的特征向量,并采用主成分分析进行降维;通过对鞋类图像训练集中每个候选图像与待检图像间的特征向量进行度量学习,按其匹配度高低顺序输出检索结果。结果 实验在UT-Zap50K数据集上与目前检索效果较好的4种方法进行比较,检索精度提高近6%。同时,与同任务的SHOE-CNN(semantic hierarchy of attribute convolutional neural network)检索方法比较,本文具有更高的检索准确率。结论 针对传统图像特征缺少细微的视觉描述导致鞋类图像检索准确率低的问题,提出一种细粒度鞋类图像检索方法,既提高了鞋类图像检索的精度和准确率,又能较好地满足实际应用需求。  相似文献   

4.
目的 随着公共安全领域中大规模图像监控及视频数据的增长以及智能交通的发展,车辆检索有着极其重要的应用价值。针对已有车辆检索中自动化和智能化水平低、难以获取精确的检索结果等问题,提出一种多任务分段紧凑特征的车辆检索方法,有效利用车辆基本信息的多样性和关联性实现实时检索。方法 首先,利用相关任务之间的联系提高检索精度和细化图像特征,因此构造了一种多任务深度卷积网络分段学习车辆不同属性的哈希码,将图像语义和图像表示相结合,并采用最小化图像编码使学习到的车辆的不同属性特征更具有鲁棒性;然后,选用特征金字塔网络提取车辆图像的实例特征并利用局部敏感哈希再排序方法对提取到的特征进行检索;最后,针对无法获取查询车辆目标图像的特殊情况,采用跨模态辅助检索方法进行检索。结果 提出的检索方法在3个公开数据集上均优于目前主流的检索方法,其中在CompCars数据集上检索精度达到0.966,在VehicleID数据集上检索精度提升至0.862。结论 本文提出的多任务分段紧凑特征的车辆检索方法既能得到最小化图像编码及图像实例特征,还可在无法获取目标检索图像信息时进行跨模态检索,通过实验对比验证了方法的有效性。  相似文献   

5.
Content-Based Image Retrieval Based on ROI Detection and Relevance Feedback   总被引:3,自引:0,他引:3  
Content-based image retrieval is an important research topic in computer vision. We present a new method that combines region of interest (ROI) detection and relevance feedback. The ROI based approach is more accurate in describing the image content than using global features, and the relevance feedback makes the system to be adaptive to subjective human perception. The feedback information is utilized to discover the subjective ROI perception of a particular user, and it is further employed to recompute the features associated with ROIs with the updated personalized ROI preference. A fast computation technique is proposed to avoid repeating the ROI detection for images in the database. It directly estimates the features of the ROIs, which makes the query process fast and efficient. For illustration of the overall approach, we use the color saliency and wavelet feature saliency to determine the ROIs. Normalized projections are selected to represent the shape features associated with the ROIs. Experimental results show that the proposed system has better performance than the global features based approaches and region based techniques without feedback.  相似文献   

6.
Classifying HEp-2 fluorescence patterns in Indirect Immunofluorescence (IIF) HEp-2 cell imaging is important for the differential diagnosis of autoimmune diseases. The current technique, based on human visual inspection, is time-consuming, subjective and dependent on the operator's experience. Automating this process may be a solution to these limitations, making IIF faster and more reliable. This work proposes a classification approach based on Subclass Discriminant Analysis (SDA), a dimensionality reduction technique that provides an effective representation of the cells in the feature space, suitably coping with the high within-class variance typical of HEp-2 cell patterns. In order to generate an adequate characterization of the fluorescence patterns, we investigate the individual and combined contributions of several image attributes, showing that the integration of morphological, global and local textural features is the most suited for this purpose. The proposed approach provides an accuracy of the staining pattern classification of about 90%.  相似文献   

7.
8.
9.
ABSTRACT

With the rapid growing of remotely sensed imagery data, there is a high demand for effective and efficient image retrieval tools to manage and exploit such data. In this letter, we present a novel content-based remote sensing image retrieval (RSIR) method based on Triplet deep metric learning convolutional neural network (CNN). By constructing a Triplet network with metric learning objective function, we extract the representative features of the images in a semantic space in which images from the same class are close to each other while those from different classes are far apart. In such a semantic space, simple metric measures such as Euclidean distance can be used directly to compare the similarity of images and effectively retrieve images of the same class. We also investigate a supervised and an unsupervised learning methods for reducing the dimensionality of the learned semantic features. We present comprehensive experimental results on two public RSIR datasets and show that our method significantly outperforms state-of-the-art.  相似文献   

10.
Goyal  Neha  Kumar  Nitin  Kapil 《Multimedia Tools and Applications》2022,81(22):32243-32264

Automated plant recognition based on leaf images is a challenging task among the researchers from several fields. This task requires distinguishing features derived from leaf images for assigning class label to a leaf image. There are several methods in literature for extracting such distinguishing features. In this paper, we propose a novel automated framework for leaf identification. The proposed framework works in multiple phases i.e. pre-processing, feature extraction, classification using bagging approach. Initially, leaf images are pre-processed using image processing operations such as boundary extraction and cropping. In the feature extraction phase, popular nature inspired optimization algorithms viz. Spider Monkey Optimization (SMO), Particle Swarm Optimization (PSO) and Gray Wolf Optimization (GWO) have been exploited for reducing the dimensionality of features. In the last phase, a leaf image is classified by multiple classifiers and then output of these classifiers is combined using majority voting. The effectiveness of the proposed framework is established based on the experimental results obtained on three datasets i.e. Flavia, Swedish and self-collected leaf images. On all the datasets, it has been observed that the classification accuracy of the proposed method is better than the individual classifiers. Furthermore, the classification accuracy for the proposed approach is comparable to deep learning based method on the Flavia dataset.

  相似文献   

11.
Image retrieval using nonlinear manifold embedding   总被引:1,自引:0,他引:1  
Can  Jun  Xiaofei  Chun  Jiajun 《Neurocomputing》2009,72(16-18):3922
The huge number of images on the Web gives rise to the content-based image retrieval (CBIR) as the text-based search techniques cannot cater to the needs of precisely retrieving Web images. However, CBIR comes with a fundamental flaw: the semantic gap between high-level semantic concepts and low-level visual features. Consequently, relevance feedback is introduced into CBIR to learn the subjective needs of users. However, in practical applications the limited number of user feedbacks is usually overwhelmed by the large number of dimensionalities of the visual feature space. To address this issue, a novel semi-supervised learning method for dimensionality reduction, namely kernel maximum margin projection (KMMP) is proposed in this paper based on our previous work of maximum margin projection (MMP). Unlike traditional dimensionality reduction algorithms such as principal component analysis (PCA) and linear discriminant analysis (LDA), which only see the global Euclidean structure, KMMP is designed for discovering the local manifold structure. After projecting the images into a lower dimensional subspace, KMMP significantly improves the performance of image retrieval. The experimental results on Corel image database demonstrate the effectiveness of our proposed nonlinear algorithm.  相似文献   

12.
Yu  Tan  Meng  Jingjing  Fang  Chen  Jin  Hailin  Yuan  Junsong 《International Journal of Computer Vision》2020,128(8-9):2325-2343

Product quantization has been widely used in fast image retrieval due to its effectiveness of coding high-dimensional visual features. By constructing the approximation function, we extend the hard-assignment quantization to soft-assignment quantization. Thanks to the differentiable property of the soft-assignment quantization, the product quantization operation can be integrated as a layer in a convolutional neural network, constructing the proposed product quantization network (PQN). Meanwhile, by extending the triplet loss to the asymmetric triplet loss, we directly optimize the retrieval accuracy of the learned representation based on asymmetric similarity measurement. Utilizing PQN, we can learn a discriminative and compact image representation in an end-to-end manner, which further enables a fast and accurate image retrieval. By revisiting residual quantization, we further extend the proposed PQN to residual product quantization network (RPQN). Benefited from the residual learning triggered by residual quantization, RPQN achieves a higher accuracy than PQN using the same computation cost. Moreover, we extend PQN to temporal product quantization network (TPQN) by exploiting temporal consistency in videos to speed up the video retrieval. It integrates frame-wise feature learning, frame-wise features aggregation and video-level feature quantization in a single neural network. Comprehensive experiments conducted on multiple public benchmark datasets demonstrate the state-of-the-art performance of the proposed PQN, RPQN and TPQN in fast image and video retrieval.

  相似文献   

13.
Many problems in information processing involve some form of dimensionality reduction, such as face recognition, image/text retrieval, data visualization, etc. The typical linear dimensionality reduction algorithms include principal component analysis (PCA), random projection, locality-preserving projection (LPP), etc. These techniques are generally unsupervised which allows them to model data in the absence of labels or categories. In this paper, we propose a semi-supervised subspace learning algorithm for image retrieval. In relevance feedback-driven image retrieval system, the user-provided information can be used to better describe the intrinsic semantic relationships between images. Our algorithm is fundamentally based on LPP which can incorporate user's relevance feedbacks. As the user's feedbacks are accumulated, we can ultimately obtain a semantic subspace in which different semantic classes can be best separated and the retrieval performance can be enhanced. We compared our proposed algorithm to PCA and the standard LPP. Experimental results on a large collection of images have shown the effectiveness and efficiency of our proposed algorithm.  相似文献   

14.
民族服饰图像具有不同民族风格的服装款式、配饰和图案,导致民族服饰图像细粒度检索准确率较低.因此,文中提出细粒度民族服饰图像检索的全局-局部特征提取方法.首先,基于自定义的民族服饰语义标注,对输入图像进行区域检测,分别获得前景、款式、图案和配饰图像.然后在全卷积网络结构的基础上构建多分支的全局-局部特征提取模型,对不同区...  相似文献   

15.
目的 服装检索对于在线服装的推广和销售有着重要的作用。而目前的服装检索算法无法准确地检索出非文本描述的服装。特别是对于跨场景的多标签服装图片,服装检索算法的准确率还有待提升。本文针对跨场景多标签服装图片的差异性较大以及卷积神经网络输出特征维度过高的问题,提出了深度多标签解析和哈希的服装检索算法。方法 该方法首先在FCN(fully convolutional network)的基础上加入条件随机场,对FCN的结果进行后处理,搭建了FCN粗分割加CRFs(conditional random fields)精分割的端到端的网络结构,实现了像素级别的语义识别。其次,针对跨场景服装检索的特点,我们调整了CCP(Clothing Co-Parsing)数据集,并构建了Consumer-to-Shop数据集。针对检索过程中容易出现的语义漂移现象,使用多任务学习网络分别训练了衣物分类模型和衣物相似度模型。结果 我们首先在Consumer-to-Shop数据集上进行了服装解析的对比实验,实验结果表明在添加了CRFs作为后处理之后,服装解析的效果有了明显提升。然后与3种主流检索算法进行了对比,结果显示,本文方法在使用哈希特征的条件下,也可以取得较好的检索效果。在top-5正确率上比WTBI(where to buy it)高出1.31%,比DARN(dual attribute-aware ranking network)高出0.21%。结论 针对服装检索的跨场景效果差、检索效率低的问题,本文提出了一种基于像素级别语义分割和哈希编码的快速多目标服装检索方法。与其他检索方法相比,本文在多目标、多标签服装检索场景有一定的优势,并且在保持了一定检索效果的前提下,有效地降低了存储空间,提高了检索效率。  相似文献   

16.
This paper presents a novel application of advanced machine learning techniques for Mars terrain image classification. Fuzzy-rough feature selection (FRFS) is adapted and then employed in conjunction with Support Vector Machines (SVMs) to construct image classifiers. These techniques are integrated to address problems in space engineering where the images are of many classes, large-scale, and diverse representational properties. The use of the adapted FRFS allows the induction of low-dimensionality feature sets from feature patterns of a much higher dimensionality. To evaluate the proposed work, K-Nearest Neighbours (KNNs) and decision trees (DTREEs) based image classifiers as well as information gain rank (IGR) based feature selection are also investigated here, as possible alternatives to the underlying machine learning techniques adopted. The results of systematic comparative studies demonstrate that in general, feature selection improves the performance of classifiers that are intended for use in high dimensional domains. In particular, the proposed approach helps to increase the classification accuracy, while enhancing classification efficiency by requiring considerably less features. This is evident in that the resultant SVM-based classifiers which utilise FRFS-selected features generally outperform KNN and DTREE based classifiers and those which use IGR-returned features. The work is therefore shown to be of great potential for on-board or ground-based image classification in future Mars rover missions.  相似文献   

17.
基于深度学习的图像检索系统   总被引:2,自引:0,他引:2  
基于内容的图像检索系统关键的技术是有效图像特征的获取和相似度匹配策略.在过去,基于内容的图像检索系统主要使用低级的可视化特征,无法得到满意的检索结果,所以尽管在基于内容的图像检索上花费了很大的努力,但是基于内容的图像检索依旧是计算机视觉领域中的一个挑战.在基于内容的图像检索系统中,存在的最大的问题是“语义鸿沟”,即机器从低级的可视化特征得到的相似性和人从高级的语义特征得到的相似性之间的不同.传统的基于内容的图像检索系统,只是在低级的可视化特征上学习图像的特征,无法有效的解决“语义鸿沟”.近些年,深度学习技术的快速发展给我们提供了希望.深度学习源于人工神经网络的研究,深度学习通过组合低级的特征形成更加抽象的高层表示属性类别或者特征,以发现数据的分布规律,这是其他算法无法实现的.受深度学习在计算机视觉、语音识别、自然语言处理、图像与视频分析、多媒体等诸多领域取得巨大成功的启发,本文将深度学习技术用于基于内容的图像检索,以解决基于内容的图像检索系统中的“语义鸿沟”问题.  相似文献   

18.
Adopting effective model to access the desired images is essential nowadays with the presence of a huge amount of digital images. The present paper introduces an accurate and rapid model for content based image retrieval process depending on a new matching strategy. The proposed model is composed of four major phases namely: features extraction, dimensionality reduction, ANN classifier and matching strategy. As for the feature extraction phase, it extracts a color and texture features, respectively, called color co-occurrence matrix (CCM) and difference between pixels of scan pattern (DBPSP). However, integrating multiple features can overcome the problems of single feature, but the system works slowly mainly because of the high dimensionality of the feature space. Therefore, the dimensionality reduction technique selects the effective features that jointly have the largest dependency on the target class and minimal redundancy among themselves. Consequently, these features reduce the calculation work and the computation time in the retrieval process. The artificial neural network (ANN) in our proposed model serves as a classifier so that the selected features of query image are the input and its output is one of the multi classes that have the largest similarity to the query image. In addition, the proposed model presents an effective feature matching strategy that depends on the idea of the minimum area between two vectors to compute the similarity value between a query image and the images in the determined class. Finally, the results presented in this paper demonstrate that the proposed model provides accurate retrieval results and achieve improvement in performance with significantly less computation time compared with other models.  相似文献   

19.
We propose a complementary relevance feedback-based content-based image retrieval (CBIR) system. This system exploits the synergism between short-term and long-term learning techniques to improve the retrieval performance. Specifically, we construct an adaptive semantic repository in long-term learning to store retrieval patterns of historical query sessions. We then extract high-level semantic features from the semantic repository and seamlessly integrate low-level visual features and high-level semantic features in short-term learning to effectively represent the query in a single retrieval session. The high-level semantic features are dynamically updated based on users’ query concept and therefore represent the image’s semantic concept more accurately. Our extensive experimental results demonstrate that the proposed system outperforms its seven state-of-the-art peer systems in terms of retrieval precision and storage space on a large scale imagery database.  相似文献   

20.
基于内容的图象检索技术   总被引:13,自引:0,他引:13       下载免费PDF全文
随着数字图象的日益增多,基于内容的图象检索已成为图象使用者和管理者迫切需要解决的问题,近年来,各国研究者纷纷加入该领域的研究.为了使人们对该领域现状有个概略了解,以推动该领域研究进一步开展,首先概括介绍了基于内容图象检索的产生、发展及其关键技术;然后介绍了特征提取(包括低层特征和语义特征)及其相似性计算、相关反馈等的原理及算法;最后指出了基于内容的图象检索技术与计算机视觉技术的区别所在,并对目前存在的问题和应着重的研究内容以及发展方向进行了分析.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号