首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 140 毫秒
1.
传统的相似图像检索大多基于颜色、纹理、场景等特征,缺少对图像的局部特征提取,忽视了局部特征对相似图像检索的影响,检索效果较差。本文采用视觉BOW(Bag of words)模型,提取图像中尺度不变的SIFT特征,形成视觉单词向量,设计了一个基于视觉单词的相似图像分类检索方法,取得了较好的分类检索效果。  相似文献   

2.
场景识别是计算机视觉研究中的一项基本任务.与图像分类不同,场景识别需要综合考虑场景的背景信息、局部场景特征以及物体特征等因素,导致经典卷积神经网络在场景识别上性能欠佳.为解决此问题,文中提出了一种基于深度卷积特征的场景全局与局部表示方法.此方法对场景图片的卷积特征进行变换从而为每张图片生成一个综合的特征表示.使用CAM...  相似文献   

3.
曹晔 《电子学报》2019,47(4):832-836
图像分类作为计算机视觉分析领域一个重要的研究方向,其分类性能很大程度上取决于图像的特征表示.为了能够更好地进行图像分类,本文提出了一种基于局部约束稀疏编码的神经气算法(Neural Gas based Locality-constrained Sparse Coding,NGLSC)用来实现图像分类.引入局部排序适配器作为距离正则化约束项已经应用在神经气(Neural Gas,NG)的算法矢量量化中,旨在通过软竞争学习算法来弥补K均值聚类(K-means)算法的不足.在稀疏编码阶段此算法可求解得到封闭解.此外,字典更新一般由目标函数的误差项来决定,已有一些经典的算法采用这种方式更新字典.本文使用ORL数据库和COIL20数据库将所提出算法和现有算法局部约束线性编码(Locality-constrained Linear Coding,LLC),脸元数据学习方法(Metaface Learning,MFL)进行比较.实验结果证明本文所提出的算法在图像分类上准确率可达95%以上.可以看出,本文为计算机视觉图像分类工作提供了一种有价值的解决思路.  相似文献   

4.
张家辉  谢毓湘  郭延明 《信号处理》2020,36(11):1804-1810
场景图像分类是机器视觉中一个热门的方向,场景图像具有内容丰富、概念复杂的特点。已有的基于深度网络的场景分类算法,往往是通过改进网络结构或者数据增强等方式提升场景识别效果,但是缺少对图像中场景要素和对象要素之间关系的考虑。基于此,本文在分析现有基于深度网络的场景分类技术的基础上提出了一种局部特征显著化的场景分类算法。该算法旨在结合场景局部特征和对象局部特征的特点,利用两类不同特征存在的互补关系,分别对其进行优化,得到更具判别力的场景特征描述。局部特征显著化算法在MIT Indoor67数据集上得到的测试精度为88.88%,实验结果验证了该算法的有效性。   相似文献   

5.
面向自然场景分类的贝叶斯网络局部语义建模方法   总被引:3,自引:0,他引:3  
本文提出了一种基于贝叶斯网络的局部语义建模方法.网络结构涵盖了区域邻域的方向特性和区域语义之间的邻接关系.基于这种局部语义模型,建立了场景图像的语义表述,实现自然场景分类.通过对已标注集的图像样本集的学习训练,获得贝叶斯刚络的参数.对于待分类的图像,利用该模型融合区域的特征及其邻接区域的信息,推理得到区域的语义概率;并通过网络迭代收敛得到整幅图像的区域语义标记和语义概率;最后在此基础上形成图像的全局描述,实现场景分类.该方法利用了场景内部对象之间的上下文关系,弥补了仅利用底层特征进行局部语义建模的不足.通过在六类自然场景图像数据集上的实验表明,本文所提的局部语义建模和图像描述方法是有效的.  相似文献   

6.
针对空间目标图像的特点,该文提出一种基于局部不变特征的空间目标图像分类方法.该方法首先提取每幅图像的局部不变特征,利用混合高斯模型(GMM)建立全局的视觉模式,然后依据最大后验概率匹配局部特征和视觉模式来构造整个训练集图像的共现矩阵,采用概率潜在语义分析(PLSA)模型得到图像的潜在类别表示来实现图像的二次表示,最后利用SVM算法实现分类.实验结果验证了该方案的有效性.  相似文献   

7.
针对遥感图像场景分类的特点,提出了一种基于SURF和PLSA的分类方法。该方法首先采用SURF算法提取图像的局部特征,其次对特征利用K-means聚类生成视觉词汇表,从而得到图像的视觉词袋描述。然后利用概率潜在语义分析(PLSA)从图像中提取潜在语义特征,最后使用支持向量机(SVM)分类器完成图像的场景分类任务。在21类场景图像上的实验结果表明,文中方法可以有效提高遥感图像的场景分类精确度。  相似文献   

8.
近年来有大量关于无参考模糊图像质量评价的研究,但是目前很多方法都忽略了图像内容对评价结果的影响.针对纯背景的无显著性目标图像和含背景的显著性目标图像的模糊评价方式是不同的,基于人眼注意力机制,前者侧重于图像的整体模糊,而后者更侧重于图像的局部细节模糊.整体模糊指的是图像整体内容的锐度信息,局部细节模糊指的是图像不同位置的局部锐度信息,二者可以将视觉显著性和图像内容更好地结合起来.针对上述问题,提出了一种基于显著性目标分类的无参考模糊图像质量评价方法.首先提出了一种基于显著性检测的目标分类算法,对待评价图像进行显著性目标分类,然后根据分类结果提取其局部模糊特征和全局模糊特征,最后对这两个特征进行融合得到最终的质量评估分数.实验结果表明,该算法不仅在BLUR数据库上取得最优的评价效果,同时在LIVE、CSIQ和TID2013数据库上也有较好的结果,具有很好的鲁棒性.此外,本文算法在各数据库中也表现出了优异的统计性能.  相似文献   

9.
黄鸿  徐科杰  石光耀 《电子学报》2000,48(9):1824-1833
高分辨率遥感图像地物信息丰富,但场景构成复杂,目前基于手工设计的特征提取方法不能满足复杂场景分类的需求,而非监督特征学习方法尽管能够挖掘局部图像块的本征结构,但单一种类及尺度的特征难以有效表达实际应用中复杂遥感场景特性,导致分类性能受限.针对此问题,本文提出了一种基于多尺度多特征的遥感场景分类方法.该算法首先设计了一种改进的谱聚类非监督特征(iUFL-SC)以有效表征图像块的本征结构,然后通过密集采样提取每幅遥感场景的iUFL-SC、LBP、SIFT等三种多尺度局部图像块特征,并通过视觉词袋模型(BoVW)获得场景的中层特征表达,以实现更为准确详实的特征描述,最后基于直方图交叉核的支持向量机(HIKSVM)进行分类.在UC Merced数据集以及WHU-RS19数据集上的实验结果表明本文方法可对遥感场景进行鉴别特征提取,有效提高分类性能.  相似文献   

10.
基于深度学习的无参考图像质量评价方法目前存在语义关联性不足或模型训练要求高的问题,为此,本文提出了一种基于语义特征符号化和Transformer的无参考图像质量评价方法。首先使用深层卷积神经网络提取图像的高层语义特征;然后将语义特征映射成视觉特征符号,并基于Transformer自注意力机制对视觉特征符号之间的关系进行建模,提取图像的全局特征,同时使用浅层神经网络提取底层局部图像特征,捕捉图像低级失真信息;最后结合全局图像信息与局部图像信息,对图像质量进行预测。为了验证模型的精度和鲁棒性,以相关系数PLCC和SROCC作为评价指标,在5个主流的图像质量评价数据集和1个水下图像质量评价数据集上进行了实验,并将本文提出的方法与15种传统和基于深度学习的无参考图像质量评价方法进行了对比。实验结果表明,本文方法以较少的参数量(大约1.56 MB)在各类数据集上均取得了优越的性能,尤其在多重失真数据集LIVE-MD上将SROCC提升到了0.958,证明在复杂的失真情况下仍能准确评估图像质量,本文网络结构能满足实际应用场景。  相似文献   

11.
鉴于生物视觉模型性能的优越性,提出了一种基于生物视觉特征的支持向量机(SVM)目标分类算法。生物视觉模型以Gabor滤波为基础,所得特征具有很好的表征能力,但却具有很高的维度,选择选择训练速度快、分类精度高的线性SVM来完成高维度生物视觉特征的分类取得了很好的效果。利用生物视觉模型提取具有位置和尺度不变性的目标特征,针对生物视觉特征高维度的特点选择线性支持向量机完成目标分类的任务,利用自抽样法验证算法的有效性。  相似文献   

12.
Fingerprint classification based on learned features   总被引:3,自引:0,他引:3  
In this paper, we present a fingerprint classification approach based on a novel feature-learning algorithm. Unlike current research for fingerprint classification that generally uses well defined meaningful features, our approach is based on Genetic Programming (GP), which learns to discover composite operators and features that are evolved from combinations of primitive image processing operations. Our experimental results show that our approach can find good composite operators to effectively extract useful features. Using a Bayesian classifier, without rejecting any fingerprints from the NIST-4 database, the correct rates for 4- and 5-class classification are 93.3% and 91.6%, respectively, which compare favorably with other published research and are one of the best results published to date.  相似文献   

13.
We address the problem of visual classification with multiple features and/or multiple instances. Motivated by the recent success of multitask joint covariate selection, we formulate this problem as a multitask joint sparse representation model to combine the strength of multiple features and/or instances for recognition. A joint sparsity-inducing norm is utilized to enforce class-level joint sparsity patterns among the multiple representation vectors. The proposed model can be efficiently optimized by a proximal gradient method. Furthermore, we extend our method to the setup where features are described in kernel matrices. We then investigate into two applications of our method to visual classification: 1) fusing multiple kernel features for object categorization and 2) robust face recognition in video with an ensemble of query images. Extensive experiments on challenging real-world data sets demonstrate that the proposed method is competitive to the state-of-the-art methods in respective applications.  相似文献   

14.
This paper presents a new learning algorithm for audiovisual fusion and demonstrates its application to video classification for film database. The proposed system utilized perceptual features for content characterization of movie clips. These features are extracted from different modalities and fused through a machine learning process. More specifically, in order to capture the spatio-temporal information, an adaptive video indexing is adopted to extract visual feature, and the statistical model based on Laplacian mixture are utilized to extract audio feature. These features are fused at the late fusion stage and input to a support vector machine (SVM) to learn semantic concepts from a given video database. Based on our experimental results, the proposed system implementing the SVM-based fusion technique achieves high classification accuracy when applied to a large volume database containing Hollywood movies.  相似文献   

15.
16.
In order to improve the visual appearance of defogged of aerial images, in this work, a novel defogging algorithm based on conditional generative adversarial network is proposed. More specifically, the training process is carried out through an end-to-end trainable deep neural network. In detail, we upgrade the traditional adversarial loss function by incorporating an L1-regularized gradient to encode a rich set of detailed visual information inside each aerial image. In practice, to our best knowledge, existing image quality assessment algorithms might have deviation and supersaturation distortion on aerial images. To alleviate this problem, we leverage a random forest classification model to learn the mapping relationship between aerial image features and the quality ranking results. Subsequently, we transform the objective of defogged image quality assessment into a classification problem. Comprehensive experimental results on our compiled fogged aerial images quality data set have clearly demonstrated the effectiveness of our proposed algorithm.  相似文献   

17.
胡正平  涂潇蕾 《信号处理》2011,27(10):1536-1542
针对场景分类问题中,传统的“词包”模型不包含图像的上下文信息,且没有考虑图像特征间的类别差异问题,本文提出一种多方向上下文特征结合空间金字塔模型的场景分类方法。该方法首先对图像进行均匀网格分块并提取尺度不变(SIFT)特征,对每个局部图像块分别结合其周围三个方向的空间相邻区域,形成三种上下文特征;然后,将每类训练图像的上下文特征分别聚类形成视觉词汇,再将其连接形成最终的视觉词汇表,得到图像的视觉词汇直方图;最后,结合空间金字塔匹配算法形成金字塔直方图,并采用SVM分类器来进行分类。该方法将图像块在特征域的相似性同空间域的上下文关系有机地结合起来并加以类别区分,从而形成了具有更好区分力的视觉词汇表。在通用场景图像库上的实验表明,相比传统方法具有更好的分类性能。   相似文献   

18.
In this paper, we propose a sector-wise JPEG fragment classification approach to classify normal and erroneous JPEG data fragments with the minimum size of 512 bytes per fragment. Our method is based on processing each read-in sector of 512 bytes with using the DCT coefficient analysis methods for extracting the features of visual inconsistencies. The classification is conducted before the inverse DCT and can be performed simultaneously with JPEG decoding. The contributions of this work are two-folds: (1) a sector-wise JPEG erroneous fragment classification approach is proposed (2) new DCT coefficient analysis methods are introduced for image content analysis. Testing results on a variety of erroneous fragmented and normal JPEG files prove the strength of this operator for the purpose of forensics analysis, data recovery and abnormal fragment inconsistencies classification and detection. Furthermore, the results also show that the proposed DCT coefficient analysis methods are efficient and practical in terms of classification accuracy. In our experiment, the proposed approach yields a false positive rate of 0.32% and a true positive rate of 96.1% in terms of erroneous JPEG fragment classification.  相似文献   

19.
Media aesthetic assessment is a key technique in computer vision, which is widely applied in computer game rendering, video/image classification. Low-level and high-level features fusion-based video aesthetic assessment algorithms have achieved impressive performance, which outperform photo- and motion-based algorithms, however, these methods only focus on aesthetic features of single-frame while ignore the inherent relationship between adjacent frames. Therefore, we propose a novel video aesthetic assessment framework, where structural cues among frames are well encoded. Our method consists of two components: aesthetic features extraction and structure correlation construction. More specifically, we incorporate both low-level and high-level visual features to construct aesthetic features, where salient regions are extracted for content understanding. Subsequently, we develop a structure correlation-based algorithm to evaluate the relationship among adjacent frames, where frames with similar structure property should have a strong correlation coefficient. Afterwards, a kernel multi-SVM is trained for video classification and high aesthetic video selection. Comprehensive experiments demonstrate the effectiveness of our method.  相似文献   

20.
A system for scene-oriented hierarchical classification of blurry and noisy images is proposed. It attempts to simulate important features of the human visual perception. The underlying approach is based on three strategies: extraction of essential signatures captured from a global context, simulating the global pathway; highlight detection based on local conspicuous features of the reconstructed image, simulating the local pathway; and hierarchical classification of extracted features using probabilistic techniques. The techniques involved in hierarchical classification use input from both the local and global pathways. Visual context is exploited by a combination of Gabor filtering with the principal component analysis. In parallel, a pseudo-restoration process is applied together with an affine invariant approach to improve the accuracy in the detection of local conspicuous features. Subsequently, the local conspicuous features and the global essential signature are combined and clustered by a Monte Carlo approach. Finally, clustered features are fed to a self-organizing tree algorithm to generate the final hierarchical classification results. Selected representative results of a comprehensive experimental evaluation validate the proposed system.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号