首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
This paper presents a novel approach based on contextual Bayesian networks (CBN) for natural scene modeling and classification. The structure of the CBN is derived based on domain knowledge, and parameters are learned from training images. For test images, the hybrid streams of semantic features of image content and spatial information are piped into the CBN-based inference engine, which is capable of incorporating domain knowledge as well as dealing with a number of input evidences, producing the category labels of the entire image. We demonstrate the promise of this approach for natural scene classification, comparing it with several state-of-art approaches.  相似文献   

2.
Automatic image orientation detection for natural images is a useful, yet challenging research topic. Humans use scene context and semantic object recognition to identify the correct image orientation. However, it is difficult for a computer to perform the task in the same way because current object recognition algorithms are extremely limited in their scope and robustness. As a result, existing orientation detection methods were built upon low-level vision features such as spatial distributions of color and texture. Discrepant detection rates have been reported for these methods in the literature. We have developed a probabilistic approach to image orientation detection via confidence-based integration of low-level and semantic cues within a Bayesian framework. Our current accuracy is 90 percent for unconstrained consumer photos, impressive given the findings of a psychophysical study conducted recently. The proposed framework is an attempt to bridge the gap between computer and human vision systems and is applicable to other problems involving semantic scene content understanding.  相似文献   

3.
在基于内容的图像检索与计算机视觉研究领域中,如何将底层的视觉特征与高层的语义信息相联系,即如何有效地根据图像的底层特征提取其表达的语义概念是备受关注的难题之一。特别是当图像包含了多个语义概念时,问题就变得更为棘手了。本文中,我们提出一种基于图像底层特征值频繁模式的语义概念标注方法,针对图像分块的特点实现了一组有效的模式挖掘算法,并设计了标注规则的生成算法。权威的真实数据集上的实验表明我们的方法在对含有多个语义概念的图像进行概念标注时要比之前的一些算法效果更好。  相似文献   

4.
Image classification is an essential task in content-based image retrieval.However,due to the semantic gap between low-level visual features and high-level semantic concepts,and the diversification of Web images,the performance of traditional classification approaches is far from users’ expectations.In an attempt to reduce the semantic gap and satisfy the urgent requirements for dimensionality reduction,high-quality retrieval results,and batch-based processing,we propose a hierarchical image manifold with novel distance measures for calculation.Assuming that the images in an image set describe the same or similar object but have various scenes,we formulate two kinds of manifolds,object manifold and scene manifold,at different levels of semantic granularity.Object manifold is developed for object-level classification using an algorithm named extended locally linear embedding(ELLE) based on intra-and inter-object difference measures.Scene manifold is built for scene-level classification using an algorithm named locally linear submanifold extraction(LLSE) by combining linear perturbation and region growing.Experimental results show that our method is effective in improving the performance of classifying Web images.  相似文献   

5.
While people compare images using semantic concepts, computers compare images using low-level visual features that sometimes have little to do with these semantics. To reduce the gap between the high-level semantics of visual objects and the low-level features extracted from them, in this paper we develop a framework of learning pseudo metrics (LPM) using neural networks for semantic image classification and retrieval. Performance analysis and comparative studies, by experimenting on an image database, show that the LPM has potential application to multimedia information processing.  相似文献   

6.
语义视频检索综述   总被引:4,自引:1,他引:4  
视频内容检索是多媒体应用的一个活跃研究方向,现有的内容检索技术大多是基于低层次特征的。这些非语义的低层特征难以理解,与人思维中的高层语义概念相差甚远,严重影响视频内容检索系统的易用性。低层特征和高层语义概念间的语义鸿沟很难逾越。如何跨越语义鸿沟,用语义概念检索视频内容是目前基于内容视频检索最具挑战性的研究方向。本文介绍语义视频检索出现的背景,分析语义鸿沟出现的原因,对现有尝试跨越语义鸿沟的主要方法进行综述;评述了相关技术的优缺点,探讨了各方法将来可能的研究发展方向以及视频语义检索近期、长期可能的技术突破点。  相似文献   

7.
Semantic analysis of soccer video using dynamic Bayesian network   总被引:3,自引:0,他引:3  
Video semantic analysis is formulated based on the low-level image features and the high-level knowledge which is encoded in abstract, nongeometric representations. This paper introduces a semantic analysis system based on Bayesian network (BN) and dynamic Bayesian network (DBN). It is validated in the particular domain of soccer game videos. Based on BN/DBN, it can identify the special events in soccer games such as goal event, corner kick event, penalty kick event, and card event. The video analyzer extracts the low-level evidences, whereas the semantic analyzer uses BN/DBN to interpret the high-level semantics. Different from previous shot-based semantic analysis approaches, the proposed semantic analysis is frame-based for each input frame, it provides the current semantics of the event nodes as well as the hidden nodes. Another contribution is that the BN and DBN are automatically generated by the training process instead of determined by ad hoc. The last contribution is that we introduce a so-called temporal intervening network to improve the accuracy of the semantics output.  相似文献   

8.
为了解决跨领域医学图像分析中不匹配的问题,提出了一种基于对抗学习的无监督领域自适应框架(UAL-DAF)。具体而言,该框架通过外观转移模块(ATM)和结合条件生成对抗网络的语义转移模块(STM)分别缩小了跨领域医学图像外观和语义层次的差异。最后,在具有挑战性的医学图像分割实验中,结果显著优于已有方法。因此,该框架能够提取领域自适应知识的外观和语义层次信息,实现领域知识的协同融合。  相似文献   

9.
为了克服图像底层特征与高层语义之间的语义鸿沟,降低自顶向下的显著性检测方法对特定物体先验的依赖,提出一种基于高层颜色语义特征的显著性检测方法。首先从彩色图像中提取结构化颜色特征并在多核学习框架下,实现对图像进行颜色命名获取像素的颜色语义名称;接着利用图像颜色语义名称分布计算高层颜色语义特征,再将其与底层的Gist特征融合,通过线性支持向量机训练生成显著性分类器,实现像素级的显著性检测。实验结果表明,本文算法能够更加准确地检测出人眼视觉关注点。且与传统的底层颜色特征相比,本文颜色语义特征能够获得更好的显著性检测结果。  相似文献   

10.
Semantic image segmentation aims to partition an image into non-overlapping regions and assign a pre-defined object class label to each region. In this paper, a semantic method combining low-level features and high-level contextual cues is proposed to segment natural scene images. The proposed method first takes the gist representation of an image as its global feature. The image is then over-segmented into many super-pixels and histogram representations of these super-pixels are used as local features. In addition, co-occurrence and spatial layout relations among object classes are exploited as contextual cues. Finally the features and cues are integrated into the inference framework based on conditional random field by defining specific potential terms and introducing weighting functions. The proposed method has been compared with state-of-the-art methods on the MSRC database, and the experimental results show its effectiveness.  相似文献   

11.
This paper presents the S-Chart framework, an approach for semantic image interpretation of line charts; and the InteliStrata system, an application for semantic interpretation of gamma ray profiles. The S-Chart framework is structured as a set of knowledge models and algorithms that can be instantiated to accomplish chart interpretation in all sorts of domains. The knowledge models are represented in three semantic levels and apply the concept of symbol grounding in order to map the representation primitives between the levels. The interpretation algorithms carry out the interaction between the high-level symbolic reasoning, and the low-level signal processing. In order to demonstrate the applicability of the S-Chart framework, we developed the InteliStrata system, an application in Geology for the semantic interpretation of gamma ray profiles. Using the developed application, we have interpreted the charts of two gamma ray profiles captured in petroleum exploration wells, indicating the position of stratigraphic sequences and maximum flooding surfaces. The results were compared with the interpretation produced by an experienced geologist using the same data input. The system carried out interpretation that were compatible with the geologist interpretation over the data. Our framework has the advantage of allowing the integration of existing domain ontologies with domain independent visual knowledge models and also the ability of grounding domain concepts in low-level data.  相似文献   

12.
Multimedia data mining refers to pattern discovery, rule extraction and knowledge acquisition from multimedia database. Two typical tasks in multimedia data mining are of visual data classification and clustering in terms of semantics. Usually performance of such classification or clustering systems may not be favorable due to the use of low-level features for image representation, and also some improper similarity metrics for measuring the closeness between multimedia objects as well. This paper considers a problem of modeling similarity for semantic image clustering. A collection of semantic images and feed-forward neural networks are used to approximate a characteristic function of equivalence classes, which is termed as a learning pseudo metric (LPM). Empirical criteria on evaluating the goodness of the LPM are established. A LPM based k-Mean rule is then employed for the semantic image clustering practice, where two impurity indices, classification performance and robustness are used for performance evaluation. An artificial image database with 11 semantics is employed for our simulation studies. Results demonstrate the merits and usefulness of our proposed techniques for multimedia data mining.  相似文献   

13.
Nowadays, due to the rapid growth of digital technologies, huge volumes of image data are created and shared on social media sites. User-provided tags attached to each social image are widely recognized as a bridge to fill the semantic gap between low-level image features and high-level concepts. Hence, a combination of images along with their corresponding tags is useful for intelligent retrieval systems, those are designed to gain high-level understanding from images and facilitate semantic search. However, user-provided tags in practice are usually incomplete and noisy, which may degrade the retrieval performance. To tackle this problem, we present a novel retrieval framework that automatically associates the visual content with textual tags and enables effective image search. To this end, we first propose a probabilistic topic model learned on social images to discover latent topics from the co-occurrence of tags and image features. Moreover, our topic model is built by exploiting the expert knowledge about the correlation between tags with visual contents and the relationship among image features that is formulated in terms of spatial location and color distribution. The discovered topics then help to predict missing tags of an unseen image as well as the ones partially labeled in the database. These predicted tags can greatly facilitate the reliable measure of semantic similarity between the query and database images. Therefore, we further present a scoring scheme to estimate the similarity by fusing textual tags and visual representation. Extensive experiments conducted on three benchmark datasets show that our topic model provides the accurate annotation against the noise and incompleteness of tags. Using our generalized scoring scheme, which is particularly advantageous to many types of queries, the proposed approach also outperforms state-of-the-art approaches in terms of retrieval accuracy.  相似文献   

14.
目的 人脸正面化重建是当前视觉领域的热点问题。现有方法对于模型的训练数据具有较高的需求,如精确的输入输出图像配准、完备的人脸先验信息等。但该类数据采集成本较高,可应用的数据集规模较小,直接将现有方法应用于真实的非受控场景中往往难以取得理想表现。针对上述问题,提出了一种无图像配准和先验信息依赖的任意视角人脸图像正面化重建方法。方法 首先提出了一种具有双输入路径的人脸编码网络,分别用于学习输入人脸的视觉表征信息以及人脸的语义表征信息,两者联合构造出更加完备的人脸表征模型。随后建立了一种多类别表征融合的解码网络,通过以视觉表征为基础、以语义表征为引导的方式对两种表征信息进行融合,融合后的信息经过图像解码即可得到最终的正面化人脸图像重建结果。结果 首先在Multi-PIE(multi-pose, illumination and expression)数据集上与8种较先进方法进行了性能评估。定量和定性的实验结果表明,所提方法在客观指标以及视觉质量方面均优于对比方法。此外,相较于当前性能先进的基于光流的特征翘曲模型(flow-based feature warping model, FFWM)方...  相似文献   

15.
语义分析是图像理解中高层认知的重点和难点,存在图像文本之间的语义鸿沟和文本描述多义性两大关键问题。以图像本体的语义化为核心,在归纳图像语义特征及上下文表示的基础上,全面阐述生成法、判别法和句法描述法3种图像语义处理策略。总结语义词汇的客观基准和评价方法。最后指出图像语义理解的发展方向。  相似文献   

16.
近年来随着深度学习技术的不断发展,涌现出各种基于深度学习的语义分割算法,然而绝大部分分割算法都无法实现推理速度和语义分割精度的兼得.针对此问题,提出一种多通道深度加权聚合网络(MCDWA_Net)的实时语义分割框架.\:该方法首先引入多通道思想,构建一种3通道语义表征模型,3通道结构分别用于提取图像的3类互补语义信息:低级语义通道输出图像中物体的边缘、颜色、结构等局部特征;辅助语义通道提取介于低级语义和高级语义的过渡信息,并实现对高级语义通道的多层反馈;高级语义通道获取图像中上下文逻辑关系及类别语义信息.\:之后,设计一种3类语义特征加权聚合模块,用于输出更完整的全局语义描述.\:最后,引入一种增强训练机制,实现训练阶段的特征增强,进而改善训练速度.\:实验结果表明,所提出方法在复杂场景中进行语义分割不仅有较快的推理速度,且有很高的分割精度,能够实现语义分割速度与精度的均衡.  相似文献   

17.
针对基于关键词WEB图像检索中的语义缺失问题,利用本体的方法描述WEB图像的语义特征,构建了基于智能体和语义特征的WEB图像检索模型,该模型以领域Ontology描述WEB图像的语义特征,通过多个Agent模块分工协作,完成满足用户请求的WEB图像检索.并在Corel提供的图像上进行了仿真实验,验证了该模型解决了基于关键词WEB图像检索模型中的语义缺失问题,提高了WEB图像检索速度和准确率.  相似文献   

18.
Grouping images into semantically meaningful categories using low-level visual features is a challenging and important problem in content-based image retrieval. Based on these groupings, effective indices can be built for an image database. In this paper, we show how a specific high-level classification problem (city images vs landscapes) can be solved from relatively simple low-level features geared for the particular classes. We have developed a procedure to qualitatively measure the saliency of a feature towards a classification problem based on the plot of the intra-class and inter-class distance distributions. We use this approach to determine the discriminative power of the following features: color histogram, color coherence vector, DCT coefficient, edge direction histogram, and edge direction coherence vector. We determine that the edge direction-based features have the most discriminative power for the classification problem of interest here. A weighted k-NN classifier is used for the classification which results in an accuracy of 93.9% when evaluated on an image database of 2716 images using the leave-one-out method. This approach has been extended to further classify 528 landscape images into forests, mountains, and sunset/sunrise classes. First, the input images are classified as sunset/sunrise images vs forest & mountain images (94.5% accuracy) and then the forest & mountain images are classified as forest images or mountain images (91.7% accuracy). We are currently identifying further semantic classes to assign to images as well as extracting low level features which are salient for these classes. Our final goal is to combine multiple 2-class classifiers into a single hierarchical classifier.  相似文献   

19.
20.
基于SVM的图像低层特征与高层语义的关联   总被引:4,自引:0,他引:4  
成洁  石跃祥 《计算机应用研究》2006,23(9):250-252,255
在基于内容的图像检索中,针对图像的低层可视特征与高层语义特征之间的鸿沟,提出了一种基于支持向量机(SVM)的语义关联方法。通过对图像低层特征的分析,提取了颜色和形状特征向量(221维),将它们作为支持向量机的输入向量,对图像类进行学习,建立图像低层特征与高层语义的关联,并应用于鸟类、花卉、海洋以及建筑物等几个典型的语义类别检索。实验结果表明,该方法可适应于不同用户的图像检索,并提高了检索性能。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号