首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 578 毫秒
1.
2.
多要素空间场景相似性匹配模型及应用   总被引:1,自引:1,他引:0       下载免费PDF全文
提出综合考虑空间对象的面积、属性及其相互间的拓扑、方位关系等多要素的空间场景相似性匹配模型。首先利用"逐步转变模型"和属性关系图实现场景语义建模,而后构建场景匹配模型,利用所提出的相似度指标评价场景间的相似性。该模型能够较为客观地刻画空间场景特别是地理场景的语义内涵,并且具有很好的场景匹配效果,在空间数据的智能查询检索上具有较好的应用前景。  相似文献   

3.
空间PACT是一种用来进行场景实例和类别识别的新型特征表示,它在PACT(Census变换直方图的主成分分析)的基础上结合最新的场景语义识别框架:空间金字塔,使之相比现存算法具有更高的识别率。针对场景语义识别的强度和效率,提出一种新型的识别方法,在空间PACT中引入潜在阶梯边缘模板,在几乎不影响识别率的基础上改进算法效率。同时通过引入颜色特征信息,获得具有更强语义识别能力的特征表示。实验结果表明,该算法具有计算效率高,识别率高,强语义识别的特点。  相似文献   

4.
5.
6.
Scene Parsing Using Region-Based Generative Models   总被引:1,自引:0,他引:1  
Semantic scene classification is a challenging problem in computer vision. In contrast to the common approach of using low-level features computed from the whole scene, we propose "scene parsing" utilizing semantic object detectors (e.g., sky, foliage, and pavement) and region-based scene-configuration models. Because semantic detectors are faulty in practice, it is critical to develop a region-based generative model of outdoor scenes based on characteristic objects in the scene and spatial relationships between them. Since a fully connected scene configuration model is intractable, we chose to model pairwise relationships between regions and estimate scene probabilities using loopy belief propagation on a factor graph. We demonstrate the promise of this approach on a set of over 2000 outdoor photographs, comparing it with existing discriminative approaches and those using low-level features  相似文献   

7.
Mobile robotics has achieved notable progress, however, to increase the complexity of the tasks that mobile robots can perform in natural environments, we need to provide them with a greater semantic understanding of their surrounding. In particular, identifying indoor scenes, such as an Office or a Kitchen, is a highly valuable perceptual ability for an indoor mobile robot, and in this paper we propose a new technique to achieve this goal. As a distinguishing feature, we use common objects, such as Doors or furniture, as a key intermediate representation to recognize indoor scenes. We frame our method as a generative probabilistic hierarchical model, where we use object category classifiers to associate low-level visual features to objects, and contextual relations to associate objects to scenes. The inherent semantic interpretation of common objects allows us to use rich sources of online data to populate the probabilistic terms of our model. In contrast to alternative computer vision based methods, we boost performance by exploiting the embedded and dynamic nature of a mobile robot. In particular, we increase detection accuracy and efficiency by using a 3D range sensor that allows us to implement a focus of attention mechanism based on geometric and structural information. Furthermore, we use concepts from information theory to propose an adaptive scheme that limits computational load by selectively guiding the search for informative objects. The operation of this scheme is facilitated by the dynamic nature of a mobile robot that is constantly changing its field of view. We test our approach using real data captured by a mobile robot navigating in Office and home environments. Our results indicate that the proposed approach outperforms several state-of-the-art techniques for scene recognition.  相似文献   

8.
Most successful approaches on scene recognition tend to efficiently combine global image features with spatial local appearance and shape cues. On the other hand, less attention has been devoted for studying spatial texture features within scenes. Our method is based on the insight that scenes can be seen as a composition of micro-texture patterns. This paper analyzes the role of texture along with its spatial layout for scene recognition. However, one main drawback of the resulting spatial representation is its huge dimensionality. Hence, we propose a technique that addresses this problem by presenting a compact Spatial Pyramid (SP) representation. The basis of our compact representation, namely, Compact Adaptive Spatial Pyramid (CASP) consists of a two-stages compression strategy. This strategy is based on the Agglomerative Information Bottleneck (AIB) theory for (i) compressing the least informative SP features, and, (ii) automatically learning the most appropriate shape for each category. Our method exceeds the state-of-the-art results on several challenging scene recognition data sets.  相似文献   

9.
10.
为了解决在街道场景图像语义分割任务中传统U-Net网络在多尺度类别下目标分割的准确率较低和图像上下文特征的关联性较差等问题,提出一种改进U-Net的语义分割网络AS-UNet,实现对街道场景图像的精确分割.首先,在U-Net网络中融入空间通道挤压激励(spatial and channel squeeze&excitation block, scSE)注意力机制模块,在通道和空间两个维度来引导卷积神经网络关注与分割任务相关的语义类别,以提取更多有效的语义信息;其次,为了获取图像的全局上下文信息,聚合多尺度特征图来进行特征增强,将空洞空间金字塔池化(atrous spatial pyramid pooling, ASPP)多尺度特征融合模块嵌入到U-Net网络中;最后,通过组合使用交叉熵损失函数和Dice损失函数来解决街道场景目标类别不平衡的问题,进一步提升分割的准确性.实验结果表明,在街道场景Cityscapes数据集和Cam Vid数据集上AS-UNet网络模型的平均交并比(mean intersection over union, MIo U)相较于传统U-Net网络分别提...  相似文献   

11.
传统潜在语义分析(Latent Semantic Analysis, LSA)方法无法获得场景目标空间分布信息和潜在主题的判别信息。针对这一问题提出了一种基于多尺度空间判别性概率潜在语义分析(Probabilistic Latent Semantic Analysis, PLSA)的场景分类方法。首先通过空间金字塔方法对图像进行空间多尺度划分获得图像空间信息,结合PLSA模型获得每个局部块的潜在语义信息;然后串接每个特定局部块中的语义信息得到图像多尺度空间潜在语义信息;最后结合提出的权值学习方法来学习不同图像主题间的判别信息,从而得到图像的多尺度空间判别性潜在语义信息,并将学习到的权值信息嵌入支持向量基(Support Vector Machine, SVM)分类器中完成图像的场景分类。在常用的三个场景图像库(Scene-13、Scene-15和Caltech-101)上的实验表明,该方法平均分类精度比现有许多state-of-art方法均优。验证了其有效性和鲁棒性。  相似文献   

12.
从深度图RGB-D域中联合学习RGB图像特征与3D几何信息有利于室内场景语义分割,然而传统分割方法通常需要精确的深度图作为输入,严重限制了其应用范围。提出一种新的室内场景理解网络框架,建立基于语义特征与深度特征提取网络的联合学习网络模型提取深度感知特征,通过几何信息指导的深度特征传输模块与金字塔特征融合模块将学习到的深度特征、多尺度空间信息与语义特征相结合,生成具有更强表达能力的特征表示,实现更准确的室内场景语义分割。实验结果表明,联合学习网络模型在NYU-Dv2与SUN RGBD数据集上分别取得了69.5%与68.4%的平均分割准确度,相比传统分割方法具有更好的室内场景语义分割性能及更强的适用性。  相似文献   

13.
目的 由于室内点云场景中物体的密集性、复杂性以及多遮挡等带来的数据不完整和多噪声问题,极大地限制了室内点云场景的重建工作,无法保证场景重建的准确度。为了更好地从无序点云中恢复出完整的场景,提出了一种基于语义分割的室内场景重建方法。方法 通过体素滤波对原始数据进行下采样,计算场景三维尺度不变特征变换(3D scale-invariant feature transform,3D SIFT)特征点,融合下采样结果与场景特征点从而获得优化的场景下采样结果;利用随机抽样一致算法(random sample consensus,RANSAC)对融合采样后的场景提取平面特征,将该特征输入PointNet网络中进行训练,确保共面的点具有相同的局部特征,从而得到每个点在数据集中各个类别的置信度,在此基础上,提出了一种基于投影的区域生长优化方法,聚合语义分割结果中同一物体的点,获得更精细的分割结果;将场景物体的分割结果划分为内环境元素或外环境元素,分别采用模型匹配的方法、平面拟合的方法从而实现场景的重建。结果 在S3DIS (Stanford large-scale 3D indoor space dataset)数据集上进行实验,本文融合采样算法对后续方法的效率和效果有着不同程度的提高,采样后平面提取算法的运行时间仅为采样前的15%;而语义分割方法在全局准确率(overall accuracy,OA)和平均交并比(mean intersection over union,mIoU)两个方面比PointNet网络分别提高了2.3%和4.2%。结论 本文方法能够在保留关键点的同时提高计算效率,在分割准确率方面也有着明显提升,同时可以得到高质量的重建结果。  相似文献   

14.
Auditory scenes are temporal audio segments with coherent semantic content. Automatically classifying and grouping auditory scenes with similar semantics into categories is beneficial for many multimedia applications, such as semantic event detection and indexing. For such semantic categorization, auditory scenes are first characterized with either low-level acoustic features or some mid-level representations like audio effects, and then supervised classifiers or unsupervised clustering algorithms are employed to group scene segments into various semantic categories. In this paper, we focus on the problem of automatically categorizing audio scenes in unsupervised manner. To achieve more reasonable clustering results, we introduce the co-clustering scheme to exploit potential grouping trends among different dimensions of feature spaces (either low-level or mid-level feature spaces), and provide more accurate similarity measure for comparing auditory scenes. Moreover, we also extend the co-clustering scheme with a strategy based on the Bayesian information criterion (BIC) to automatically estimate the numbers of clusters. Evaluation performed on 272 auditory scenes extracted from 12-h audio data shows very encouraging categorization results. Co-clustering achieved a better performance compared to some traditional one-way clustering algorithms, both based on the low-level acoustic features and on the mid-level audio effect representations. Finally, we present our vision regarding the applicability of this approach on general multimedia data, and also show some preliminary results on content-based image clustering.  相似文献   

15.
The purpose of this work is the semantic visualization of complex 3D city models containing numerous dynamic entities, as well as performing interactive semantic walkthroughs and flights without predefined paths. This is achieved by using a 3D multilayer scene graph that integrates geometric and semantic information as well as by the performance of efficient geometric and what we call semantic view culling. The proposed semantic-geometric scene graph is a 3D structure composed of several layers which is suitable for visualizing geometric data with semantic meaning while the user is navigating inside the 3D city model. BqR-Tree is the data structure specially developed for the geometric layer for the purpose of speeding up rendering time in urban scenes. It is an improved R-Tree data structure based on a quadtree spatial partitioning which improves the rendering speed of the usual R-trees when view culling is implemented in urban scenes. The BqR-Tree is defined by considering the city block as the basic and logical unit. The advantage of the block as opposed to the traditional unit, the building, is that it is easily identified regardless of the data source format, and allows inclusion of mobile and semantic elements in a natural way. The usefulness of the 3D scene graph has been tested with low structured data, which makes its application appropriate to almost all city data containing not only static but dynamic elements as well.  相似文献   

16.
17.
18.
In this paper we present the first large-scale scene attribute database. First, we perform crowdsourced human studies to find a taxonomy of 102 discriminative attributes. We discover attributes related to materials, surface properties, lighting, affordances, and spatial layout. Next, we build the “SUN attribute database” on top of the diverse SUN categorical database. We use crowdsourcing to annotate attributes for 14,340 images from 707 scene categories. We perform numerous experiments to study the interplay between scene attributes and scene categories. We train and evaluate attribute classifiers and then study the feasibility of attributes as an intermediate scene representation for scene classification, zero shot learning, automatic image captioning, semantic image search, and parsing natural images. We show that when used as features for these tasks, low dimensional scene attributes can compete with or improve on the state of the art performance. The experiments suggest that scene attributes are an effective low-dimensional feature for capturing high-level context and semantics in scenes.  相似文献   

19.
Optic flow motion patterns can be a rich source of information about our own movement and about the structure of the environment we are moving in. We investigate the information available to the brain under real operating conditions by analyzing video sequences generated by physically moving a camera through various typical human environments. We consider to what extent the motion signal maps generated by a biologically plausible, two-dimensional array of correlation-based motion detectors (2DMD) not only depend on egomotion, but also reflect the spatial setup of such environments. We analyzed the local motion outputs by extracting the relative amounts of detected directions and comparing the spatial distribution of the motion signals to that of idealized optic flow. Using a simple template matching estimation technique, we are able to extract the focus of expansion and find relatively small errors that are distributed in characteristic patterns in different scenes. This shows that all types of scenes provide suitable motion information for extracting ego motion despite the substantial levels of noise affecting the motion signal distributions, attributed to the sparse nature of optic flow and the presence of camera jitter. However, there are large differences in the shape of the direction distributions between different types of scenes; in particular, man-made office scenes are heavily dominated by directions in the cardinal axes, which is much less apparent in outdoor forest scenes. Further examination of motion magnitudes at different scales and the location of motion information in a scene revealed different patterns across different scene categories. This suggests that self-motion patterns are not only relevant for deducing heading direction and speed but also provide a rich information source for scene structure and could be important for the rapid formation of the gist of a scene under normal human locomotion.  相似文献   

20.
手机3D 动画自动生成系统是要实现从用户发送信息给服务器,经过信息抽取、情节规划、场景规划等一系列的处理,最终生成与短信内容相关的视频动画并发送给接收方这一过程。其中场景规划模块是在情节定性规划的基础上确定情节的各个细节,并将其量化到三维动画场景文件中。在动画情节规划的基础上,对动画场景规划模块中的三维场景空间布局问题进行研究,将三维场景可用空间根据物体的语义信息进行布局,基于语义网技术设计和实现三维场景的布局知识库,最终实现了三维物体的合理摆放,系统不仅保证了物体的无遮挡、无碰撞摆放,也实现了同一物体添加多个的情况,使物体的摆放具有多样性同时也体现了物体的语义信息。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号