首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
手机3D 动画自动生成系统是要实现从用户发送信息给服务器,经过信息抽取、情节规划、场景规划等一系列的处理,最终生成与短信内容相关的视频动画并发送给接收方这一过程。其中场景规划模块是在情节定性规划的基础上确定情节的各个细节,并将其量化到三维动画场景文件中。在动画情节规划的基础上,对动画场景规划模块中的三维场景空间布局问题进行研究,将三维场景可用空间根据物体的语义信息进行布局,基于语义网技术设计和实现三维场景的布局知识库,最终实现了三维物体的合理摆放,系统不仅保证了物体的无遮挡、无碰撞摆放,也实现了同一物体添加多个的情况,使物体的摆放具有多样性同时也体现了物体的语义信息。  相似文献   

2.
周方波  赵怀林  刘华平   《智能系统学报》2022,17(5):1032-1038
在移动机器人执行日常家庭任务时,首先需要其能够在环境中避开障碍物,自主地寻找到房间中的物体。针对移动机器人如何有效在室内环境下对目标物体进行搜索的问题,提出了一种基于场景图谱的室内移动机器人目标搜索,其框架结合了导航地图、语义地图和语义关系图谱。在导航地图的基础上建立了包含地标物体位置信息的语义地图,机器人可以轻松对地标物体进行寻找。对于动态的物体,机器人根据语义关系图中物体之间的并发关系,优先到关系强度比较高的地标物体旁寻找。通过物理实验展示了机器人在语义地图和语义关系图的帮助下可以实现在室内环境下有效地寻找到目标,并显著地减少了搜索的路径长度,证明了该方法的有效性。  相似文献   

3.
对结构化室内场景的空域布局结构进行估计是计算机视觉领域的研究热点之一.然而,对于内部堆放了众多杂乱物体的室内场景,现有的大多数方法容易受到各种物体遮挡的影响而无法对这一类场景的布局结构进行准确推理.为此,本文方法充分考虑了房间和物体之间的几何和语义关联性,参数化地对房间和内部物体的三维体积分别进行描述,并且提出利用多种高层图像语义来获取物体的先验信息.此外,还在此基础上加入了空域排他性和空域位置等多种空域约束,进而在改进室内场景空域布局估计的同时为物体的识别和定位提供关键信息.本文方法不仅具有较低的求解复杂度,而且通过试验表明相比于现有的经典方法在杂乱的室内场景中能够取得更为鲁棒的空域布局推理结果.  相似文献   

4.
本文将图像语义分割与即时定位与地图构建(SLAM)技术相结合构建环境的3维语义地图.输入的图像序列经过ORB-SLAM进行关键帧筛选.提出了一种基于DeepLab算法改进的图像语义分割方法.在卷积网络的最后一层后面引入上采样卷积层,改善双线性插值过于粗糙的问题.关键帧的深度图作为门控信号控制不同卷积操作的选择,从而在对远处的物体保持细节的同时对近处的物体保持较大视野.然后,对齐分割后的图像与深度图,利用相邻关键帧之间的空间对应关系构建3维稠密语义地图.实验结果表明,对于室内和室外场景,本文算法可以实现准确的语义分割,反投影到3维空间中也能形成效果良好的语义地图;与当前大多数基于DeepLab与反卷积算法的方法相比,本文算法可以得到更好的语义地图.  相似文献   

5.
付豪  徐和根  张志明  齐少华 《计算机应用》2021,41(11):3337-3344
针对动态场景下的定位与静态语义地图构建问题,提出了一种基于语义和光流约束的动态环境下的同步定位与地图构建(SLAM)算法,以降低动态物体对定位与建图的影响。首先,对于输入的每一帧,通过语义分割获得图像中物体的掩模,再通过几何方法过滤不符合极线约束的特征点;接着,结合物体掩模与光流计算出每个物体的动态概率,根据动态概率过滤特征点以得到静态特征点,再利用静态特征点进行后续的相机位姿估计;然后,基于RGB-D图片和物体动态概率建立静态点云,并结合语义分割建立语义八叉树地图。最后,基于静态点云与语义分割创建稀疏语义地图。公共TUM数据集上的测试结果表明,高动态场景下,所提算法与ORB-SLAM2相比,在绝对轨迹误差和相对位姿误差上能取得95%以上的性能提升,与DS-SLAM、DynaSLAM相比分别减小了41%和11%的绝对轨迹误差,验证了该算法在高动态场景中具有较好的定位精度和鲁棒性。地图构建的实验结果表明,所提算法创建了静态语义地图,与点云地图相比,稀疏语义地图的存储空间需求量降低了99%。  相似文献   

6.
视觉SLAM(Simultaneous Localization And Mapping,同时定位与建图)是移动机器人领域的核心技术,传统视觉SLAM还难以适用于高动态场景并且地图中缺少语义信息。提出一种动态环境语义SLAM方法,用深度学习网络对图像进行目标检测,检测动态目标所在区域,对图像进行特征提取并剔除动态物体所在区域的特征点,利用静态的特征点进行位姿计算,对关键帧进行语义分割,在构建语义地图时滤除动态物体的地图点构建出无动态物体干扰的语义地图。在TUM数据集上进行实验,结果显示该方法在动态环境下可以提升88.3%位姿估计精度,并且可同时构建出无动态物体干扰的语义地图。  相似文献   

7.
莫宏伟  田朋 《控制与决策》2021,36(12):2881-2890
视觉场景理解包括检测和识别物体、推理被检测物体之间的视觉关系以及使用语句描述图像区域.为了实现对场景图像更全面、更准确的理解,将物体检测、视觉关系检测和图像描述视为场景理解中3种不同语义层次的视觉任务,提出一种基于多层语义特征的图像理解模型,并将这3种不同语义层进行相互连接以共同解决场景理解任务.该模型通过一个信息传递图将物体、关系短语和图像描述的语义特征同时进行迭代和更新,更新后的语义特征被用于分类物体和视觉关系、生成场景图和描述,并引入融合注意力机制以提升描述的准确性.在视觉基因组和COCO数据集上的实验结果表明,所提出的方法在场景图生成和图像描述任务上拥有比现有方法更好的性能.  相似文献   

8.
目的 机器人在进行同时定位与地图构建(simultaneous localization and mapping,SLAM)时需要有效利用未知复杂环境的场景信息,针对现有SLAM算法对场景细节理解不够及建图细节信息缺失的问题,本文构造出一种将SLAM点云定位技术与语义分割网络相结合的未知环境地图构建方法,实现高精度三维地图重建。方法 首先,利用场景的实时彩色信息进行相机的位姿估计,并构造融合空间多尺度稀疏及稠密特征的深度学习网络HieSemNet(hierarchical semantic network),对未知场景信息进行语义分割,得到场景的实时二维语义信息;其次,利用深度信息和相机位姿进行空间点云估计,并将二维语义分割信息与三维点云信息融合,使语义分割的结果对应到点云的相应空间位置,构建出具有语义信息的高精度点云地图,实现三维地图重建。结果 为验证本文方法的有效性,分别针对所构造的HieSemNet网络和语义SLAM系统进行验证实验。实验结果表明,本文的网络在平均像素准确度和平均交并比上均取得了较好的精度,MPA(mean pixel accuracy)指标相较于其他网络分别提高了17.47%、11.67%、4.86%、2.90%和0.44%,MIoU(mean intersection over union)指标分别提高了13.94%、1.10%、6.28%、2.28%和0.62%。本文的SLAM算法可以获得更多的建图信息,构建的地图精度和准确度都更好。结论 本文方法充分考虑了不同尺寸物体的分割效果,提出的HieSemNet网络能有效提高场景语义分割准确性,此外,与现有的前沿语义SLAM系统相比,本文方法能够明显提高建图的精度和准确度,获得更高质量的地图。  相似文献   

9.
机器人在执行同时定位与地图创建(simultaneous localization and mapping,SLAM)的复杂任务时,容易受到移动物体的干扰,导致定位精度下降、地图可读性较差、系统鲁棒性不足,为此提出一种基于深度学习和边缘检测的SLAM算法。首先,利用YOLOv4目标检测算法获取场景中的语义信息,得到初步的语义动静态区域,同时提取ORB特征点并计算光流场,筛选动态特征点,通过语义关联进一步得到动态物体,利用canny算子计算动态物体的轮廓边缘,利用动态物体以外的静态特征点进行相机位姿估计,筛选关键帧,进行点云叠加,利用剔除动态物体的点云信息构建静态环境地图。本文算法在公开数据集上与ORB_SLAM2进行对比,定位精度提升90%以上,地图可读性明显增强,实验结果表明本文算法可以有效降低移动物体对定位与建图的影响,显著提升算法稳健性。  相似文献   

10.
席志红  韩双全  王洪旭 《计算机应用》2019,39(10):2847-2851
针对动态物体在室内同步定位与地图构建(SLAM)系统中影响位姿估计的问题,提出一种动态场景下基于语义分割的SLAM系统。在相机捕获图像后,首先用PSPNet(Pyramid Scene Parsing Network)对图像进行语义分割;之后提取图像特征点,剔除分布在动态物体内的特征点,并用静态的特征点进行相机位姿估计;最后完成语义点云图和语义八叉树地图的构建。在公开数据集上的五个动态序列进行多次对比测试的结果表明,相对于使用SegNet网络的SLAM系统,所提系统的绝对轨迹误差的标准偏差有6.9%~89.8%的下降,平移和旋转漂移的标准偏差在高动态场景中的最佳效果也能分别提升73.61%和72.90%。结果表明,改进的系统能够显著减小动态场景下位姿估计的误差,准确地在动态场景中进行相机位姿估计。  相似文献   

11.
In user interfaces of modern systems, users get the impression of directly interacting with application objects. In 3D based user interfaces, novel input devices, like hand and force input devices, are being introduced. They aim at providing natural ways of interaction. The use of a hand input device allows the recognition of static poses and dynamic gestures performed by a user's hand. This paper describes the use of a hand input device for interacting with a 3D graphical application. A dynamic gesture language, which allows users to teach some hand gestures, is presented. Furthermore, a user interface integrating the recognition of these gestures and providing feedback for them, is introduced. Particular attention has been spent on implementing a tool for easy specification of dynamic gestures, and on strategies for providing graphical feedback to users' interactions. To demonstrate that the introduced 3D user interface features, and the way the system presents graphical feedback, are not restricted to a hand input device, a force input device has also been integrated into the user interface.  相似文献   

12.
A framework for 3D object recognition is presented. Its flexibility and extensibility are accomplished through a uniform, parallel, and modular recognition architecture. Concurrent and stacked parameter transforms reconstruct a variety of features from the input scene. At each stage, constraint satisfaction networks collect and fuse the evidence obtained through the parameter transforms, ensuring a globally consistent interpretation of the input scene and allowing for the integration of diverse types of information. The final interpretation of the scene is a small consistent subset of the many initial hypotheses about partial features, primitive features, feature assemblies, and 3D objects computed by the various parameter transforms. A complete, integrated, and implemented system that extracts planar surfaces, patches of quadrics of revolution, and planar intersection curves of these surfaces from a depth map viewing 3D objects is described. Experimental results on the recognition behavior of the system are presented  相似文献   

13.
In the traditional design process for a 3D environment, people usually depict a rough prototype to verify their ideas, and iteratively modify its configuration until they are satisfied with the general layout. In this activity, one of the main operations is the rearrangement of single and composite parts of a scene. With current desktop virtual reality (VR) systems, the selection and manipulation of arbitrary objects in 3D is still difficult. In this work, we present new and efficient techniques that allow even novice users to perform meaningful rearrangement tasks with traditional input devices. The results of our work show that the presented techniques can be mastered quickly and enable users to perform complex tasks on composite objects. Moreover, the system is easy to learn, supports creativity, and is fun to use.  相似文献   

14.
三维物体识别研究进展   总被引:19,自引:2,他引:17       下载免费PDF全文
出于工业和医疗等领域大量现实应用的需要,如今三维物体识别已成为一个很活跃的研究领域。一般来说,三维物体识别系统可以通过两个阶段的处理来完成三维物体的识别和定位,首先用传感器获取的场景输入数据来得到场景的表达;然后将它与数据库中存储的物体表达相匹配。为了推动该领域研究进一步发展,因而对近10a年中该识别过程中必须解决的感传器类型、三维物体表达方法和匹配策略等3个方面问题的研究成果进行了综述,对主要方法进行分类和总结;并提出了一些三维视觉系统中还需要深入研究的问题,包括对所研究物体形状的限制、复杂背景的影响和表达以及识别中的“整体和局部”的矛盾等。  相似文献   

15.
In this paper, we present a new framework to determine up front orientations and detect salient views of 3D models. The salient viewpoint to human preferences is the most informative projection with correct upright orientation. Our method utilizes two Convolutional Neural Network (CNN) architectures to encode category‐specific information learnt from a large number of 3D shapes and 2D images on the web. Using the first CNN model with 3D voxel data, we generate a CNN shape feature to decide natural upright orientation of 3D objects. Once a 3D model is upright‐aligned, the front projection and salient views are scored by category recognition using the second CNN model. The second CNN is trained over popular photo collections from internet users. In order to model comfortable viewing angles of 3D models, a category‐dependent prior is also learnt from the users. Our approach effectively combines category‐specific scores and classical evaluations to produce a data‐driven viewpoint saliency map. The best viewpoints from the method are quantitatively and qualitatively validated with more than 100 objects from 20 categories. Our thumbnail images of 3D models are the most favoured among those from different approaches.  相似文献   

16.
We describe DataSplash, a direct manipulation system for creating semantic zoom visualizations of tabular (relational) data. DataSplash makes contributions in three areas that are key to the construction of such visualizations. First, DataSplash helps users graphically specify the visual appearance of groups of objects. Second, the system helps users visually program the way the appearance of groups of objects changes as users browse the visualization. Third, DataSplash allows users to create groups of graphical links between canvases. These direct manipulation facilities simplify the process of constructing semantic zoom applications, particularly ones that display large data sets.  相似文献   

17.
User interfaces of current 3D and virtual reality environments require highly interactive input/output (I/O) techniques and appropriate input devices, providing users with natural and intuitive ways of interacting. This paper presents an interaction model, some techniques, and some ways of using novel input devices for 3D user interfaces. The interaction model is based on a tool‐object syntax, where the interaction structure syntactically simulates an action sequence typical of a human's everyday life: One picks up a tool and then uses it on an object. Instead of using a conventional mouse, actions are input through two novel input devices, a hand‐ and a force‐input device. The devices can be used simultaneously or in sequence, and the information they convey can be processed in a combined or in an independent way by the system. The use of a hand‐input device allows the recognition of static poses and dynamic gestures performed by a user's hand. Hand gestures are used for selecting, or acting as, tools and for manipulating graphical objects. A system for teaching and recognizing dynamic gestures, and for providing graphical feedback for them, is described.  相似文献   

18.
In comparison to 2D maps, 3D mobile maps involve volumetric instead of flat representation of space, realistic instead of symbolic representation of objects, more variable views that are directional and bound to a first-person perspective, more degrees of freedom in movement, and dynamically changing object details. We conducted a field experiment to understand the influence of these qualities on a mobile spatial task where buildings shown on the map were to be localized in the real world. The representational differences were reflected in how often users interact with the physical environment and in when they are more likely to physically turn and move the device, instead of using virtual commands. 2D maps direct users into using reliable and ubiquitous environmental cues like street names and crossings, and 2D better affords the use of pre-knowledge and bodily action to reduce cognitive workload. Both acclaimed virtues of 3D mobile maps—rapid identification of objects and ego-centric alignment—worked poorly due reasons we discuss. However, with practice, some 3D users learned to shift to 2D-like strategies and could thereby improve performance. We conclude with a discussion of how representational differences in mobile maps affect strategies of embodied interaction. Electronic supplementary material  The online version of this article (doi:) contains supplementary material, which is available to authorized users.  相似文献   

19.
We present a real-time object-based SLAM system that leverages the largest object database to date. Our approach comprises two main components: (1) a monocular SLAM algorithm that exploits object rigidity constraints to improve the map and find its real scale, and (2) a novel object recognition algorithm based on bags of binary words, which provides live detections with a database of 500 3D objects. The two components work together and benefit each other: the SLAM algorithm accumulates information from the observations of the objects, anchors object features to especial map landmarks and sets constrains on the optimization. At the same time, objects partially or fully located within the map are used as a prior to guide the recognition algorithm, achieving higher recall. We evaluate our proposal on five real environments showing improvements on the accuracy of the map and efficiency with respect to other state-of-the-art techniques.  相似文献   

20.
地图符号是可视化表达地理信息内容的基础,是GIS应用中的一个重要组成部分。在ArcGIS8.1平台下设计中国岩石圈三维结构数据库的符号库(ZGYSQ3D.style);利用ArcObjects组件库提供的相关对象进行地图符号化功能组件的二次开发,在总库管理系统中实现地图符号的自动化配置和地图符号库的管理。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号