首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 906 毫秒
1.
 现有定性空间推理研究主要解决单类对象、单种空间关系的定性约束满足问题.提出了异构定性空间推理概念,它包括不同种类空间关系结合(异构关系)、不同种类空间对象结合(异构对象)和定性定量对象融合三种情况下的空间关系约束满足问题.提出了三种以上异构关系的结合推理,此前工作以研究二元结合为主;给出了异构对象空间推理算法,此前工作仅研究表示模型;研究了定性定量对象融合的空间推理,该问题也可表达为部分解向全局解的扩展.上述研究结果可应用于环境智能和其他领域.  相似文献   

2.
For video understanding, namely analyzing who did what in a video, actions along with objects are primary elements. Most studies on actions have handled recognition problems for a well‐trimmed video and focused on enhancing their classification performance. However, action detection, including localization as well as recognition, is required because, in general, actions intersect in time and space. In addition, most studies have not considered extensibility for a newly added action that has been previously trained. Therefore, proposed in this paper is an extensible hierarchical method for detecting generic actions, which combine object movements and spatial relations between two objects, and inherited actions, which are determined by the related objects through an ontology and rule based methodology. The hierarchical design of the method enables it to detect any interactive actions based on the spatial relations between two objects. The method using object information achieves an F‐measure of 90.27%. Moreover, this paper describes the extensibility of the method for a new action contained in a video from a video domain that is different from the dataset used.  相似文献   

3.
对抗意图识别技术研究现状及其突破途径   总被引:1,自引:0,他引:1  
对抗活动中,能对对方进行正确的意图识别是获取胜利的前提条件。军事领域的意图识别是最困难、最复杂的识别问题之一。简要介绍并分析了几种常用的意图识别技术,包括基于逻辑的意图识别技术和基于概率推理的意图识别技术。指出了现有技术无法解决军事意图识别中存在的缺乏对被识别Agent的完整知识、可用信息矛盾、难以辨别被识别Agent欺骗行为等问题。提出了解决上述问题的基本途径,即基于可拓学和决策分析、冲突分析相结合的军事对抗意图识别方法。  相似文献   

4.
混合维定性空间查询语言MQS-SQL   总被引:16,自引:2,他引:14       下载免费PDF全文
王生生  刘大有  杨博 《电子学报》2002,30(Z1):1995-1999
定性空间推理的RCC理论能够比较全面地表达空间对象的拓扑关系.但由于不支持混合维空间对象,RCC不能直接用于空间查询.本文扩展了RCC,建立了能表达混合维空间关系,更适合于空间查询的空间关系模型MRCC.该模型采用GIS中常用的数据结构,支持混合维空间对象的全拓扑(mereotopology)关系,并根据混合维对象的特点建立了与维数无关的统一的方向、距离关系.利用该模型扩充了标准SQL语言的关系代数,实现了混合维定性空间查询语言MQS-SQL.  相似文献   

5.
In this paper, we present a novel method for content adaptation and video summarization fully implemented in compressed-domain. Firstly, summarization of generic videos is modeled as the process of extracted human objects under various activities/events. Accordingly, frames are classified into five categories via fuzzy decision including shot changes (cut and gradual transitions), motion activities (camera motion and object motion) and others by using two inter-frame measurements. Secondly, human objects are detected using Haar-like features. With the detected human objects and attained frame categories, activity levels for each frame are determined to adapt with video contents. Continuous frames belonging to same category are grouped to form one activity entry as content of interest (COI) which will convert the original video into a series of activities. An overall adjustable quota is used to control the size of generated summarization for efficient streaming purpose. Upon this quota, the frames selected for summarization are determined by evenly sampling the accumulated activity levels for content adaptation. Quantitative evaluations have proved the effectiveness and efficiency of our proposed approach, which provides a more flexible and general solution for this topic as domain-specific tasks such as accurate recognition of objects can be avoided.  相似文献   

6.
For a variety of applications such as video surveillance and event annotation, the spatial–temporal boundaries between video objects are required for annotating visual content with high-level semantics. In this paper, we define spatial–temporal sampling as a unified process of extracting video objects and computing their spatial–temporal boundaries using a learnt video object model. We first provide a computational approach for learning an optimal key-object codebook sequence from a set of training video clips to characterize the semantics of the detected video objects. Then, dynamic programming with the learnt codebook sequence is used to locate the video objects with spatial–temporal boundaries in a test video clip. To verify the performance of the proposed method, a human action detection and recognition system is constructed. Experimental results show that the proposed method gives good performance on several publicly available datasets in terms of detection accuracy and recognition rate.  相似文献   

7.
In conventional video production, logotypes are used to convey information about content originator or the actual video content. Logotypes contain information that is critical to infer genre, class and other important semantic features of video. This paper presents a framework to support semantic-based video classification and annotation. The backbone of the proposed framework is a technique for logotype extraction and recognition. The method consists of two main processing stages. The first stage performs temporal and spatial segmentation by calculating the minimal luminance variance region (MVLR) for a set of frames. Non-linear diffusion filters (NLDF) are used at this stage to reduce noise in the shape of the logotype. In the second stage, logotype classification and recognition are achieved. The earth mover's distance (EMD) is used as a metric to decide if the detected MLVR belongs to one of the following logotype categories: learned or candidate. Learned logos are semantically annotated shapes available in the database. The semantic characterization of such logos is obtained through an iterative learning process. Candidate logos are non-annotated shapes extracted during the first processing stage. They are assigned to clusters grouping different instances of logos of similar shape. Using these clusters, false logotypes are removed and different instances of the same logo are averaged to obtain a unique prototype representing the underlying noisy cluster. Experiments involving several hours of MPEG video and around 1000 of candidate logotypes have been carried out in order to show the robustness of both detection and classification processes.  相似文献   

8.
 本文通过将经典4-交集模型扩展得到8-交集体模型,对三个简单区域间的关系进行表示,并具体得到了三个简单区域间实际存在的109种拓扑关系图.通过对三个简单区域间的109种拓扑关系进行研究,建立了拓扑关系的推理系统,给出了拓扑关系复合表,进而给出了109种拓扑关系的概念邻域图.本文所建立的拓扑关系模型,可用于对机器人与两个指定障碍物间的拓扑关系进行定性模拟,对制定机器人的避障机制具有一定的指导意义.  相似文献   

9.
Detection of moving objects in video streams is the first relevant step of information extraction in many computer vision applications. Aside from the intrinsic usefulness of being able to segment video streams into moving and background components, detecting moving objects provides a focus of attention for recognition, classification, and activity analysis, making these later steps more efficient. We propose an approach based on self organization through artificial neural networks, widely applied in human image processing systems and more generally in cognitive science. The proposed approach can handle scenes containing moving backgrounds, gradual illumination variations and camouflage, has no bootstrapping limitations, can include into the background model shadows cast by moving objects, and achieves robust detection for different types of videos taken with stationary cameras. We compare our method with other modeling techniques and report experimental results, both in terms of detection accuracy and in terms of processing speed, for color video sequences that represent typical situations critical for video surveillance systems.  相似文献   

10.
A VideoGIS system aims at combining geo-referenced video information with traditional geographic information in order to provide a more comprehensive understanding over a spatial location. Video data have been used with geographic information in some projects to facilitate a better understanding of the spatial objects of interest. This paper presents an on-going VideoGIS project, in which scalable geo-referenced video and geographic information (GI) are transmitted to GPS-guided vehicles. The hypermedia, which contains cross-referenced video and GI, are organized in a scalable (layered) fashion. The remote users can request, through 3G mobile devices, the abundant information related to the objects of interest, while adapting to heterogeneous network condition and local CPU usage. Available bandwidth estimation technique is used in the adaptive video transmission.  相似文献   

11.
王生生  刘大有 《电子学报》2003,31(Z1):2175-2178
从某个视点观察两个三维空间中的对象时,一个对象遮住另一个对象的现象被称为空间遮蔽关系.在空间推理和机器视觉领域,它是一种重要的面向观察者的空间关系.LOS和ROC等现有的遮蔽关系模型都是基于RCC(区域连接演算)的,因而不能支持混合维空间对象.但在3维GIS等遮蔽关系的应用领域中,空间对象的维数是多样的.为此提出了混合维空间遮蔽关系模型MSO.首先将RCC扩展到混合维得到了MRCC,然后基于MRCC定义了混合维遮蔽关系,最后研究了MRCC的复合推理方法.  相似文献   

12.
Detecting hazardous activity during driving can be useful in curbing roadside accidents. Existing techniques utilizing image based features for encoding such activity can sometimes misclassify crucial scenarios. One particular work by Zhao et al. (2013 [1], 2013 [2], 2011 [3]) suggests an image based feature set that encodes the driver’s pose, which is categorized into one of four activities. We bring more clarity in understanding the activity by proposing a richer, video based feature set that adeptly exploits spatiotemporal information of the driver. Our feature set encodes the driver’s pose, crucial variations in pose and interactions with objects within the vehicle. The feature set is tested on our newly created dataset since the ones used in literature are not publicly available. Our proposed feature set captures a larger number of activities and using standard classifiers and benchmarks it has shown significant improvements over the existing ones.  相似文献   

13.
14.
Segmentation of moving objects in video sequences is a basic task in many applications. However, it is still challenging due to the semantic gap between the low-level visual features and the high-level human interpretation of video semantics. Compared with segmentation of fast moving objects, accurate and perceptually consistent segmentation of slowly moving objects is more difficult. In this paper, a novel hybrid algorithm is proposed for segmentation of slowly moving objects in video sequence aiming to acquire perceptually consistent results. Firstly, the temporal information of the differences among multiple frames is employed to detect initial moving regions. Then, the Gaussian mixture model (GMM) is employed and an improved expectation maximization (EM) algorithm is introduced to segment a spatial image into homogeneous regions. Finally, the results of motion detection and spatial segmentation are fused to extract final moving objects. Experiments are conducted and provide convincing results.  相似文献   

15.
一种不确定区域的扩展蛋黄模型   总被引:4,自引:0,他引:4       下载免费PDF全文
空间区域的拓扑关系建模是空间推理、地理信息系统(GIS)和计算机视觉等领域一个重要的研究内容,近年来不确定区域间的拓扑关系建模受到相关领域研究者的极大关注.基于三元组谓词给出了一种不确定区域的扩展蛋黄模型,该模型具有较高的认知合理性,将分明区域作为特例统一处理,分别基于RCC5和RCC8关系进行了扩展,能够实现多层次上的拓扑关系分析.  相似文献   

16.
基于Dempster-Shafer证据推理的多传感器信息融合技术及应用   总被引:11,自引:0,他引:11  
本文详细阐明了基于D-S证据推理的多传感器信息融合的原理及目标识别的方法.同时,介绍了其在雷达目标融合识别中的应用.  相似文献   

17.
在基于视频图像的动作识别中,由于固定视角相机所获取的不同动作视频存在视角差异,会造成识别准确率降低等问题。使用多视角视频图像是提高识别准确率的方法之一,提出基于三维残差网络(3D Residual Network,3D ResNet)和长短时记忆(Long Short-term Memory,LSTM)网络的多视角人体动作识别算法,通过3D ResNet学习各视角动作序列的融合时空特征,利用多层LSTM网络继续学习视频流中的长期活动序列表示并深度挖掘视频帧序列之间的时序信息。在NTU RGB+D 120数据集上的实验结果表明,该模型对多视角视频序列动作识别的准确率可达83.2%。  相似文献   

18.
We propose a framework, consisting of several algorithms to recognize human activities that involve manipulating objects. Our proposed algorithm identifies objects being manipulated and models high-level tasks being performed accordingly. Realistic settings for such tasks pose several problems for computer vision, including sporadic occlusion by subjects, non-frontal poses, and objects with few local features. We show how size and segmentation information derived from depth data can address these challenges using simple and fast techniques. In particular, we show how to robustly and without supervision find the manipulating hand, properly detect/recognize objects and properly use the temporal information to fill in the gaps between sporadically detected objects, all through careful inclusion of depth cues. We evaluate our approach on a challenging dataset of 12 kitchen tasks that involve 24 objects performed by 2 subjects. The entire framework yields 82%/84% precision (74%/83%recall) for task/object recognition. Our techniques outperform the state-of-the-art significantly in activity/object recognition.  相似文献   

19.
We implement a video object segmentation system that integrates the novel concept of Voronoi Order with existing surface optimization techniques to support the MPEG-4 functionality of object-addressable video content in the form of video objects. The major enabling technology for the MPEG-4 standard are systems that compute video object segmentation, i.e., the extraction of video objects from a given video sequence. Our surface optimization formulation describes the video object segmentation problem in the form of an energy function that integrates many visual processing techniques. By optimizing this surface, we balance visual information against predictions of models with a priori information and extract video objects from a video sequence. Since the global optimization of such an energy function is still an open problem, we use Voronoi Order to decompose our formulation into a tractable optimization via dynamic programming within an iterative framework. In conclusion, we show the results of the system on the MPEG-4 test sequences, introduce a novel objective measure, and compare results against those that are hand-segmented by the MPEG-4 committee.  相似文献   

20.
邱玉  赵杰煜  汪燕芳 《电子学报》2016,44(6):1307-1313
脸部肌肉之间的时空关系在人脸表情识别中起着重要作用,而当前的模型无法高效地捕获人脸的复杂全局时空关系使其未被广泛应用.为了解决上述问题,本文提出一种基于区间代数贝叶斯网络的人脸表情建模方法,该方法不仅能够捕获脸部的空间关系,也能捕获脸部的复杂时序关系,从而能够更加有效地对人脸表情进行识别.且该方法仅利用基于跟踪的特征且不需要手动标记峰值帧,可提高训练与识别的速度.在标准数据库CK+和MMI上进行实验发现本文方法在识别人脸表情过程中有效提高了准确率.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号