首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
《Computers & Graphics》2002,26(6):951-970
This paper presents the method of understanding objects that can be considered as thin objects. The proposed method of understanding thin objects is part of the shape understanding method developed by the author. The main novelty of the presented method is that the process of understanding thin objects is related to the visual concept represented as a symbolic name of the possible class of shapes. The possible classes of shape, viewed as hierarchical structures, are incorporated into the shape model. At each stage of the reasoning process that led to assigning of an examined object to one of the possible classes, novel processing methods are used. These methods are very efficient because they deal with a very specific class of shapes. In this paper, the 2-D objects that are classified as thin objects are regarded as geometrical objects without any reference to the real world objects. However, the shape under standing method is designed to understand an object at many levels of interpretation, such as the topological level, the linguistic level and the real world reference level. This approach influences the way in which the system of shape understanding is designed. The system consists of different types of experts that perform different processing and reasoning tasks.  相似文献   

3.
Understanding is based on a large number of highly varied abilities called intelligence that can be measured. In this paper understanding abilities of the shape understanding system (SUS) are tested based on the methods used in intelligence tests. These tests are formulated as tasks given to the system and performance is compared with the human performance of these tasks. The main novelty of the presented method is that the process of understanding is related to the visual concept represented as a symbolic name of the possible class of shapes. The visual concept is one of the ingredients of the concept of the visual object (the phantom concept) that makes it possible to perform different tasks that are characteristic for visual understanding. The presented results are part of the research aimed at developing the shape understanding method that would be able to perform complex visual tasks connected with visual thinking. The shape understanding method is implemented as the shape understanding system. © 2005 Wiley Periodicals, Inc. Int J Int Syst 20: 799–826, 2005.  相似文献   

4.
5.
多层感知机分类器是一种有效的数据分类方法,但其分类性能受训练样本空间的限制。通过多层感知机分类器系综提高室外场景理解中图像区域的分类性能,提出了一种自动识别室外场景图像中多种景物所属概念类别的方法。该方法首先提取图像分割区域的低层视觉特征,然后基于系综分类方法建立区域视觉特征和语义类别的对应关系,通过合并相同标注区域,确定图像中景物的高层语义。对包含5种景物的150幅图像进行测试,识别率达到了87%。与基于多层感知机方法的实验结果相比,本文提出的方法取得了更好的性能,这表明该方法适合于图像区域分类。此外,系综方法还可以推广到其他的分类问题。  相似文献   

6.
Image parsing is a process of understanding the contents of an image. The process normally involves labeling pixels or superpixels of a given image with classes of objects that may exist in the image. The accuracy of such labeling for the existing methodologies still needs to be improved. The parsing method needs to be able to identify multiple instances of objects of different classes and sizes. In our previous work, a novel feature representation for an instance of objects in an image was proposed for object recognition and image parsing. The feature representation consists of the histogram vector of 2 g of visual word ids of the two successive clockwise neighbors of any superpixels in the object instance and the shape vector of the instance. Using the feature representation, the instance can be classified with very high accuracy by the per class support vector machines (SVMs). A multi-objective genetic algorithm is also proposed to find a subset of image segments that would best constitute an instance for a class of objects, i.e., maximizing the SVM classification score and the size of the instance. However, the genetic algorithm can only identify a single instance for each class of objects, despite the fact that many instances of the same class may exist. In this paper, a crowding genetic algorithm is used instead to search for multiple optimal solutions and help alleviate this deficiency. The experimental results show that this crowding genetic algorithm performs better than the previously proposed method as well as the existing methodologies, in terms of class-wise and pixel-wise accuracy. The qualitative results also clearly show that this method can effectively identify multiple object instances existing in a given image.  相似文献   

7.
莫宏伟  田朋 《控制与决策》2021,36(12):2881-2890
视觉场景理解包括检测和识别物体、推理被检测物体之间的视觉关系以及使用语句描述图像区域.为了实现对场景图像更全面、更准确的理解,将物体检测、视觉关系检测和图像描述视为场景理解中3种不同语义层次的视觉任务,提出一种基于多层语义特征的图像理解模型,并将这3种不同语义层进行相互连接以共同解决场景理解任务.该模型通过一个信息传递图将物体、关系短语和图像描述的语义特征同时进行迭代和更新,更新后的语义特征被用于分类物体和视觉关系、生成场景图和描述,并引入融合注意力机制以提升描述的准确性.在视觉基因组和COCO数据集上的实验结果表明,所提出的方法在场景图生成和图像描述任务上拥有比现有方法更好的性能.  相似文献   

8.
9.
As the size of the available collections of 3D objects grows, database transactions become essential for their management with the key operation being retrieval (query). Large collections are also precategorized into classes so that a single class contains objects of the same type (e.g., human faces, cars, four-legged animals). It is shown that general object retrieval methods are inadequate for intraclass retrieval tasks. We advocate that such intraclass problems require a specialized method that can exploit the basic class characteristics in order to achieve higher accuracy. A novel 3D object retrieval method is presented which uses a parameterized annotated model of the shape of the class objects, incorporating its main characteristics. The annotated subdivision-based model is fitted onto objects of the class using a deformable model framework, converted to a geometry image and transformed into the wavelet domain. Object retrieval takes place in the wavelet domain. The method does not require user interaction, achieves high accuracy, is efficient for use with large databases, and is suitable for nonrigid object classes. We apply our method to the face recognition domain, one of the most challenging intraclass retrieval tasks. We used the Face Recognition Grand Challenge v2 database, yielding an average verification rate of 95.2 percent at 10-3 false accept rate. The latest results of our work can be found at http://www.cbl.uh.edu/UR8D/.  相似文献   

10.
We present a robust object tracking algorithm that handles spatially extended and temporally long object occlusions. The proposed approach is based on the concept of “object permanence” which suggests that a totally occluded object will re-emerge near its occluder. The proposed method does not require prior training to account for differences in the shape, size, color or motion of the objects to be tracked. Instead, the method automatically and dynamically builds appropriate object representations that enable robust and effective tracking and occlusion reasoning. The proposed approach has been evaluated on several image sequences showing either complex object manipulation tasks or human activity in the context of surveillance applications. Experimental results demonstrate that the developed tracker is capable of handling several challenging situations, where the labels of objects are correctly identified and maintained over time, despite the complex interactions among the tracked objects that lead to several layers of occlusions.  相似文献   

11.
Several methods have been presented in the literature that successfully used SIFT features for object identification, as they are reasonably invariant to translation, rotation, scale, illumination and partial occlusion. However, they have poor performance for classification tasks. In this work, SIFT features are used to solve object class recognition problems in images using a two-step process. In its first step, the proposed method performs clustering on the extracted features in order to characterize the appearance of the different classes. Then, in the classification step, it uses a three layer Bayesian network for object class recognition. Experiments show quantitatively that clusters of SIFT features are suitable to represent classes of objects. The main contributions of this paper are the introduction of a Bayesian network approach in the classification step to improve performance in an object class recognition task, and a detailed experimentation that shows robustness to changes in illumination, scale, rotation and partial occlusion.  相似文献   

12.
Tian  Peng  Mo  Hongwei  Jiang  Laihao 《Applied Intelligence》2021,51(11):7781-7793

Understanding scene image includes detecting and recognizing objects, estimating the interaction relationships of the detected objects, and describing image regions with sentences. However, since the complexity and variety of scene image, existing methods take object detection or vision relationship estimate as the research targets in scene understanding, and the obtained results are not satisfactory. In this work, we propose a Multi-level Semantic Tasks Generation Network (MSTG) to leverage mutual connections across object detection, visual relationship detection and image captioning, to solve jointly and improve the accuracy of the three vision tasks and achieve the more comprehensive and accurate understanding of scene image. The model uses a message pass graph to mutual connections and iterative updates across the different semantic features to improve the accuracy of scene graph generation, and introduces a fused attention mechanism to improve the accuracy of image captioning while using the mutual connections and refines of different semantic features to improve the accuracy of object detection and scene graph generation. Experiments on Visual Genome and COCO datasets indicate that the proposed method can jointly learn the three vision tasks to improve the accuracy of those visual tasks generation.

  相似文献   

13.
14.
Intelligent visual surveillance — A survey   总被引:3,自引:0,他引:3  
Detection, tracking, and understanding of moving objects of interest in dynamic scenes have been active research areas in computer vision over the past decades. Intelligent visual surveillance (IVS) refers to an automated visual monitoring process that involves analysis and interpretation of object behaviors, as well as object detection and tracking, to understand the visual events of the scene. Main tasks of IVS include scene interpretation and wide area surveillance control. Scene interpretation aims at detecting and tracking moving objects in an image sequence and understanding their behaviors. In wide area surveillance control task, multiple cameras or agents are controlled in a cooperative manner to monitor tagged objects in motion. This paper reviews recent advances and future research directions of these tasks. This article consists of two parts: The first part surveys image enhancement, moving object detection and tracking, and motion behavior understanding. The second part reviews wide-area surveillance techniques based on the fusion of multiple visual sensors, camera calibration and cooperative camera systems.  相似文献   

15.
16.
张华迪 《计算机应用研究》2020,37(12):3811-3814,3819
针对目前协同显著性检测方法中存在的语义特征类相差悬殊的物体被误检测为协同对象等问题,提出了一种基于卷积神经网络和语义相关的协同显著性检测算法CSCCD。首先,采用引导超像素滤波方法对SLIC分割出的超像素区域和DSS生成的显著性区域进行处理,清晰地显示了目标边界轮廓;然后使用Mask R-CNN提取语义特征,给出了图像语义特征和语义一致性的定义,并针对提取语义特征过程中出现的同一语义类别的物体在不同形态下被检测为不同语义类别的问题,提出了图像组语义相关类的概念,在此概念的基础上定义了图像组语义关联类,解决了多幅图像的语义关联问题;最后融合显著性检测区域和图像组语义一致性区域得到协同显著性检测结果。在公开基准数据集上的实验结果表明,该算法能够有效凸显目标整体及轮廓,在客观量化方面的综合性能有明显提升。  相似文献   

17.
We propose a method that detects and segments multiple, partially occluded objects in images. A part hierarchy is defined for the object class. Both the segmentation and detection tasks are formulated as binary classification problem. A whole-object segmentor and several part detectors are learned by boosting local shape feature based weak classifiers. Given a new image, the part detectors are applied to obtain a number of part responses. All the edge pixels in the image that positively contribute to the part responses are extracted. A joint likelihood of multiple objects is defined based on the part detection responses and the object edges. Computation of the joint likelihood includes an inter-object occlusion reasoning that is based on the object silhouettes extracted with the whole-object segmentor. By maximizing the joint likelihood, part detection responses are grouped, merged, and assigned to multiple object hypotheses. The proposed approach is demonstrated with the class of pedestrians. The experimental results show that our method outperforms the previous ones.  相似文献   

18.
目的 场景图能够简洁且结构化地描述图像。现有场景图生成方法重点关注图像的视觉特征,忽视了数据集中丰富的语义信息。同时,受到数据集长尾分布的影响,大多数方法不能很好地对出现概率较小的三元组进行推理,而是趋于得到高频三元组。另外,现有大多数方法都采用相同的网络结构来推理目标和关系类别,不具有针对性。为了解决上述问题,本文提出一种提取全局语义信息的场景图生成算法。方法 网络由语义编码、特征编码、目标推断以及关系推理等4个模块组成。语义编码模块从图像区域描述中提取语义信息并计算全局统计知识,融合得到鲁棒的全局语义信息来辅助不常见三元组的推理。目标编码模块提取图像的视觉特征。目标推断和关系推理模块采用不同的特征融合方法,分别利用门控图神经网络和门控循环单元进行特征学习。在此基础上,在全局统计知识的辅助下进行目标类别和关系类别推理。最后利用解析器构造场景图,进而结构化地描述图像。结果 在公开的视觉基因组数据集上与其他10种方法进行比较,分别实现关系分类、场景图元素分类和场景图生成这3个任务,在限制和不限制每对目标只有一种关系的条件下,平均召回率分别达到了44.2%和55.3%。在可视化实验中,相比...  相似文献   

19.
Abstract: In the last years, smart surveillance has been one of the most active research topics in computer vision because of the wide spectrum of promising applications. Its main point is about the use of automatic video analysis technologies for surveillance purposes. In general, a processing framework for smart surveillance consists of a preliminary motion detection step in combination with high‐level reasoning that allows automatic understanding of evolutions of observed scenes. In this paper, we propose a surveillance framework based on a set of reliable visual algorithms that perform different tasks: a motion analysis approach that segments foreground regions is followed by three procedures, which perform object tracking, homographic transformations and edge matching, in order to achieve the real‐time monitoring of forbidden areas and the detection of abandoned or removed objects. Several experiments have been performed on different real image sequences acquired from a Messapic museum (indoor context) and the nearby archaeological site (outdoor context) to demonstrate the effectiveness and the flexibility of the proposed approach.  相似文献   

20.
利用线画图象二维串的图象理解方法   总被引:1,自引:1,他引:0  
高文 《计算机学报》1996,19(2):110-119
本文提出一种基于符号运算和面向规则线画图象的自动图象理解方法,方法用代数符号表达空间物体,用两个符号串描述物体在特定的投影空间上的相互关系,上方法所构造的代数系统规定了包括微分,腐蚀等等代数操作定义了若干条运算规则。利用这些操作和规则,可以实现对图象中物体的空间位置,运动趋势,自身大小的变化等等的自动分析。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号