首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
An object can often be uniquely identified by its shape, which is usually fairly invariant. However, when the search is for a type of object or an object category, there can be variations in object deformation (i.e. variations in body shapes) and articulation (i.e. joint movement by limbs) that complicate their detection. We present a system that can account for this articulation variation to improve the robustness of its object detection by using deformable shapes as its main search criteria. However, existing search techniques based on deformable shapes suffer from slow search times and poor best matches when images are cluttered and the search is not initialised. To overcome these drawbacks, our object detection system uses flexible shape templates that are augmented by salient object features and user-defined heuristics. Our approach reduces computation time by prioritising the search around these salient features and uses the template heuristics to find truer positive matches.
Binh PhamEmail:
  相似文献   

2.
A general shape context framework is proposed for object/image retrieval in occluded and cluttered environment with hundreds of models as the potential matches of an input. The approach is general since it does not require separation of input objects from complex background. It works by first extracting consistent and structurally unique local neighborhood information from inputs or models, and then voting on the optimal matches. Its performance degrades gracefully with respect to the amount of structural information that is being occluded or lost. The local neighborhood information applicable to the system can be shape, color, texture feature, etc. Currently, we employ shape information only. The mechanism of voting is based on a novel hyper cube based indexing structure, and driven by dynamic programming. The proposed concepts have been tested on database with thousands of images. Very encouraging results have been obtained.  相似文献   

3.
This paper proposes a new method of detecting an object containing multiple colors with non-homogeneous distributions in complex backgrounds and subsequently estimating the depth and shape of the object using a stereo camera. To extract features for object detection, this paper proposes fuzzy color histograms (FCHs) based on the self-splitting clustering (SSC) of the hue-saturation (HS) color space. For each scanning window in a pyramid of scaled images, the FCH is obtained by accumulating the fuzzy degrees of all of the pixels belonging to each cluster. The FCH is fed to a fuzzy classifier to detect an object in the left image captured by the stereo camera. To find the matched object region in the right image, the left and right images are first segmented using the SSC-partitioned HS space. The depth of the object is then found by performing stereo matching on the segmented images. To find the shape of the object, a disparity map is built using the estimated object depth to automatically determine the stereo matching window size and disparity search range. Finally, the shape of the object is segmented from the disparity map. The experimental results of the detection of different objects with depth and shape estimations are used to verify the performance of the proposed method. Comparisons with different detection and disparity map construction methods are performed to demonstrate the advantage of the proposed method.  相似文献   

4.
Wen Fang 《Pattern recognition》2007,40(8):2163-2172
A new method to incorporate shape prior knowledge into geodesic active contours for detecting partially occluded object is proposed in this paper. The level set functions of the collected shapes are used as training data. They are projected onto a low dimensional subspace using PCA and their distribution is approximated by a Gaussian function. A shape prior model is constructed and is incorporated into the geodesic active contour formulation to constrain the contour evolution process. To balance the strength between the image gradient force and the shape prior force, a weighting factor is introduced to adaptively guide the evolving curve to move under both forces. The curve converges with due consideration of both local shape variations and global shape consistency. Experimental results demonstrate that the proposed method makes object detection robust against partial occlusions.  相似文献   

5.
Detecting objects, estimating their pose, and recovering their 3D shape are critical problems in many vision and robotics applications. This paper addresses the above needs using a two stages approach. In the first stage, we propose a new method called DEHV – Depth-Encoded Hough Voting. DEHV jointly detects objects, infers their categories, estimates their pose, and infers/decodes objects depth maps from either a single image (when no depth maps are available in testing) or a single image augmented with depth map (when this is available in testing). Inspired by the Hough voting scheme introduced in [1], DEHV incorporates depth information into the process of learning distributions of image features (patches) representing an object category. DEHV takes advantage of the interplay between the scale of each object patch in the image and its distance (depth) from the corresponding physical patch attached to the 3D object. Once the depth map is given, a full reconstruction is achieved in a second (3D modelling) stage, where modified or state-of-the-art 3D shape and texture completion techniques are used to recover the complete 3D model. Extensive quantitative and qualitative experimental analysis on existing datasets [2], [3], [4] and a newly proposed 3D table-top object category dataset shows that our DEHV scheme obtains competitive detection and pose estimation results. Finally, the quality of 3D modelling in terms of both shape completion and texture completion is evaluated on a 3D modelling dataset containing both in-door and out-door object categories. We demonstrate that our overall algorithm can obtain convincing 3D shape reconstruction from just one single uncalibrated image.  相似文献   

6.
7.
Object detection is an essential component in automated vision-based surveillance systems. In general, object detectors are constructed using training examples obtained from large annotated data sets. The inevitable limitations of typical training data sets make such supervised methods unsuitable for building generic surveillance systems applicable to a wide variety of scenes and camera setups. In our previous work we proposed an unsupervised method for learning and detecting the dominant object class in a general dynamic scene observed by a static camera. In this paper, we investigate the possibilities to expand the applicability of this method to the problem of multiple dominant object classes. We propose an idea on how to approach this expansion, and perform an evaluation of this idea using two representative surveillance video sequences.  相似文献   

8.
Biological and psychological evidence increasingly reveals that high-level geometrical and topological features are the keys to shape-based object recognition in the brain. Attracted by the excellent performance of neural visual systems, we simulate the mechanism of hypercolumns in the mammalian cortical area V1 that selectively responds to oriented bar stimuli. We design an orderly-arranged hypercolumn array to extract and represent linear or near-linear stimuli in an image. Each unit of this array covers stimuli of various orientations in a small area, and multiple units together produce a low-dimensional vector to describe shape. Based on the neighborhood of units in the array, we construct a graph whose node represents a short line segment with a certain position and slope. Therefore, a contour segment in the image can be represented with a route in this graph. The graph converts an image, comprised of typically unstructured raw data, into structured and semantic-enriched data. We search along the routes in the graph and compare them with a shape template for object detection. The graph greatly upgrades the level of image representation, remarkably reduces the load of combinations, significantly improves the efficiency of object searching, and facilitates the intervening of high-level knowledge. This work provides a systematic infrastructure for shape-based object recognition.  相似文献   

9.
In this paper we suggest a new way of representing planar two-dimensional shapes and a shape matching method which utilizes the new representation. Through merging of the neighboring boundary runs, a shape can be partitioned into a set of triangles. These triangles are inherently connected according to a binary tree structure. Here we use the binary tree with the triangles as its nodes to represent the shape. This representation is found to be insensitive to shape translation, rotation, scaling and skewing changes due to viewer's location changes (or the object's pose changes). Furthermore, the representation is of multiresolution.

In shape matching we compare the two trees representing two given shapes node by node according to the breadth-first tree traversing sequence. The comparison is done from top of the tree and moving downward, which means that we first compare the lower resolution approximations of the two shapes. If the two approximations are different, the comparison stops. Otherwise, it goes on and compares the finer details of the two shapes. Only when the two shapes are very similar, will the two corresponding trees be compared entirely. Thus, the matching algorithm utilizes the multiresolution characteristic of the tree representation and appears to be very efficient.  相似文献   


10.
This article proposes an extension of Haar-like features for their use in rapid object detection systems. These features differ from the traditional ones in that their rectangles are assigned optimal weights so as to maximize their ability to discriminate objects from clutter (non-objects). These features maintain the simplicity of evaluation of the traditional formulation while being more discriminative. The proposed features were trained to detect two types of objects: human frontal faces and human heart regions. Our experimental results suggest that the object detectors based on the proposed features are more accurate and faster than the object detectors built with traditional Haar-like features.  相似文献   

11.
计算机网络的安全在当今社会起着举足轻重的作用。该文将基于分类器选择的模式识别方法应用于入侵检测,提出了一种基于静态分类器选择的网络入侵检测方法。该方法对经过聚类获得的各个区域采用新的策略进一步划分,在划分后的子区域上选择分类器,结合了最近邻规则,减小静态分类器选择方法的误差,提高了检测性能。聚类选择(CS)是典型的静态分类器选择方法,在KDD’99的入侵检测数据集上的实验表明,该方法的性能优于基于聚类选择的网络入侵检测方法。  相似文献   

12.
Unlike many other object recognition datasets which provide either category-level or within-category annotations, we introduce a novel dataset called “IAIR-CarPed” with layered semantic labels ranging from categories to fine-grained subcategories. These labels are collected from 20 subjects via strict psychophysical experiments. To the best of our knowledge, it is the first time that an object recognition dataset is built in this way to represent the adaptive and in-depth interpretations of objects in human vision. This dataset focuses on “car” and “pedestrian” which are two representative categories important in real applications. It contains 3132 images collected from pictures taken under various conditions and 8567 objects carefully annotated by all the 20 subjects. Besides fine-grained and layered semantic labels, five types of detailed visual difficulties of these objects are also provided, which can be adopted to evaluate the representation and generalization abilities of the recognition systems against individual difficulties. We present here the details of building this dataset, its statistics and properties, and then discuss possible applications of it with some primary experimental results.  相似文献   

13.
We present a real-time object-based SLAM system that leverages the largest object database to date. Our approach comprises two main components: (1) a monocular SLAM algorithm that exploits object rigidity constraints to improve the map and find its real scale, and (2) a novel object recognition algorithm based on bags of binary words, which provides live detections with a database of 500 3D objects. The two components work together and benefit each other: the SLAM algorithm accumulates information from the observations of the objects, anchors object features to especial map landmarks and sets constrains on the optimization. At the same time, objects partially or fully located within the map are used as a prior to guide the recognition algorithm, achieving higher recall. We evaluate our proposal on five real environments showing improvements on the accuracy of the map and efficiency with respect to other state-of-the-art techniques.  相似文献   

14.
The present study employs deep learning methods to recognize repetitive assembly actions and estimate their operating times. It is intended to monitor the assembly process of workers and prevent assembly quality problems caused by the lack of key operational steps and the irregular operation of workers. Based on the characteristics of the repeatability and tool dependence of the assembly action, the recognition of the assembly action is considered as the tool object detection in the present study. Moreover, the YOLOv3 algorithm is initially applied to locate and judge the assembly tools and recognize the worker's assembly action. The present study shows that the accuracy of the action recognition is 92.8 %. Then, the pose estimation algorithm CPM based on deep learning is used to realize the recognition of human joint. Finally, the joint coordinates are extracted to judge the operating times of repetitive assembly actions. The accuracy rate of judging the operating times for repetitive assembly actions is 82.1 %.  相似文献   

15.
16.
基于混合高斯模型的运动目标检测   总被引:1,自引:0,他引:1  
提出了一种新的基于HSV颜色空间的阴影检测和误判检测的自适应背景模型运动目标检测算法,并将其应用于运动物体的分割。该算法较好地解决了背景模型的提取、更新、背景扰动、外界光照变化等问题。实验结果表明了该算法的实时性、可靠性和准确性较好。  相似文献   

17.
Online multiple instance boosting for object detection   总被引:1,自引:0,他引:1  
Semi-supervised or unsupervised, incremental learning approaches based on online boosting are very popular for object detection. However, in the course of online learning, since the positive examples labelled by the current classifier may actually not be “correct”, the optimal weak classifier is unlikely to be selected by previous approaches. This would directly lead to a decline in algorithm performance. In this paper, we present an improved online multiple instance learning algorithm based on boosting (called OMILBoost) for object detection. It can pick out the real correct image patch around labelled example with high possibility and thus, avoid drifting problem effectively. Furthermore, our method shows high performance when dealing with partial occlusions. Effectiveness is experimentally demonstrated on six representative video sequences.  相似文献   

18.
In this paper we propose a novel, game-theoretic approach for finding multiple instances of an object category as sets of mutually coherent votes in a generalized Hough space. Existing Hough-voting based detection systems have to inherently apply parameter-sensitive non-maxima suppression (NMS) or mode detection techniques for finding object center hypotheses. Moreover, the voting origins contributing to a particular maximum are lost and hence mostly bounding boxes are drawn to indicate the object hypotheses. To overcome these problems, we introduce a two-stage method, applicable on top of any Hough-voting based detection framework. First, we define a Hough environment, where the geometric compatibilities of the voting elements are captured in a pairwise fashion. Then we analyze this environment within a game-theoretic setting, where we model the competition between voting elements as a Darwinian process, driven by their mutual geometric compatibilities. In order to find multiple and possibly overlapping objects, we introduce a new enumeration method inspired by tabu search. As a result, we obtain locations and voting element compositions of each object instance while bypassing the task of NMS. We demonstrate the broad applicability of our method on challenging datasets like the extended TUD pedestrian crossing scene.  相似文献   

19.
RetinaNet is a typical representative of single-stage object detection, which can solve the problem of sample imbalance. However, due to the lack of region proposal extraction process in single-stage object detection, the effect of RetinaNet in dealing with the problem that object deviating from center or multi object crowding is not good. To solve this problem, we use a variety of optimization methods for RetinaNet to improve the accuracy of object detection. Firstly, FreeAnchor is introduced on the basis of RetinaNet, which can autonomously learn to match the target category; secondly, ResNeXt50 is taken as the backbone to improve the accuracy without increasing the parameter complexity; thirdly, Bottom-up Path Augmentation module is used to enhance the transmission of location information and further optimize the recall rate; finally, soft-NMS method is used to effectively reduce the false positive detection results and improve the average accuracy of object detection. We use the MS COCO data set to verify the new model. The mAP value of the new model reached 40.8, which is 4.3 more than baseline. It shows that the optimization methods are complementary to each other, which can effectively improve the object detection accuracy while ensuring the speed.  相似文献   

20.
Evaluation of object detection algorithms is a non-trivial task: a detection result is usually evaluated by comparing the bounding box of the detected object with the bounding box of the ground truth object. The commonly used precision and recall measures are computed from the overlap area of these two rectangles. However, these measures have several drawbacks: they don't give intuitive information about the proportion of the correctly detected objects and the number of false alarms, and they cannot be accumulated across multiple images without creating ambiguity in their interpretation. Furthermore, quantitative and qualitative evaluation is often mixed resulting in ambiguous measures.In this paper we propose a new approach which tackles these problems. The performance of a detection algorithm is illustrated intuitively by performance graphs which present object level precision and recall depending on constraints on detection quality. In order to compare different detection algorithms, a representative single performance value is computed from the graphs. The influence of the test database on the detection performance is illustrated by performance/generality graphs. The evaluation method can be applied to different types of object detection algorithms. It has been tested on different text detection algorithms, among which are the participants of the ICDAR 2003 text detection competition.The work presented in this article has been conceived in the framework of two industrial contracts with France Télécom in the framework of the projects ECAV I and ECAV II with respective numbers 001B575 and 0011BA66.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号