首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到10条相似文献,搜索用时 62 毫秒
1.
This paper presents a complete, general and modular system which after a simple previous configuration is able to detect and track each player on the court or field. The presented multi-camera system is based on a mono-camera object detection and tracking system originally designed for video surveillance applications. Target sports of the developed system are team sports (e.g., basketball, soccer). The main objective of this paper is to present a semi-supervised system able to detect and track the players in multi-camera sports videos, focusing on the fusion of different tracks of detected blobs in order to match tracks across cameras. The proposed system is simpler than other systems from the state of the art, can operate in real time and has margin to be improved and to reduce supervision adding additional complexity. In addition to the detection and tracking system, an evaluation system has been designed to obtain quantitative results of the system performance.  相似文献   

2.
As a special application of computer vision, automatic sports video analysis has been studied by some researchers. This sports video analysis via computer vision is a moderately challenging problem: it is more difficult than analyzing a video of a few laboratory members acting as in a simple scenario and is easier than analyzing a video of crowded people at a subway station. So the success of an analysis heavily depends on how much one can exploit the prior information on the sport and setting. The most challenging and important part would be the tracking of players (and ball). With a multi-camera system, 3D tracking is feasible which is much more meaningful than 2D tracking for the analysis. As an initial step of 3D player tracking from multi-view soccer videos, this paper deals with automatic initialization of player positions. Initial 3D positions can be estimated by exploiting some conditions of a soccer match. To make it robust, prior knowledge on the features of players is learnt by support vector machines (SVM). Experimental results show that the proposed system is efficient for general soccer sequences.  相似文献   

3.
Automatic classification of shots extracted by news videos plays an important role in the context of news video segmentation, which is an essential step towards effective indexing of broadcasters digital databases. In spite of the efforts reported by the researchers involved in this field, no techniques providing fully satisfactory performance have been presented until now. In this paper, we propose a multi-expert approach for unsupervised shot classification. The proposed multi-expert system (MES) combines three algorithms that are model-free and do not require a specific training phase. In order to assess the performance of the MES, we built up a database significantly wider than those typically used in the field. Experimental results demonstrate the effectiveness of the proposed approach both in terms of shot classification and of news story detection capability.  相似文献   

4.
Image and video processing techniques are being frequently used in medical science applications. Computer vision-based systems have successfully replaced various manual medical processes such as analyzing physical and biomechanical parameters, physical examination of patients. These systems are gaining popularity because of their robustness and the objectivity they bring to various medical procedures. Hammersmith Infant Neurological Examinations (HINE) is a set of physical tests that are carried out on infants in the age group of 3–24 months with neurological disorders. However, these tests are graded through visual observations, which can be highly subjective. Therefore, computer vision-aided approach can be used to assist the experts in the grading process. In this paper, we present a method of automatic exercise classification through visual analysis of the HINE videos recorded at hospitals. We have used scale-invariant-feature-transform features to generate a bag-of-words from the image frames of the video sequences. Frequency of these visual words is then used to classify the video sequences using HMM. We also present a method of event segmentation in long videos containing more than two exercises. Event segmentation coupled with a classifier can help in automatic indexing of long and continuous video sequences of the HINE set. Our proposed framework is a step forward in the process of automation of HINE tests through computer vision-based methods. We conducted tests on a dataset comprising of 70 HINE video sequences. It has been found that the proposed method can successfully classify exercises with accuracy as high as 84%. The proposed work has direct applications in automatic or semiautomatic analysis of “vertical suspension” and “ventral suspension” tests of HINE. Though some of the critical tests such as “pulled-to-sit,” “lateral tilting,” or “adductor’s angle measurement” have already been addressed using image- and video-guided techniques, scopes are there for further improvement.  相似文献   

5.
Soccer is the most popular sport around the world, and automatic processing of soccer images is a precious alternative to the manual solutions regarding the explosive growth of soccer videos. A new multi-player detection algorithm in far view frames as an initial step to a wide range of applications, such as player tracking, is addressed in this paper. In the proposed detector, a two-step blob detection (grass-based blob detection followed by an edge-based blob detection) is combined with an efficient search mechanism based on particle swarm optimization (PSO) by assigning sub-swarms to each detected blob. Then, a sub-swarm is initialized and tripled to search for three models corresponding to two teams and the referee. Therefore, the most player-like regions in detected blobs are simultaneously searched by all sub-swarms flying through the solution space, thus expanding the scope of single player detection to multi-player detection. Experimental results demonstrate the efficiency and robustness of the algorithm.  相似文献   

6.
We investigate the use of structure learning in Bayesian networks for a complex multimodal task of action detection in soccer videos. We illustrate that classical score-oriented structure learning algorithms, such as the K2 one whose usefulness has been demonstrated on simple tasks, fail in providing a good network structure for classification tasks where many correlated observed variables are necessary to make a decision. We then compare several structure learning objective functions, which aim at finding out the structure that yields the best classification results, extending existing solutions in the literature. Experimental results on a comprehensive data set of 7 videos show that a discriminative objective function based on conditional likelihood yields the best results, while augmented approaches offer a good compromise between learning speed and classification accuracy.  相似文献   

7.
Abnormality detection in crowded scenes plays a very important role in automatic monitoring of surveillance feeds. Here we present a novel framework for abnormality detection in crowd videos. The key idea of the approach is that rarely or sparsely occurring events correspond to abnormal activities, while the regularly or commonly occurring events correspond to the normal activities. Each input video is represented using feature matrices that capture the nature of activity taking place while maintaining the spatial and temporal structure of the video. The feature matrices are decomposed into their low-rank and sparse components where sparse component corresponds to the abnormal activities. The approach does not require any explicit modeling of crowd behavior or training, but the information from training data can be seamlessly incorporated if it is available. The estimation is further improved by ensuring temporal and spatial coherence of sparse component across the videos using a Kalman filter-like framework. This not only results in reduction of outliers and noise but also fills missing regions in the sparse component. Localization of the anomalies is obtained as a by-product of the proposed approach. Evaluation on the UMN and UCSD datasets and comparisons with several state-of-the-art crowd abnormality detection approaches shows the effectiveness of the proposed approach. We also show results on a challenging crowd dataset created as part of this effort, with videos downloaded from the web.  相似文献   

8.
Temporal segmentation of videos into meaningful image sequences containing some particular activities is an interesting problem in computer vision. We present a novel algorithm to achieve this semantic video segmentation. The segmentation task is accomplished through event detection in a frame-by-frame processing setup. We propose using one-class classification (OCC) techniques to detect events that indicate a new segment, since they have been proved to be successful in object classification and they allow for unsupervised event detection in a natural way. Various OCC schemes have been tested and compared, and additionally, an approach based on the temporal self-similarity maps (TSSMs) is also presented. The testing was done on a challenging publicly available thermal video dataset. The results are promising and show the suitability of our approaches for the task of temporal video segmentation.  相似文献   

9.
In this paper, we present a user-based event detection method for social web videos. Previous research in event detection has focused on content-based techniques, such as pattern recognition algorithms that attempt to understand the contents of a video. There are few user-centric approaches that have considered either search keywords, or external data such as comments, tags, and annotations. Moreover, some of the user-centric approaches imposed an extra effort to the users in order to capture required information. In this research, we are describing a method for the analysis of implicit users’ interactions with a web video player, such as pause, play, and thirty-seconds skip or rewind. The results of our experiments indicated that even the simple user heuristic of local maxima might effectively detect the same video-events, as indicated manually. Notably, the proposed technique was more accurate in the detection of events that have a short duration, because those events motivated increased user interaction in video hot-spots. The findings of this research provide evidence that we might be able to infer semantics about a piece of unstructured data just from the way people actually use it.  相似文献   

10.
A novel specular highlights detection method in colonoscopy videos is presented. The method is based on a model of appearance defining specular highlights as bright spots which are highly contrasted with respect to adjacent regions. Our approach proposes two stages: segmentation and then classification of bright spot regions. The former defines a set of candidate regions obtained through a region growing process with local maxima as initial region seeds. This process creates a tree structure which keeps track, at each growing iteration, of the region frontier contrast; final regions provided depend on restrictions over contrast value. Non-specular regions are filtered through a classification stage performed by a linear SVM classifier using model-based features from each region. We introduce a new validation database with more than 25, 000 regions along with their corresponding pixel-wise annotations. We perform a comparative study against other approaches. Results show that our method is superior to other approaches, with our segmented regions being closer to actual specular regions in the image. Finally, we also present how our methodology can also be used to obtain an accurate prediction of polyp histology.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号