首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到10条相似文献,搜索用时 156 毫秒
1.
Zhang H  Li L  Jia W  Fernstrom JD  Sclabassi RJ  Mao ZH  Sun M 《Neurocomputing》2011,74(12-13):2184-2192
A new technique to extract and evaluate physical activity patterns from image sequences captured by a wearable camera is presented in this paper. Unlike standard activity recognition schemes, the video data captured by our device do not include the wearer him/herself. The physical activity of the wearer, such as walking or exercising, is analyzed indirectly through the camera motion extracted from the acquired video frames. Two key tasks, pixel correspondence identification and motion feature extraction, are studied to recognize activity patterns. We utilize a multiscale approach to identify pixel correspondences. When compared with the existing methods such as the Good Features detector and the Speed-up Robust Feature (SURF) detector, our technique is more accurate and computationally efficient. Once the pixel correspondences are determined which define representative motion vectors, we build a set of activity pattern features based on motion statistics in each frame. Finally, the physical activity of the person wearing a camera is determined according to the global motion distribution in the video. Our algorithms are tested using different machine learning techniques such as the K-Nearest Neighbor (KNN), Naive Bayesian and Support Vector Machine (SVM). The results show that many types of physical activities can be recognized from field acquired real-world video. Our results also indicate that, with a design of specific motion features in the input vectors, different classifiers can be used successfully with similar performances.  相似文献   

2.
We conducted a project with professional skiers and their trainers in which we used wearable sensors to improve the trainer-athlete relationship by helping them share their observations and impressions. In particular, we analyzed which sensors reveal important features describing the athlete's motions. Visualization software shows the athletes' movements by overlaying and synchronizing a video stream with sensor data. A system based on wearable sensors and video recording can reveal information about a skier's motions, helping trainers identify the skier's strengths and weaknesses.  相似文献   

3.
Movement detection is gaining more and more attention among various pattern recognition problems. Recognizing human movement activity types is extremely useful for fall detection for elderly people. Wireless sensor network technology enables human motion data from wearable wireless sensor devices be transmitted for remote processing. This paper studies methods to process the human motion data received from wearable wireless sensor devices for detecting different types of human movement activities such as sitting, standing, lying, fall, running, and walking. Machine learning methods K Nearest Neighbor algorithm (KNN) and the Back Propagation Neural Network (BPNN) algorithm are used to classify the activities from the data acquired from sensors based on sample data. As there are a large amount of real-time raw data received from sensors and there are noises associated with these data, feature construction and reduction are used to preprocess these raw sensor data obtained from accelerometers embedded in wireless sensing motes for learning and processing. The singular value decomposition (SVD) technique is used for constructing the enriched features. The enriched features are then integrated with machine learning algorithms for movement detection. The testing data are collected from five adults. Experimental results show that our methods can achieve promising performance on human movement recognition and fall detection.  相似文献   

4.
Advances in the media and entertainment industries, including streaming audio and digital TV, present new challenges for managing and accessing large audio-visual collections. Current content management systems support retrieval using low-level features, such as motion, color, and texture. However, low-level features often have little meaning for naive users, who much prefer to identify content using high-level semantics or concepts. This creates a gap between systems and their users that must be bridged for these systems to be used effectively. To this end, in this paper, we first present a knowledge-based video indexing and content management framework for domain specific videos (using basketball video as an example). We will provide a solution to explore video knowledge by mining associations from video data. The explicit definitions and evaluation measures (e.g., temporal support and confidence) for video associations are proposed by integrating the distinct feature of video data. Our approach uses video processing techniques to find visual and audio cues (e.g., court field, camera motion activities, and applause), introduces multilevel sequential association mining to explore associations among the audio and visual cues, classifies the associations by assigning each of them with a class label, and uses their appearances in the video to construct video indices. Our experimental results demonstrate the performance of the proposed approach.  相似文献   

5.
The Microsoft SenseCam is a small lightweight wearable camera used to passively capture photos and other sensor readings from a user’s day-to-day activities. It captures on average 3,000 images in a typical day, equating to almost 1 million images per year. It can be used to aid memory by creating a personal multimedia lifelog, or visual recording of the wearer’s life. However the sheer volume of image data captured within a visual lifelog creates a number of challenges, particularly for locating relevant content. Within this work, we explore the applicability of semantic concept detection, a method often used within video retrieval, on the domain of visual lifelogs. Our concept detector models the correspondence between low-level visual features and high-level semantic concepts (such as indoors, outdoors, people, buildings, etc.) using supervised machine learning. By doing so it determines the probability of a concept’s presence. We apply detection of 27 everyday semantic concepts on a lifelog collection composed of 257,518 SenseCam images from 5 users. The results were evaluated on a subset of 95,907 images, to determine the accuracy for detection of each semantic concept. We conducted further analysis on the temporal consistency, co-occurance and relationships within the detected concepts to more extensively investigate the robustness of the detectors within this domain.  相似文献   

6.
In this article, a learning framework that enables robotic arms to replicate new skills from human demonstration is proposed. The learning framework makes use of online human motion data acquired using wearable devices as an interactive interface for providing the anticipated motion to the robot in an efficient and user-friendly way. This approach offers human tutors the ability to control all joints of the robotic manipulator in real-time and able to achieve complex manipulation. The robotic manipulator is controlled remotely with our low-cost wearable devices for easy calibration and continuous motion mapping. We believe that our approach might lead to improving the human-robot skill learning, adaptability, and sensitivity of the proposed human-robot interaction for flexible task execution and thereby giving room for skill transfer and repeatability without complex coding skills.  相似文献   

7.
We present a visual assistive system that features mobile face detection and recognition in an unconstrained environment from a mobile source using convolutional neural networks. The goal of the system is to effectively detect individuals that approach facing towards the person equipped with the system. We find that face detection and recognition becomes a very difficult task due to the movement of the user which causes camera shakes resulting in motion blur and noise in the input for the visual assistive system. Due to the shortage of related datasets, we create a dataset of videos captured from a mobile source that features motion blur and noise from camera shakes. This makes the application a very challenging aspect of face detection and recognition in unconstrained environments. The performance of the convolutional neural network is further compared with a cascade classifier. The results show promising performance in daylight and artificial lighting conditions while the challenges lie for moonlight conditions with the need for reduction of false positives in order to develop a robust system. We also provide a framework for implementation of the system with smartphones and wearable devices for video input and auditory notification from the system to guide the visually impaired.  相似文献   

8.
We present an effective technique for automatic extraction, representation, and classification of digital video, and a visual language for formulation of queries to access the semantic information contained in digital video. We have devised an algorithm that extracts motion information from a video sequence. This algorithm provides a low-cost extension to the motion compensation component of the MPEG compression algorithm. In this paper, we present a visual language called VEVA for querying multimedia information in general, and video semantic information in particular. Unlike many other proposals that concentrate on browsing the data, VEVA offers a complete set of capabilities for specifying relationships between the image components and formulating queries that search for objects, their motions and their other associated characteristics. VEVA has been shown to be very expressive in this context mainly due to the fact that many types of multimedia information are inherently visual in nature.  相似文献   

9.
A graphical model for audiovisual object tracking   总被引:3,自引:0,他引:3  
We present a new approach to modeling and processing multimedia data. This approach is based on graphical models that combine audio and video variables. We demonstrate it by developing a new algorithm for tracking a moving object in a cluttered, noisy scene using two microphones and a camera. Our model uses unobserved variables to describe the data in terms of the process that generates them. It is therefore able to capture and exploit the statistical structure of the audio and video data separately, as well as their mutual dependencies. Model parameters are learned from data via an EM algorithm, and automatic calibration is performed as part of this procedure. Tracking is done by Bayesian inference of the object location from data. We demonstrate successful performance on multimedia clips captured in real world scenarios using off-the-shelf equipment.  相似文献   

10.
Automatic video content classification attracts much attention from researchers in multimedia analysis because the management of video content is a challenging task. In this paper, a visual feature representation composed of editing, color, texture and motion features is proposed which is shown to be effective in differentiating among various video contents. A modified Directed Acyclic Graph Support Vector Machine (DAGSVM) model as the classifier is also presented. Experiments show that the features extracted have improved the discriminative ability between different video contents and the computational complexity has also been reduced. By introducing the DAG policy, the performance of the classifier has been enhanced and the classification results demonstrate the precision and effectiveness of this approach, compared with the other two classification methods. In addition, the proposed algorithm can be applied to video searching and harmful-video content filtering, etc.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号