首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Wireless video streaming on smartphones drains a significantly large fraction of battery energy, which is primarily consumed by wireless network interfaces for downloading unused data and repeatedly switching radio interface. In this paper, we propose an energy-efficient download scheduling algorithm for video streaming based on an aggregate model that utilizes user’s video viewing history to predict user behavior when watching a new video, thereby minimizing wasted energy when streaming over wireless network interfaces. The aggregate model is constructed by a personal retention model with users’ personal viewing history and the audience retention on crowd-sourced viewing history, which can accurately predict the user behavior of watching videos by balancing “user interest” and “video attractiveness”. We evaluate different users streaming multiple videos in various wireless environments and the results illustrate that the aggregate model can help reduce energy waste by 20 % on average. In addition, we also discuss implementation details and extensions, such as dynamically updating personal retention, balancing audience and personal retention, categorizing videos for accurate model.  相似文献   

2.
We have developed an easy-to-use and cost-effective system to construct textured 3D animated face models from videos with minimal user interaction. This is a particularly challenging task for faces due to a lack of prominent textures. We develop a robust system by following a model-based approach: we make full use of generic knowledge of faces in head motion determination, head tracking, model fitting, and multiple-view bundle adjustment. Our system first takes, with an ordinary video camera, images of a face of a person sitting in front of the camera turning their head from one side to the other. After five manual clicks on two images to indicate the position of the eye corners, nose tip and mouth corners, the system automatically generates a realistic looking 3D human head model that can be animated immediately (different poses, facial expressions and talking). A user, with a PC and a video camera, can use our system to generate his/her face model in a few minutes. The face model can then be imported in his/her favorite game, and the user sees themselves and their friends take part in the game they are playing. We have demonstrated the system on a laptop computer live at many events, and constructed face models for hundreds of people. It works robustly under various environment settings.  相似文献   

3.
Nowadays, tremendous amount of video is captured endlessly from increased numbers of video cameras distributed around the world. Since needless information is abundant in the raw videos, making video browsing and retrieval is inefficient and time consuming. Video synopsis is an effective way to browse and index such video, by producing a short video representation, while keeping the essential activities of the original video. However, video synopsis for single camera is limited in its view scope, while understanding and monitoring overall activity for large scenarios is valuable and demanding. To solve the above issues, we propose a novel video synopsis algorithm for partially overlapping camera network. Our main contributions reside in three aspects: First, our algorithm can generate video synopsis for large scenarios, which can facilitate understanding overall activities. Second, for generating overall activity, we adopt a novel unsupervised graph matching algorithm to associate trajectories across cameras. Third, a novel multiple kernel similarity is adopted in selecting key observations for eliminating content redundancy in video synopsis. We have demonstrated the effectiveness of our approach on real surveillance videos captured by our camera network.  相似文献   

4.
We present a visual assistive system that features mobile face detection and recognition in an unconstrained environment from a mobile source using convolutional neural networks. The goal of the system is to effectively detect individuals that approach facing towards the person equipped with the system. We find that face detection and recognition becomes a very difficult task due to the movement of the user which causes camera shakes resulting in motion blur and noise in the input for the visual assistive system. Due to the shortage of related datasets, we create a dataset of videos captured from a mobile source that features motion blur and noise from camera shakes. This makes the application a very challenging aspect of face detection and recognition in unconstrained environments. The performance of the convolutional neural network is further compared with a cascade classifier. The results show promising performance in daylight and artificial lighting conditions while the challenges lie for moonlight conditions with the need for reduction of false positives in order to develop a robust system. We also provide a framework for implementation of the system with smartphones and wearable devices for video input and auditory notification from the system to guide the visually impaired.  相似文献   

5.
Schiele  B. Jebara  T. Oliver  N. 《Micro, IEEE》2001,21(3):44-52
Personal computers have not lived up to their name. Most machines sit on a desk and interact with users for only a small fraction of the day. Notebook computers have become smaller and faster, enabling mobility but the same staid user paradigm persists. Typically you must stop everything you're doing, use both hands, and give the computer your full attention. Wearable computing is poised to shatter our preconceptions of how we should use a computer. A personal computer should be worn like eyeglasses or clothing and continuously interact with the user on the basis of context. With heads-up (head-mounted) displays, unobtrusive input devices, personal wireless local area networks, and a host of other context-sensing and communication tools, wearable computers may act as intelligent assistants. It is argued that a wearable computing device that perceives its surroundings and presents multimedia information through a heads-up display can behave like an intelligent assistant to fulfill the promise of personal computing  相似文献   

6.
Zhang H  Li L  Jia W  Fernstrom JD  Sclabassi RJ  Mao ZH  Sun M 《Neurocomputing》2011,74(12-13):2184-2192
A new technique to extract and evaluate physical activity patterns from image sequences captured by a wearable camera is presented in this paper. Unlike standard activity recognition schemes, the video data captured by our device do not include the wearer him/herself. The physical activity of the wearer, such as walking or exercising, is analyzed indirectly through the camera motion extracted from the acquired video frames. Two key tasks, pixel correspondence identification and motion feature extraction, are studied to recognize activity patterns. We utilize a multiscale approach to identify pixel correspondences. When compared with the existing methods such as the Good Features detector and the Speed-up Robust Feature (SURF) detector, our technique is more accurate and computationally efficient. Once the pixel correspondences are determined which define representative motion vectors, we build a set of activity pattern features based on motion statistics in each frame. Finally, the physical activity of the person wearing a camera is determined according to the global motion distribution in the video. Our algorithms are tested using different machine learning techniques such as the K-Nearest Neighbor (KNN), Naive Bayesian and Support Vector Machine (SVM). The results show that many types of physical activities can be recognized from field acquired real-world video. Our results also indicate that, with a design of specific motion features in the input vectors, different classifiers can be used successfully with similar performances.  相似文献   

7.
虚拟实景空间系统HVS的研究与实现   总被引:2,自引:0,他引:2  
虚拟实景空间系统HVS由计算机自动拼接,变形与组织许多幅离散的实景图像或连续的视频,生成虚拟空间。这种虚拟空间具有照片质量的视觉效果,称为虚拟实景空间。虑拟实景空间能为用户提供前进,后退,仰视,俯视,360度环视,近看,远看等漫游能力,可运行于PC平台,HVS是虚拟实景空间的生成与漫游平台。将介绍它的模型,组成与实现。  相似文献   

8.
可穿戴式计算机技术属于用户的个人空间技术,是一类可穿戴的个人移动计算系统,在电网现场巡检巡视工作中的科学的应用可穿戴计算机系统,能够有效的高作业效率。本文针对智能变电站巡检巡视作业环境的特点和需求,设计出了一套基于头戴式双目相机技术的智能变电站巡检巡视系统,提出了一种抽象概要地图构建方法,实现了基于环境概要地图的智能变电站巡检巡视作业人员的全局定位和稳定的前后台通信并对系统的有效性和可靠性进行了验证  相似文献   

9.
随着科技水平的提高和经济的快速发展,人们对于如何保护自身安全和经济财产等方面的问题越来越重视。针对这些情况,设计了一种能够实现远程监控并可以自动预警的设备。设备使用TQ210开发板作为硬件平台和嵌入式Linux作为软件平台,通过外接的摄像头采集图像,利用帧差法对采集到的图像进行处理,从而实现对运动物体的检测,最后通过音箱播放警报声以实现预警。同时在开发板上构建了以Boa为核心的WEB服务器。此设备通过Wi-Fi模块,实现了远程查看移动物体图像和视频的功能。实验结果表明,该系统不仅实现了传统的视频实时监控功能,而且当检测到移动物体产生时,不仅能够立即开启声音预警,并将图像和视频保存下来用于用户事后的查看或者取证。  相似文献   

10.
为了扩大单个摄像头的视频监控范围及灵活性, 设计了一种可远程操控的移动视频监控系统. 该系统由四个模块组成, 基于Arduino系统的智能车搭载有摄像头, 接收用户指令, 用于移动视频的采集; 嵌入式Linux系统通过V4L2接口实现对视频数据的实时采集, 一方面将数据通过网络发送至转发服务器, 另一方面将来自用户的控制指令转发至智能车; 服务器则用于转发视频至客户端以及转发用户控制指令至Linux系统; 基于Android的移动端呈现监控视频并提供用户控制界面. 与现有系统相比, 该系统可使用单摄像头实现无死角监控.  相似文献   

11.
This paper presents an automatic and robust approach to synthesize stereoscopic videos from ordinary monocular videos acquired by commodity video cameras. Instead of recovering the depth map, the proposed method synthesizes the binocular parallax in stereoscopic video directly from the motion parallax in monocular video, The synthesis is formulated as an optimization problem via introducing a cost function of the stereoscopic effects, the similarity, and the smoothness constraints. The optimization selects the most suitable frames in the input video for generating the stereoscopic video frames. With the optimized selection, convincing and smooth stereoscopic video can be synthesized even by simple constant-depth warping. No user interaction is required. We demonstrate the visually plausible results obtained given the input clips acquired by ordinary handheld video camera.  相似文献   

12.
In this paper we present a computationally economical method of recovering the projective motion of head mounted cameras or EyeTap devices, for use in wearable computer-mediated reality. The tracking system combines featureless vision and inertial methods in a closed loop system to achieve accurate robust head tracking using inexpensive sensors. The combination of inertial and vision techniques provides the high accuracy visual registration needed for fitting computer graphics onto real images and the robustness to large interframe camera motion due to fast head rotations. Operating on a 1.2 GHz Pentium III wearable computer with graphics accelerated hardware, the system is able to register live video images with less than 2 pixels of error (0.3 degrees) at 12 frames per second. Fast image registration is achieved by offloading computer vision computation onto the graphics hardware, which is readily available on many wearable computer systems. As an application of this tracking approach, we present a system which allows wearable computer users to share views of their current environments that have been stabilised to another viewer's head position.
Chris AimoneEmail:
  相似文献   

13.
This paper proposes the “Substitute Interface” to utilize the flat surfaces of objects around us as part of an ad hoc mobile device. The substitute interface is established by the combination of wearable devices such as a head-mounted display with camera and a ring-type microphone. The camera recognizes which object the user intends to employ. When the user picks up and taps the object, such as a notebook, a virtual display is overlaid on the object, and the user can operate the ad hoc mobile device as if the object were part of the device. Display size can be changed easily by selecting a larger object. The user’s pointing/selection action is recognized by the combination of the camera and the ring-type microphone. We first investigate the usage scene of tablet devices and create a prototype that can operate as a tablet device. Experiments on the prototype confirm that the proposal functions as intended.  相似文献   

14.
Extracting foreground objects from videos captured by a handheld camera has emerged as a new challenge. While existing approaches aim to exploit several clues such as depth and motion to extract the foreground layer, there are limitations in handling partial movement and cast shadow. In this paper, we bring a novel perspective to address these two issues by utilizing occlusion map introduced by object and camera motion and taking the advantage of interactive image segmentation methods. For partial movement, we treat each video frame as an image and synthesize “seeding” user interactions (i.e., user manually marking foreground and background) with both forward and backward occlusion maps to leverage the advances in high quality interactive image segmentation. For cast shadow, we utilize a paired region based shadow detection method to further refine initial segmentation results by removing detected shadow regions. Experimental results from both qualitative evaluation and quantitative evaluation on the Hopkins dataset demonstrate both the effectiveness and the efficiency of our proposed approach.  相似文献   

15.
In this paper, we propose a body-mounted system to capture user experience as audio/visual information. The proposed system consists of two cameras (head-detection and wide angle) and a microphone. The head-detection camera captures user head motions, while the wide angle color camera captures user frontal view images. An image region approximately corresponding to user view is then synthesized from the wide angle image based on estimated human head motions. The synthesized image and head-motion data are stored in a storage device with audio data. This system overcomes the disadvantages of head-mounted cameras in terms of ease of putting on/taking off the device. It also has less obtrusive visual impact on third persons. Using the proposed system, we can simultaneously record audio data, images in the user field of view, and head gestures (nodding, shaking, etc.) simultaneously. These data contain significant information for recording/analyzing human activities and can be used in wider application domains such as a digital diary or interaction analysis. Experimental results demonstrate the effectiveness of the proposed system.  相似文献   

16.
In this paper, we introduce the concept of personal driving diary. A personal driving diary is a multimedia archive of a person’s daily driving experience, describing important driving events of the user with annotated videos. This paper presents an automated system that constructs such multimedia diary by analyzing videos obtained from a vehicle-mounted camera. The proposed system recognizes important interactions between the driving vehicle and the other actors in videos (e.g., accident, overtaking, etc.), and labels them together with its contextual knowledge on the vehicle (e.g., mean velocity) to construct an event log. A decision tree based activity recognizer is designed, detecting driving events of vehicles and pedestrians from the first-person view videos by analyzing their trajectories and spatio-temporal relationships. The constructed diary enables efficient searching and event-based browsing of video clips, which helps the users when retrieving videos of dangerous situations. Our experiment confirms that the proposed system reliably generates driving diaries by annotating the vehicle events learned from training examples.  相似文献   

17.
Video cameras must produce images at a reasonable frame-rate and with a reasonable depth of field. These requirements impose fundamental physical limits on the spatial resolution of the image detector. As a result, current cameras produce videos with a very low resolution. The resolution of videos can be computationally enhanced by moving the camera and applying super-resolution reconstruction algorithms. However, a moving camera introduces motion blur, which limits super-resolution quality. We analyze this effect and derive a theoretical result showing that motion blur has a substantial degrading effect on the performance of super-resolution. The conclusion is that, in order to achieve the highest resolution motion blur should be avoided. Motion blur can be minimized by sampling the space-time volume of the video in a specific manner. We have developed a novel camera, called the "jitter camera," that achieves this sampling. By applying an adaptive super-resolution algorithm to the video produced by the jitter camera, we show that resolution can be notably enhanced for stationary or slowly moving objects, while it is improved slightly or left unchanged for objects with fast and complex motions. The end result is a video that has a significantly higher resolution than the captured one.  相似文献   

18.
We present a hybrid camera system for capturing video at high spatial and spectral resolutions. Composed of an red, green, and blue (RGB) video camera, a grayscale video camera and a few optical elements, the hybrid camera system simultaneously records two video streams: an RGB video with high spatial resolution, and a multispectral (MS) video with low spatial resolution. After registration of the two video streams, our system propagates the MS information into the RGB video to produce a video with both high spectral and spatial resolution. This propagation between videos is guided by color similarity of pixels in the spectral domain, proximity in the spatial domain, and the consistent color of each scene point in the temporal domain. The propagation algorithm, based on trilateral filtering, is designed to rapidly generate output video from the captured data at frame rates fast enough for real-time video analysis tasks such as tracking and surveillance. We evaluate the proposed system using both simulations with ground truth data and on real-world scenes. The accuracy of spectral capture is examined through comparisons with ground truth and with a commercial spectrometer. The utility of this high resolution MS video data is demonstrated on the applications of dynamic white balance adjustment, object tracking, and separating the appearance contributions of different illumination sources. The various high resolution MS video datasets that we captured will be made publicly available to facilitate research on dynamic spectral data analysis.  相似文献   

19.
We present a suitable virtually documented environment system providing the user with high level interaction possibilities. The system is dedicated to applications where the operator needs to have his hands free in order to access information, carry out measurements and/or operate on a device (e.g. maintenance, instruction). The system merges video images acquired through a head-mounted video camera with synthetic data (multimedia documents including CAD models and text) and presents these merged images to the operator. Registration techniques allow the operator to visualise information properly correlated to the real world: this is an essential aspect in order to achieve a feeling of presence in a real environment. We increase the sense of immersion through high level Human-Computer Interaction (HCI) allowing hands-free access to information through vocal commands as well as multimodal interaction associating speech and gesture. In this way, the user can access information and manipulate it in a very natural manner. We discuss the construction of the documentation system and the requested functionalities which led to the system architecture.  相似文献   

20.
Multiple spatially-related videos are increasingly used in security, communication, and other applications. Since it can be difficult to understand the spatial relationships between multiple videos in complex environments (e.g. to predict a person's path through a building), some visualization techniques, such as video texture projection, have been used to aid spatial understanding. In this paper, we identify and begin to characterize an overall class of visualization techniques that combine video with 3D spatial context. This set of techniques, which we call contextualized videos, forms a design palette which must be well understood so that designers can select and use appropriate techniques that address the requirements of particular spatial video tasks. In this paper, we first identify user tasks in video surveillance that are likely to benefit from contextualized videos and discuss the video, model, and navigation related dimensions of the contextualized video design space. We then describe our contextualized video testbed which allows us to explore this design space and compose various video visualizations for evaluation. Finally, we describe the results of our process to identify promising design patterns through user selection of visualization features from the design space, followed by user interviews.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号