首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
This paper addresses a sensor-based simultaneous localization and mapping (SLAM) algorithm for camera tracking in a virtual studio environment. The traditional camera tracking methods in virtual studios are vision-based or sensor-based. However, the chroma keying process in virtual studios requires color cues, such as blue background, to segment foreground objects to be inserted into images and videos. Chroma keying limits the application of vision-based tracking methods in virtual studios since the background cannot provide enough feature information. Furthermore, the conventional sensor-based tracking approaches suffer from the jitter, drift or expensive computation due to the characteristics of individual sensor system. Therefore, the SLAM techniques from the mobile robot area are first investigated and adapted to the camera tracking area. Then, a sensor-based SLAM extension algorithm for two dimensional (2D) camera tracking in virtual studio is described. Also, a technique called map adjustment is proposed to increase the accuracy and efficiency of the algorithm. The feasibility and robustness of the algorithm is shown by experiments. The simulation results demonstrate that the sensor-based SLAM algorithm can satisfy the fundamental 2D camera tracking requirement in virtual studio environment.  相似文献   

2.
This paper addresses a sensor-based simultaneous localization and mapping (SLAM) algorithm for camera tracking in a virtual studio environment. The traditional camera tracking methods in virtual studios are vision-based or sensor-based. However, the chroma keying process in virtual studios requires color cues, such as blue background, to segment foreground objects to be inserted into images and videos. Chroma keying limits the application of vision-based tracking methods in virtual studios since the background cannot provide enough feature information. Furthermore, the conventional sensor-based tracking approaches suffer from the jitter, drift or expensive computation due to the characteristics of individual sensor system. Therefore, the SLAM techniques from the mobile robot area are first investigated and adapted to the camera tracking area. Then, a sensor-based SLAM extension algorithm for two dimensional (2D) camera tracking in virtual studio is described. Also, a technique called map adjustment is proposed to increase the accuracy' and efficiency of the algorithm. The feasibility and robustness of the algorithm is shown by experiments. The simulation results demonstrate that the sensor-based SLAM algorithm can satisfy the fundamental 2D camera tracking requirement in virtual studio environment.  相似文献   

3.
A new vision-based framework and system for human–computer interaction in an Augmented Reality environment is presented in this article. The system allows the users to interact with computer-generated virtual objects using their hands directly. With an efficient color segmentation algorithm, the system is adaptable to different light conditions and backgrounds. It is also suitable for real-time applications. The dominant features on the palm are detected and tracked to estimate the camera pose. After the camera pose relative to the user's hand has been reconstructed, 3D virtual objects can be augmented naturally onto the palm for the user to inspect and manipulate. With off-the-shelf web camera and computer, natural bare-hand based interactions with 2D and 3D virtual objects can be achieved with low cost.  相似文献   

4.
In applications of augmented reality like virtual studio TV production, multisite video conference applications using a virtual meeting room and synthetic/natural hybrid coding according to the new ISO/MPEG-4 standard, a synthetic scene is mixed into a natural scene to generate a synthetic/natural hybrid image sequence. For realism, the illumination in both scenes should be identical. In this paper, the illumination of the natural scene is estimated automatically and applied to the synthetic scene. The natural scenes are restricted to scenes with nonoccluding, simple, moving, mainly rigid objects. For illumination estimation, these natural objects are automatically segmented in the natural image sequence and three-dimensionally (3-D) modeled using ellipsoid-like models. The 3-D shape, 3-D motion, and the displaced frame difference between two succeeding images are evaluated to estimate three illumination parameters. The parameters describe a distant point light source and ambient light. Using the estimated illumination parameters, the synthetic scene is rendered and mixed to the natural image sequence. Experimental results with a moving virtual object mixed into real video telephone sequences show that the virtual object appears naturally having the same shading and shadows as the real objects. Further, shading and shadow allows the viewer to understand the motion trajectory of the objects much better  相似文献   

5.
增强现实技术是近年来人机交互领域的研究热点。在增强现实环境下加入触觉感知,可使用户在真实场景中看到并感知到虚拟对象。为了实现增强现实环境下与虚拟对象之间更加自然的交互,提出一种视触觉融合的三维注册方法。基于图像视觉技术获得三维注册矩阵;借助空间转换关系求解出触觉空间与图像空间的转换关系;结合两者与摄像头空间的关系实现视触觉融合的增强现实交互场景。为验证该方法的有效性,设计了一个基于视触觉增强现实的组装机器人项目。用户可触摸并移动真实环境中的机器人零件,还能在触摸时感受到反馈力,使交互更具真实感。  相似文献   

6.
Automated virtual camera control has been widely used in animation and interactive virtual environments. We have developed a multiple sparse camera based free view video system prototype that allows users to control the position and orientation of a virtual camera, enabling the observation of a real scene in three dimensions (3D) from any desired viewpoint. Automatic camera control can be activated to follow selected objects by the user. Our method combines a simple geometric model of the scene composed of planes (virtual environment), augmented with visual information from the cameras and pre-computed tracking information of moving targets to generate novel perspective corrected 3D views of the virtual camera and moving objects. To achieve real-time rendering performance, view-dependent textured mapped billboards are used to render the moving objects at their correct locations and foreground masks are used to remove the moving objects from the projected video streams. The current prototype runs on a PC with a common graphics card and can generate virtual 2D views from three cameras of resolution 768×576 with several moving objects at about 11 fps.  相似文献   

7.
Inserting synthetic objects into video sequences has gained much interest in recent years. Fast and robust vision-based algorithms are necessary to make such an application possible. Traditional pose tracking schemes using recursive structure from motion techniques adopt one Kalman filter and thus only favor a certain type of camera motion. We propose a robust simultaneous pose tracking and structure recovery algorithm using the interacting multiple model (IMM) to improve performance. In particular, a set of three extended Kalman filters (EKFs), each describing a frequently occurring camera motion in real situations (general, pure translation, pure rotation), is applied within the IMM framework to track the pose of a scene. Another set of EKFs,one filter for each model point, is used to refine the positions of the model features in the 3-D space. The filters for pose tracking and structure refinement are executed in an interleaved manner. The results are used for inserting virtual objects into the original video footage. The performance of the algorithm is demonstrated with both synthetic and real data. Comparisons with different approaches have been performed and show that our method is more efficient and accurate.  相似文献   

8.
Multithreaded Hybrid Feature Tracking for Markerless Augmented Reality   总被引:1,自引:0,他引:1  
We describe a novel markerless camera tracking approach and user interaction methodology for augmented reality (AR) on unprepared tabletop environments. We propose a real-time system architecture that combines two types of feature tracking. Distinctive image features of the scene are detected and tracked frame-to-frame by computing optical flow. In order to achieve real-time performance, multiple operations are processed in a synchronized multi-threaded manner: capturing a video frame, tracking features using optical flow, detecting distinctive invariant features, and rendering an output frame. We also introduce user interaction methodology for establishing a global coordinate system and for placing virtual objects in the AR environment by tracking a user's outstretched hand and estimating a camera pose relative to it. We evaluate the speed and accuracy of our hybrid feature tracking approach, and demonstrate a proof-of-concept application for enabling AR in unprepared tabletop environments, using bare hands for interaction.  相似文献   

9.
A multi-user 3-D virtual environment allows remote participants to have a transparent communication as if they are communicating face-to-face. The sense of presence in such an environment can be established by representing each participant with a vivid human-like character called an avatar. We review several immersive technologies, including directional sound, eye gaze, hand gestures, lip synchronization and facial expressions, that facilitates multimodal interaction among participants in the virtual environment using speech processing and animation techniques. Interactive collaboration can be further encouraged with the ability to share and manipulate 3-D objects in the virtual environment. A shared whiteboard makes it easy for participants in the virtual environment to convey their ideas graphically. We survey various kinds of capture devices used for providing the input for the shared whiteboard. Efficient storage of the whiteboard session and precise archival at a later time bring up interesting research topics in information retrieval.  相似文献   

10.
We present a real‐time multi‐view facial capture system facilitated by synthetic training imagery. Our method is able to achieve high‐quality markerless facial performance capture in real‐time from multi‐view helmet camera data, employing an actor specific regressor. The regressor training is tailored to specified actor appearance and we further condition it for the expected illumination conditions and the physical capture rig by generating the training data synthetically. In order to leverage the information present in live imagery, which is typically provided by multiple cameras, we propose a novel multi‐view regression algorithm that uses multi‐dimensional random ferns. We show that higher quality can be achieved by regressing on multiple video streams than previous approaches that were designed to operate on only a single view. Furthermore, we evaluate possible camera placements and propose a novel camera configuration that allows to mount cameras outside the field of view of the actor, which is very beneficial as the cameras are then less of a distraction for the actor and allow for an unobstructed line of sight to the director and other actors. Our new real‐time facial capture approach has immediate application in on‐set virtual production, in particular with the ever‐growing demand for motion‐captured facial animation in visual effects and video games.  相似文献   

11.
Camera view invariant 3-D object retrieval is an important issue in many traditional and emerging applications such as security, surveillance, computer-aided design (CAD), virtual reality, and place recognition. One straightforward method for camera view invariant 3-D object retrieval is to consider all the possible camera views of 3-D objects. However, capturing and maintaining such views require an enormous amount of time and labor. In addition, all camera views should be indexed for reasonable retrieval performance, which requires extra storage space and maintenance overhead. In the case of shape-based 3-D object retrieval, such overhead could be relieved by considering the symmetric shape feature of most objects. In this paper, we propose a new shape-based indexing and matching scheme of real or rendered 3-D objects for camera view invariant object retrieval. In particular, in order to remove redundant camera views to be indexed, we propose a camera view skimming scheme, which includes: i) mirror shape pairing and ii) camera view pruning according to the symmetrical patterns of object shapes. Since our camera view skimming scheme considerably reduces the number of camera views to be indexed, it could relieve the storage requirement and improve the matching speed without sacrificing retrieval accuracy. Through various experiments, we show that our proposed scheme can achieve excellent performance.  相似文献   

12.
This paper proposes an augmented reality content authoring system that enables ordinary users who do not have programming capabilities to easily apply interactive features to virtual objects on a marker via gestures. The purpose of this system is to simplify augmented reality (AR) technology usage for ordinary users, especially parents and preschool children who are unfamiliar with AR technology. The system provides an immersive AR environment with a head-mounted display and recognizes users’ gestures via an RGB-D camera. Users can freely create the AR content that they will be using without any special programming ability simply by connecting virtual objects stored in a database to the system. Following recognition of the marker via the system’s RGB-D camera worn by the user, he/she can apply various interactive features to the marker-based AR content using simple gestures. Interactive features applied to AR content can enlarge, shrink, rotate, and transfer virtual objects with hand gestures. In addition to this gesture-interactive feature, the proposed system also allows for tangible interaction using markers. The AR content that the user edits is stored in a database, and is retrieved whenever the markers are recognized. The results of comparative experiments conducted indicate that the proposed system is easier to use and has a higher interaction satisfaction level than AR environments such as fixed-monitor and touch-based interaction on mobile screens.  相似文献   

13.
This paper presents a novel method that acquires camera position and orientation from a stereo image sequence without prior knowledge of the scene. To make the algorithm robust, the interacting multiple model probabilistic data association filter (IMMPDAF) is introduced. The interacting multiple model (IMM) technique allows the existence of more than one dynamic system in the filtering process and in return leads to improved accuracy and stability even under abrupt motion changes. The probabilistic data association (PDA) framework makes the automatic selection of measurement sets possible, resulting in enhanced robustness to occlusions and moving objects. In addition to the IMMPDAF, the trifocal tensor is employed in the computation so that the step of reconstructing the 3-D models can be eliminated. This further guarantees the precision of estimation and computation efficiency. Real stereo image sequences have been used to test the proposed method in the experiment. The recovered 3-D motions are accurate in comparison with the ground truth data and have been applied to control cameras in a virtual environment.  相似文献   

14.
15.
This paper presents an efficient image-based approach to navigate a scene based on only three wide-baseline uncalibrated images without the explicit use of a 3D model. After automatically recovering corresponding points between each pair of images, an accurate trifocal plane is extracted from the trifocal tensor of these three images. Next, based on a small number of feature marks using a friendly GUI, the correct dense disparity maps are obtained by using our trinocular-stereo algorithm. Employing the barycentric warping scheme with the computed disparity, we can generate an arbitrary novel view within a triangle spanned by three camera centers. Furthermore, after self-calibration of the cameras, 3D objects can be correctly augmented into the virtual environment synthesized by the tri-view morphing algorithm. Three applications of the tri-view morphing algorithm are demonstrated. The first one is 4D video synthesis, which can be used to fill in the gap between a few sparsely located video cameras to synthetically generate a video from a virtual moving camera. This synthetic camera can be used to view the dynamic scene from a novel view instead of the original static camera views. The second application is multiple view morphing, where we can seamlessly fly through the scene over a 2D space constructed by more than three cameras. The last one is dynamic scene synthesis using three still images, where several rigid objects may move in any orientation or direction. After segmenting three reference frames into several layers, the novel views in the dynamic scene can be generated by applying our algorithm. Finally, the experiments are presented to illustrate that a series of photo-realistic virtual views can be generated to fly through a virtual environment covered by several static cameras.  相似文献   

16.
An immersive whiteboard system is presented where users at multiple locations can communicate with each other. The system features a virtual environment with vivid avatars, stroke compression and streaming technology to effectively deliver stroke data across meeting participants, friendly human interaction and navigation, virtual and physical whiteboard. The whiteboard is both a physical platform for our input/output interfaces and a virtual screen for sharing common multimedia. It is this whiteboard correspondence that allows the user to physically write on the virtual whiteboard. In addition to drawing on the shared virtual board, the immersive whiteboard in our setup permits users to control the application menus, insert multimedia objects into the world, and navigate around the virtual environment. By integrating multimedia objects and avatar representations into an immersive environment, we provide the users with a more transparent medium so that they feel as if they are communicating and interacting face-to-face. The whiteboard efficiently pulls all the collaboration technologies together. The goal of this collaborative system is to provide a convenient environment for participants to interact with each other and support collaborative applications such as instant messaging, distance learning and conferencing.  相似文献   

17.
18.
This paper presents an intuitive method to simulate architectural space and objects by using gesture modeling. This method applies hand movements to retrieve data and generate objects under circumstances allowing larger tolerance. Communication between gesture (actor) and computer (listener) is conducted with a set of basic components like gestures, regulations, and interfaces. The components are used as theoretical models to assist gesture-aided 3-D modeling. Gesture is analyzed by both surface and deep structures in syntactic structure. At the outset, persons of different backgrounds were requested to model a virtual building. Several basic gesture types were concluded based on experiment categorization. The test result was used to build a theoretical model and conduct subsequent simulation. A 3-D digitizer associated with software is applied as a communication interface between user and computer. Gestures are simulated by moving a stylus held by one hand.  相似文献   

19.
Augmented Reality (AR) composes virtual objects with real scenes in a mixed environment where human–computer interaction has more semantic meanings. To seamlessly merge virtual objects with real scenes, correct occlusion handling is a significant challenge. We present an approach to separate occluded objects in multiple layers by utilizing depth, color, and neighborhood information. Scene depth is obtained by stereo cameras and two Gaussian local kernels are used to represent color, spatial smoothness. These three cues are intelligently fused in a probability framework, where the occlusion information can be safely estimated. We apply our method to handle occlusions in video‐based AR where virtual objects are simply overlapped on real scenes. Experiment results show the approach can correctly register virtual and real objects in different depth layers, and provide a spatial‐awareness interaction environment. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

20.
Advanced interaction techniques in virtual environments   总被引:4,自引:0,他引:4  
Fundamental to much of the Virtual Reality work is, in addition to high-level 3D graphical and multimedia scenes, research on advanced methods of interaction. The “visitor” of such virtual worlds must be able to act and behave intuitively, as he would in everyday situations, as well as receive expectable natural behaviour presented as feedback from the objects in the environment, in a way that he/she has the feeling of direct interaction with his/her application. In this paper we present several techniques to enrich the naturalness and enhance the user involvement in the virtual environment. We present how the user is enabled to grab objects without using any specific and elaborate hand gesture, which is more intuitive and close to the way humans are used to do. We also introduce a technique that makes it possible for the user to surround objects without any force-feedback interaction device. This technique allows the user to surround or “walk” with the virtual hand on the object's surface and look for the best position to grab it.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号