首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 281 毫秒
1.
Can we make virtual characters in a scene interact with their surrounding objects through simple instructions? Is it possible to synthesize such motion plausibly with a diverse set of objects and instructions? Inspired by these questions, we present the first framework to synthesize the full-body motion of virtual human characters performing specified actions with 3D objects placed within their reach. Our system takes textual instructions specifying the objects and the associated ‘intentions’ of the virtual characters as input and outputs diverse sequences of full-body motions. This contrasts existing works, where full-body action synthesis methods generally do not consider object interactions, and human-object interaction methods focus mainly on synthesizing hand or finger movements for grasping objects. We accomplish our objective by designing an intent-driven full-body motion generator, which uses a pair of decoupled conditional variational auto-regressors to learn the motion of the body parts in an autoregressive manner. We also optimize the 6-DoF pose of the objects such that they plausibly fit within the hands of the synthesized characters. We compare our proposed method with the existing methods of motion synthesis and establish a new and stronger state-of-the-art for the task of intent-driven motion synthesis.  相似文献   

2.
Building a multimodal human-robot interface   总被引:3,自引:0,他引:3  
When we begin to build and interact with machines or robots that either look like humans or have human functionalities and capabilities, then people may well interact with their human-like machines in ways that mimic human-human communication. For example, if a robot has a face, a human might interact with it similarly to how humans interact with other creatures with faces, Specifically, a human might talk to it, gesture to it, smile at it, and so on. If a human interacts with a computer or a machine that understands spoken commands, the human might converse with the machine, expecting it to have competence in spoken language. In our research on a multimodal interface to mobile robots, we have assumed a model of communication and interaction that, in a sense, mimics how people communicate. Our interface therefore incorporates both natural language understanding and gesture recognition as communication modes. We limited the interface to these two modes to simplify integrating them in the interface and to make our research more tractable. We believe that with an integrated system, the user is less concerned with how to communicate (which interactive mode to employ for a task), and is therefore free to concentrate on the tasks and goals at hand. Because we integrate all our system's components, users can choose any combination of our interface's modalities. The onus is on our interface to integrate the input, process it, and produce the desired results.  相似文献   

3.
Particle-Based Simulation of Fluids   总被引:8,自引:0,他引:8  
Due to our familiarity with how fluids move and interact, as well as their complexity, plausible animation of fluidsremains a challenging problem. We present a particle interaction method for simulating fluids. The underlyingequations of fluid motion are discretized using moving particles and their interactions. The method allows simulationand modeling of mixing fluids with different physical properties, fluid interactions with stationary objects, andfluids that exhibit significant interface breakup and fragmentation. The gridless computational method is suitedfor medium scale problems since computational elements exist only where needed. The method fits well into thecurrent user interaction paradigm and allows easy user control over the desired fluid motion.  相似文献   

4.
5.
《Advanced Robotics》2013,27(8):717-742
A surface wave distributed actuation method and its proper design for safely transporting bedridden patients is explored in this paper. First, the basic principle of surface wave distributed actuation is presented, including a new kinematic feature that augments natural surface wave motion for enhanced transport efficiency of humans and elastic bodies. Kinematic modeling and analysis reveals that an object can be transferred by a simplified actuator architecture that makes the concept amenable to hardware realization. A proof of concept prototype demonstrates that heavily loaded rigid objects, elastic objects and humans can be transported. Human tissue physiology is studied to establish worst-case criteria for safe and healthy interactions between the human and the support surface that depends on the duration of interaction. Static models are developed and solved using finite element methods to calculate interaction stresses for realistic, worst-case human-surface wave interaction scenarios. Based on these results a new two-mode surface is designed to secure safe interactions for both long-term support and short term transport tasks.  相似文献   

6.
Automatically observing and understanding human activities is one of the big challenges in computer vision research. Among the potential fields of application are areas such as robotics, human computer interaction or medical research. In this article we present our work on unintrusive observation and interpretation of human activities for the precise recognition of human fullbody motions. The presented system requires no more than three cameras and is capable of tracking a large spectrum of motions in a wide variety of scenarios. This includes scenarios where the subject is partially occluded, where it manipulates objects as part of its activities, or where it interacts with the environment or other humans. Our system is self-training, i.e. it is capable of learning models of human motion over time. These are used both to improve the prediction of human dynamics and to provide the basis for the recognition and interpretation of observed activities. The accuracy and robustness obtained by our system is the combined result of several contributions. By taking an anthropometric human model and optimizing it towards use in a probabilistic tracking framework we obtain a detailed biomechanical representation of human shape, posture and motion. Furthermore, we introduce a sophisticated hierarchical sampling strategy for tracking that is embedded in a probabilistic framework and outperforms state-of-the-art Bayesian methods. We then show how to track complex manipulation activities in everyday environments using a combination of learned human appearance models and implicit environment models. Finally, we discuss a locally consistent representation of human motion that we use as a basis for learning environment- and task-specific motion models. All methods presented in this article have been subject to extensive experimental evaluation on today??s benchmarks and several challenging sequences ranging from athletic exercises to ergonomic case studies to everyday manipulation tasks in a kitchen environment.  相似文献   

7.
针对摄像机较远距离拍摄目标的情形,提出一种利用人体行走时的投影周期性特征进行人体目标检测的方法,首先通过运动分割获取每一帧的运动目标,然后通过计算运动目标的投影相似性对目标进行检测。为了简化计算,利用Hausdorff距离对运动目标的投影进行相似性计算,同时为减少存储空间,利用码书作为存储相似性特征的数据结构。  相似文献   

8.
Social cues facilitate engagement between interaction participants, whether they be two (or more) humans or a human and an artificial agent such as a robot. Previous work specific to human–agent/robot interaction has demonstrated the efficacy of implemented social behaviours, such as eye-gaze or facial gestures, for demonstrating the illusion of engagement and positively impacting interaction with a human. We describe the implementation of THAMBS, The Thinking Head Attention Model and Behavioural System, which is used to model attention controlling how a virtual agent reacts to external audio and visual stimuli within the context of an interaction with a human user. We evaluate the efficacy of THAMBS for a virtual agent mounted on a robotic platform in a controlled experimental setting, and collect both task- and behavioural-performance variables, along with self-reported ratings of engagement. Our results show that human subjects noticeably engaged more often, and in more interesting ways, with the robotic agent when THAMBS was activated, indicating that even a rudimentary display of attention by the robot elicits significantly increased attention by the human. Back-channelling had less of an effect on user behaviour. THAMBS and back-channelling did not interact and neither had an effect on self-report ratings. Our results concerning THAMBS hold implications for the design of successful human–robot interactive behaviours.  相似文献   

9.
Safety, legibility and efficiency are essential for autonomous mobile robots that interact with humans. A key factor in this respect is bi-directional communication of navigation intent, which we focus on in this article with a particular view on industrial logistic applications. In the direction robot-to-human, we study how a robot can communicate its navigation intent using Spatial Augmented Reality (SAR) such that humans can intuitively understand the robot’s intention and feel safe in the vicinity of robots. We conducted experiments with an autonomous forklift that projects various patterns on the shared floor space to convey its navigation intentions. We analyzed trajectories and eye gaze patterns of humans while interacting with an autonomous forklift and carried out stimulated recall interviews (SRI) in order to identify desirable features for projection of robot intentions. In the direction human-to-robot, we argue that robots in human co-habited environments need human-aware task and motion planning to support safety and efficiency, ideally responding to people’s motion intentions as soon as they can be inferred from human cues. Eye gaze can convey information about intentions beyond what can be inferred from the trajectory and head pose of a person. Hence, we propose eye-tracking glasses as safety equipment in industrial environments shared by humans and robots. In this work, we investigate the possibility of human-to-robot implicit intention transference solely from eye gaze data and evaluate how the observed eye gaze patterns of the participants relate to their navigation decisions. We again analyzed trajectories and eye gaze patterns of humans while interacting with an autonomous forklift for clues that could reveal direction intent. Our analysis shows that people primarily gazed on that side of the robot they ultimately decided to pass by. We discuss implications of these results and relate to a control approach that uses human gaze for early obstacle avoidance.  相似文献   

10.
《Advanced Robotics》2013,27(9):983-999
Joint attention is one of the most important cognitive functions for the emergence of communication not only between humans, but also between humans and robots. In previous work, we have demonstrated how a robot can acquire primary joint attention behavior (gaze following) without external evaluation. However, this method needs the human to tell the robot when to shift its gaze. This paper presents a method that does not need such a constraint by introducing an attention selector based on a measure consisting of saliencies of object features and motion cues. In order to realize natural interaction, a self-organizing map for real-time face pattern separation and contingency learning for gaze following without external evaluation are utilized. The attention selector controls the robot gaze to switch often from the human face to an object and vice versa, and pairs of a face pattern and a gaze motor command are input to the contingency learning. The motion cues are expected to reduce the number of incorrect training data pairs due to the asynchronous interaction that affects the convergence of the contingency learning. The experimental result shows that gaze shift utilizing motion cues enables a robot to synchronize its own motion with human motion and to learn joint attention efficiently in about 20 min.  相似文献   

11.
Interpretation of images and videos containing humans interacting with different objects is a daunting task. It involves understanding scene/event, analyzing human movements, recognizing manipulable objects, and observing the effect of the human movement on those objects. While each of these perceptual tasks can be conducted independently, recognition rate improves when interactions between them are considered. Motivated by psychological studies of human perception, we present a Bayesian approach which integrates various perceptual tasks involved in understanding human-object interactions. Previous approaches to object and action recognition rely on static shape/appearance feature matching and motion analysis, respectively. Our approach goes beyond these traditional approaches and applies spatial and functional constraints on each of the perceptual elements for coherent semantic interpretation. Such constraints allow us to recognize objects and actions when the appearances are not discriminative enough. We also demonstrate the use of such constraints in recognition of actions from static images without using any motion information.  相似文献   

12.
In projection-based Virtual Reality (VR) systems, typically only one headtracked user views stereo images rendered from the correct view position. For other users, who are presented a distorted image, moving with the first user's head motion, it is difficult to correctly view and interact with 3D objects in the virtual environment. In close-range VR systems, such as the Virtual Workbench, distortion effects are especially large because objects are within close range and users are relatively far apart. On these systems, multi-user collaboration proves to be difficult. In this paper, we analyze the problem and describe a novel, easy to implement method to prevent and reduce image distortion and its negative effects on close-range interaction task performance. First, our method combines a shared camera model and view distortion compensation. It minimizes the overall distortion for each user, while important user-personal objects such as interaction cursors, rays and controls remain distortion-free. Second, our method retains co-location for interaction techniques to make interaction more consistent. We performed a user experiment on our Virtual Workbench to analyze user performance under distorted view conditions with and without the use of our method. Our findings demonstrate the negative impact of view distortion on task performance and the positive effect our method introduces. This indicates that our method can enhance the multi-user collaboration experience on close-range, projection-based VR systems.  相似文献   

13.
In this paper, we propose a method for detecting humans and vehicles in imagery taken from a UAV. This is a challenging problem due to a limited number of pixels on target, which makes it more difficult to distinguish objects from background clutter, and results in much larger search space. We propose a method for constraining the search based on a number of geometric constraints obtained from the metadata. Specifically, we obtain the orientation of ground plane normal, the orientation of shadows cast by out of plane objects in the scene, and the relationship between object heights and the size of their corresponding shadows. We use the aforementioned information in a geometry-based shadow, and ground-plane normal blob detector, which provides an initial estimation for locations of shadow casting out of plane (SCOOP) objects in the scene. These SCOOP candidate locations are then classified as either human or clutter using a combination of wavelet features and a Support Vector Machine. To detect vehicles, we similarly find potential vehicle candidates by combining SCOOP and inverted-SCOOP candidates and then classify them using wavelet features and SVM. Our method works on a single frame, and unlike motion detection based methods, it bypasses the entire pipeline of registration, motion detection, and tracking. This method allows for detection of stationary and slowly moving humans and vehicles while avoiding the search across the entire image, allowing accurate and fast localization. We show impressive results on sequences from VIVID and CLIF datasets and provide comparative analysis.  相似文献   

14.
Smart spaces use human–computer interfaces (HCIs) to improve how humans experience their surroundings. However, HCIs are sometimes less user-friendly and intuitive than their traditional counterparts. Our research aims to use every-day objects to create communication channels between spaces and people, which can then strengthen interpersonal emotional relationships through a natural and unobtrusive interface. This study explores how using simpler instruments such as a whiskey glass, a table, and an MP3 player to interact with a dwelling improves user experience in a HCI-equipped smart space. We implemented a real smart space—the Time Home Pub, which not only adjusts the environmental atmosphere (such as background lighting, music, and photos) in response to human activities but also encourages a better connection between humans, their memories, and physical space. Time Home Pub was exhibited at the Taipei Fine Arts Museum in 2007 for the topic of Architecture of Tomorrow. Preliminary evaluations by visitors demonstrate the satisfactory feasibility of the system and how a smart space could change and improve human experiences through the use of new technology and architectural design elements.  相似文献   

15.
Robots that interact with humans in household environments are required to handle multiple real-time tasks simultaneously, such as carrying objects, collision avoidance and conversation with human. This article presents a design framework for the control and recognition processes to meet these requirements taking into account stochastic human behaviour. The proposed design method first introduces a Petri net for synchronisation of multiple tasks. The Petri net formulation is converted to Markov decision processes and processed in an optimal control framework. Three tasks (safety confirmation, object conveyance and conversation) interact and are expressed by the Petri net. Using the proposed framework, tasks that normally tend to be designed by integrating many if–then rules can be designed in a systematic manner in a state estimation and optimisation framework from the viewpoint of the shortest time optimal control. The proposed arbitration method was verified by simulations and experiments using RI-MAN, which was developed for interactive tasks with humans.  相似文献   

16.
在 MPEG- 4视频编码标准中 ,为了实现基于视频内容的交互功能 ,视频序列的每一帧由视频对象面来表示 ,而生成视频对象面 ,需要对视频序列中运动对象进行有效分割 ,并跟踪运动对象随时间的变化 .在视频分割方法中 ,交互式分割视频对象能满足分割的效率与质量指标要求 ,因此提出了一种交互分割与自动跟踪相结合的方式来分割视频语义对象 ,即在初始分割时 ,依据用户的交互与形态学的分水线分割算法相结合提取视频对象轮廓 ,并用改进的轮廓跟踪方法有效提高视频对象轮廓的精度 ;对后续帧的跟踪 ,采用六参数仿射变换跟踪运动对象轮廓的变化 ,用平移估算的运动矢量作为初始值 ,计算六参数仿射变换的参数 .实验结果表明 ,该方法能有效地分割并跟踪视频运动对象  相似文献   

17.
In engineering practical implications, assembly simulation is a useful solution to test the planned assembly process and address assembly issues. However, it is usually carried out by time-consuming human–computer interaction, which causes a lot of unavoidable laborious work. Assembly simulation could exhibit the design intent in virtual environment through the interactions between parts. Unfortunately, the models which modeled with current constraint-based method cannot provide sufficient interaction information to support assembly simulation, and the needed information should be added by designers manually. To solve this problem, this paper introduces the concept of interaction feature pair (IFP) and presents an automatic assembly simulation method based on IFP. The proposed IFP provides a form that endues a part with the capability of knowing which part is going to interact with and how to interact. Based on IFP, the automatic assembly simulation is carried out in two steps: First, a graph-based method in presented to generate the interaction sequence, which provides the information that when and which feature should be mated during assembly. Then, the randomized motion planning is employed in the established C-space to find a collision-free path for each part, and the planning results in C-space are transferred into the movements of parts in simulation environment. With these two steps, the parts could interact with other parts under a certain interaction sequence, which automatically simulates the assembly of the product. Finally, an implementation sample is presented and the results show the effectiveness of the proposed method.  相似文献   

18.
We introduce a weakly supervised approach for learning human actions modeled as interactions between humans and objects. Our approach is human-centric: We first localize a human in the image and then determine the object relevant for the action and its spatial relation with the human. The model is learned automatically from a set of still images annotated only with the action label. Our approach relies on a human detector to initialize the model learning. For robustness to various degrees of visibility, we build a detector that learns to combine a set of existing part detectors. Starting from humans detected in a set of images depicting the action, our approach determines the action object and its spatial relation to the human. Its final output is a probabilistic model of the human-object interaction, i.e., the spatial relation between the human and the object. We present an extensive experimental evaluation on the sports action data set from [1], the PASCAL Action 2010 data set [2], and a new human-object interaction data set.  相似文献   

19.
In avatar-mediated telepresence systems, a similar environment is assumed for involved spaces, so that the avatar in a remote space can imitate the user's motion with proper semantic intention performed in a local space. For example, touching on the desk by the user should be reproduced by the avatar in the remote space to correctly convey the intended meaning. It is unlikely, however, that the two involved physical spaces are exactly the same in terms of the size of the room or the locations of the placed objects. Therefore, a naive mapping of the user's joint motion to the avatar will not create the semantically correct motion of the avatar in relation to the remote environment. Existing studies have addressed the problem of retargeting human motions to an avatar for telepresence applications. Few studies, however, have focused on retargeting continuous full-body motions such as locomotion and object interaction motions in a unified manner. In this paper, we propose a novel motion adaptation method that allows to generate the full-body motions of a human-like avatar on-the-fly in the remote space. The proposed method handles locomotion and object interaction motions as well as smooth transitions between them according to given user actions under the condition of a bijective environment mapping between morphologically-similar spaces. Our experiments show the effectiveness of the proposed method in generating plausible and semantically correct full-body motions of an avatar in room-scale space.  相似文献   

20.
The knowledge intensive service processes should be managed in a human-oriented way since humans who naturally undertake complex operations of an intellectual nature in the processes are the most valuable resources. The most fundamental nature of human work is collaborative and dynamic. Humans interact and communicate with each other to accomplish their jobs in the process. To help them to work together, a strong representation of the process should be provided to facilitate them to clearly understand who they should interact with and what activities need to be performed. For the clear representation, Human Interaction Management (HIM), which has been suggested to comprehensively support human work, adopts a role-based approach to process modeling. It, however, tends to hide elements of interactions although the collaborative human interaction is one of the most fundamental nature of human work. To remedy this problem, a state-driven modeling approach to human interactions was presented. It clearly visualizes the interactions so that humans can be guided through it. However, they do not just follow the previously defined sequence of activities, but continuously work out how they are going to proceed from now on according to the state of things they encounter throughout the life of the work. To fully support the dynamic nature of human work, human interactions should be flexibly managed. Therefore, this paper presents a framework for the flexible management of human interactions. The framework provides a capability to flexibly manage the interactions in a decentralized way by allowing interaction participants to dynamically change the involved interaction based on the continuous negotiation of how to achieve the ultimate goal of the interaction. It will be a basis for realization of decentralized management of human interactions in knowledge intensive service processes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号