首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Although avatars may resemble communicative interface agents, they have for the most part not profited from recent research into autonomous embodied conversational systems. In particular, even though avatars function within conversational environments (for example, chat or games), and even though they often resemble humans (with a head, hands, and a body) they are incapable of representing the kinds of knowledge that humans have about how to use the body during communication. Humans, however, do make extensive use of the visual channel for interaction management where many subtle and even involuntary cues are read from stance, gaze, and gesture. We argue that the modeling and animation of such fundamental behavior is crucial for the credibility and effectiveness of the virtual interaction in chat. By treating the avatar as a communicative agent, we propose a method to automate the animation of important communicative behavior, deriving from work in conversation and discourse theory. BodyChat is a system that allows users to communicate via text while their avatars automatically animate attention, salutations, turn taking, back-channel feedback, and facial expression. An evaluation shows that users found an avatar with autonomous conversational behaviors to be more natural than avatars whose behaviors they controlled, and to increase the perceived expressiveness of the conversation. Interestingly, users also felt that avatars with autonomous communicative behaviors provided a greater sense of user control.  相似文献   

2.
《Artificial Intelligence》2007,171(8-9):568-585
Head pose and gesture offer several conversational grounding cues and are used extensively in face-to-face interaction among people. To accurately recognize visual feedback, humans often use contextual knowledge from previous and current events to anticipate when feedback is most likely to occur. In this paper we describe how contextual information can be used to predict visual feedback and improve recognition of head gestures in human–computer interfaces. Lexical, prosodic, timing, and gesture features can be used to predict a user's visual feedback during conversational dialog with a robotic or virtual agent. In non-conversational interfaces, context features based on user–interface system events can improve detection of head gestures for dialog box confirmation or document browsing. Our user study with prototype gesture-based components indicate quantitative and qualitative benefits of gesture-based confirmation over conventional alternatives. Using a discriminative approach to contextual prediction and multi-modal integration, performance of head gesture detection was improved with context features even when the topic of the test set was significantly different than the training set.  相似文献   

3.
GeeAir: a universal multimodal remote control device for home appliances   总被引:2,自引:2,他引:0  
In this paper, we present a handheld device called GeeAir for remotely controlling home appliances via a mixed modality of speech, gesture, joystick, button, and light. This solution is superior to the existing universal remote controllers in that it can be used by the users with physical and vision impairments in a natural manner. By combining diverse interaction techniques in a single device, the GeeAir enables different user groups to control home appliances effectively, satisfying even the unmet needs of physically and vision-impaired users while maintaining high usability and reliability. The experiments demonstrate that the GeeAir prototype achieves prominent performance through standardizing a small set of verbal and gesture commands and introducing the feedback mechanisms.  相似文献   

4.
In human–human communication we can adapt or learn new gestures or new users using intelligence and contextual information. Achieving natural gesture-based interaction between humans and robots, the system should be adaptable to new users, gestures and robot behaviors. This paper presents an adaptive visual gesture recognition method for human–robot interaction using a knowledge-based software platform. The system is capable of recognizing users, static gestures comprised of the face and hand poses, and dynamic gestures of face in motion. The system learns new users, poses using multi-cluster approach, and combines computer vision and knowledge-based approaches in order to adapt to new users, gestures and robot behaviors. In the proposed method, a frame-based knowledge model is defined for the person-centric gesture interpretation and human–robot interaction. It is implemented using the frame-based Software Platform for Agent and Knowledge Management (SPAK). The effectiveness of this method has been demonstrated by an experimental human–robot interaction system using a humanoid robot ‘Robovie’.  相似文献   

5.
Until now, research on arrangement of verbal and non-verbal information in multimedia presentations has not considered multimodal behavior of animated agents. In this paper, we will present an experiment exploring the effects of different types of speech–gesture cooperation in agents’ behavior: redundancy (gestures duplicate pieces of information conveyed by speech), complementarity (distribution of information across speech and gestures) and a control condition in which gesture does not convey semantic information. Using a Latin-square design, these strategies were attributed to agents of different appearances to present different objects. Fifty-four male and 54 female users attended three short presentations performed by the agents, recalled the content of presentations and evaluated both the presentations and the agents. Although speech–gesture cooperation was not consciously perceived, it proved to influence users’ recall performance and subjective evaluations: redundancy increased verbal information recall, ratings of the quality of explanation, and expressiveness of agents. Redundancy also resulted in higher likeability scores for the agents and a more positive perception of their personality. Users’ gender had no influence on this set of results.  相似文献   

6.
In this study, we highlight the theoretical underpinnings of human impression management tactics and link them to current research in embodied conversational agents. Specifically, we incorporated impression management behaviors into an embodied conversational agent in order to show that different influence strategies affect user perceptions, and how those perceptions might be moderated by user gender. We programmed the agent to use two human impression management techniques (ingratiation and self-promotion) and had the agent interact with 88 users. After the interaction, users reported their perceptions of the system’s power, trustworthiness, expertise, and attractiveness. The impression management techniques altered users’ perceptions and these perceptions were moderated by gender differences.  相似文献   

7.

Most of today's virtual environments are populated with some kind of autonomous life - like agents . Such agents follow a preprogrammed sequence of behaviors that excludes the user as a participating entity in the virtual society . In order to make inhabited virtual reality an attractive place for information exchange and social interaction , we need to equip the autonomous agents with some perception and interpretation skills . In this paper we present one skill: human action recognition . By opposition to human - computer interfaces that focus on speech or hand gestures , we propose a full - body integration of the user . We present a model of human actions along with a real - time recognition system . To cover the bilateral aspect in human - computer interfaces , we also discuss some action response issues . In particular , we describe a motion management library that solves animation continuity and mixing problems . Finally , we illustrate our systemwith two examples and discuss what we have learned .  相似文献   

8.
Building a collaborative trusting relationship with users is crucial in a wide range of applications, such as advice-giving or financial transactions, and some minimal degree of cooperativeness is required in all applications to even initiate and maintain an interaction with a user. Despite the importance of this aspect of human–human relationships, few intelligent systems have tried to build user models of trust, credibility, or other similar interpersonal variables, or to influence these variables during interaction with users. Humans use a variety of kinds of social language, including small talk, to establish collaborative trusting interpersonal relationships. We argue that such strategies can also be used by intelligent agents, and that embodied conversational agents are ideally suited for this task given the myriad multimodal cues available to them for managing conversation. In this article we describe a model of the relationship between social language and interpersonal relationships, a new kind of discourse planner that is capable of generating social language to achieve interpersonal goals, and an actual implementation in an embodied conversational agent. We discuss an evaluation of our system in which the use of social language was demonstrated to have a significant effect on users’ perceptions of the agent’s knowledgableness and ability to engage users, and on their trust, credibility, and how well they felt the system knew them, for users manifesting particular personality traits.This revised version was published online in July 2005 with corrections to the author name Bickmore.  相似文献   

9.
ABSTRACT

Off-the-shelf conversational agents are permeating people’s everyday lives. In these artificial intelligence devices, trust plays a key role in users’ initial adoption and successful utilization. Factors enhancing trust toward conversational agents include appearances, voice features, and communication styles. Synthesizing such work will be useful in designing evidence-based, trustworthy conversational agents appropriate for various contexts. We conducted a systematic review of the experimental studies that investigated the effect of conversational agents’ and users’ characteristics on trust. From a full-text review of 29 articles, we identified five agent design-themes affecting trust toward conversational agents: social intelligence of the agent, voice characteristics and communication style, look of the agent, non-verbal communication, and performance quality. We also found that participants’ demographic, personality, or use context moderate the effect of these themes. We discuss implications for designing trustworthy conversational agents and responsibilities around on stereotypes and social norm building through agent design.  相似文献   

10.
An accurate estimation of sentence units (SUs) in spontaneous speech is important for (1) helping listeners to better understand speech content and (2) supporting other natural language processing tasks that require sentence information. There has been much research on automatic SU detection; however, most previous studies have only used lexical and prosodic cues, but have not used nonverbal cues, e.g., gesture. Gestures play an important role in human conversations, including providing semantic content, expressing emotional status, and regulating conversational structure. Given the close relationship between gestures and speech, gestures may provide additional contributions to automatic SU detection. In this paper, we have investigated the use of gesture cues for enhancing the SU detection. Particularly, we have focused on: (1) collecting multimodal data resources involving gestures and SU events in human conversations, (2) analyzing the collected data sets to enrich our knowledge about co-occurrence of gestures and SUs, and (3) building statistical models for detecting SUs using speech and gestural cues. Our data analyses suggest that some gesture patterns influence a word boundary’s probability of being an SU. On the basis of the data analyses, a set of novel gestural features were proposed for SU detection. A combination of speech and gestural features was found to provide more accurate SU predictions than using only speech features in discriminative models. Findings in this paper support the view that human conversations are processes involving multimodal cues, and so they are more effectively modeled using information from both verbal and nonverbal channels.  相似文献   

11.
Exertion games (exergames) pose interesting challenges in terms of user interaction techniques. Players are commonly unable to use traditional input devices such as mouse and keyboard, given the body movement requirements of this type of videogames. In this work we propose a hand gesture interface to direct actions in a target-shooting exertion game that is played while exercising on an ergo-bike. A vision-based hand gesture interface for interacting with objects in a 3D videogame is designed and implemented. The system is capable to issue game commands to any computer game that normally responds to mouse and keyboard without modifying the underlying source code of the game. The vision system combines Bag-of-features and Support Vector Machine (SVM) to achieve user-independent and real-time hand gesture recognition. In particular, a Finite State Machine (FSM) is used to build the grammar that generates gesture commands for the game. We carried out a user study to gather feedback from participants, and our preliminary results show the high level of interest from users use this multimedia system that implements a natural way of interaction. Albeit some concerns in terms of comfort, users had a positive experience using our exertion game and they expressed their positive intention to use a system like this in their daily lives.  相似文献   

12.
In designing robot systems for human interaction, designers draw on aspects of human behavior that help them achieve specific design goals. For instance, the designer of an educational robot system may use speech, gaze, and gesture cues in a way that enhances its student’s learning. But what set of behaviors improve such outcomes? How might designers of such a robot system determine this set of behaviors? Conventional approaches to answering such questions primarily involve designers carrying out a series of experiments in which they manipulate a small number of design variables and measure the effects of these manipulations on specific interaction outcomes. However, these methods become infeasible when the design space is large and when the designer needs to understand the extent to which each variable contributes to achieving the desired effects. In this paper, we present a novel multivariate method for evaluating what behaviors of interactive robot systems improve interaction outcomes. We illustrate the use of this method in a case study in which we explore how different types of narrative gestures of a storytelling robot improve its users’ recall of the robot’s story, their ability to retell the robot’s story, their perceptions of and rapport with the robot, and their overall engagement in the experiment.  相似文献   

13.
A multiplayer dice game was realized which is played by two users and one embodied conversational agent. During the game, the players have to lie to each other to win the game and the longer the game commences the more probable it is that someone is lying, which creates highly emotional situations. We ran a number of evaluation studies with the system. The specific setting allows us to compare user–user interactions directly with user–agent interactions in the same game. So far, the users’ gaze behavior and the users’ verbal behavior towards one another and towards the agent have been analyzed. Gaze and verbal behavior towards the agent partly resembles patterns found in the literature for human–human interactions, partly the behavior deviates from these observations and could be interpreted as rude or impolite like continuous staring, insulting, or talking about the agent. For most of these seemingly abusive behaviors, a more thorough analysis reveals that they are either acceptable or present some interesting insights for improving the interaction design between users and embodied conversational agents.  相似文献   

14.
Interactive agents such as pet robots or adaptive speech interface systems that require forming a mutual adaptation process with users should have two competences. One of these is recognizing reward information from users' expressed paralanguage information, and the other is informing the learning system about the users by means of that reward information. The purpose of this study was to clarify the specific contents of reward information and the actual mechanism of a learning system by observing how 2 persons could create a smooth speech communication, such as that between owners and their pets.

A communication experiment was conducted to observe how human participants create smooth communication through acquiring meaning from utterances in languages they did not understand. Then, based on experimental results, a meaning-acquisition model that considers the following 2 assumptions was constructed: (a) To achieve a mutual adaptive relationship with users, the model needs to induce users' adaptation and to exploit this induced adaptation to recognize the meanings of a user's speech sounds; and (b) to recognize users' utterances through trial-and-error interaction regardless of the language used, the model should focus on prosodic information in speech sounds, rather than on the phoneme information on which most past interface studies have focused.

The results confirmed that the proposed model could recognize the meanings of users' verbal commands by using participants' adaptations to the model for its meaning-acquisition process. However, this phenomenon was observed only when an experimenter gave the participants appropriate instructions equivalent to catchphrases that helped users learn how to use and interact intuitively with the model. Thus, this suggested the need for a subsequent study to discover how to induce the participants' adaptations or natural behaviors without giving these kinds of instructions.  相似文献   

15.
Multimodal identity tracking in a smart room   总被引:1,自引:1,他引:0  
The automatic detection, tracking, and identification of multiple people in intelligent environments are important building blocks on which smart interaction systems can be designed. Those could be, e.g., gesture recognizers, head pose estimators or far-field speech recognizers and dialog systems. In this paper, we present a system which is capable of tracking multiple people in a smart room environment while inferring their identities in a completely automatic and unobtrusive way. It relies on a set of fixed and active cameras to track the users and get close-ups of their faces for identification, and on several microphone arrays to determine active speakers and steer the attention of the system. Information coming asynchronously from several sources, such as position updates from audio or visual trackers and identification events from identification modules, is fused at higher level to gradually refine the room’s situation model. The system has been trained on a small set of users and showed good performance at acquiring and keeping their identities in a smart room environment. An erratum to this article can be found at  相似文献   

16.
17.
Despite the existence of advanced functions in smartphones, most blind people are still using old-fashioned phones with familiar layouts and dependence on tactile buttons. Smartphones support accessibility features including vibration, speech and sound feedback, and screen readers. However, these features are only intended to provide feedback to user commands or input. It is still a challenge for blind people to discover functions on the screen and to input the commands. Although voice commands are supported in smartphones, these commands are difficult for a system to recognize in noisy environments. At the same time, smartphones are integrated with sophisticated motion sensors, and motion gestures with device tilt have been gaining attention for eyes-free input. We believe that these motion gesture interactions offer more efficient access to smartphone functions for blind people. However, most blind people are not smartphone users and they are aware of neither the affordances available in smartphones nor the potential for interaction through motion gestures. To investigate the most usable gestures for blind people, we conducted a user-defined study with 13 blind participants. Using the gesture set and design heuristics from the user study, we implemented motion gesture based interfaces with speech and vibration feedback for browsing phone books and making a call. We then conducted a second study to investigate the usability of the motion gesture interface and user experiences using the system. The findings indicated that motion gesture interfaces are more efficient than traditional button interfaces. Through the study results, we provided implications for designing smartphone interfaces.  相似文献   

18.
19.
Interactive environments for music and multimedia   总被引:2,自引:0,他引:2  
  相似文献   

20.
This paper promotes socially intelligent animated agents for the pedagogical task of English conversation training for native speakers of Japanese. As a novel feature, social role awareness is introduced to animated conversational agents, that are by non-strong affective reasoners, but otherwise often lack the social competence observed in humans. In particular, humans may easily adjust their behavior depending on their respective role in a social setting, whereas their synthetic pendants tend to be driven mostly by emotions and personality. Our main contribution is the incorporation of a “social filter program” to mental models of animated agents. This program may qualify an agent's expression of its emotional state by the social contest, thereby enhancing the agent's believability as a conversational partner. Our implemented system is web-based and demonstrates socially aware animated agents in a virtual coffee shop environment. An experiment with our conversation system shows that users consider socially aware agents as more natural than agents that violate conventional practices  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号