首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In this article, a dialogue game is presented in which coherent conversational sequences with inconsistent and biased information are described at the speech act level. Inconsistent and biased information is represented with bilattice structures, and based on these bilattice structures, a multi-valued logic is defined that makes it possible to describe a dialogue game in which agents can communicate about their cognitive states with inconsistent and biased information. A dialogue game is formalized by, first, defining the agent's cognitive state as a set of multi-valued theories, second, by defining the dialogue rules that prescribe permissible communicative acts based on the agent's cognitive state, and last, by defining update rules that change the agent's cognitive state as a result of communicative acts. We show that an example dialogue with inconsistent and biased information can be derived from our dialogue game.  相似文献   

2.
3.
We present a corpus-based prosodic analysis with the aim of uncovering the relationship between dialogue acts, personality and prosody in view to providing guidelines for the ECA Greta’s text-to-speech system. The corpus used is the SEMAINE corpus, featuring four different personalities, further annotated for dialogue acts and prosodic features. In order to show the importance of the choice of dialogue act taxonomy, two different taxonomies were used, the first corresponding to Searle’s taxonomy of speech acts and the second, inspired by Bunt’s DIT++, including a division of directive acts into finer categories. Our results show that finer-grained distinctions are important when choosing a taxonomy. We also show with some preliminary results that the prosodic correlates of dialogue acts are not always as cited in the literature and prove more complex and variable. By studying the realisation of different directive acts, we also observe differences in the communicative strategies of the ECA depending on personality, in view to providing input to a speech system.  相似文献   

4.
This paper describes some of the basic cooperative mechanisms of dialogue. Ideal cooperation is seen as consisting of four features (cognitive consideration, joint purpose, ethical consideration and trust), which can also to some extent be seen as requirements building on each other. Weaker concepts such as “coordination” and “collaboration” have only some of these features or have them to lesser degrees. We point out the central role of ethics and trust in cooperation, and contrast the result with popular AI accounts of collaboration. Dialogue is also seen as associated with social activities, in which certain obligations and rights are connected with particular roles. Dialogue is seen to progress through the written, vocal or gestural contributions made by participants. Each of the contributions has associated with it both expressive and evocative functions, as well as specific obligations for participants. These functions are dependent on the surface form of a contribution, the activity and the local context, for their interpretation. We illustrate the perspective by analysing dialogue extracts from three different activity types (a travel dialogue, a quarrel and a dialogue with a computer system). Finally, we consider what kind of information is shared in dialogue, and the ways in which dialogue participants manifest this sharing to each other through linguistic and other communicative behaviour. The paper concludes with a comparison to other accounts of dialogue and prospects for integration of these ideas within dialogue systems.  相似文献   

5.
In complex multiagent systems, the agents may be heterogeneous and possibly designed by different programmers. Thus, the importance of defining a standard framework for agent communication languages (ACL) with a clear semantics has been widely recognized. The semantics should be verifiable, clear, and practical. Most classical proposals (for instance, mentalistic semantics) fail to meet these objectives. This paper proposes a logic‐based semantics, which is social in nature. The basic idea is to associate with each speech act a clear meaning in terms of a commitment induced by that speech act, and a penalty to be paid in case that commitment is violated. A violation criterion based on the existence of arguments is then defined per speech act. We show that the proposed semantics satisfies some key properties that ensure that the approach is well founded. The logical setting makes the semantics verifiable. Moreover, it is shown that the new semantics is practical because it captures the dynamic of dialogues and shows clearly how isolated speech acts can be connected for building dialogues. © 2008 Wiley Periodicals, Inc.  相似文献   

6.
In some cases, to make a proper translation of an utterance in a dialogue, different pieces of contextual information are needed. Interpreting such utterances often requires dialogue analysis including speech acts and discourse analysis. In this paper, a statistical dialogue analysis model for Korean–English dialogue machine translation based on speech acts is proposed. The model uses syntactic patterns and n-grams of speech acts. The syntactic patterns include surface syntactic features which are related to the language-dependent expressions of speech acts. Speech-act n-grams are used to approximate the context of utterances. The key feature is the use of speech-act n-grams based on hierarchical recency. Experimental results with trigrams show that the proposed model achieves an accuracy of 66.87% for the top candidate and 82.35% for the top three candidates. It indicates that the proposed model based on hierarchical recency outperforms the model based on linear recency.  相似文献   

7.
This paper presents a model of incremental speech generation in practical conversational systems. The model allows a conversational system to incrementally interpret spoken input, while simultaneously planning, realising and self-monitoring the system response. If these processes are time consuming and result in a response delay, the system can automatically produce hesitations to retain the floor. While speaking, the system utilises hidden and overt self-corrections to accommodate revisions in the system. The model has been implemented in a general dialogue system framework. Using this framework, we have implemented a conversational game application. A Wizard-of-Oz experiment is presented, where the automatic speech recognizer is replaced by a Wizard who transcribes the spoken input. In this setting, the incremental model allows the system to start speaking while the user's utterance is being transcribed. In comparison to a non-incremental version of the same system, the incremental version has a shorter response time and is perceived as more efficient by the users.  相似文献   

8.
A linguistic form's compositional, timeless meaning can be surrounded or even contradicted by various social, aesthetic, or analogistic companion meanings. This paper addresses a series of problems in the structure of spoken language discourse, including turn-taking and grounding . It views these processes as composed of fine-grained actions, which resemble speech acts both in resulting from a computational mechanism of planning and in having a rich relationship to the specific linguistic features which serve to indicate their presence. The resulting notion of Conversation Acts is more general than speech act theory, encompassing not only the traditional speech acts but turn-taking, grounding, and higher-level argumentation acts as well. Furthermore, the traditional speech acts in this scheme become fully joint actions, whose successful performance requires full listener participation. This paper presents a detailed analysis of spoken language dialogue. It shows the role of each class of conversation acts in discourse structure, and discusses how each class can be processed and recognized. Conversation acts, it will be seen, better account for the success of conversation than speech act theory alone. They also provide a pragmatic view of meaning in which the literal/non-literal distinction is simply irrelevant.  相似文献   

9.
A discourse is composed of a sequence of sentences that must be interpreted with respect to the context in which they are uttered and to the actions that produce them: locutors' speech acts. The analysis of discourse content must be based on a pragmatic approach to the study of language in use. Some of the most obvious linguistic elements that require contextual information for their representation are deictic forms such as here, now, I, you, this , and verb tenses.Several authors have recognized a need for introducing contextual structures in knowledge representation models such as semantic networks. Sowa's Conceptual Graph Theory is a powerful approach to conceptually represent knowledge contained in discourses. However, it must be extended in order to represent several semantic and pragmatic mechanisms related to the expression of time in natural language. In this paper we present such an extension as a framework for modeling temporal knowledge in discourses integrating several features borrowed from speech act theory.First, we introduce the notions of time interval, temporal object, temporal situation, and temporal relation. Then, we discuss the importance of explicitly introducing the concept of time coordinate system in a discourse representation and we present different kinds of temporal contexts: narrator's perspective, agent's perspective and temporal localization. We show how this conceptual framework can be used to represent various referential mechanisms in discourse such as anaphoras, indexicals, direct and indirect styles. We also discuss how to model several linguistic phenomena such as speech act characteristics and the specification of performative and attitude utterances. Finally, we briefly discuss how verb tenses can be determined in a discourse on the basis of this temporal approach.  相似文献   

10.
Modeling conversation policies using permissions and obligations   总被引:1,自引:1,他引:1  
Both conversation specifications and policies are required to facilitate effective agent communication. Specifications provide the order in which speech acts can occur in a meaningful conversation, whereas policies restrict the specifications that can be used in a certain conversation based on the sender, receiver, messages exchanged thus far, content, and other context. We propose that positive/negative permissions and obligations be used to model conversation specifications and policies. We also propose the use of ontologies to categorize speech acts such that high level policies can be defined without going into specifics of the speech acts. This approach is independent of the syntax and semantics of the communication language and can be used for different agent communication languages. Our policy based framework can help in agent communication in three ways: (i) to filter inappropriate messages, (ii) to help an agent to decide which speech act to use next, and (iii) to prevent an agent from sending inappropriate messages. Our work differs from most existing research on communication policies because it is not tightly coupled to any domain information such as the mental states of agents or specific communicative acts. Contributions of this work include: (i) an extensible framework that is applicable to varied domain knowledge and different agent communication languages, and (ii) the declarative representation of conversation specifications and policies in terms of permitted and obligated speech acts.  相似文献   

11.
In this paper, we extend a temporal defeasible logic with a modal operator Committed to formalize commitments that agents undertake as a consequence of communicative actions (speech acts) during dialogues. We represent commitments as modal sentences. The defeasible dual of the modal operator Committed is a modal operator called Exempted. The logical setting makes the social-commitment based semantics of speech acts verifiable and practical; it is possible to detect if, and when, a commitment is violated and/or complied with. One of the main advantages of the proposed system is that it allows for capturing the nonmonotonic behavior of the commitments induced by the relevant speech acts.  相似文献   

12.
On the Representation of Context   总被引:1,自引:0,他引:1  
This paper revisits some foundational questions concerning the abstract representation of a discourse context. The context of a conversation is represented by a body of information that is presumed to be shared by the participants in the conversation – the information that the speaker presupposes a point at which a speech act is interpreted. This notion is designed to represent both the information on which context-dependent speech acts depend, and the situation that speech acts are designed to affect, and so to be a representation of context that is appropriate for explaining the interaction of context and the contents expressed in them. After reviewing the motivating ideas and the outlines of the apparatus, the paper responds to a criticism of the framework, and considers the way it can help to clarify some phenomena concerning pronouns with indefinite antecedents.  相似文献   

13.
Conversational Actions and Discourse Situations   总被引:2,自引:0,他引:2  
We use the idea that actions performed in a conversation become part of the common ground as the basis for a model of context that reconciles in a general and systematic fashion the differences between the theories of discourse context used for reference resolution, intention recognition, and dialogue management. We start from the treatment of anaphoric accessibility developed in discourse representation theory (DRT), and we show first how to obtain a discourse model that, while preserving DRT's basic ideas about referential accessibility, includes information about the occurrence of speech acts and their relations. Next, we show how the different kinds of 'structure' that play a role in conversation—discourse segmentation, turn‐taking, and grounding—can be formulated in terms of information about speech acts, and use this same information as the basis for a model of the interpretation of fragmentary input.  相似文献   

14.
15.
We recently reported the use of Kohonen's feature map as the hidden layer of an RBF network for the recognition of spoken letters [1], and the analysis of sleep EEG [2]. The feature map was shown to act as an aid to visualization during the initial period of unsupervised learning in the hidden layer. In this paper, we again explore the topology preserving properties of Kohonen's feature map, this time for the visual interpretation of speech. It is shown that speech sounds, such as words or phonemes, may be displayed as moving trajectories on a computer screen and enhanced for ease of interpretation. A system known as the Visual Ear is introduced, in which speech from a normal speaker is displayed alongside that of a pupil learning pronunciation, enabling a visual comparison to be made between the two. The application of the Visual Ear to accelerated learning of foreign languages, or as a general speech therapy tool, are then discussed, and the limitations of the present system are highlighted.  相似文献   

16.
Research into cognitive architectures is described within a framework spanning major issues in artificial intelligence and cognitive science. Earlier work on motivation is extended with a cognitive model of reasoning which, together with an affective mechanism, enables consistent decision-making across a variety of cognitive and reactive processes. Cognition involves the control of behaviour within both external and internal environments. The control of behaviour is vital to an autonomous system as it acts to further its goals. Except in the most spartan of environments, the potential available information and associated combinatorics in a perception, cognition, and action sequence can tax even the most powerful agents. The affect magnitude concept solves some problems with BDI models, and allows for adaptive decision-making over a number of tasks in different domains. The cognitive and affective components are brought together using motivational constructs. The generic cognitive model can adapt to different environments and tasks as it makes use of motivational models to direct reactive and situated processes.  相似文献   

17.
Medical image interpretation is moving from using 2D- to volumetric images, thereby changing the cognitive and perceptual processes involved. This is expected to affect medical students' experienced cognitive load, while learning image interpretation skills. With two studies this explorative research investigated whether measures inherent to image interpretation, i.e. human-computer interaction and eye tracking, relate to cognitive load. Subsequently, it investigated effects of volumetric image interpretation on second-year medical students' cognitive load. Study 1 measured human-computer interactions of participants during two volumetric image interpretation tasks. Using structural equation modelling, the latent variable ‘volumetric image information’ was identified from the data, which significantly predicted self-reported mental effort as a measure of cognitive load. Study 2 measured participants' eye movements during multiple 2D and volumetric image interpretation tasks. Multilevel analysis showed that time to locate a relevant structure in an image was significantly related to pupil dilation, as a proxy for cognitive load. It is discussed how combining human-computer interaction and eye tracking allows for comprehensive measurement of cognitive load. Combining such measures in a single model would allow for disentangling unique sources of cognitive load, leading to recommendations for implementation of volumetric image interpretation in the medical education curriculum.  相似文献   

18.
A multimodal interactive dialogue automaton (kiosk) for self-service is presented in the paper. Multimodal user interface allow people to interact with the kiosk by natural speech, gestures additionally to the standard input and output devices. Architecture of the kiosk contains key modules of speech processing and computer vision. An array of four microphones is applied for far-field capturing and recording of user’s speech commands, it allows the kiosk to detect voice activity, to localize sources of desired speech signals, and to eliminate environmental acoustical noises. A noise robust speaker-independent recognition system is applied to automatic interpretation and understanding of continuous Russian speech. The distant speech recognizer uses grammar of voice queries as well as garbage and silence models to improve recognition accuracy. Pair of portable video-cameras are applied for vision-based detection and tracking of user’s head and body position inside of the working area. Russian-speaking talking head serves both for bimodal audio-visual speech synthesis and for improvement of communication intelligibility by turning the head to an approaching client. Dialogue manager controls the flow of dialogue and synchronizes sub-modules for input modalities fusion and output modalities fission. The experiments made with the multimodal kiosk were directed to cognitive and usability studies of human-computer interaction by different communication means  相似文献   

19.
The aim of this paper is to show how speech act theory can be used in systems development as a theoretical foundation for conceptual modelling. With the traditional notion of the conceptual model as an image of reality, the predominant modelling problem is to analyse how the external reality should be mapped into, and represented in, the system in a ‘true’ way. In contrast to this, we maintain that the main modelling problem should be to analyse the communication acts performed by use of the system within its business context. This implies an integration of traditional conceptual modelling with action-oriented business modelling based on speech act theory. With such an approach, it is possible to reconcile traditional conceptual modelling and the pragmatic aspects of language and computer use. It is argued that such reconciliation is essential to arrive at systems that provide relevant information to users and in which users can trace responsibilities for information, actions and commitments made.  相似文献   

20.
We present an approach to adapt dynamically the language models (LMs) used by a speech recognizer that is part of a spoken dialogue system. We have developed a grammar generation strategy that automatically adapts the LMs using the semantic information that the user provides (represented as dialogue concepts), together with the information regarding the intentions of the speaker (inferred by the dialogue manager, and represented as dialogue goals). We carry out the adaptation as a linear interpolation between a background LM, and one or more of the LMs associated to the dialogue elements (concepts or goals) addressed by the user. The interpolation weights between those models are automatically estimated on each dialogue turn, using measures such as the posterior probabilities of concepts and goals, estimated as part of the inference procedure to determine the actions to be carried out. We propose two approaches to handle the LMs related to concepts and goals. Whereas in the first one we estimate a LM for each one of them, in the second one we apply several clustering strategies to group together those elements that share some common properties, and estimate a LM for each cluster. Our evaluation shows how the system can estimate a dynamic model adapted to each dialogue turn, which helps to significantly improve the performance of the speech recognition, which leads to an improvement in both the language understanding and the dialogue management tasks.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号