首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
This paper presents PARADISE (PARAdigm for DIalogue System Evaluation), a general framework for evaluating and comparing the performance of spoken dialogue agents. The framework decouples task requirements from an agent's dialogue behaviours, supports comparisons among dialogue strategies, enables the calculation of performance over subdialogues and whole dialogues, specifies the relative contribution of various factors to performance, and makes it possible to compare agents performing different taks by normalizing for task complexity. After presenting PARADISE, we illustrate its application to two different spoken dialogue agents. We show how to derive a performance function for each agent and how to generalize results across agents. We then show that once such a performance function has been derived, it can be used both for making predictions about future versions of an agent, and as feedback to the agent so that the agent can learn to optimize its behaviour based on its experiences with users over time.  相似文献   

2.
有效地进行多轮对话是开放域人机对话系统的主要目标之一。目前的神经网络对话生成模型在开放域多轮对话过程中存在着容易产生万能回复、很快陷入死循环的问题;而已有的多轮对话研究工作存在着没有考虑未来对话走向的问题。借鉴强化学习方法考虑全局的视角,该文利用深度强化学习算法DQN(deep Q-network),提出了使用深度价值网络对每一轮的候选句子进行评估,并选择未来收益最大的而非生成概率最大的句子作为回复的多轮对话策略学习方法。实验结果表明,该文提出的方法将多轮对话的平均对话轮数提高了两轮,同时在主观对比评价指标上获胜比例高出了45%。  相似文献   

3.
Spoken dialogue system performance can vary widely for different users, as well for the same user during different dialogues. This paper presents the design and evaluation of an adaptive version of TOOT, a spoken dialogue system for retrieving online train schedules. Based on rules learned from a set of training dialogues, adaptive TOOT constructs a user model representing whether the user is having speech recognition problems as a particular dialogue progresses. Adaptive TOOT then automatically adapts its dialogue strategies based on this dynamically changing user model. An empirical evaluation of the system demonstrates the utility of the approach.  相似文献   

4.
This paper proposes a new technique to test the performance of spoken dialogue systems by artificially simulating the behaviour of three types of user (very cooperative, cooperative and not very cooperative) interacting with a system by means of spoken dialogues. Experiments using the technique were carried out to test the performance of a previously developed dialogue system designed for the fast-food domain and working with two kinds of language model for automatic speech recognition: one based on 17 prompt-dependent language models, and the other based on one prompt-independent language model. The use of the simulated user enables the identification of problems relating to the speech recognition, spoken language understanding, and dialogue management components of the system. In particular, in these experiments problems were encountered with the recognition and understanding of postal codes and addresses and with the lengthy sequences of repetitive confirmation turns required to correct these errors. By employing a simulated user in a range of different experimental conditions sufficient data can be generated to support a systematic analysis of potential problems and to enable fine-grained tuning of the system.  相似文献   

5.
We propose a dialogue game protocol for purchase negotiation dialogues which identifies appropriate speech acts, defines constraints on their utterances, and specifies the different sub-tasks agents need to perform in order to engage in dialogues according to this protocol. Our formalism combines a dialogue game similar to those in the philosophy of argumentation with a model of rational consumer purchase decision behaviour adopted from marketing theory. In addition to the dialogue game protocol, we present a portfolio of decision mechanisms for the participating agents engaged in the dialogue and use these to provide our formalism with an operational semantics. We show that these decision mechanisms are sufficient to generate automated purchase decision dialogues between autonomous software agents interacting according to our proposed dialogue game protocol.  相似文献   

6.
《Ergonomics》2012,55(1):43-55
The aim of the study was to determine the influence of textual feedback on the content and outcome of spoken interaction with a natural language dialogue system. More specifically, the assumption that textual feedback could disrupt spoken interaction was tested in a human–computer dialogue situation. In total, 48 adult participants, familiar with the system, had to find restaurants based on simple or difficult scenarios using a real natural language service system in a speech-only (phone), speech plus textual dialogue history (multimodal) or text-only (web) modality. The linguistic contents of the dialogues differed as a function of modality, but were similar whether the textual feedback was included in the spoken condition or not. These results add to burgeoning research efforts on multimodal feedback, in suggesting that textual feedback may have little or no detrimental effect on information searching with a real system.

Statement of Relevance: The results suggest that adding textual feedback to interfaces for human–computer dialogue could enhance spoken interaction rather than create interference. The literature currently suggests that adding textual feedback to tasks that depend on the visual sense benefits human–computer interaction. The addition of textual output when the spoken modality is heavily taxed by the task was investigated.  相似文献   

7.
Models of rationality typically rely on underlying logics that allow simulated agents to entertain beliefs about one another to any depth of nesting. Such models seem to be overly complex when used for belief modelling in environments in which cooperation between agents can be assumed, i.e., most HCI contexts. We examine some existing dialogue systems and find that deeply-nested beliefs are seldom supported, and that where present they appear to be unnecessary except in some situations involving deception.Use of nested beliefs is associated with nested reasoning (i.e., reasoning about other agents' reasoning). We argue that for cooperative dialogues, representations of individual nested beliefs of the third level (i.e., what A thinks B thinks A thinks B thinks) and beyond are in principle unnecessary unless directly available from the environment, because the corresponding nested reasoning is redundant.Since cooperation sometimes requires that agents reason about what is mutually believed, we propose a representation in which the second and all subsequent nesting levels are merged into a single category. In situations affording individual deeply-nested beliefs, such a representation restricts agents to human-like referring and repair strategies, where an unrestricted agent might make an unrealistic and perplexing utterance.  相似文献   

8.
Reinforcement techniques have been successfully used to maximise the expected cumulative reward of statistical dialogue systems. Typically, reinforcement learning is used to estimate the parameters of a dialogue policy which selects the system's responses based on the inferred dialogue state. However, the inference of the dialogue state itself depends on a dialogue model which describes the expected behaviour of a user when interacting with the system. Ideally the parameters of this dialogue model should be also optimised to maximise the expected cumulative reward.This article presents two novel reinforcement algorithms for learning the parameters of a dialogue model. First, the Natural Belief Critic algorithm is designed to optimise the model parameters while the policy is kept fixed. This algorithm is suitable, for example, in systems using a handcrafted policy, perhaps prescribed by other design considerations. Second, the Natural Actor and Belief Critic algorithm jointly optimises both the model and the policy parameters. The algorithms are evaluated on a statistical dialogue system modelled as a Partially Observable Markov Decision Process in a tourist information domain. The evaluation is performed with a user simulator and with real users. The experiments indicate that model parameters estimated to maximise the expected reward function provide improved performance compared to the baseline handcrafted parameters.  相似文献   

9.
Oral discourse is the primary form of human–human communication, hence, computer interfaces that communicate via unstructured spoken dialogues will presumably provide a more efficient, meaningful, and naturalistic interaction experience. Within the context of learning environments, there are theoretical positions supporting a speech facilitation hypothesis that predicts that spoken tutorial dialogues will increase learning more than typed dialogues. We evaluated this hypothesis in an experiment where 24 participants learned computer literacy via a spoken and a typed conversation with AutoTutor, an intelligent tutoring system with conversational dialogues. The results indicated that (a) enhanced content coverage was achieved in the spoken condition; (b) learning gains for both modalities were on par and greater than a no-instruction control; (c) although speech recognition errors were unrelated to learning gains, they were linked to participants' evaluations of the tutor; (d) participants adjusted their conversational styles when speaking compared to typing; (e) semantic and statistical natural language understanding approaches to comprehending learners' responses were more resilient to speech recognition errors than syntactic and symbolic-based approaches; and (f) simulated speech recognition errors had differential impacts on the fidelity of different semantic algorithms. We discuss the impact of our findings on the speech facilitation hypothesis and on human–computer interfaces that support spoken dialogues.  相似文献   

10.
由于领域外话语具有内容短小、表达多样性、开放性及口语化等特点,限定领域口语对话系统中超出领域话语的对话行为识别是一个挑战。该文提出了一种结合外部无标签微博数据的随机森林对话行为识别方法。该文采用的微博数据无需根据应用领域特点专门收集和挑选,又与口语对话同样具有口语化和表达多样性的特点,其训练得到的词向量在超出领域话语出现超出词汇表字词时提供了有效的相似性扩展度量。随机森林模型具有较好的泛化能力,适合训练数据有限的分类任务。中文特定领域的口语对话语料库测试表明,该文提出的超出领域话语的对话行为识别方法取得了优于最大熵、卷积神经网络等短文本分类研究进展中的方法的效果。  相似文献   

11.
This paper describes an experiment on the effects of learning, mode of interaction (written vs. spoken) and transfer mode on user performance and discourse organization during interaction with a natural language dialogue system. Forty-eight participants took part in a series of 12 dialogues with an information retrieval system presented either in the written or the spoken mode during the first six dialogues. The next six dialogues were then presented either in the same interaction mode or in another mode. The analysis of the results showed that performance (time, number of effective turns) improved throughout the dialogues whatever the mode of interaction. Nevertheless, performance was higher in the written mode. Moreover, mode-specific characteristics were observed. These consisted in greater use of subject pronouns and articles in the spoken mode. Similarly, in the spoken mode, the users found it easier to re-use the formulations presented in the system speech than in the written mode. Furthermore, the analysis also revealed a positive transfer effect on performance and discourse organization when the individuals first interacted in the spoken mode and then in the written mode. Both positive and negative transfer effects were observed when the individuals interacted first in the written mode followed by the spoken mode. The implications of the results are discussed in terms of direct and indirect consequences of modality effects on natural language dialogue interaction.  相似文献   

12.
The goal of dialogue management in a spoken dialogue system is to take actions based on observations and inferred beliefs. To ensure that the actions optimize the performance or robustness of the system, researchers have turned to reinforcement learning methods to learn policies for action selection. To derive an optimal policy from data, the dynamics of the system is often represented as a Markov Decision Process (MDP), which assumes that the state of the dialogue depends only on the previous state and action. In this article, we investigate whether constraining the state space by the Markov assumption, especially when the structure of the state space may be unknown, truly affords the highest reward. In simulation experiments conducted in the context of a dialogue system for interacting with a speech-enabled web browser, models under the Markov assumption did not perform as well as an alternative model which classifies the total reward with accumulating features. We discuss the implications of the study as well as its limitations.  相似文献   

13.
14.
We present the MATCH corpus, a unique data set of 447 dialogues in which 26 older and 24 younger adults interact with nine different spoken dialogue systems. The systems varied in the number of options presented and the confirmation strategy used. The corpus also contains information about the users’ cognitive abilities and detailed usability assessments of each dialogue system. The corpus, which was collected using a Wizard-of-Oz methodology, has been fully transcribed and annotated with dialogue acts and “Information State Update” (ISU) representations of dialogue context. Dialogue act and ISU annotations were performed semi-automatically. In addition to describing the corpus collection and annotation, we present a quantitative analysis of the interaction behaviour of older and younger users and discuss further applications of the corpus. We expect that the corpus will provide a key resource for modelling older people’s interaction with spoken dialogue systems.  相似文献   

15.
There is a strong relationship between evaluation and methods for automatically training language processing systems, where generally the same resource and metrics are used both to train system components and to evaluate them. To date, in dialogue systems research, this general methodology is not typically applied to the dialogue manager and spoken language generator. However, any metric for evaluating system performance can be used as a feedback function for automatically training the system. This approach is motivated with examples of the application of reinforcement learning to dialogue manager optimization, and the use of boosting to train the spoken language generator.  相似文献   

16.
口语对话中的语句主题分析   总被引:1,自引:0,他引:1  
本文研究如何根据浅层的语义分析确定自然口语对话中的语句主题。首先将对话中的语句主题定义为说话者所关注的显著语义实体,并讨论了这样的语句主题所具有的两个特点(即话语性和连续性) 以及语句主题跟(扩展) 句子类型的关系(因而也介绍了句子类型及其扩展和扩展句子类型的识别) 。然后根据这些建立了语句主题分析算法,并在实际的对话语料中进行分析。实验结果表明,语句主题的分析正确率可达到6111~8716 % ,取决于不同的扩展句子类型和不同的正确率定义。  相似文献   

17.
Traditional dialogue systems use a fixed silence threshold to detect the end of users’ turns. Such a simplistic model can result in system behaviour that is both interruptive and unresponsive, which in turn affects user experience. Various studies have observed that human interlocutors take cues from speaker behaviour, such as prosody, syntax, and gestures, to coordinate smooth exchange of speaking turns. However, little effort has been made towards implementing these models in dialogue systems and verifying how well they model the turn-taking behaviour in human–computer interactions. We present a data-driven approach to building models for online detection of suitable feedback response locations in the user's speech. We first collected human–computer interaction data using a spoken dialogue system that can perform the Map Task with users (albeit using a trick). On this data, we trained various models that use automatically extractable prosodic, contextual and lexico-syntactic features for detecting response locations. Next, we implemented a trained model in the same dialogue system and evaluated it in interactions with users. The subjective and objective measures from the user evaluation confirm that a model trained on speaker behavioural cues offers both smoother turn-transitions and more responsive system behaviour.  相似文献   

18.
Dialogue state tracking(DST)leverages dialogue information to predict dialogues states which are generally represented as slot-value pairs.However,previous work usually has limitations to efficiently predict values due to the lack of a powerful strategy for generating values from both the dialogue history and the predefined values.By predicting values from the predefined value set,previous discriminative DST methods are difficult to handle unknown values.Previous generative DST methods determine values based on mentions in the dialogue history,which makes it difficult for them to handle uncovered and non-pointable mentions.Besides,existing generative DST methods usually ignore the unlabeled instances and suffer from the label noise problem,which limits the generation of mentions and eventually hurts performance.In this paper,we propose a unified shared-private network(USPN) to generate values from both the dialogue history and the predefined values through a unified strategy.Specifically,USPN uses an encoder to construct a complete generative space for each slot and to discern shared information between slots through a shared-private architecture.Then,our model predicts values from the generative space through a shared-private decoder.We further utilize reinforcement learning to alleviate the label noise problem by learning indirect supervision from semantic relations between conversational words and predefined slot-value pairs.Experimental results on three public datasets show the effectiveness of USPN by outperforming state-of-the-art baselines in both supervised and unsupervised DST tasks.  相似文献   

19.
We address the issue of appropriate user modeling to generate cooperative responses to users in spoken dialogue systems. Unlike previous studies that have focused on a user’s knowledge, we propose more generalized modeling. We specifically set up three dimensions for user models: the skill level in use of the system, the knowledge level about the target domain, and the degree of urgency. Moreover, the models are automatically derived by decision tree learning using actual dialogue data collected by the system. We obtained reasonable accuracy in classification for all dimensions. Dialogue strategies based on user modeling were implemented on the Kyoto City Bus Information System that was developed at our laboratory. Experimental evaluations revealed that the cooperative responses adapted to each subject type served as good guides for novices without increasing the duration dialogue lasted for skilled users.  相似文献   

20.
Abstract   In this paper, we present a synchronous text-based communication tool, referred to as Adaptive Communication Tool (ACT), which provides capabilities for adaptation and personalization. ACT supports both the free and the structured form of dialogue. The structured dialogue is implemented by two types of Scaffolding Sentence Templates (SST); i.e. sentence openers or communicative acts. The capability of adaptation is considered in the sense of making suggestions for the supported form of dialogue and SST type and providing the most meaningful and complete set of SST with respect to the learning outcomes addressed by the collaborative learning activity and the model of collaboration followed by the group members. Also, ACT enables learners to have control on the adaptation by selecting the form of dialogue and the SST type they prefer to use and enriching the provided SST set with their own ones in order to cover their communication needs. The results from the formative evaluation of the tool showed that (i) the proposed dialogue form, SST type and the provided set of SST cover students' communication needs, (ii) the capability of personalizing the communication by selecting the desired communication means as well as by enriching the provided SST set satisfied students, and (iii) students used adequately both types of SST resulting into on-task and coherent dialogues.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号