首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
For pt.1see ibid., vol. 9, p. 3 (2007). In this paper, the task and user interface modules of a multimodal dialogue system development platform are presented. The main goal of this work is to provide a simple, application-independent solution to the problem of multimodal dialogue design for information seeking applications. The proposed system architecture clearly separates the task and interface components of the system. A task manager is designed and implemented that consists of two main submodules: the electronic form module that handles the list of attributes that have to be instantiated by the user, and the agenda module that contains the sequence of user and system tasks. Both the electronic forms and the agenda can be dynamically updated by the user. Next a spoken dialogue module is designed that implements the speech interface for the task manager. The dialogue manager can handle complex error correction and clarification user input, building on the semantics and pragmatic modules presented in Part I of this paper. The spoken dialogue system is evaluated for a travel reservation task of the DARPA Communicator research program and shown to yield over 90% task completion and good performance for both objective and subjective evaluation metrics. Finally, a multimodal dialogue system which combines graphical and speech interfaces, is designed, implemented and evaluated. Minor modifications to the unimodal semantic and pragmatic modules were required to build the multimodal system. It is shown that the multimodal system significantly outperforms the unimodal speech-only system both in terms of efficiency (task success and time to completion) and user satisfaction for a travel reservation task  相似文献   

2.
近年来,随着人工智能的发展与智能设备的普及,人机智能对话技术得到了广泛的关注。口语语义理解是口语对话系统中的一项重要任务,而口语意图检测是口语语义理解中的关键环节。由于多轮对话中存在语义缺失、框架表示以及意图转换等复杂的语言现象,因此面向多轮对话的意图检测任务十分具有挑战性。为了解决上述难题,文中提出了基于门控机制的信息共享网络,充分利用了多轮对话中的上下文信息来提升检测性能。具体而言,首先结合字音特征构建当前轮文本和上下文文本的初始表示,以减小语音识别错误对语义表示的影响;其次,使用基于层级化注意力机制的语义编码器得到当前轮和上下文文本的深层语义表示,包含由字到句再到多轮文本的多级语义信息;最后,通过在多任务学习框架中引入门控机制来构建基于门控机制的信息共享网络,使用上下文语义信息辅助当前轮文本的意图检测。实验结果表明,所提方法能够高效地利用上下文信息来提升口语意图检测效果,在全国知识图谱与语义计算大会(CCKS2018)技术评测任务2的数据集上达到了88.1%的准确率(Acc值)和88.0%的综合正确率(F1值),相比于已有的方法显著提升了性能。  相似文献   

3.
汉语股票实时行情查询对话系统   总被引:1,自引:0,他引:1  
介绍了一个用于股票实时行情查询的口语化的人机对话系统,该系统集成了语音识别、语言理解、对话控制等技术。文中定义了一个情景语义框架模型,较好地处理了口语理解系统的一些难点。  相似文献   

4.
This article describes the various knowledge sources that, in general, are required to handle multimodal human-machine interaction efficiently: these are called the task, user, dialogue, environment and system models. The first part discusses the content of these models. Special emphasis is given on problems that occur when speech is combined with other modalities. The second part focuses on spoken language characteristics and proposes an adapted semantic representation for the task model. It also describes a stochastic method to collect and process the information related to this model. The conclusion discusses an extension of such a stochastic method to multimodality.  相似文献   

5.
口语解析在人机对话系统和口语翻译系统中的作用是十分关键的。本文提出了一种统计和规则相结合的汉语口语解析方法,解析结果是一种中间语义表示格式。该方法分为两个阶段。首先,采用统计方法,解析出输入句子的语义信息,然后,利用规则,将这些语义信息映射到中间语义表示格式。试验证明,此方法具有较强的鲁棒性,而且避免了完全用规则方法解析的一些弊端,达到较高的解析正确率。  相似文献   

6.
领域外话语的开放性、口语化以及表达多样性,使得现有的限定领域口语对话系统不能很好地处理超出领域话语。该文提出了一种限定领域口语对话系统协处理方案,基于人工智能标记语言AIML,设计一套理解开放语义用户话语的理解模板,并对未匹配话语基于话语相似度进行理解模板分类,进而采用扩展有限状态自动机处理模式,结合对话流程上下文的状态及信息,实现理解模板到应答模板的转换,改变了单纯模板匹配方法在对话流程控制方面的相对缺失。中文手机导购领域的测试表明,该文所提出的协处理方法能有效地辅助口语对话系统完成限定领域完整对话流程,得到更好的用户满意度。
  相似文献   

7.
校园导航系统Easy Nav的设计与实现   总被引:10,自引:0,他引:10  
本文介绍了校园导航口语对话系统EasyNav的设计与实现。在分析了口语对话系统的特点和要求之后,我们提出了适合于对话系统的基于规则的语言理解流程。在这一流程中,句法分析使用GLR分析器处理上下文无关文法(CFG),获取句子结构特征以便为语义分析服务,句法规则照顾到覆盖率和准确率间的平衡。语义分析使用考虑句法约束条件的模板匹配方法,以获取话者意图为目标,并消除句法分析引入的歧义。这一设计的优点是系统容易搭建,也容易扩展。  相似文献   

8.
为正确理解口语对话、准确把握话者意图,除必要的语法和语义分析外,口语系统还需进行语用层面上的言语行为分析.文中提出一种基于精简循环网络的、综合使用语段级的微结构信息和语篇级的宏结构特征的汉语口语言语行为分析方法.针对会面安排领域口语语料库训练和测试,取得了满意效果  相似文献   

9.
针对中文口语问句的表达多样性对对话系统问题理解带来的挑战,该文采用“在语法结构之上获取语义知识”的设计理念,提出了一种语法和语义相结合的口语对话系统问题理解方法。首先人工编制了独立于领域和应用方向的语法知识库,进而通过句子压缩模块简化复杂句子,取得结构信息,再进行问题类型模式识别,得到唯一确定问题的语义组织方法、查询策略和应答方式的句型模式。另一方面,根据领域语义知识库,从源句子中提取相应的语义信息,并根据识别到的句型模式所对应的知识组织方法进行语义知识组织,完成对问句的理解。该文的方法被应用到开发的中文手机导购对话系统。测试结果表明,该方法能有效地完成对话流程中的用户问题理解。  相似文献   

10.
口语对话系统中对话管理方法研究综述   总被引:1,自引:1,他引:0  
口语对话系统是人机交互领域的核心技术,也是实现和谐人机交互的重要途径,具有重大的研究意义和应用价值,其中的各项理论和技术的研究进展一直备受关注.较为全面深入地总结了对话管理及口语对话系统的研究进展和现状.首先阐述了口语对话系统中的主要研究问题,包括系统各模块的研究内容与关键技术、系统的可移植性和鲁棒性设计等;然后从理论模型、研究进展及可用性等角度系统地剖析了现有的多种口语对话管理策略;最后展望了未来可能的研究方向和亟待解决的问题.  相似文献   

11.
《Advanced Robotics》2013,27(1-2):209-232
We describe an implementation integrating a complete spoken dialogue system with a mobile robot, which a human can direct to specific locations, ask for information about its status and supply information about its environment. The robot uses an internal map for navigation, and communicates its current orientation and accessible locations to the dialogue system using a topological map as interface. We focus on linguistic and inferential aspects of the human–robot communication process. The result is a novel approach using a principled semantic theory combined with techniques from automated deduction applied to a mobile robot platform. Due to the abstract level of the dialogue system, it is easily portable to other environments or applications.  相似文献   

12.
The aim of the French Media project was to define a protocol for the evaluation of speech understanding modules for dialog systems. Accordingly, a corpus of 1,257 real spoken dialogs related to hotel reservation and tourist information was recorded, transcribed and semantically annotated, and a semantic attribute-value representation was defined in which each conceptual relationship was represented by the names of the attributes. Two semantic annotation levels are distinguished in this approach. At the first level, each utterance is considered separately and the annotation represents the meaning of the statement without taking into account the dialog context. The second level of annotation then corresponds to the interpretation of the meaning of the statement by taking into account the dialog context; in this way a semantic representation of the dialog context is defined. This paper discusses the data collection, the detailed definition of both annotation levels, and the annotation scheme. Then the paper comments on both evaluation campaigns which were carried out during the project and discusses some results.  相似文献   

13.
《Knowledge》2006,19(3):153-163
Spoken dialogue systems can be considered knowledge-based systems designed to interact with users using speech in order to provide information or carry out simple tasks. Current systems are restricted to well-known domains that provide knowledge about the words and sentences the users will likely utter. Basically, these systems rely on an input interface comprised of speech recogniser and semantic analyser, a dialogue manager, and an output interface comprised of response generator and speech synthesiser. As an attempt to enhance the performance of the input interface, this paper proposes a technique based on a new type of speech recogniser comprised of two modules. The first one is a standard speech recogniser that receives the sentence uttered by the user and generates a graph of words. The second module analyses the graph and produces the recognised sentence using the context knowledge provided by the current prompt of the system. We evaluated the performance of two input interfaces working in a previously developed dialogue system: the original interface of the system and a new one that features the proposed technique. The experimental results show that when the sentences uttered by the users are out-of-context analysed by the new interface, the word accuracy and sentence understanding rates increase by 93.71 and 77.42% absolute, respectively, regarding the original interface. The price to pay for this clear enhancement is a little reduction in the scores when the new interface analyses sentences in-context, as they decrease by 2.05 and 3.41% absolute, respectively, in comparison with the original interface. Given that in real dialogues sentences may be out-of-context analysed, specially when they are uttered by inexperienced users, the technique can be very useful to enhance the system performance.  相似文献   

14.
This paper describes a domain-limited system for speech understanding as well as for speech translation. An integrated semantic decoder directly converts the preprocessed speech signal into its semantic representation by a maximum a-posteriori classification. With the combination of probabilistic knowledge on acoustic, phonetic, syntactic, and semantic levels, the semantic decoder extracts the most probable meaning of the utterance. No separate speech recognition stage is needed because of the integration of the Viterbi-algorithm (calculating acoustic probabilities by the use of Hidden-Markov-Models) and a probabilistic chart parser (calculating semantic and syntactic probabilities by special models). The semantic structure is introduced as a representation of an utterance's meaning. It can be used as an intermediate level for a succeeding intention decoder (within a speech understanding system for the control of a running application by spoken inputs) as well as an interlingua-level for a succeeding language production unit (within an automatic speech translation system for the creation of spoken output in another language). Following the above principles and using the respective algorithms, speech understanding and speech translating front-ends for the domains ‘graphic editor’, ‘service robot’, ‘medical image visualisation’ and ‘scheduling dialogues’ could be successfully realised.  相似文献   

15.
This paper presents a real-time system for human-machine spoken dialogue on the telephone in task-oriented domains. The system has been tested in a large trial with inexperienced users and it has proved robust enough to allow spontaneous interactions even for people with poor recognition performance. The robust behaviour of the system has been achieved by combining the use of specific language models during the recognition phase of analysis, the tolerance toward spontaneous speech phenomena, the activity of a robust parser, and the use of pragmatic-based dialogue knowledge. This integration of the different modules allows the system to deal with partial or total breakdowns at other levels of analysis. We report the field trial data of the system with respect to speech recognition metrics of word accuracy and sentence understanding rate, time-to-completion, time-to-acquisition of crucial parameters, and degree of success of the interactions in providing the speakers with the information they required. The evaluation data show that most of the subjects were able to interact fruitfully with the system. These results suggest that the design choices made to achieve robust behaviour are a promising way to create usable spoken language telephone systems.  相似文献   

16.
赋予聊天机器人个人信息对于提供自然的对话至关重要,对此提出具有个人信息的对话模型,包括问题分类、个人信息回复和开放域对话三个模块.在问题分类模块中,分析测试不同分类方法的效果;个人信息回复模块利用BiLSTM进行语义信息编码,训练采用对比损失函数,同时实验对比多种匹配模型;开放域对话模型以最大互信息为目标函数,减少无意...  相似文献   

17.
In this paper, we propose a strategy for designing dialogue managers in spoken dialogue systems for a restricted domain. This strategy combines several information sources intuition, observation and simulation, in order to maximize the adaptation within the system capability and the expectation of the user. These sources are combined by an iterative process consisting of five steps, where different dialogue alternatives are proposed and evaluated sequentially. The evaluation process includes different measures depending on the information required. Several measures are proposed and analyzed in each step. We also describe a user-modeling technique and an approach for designing the confirmation sub-dialogues based on recognition confidence measures. The knowledge-combining methodology is described and applied to a railway information system. In a subjective evaluation, users from the university gave the system a 3.9 score on a 5-point scale with an average call duration of 205 seconds. The employers of the railway company were more critical of the system. They gave it a score of 2.1 even though the system resolved more than half of the calls (57.8%) within an average call duration of three minutes (185 seconds).  相似文献   

18.
This paper describes a statistically motivated framework for performing real-time dialogue state updates and policy learning in a spoken dialogue system. The framework is based on the partially observable Markov decision process (POMDP), which provides a well-founded, statistical model of spoken dialogue management. However, exact belief state updates in a POMDP model are computationally intractable so approximate methods must be used. This paper presents a tractable method based on the loopy belief propagation algorithm. Various simplifications are made, which improve the efficiency significantly compared to the original algorithm as well as compared to other POMDP-based dialogue state updating approaches. A second contribution of this paper is a method for learning in spoken dialogue systems which uses a component-based policy with the episodic Natural Actor Critic algorithm.The framework proposed in this paper was tested on both simulations and in a user trial. Both indicated that using Bayesian updates of the dialogue state significantly outperforms traditional definitions of the dialogue state. Policy learning worked effectively and the learned policy outperformed all others on simulations. In user trials the learned policy was also competitive, although its optimality was less conclusive. Overall, the Bayesian update of dialogue state framework was shown to be a feasible and effective approach to building real-world POMDP-based dialogue systems.  相似文献   

19.
We present an approach to adapt dynamically the language models (LMs) used by a speech recognizer that is part of a spoken dialogue system. We have developed a grammar generation strategy that automatically adapts the LMs using the semantic information that the user provides (represented as dialogue concepts), together with the information regarding the intentions of the speaker (inferred by the dialogue manager, and represented as dialogue goals). We carry out the adaptation as a linear interpolation between a background LM, and one or more of the LMs associated to the dialogue elements (concepts or goals) addressed by the user. The interpolation weights between those models are automatically estimated on each dialogue turn, using measures such as the posterior probabilities of concepts and goals, estimated as part of the inference procedure to determine the actions to be carried out. We propose two approaches to handle the LMs related to concepts and goals. Whereas in the first one we estimate a LM for each one of them, in the second one we apply several clustering strategies to group together those elements that share some common properties, and estimate a LM for each cluster. Our evaluation shows how the system can estimate a dynamic model adapted to each dialogue turn, which helps to significantly improve the performance of the speech recognition, which leads to an improvement in both the language understanding and the dialogue management tasks.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号