首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
This paper proposes a domain-independent statistical methodology to develop dialog managers for spoken dialog systems. Our methodology employs a data-driven classification procedure to generate abstract representations of system turns taking into account the previous history of the dialog. A statistical framework is also introduced for the development and evaluation of dialog systems created using the methodology, which is based on a dialog simulation technique. The benefits and flexibility of the proposed methodology have been validated by developing statistical dialog managers for four spoken dialog systems of different complexity, designed for different languages (English, Italian, and Spanish) and application domains (from transactional to problem-solving tasks). The evaluation results show that the proposed methodology allows rapid development of new dialog managers as well as to explore new dialog strategies, which permit developing new enhanced versions of already existing systems.  相似文献   

2.
This paper describes our initial effort in developing a trilingual speech interface for financial information inquiries. Our foreign exchange inquiry system consists of: (i) monolingual and trilingual speech recognizers, which receive the user's spoken input in the form of microphone speech; (ii) a real-time data capture component which continuously updates a relational database from a financial data satellite feed; and (iii) a trilingual speech generation component, which generates English and Chinese text based on the raw financial data. The generated text is then transformed into spoken presentations. English text is processed by the FESTIVAL synthesizer system. Chinese text is sent to our syllable-based synthesizer, which employs a concatenative resequencing technique to produce spoken presentations in Putonghua or Cantonese. The speech interface is augmented with a visual display which aims to provide feedback to the user at all times during an interaction. Within the restricted scope of foreign exchange (FOREX), our recognition performance accuracies remain above 93%. Confusions across languages contributed significantly to our recognition errors, but most are confusions between the same currency/country names spoken in different languages. These errors are not detrimental with respect to data retrieval. Our concatenative re-sequencing technique reports the date, time and exchange rates of the input currency pair. A demonstration can be found at http://www.se.cuhk.edu.hk/hccl/demos/.  相似文献   

3.
This paper presents a real-time system for human-machine spoken dialogue on the telephone in task-oriented domains. The system has been tested in a large trial with inexperienced users and it has proved robust enough to allow spontaneous interactions even for people with poor recognition performance. The robust behaviour of the system has been achieved by combining the use of specific language models during the recognition phase of analysis, the tolerance toward spontaneous speech phenomena, the activity of a robust parser, and the use of pragmatic-based dialogue knowledge. This integration of the different modules allows the system to deal with partial or total breakdowns at other levels of analysis. We report the field trial data of the system with respect to speech recognition metrics of word accuracy and sentence understanding rate, time-to-completion, time-to-acquisition of crucial parameters, and degree of success of the interactions in providing the speakers with the information they required. The evaluation data show that most of the subjects were able to interact fruitfully with the system. These results suggest that the design choices made to achieve robust behaviour are a promising way to create usable spoken language telephone systems.  相似文献   

4.
This paper presents two projects concerned with the application of natural language processing technology for improving communication between Public Administration and citizens. The first project, GIST,is concerned with automatic multilingual generation of instructional texts for form-filling. The second project, TAMIC, aims at providing an interface for interactive access to information, centered on natural language processing and supposed to be used by the clerk but with the active participation of the citizen.This revised version was published online in October 2005 with corrections to the Cover Date.  相似文献   

5.
This paper describes the evaluation of a natural language dialog-based navigation system (HappyAssistant) that helps users access e-commerce sites to find relevant information about products and services. The prototype system leverages technologies in natural language processing and human-computer interaction to create a faster and more intuitive way of interacting with websites, especially for less experienced users. The result of a comparative study shows that users prefer the natural language-enabled navigation two to one over the menu driven navigation. In addition, the study confirmed the efficiency of using natural language dialog in terms of the number of clicks and the amount of time required to obtain the relevant information. In the case study, as compared to the menu driven system, the average number of clicks used in the natural language system was reduced by 63.2% and the average time was reduced by 33.3%.  相似文献   

6.
This paper concentrates on the problem of designing and developing a spoken query retrieval (SQR) system to access large document databases via voice. The main challenge is to identify and address issues related to the adaptation and scalability of integrating automatic speech recognition (ASR) and information retrieval (IR). In this paper, a Context Aware Language Model (CALM) framework allowing information retrieval to large document databases via voice is presented and findings from a research study using the framework will be discussed as well.  相似文献   

7.
Despite more than a decade of research on medical information systems, deficiencies exist in our capability of establishing an effective environmental health information infrastructure. In this research, we present a pilot study on creating a feasible environmental health information infrastructure. The newly-developed environmental health information system is a web-based platform that integrates databases, decision-making tools, geographic information systems for supporting public health service and policy making. The study, which is a part of a comprehensive effort known as Environmental Public Health Tracking proposed by the Center for Disease Control and Prevention, opens the door for future research on a large scale nation-wide healthcare information infrastructure.
Ling LiEmail:
  相似文献   

8.
针对目前市场上公交监控系统监控区域过小的不足,本文提出了一种基于FPGA的新型公交监控系统设计方案,该系统能够同时全面监控公交车内外的所有区域。系统中FPGA作为核心器件,主要完成的功能是对视频采集芯片的配置、数字视频的四合一处理和视频的实时显示。实验表明,该方案在满足图像质量要求的同时,较好的兼顾了实时性需求。  相似文献   

9.
It is widely acknowledged that users of Spoken Language Systems (SLS) want the ability to truncate system prompts by using a barge-in capability (e.g., Basson et al., 1995; Yankelovich et al., 1995). However, little has been published on how barge-in is used or if it adversely affects Automatic Speech Recognition (ASR) and the interface usability. Typically, user requests for barge-in are assumed to be based on the desire to make system interactions faster and therefore more similar to interactions with touch-tone systems. We believe that requests for a barge-in capability are rooted in the notion of discourse as a turn-taking event. Viewed in this way, we believe SLS can be enhanced to develop speech interfaces that are deemed more natural by users, as well as to increase system performance. This study addressed several issues. We found that users new to the system did not need to be informed about the barge-in capability before they attempted barge-in, that they used barge-in during almost half of their interactions with the system, and that they had identifiable patterns of barge-in use consistent with the turn-taking model. Results are presented and consequences for speech interface design as well as algorithm enhancement are discussed.  相似文献   

10.
This paper presents the structure and design criteria of a neural network-based multimedia information processing and analysis system (MIPAS) which can be used to deal with more-complicated intelligence issues. According to the structure and design criteria, a software environment (SEMIPAS), which supports the implementation of multimedia information (image + speech, image + characters, speech + characters, image + speech + characters) processing and analysis applications, is implemented and introduced. Under this software environment, a multimedia information processing and analysis system called “To Know the World” is constructed. Experiments show that the multimedia information processing and analysis is much more powerful and effective than single-medium information processing and analysis.  相似文献   

11.
汉语语音检索的集外词问题与两阶段检索方法   总被引:2,自引:0,他引:2  
该文针对大规模汉语语音检索任务提出汉语语音检索中的集外词问题和针对集外查询词的两阶段检索方法。汉语语音识别和检索中,集外词可以以词表词序列的形式被识别和检索到,因此被认为不存在集外词问题;该文发现集外查询词性能远远低于集内查询词,将此问题定义为汉语语音检索任务的集外词问题,并提出两阶段的检索方法,第一阶段通过模糊音素匹配的方法提高查全率,第二阶段通过词格修正的方法提高查准率。实验表明,两阶段的检索方法极大的提高了典型集外查询词的检索性能,FOM指标相对基线系统提高了24.1%。  相似文献   

12.
为了在未知一段语音所属语言种类的情况下将其转换为正确的字符序列,将语种辨识(language identification,LID)同语音识别集成在一起建立了中、英文大词汇量连续语音识别(large vocabulary continuous speech recognition,LVCSR)系统.为了在中、英文连续语音识别系统中能够尽早的对语音所属的语言种类做出判决以便进行识别,从而降低解码的计算量,对语种辨识过程中的语种剪枝进行了研究,表明采用合理的语种剪枝门限在不降低系统性能的情况下,可以有效的降低系统的计算量及识别时间.  相似文献   

13.
基于HTK的维吾尔语连续语音声学建模   总被引:3,自引:1,他引:2  
维吾尔语属于阿勒泰语系突厥语族,是黏着性语言。本文根据维吾尔语的特点,分析设计了维吾尔语语音识别系统的总体结构,讨论了维吾尔语最佳识别基元的选择方法,提出建立基于决策树聚类的上下文相关模型,并采用混合高斯分布(GMD)拟合观测概率分布,优化维吾尔语连续语音中HMM模型系统以提高识别性能。最后给出实验对比,得出结论,为今后维吾尔语连续语音识别研究提供依据。  相似文献   

14.
多媒体信息由于维度高、数据量大、可解释性差等特征制约了其检索性能,提出了基于自然语言理解的智能化多媒体信息检索系统模型。该系统基于自然语言理解、数据挖掘、自反馈等技术的运用,在一定程度上扩大了检索范围,提高了检索准确率。  相似文献   

15.
Obtaining training material for rarely used English words and common given names from countries where English is not spoken is di?cult due to excessive time, storage and cost factors. By considering pe...  相似文献   

16.
Cognitive science uses the notion of computational information processing to explain cognitive information processing. Some philosophers have argued that anything can be described as doing computational information processing; if so, it is a vacuous notion for explanatory purposes.An attempt is made to explicate the notions of cognitive information processing and computational information processing and to specify the relationship between them. It is demonstrated that the resulting notion of computational information processing can only be realized in a restrictive class of dynamical systems called physical notational systems (after Goodman's theory of notationality), and that the systems generally appealed to by cognitive science-physical symbol systems-are indeed such systems. Furthermore, it turns out that other alternative conceptions of computational information processing, Fodor's (1975) Language of Thought and Cummins' (1989) Interpretational Semantics appeal to substantially the same restrictive class of systems.The necessary connection of computational information processing with notationality saves the enterprise from charges of vacuousness and has some interesting implications for connectionism. But, unfortunately, it distorts the subject matter and entails some troubling consequences for a cognitive science which tries to make notationality do the work of genuine mental representations.  相似文献   

17.
18.
The success of information system development involving multi-organizational collaboration can depend heavily on effective knowledge sharing across boundaries. This paper reports on a comparative examination of knowledge sharing in two separate networks of public sector organizations participating in information technology innovation projects in New York State. As is typical of innovations resulting from recent government reforms, the knowledge sharing in these cases is a critical component of the information system development, involving a mix of tacit, explicit, and interactional forms of sharing across organizational boundaries. In one case the sharing is among state agencies and in the other across state and local government agencies. Using interviews, observations and document analysis, the longitudinal case studies follow knowledge sharing and other interactions in the interorganizational networks of these two distinct settings. Results confirm the difficulty of sharing knowledge across agencies, and further reveal the influences of several relevant factors—incentives, risks and barriers for sharing, and trust—on the effectiveness of knowledge sharing. The results contribute to theory on knowledge sharing processes in multi-organizational public sector settings and provide practice guidance for developing effective sharing relationships in collaborative cross-boundary information system initiatives. The research reported here is supported by the National Science Foundation grant #SES-9979839. The views and conclusions expressed in this report are those of the authors alone and do not reflect the views or policies of the National Science Foundation.  相似文献   

19.
该文介绍了在汉语文本中抽取定义语句的方法。方法的主要特点是:给定被定义的词汇(字符串),应用Bo-yer-Moore算法查找该串在文本中的位置,继而在该句子中查找符合定义特征的谓词。在这个工作基础上,根据谓词字符串的特征排除谓词歧义,并按照句法分析的结果对定义语句修饰谓词的不同语法单元进行识别,从而实现了基于字符串和语法特征的识别的定义语句抽取。  相似文献   

20.
《Ergonomics》2012,55(1):43-55
The aim of the study was to determine the influence of textual feedback on the content and outcome of spoken interaction with a natural language dialogue system. More specifically, the assumption that textual feedback could disrupt spoken interaction was tested in a human–computer dialogue situation. In total, 48 adult participants, familiar with the system, had to find restaurants based on simple or difficult scenarios using a real natural language service system in a speech-only (phone), speech plus textual dialogue history (multimodal) or text-only (web) modality. The linguistic contents of the dialogues differed as a function of modality, but were similar whether the textual feedback was included in the spoken condition or not. These results add to burgeoning research efforts on multimodal feedback, in suggesting that textual feedback may have little or no detrimental effect on information searching with a real system.

Statement of Relevance: The results suggest that adding textual feedback to interfaces for human–computer dialogue could enhance spoken interaction rather than create interference. The literature currently suggests that adding textual feedback to tasks that depend on the visual sense benefits human–computer interaction. The addition of textual output when the spoken modality is heavily taxed by the task was investigated.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号