首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 78 毫秒
1.
提出一种基于型号为SPCE061A的凌阳单片机的语音识别系统,并将语音识别技术应用在家居灯具中.  相似文献   

2.
分析语音情感识别技术的发展现状和关键技术,将基于隐马尔可夫模型的语音情感识别方法应用在机器人中,目的在于使机器人能够识别人的语音信号中的情感信息,并做出相应的情感表达.这在我们研制出的服务机器人中得到了较好的应用,该机器人能够识别人的语音情感并能与人进行一定的交互.  相似文献   

3.
语音识别是人工智能最基础性课题,该课题研究者通过对隐马尔可夫模型这一数学模型的扩领域应用,解决了声学、语言学、句法等统计知识相关性问题。文章系统阐述了隐马尔可夫模型原理以及在语音识别中的应用过程,从而为更多研究者了解和认识。  相似文献   

4.
随着计算机技术的发展,人工智能产品已经开始广泛地应用在各个领域。利用地区方言与人工智能产品进行交流成为了人机交互技术领域一个重要的研究方向。地处西南的重庆市为国家定位的国际大都市,世界各种文化伴随着人流汇聚于此。承载着重庆本土文化的重庆话作为重庆地区的主要交流语言,研究重庆话语音识别在推动人工智能产品本土化有着积极的作用。本文以重庆话为研究对象,建立了重庆话和重庆话口音的普通话小语料库,搭建了以HMM为声学模型的语音识别系统,分别以重庆话和重庆话口音的普通话作为声学模型去分别识别重庆话和带重庆话口音的普通话。实验表明,重庆话和重庆话口音的普通话声学模型去识别对应语音的正确识别率均为100%;重庆话声学模型识别重庆口音的普通话的正确识别率达到78.89%,重庆话口音的普通话声学模型去识别重庆话的正确识别率达到91.67%。  相似文献   

5.
语音识别综述   总被引:1,自引:0,他引:1  
语音作为一个交叉学科,具有深远的研究价值,近50年的研究发展,语音识别技术已经有了极大的发展.但大多数产品能存在与实验室,没有达到使用化的效果。所以语音识别的研究还要更加深入。本文介绍了语音识别的发展现过程。以及一个语音系统框架和识别过程,HMM模型的概念和建立,还有语音发展的问题和解决方案。  相似文献   

6.
一种机器人智能语音识别算法研究   总被引:1,自引:0,他引:1       下载免费PDF全文
周璐璐  邓江洪 《计算机测量与控制》2014,22(10):3267-32693273
针对智能机器人在非特定人语音识别中识别率偏低的问题,提出了一种双门限的端点检测算法,精确地检测出了语音端点,对分形维数和Mel频率倒谱系数(MFCC)进行结合,同时基于隐马尔可夫(HMM)模型,提出了智能机器人命令识别系统;在实验室环境下,利用Cool Edit软件录制了5男5女的语音,采样率为8kHz,精度为16位,内容为5个命令词,每个词均被采集6次,将每人的前3次发音作为模板语音,后3次发音作为测试语音,实验结果表明,系统识别率可以达到85%以上,MFCC与分形维数混合的语音特征参数的算法提高了系统识别率,优化了系统性能;该方法用于非特定人语音智能识别是可行的、有效的。  相似文献   

7.
随着机器人技术不断发展,本文提出机器人的语音识别这一智能人机交互方式。在研究了基于HMM语音识别基本原理的情况下,在实验室的机器人平台上,利用HTK和Julius开源平台,构建了一个孤立词的语音识别系统。利用该语音识别系统可以提取语音命令用于机器人的控制。  相似文献   

8.
本文探讨和研究了基于嵌入式系统以及DSP的语音识别工业机器人系统设计与实现.系统采用嵌入式+DSP的方案使机器人的性能、成本、可配置性和可扩展性达到一个更佳的平衡点,同时在语音识别方面采用了改进的MFCC方法进行语音特征提取以及采用基于K均值分段的HMM模型进行实时语音学习与识别使算法的实时性和可移植性提高.  相似文献   

9.
基于HMM的语音识别技术在嵌入式系统中的应用   总被引:8,自引:0,他引:8  
介绍语音识别技术在嵌入式系统中的应用状况与发展,以及在嵌入式系统中使用语音识别算法的优点,并对基于HMM语音识别技术的系统进行介绍。  相似文献   

10.
语音识别技术是一个涉及多种学科的集成技术,目前已在工业、军事和医疗部门,产品检验和人机语音通信等领域取得了广泛的实际应用.语音识别技术长期以来一直是研究热点,但现有的语音识别系统运行缓慢,成本高,不方便使用.这些缺点影响了语音识别的速度,系统的硬件实现和应用.特别是在吵闹的环境中应用智能机器人语音识别更是非常困难.用于识别的工业智能机器人技术研究也越来越受到人们的关注.  相似文献   

11.
The aim of this work is to show the ability of stochastic regular grammars to generate accurate language models which can be well integrated, allocated and handled in a continuous speech recognition system. For this purpose, a syntactic version of the well-known n -gram model, called k -testable language in the strict sense (k -TSS), is used. The complete definition of a k -TSS stochastic finite state automaton is provided in the paper. One of the difficulties arising in representing a language model through a stochastic finite state network is that the recursive schema involved in the smoothing procedure must be adopted in the finite state formalism to achieve an efficient implementation of the backing-off mechanism. The use of the syntactic back-off smoothing technique applied to k -TSS language modelling allowed us to obtain a self-contained smoothed model integrating several k -TSS automata in a unique smoothed and integrated model, which is also fully defined in the paper. The proposed formulation leads to a very compact representation of the model parameters learned at training time: probability distribution and model structure. The dynamic expansion of the structure at decoding time allows an efficient integration in a continuous speech recognition system using a one-step decoding procedure. An experimental evaluation of the proposed formulation was carried out on two Spanish corpora. These experiments showed that regular grammars generate accurate language models (k -TSS) that can be efficiently represented and managed in real speech recognition systems, even for high values of k, leading to very good system performance.  相似文献   

12.
In a real environment, acoustic and language features often vary depending on the speakers, speaking styles and topic changes. To accommodate these changes, speech recognition approaches that include the incremental tracking of changing environments have attracted attention. This paper proposes a topic tracking language model that can adaptively track changes in topics based on current text information and previously estimated topic models in an on-line manner. The proposed model is applied to language model adaptation in speech recognition. We use the MIT OpenCourseWare corpus and Corpus of Spontaneous Japanese in speech recognition experiments, and show the effectiveness of the proposed method.  相似文献   

13.

Speech recognition is a fascinating process that offers the opportunity to interact and command the machine in the field of human-computer interactions. Speech recognition is a language-dependent system constructed directly based on the linguistic and textual properties of any language. Automatic speech recognition (ASR) systems are currently being used to translate speech to text flawlessly. Although ASR systems are being strongly executed in international languages, ASR systems’ implementation in the Bengali language has not reached an acceptable state. In this research work, we sedulously disclose the current status of the Bengali ASR system’s research endeavors. In what follows, we acquaint the challenges that are mostly encountered while constructing a Bengali ASR system. We split the challenges into language-dependent and language-independent challenges and guide how the particular complications may be overhauled. Following a rigorous investigation and highlighting the challenges, we conclude that Bengali ASR systems require specific construction of ASR architectures based on the Bengali language’s grammatical and phonetic structure.

  相似文献   

14.
In this paper, the architecture of the first Iranian Farsi continuous speech recognizer and syntactic processor is introduced. In this system, by extracting suitable features of speech signal (cepstral, delta-cepstral, energy and zero-crossing rate) and using a hydrid architecture of neural networks (a Self-Organizing Feature Map, SOFM, at the first stage and a Multi-Layer Perceptron, MLP, at the second stage) the Iranian Farsi phonemes are recognized. Then the string of phonemes are corrected, segmented and converted to formal text by using a non-stochastic method. For syntactic processing, the symbolic (by using artificial intelligence techniques) and connectionist (by using artificial neural networks) approaches are used to determine the correctness, position and the kind of syntactic errors in Iranian Farsi sentences, as well.  相似文献   

15.
A cache-based natural language model for speech recognition   总被引:4,自引:0,他引:4  
Speech-recognition systems must often decide between competing ways of breaking up the acoustic input into strings of words. Since the possible strings may be acoustically similar, a language model is required; given a word string, the model returns its linguistic probability. Several Markov language models are discussed. A novel kind of language model which reflects short-term patterns of word use by means of a cache component (analogous to cache memory in hardware terminology) is presented. The model also contains a 3g-gram component of the traditional type. The combined model and a pure 3g-gram model were tested on samples drawn from the Lancaster-Oslo/Bergen (LOB) corpus of English text. The relative performance of the two models is examined, and suggestions for the future improvements are made  相似文献   

16.
提出了一种基于隐马尔可夫模型(HMM)与人工神经网络(ANN)相结合的情感语音识别系统的实现方法.并从情感语音资料的获取、情感语音特征的提取及情感语音识别等方面阐明了情感语音识别系统的建立过程.该系统实现了情感语音特征参数的提取、情感语音模型参数的训练及对录入的情感语音进行识别等功能.研究结果表明了该系统识别效果良好.  相似文献   

17.
Use of the architectural features of transputers to provide realtime matching of speech recognition is considered. The dynamic time warping algorithm has been implemented in occam and tested initially for matching a single word against a single template. The implementation has been extended to the multitemplate case and preliminary results for this are given. Logical extension to a multitransputer system is also discussed. The single-transputer implementation has been undertaken using a T414 transputer resident in an IBM XT personal computer on an Inmos B004 evaluation board with 2 Mbyte of RAM.  相似文献   

18.
19.
Language modeling for large-vocabulary conversational Arabic speech recognition is faced with the problem of the complex morphology of Arabic, which increases the perplexity and out-of-vocabulary rate. This problem is compounded by the enormous dialectal variability and differences between spoken and written language. In this paper, we investigate improvements in Arabic language modeling by developing various morphology-based language models. We present four different approaches to morphology-based language modeling, including a novel technique called factored language models. Experimental results are presented for both rescoring and first-pass recognition experiments.  相似文献   

20.
为了提高语音识别的可靠性和高效率性,设计了以"MCU+DSP"的双CPU结构为核心的语音识别系统,其中以DSP[1]芯片作为硬件平台的主处理器,完成语音识别所需的计算。MCU用以完成对DSP运算的协助工作,控制机器人各部分动作,其性能达到了实时处理的要求。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号