首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 15 毫秒
随着大词汇量连续语音识别技术的发展,越来越多的研究人员选取声韵母作为识别单元。在基于声韵母的汉语连续语音识别中,声韵母基元的准确分割是非常重要的一步。结合汉语发音声学特性,提出了基于声母分割方法和基于段间距离方法相结合的策略。实验结果表明:该方法达到了准确分割的目的。  相似文献   

A speech prosthesis has been developed based on the following idea. When a handicapped person such as a laryngectomee tries to speak in vain, the movements of the mouth, tongue, etc., are elicited. By detecting the movements, what he or she is trying to say can be determined. Then a speech synthesizer is driven to produce a voice of good quality.  相似文献   

We explore the use of clock skew of a wireless local area network access point (AP) as its fingerprint to detect unauthorized APs quickly and accurately. The main goal behind using clock skews is to overcome one of the major limitations of existing solutions—the inability to effectively detect Medium Access Control (MAC) address spoofing. We calculate the clock skew of an AP from the IEEE 802.11 Time Synchronization Function (TSF) time stamps sent out in the beacon/probe response frames. We use two different methods for this purpose—one based on linear programming and the other based on least-square fit. We supplement these methods with a heuristic for differentiating original packets from those sent by the fake APs. We collect TSF time stamp data from several APs in three different residential settings. Using our measurement data as well as data obtained from a large conference setting, we find that clock skews remain consistent over time for the same AP but vary significantly across APs. Furthermore, we improve the resolution of received time stamp of the frames and show that with this enhancement, our methodology can find clock skews very quickly, using 50-100 packets in most of the cases. We also discuss and quantify the impact of various external factors including temperature variation, virtualization, clock source selection, and NTP synchronization on clock skews. Our results indicate that the use of clock skews appears to be an efficient and robust method for detecting fake APs in wireless local area networks.  相似文献   

This paper presents the design of a speech recognition IC using hidden Markov models (HMMs) with continuous observation densities. Results of offline and live recognition tests are also given. Our design employs a table look-up method to simplify the computation and hence the architecture of the circuit. Currently each state of the HMMs is represented by a double-mixture Gaussian distribution. With minor modifications, the proposed architecture can be extended to implement a recognizer in which models with higher order multi-mixture Gaussian distribution are used for more precise acoustic modeling. The test chip is fabricated with a 0.35 μm CMOS technology. The maximum operating frequency is 62.5 MHz at 3.3 V. For a 50-word vocabulary, the estimated recognition time is about 0.16 s. Using noise-corrupted utterances, the recognition accuracy is 93.8% for isolated English digits. Such a performance is comparable to the software implementation with the same algorithm. Live recognition test was also run for a vocabulary of 11 Chinese words. The accuracy is 91.8% for five male and five female speakers.
Wei HanEmail:

基于GDTW+SVM的语音识别   总被引:3,自引:0,他引:3  
针对经特征提取后语音信号的特征参数的维数不同问题,文章提出了基于GDTW核 SVM算法的语音识别方法。这种方法先对语音信号进行特征提取,并通过GDTW核把特征矢量映射到高维特征空间,然后在高维特征空间中应用支持矢量机的分类方法进行识别。实验证明,与DTW算法和神经网络方法相比,这种方法是可行的,能显著提高语音信号的识别率。  相似文献   

An architecture is presented for real-time continuous speech recognition based on a modified hidden Markov model. The algorithm is adapted to the needs of continuous speech recognition by efficient encoding of the state space, and logarithmic encoding of the weights so that products can be computed as sums. The paper presents the algorithm and its application related modifications, the mapping of the algorithm to a special purpose architecture, and the detailed design of this architecture using configurable logic. Emphasis is given on how the attributes of the algorithm are exploited in a configurable logic based design. A concrete design example is presented with a coprocessor engine having one large FPGA, 64 Mbytes of synchronous DRAM (SDRAM), a small FPGA as a SDRAM controller, and 2 Mbytes SRAM. This engine operating at 66 MHz performs roughly nine times as fast as a high end personal computer running a fully optimized version of the same algorithm.  相似文献   

简要分析连续语音识别技术原理,介绍了语音识别网格构建海量多媒体新闻素材检索系统,该技术显著提升了多媒体新闻制播体系的素材资产化水平,为视音频媒体的多媒体内容资源检索带来了革命性变化。以中国国际广播电台(China Radio International,CRI)为例,描述了语音识别网格技术所带来的实际应用效果。  相似文献   

嵌入式系统正逐渐成为语音识别实际应用的首选平台。该文在嵌入式平台上研究HMM连续语音识别的计算复杂度要素,提出特征系数屏蔽方法和综合剪枝相结合的瘦身计算方法,降低计算复杂度并保持识别率。该方法在嵌入式平台上研究的实验数据表明,HMM连续语音识别瘦身系统与基线系统相比,计算时间从基线系统的100%降低到27.91%,识别率仅从基线系统的89.65%下降到89.41%。  相似文献   

介绍了语音识别的基本原理和用定点数字信号处理器ADSP2181实现语音识别算法的一些原则和方法.  相似文献   

方杰  李英  钱红 《电声技术》2006,(8):46-49
在研究双门限比较法的基础上,提出了语音端点检测不变门限三次搜索检测法,该方法主要由多词检测、端点修复和漏点检测3部分组成,有效解决了双门限比较法检测连续词端点的门限设置问题;在语音信号归一化的前提下,能以同一门限准确检测出语音信号的端点。在较低信噪比情况下,基于语音信号的短时相对自相关序列的短时平均幅度的端点检测能够获得较高的检测精度。  相似文献   

汉语连续语音识别中不同基元声学模型的复合   总被引:1,自引:0,他引:1  
张辉  杜利民 《电子与信息学报》2006,28(11):2045-2049
该文研究由不同声学基元训练的声学模型的复合。在汉语连续语音识别中,流行的基元包括上下文相关的声韵母基元和音素基元。实验发现,有些汉语音节在声韵母模型下有更高的识别率,有些音节在音素模型下有更高的识别率。该文提出一种复合这两种声学模型的方法,一方面在识别过程中同时使用两种模型,另一方面在识别过程中避开造成低识别率的模型。实验表明,采用本文的方法后,音节错误率比音素模型和声韵母模型分别下降了9.60%和6.10%。  相似文献   

This letter proposes the use of vowel sound detection for voice activity detection. Vowels have distinctive spectral peaks. These are likely to remain higher than their surroundings even after severe corruption. Therefore, by developing a method of detecting the spectral peaks of vowel sounds in corrupted signals, voice activity can be detected as well even in low signal‐to‐noise ratio (SNR) conditions. Experimental results indicate that the proposed algorithm performs reliably under various noise and low SNR conditions. This method is suitable for mobile environments where the characteristics of noise may not be known in advance.  相似文献   

基于模式识别的盲分离语音信号获取方法   总被引:1,自引:1,他引:0  
徐舜  刘郁林  柏森 《电声技术》2006,(12):38-42,46
在批处理音频信号盲分离过程中,要将分离出的语音信号帧拼接成连续的语音比较困难,这主要是由于盲信号分离存在分离信号排列顺序的不确定性。笔者根据语音信号短时平稳特性,利用基音周期为窗长,连续分割盲分离信号,将各个窗内数据的归一化自相关函数值取平均作为音频信号的模式特征,最后根据相似性阈值和最小距离原则进行信号聚类分析,从而克服提取盲分离语音信号中的信号顺序不确定性,获得连续语音信号。实验仿真证明了该方法的有效性。  相似文献   

基于HMM/VQ的认人的中等词表连续语音识别   总被引:2,自引:2,他引:0  
本文讨论基于隐马尔可夫模型(HMM)和矢量量化(VQ)的连续语音识别方法。用这种方法,对每个单词作成一个HMM,对多个模型组合成的状态转移网络搜索其状态转移的最佳路径,从而实现不预先进行单词切分的连续语音的识别,使用有限态文法约束及其它一些改善识别性能的措施,演示系统能识别特定人的18种英语句式,150个单词,用312个话句(共有2710个单词)进行测试,识别延迟时间为发音时长的62%,发音速度平均为每秒2.32个单词,单词识准率为97.3%。  相似文献   

冯国友  戴扬  沈海斌  时晓东 《电子器件》2007,30(3):1098-1101
传统的语音端点检测方法以信号的短时能量、过零率等简单特征作为判决特征参数.这些方法在实际应用中,尤其当信号信噪比比较低时,无法满足系统的需要.文中利用零能积差作为判决采样信号帧是否为语音信号的依据,并通过了硬件来实现.结果表明,该模块较传统方法在保证高识别率的同时,提高了模块的速率,减小了面积,具有一定的实用价值.  相似文献   

王守觉  曹文明 《电子学报》2006,34(2):267-271
本文首先分析了以PC机作为宿主机的半导体神经网络处理机CASSANDRA-I,进一步介绍了新的半导体神经计算机CASSANDRA-II的系统实现和功能特性,并将其应用到问候语语音识别中,实验结果表明CASSANDRA-II神经计算机识别结果优于HMM模型的识别结果.  相似文献   


An ASR system is built for the Continuous Kannada Speech Recognition. The acoustic and language models are created with the help of the Kaldi toolkit. The speech database is created with the native male and female Kannada speakers. The 80% of collected speech data is used for training the acoustic models and 20% of speech database is used for the system testing. The Performance of the system is presented interms of Word Error Rate (WER). Wavelet Packet Decomposition along with Mel filter bank is used to achieve feature extraction. The proposed feature extraction performs slightly better than the conventional features such as MFCC, PLP interms of WRA and WER under uncontrolled conditions. For the speech corpus collected in Kannada Language, the proposed features shows an improvement in Word Recognition Accuracy (WRA) of 1.79% over baseline features.


基于3维空间Viterbi算法的汉语连续语音识别方法   总被引:1,自引:0,他引:1       下载免费PDF全文
赵力  邹采荣  吴镇扬 《电子学报》2000,28(7):67-69,58
本文提出了基于3维空间Viterbi算法的汉语连续语音识别方法。本方法采用60个音素单位的隐马尔可大模型(HMM)和8个声调单位的HMM作为识别用基元模型。音素基元模型和声调基元模型的识别结果的统合,采用音素单位的HMM状态,声调单位的HMM状态和时间的3维空间Viterbi算法来实现。  相似文献   

作为汉语语音识别的重要组成部分,声调识别具有关键的作用.提出了一种新的基于前后文相关的模型识别方法用以提高汉语连续语音中的识别率.首先介绍用于声调识别的基因轨迹的提取和处理,然后提出6种特征来描述基因轨迹的变化趋势并给出具体的计算公式,利用这些特征并考虑连续语音中前后音节的相关性对基因轨迹造成的变化而建立细分的声调模型...  相似文献   

利用模糊熵进行参数有效性分析的语音情感识别   总被引:5,自引:0,他引:5  
本文利用模糊熵理论来分析语音信号情感特征参数相对于识别情感模式的不确定度,并提出了一种利用模糊熵对情感参数有效性进行度量的方法。并将参数有效性分析结合模糊综合判别对情感语音信号作情感识别,取得了较好效果。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号