首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
Conventional Hidden Markov Model (HMM) based Automatic Speech Recognition (ASR) systems generally utilize cepstral features as acoustic observation and phonemes as basic linguistic units. Some of the most powerful features currently used in ASR systems are Mel-Frequency Cepstral Coefficients (MFCCs). Speech recognition is inherently complicated due to the variability in the speech signal which includes within- and across-speaker variability. This leads to several kinds of mismatch between acoustic features and acoustic models and hence degrades the system performance. The sensitivity of MFCCs to speech signal variability motivates many researchers to investigate the use of a new set of speech feature parameters in order to make the acoustic models more robust to this variability and thus improve the system performance. The combination of diverse acoustic feature sets has great potential to enhance the performance of ASR systems. This paper is a part of ongoing research efforts aspiring to build an accurate Arabic ASR system for teaching and learning purposes. It addresses the integration of complementary features into standard HMMs for the purpose to make them more robust and thus improve their recognition accuracies. The complementary features which have been investigated in this work are voiced formants and Pitch in combination with conventional MFCC features. A series of experimentations under various combination strategies were performed to determine which of these integrated features can significantly improve systems performance. The Cambridge HTK tools were used as a development environment of the system and experimental results showed that the error rate was successfully decreased, the achieved results seem very promising, even without using language models.  相似文献   

3.
4.
Printed Arabic character recognition using HMM   总被引:1,自引:0,他引:1       下载免费PDF全文
The Arabic Language has a very rich vocabulary. More than 200 million people speak this language as their native speaking, and over 1 billion people use it in several religion-related activities. In this paper a new technique is presented for recognizing printed Arabic characters. After a word is segmented, each character/word is entirely transformed into a feature vector. The features of printed Arabic characters include strokes and bays in various directions, endpoints, intersection points, loops, dots and zigzags. The word skeleton is decomposed into a number of links in orthographic order, and then it is transferred into a sequence of symbols using vector quantization. Single hidden Markov model has been used for recognizing the printed Arabic characters. Experimental results show that the high recognition rate depends on the number of states in each sample.  相似文献   

5.
为了提高语音识别的鲁棒性,提出一种新的特征组合方法。方法基于F比对梅尔频率倒谱系数(MFCC)进行加权优化,同时将不同特征组合输入到语音隐马尔科夫模型(HMM)进行训练,得到具有抗噪性的最佳组合,并采用主成分分析(PCA)进行降维,增加支持向量机(SVM)分类器作为后处理器。实验表明,改进的MFCC、短时平均能量和Teager能量算子组合参数识别效果最优,识别率达到90. 48%。PCA降维后识别率降低了0. 4%,提升了计算速度。增加后处理器,系统识别率达到95. 25%,提高了系统的识别效率和分类决策力,相对于常规识别方法,准确率有所提高。  相似文献   

6.
SVM和HMM混合模型在人脸识别中的应用   总被引:2,自引:1,他引:1       下载免费PDF全文
采用支持向量机(SVM)和隐马尔可夫模型(HMM)相结合的方法进行人脸识别。首先对照片中的人脸进行定位,从定位区域提取人脸各个器官的独立基特征,然后使用支持向量机和隐马尔可夫混合模型对定位区域进行人脸识别。利用SVM和HMM结合的优点,取得较高的识别率。  相似文献   

7.
This paper presents a new hybrid method for continuous Arabic speech recognition based on triphones modelling. To do this, we apply Support Vectors Machine (SVM) as an estimator of posterior probabilities within the Hidden Markov Models (HMM) standards. In this work, we describe a new approach of categorising Arabic vowels to long and short vowels to be applied on the labeling phase of speech signals. Using this new labeling method, we deduce that SVM/HMM hybrid model is more efficient then HMMs standards and the hybrid system Multi-Layer Perceptron (MLP) with HMM. The obtained results for the Arabic speech recognition system based on triphones are 64.68 % with HMMs, 72.39 % with MLP/HMM and 74.01 % for SVM/HMM hybrid model. The WER obtained for the recognition of continuous speech by the three systems proves the performance of SVM/HMM by obtaining the lowest average for 4 tested speakers 11.42 %.  相似文献   

8.
Text-to-speech system (TTS), known also as speech synthesizer, is one of the important technology in the last years due to the expanding field of applications. Several works on speech synthesizer have been made on English and French, whereas many other languages, including Arabic, have been recently taken into consideration. The area of Arabic speech synthesis has not sufficient progress and it is still in its first stage with a low speech quality. In fact, speech synthesis systems face several problems (e.g. speech quality, articulatory effect, etc.). Different methods were proposed to solve these issues, such as the use of large and different unit sizes. This method is mainly implemented with the concatenative approach to improve the speech quality and several works have proved its effectiveness. This paper presents an efficient Arabic TTS system based on statistical parametric approach and non-uniform units speech synthesis. Our system includes a diacritization engine. Modern Arabic text is written without mention the vowels, called also diacritic marks. Unfortunately, these marks are very important to define the right pronunciation of the text which explains the incorporation of the diacritization engine to our system. In this work, we propose a simple approach based on deep neural networks. Deep neural networks are trained to directly predict the diacritic marks and to predict the spectral and prosodic parameters. Furthermore, we propose a new simple stacked neural network approach to improve the accuracy of the acoustic models. Experimental results show that our diacritization system allows the generation of full diacritized text with high precision and our synthesis system produces high-quality speech.  相似文献   

9.
This paper presents a method for reconstructing unreliable spectral components of speech signals using the statistical distributions of the clean components. Our goal is to model the temporal patterns in speech signal and take advantage of correlations between speech features in both time and frequency domain simultaneously. In this approach, a hidden Markov model (HMM) is first trained on clean speech data to model the temporal patterns which appear in the sequences of the spectral components. Using this model and according to the probabilities of occurring noisy spectral component at each states, a probability distributions for noisy components are estimated. Then, by applying maximum a posteriori (MAP) estimation on the mentioned distributions, the final estimations of the unreliable spectral components are obtained. The proposed method is compared to a common missing feature method which is based on the probabilistic clustering of the feature vectors and also to a state of the art method based on sparse reconstruction. The experimental results exhibits significant improvement in recognition accuracy over a noise polluted Persian corpus.  相似文献   

10.
11.
Language modeling for large-vocabulary conversational Arabic speech recognition is faced with the problem of the complex morphology of Arabic, which increases the perplexity and out-of-vocabulary rate. This problem is compounded by the enormous dialectal variability and differences between spoken and written language. In this paper, we investigate improvements in Arabic language modeling by developing various morphology-based language models. We present four different approaches to morphology-based language modeling, including a novel technique called factored language models. Experimental results are presented for both rescoring and first-pass recognition experiments.  相似文献   

12.
13.
14.
15.
对于具有大量特征数据和复杂发音变化的英语语音,与单词相比,在隐马尔可夫模型(HMM)中存在更多问题,例如维特比算法的复杂度计算和高斯混合模型中的概率分布问题。为了实现基于HMM和聚类的独立于说话人的英语语音识别系统,提出了用于降低语音特征参数维数的分段均值算法、聚类交叉分组算法和HMM分组算法的组合形式。实验结果表明,与单个HMM模型相比,该算法不仅提高了英语语音的识别率近3%,而且提高系统的识别速度20.1%。  相似文献   

16.
维吾尔语是黏着性语言,利用丰富的词缀可以用同样的词干产生超大词汇,给维吾尔语语音识别的研究工作带来了很大困难。结合维吾尔语自身特点,建立了维吾尔语连续语音语料库,利用HTK(HMMToolKit)工具实现了基于隐马尔可夫模型(HMM)的维吾尔语连续语音识别系统。在声学层,选取三音子作为基本的识别单元,建立了维吾尔语的三音子声学模型,并使用决策树、三音子绑定、修补哑音、增加高斯混合分量等方法提高模型的识别精度。在语言层,使用了适合于维吾尔语语音特征的基于统计的二元文法语言模型。最后,利用该系统进行了维吾尔语连续语音识别实验。  相似文献   

17.
In this paper, a structural method of recognising Arabic handwritten characters is proposed. The major problem in cursive text recognition is the segmentation into characters or into representative strokes. When we segment the cursive portions of words, we take into account the contextual properties of the Arabic grammar and the junction segments connecting the characters to each other along the writing line. The problem of overlapping characters is resolved with a contour-following algorithm associated with the labelling of the detected contours. In the recognition phase, the characters are gathered into ten families of candidate characters with similar shapes. Then a heterarchical analysis follows that checks the pattern via goal-directed feedback control.  相似文献   

18.
19.
支持向量机的核函数因参数寻优问题,产生了额外计算量,从而降低了在语音识别应用系统中的实时性.鉴于以上弊端,在语音识别系统中,运用了一种基于切比雪夫多项式的核函数.该函数在训练过程中能够获得更少的支持向量个数,同时该函数结合了高斯核函数的优良性能,对广义的切比雪夫核函数进行了适当的改进得到修正切比雪夫核函数.实验运用了两个不同的语音数据库分别进行了对比实验,取得了较为理想的效果,提高了支持向量机的泛化能力及语音识别系统的鲁棒性.  相似文献   

20.
为了进一步提高语音识别系统的准确率,使语音产品应用更加方便,提出了一种隐马尔可夫模型和代数神经网络相结合的语音识别方法.利用隐马尔可夫模型生成最佳语音状态序列,将最佳状态序列的输出概率作为前馈型神经网络的输入,通过代数神经网络进行分类识别.使用Matlab7.0实验平台进行仿真,实验结果表明,与传统神经网络相比,该方法在收敛速度、鲁棒性和识别率方面都有改善.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号