首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
This paper proposes a new feature extraction technique using wavelet based sub-band parameters (WBSP) for classification of unaspirated Hindi stop consonants. The extracted acoustic parameters show marked deviation from the values reported for English and other languages, Hindi having distinguishing manner based features. Since acoustic parameters are difficult to be extracted automatically for speech recognition. Mel Frequency Cepstral Coefficient (MFCC) based features are usually used. MFCC are based on short time Fourier transform (STFT) which assumes the speech signal to be stationary over a short period. This assumption is specifically violated in case of stop consonants. In WBSP, from acoustic study, the features derived from CV syllables have different weighting factors with the middle segment having the maximum. The wavelet transform has been applied to splitting of signal into 8 sub-bands of different bandwidths and the variation of energy in different sub-bands is also taken into account. WBSP gives improved classification scores. The number of filters used (8) for feature extraction in WBSP is less compared to the number (24) used for MFCC. Its classification performance has been compared with four other techniques using linear classifier. Further, Principal components analysis (PCA) has also been applied to reduce dimensionality.  相似文献   

2.
This paper investigates the unique pharyngeal and uvular consonants of Arabic from the point of view of automatic speech recognition (ASR). Comparisons of the recognition error rates for these phonemes are analyzed in five experiments that involve different combinations of native and non-native Arabic speakers. The most three confusing consonants for every investigated consonant are discussed. All experiments use the Hidden Markov Model Toolkit (HTK) and the Language Data Consortium (LDC) WestPoint Modern Standard Arabic (MSA) database. Results confirm that these Arabic distinct consonants are a major source of difficulty for Arabic ASR. While the recognition rate for certain of these unique consonants such as // can drop below 35% when uttered by non-native speakers, there is advantage to include non-native speakers in ASR. Besides, regional differences in pronunciation of MSA by native Arabic speakers require the attention of Arabic ASR research.  相似文献   

3.
基于频谱能量的指纹分类   总被引:1,自引:0,他引:1  
指纹分类是自动指纹识别系统中的关键技术,但目前的算法对低质量的指纹图像的分类还存在较大的误差.为了能够对低质量的指纹图像进行准确分类,提出了一种基于频谱能量的指纹分类,首先对分块的指纹图像进行傅立叶变换,然后根据频谱图中能量的分布特点得到指纹图像的方向图,提取core点周围的指纹图像的方向向量作为该指纹图像的特征向量.最后使用K近邻分类器和最小距离分类器对输入指纹进行分类.在NIST-4指纹数据库上的实验结果表明了算法的有效性,分类正确率达到94.1%,且算法速度比同类算法有较大的提高.  相似文献   

4.
We propose an image prior for the model-based nonparametric classification of synthetic aperture radar (SAR) images that allows working with infinite number of mixture components. In order to enclose the spatial interactions of the pixel labels, the prior is derived by incorporating a conditional multinomial auto-logistic random field into the Normalized Gamma Process prior. In this way, we obtain an image classification prior that is free from the limitation on the number of classes and includes the smoothing constraint into classification problem. In this model, we introduced a hyper-parameter that can control the preservation of the important classes and the extinction of the weak ones. The recall rates reported on the synthetic and the real TerraSAR-X images show that the proposed model is capable of accurately classifying the pixels. Unlike the existing methods, it applies a simple iterative update scheme without performing a hierarchical clustering strategy. We demonstrate that the estimation accuracy of the proposed method in number of classes outperforms the conventional finite mixture models.  相似文献   

5.
Energy bands and spectral cues for Arabic vowels recognition   总被引:1,自引:0,他引:1  
The present study examines the short and long Arabic vowels (/a/, /a:/, /i/, /i:/, /u/ and /u:/) with a new approach based on three methods: formant frequencies extraction, spectral moments and energy bands. Among Arabic language characteristics compared to other languages are long vowels which can be pronounced with different duration length. The formant frequencies are the most exploited in characterizing vowels in different languages nevertheless using only formants was not very significant for vowels identification especially when production duration augments. Therefore, our approach is to broaden previous studies and present new tools in order to characterize long vowels compared to short ones.  相似文献   

6.
A land cover classification map is necessary for modelling interactions between the land surface and the atmosphere, monitoring the environment and estimating food production. In order to classify land cover in SE Asia in 2000, Normalized Difference Vegetation Index (NDVI), reflectance of near-infrared (NIR) band, and reflectance of short wave infrared (SWIR) band of Systeme pour l'Observation de la Terre (SPOT) VEGETATION data were used in this study. First, ground data were collected for training data. In addition, supervised classification was performed on twelve months of NDVI data. As a result, some deserts and peripheral sparse vegetative areas were classified into urban, compared with the world atlas. Secondly, the number of months when the reflectance of the SWIR band is higher than that of the NIR band was counted (SWIR>NIR month-count condition) in each pixel, and pixels with counts of 10 were classified as Sparse Herbaceous/Shrub and of 11 or 12 were classified as Bare Areas, respectively. Finally, land cover was classified based on the SWIR>NIR month-count condition combined with NDVI, and it was compared with the existing land cover map. It was found that the SWIR>NIR month-count condition gives a better result for areas of non- or sparsely vegetative classification than when using only NDVI.  相似文献   

7.
A vast amount of valuable human knowledge is recorded in documents. The rapid growth in the number of machine-readable documents for public or private access necessitates the use of automatic text classification. While a lot of effort has been put into Western languages—mostly English—minimal experimentation has been done with Arabic. This paper presents, first, an up-to-date review of the work done in the field of Arabic text classification and, second, a large and diverse dataset that can be used for benchmarking Arabic text classification algorithms. The different techniques derived from the literature review are illustrated by their application to the proposed dataset. The results of various feature selections, weighting methods, and classification algorithms show, on average, the superiority of support vector machine, followed by the decision tree algorithm (C4.5) and Naïve Bayes. The best classification accuracy was 97 % for the Islamic Topics dataset, and the least accurate was 61 % for the Arabic Poems dataset.  相似文献   

8.
《Applied Soft Computing》2008,8(2):1131-1149
Neural networks have shown good results for detecting a certain pattern in a given image. In this paper, faster neural networks for pattern detection are presented. Such processors are designed based on cross correlation in the frequency domain between the input matrix and the input weights of neural networks. This approach is developed to reduce the computation steps required by these faster neural networks for the detection process. The principle of divide and conquer strategy is applied through matrix decomposition. Each matrix is divided into smaller in size submatrices and then each one is tested separately by using a single faster neural processor. Furthermore, faster pattern detection is obtained by using parallel processing techniques to test the resulting submatrices at the same time using the same number of faster neural networks. In contrast to faster neural networks, the speed up ratio is increased with the size of the input matrix when using faster neural networks and matrix decomposition. Moreover, the problem of local submatrix normalization in the frequency domain is solved. The effect of matrix normalization on the speed up ratio of pattern detection is discussed. Simulation results show that local submatrix normalization through weight normalization is faster than submatrix normalization in the spatial domain. The overall speed up ratio of the detection process is increased as the normalization of weights is done off line.  相似文献   

9.
10.
Multimedia Tools and Applications - Human age is a crucial factor in social interaction. It determines the way we interact with others. It is also a relevant forensic issue that can provide helpful...  相似文献   

11.
This paper presents an efficient approach for automatic speaker identification based on cepstral features and the Normalized Pitch Frequency (NPF). Most relevant speaker identification methods adopt a cepstral strategy. Inclusion of the pitch frequency as a new feature in the speaker identification process is expected to enhance the speaker identification accuracy. In the proposed framework for speaker identification, a neural classifier with a single hidden layer is used. Different transform domains are investigated for reliable feature extraction from the speech signal. Moreover, a pre-processing noise reduction step, is used prior to the feature extraction process to enhance the performance of the speaker identification system. Simulation results prove that the NPF as a feature in speaker identification enhances the performance of the speaker identification system, especially with the Discrete Cosine Transform (DCT) and wavelet denoising pre-processing step.  相似文献   

12.
Multimedia Tools and Applications - Spoofing attack detection is one of the essential components in automatic speaker verification (ASV) systems. The success of ASV-2015 shows a great perspective...  相似文献   

13.
提出了矿用无线电系统及设备严禁使用广播、电视、射电天文、安全救助、无线电导航等频率,并给出了严禁使用的频段。提出了矿用无线电系统及设备应优先选用业余频段,优先选用工业、科学和医疗设备频段,并给出了推荐使用的频段。提出了矿用无线电系统及设备若与地面业务一致或相近时,宜选用与地面一致或相近的频段,并给出了业务相近的频段。给出了煤矿井下在用的频段。  相似文献   

14.
Classification in a normalized feature space using support vector machines   总被引:7,自引:0,他引:7  
This paper discusses classification using support vector machines in a normalized feature space. We consider both normalization in input space and in feature space. Exploiting the fact that in this setting all points lie on the surface of a unit hypersphere we replace the optimal separating hyperplane by one that is symmetric in its angles, leading to an improved estimator. Evaluation of these considerations is done in numerical experiments on two real-world datasets. The stability to noise of this offset correction is subsequently investigated as well as its optimality.  相似文献   

15.
This paper investigates the contribution of formants and prosodic features such as pitch and energy in Arabic speech recognition under real-life conditions. Our speech recognition system based on Hidden Markov Models (HMMs) is implemented using the HTK Toolkit. The front-end of the system combines features based on conventional Mel-Frequency Cepstral Coefficient (MFFC), prosodic information and formants. The experiments are performed on the ARADIGIT corpus which is a database of Arabic spoken words. The obtained results show that the resulting multivariate feature vectors, in noisy environment, lead to a significant improvement, up to 27%, in word accuracy relative the word accuracy obtained from the state-of-the-art MFCC-based system.  相似文献   

16.
17.
18.
For fractal image encoding, based on a special measure called the one-norm of normalized block, this paper presents a novel kick-out method to discard impossible domain blocks in early stage for the current range block. It leads to speed up the encoding time. Since our proposed kick-out method is based on Jacquin’s full search method, both methods need to search the whole image and the decoded image quality are the same. Based on five typical testing images, our proposed method has 22% execution time improvement ratio in average when compared with Jacquin’s full search method. Combining our proposed method with Truong et al.’s DCT inner product method, Lai et al.’s kick-out method, or both methods, the encoding-time performance can be improved further.  相似文献   

19.
Many previous researchers have tried developing sign languages recognition systems in general and Arabic sign language specifically. They succeeded to achieve acceptable results for isolated gestures level, but none of them investigated the recognition of connected sequence of gestures. This paper focuses on how to recognize real-time connected sequence of gestures using graph-matching technique, also how the continuous input gestures are segmented and classified. Graphs are a general and powerful data structure useful for the representation of various objects and concepts. This work is a component of a real-time Arabic Sign Language Recognition system that applied pulse-coupled neural network for static posture recognition in its first phase. This work can be adapted and applied to different sign languages and other recognition problems.  相似文献   

20.
The Arabic alphabet is used in around 27 languages, including Arabic, Persian, Kurdish, Urdu, and Jawi. Many researchers have developed systems for recognizing cursive handwritten Arabic words, using both holistic and segmentation-based approaches. This paper introduces a system that achieves high accuracy using efficient segmentation, feature extraction, and recurrent neural network (RNN). We describe a robust rule-based segmentation algorithm that uses special feature points identified in the word skeleton to segment the cursive words into graphemes. We show that careful selection from a wide range of features extracted during and after the segmentation stage produces a feature set that significantly reduces the label error. We demonstrate that using same RNN recognition engine, the segmentation approach with efficient feature extraction gives better results than a holistic approach that extracts features from raw pixels. We evaluated this segmentation approach against an improved version of the holistic system MDLSTM that won the ICDAR 2009 Arabic handwritten word recognition competition. On the IfN/ENIT database of handwritten Arabic words, the segmentation approach reduces the average label error by 18.5 %, the sequence error by 22.3 %, and the execution time by 31 %, relative to MDLSTM. This approach also has the best published accuracies on two IfN/ENIT test sets.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号