首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 190 毫秒
1.
Speech and speaker recognition is an important topic to be performed by a computer system. In this paper, an expert speaker recognition system based on optimum wavelet packet entropy is proposed for speaker recognition by using real speech/voice signal. This study contains both the combination of the new feature extraction and classification approach by using optimum wavelet packet entropy parameter values. These optimum wavelet packet entropy values are obtained from measured real English language speech/voice signal waveforms using speech experimental set. A genetic-wavelet packet-neural network (GWPNN) model is developed in this study. GWPNN includes three layers which are genetic algorithm, wavelet packet and multi-layer perception. The genetic algorithm layer of GWPNN is used for selecting the feature extraction method and obtaining the optimum wavelet entropy parameter values. In this study, one of the four different feature extraction methods is selected by using genetic algorithm. Alternative feature extraction methods are wavelet packet decomposition, wavelet packet decomposition – short-time Fourier transform, wavelet packet decomposition – Born–Jordan time–frequency representation, wavelet packet decomposition – Choi–Williams time–frequency representation. The wavelet packet layer is used for optimum feature extraction in the time–frequency domain and is composed of wavelet packet decomposition and wavelet packet entropies. The multi-layer perceptron of GWPNN, which is a feed-forward neural network, is used for evaluating the fitness function of the genetic algorithm and for classification speakers. The performance of the developed system has been evaluated by using noisy English speech/voice signals. The test results showed that this system was effective in detecting real speech signals. The correct classification rate was about 85% for speaker classification.  相似文献   

2.
In this study, an expert speaker identification system is presented for speaker identification using Turkish speech signals. Here, a discrete wavelet adaptive network based fuzzy inference system (DWANFIS) model is used for this aim. This model consists of two layers: discrete wavelet and adaptive network based fuzzy inference system. The discrete wavelet layer is used for adaptive feature extraction in the time–frequency domain and is composed of discrete wavelet decomposition and discrete wavelet entropy. The performance of the used system is evaluated by using repeated speech signals. These test results show the effectiveness of the developed intelligent system presented in this paper. The rate of correct classification is about 90.55% for the sample speakers.  相似文献   

3.
In this paper, an intelligent diagnosis system based on principle component analysis (PCA) and adaptive network based on fuzzy inference system (ANFIS) for the heart valve disease is introduced. This intelligent system deals with combination of the feature extraction and classification from measured Doppler signal waveforms at the heart valve using the Doppler ultrasound (DHS). Here, the wavelet entropy is used as features. This intelligent system has three phases. In pre-processing phase, the data acquisition and pre-processing for DHS signals are performed. In feature extraction phase, the feature vector is extracted by calculating the 12 wavelet entropy values for per DHS signal and dimension of Doppler signal dataset, which are 12 features, is reduced to 6 features using PCA. In classification phase, these reduced wavelet entropy features are given to inputs ANFIS classifier. The correct diagnosis performance of the PCA–ANFIS intelligent system is calculated in 215 samples. The classification accuracy of this PCA–ANFIS intelligent system was 96% for normal subjects and 93.1% for abnormal subjects.  相似文献   

4.
In last year’s, the expert target recognition has been become very important topic in radar literature. In this study, a target recognition system is introduced for expert target recognition (ATR) using radar target echo signals of High Range Resolution (HRR) radars. This study includes a combination of an adaptive feature extraction and classification using optimum wavelet entropy parameter values. The features used in this study are extracted from radar target echo signals. Herein, a genetic wavelet extreme learning machine classifier model (GAWELM) is developed for expert target recognition. The GAWELM composes of three stages. These stages of GAWELM are genetic algorithm, wavelet analysis and extreme learning machine (ELM) classifier. In previous studies of radar target recognition have shown that the learning speed of feedforward networks is in general much slower than required and it has been a major disadvantage. There are two important causes. These are: (1) the slow gradient-based learning algorithms are commonly used to train neural networks, and (2) all the parameters of the networks are fixed iteratively by using such learning algorithms. In this paper, a new learning algorithm named extreme learning machine (ELM) for single-hidden layer feedforward networks (SLFNs) Ahern et al., 1989, Al-Otum and Al-Sowayan, 2011, Avci et al., 2005a, Avci et al., 2005b, Biswal et al., 2009, Frigui et al., in press, Cao et al., 2010, Guo et al., 2011, Famili et al., 1997, Han and Huang, 2006, Huang et al., 2011, Huang et al., 2006, Huang and Siew, 2005, Huang et al., 2009, Jiang et al., 2011, Kubrusly and Levan, 2009, Le et al., 2011, Lhermitte et al., in press, Martínez-Martínez et al., 2011, Matlab, 2011, Nelson et al., 2002, Nejad and Zakeri, 2011, Tabib et al., 2009, Tang et al., 2011, which randomly choose hidden nodes and analytically determines the output weights of SLFNs, to eliminate the these disadvantages of feedforward networks for expert target recognition area. Then, the genetic algorithm (GA) stage is used for obtaining the feature extraction method and finding the optimum wavelet entropy parameter values. Herein, the optimal one of four variant feature extraction methods is obtained by using a genetic algorithm (GA). The four feature extraction methods proposed GAWELM model are discrete wavelet transform (DWT), discrete wavelet transform–short-time Fourier transform (DWT–STFT), discrete wavelet transform–Born–Jordan time–frequency transform (DWT–BJTFT), and discrete wavelet transform–Choi–Williams time–frequency transform (DWT–CWTFT). The discrete wavelet transform stage is performed for optimum feature extraction in the time–frequency domain. The discrete wavelet transform stage includes discrete wavelet transform and calculating of discrete wavelet entropies. The extreme learning machine (ELM) classifier is performed for evaluating the fitness function of the genetic algorithm and classification of radar targets. The performance of the developed GAWELM expert radar target recognition system is examined by using noisy real radar target echo signals. The applications results of the developed GAWELM expert radar target recognition system show that this GAWELM system is effective in rating real radar target echo signals. The correct classification rate of this GAWELM system is about 90% for radar target types used in this study.  相似文献   

5.
In this paper, an intelligent diagnosis for fault gear identification and classification based on vibration signal using discrete wavelet transform and adaptive neuro-fuzzy inference system (ANFIS) is presented. The discrete wavelet transform (DWT) technique plays one of the important roles for signal feature extraction in the proposed system. The abnormal transient signals will show in different decomposition levels and can be used to recognize the various faults by the DWT figure. However, many fault conditions are hard to inspect accurately by the naked eye. In the present study, the feature extraction method based on discrete wavelet transform with energy spectrum is proposed. The different order wavelets are considered to identify fault features accurately. The database is established by feature vectors of energy spectrum which are used as input pattern in the training and identification process. Furthermore, the ANFIS is proposed to identify and classify the fault gear positions and the gear fault conditions in the fault diagnosis system. The proposed ANFIS includes both the fuzzy logic qualitative approximation and the adaptive neural network capability. The experimental results verified that the proposed ANFIS has more possibilities in fault gear identification. The ANFIS achieved an accuracy identification rate which was more satisfactory than traditional vision inspection in the proposed system.  相似文献   

6.
An expert system is presented for interpretation of the Doppler signals of heart valve diseases based on pattern recognition. We deal in particular with the combination of feature extraction and classification from measured Doppler signal waveforms at the heart valve using Doppler ultrasound. A wavelet neural network model developed by us is used. The model consists of two layers: a wavelet layer and a multilayer perceptron. The wavelet layer used for adaptive feature extraction in the time–frequency domain is composed of wavelet decomposition and wavelet entropy. The multilayer perceptron used for classification is a feedforward neural network. The performance of the developed system has been evaluated in 215 samples. The test results show that this system is effective to detect Doppler heart sounds. The classification rate averaged 91% correct for 123 test subjects.  相似文献   

7.
In this work, an average framing linear prediction coding (AFLPC) technique for text-independent speaker identification systems is presented. Conventionally, linear prediction coding (LPC) has been applied in speech recognition applications. However, in this study the combination of modified LPC with wavelet transform (WT), termed AFLPC, is proposed for speaker identification. The investigation procedure is based on feature extraction and voice classification. In the phase of feature extraction, the distinguished speaker’s vocal tract characteristics were extracted using the AFLPC technique. The size of a speaker’s feature vector can be optimized in term of an acceptable recognition rate by means of genetic algorithm (GA). Hence, an LPC order of 30 is found to be the best according to the system performance. In the phase of classification, probabilistic neural network (PNN) is applied because of its rapid response and ease in implementation. In the practical investigation, performances of different wavelet transforms in conjunction with AFLPC were compared with one another. In addition, the capability analysis on the proposed system was examined by comparing it with other systems proposed in literature. Consequently, the PNN classifier achieves a better recognition rate (97.36%) with the wavelet packet (WP) and AFLPC termed WPLPCF feature extraction method. It is also suggested to analyze the proposed system in additive white Gaussian noise (AWGN) and real noise environments; 58.56% for 0 dB and 70.52% for 5 dB. The recognition rates for the whole database of the Gaussian mixture model (GMM) reached the lowest value in case of small number of training samples.  相似文献   

8.
Local features for any pattern recognition system are based on the information extracted locally. In this paper, a local feature extraction technique was developed. This feature was extracted in the time–frequency plain by taking the moving average on the diagonal directions of the time–frequency plane. This feature captured the time–frequency events producing a unique pattern for each speaker that can be viewed as a voice print of the speaker. Hence, we referred to this technique as voice print-based local feature. The proposed feature was compared to other features including mel-frequency cepstral coefficient (MFCC) for speaker recognition using two different databases. One of the databases used in the comparison is a subset of an LDC database that consisted of two short sentences uttered by 182 speakers. The proposed feature attained 98.35% recognition rate compared to 96.7% for MFCC using the LDC subset.  相似文献   

9.
This paper presents the application of adaptive neuro-fuzzy inference system (ANFIS) model for estimation of vigilance level by using electroencephalogram (EEG) signals recorded during transition from wakefulness to sleep. The developed ANFIS model combined the neural network adaptive capabilities and the fuzzy logic qualitative approach. This study comprises of three stages. In the first stage, three types of EEG signals (alert signal, drowsy signal and sleep signal) were obtained from 30 healthy subjects. In the second stage, for feature extraction, obtained EEG signals were separated to its sub-bands using discrete wavelet transform (DWT). Then, entropy of each sub-band was calculated using Shannon entropy algorithm. In the third stage, the ANFIS was trained with the back-propagation gradient descent method in combination with least squares method. The extracted features of three types of EEG signals were used as input patterns of the three ANFIS classifiers. In order to improve estimation accuracy, the fourth ANFIS classifier (combining ANFIS) was trained using the outputs of the three ANFIS classifiers as input data. The performance of the ANFIS model was tested using the EEG data obtained from 12 healthy subjects that have not been used for the training. The results confirmed that the developed ANFIS classifier has potential for estimation of vigilance level by using EEG signals.  相似文献   

10.
以便携式回放设备的语音为代表的假冒语音攻击,给说话人识别系统带来了严峻的挑战.针对这种回放语音攻击问题,论文提出一种基于小波包的多频带回放语音鉴别算法.首先,通过小波包分解及重构后的信号进行傅里叶变换,取每一帧频谱的最大值;然后,利用对数运算以及离散余弦变换(DCT)来得到鉴别特征;最后,使用高斯混合模型(GMM)作为...  相似文献   

11.
This paper presents a wavelet-based feature extraction method for human gait recognition. The selection of features with most discriminative information is the key to improve recognition performance. The frequency domain representation of the gait image is obtained by using fast Fourier transforms. Next, a discrete wavelet transform is applied to the obtained spectrum. With single-level wavelet decomposition, four coefficients are generated. The sum of the entropy of these four wavelet coefficients is computed yielding the wavelet Entropy Image (wEnI) which is used here as the potential feature for human gait recognition. A template matching-based approach is used as the classification. The performance of the proposed wEnI feature is evaluated using whole-based and part-based methods. The experimental results show that the wEnI feature performs better compared to state-of-the-art gait features in common use.  相似文献   

12.
The most widely used speech representation is based on the mel-frequency cepstral coefficients, which incorporates biologically inspired characteristics into artificial recognizers. However, the recognition performance with these features can still be enhanced, specially in adverse conditions. Recent advances have been made with the introduction of wavelet based representations for different kinds of signals, which have shown to improve the classification performance. However, the problem of finding an adequate wavelet based representation for a particular problem is still an important challenge. In this work we propose a genetic algorithm to evolve a speech representation, based on a non-orthogonal wavelet decomposition, for phoneme classification. The results, obtained for a set of spanish phonemes, show that the proposed genetic algorithm is able to find a representation that improves speech recognition results. Moreover, the optimized representation was evaluated in noise conditions.  相似文献   

13.
Listening via stethoscope is a preferential method, being used by physicians for distinguishing normal and abnormal cardiac systems. On the other hand, listening with stethoscope has a number of constraints. The interpretation of various heart sounds depends on physician’s ability of hearing, experience, and skill. Such limitations may be reduced by developing biomedical-based decision support systems. In this study, a biomedical-based decision support system was developed for the classification of heart sound signals, obtained from 120 subjects with normal, pulmonary, and mitral stenosis heart valve diseases via stethoscope. Developed system comprises of three stages. In the first stage, for feature extraction, obtained heart sound signals were separated to its sub-bands using discrete wavelet transform (DWT). In the second stage, entropy of each sub-band was calculated using Shannon entropy algorithm to reduce the dimensionality of the feature vectors via DWT. In the third stage, the reduced features of three types of heart sound signals were used as input patterns of the adaptive neuro-fuzzy inference system (ANFIS) classifiers. Developed method reached 98.33% classification accuracy, and it was showed that purposed method is effective for detection of heart valve diseases.  相似文献   

14.
In this paper, an intelligent analog modulation identification system is presented for interpretation of the analog modulated signals. This paper especially deals with combination of the feature extraction and classification for analog modulated signals. The analog modulated signals used in this study are six types (AM, DSB, USB, LSB, FM, and PM). Here, a discrete wavelet neural network-adaptive wavelet entropy (DWNN-ANE) model is used, which consists of two layers: discrete wavelet-adaptive wavelet entropy and multi-layer perceptron neural networks for intelligent analog modulation identification. The discrete wavelet layer is used for adaptive feature extraction in the time-frequency domain and is composed of DWT and adaptive wavelet entropy. The performance of the used system is evaluated by using total 1080 analog modulated signals. These test results show the effectiveness of the used intelligent system presented in this paper. The rate of correct classification is about 98.34% for the sample analog modulated signals.  相似文献   

15.
基于HHT倒谱系数的说话人识别算法   总被引:1,自引:0,他引:1  
针对LPCC只反应语音静态特征且不能突出其低频局部特征问题,提出一种以HHT倒谱系数为特征的说话人识别算法,HHT的经验模态分解使语音的低频局部特征得到更好的描述,Hilbert变换能够刻画语音动态特性,改进了LPCC的不足。用经验模态分解将语音分解为一系列固有模态函数分量并做Hilbert变换求得Hilbert边际谱,计算总边际谱的对数功率谱并做DCT得13维倒谱系数,将此特征送入高斯混合模型进行说话人识别。仿真实验结果表明,基于HHT倒谱系数的说话人识别算法,相较LPCC识别率提高了12.59%,但特征提取时间增加了19.27 s。  相似文献   

16.
针对运动想象脑电信号特征提取困难,分类正确率低的问题,提出了利用小波熵进行特征提取并采用支持向量机(SVM)来分类的算法。计算运动想象脑电信号的功率,通过理论分析选择小波包尺度,对信号功率进行小波包分解并计算其小波包熵(WPE),提取C3、C4导联的小波包熵插值组成特征向量,将特征向量作为分类器的输入送入支持向量机进行分类。采用国际BCI竞赛2003中的Graz数据进行验证,算法的最高分类正确率达97.56%。算法特征向量维数低、数据量小、分类正确率高,对运动想象脑电信号特征提取及分类的任务可以提供参考方法。  相似文献   

17.
This paper presents an efficient approach for automatic speaker identification based on cepstral features and the Normalized Pitch Frequency (NPF). Most relevant speaker identification methods adopt a cepstral strategy. Inclusion of the pitch frequency as a new feature in the speaker identification process is expected to enhance the speaker identification accuracy. In the proposed framework for speaker identification, a neural classifier with a single hidden layer is used. Different transform domains are investigated for reliable feature extraction from the speech signal. Moreover, a pre-processing noise reduction step, is used prior to the feature extraction process to enhance the performance of the speaker identification system. Simulation results prove that the NPF as a feature in speaker identification enhances the performance of the speaker identification system, especially with the Discrete Cosine Transform (DCT) and wavelet denoising pre-processing step.  相似文献   

18.
针对语音信号去噪问题, 提出小波熵自适应阈值去噪法。首先利用小波变换分解带噪语音信号, 计算小波分解后信号子带区间的小波熵, 然后将小波熵和自适应阈值相结合确定各层高频系数的阈值门限, 采用折中指数阈值函数对各层高频系数进行去噪处理, 重构降噪后的语音信号, 最后对比小波熵自适应阈值、极大极小阈值、固定阈值和无偏风险阈值去噪方法的性能。实验结果表明, 当输入信噪比为5 dB时, 小波熵自适应阈值去噪法的输出信噪比是最大的, 且其输入输出信噪比曲线高于其他三种阈值去噪法的输入输出信噪比曲线, 从而证实该算法具有更好的去噪性能。  相似文献   

19.
In this paper, an automatic diagnosis system based on Linear Discriminant Analysis (LDA) and Adaptive Network based on Fuzzy Inference System (ANFIS) for hepatitis diseases is introduced. This automatic diagnosis system deals with the combination of feature extraction and classification. This automatic hepatitis diagnosis system has two stages, which feature extraction – reduction and classification stages. In the feature extraction – reduction stage, the hepatitis features were obtained from UCI Repository of Machine Learning Databases. Then, the number of these features was reduced to 8 from 19 by using Linear Discriminant Analysis (LDA). In the classification stage, these reduced features are given to inputs ANFIS classifier. The correct diagnosis performance of the LDA-ANFIS automatic diagnosis system for hepatitis disease is estimated by using classification accuracy, sensitivity and specificity analysis, respectively. The classification accuracy of this LDA-ANFIS automatic diagnosis system for the diagnosis of hepatitis disease was obtained in about 94.16%.  相似文献   

20.
In the present study, the techniques of wavelet transform (WT) and neural network were developed for speech based text-independent speaker identification. The first five formants in conjunction with the Shannon entropy of wavelet packet (WP) upon level four features extraction method was developed. Thirty-five features were fed to feed-forward backpropagation neural networks (FFPBNN) for classification. The functions of features extraction and classification are performed using the wavelet packet and formants neural networks (WPFNN) expert system. The declared results show that the proposed method can make an effectual analysis with average identification rates reaching 91.09. Two published methods were investigated for comparison. The best recognition rate selection obtained was for WPFNN. Discrete wavelet transform (DWT) was studied to improve the system robustness against the noise of −2 dB.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号