首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Most of the state‐of‐the‐art speech recognition systems use continuous‐mixture hidden Markov models (CMHMMs) as acoustic models. On the other hand, it is well known that discrete hidden Markov model (DHMM) systems show poor performance because they are affected by quantization distortion. In this paper, we present an efficient acoustic modeling based on discrete distribution for large‐vocabulary continuous speech recognition (LVCSR). In our previous work, we proposed the maximum a posteriori (MAP) estimation of discrete‐mixture hidden Markov model (DMHMM) parameters and showed that the DMHMM system performed better in noisy conditions than the conventional CMHMM system. However, we conducted the recognition experiments on a read/speech task in which the vocabulary size was only 5k. In addition, the DMHMM was not effective in clean condition in that work. In this paper, we have developed a DMHMM‐based LVCSR system and evaluated the system on a more difficult task under clean condition. In Japan, a large‐scale spontaneous speech database ‘Corpus of Spontaneous Japanese’ has been used as the common evaluation database for spontaneous speech and we used it for our experiments. From the results, it was seen that the DMHMM system showed almost the same performance as the CMHMM system. Moreover, performance improvement could be achieved by a histogram equalization method. Copyright © 2010 Institute of Electrical Engineers of Japan. Published by John Wiley & Sons, Inc.  相似文献   

2.
Multiple-size units-based acoustic modeling has been proposed for large vocabulary speech recognition system to improve the recognition accuracy with limited training data. By introducing a limited number of long-size units into unit set, this modeling scheme can make better acoustic model precision than complete short-size unit modeling without losing model trainability. However, such a multiple-size unit acoustic modeling paradigm does not always bring reliable improvement on recognition performance, since when a large number of long-size units are added in, the amount of training data for short-size units will decrease and result in insufficiently trained models. In this paper, a modified Baum-Welch training method is proposed, which uses product hidden Markov models (PHMMs) to couple units with different sizes and enables them to share same portions of training data. The validity of proposed method is proved by experiment results.  相似文献   

3.
In this paper, we are exploring features extracted from steady vowel segments for improving the performance of speaker identification system under background noise. Steady vowel regions are produced by periodic impulse‐like excitation and they contain relatively high signal energy. Hence, speaker specific information present in steady vowel regions may be less affected by the noise. In this work, steady vowel regions are determined by using the knowledge of accurate vowel onset points and epochs. Speaker identification studies are carried out using TIMIT database for white and vehicle noises. Universal background model–Gaussian mixture model‐based modeling is explored for developing speaker models. Significant improvement in the performance of speaker identification is observed by using features extracted from steady vowel region in presence of noisy environments. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

4.
To achieve decision‐level fusion of multi‐regional features and highlight the credibility of different regional evidences, a facial expression recognition method based on multi‐regional evidence fusion is proposed. A block histogram of gradient Gabor features in three regions, namely eyebrows, eyes, and mouth, is extracted from a facial image and regarded as evidence in expression classification. Then, category membership and regional contribution are solved with the region‐weighted semisupervised fuzzy c‐means clustering algorithm to construct initial basic probability assignment (BPA ) and emphasize the importance of different evidences, respectively. The initial BPA of evidence is further reassigned by combining region contribution and evidence supportability to reduce evidential conflict. Finally, the final decision‐level fusion of multi‐regional evidences is obtained based on the Dempster–Shafer (D–S) combination rule. The experimental results for the Cohn–Kanade expression database show that the BPA construction method based on category‐membership degree and the reassignment strategy based on region contribution and evidence supportability improves the recognition rate and maintains good robustness for all types of expressions. Compared with existing decision‐level fusion strategies and classification methods, the proposed recognition framework based on D–S evidences theory has the advantages in recognition performance and reliability, particularly in increasing the recognition rate for expressions that are difficult to distinguish, such as fear, sadness, and disgust. © 2016 Institute of Electrical Engineers of Japan. Published by John Wiley & Sons, Inc.  相似文献   

5.
基于元音MFCC的说话人识别系统研究   总被引:1,自引:0,他引:1  
说话人识别从本质上看是从语音信息中提取说话人特征,并通过一定的方式进行模式识别的过程.辨别说话人的方法很多,本文认为先从语音中提出元音,再通过计算元音的MFCC(美尔频标倒谱系数)特征参数,并与DTW(动态时间规整)结合进行多人多单词试验,实验证明这种识别方式能提高识别率5%左右--从原字平均识别率为83%提高到取元音后平均识别率为88%.  相似文献   

6.
The cascaded H‐bridge (CHB) multilevel inverter is being recognized as the most suitable topology for high‐power medium‐voltage power quality conditioning applications. This paper presents mathematical modeling and effective controller design methodology for the CHB‐based active power filters (APFs), which achieves dynamic reactive power and harmonic compensation. The most crucial problems in CHB‐APF control are the simultaneous requirements of both accurate harmonic current compensation and the dc‐link voltage stabilization among the H‐bridges, which is the prerequisite for the stable operation of CHB‐APF. To achieve dc‐link stabilization, a novel voltage balancing algorithm is proposed by splitting the dc‐link voltage control task into two parts, namely, the average voltage control and the voltage balancing control, where the sine and cosine functions of the phase angle of the fundamental component of the grid voltage are used, respectively. To ensure accurate phase tracking, a novel phase‐locked loop (PLL) is proposed by using the adaptive linear neural network (ADALINE), where the grid voltage background distortion is also taken into account. The superior performance of the ADALINE‐PLL is validated by comparison with the existing PLLs in literatures. Furthermore, the proportional‐resonant (PR) controller is used for the reference current tracking. A separate ADALINE algorithm is applied for reference current generation (RCG) for the CHB‐APF. The excellent performance of the ADALINE‐based RCG scheme is verified by comparison with the existing RCG schemes, namely, the low‐pass filter approach and the single‐phase p ? qmethod. The experimental results on the three modules CHB‐APF are presented, which verifies the effectiveness of the proposed control algorithms. Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

7.
This paper proposes a cooperative control of battery energy storage (BES) units within a microgrid (MG) which includes two control subsystems for charge and discharge operation mode of the BES. In addition, the proposed cooperative control strategy provides accurate reactive power sharing among the BES units. During discharge operation, the proposed strategy utilizes a SoC-based droop control in order to avoid promptly depleting of the BES units, by dedicating the highest priority to their SoC level and respecting their power rating. This is achieved without any disturbance in the power balance of the MG. In addition, during charge operation of the BES units, the proposed control method uses a proportional-integral (PI) controller to limit the BES absorbing power and match it with the available surplus power from the renewable energy sources (RESs). This in turn avoids any power imbalance within the system. Finally, to utilize the extra capacity of the BES converters and also to avoid overloading of RESs, a new adaptive virtual impedance (AVI) strategy is proposed here which provides accurate reactive power sharing by imposing a virtual impedance in series with the coupling impedance of each BES and RES unit. The system performance is validated through extensive simulations carried out in PSCAD/EMTDC software.  相似文献   

8.
In this paper, we present a new no‐reference (NR) image quality evaluation model for Joint Photographic Experts Group (JPEG) and JPEG2000 coded images. The proposed model is based on the blockiness around the block boundary, average absolute difference between adjacent pixels within the block, and zero‐crossing (ZC) rate within the block of the image. Subjective experimental results of the Laboratory for Image and Video Engineering (LIVE) Image Quality Assessment Database were used to train and test the model, which achieved sufficient quality prediction performance. Copyright © 2008 Institute of Electrical Engineers of Japan. Published by John Wiley & Sons, Inc.  相似文献   

9.
基于FCM聚类的随机子空间低频振荡模态识别算法   总被引:1,自引:0,他引:1  
振荡模态的精准捕捉对有效抑制低频振荡有重要意义,基于量测的低频振荡模态辨识方法在在线监测识别领域具有广阔应用前景。本文针对模态识别算法定阶困难、易存在虚假模态等问题,提出了基于模糊C均值聚类的多阶随机子空间算法。通过多阶子空间计算可捕捉所有可能的系统模态,并通过模糊C均值算法确定实际最低阶数,经虚假模态筛除确定最终振荡主导模态,并且能降低干扰,提升辨识抗噪性能。本文算法与Prony算法进行了性能对比,并通过四机两区系统和实际电网相量测量单元量测数据验证了算法的适用性和鲁棒性。  相似文献   

10.
This paper describes a DC micro‐grid system interconnecting distributed power generators. The system consists of five generation and control units: a solar‐cell generation unit, a wind‐turbine generation unit, a battery energy‐storage unit, a flywheel power‐leveling unit, and an AC grid‐interconnecting power control unit. The control method is proposed for suppressing the circulating current by detecting only the DC grid voltage. This method brings high reliability, high flexibility, and maintenance‐free operation to the system. The method pays attention to DC output voltage performance of each unit. Each of the power control units and the energy‐storage unit is controlled to act as a voltage source with imaginary impedance. On the other hand, each of the two generation units is controlled to act as a current source. The power‐leveling unit is controlled to act as a current source having the function of frequency selectivity like a high‐pass filter. A 10‐kW prototype system verifies experimentally the validity and effectiveness of the proposed control method for the DC‐grid system. © 2009 Wiley Periodicals, Inc. Electr Eng Jpn, 167(2): 86–93, 2009; Published online in Wiley InterScience ( www.interscience.wiley.com ). DOI 10.1002/eej.20603  相似文献   

11.
12.
In this paper we present two supervised speaker adaptation methods, including a feature normalization and an MCE/GPD algorithm, developed to implement an MSVQ-based adaptive Chinese syllable recognition system. In the MSVQ-based speech recognition, each recognition unit is represented as a time sequence of codebooks. The first proposed method is feature normalization, in which we model the inter-speaker variability as a linear transformation. By applying the feature normalization, the target speaker speech is normalized to reduce the inter-speaker acoustic variability. In the second adaptation method we first present an implementation of the MCE/GPD algorithm for discriminatively training an MSVQ-based speech recognizer. It is expected that this method can separate the confusion classes and can enhance speaker adaptation capability. By applying the MCE/GPD algorithm, the MSVQ-based recognizer parameters are adjusted iteratively to accomplish the objective of minimum classification error rate. We carried out recognition experiments of highly confusing Chinese syllables to assess its performance. Using the standard Chinese syllable database CRDB in China, the results show that when the two adaptation methods are combined, the error rate reduction on open data is over 62% with a single set of adaptation training data. Therefore, when the amount of adaptation data is limited, the adaptation methods can lead to substantial improvement. Upon increasing the training data, the capability of speaker adaptation is improved by using the MCE/GPD training only, so it can be used for tracking spectral evolution over time and provides a robust means for adaptive speech recognition. © 1997 John Wiley & Sons, Ltd.  相似文献   

13.
在人脸识别的过程中,利用独立成分分析(ICA)方法得到的特征能够很好的描述原始图像,但是不具备很好的判别分类能力。判别共同矢量(DCV)是一种在类内散布矩阵的零空间中求取投影矩阵的方法,相比线性鉴别分析(LDA)方法,可以得到更具鉴别能力的特征。因此本文提出一种新的特征提取算法,简称I-DCV,首先对预处理后的人脸图像应用独立成分分析算法去除二阶及高阶冗余信息,然后利用DCV对求取的独立特征向量作进一步的处理,最后依据欧式距离进行分类识别。实验结果表明,本文提出的方法具有很好的识别性能。  相似文献   

14.
In this paper, we propose a novel learning method of two‐channel linear filter for a target sound extraction in a non‐stationary noisy environment using a two‐channel microphone array. The method is based on a correlation coefficient between received sounds of two microphones. The cue signal, which has a correlation with a variation of S/N of the received sounds, is generated using the correlation coefficient and is applied to the learning. By several computer simulation results, a superior performance of the proposed method even at the consonant section of the speech signal is presented in comparison with the previously proposed method. © 2006 Wiley Periodicals, Inc. Electr Eng Jpn, 155(3): 45–52, 2006; Published online in Wiley InterScience ( www.interscience.wiley.com ). DOI 10.1002/eej.20141  相似文献   

15.
A reference‐less all‐digital burst‐mode clock and data recovery circuit (CDR) is proposed in the paper. The burst‐mode CDR includes a coarse and a fine time‐to‐digital converter (TDC) with embedded phase generator. A low‐power current‐starved inverter is employed as the delay unit of the fine TDC to acquire the high measurement resolution. A calibration method to diminish the inherent delay is used to reduce the quantization error of the recovery clock. The proposed CDR is fabricated in a 65‐nm CMOS process. Experiment results show that the CDR operates from 0.9 to 1.1 Gbps and have a 13‐bit consecutive identical digits (CIDs) tolerance.  相似文献   

16.
The large scale penetration of renewable energy resources has boosted the need of using improved control technique and modular power electronic converter structures for efficient and reliable operation of grid‐connected systems. This study investigates the performance of a grid‐connected 3‐phase 3‐level neutral‐point clamped voltage source inverter for renewable energy integration by using improved current control technique. For medium or high‐voltage grid interfacing, the multilevel inverter structure is generally used to reduce the voltage stress across the switching device as well as the harmonic distortion. The neutral‐point clamped voltage source inverter is controlled by using decoupling technique along with the proper grid synchronization via moving average filter–based phase‐locked loop. The moving average filter–based phase‐locked loop is used to reduce the delay in grid angle estimation under balanced as well as distorted grid conditions. A Lyapunov‐based approach for analysing the stability of the system has also been discussed. In this study, the hardware‐in‐loop (HIL) simulation of the control algorithm and the grid synchronization technique is realized using Virtex‐6 FPGA ML605 evaluation kit. The performance of the system is analyzed by conducting a time‐domain simulation in the Matlab/Simulink platform and its performance is examined in the HIL environment. The simulation and the hardware cosimulation results are presented to validate the effectiveness of the proposed control scheme.  相似文献   

17.
In this paper, the twice second‐order high‐pass error feedback (EF) (twice second‐order high‐pass EF (HPEF)) utilizing re‐feedback process and phase‐bit splitting technique in the second‐order HPEF to design a simplified low‐spur direct digital frequency synthesizer is proposed. The proposed method performs phase‐bit splitting technique and re‐feedback process in order to make the phase change tremendously and scramble the periodicity of the phase sequences violently in the original feedback path. In addition, the noise spectrum power is spread more uniformly in order to effectively suppress the spurs due to phase‐truncated error effect. Thus, the twice second‐order HPEF is implemented on a field programmable gate array development board, the Altera Stratix II EP2S60. The simulation and experimental results show that the proposed method can effectively achieve better spectrum performance, such as spurious‐free dynamic range, as compared to basic phase truncation, first‐order HPEF and second‐order HPEF architectures. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

18.
In this paper, a single‐phase quasi‐Z‐source (qZS) inverter (qZSI), integrating the pulse width modulation (PWM) control with interleaved‐and‐shifted shoot‐through state (STS) placement modulation technique, is proposed to simultaneously achieve both dc voltage boost and dc‐ac inversion. Instead of placing the STS in both inverter legs simultaneously, the addressed method inserts the STS only in left/right inverter leg separately during the positive/negative half cycle of the output voltage to reduce switching losses and thermal stresses of the power devices. The STS shift is also studied to decrease the switching numbers of power devices and thus can improve the efficiency further. Theoretical analysis and design guidelines of the studied inverter are included. Improvement in effectiveness and performance of the devised scheme and modulation strategy are proved experimentally and compared with the previous studies on a built laboratory prototype.  相似文献   

19.
The interaction between humans and machines has become an issue of concern in recent years. Besides facial ex-pressions or gestures, speech has been evidenced as one of the foremost promising modalities for automatic emotion recognition. Effective computing means to support HCI (Human-Computer Interaction) at a psychological level, al-lowing PCs to adjust their reactions as per human requirements. Therefore, the recognition of emotion is pivotal in High-level interactions. Each Emotion has distinctive properties that form us to recognize them. The acoustic signal produced for identical expression or sentence changes is essentially a direct result of biophysical changes, (for example, the stress instigated narrowing of the larynx) set off by emotions. This connection between acoustic cues and emotions made Speech Emotion Recognition one of the moving subjects of the emotive computing area. The most motivation behind a Speech Emotion Recognition algorithm is to observe the emotional condition of a speaker from recorded Speech signals. The results from the application of k-NN and OVA-SVM for MFCC features without and with a feature selection approach are presented in this research. The MFCC features from the audio signal were initially extracted to characterize the properties of emotional speech. Secondly, nine basic statistical measures were calculated from MFCC and 117-dimensional features were consequently obtained to train the classifiers for seven different classes (Anger, Happiness, Disgust, Fear, Sadness, Disgust, Boredom and Neutral) of emotions. Next, Classification was done in four steps. First, all the 117-features are classified using both classifiers. Second, the best classifier was found and then features were scaled to [-1, 1] and classified. In the third step, the with or without feature scaling which gives better performance was derived from the results of the second step and the classification was done for each of the basic sta-tistical measures separately. Finally, in the fourth step, the combination of statistical measures which gives better per-formance was derived using the forward feature selection method Experiments were carried out using k-NN with different k values and a linear OVA-based SVM classifier with different optimal values. Berlin emotional speech da-tabase for the German language was utilized for testing the planned methodology and recognition rates as high as 60% accomplished for the recognition of emotion from voice signal for the set of statistical measures (median, maximum, mean, Inter-quartile range, skewness). OVA-SVM performs better than k-NN and the use of the feature selection technique gives a high rate.  相似文献   

20.
This paper presents a free‐weighting matrix (FWM) method based on linear control design approach for the wide‐area robust damping (WARD) controller associated with flexible AC transmission system (FACTS) device to improve the dynamical performance of the large‐scale power systems. First, the linearized reduced‐order plant model is established, which efficiently considers the time delay of the remote feedback signals transmitted by wide‐area measurement systems. Then, based on the robust control theory, the design of the FACTS‐WARD controller is formulated as the standard control problem on delay‐dependent state‐feedback robust control, which is described by a set of linear matrix inequality constraints. Furthermore, in order to obtain the optimal control parameters that can endure the maximum time delay, a FWM approach is proposed to solve the time‐dependent problem of the time‐delay system. Meanwhile, an iterative algorithm based on cone complementary linearization is presented to search out the optimal control parameters. Finally, the nonlinear simulations on the 2‐area 4‐machine and the 5‐area 16‐machine test systems are performed, to evaluate the control performance of the proposed robust wide‐area time‐delay control approach. © 2011 Institute of Electrical Engineers of Japan. Published by John Wiley & Sons, Inc.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号