首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
传统的音频时域修正采用SOLA方法,但不适宜处理音频信号中广泛存在的瞬时冲击,存在一定局限性,基于此方法提出了一种改进的SOLA算法,有效地改进了传统算法;或者采用基于正弦模型的时域修正方法,灵活地将不同类型的信号进行不同处理,较好地保留了瞬时冲击特性;仿真实验的结果证明,两种方法均能解决传统SOLA算法中瞬时冲击处理不当的问题,同时两种方法各有利弊.  相似文献   

2.
In this paper, a novel solving method for speech signal chaotic time series prediction model was proposed. A phase space was reconstructed based on speech signal's chaotic characteristics and the genetic programming (GP) algorithm was introduced for solving the speech chaotic time series prediction models on the phase space with the embedding dimension m and time delay τ. And then, the speech signal's chaotic time series models were built. By standardized processing of these models and optimizing parameters, a speech signal's coding model of chaotic time series with certain generalization ability was obtained. At last, the experimental results showed that the proposed method can get the speech signal chaotic time series prediction models much more effectively, and had a better coding accuracy than linear predictive coding (LPC) algorithms and neural network model.  相似文献   

3.
Suprasegmental (prosody) features of discourse provide a vehicle by which speakers reflect their mental purposes to listeners. Generating suitable prosody information is critical to expressing messages and improving the intelligibility and naturalness of synthetic speech. Generic prosody generators should provide information about pitch frequency (F 0) contours, energy levels, word durations, and inter-word pause durations for speech synthesizers. The present study used a recurrent neural network (RNN) for prosody generation. The inputs of this RNN were word-level and syllable-level linguistic features. To provide data efficiently for the RNN-based prosody generator in the training, validation, and test phases, automatic segmentation and labeling of phonemes were performed. The number of inputs to the RNN was reduced by employing a binary gravitational search algorithm (BGSA) for feature selection (FS). The proposed prosody generator provided 12 output prosodic parameters for the current syllable for representing pitch contour, log-energy contour, inter-syllable pause duration, duration of syllable, duration of the vowel in the syllable, and vowel onset time. Experimental results demonstrated the success of the RNN-based prosody generator in synthesizing the six prosodic elements with acceptable root mean square error (RMSE). By using a BGSA-based FS unit, a lighter neural model was achieved with a 53 % reduction in the number of weight connections, producing RMSEs with acceptable degradation over the no-FS unit prosody generator. The performance of the BGSA-based FS method was compared with a binary particle swarm optimization (BPSO) algorithm, and the BGSA showed slightly better results. A modified mean opinion score scale was used to evaluate the intelligibility and naturalness of synthesized speech using the proposed method.  相似文献   

4.
5.
The current work intended to enhance our knowledge of changes or lack of changes in the speech signal when people were being deceptive. In particular, the study attempted to investigate the appropriateness of using speech cues in detecting deception. Truthful, deceptive and control speech were elicited from ten speakers in an interview setting. The data were subjected to acoustic analysis and results are presented on a range of speech parameters including fundamental frequency (f0), overall amplitude and mean vowel formants F1, F2 and F3. A significant correlation could not be established between deceptiveness/truthfulness and any of the acoustic features examined. Directions for future work are highlighted.  相似文献   

6.
A new sensor method based on series piezoelectric quartz crystal (SPQC) sensing technique was proposed for studying the effect of surfactants on the growth of Pseudomonas aeruginosa in media. The frequency shift curves under different growth conditions were obtained and compared with each other. By fitting frequency shift (ΔF) curves, three kinetic parameters (μm, λ and A) were gained to describe the growth of microorganisms. When the ΔF became small and frequency detection time (FDT) prolonged, surfactants had an inhibitory effect to the growth of bacteria. At the same time, the lag time (λ) was prolonged; the maximum specific growth rate (μm) and the asymptote (A) were decreased. By using the proposed method, the influence of the concentration and ion type of surfactants, hydrophilic group and the number of ethoxyl of nonionic surfactants on the growth of bacteria was investigated in detail.  相似文献   

7.
Surface roughness is a major concern to the present manufacturing sector without the wastage of material. Hence, in order to achieve good surface roughness and reduce production time, optimization is necessary. In this study optimization techniques based on swarm intelligence (SI) namely firefly algorithm (FA), particle swarm optimization (PSO) and a newly introduced metaheuristic algorithm namely bat algorithm (BA) has been implemented for optimizing machining parameters namely cutting speed, feed rate, depth of cut and tool flank wear and cutting tool vibrations in order to achieve minimum surface roughness. Two parameters Ra and Rt have been considered for evaluating the surface roughness. The performance of BA algorithm has been compared with FA algorithm and PSO, which is a commonly and widely used optimization algorithm in machining. The results conclude that BA produces better optimization, when compared to FA and PSO. Based on the literature review carried out, this work is a first attempt at using a metaheuristic algorithm namely BA in machining applications.  相似文献   

8.
A linear time recognition algorithm for proper interval graphs   总被引:1,自引:0,他引:1  
We propose a linear time recognition algorithm for proper interval graphs. The algorithm is based on certain ordering of vertices, called bicompatible elimination ordering (BCO). Given a BCO of a biconnected proper interval graph G, we also propose a linear time algorithm to construct a Hamiltonian cycle of G.  相似文献   

9.
The NP-complete problem Proper Interval Vertex Deletion is to decide whether an input graph on n vertices and m edges can be turned into a proper interval graph by deleting at most k vertices. Van Bevern et al. (In: Proceedings WG 2010. Lecture notes in computer science, vol. 6410, pp. 232–243, 2010) showed that this problem can be solved in $\mathcal {O}((14k +14)^{k+1} kn^{6})$ time. We improve this result by presenting an $\mathcal {O}(6^{k} kn^{6})$ time algorithm for Proper Interval Vertex Deletion. Our fixed-parameter algorithm is based on a new structural result stating that every connected component of a {claw,net,tent,C 4,C 5,C 6}-free graph is a proper circular arc graph, combined with a simple greedy algorithm that solves Proper Interval Vertex Deletion on {claw,net,tent,C 4,C 5,C 6}-free graphs in $\mathcal {O}(n+m)$ time. Our approach also yields a polynomial-time 6-approximation algorithm for the optimization variant of Proper Interval Vertex Deletion.  相似文献   

10.
To generate the structure and parameters of fuzzy rule base automatically, a particle swarm optimization algorithm with different length of particles (DLPPSO) is proposed in the paper. The main finding of the proposed approach is that the structure and parameters of a fuzzy rule base can be generated automatically by the proposed PSO. In this method, the best fitness (fgbest) and the number (Ngbest) of active rules of the best particle in current generation, the best fitness (fpbesti) which ith particle has achieved so far and the number (Npbesti) of active rules of it when the best position emerged are utilized to determine the active rules of ith particle in each generation. To increase the diversity of structure, mutation operator is used to change the number of active rules for particles. Compared with some other PSOs with different length of particles, the algorithm has good adaptive performance. To indicate the effectiveness of the give algorithm, a nonlinear function and two time series are used in the simulation experiments. Simulation results demonstrate that the proposed method can approximate the nonlinear function and forecast the time series efficiently.  相似文献   

11.
In this study, the combination of artificial neural network (ANN) and ant colony optimization (ACO) algorithm has been utilized for modeling and reducing NOx and soot emissions from a direct injection diesel engine. A feed-forward multi-layer perceptron (MLP) network is used to represent the relationship between the input parameters (i.e., engine speed, intake air temperature, rate of fuel mass injected, and power) on the one hand and the output parameters (i.e., NOx and soot emissions) on the other hand. The ACO algorithm is employed to find the optimum air intake temperatures and the rates of fuel mass injected for different engine speeds and powers with the purpose of simultaneous reduction of NOx and soot. The obtained results reveal that the ANN can appropriately model the exhaust NOx and soot emissions with the correlation factors of 0.98, 0.96, respectively. Further, the employed ACO algorithm gives rise to 32% and 7% reduction in the NOx and soot, respectively. The response time of the optimization process was obtained to be less than 4 min for the particular PC system used in the present work. The high accuracy and speed of the model show its potential for application in intelligent controlling systems of the diesel engines.  相似文献   

12.
A three-state Markov model of speech on the telephone lines is developed. The model considers the alternate occurrence of the telephone calls and the intercall gaps on the telephone lines. During a phone call there are several talkspurts (speech without break) and pauses (silence without break) in the user's speech. The three types of events, intercall gaps (large gaps) talkspurts and pauses occurring on the telephone lines are assumed to have negative exponential density functions with different transition rate parameters. The steady state probability distribution, average and variance of the number of busy channels and the system utilization are evaluated as a function of call loss fraction (P0f). The Synchronous Time Division Multiplexing (STDM) system with number of channels in the group equal to 6 and 24 is considered. The model is also applied to Time Assignment Speech Interpolation (TASI) system and the relationships among the number of user terminals, speech Freeze Out Fraction (FOF) and system utilization are obtained for various values of P0f. The STDM and TASI systems applied to the group of 6 channels are simulated on the EC-1030 computer to check the validity of the analytical results. The results of this study are portrayed on graphs and may be used as guide lines in the design of TASI systems.  相似文献   

13.
The scale-invariant behavior of air pollutant concentration (APC) time structure was investigated by applying the box counting method to APC time series. One-year series of hourly average APC observations, including O3, CO, SO2, NO, NO2, and PM10 which were obtained from urban, traffic, and national park air monitoring station at Taipei (Taiwan), were transferred into a useful compact form through this method, namely, the box-dimension (DB)-threshold (Th) and critical scale (CS)-threshold (Th) plots. The validity of this approach was supported with the result that the practical implications of DB-Th (or CS-Th) plots could be interpreted in terms of traditional statistical parameters. Since the dependences of both DB and CS on the Th values were closely related to the variation of APC in time, they were used to characterize the temporal distribution of APC. The analysis confirmed the existence of scale invariance in those investigated APC time series. Moreover, the DB (CS) was shown to be a decreasing (increasing) function of the threshold level, implying multifractal characteristics, i.e. the weak and intense regions scale differently. Some practical applications based on the box counting method were also discussed.  相似文献   

14.
This paper proposes a method for detecting word boundaries in continuous speech signal for Standard Colloquial Bengali (SCB), commonly referred to as Bangla. Bangla is a bound stress language with stress on the first syllable. Stress introduces its signature on the supra-segmental parameters of the speech signal, which may help to detect the word boundary in the continuous speech signal. The parameters used in this present study are: (1) Difference of the nucleus vowel duration across the syllable boundary, (2) Difference of the normalized nucleus vowel power across the syllable boundary, (3) Normalized F 0 difference across the syllable boundary, (4) Difference of the average normalized F 0 across the syllable boundary, (5) Difference of the normalized maximum periodic power of nucleus vowels across the syllable boundary, (6) Onset duration of the nucleus vowel. Altogether 225 sentences spoken by five native Bangla informants of both the sexes, in the age group of 20–50 years in normal laboratory environment are used in this study. These sentences contain 2734 syllables and 1103 words, sentence terminal words being excluded. A recognition score of 87.8% with a classifier, based on a distance function, weighted by inverse of variance is reported. Both speaker dependent as well as speaker independent studies are included.  相似文献   

15.
We propose a new approach to estimate the a priori signal-to-noise ratio (SNR) based on a multiple linear regression (MLR) technique. In contrast to estimation of the a priori SNR employing the decision-directed (DD) method, which uses the estimated speech spectrum in previous frame, we propose to find the a priori SNR based on the MLR technique by incorporating regression parameters such as the ratio between the local energy of the noisy speech and its derived minimum along with the a posteriori SNR. In the experimental step, regression coefficients obtained using the MLR are assigned according to various noise types, for which we employ a real-time noise classification scheme based on a Gaussian mixture model (GMM). Evaluations using both objective speech quality measures and subjective listening tests under various ambient noise environments show that the performance of the proposed algorithm is better than that of the conventional methods.  相似文献   

16.
Modeling NOx emissions from coal fired utility boiler is critical to develop a predictive emissions monitoring system (PEMS) and to implement combustion optimization software package for low NOx combustion. This paper presents an efficient NOx emissions model based on support vector regression (SVR), and compares its performance with traditional modeling techniques, i.e., back propagation (BPNN) and generalized regression (GRNN) neural networks. A large number of NOx emissions data from an actual power plant, was employed to train and validate the SVR model as well as two neural networks models. Moreover, an ant colony optimization (ACO) based technique was proposed to select the generalization parameter C and Gaussian kernel parameter γ. The focus is on the predictive accuracy and time response characteristics of the SVR model. Results show that ACO optimization algorithm can automatically obtain the optimal parameters, C and γ, of the SVR model with very high predictive accuracy. The predicted NOx emissions from the SVR model, by comparing with the BPNN model, were in good agreement with those measured, and were comparable to those estimated from the GRNN model. Time response of establishing the optimum SVR model was in scale of minutes, which is suitable for on-line and real-time modeling NOx emissions from coal-fired utility boilers.  相似文献   

17.
The Cocke-Younger-Kasami algorithm (CYK) always requires 0(n3) time and 0(n2) space to recognize a trial sentence ω = w1w2…wn, given an e-free context-free grammar in Chomsky Normal form. The same inductive rule that underlies the CYK algorithm may be used to produce a variant that computes the same information but requires (1) a maximum of 0(n3) time and 0(n2) space, and (2) only 0(s(n)) space and time for an unambiguous grammar, where s(n) is the number of triples (A,i,j) for which a nonterminal symbol A derives wiwi+1wi+j?1. In this case, time and space consumed are at worst 0(n2).It is shown in addition, for any grammar, that a parse may be obtained from the table left from the recognition algorithm in time 0(s(n)) whether or not the grammar is ambiguous. The same procedure for the CYK algorithm requires time 0(n2).The performance of our variant is quite similar to that of the Earley algorithm except that the Earley algorithm substitutes for s(n), a function which is usually smaller.The model we use of a RAM is strictly identical to the model used in the CYK algorithm. CR categories: 4.20, 5.23, 5.25.  相似文献   

18.
We propose a coupled hidden Markov model (CHMM) approach to video-realistic speech animation, which realizes realistic facial animations driven by speaker independent continuous speech. Different from hidden Markov model (HMM)-based animation approaches that use a single-state chain, we use CHMMs to explicitly model the subtle characteristics of audio-visual speech, e.g., the asynchrony, temporal dependency (synchrony), and different speech classes between the two modalities. We derive an expectation maximization (EM)-based A/V conversion algorithm for the CHMMs, which converts acoustic speech into decent facial animation parameters. We also present a video-realistic speech animation system. The system transforms the facial animation parameters to a mouth animation sequence, refines the animation with a performance refinement process, and finally stitches the animated mouth with a background facial sequence seamlessly. We have compared the animation performance of the CHMM with the HMMs, the multi-stream HMMs and the factorial HMMs both objectively and subjectively. Results show that the CHMMs achieve superior animation performance. The ph-vi-CHMM system, which adopts different state variables (phoneme states and viseme states) in the audio and visual modalities, performs the best. The proposed approach indicates that explicitly modelling audio-visual speech is promising for speech animation.  相似文献   

19.
Estimating the noise power spectral density (PSD) from the corrupted speech signal is an essential component for speech enhancement algorithms. In this paper, a novel noise PSD estimation algorithm based on minimum mean-square error (MMSE) is proposed. The noise PSD estimate is obtained by recursively smoothing the MMSE estimation of the current noise spectral power. For the noise spectral power estimation, a spectral weighting function is derived, which depends on the a priori signal-to-noise ratio (SNR). Since the speech spectral power is highly important for the a priori SNR estimate, this paper proposes an MMSE spectral power estimator incorporating speech presence uncertainty (SPU) for speech spectral power estimate to improve the a priori SNR estimate. Moreover, a bias correction factor is derived for speech spectral power estimation bias. Then, the estimated speech spectral power is used in “decision-directed” (DD) estimator of the a priori SNR to achieve fast noise tracking. Compared to three state-of-the-art approaches, i.e., minimum statistics (MS), MMSE-based approach, and speech presence probability (SPP)-based approach, it is clear from experimental results that the proposed algorithm exhibits more excellent noise tracking capability under various nonstationary noise environments and SNR conditions. When employed in a speech enhancement system, improved speech enhancement performances in terms of segmental SNR improvements (SSNR+) and perceptual evaluation of speech quality (PESQ) can be observed.  相似文献   

20.
This paper presents a heavy-tailed mixture model for describing time-varying conditional distributions in time series of returns on prices. Student-t component distributions are taken to capture the heavy tails typically encountered in such financial data. We design a mixture MT(m)-GARCH(p, q) volatility model for returns, and develop an EM algorithm for maximum likelihood estimation of its parameters. This includes formulation of proper temporal derivatives for the volatility parameters. The experiments with a low order MT(2)-GARCH(1, 1) show that it yields results with improved statistical characteristics and economic performance compared to linear and nonlinear heavy-tail GARCH, as well as normal mixture GARCH. We demonstrate that our model leads to reliable Value-at-Risk performance in short and long trading positions across different confidence levels.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号