首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 390 毫秒
1.
This paper proposes a method for tuning the weights of unit selection cost functions in syllable based text-to-speech (TTS) synthesis system. In this work, unit selection cost functions, namely target cost and concatenation cost, are designed appropriate to syllables. The method tunes the weights in such a way that perceptual preference patterns are appropriately considered while selecting the units. The method uses genetic algorithm to derive the optimal weights. Fitness function is designed to map perceptual preference patterns into weights of unit selection cost functions. The effectiveness of proposed method is evaluated by both subjective and objective measures. From the results, it is observed that the derived optimal weights can synthesize good quality speech compared to manually tuned weights.  相似文献   

2.
基于特征选择的语音特征获取用于说话人识别是目前较为有效的方式。但是,最优语音特征随着具体应用环境的变化而不同。因此,提出了基于四类型语音特征封装式遗传特征选择算法(FSF-WrGAF),该算法提取了四种类型的语音特征参数,通过链式智能体遗传算法和GMM-UBM进行封装式动态特征选择,获取高精度的识别准确率。采用了多种指标完成该算法的性能测试。实验结果表明,该算法具体实现过程简便,改进效果明显,较同类算法在多项指标(识别率,EER,DET曲线)上都有显著提高。  相似文献   

3.
针对分布式数据库中关系及其分片多副本、多站点存储的特性会增加查询搜索空间及时间复杂度,从而降低查询执行计划(QEP)搜索效率的问题,提出一种基于分片分配选择器(FSS)设计准则的并行遗传-最大最小蚁群算法(PGA-MMAS)。首先,结合实际的企业分布式信息管理系统设计FSS,启发式选择较优关系副本,以减少查询连接代价并缩小PGA-MMAS的搜索空间;然后结合遗传算法(GA)收敛较快的优势,对最终连接关系进行编码和并行遗传操作,得到一组相对较优的QEP,并将其转化为并行最大最小蚁群算法(MMAS)的初始信息素分布,从而使其更快速地搜索到全局最优QEP;最后分别在不同关系数情况下对算法进行仿真实验,结果表明,基于FSS的PGA-MMAS搜索最优QEP的效率高于原GA以及基于FFS的GA、MMAS和GA-MMAS;经实际工程应用验证,所提算法搜索出的高质量QEP可以提高分布式数据库多关系查询效率。  相似文献   

4.
This paper presents the design and development of an Auto Associative Neural Network (AANN) based unrestricted prosodic information synthesizer. Unrestricted Text To Speech System (TTS) is capable of synthesize different domain speech with improved quality. This paper deals with a corpus-driven text-to speech system based on the concatenative synthesis approach. Concatenative speech synthesis involves the concatenation of the basic units to synthesize an intelligent, natural sounding speech. A corpus-based method (unit selection) uses a large inventory to select the units and concatenate. The prosody prediction is done with the help of five layer auto associative neural network which helps us to improve the quality of speech synthesis. Here syllables are used as basic unit of speech synthesis database. The database consisting of the units along with their annotated information is called annotated speech corpus. A clustering technique is used in annotated speech corpus that provides way to select the appropriate unit for concatenation, based on the lowest total join cost of the speech unit. Discontinuities present at the unit boundaries are lowered by using the mel-LPC smoothing technique. The experiment has been made for the Dravidian language Tamil and the results reveal to demonstrate the improved intelligibility and naturalness of the proposed method. The proposed system is applicable to all the languages if the syllabification rules has been changed.  相似文献   

5.
This paper describes a new Korean Text-to-Speech (TTS) system based on a large speech corpus. Conventional concatenative TTS systems still produce machine-like synthetic speech. The poor naturalness is caused by excessive prosodic modification using a small speech database. To cope with this problem, we utilized a dynamic unit selection method based on a large speech database without prosodic modification. The proposed TTS system adopts triphones as synthesis units. We designed a new sentence set maximizing phonetic or prosodic coverage of Korean triphones. All the utterances were segmented automatically into phonemes using a speech recognizer. With the segmented phonemes, we achieved a synthesis unit cost of zero if two synthesis units were placed consecutively in an utterance. This reduces the number of concatenating points that may occur due to concatenating mismatches. In this paper, we present data concerning the realization of major prosodic variations through a consideration of prosodic phrase break strength. The phrase break was divided into four kinds of strength based on pause length. Using phrase break strength, triphones were further classified to reflect major prosodic variations. To predict phrase break strength on texts, we adopted an HMM-like Part-of-Speech (POS) sequence model. The performance of the model showed 73.5% accuracy for 4-level break strength prediction. For unit selection, a Viterbi beam search was performed to find the most appropriate triphone sequence, which has the minimum continuation cost of prosody and spectrum at concatenating boundaries. From the informal listening test, we found that the proposed Korean corpus-based TTS system showed better naturalness than the conventional demisyllable-based one.  相似文献   

6.
In unit selection-based concatenative speech synthesis, join cost (also known as concatenation cost), which measures how well two units can be joined together, is one of the main criteria for selecting appropriate units from the inventory. Usually, some form of local parameter smoothing is also needed to disguise the remaining discontinuities. This paper presents a subjective evaluation of three join cost functions and three smoothing methods. We also describe the design and performance of a listening test. The three join cost functions were taken from our previous study, where we proposed join cost functions derived from spectral distances, which have good correlations with perceptual scores obtained for a range of concatenation discontinuities. This evaluation allows us to further validate their ability to predict concatenation discontinuities. The units for synthesis stimuli are obtained from a state-of-the-art unit selection text-to-speech system: rVoice from Rhetorical Systems Ltd. In this paper, we report listeners' preferences for each join cost in combination with each smoothing method.  相似文献   

7.
提出了一种融合自动检错的单元挑选语音合成方法。本文方法旨在设计与主观听感更加一致的单 元挑选准则,以提高合成语音的自然度。首先利用众包网络平台快速大量地收集测听人对于合成语音的主观评价数据,取代了传统的利用具备语言学知识的专家收集主观评价数 据的方法;然后基于这些主观评价数据,提取对应语音的音节时长、单元代价以及声学参数距 离等特征,构建基于支持向量机的合成错误检测器;在合成阶段,该检测器被用来对传统单元 挑选输出的N条路径行重打分,以确定最优的单元挑选序列。倾向性测听结果表明本文方法可以有效地提高合成语音的自然度。  相似文献   

8.
This paper presents a variable-length unit selection scheme based on syntactic cost to select text-to-speech (TTS) synthesis units. The syntactic structure of a sentence is derived from a probabilistic context-free grammar (PCFG), and represented as a syntactic vector. The syntactic difference between target and candidate units (words or phrases) is estimated by the cosine measure with the inside probability of PCFG acting as a weight. Latent semantic analysis (LSA) is applied to reduce the dimensionality of the syntactic vectors. The dynamic programming algorithm is adopted to obtain a concatenated unit sequence with minimum cost. A syntactic property-rich speech database is designed and collected as the unit inventory. Several experiments with statistical testing are conducted to assess the quality of the synthetic speech as perceived by human subjects. The proposed method outperforms the synthesizer without considering syntactic property. The structural syntax estimates the substitution cost better than the acoustic features alone  相似文献   

9.
《Artificial Intelligence》1986,29(3):339-347
A heuristic search strategy via islands is suggested to significantly decrease the number of nodes expanded. Algorithm I, which searches through a set of island nodes (“island set”), is presented assuming that the island set contains at least one node on an optimal cost path. This algorithm is shown to be admissible and expands no more nodes than A1. For cases where the island set does not contain an optimal cost path (or any path). Algorithm I', a modification of Algorithm I, is suggested. This algorithm ensures a suboptimal cost path (which may be optimal) and in extreme cases falls back to A1.  相似文献   

10.
基于统计韵律模型的汉语语音合成系统的研究   总被引:2,自引:4,他引:2  
本文论述了采用统计模型进行汉语韵律层级结构分析和韵律建模的思路,在此基础上建立了汉语语音合成系统。其中,本文还仔细阐述了韵律代价函数的构造,及其参数的自动训练算法。同时,论文还分析了韵律特征间相互作用对音节基元选取的影响,并最终实现了一个连续语流中用于汉语语音合成的音节基元选取模型。测试表明了本文提出的基于统计模型的韵律层级分析和韵律建模思路,能够较好应用于汉语语音合成系统的构造,并使之具有良好的合成语音的自然度。  相似文献   

11.
杨万春  张晨曦  穆斌 《计算机应用》2016,36(8):2207-2212
服务级别协议(SLA)等级感知的服务选择是NP难题。针对服务选择中维度与粒度方面的问题,提出结合语义与事务属性的服务质量(QoS)感知的服务优化选择模型。该模型从语义链接匹配度、QoS与事务三个维度对服务进行优化选择,并设计了支持多粒度的编码策略。针对服务选择中时间复杂度高的问题,提出了克隆选择与遗传算法相结合的混合优化算法。该算法首先采用动态适应度函数,逐代淘汰不满足约束的个体;其次给出了事务属性的优先级,并根据优先级设计了知识启发式的交叉与变异算子,以保证个体满足事务属性要求;最后在遗传算法中对优秀个体进行克隆选择,以增强对最优解的搜索能力。仿真实验中,该算法在服务选择的精确度和成功率方面均优于遗传算法;在时间花费上稍高于遗传算法但远低于穷举法。实验结果表明,所提算法能在较少时间花费的基础上保证服务选择的质量。  相似文献   

12.
以英语文语转换系统的开发为背景,采用基于大语料库的拼接语音合成方法进行英语语音合成。就英语多音节和无限词汇的特点,选用了3种不同长度的拼接单元:单词,音节,phone。引入了决策树CART(classification and regressiontree)方法对大语料库中的语音单元进行预选,并设计了相应的单元选择算法。实验表明,利用该方法能得到清晰自然的合成效果,并且提高了单元选择的效率。  相似文献   

13.
In concatenative Text-to-Speech, the size of the speech corpus is closely related to synthetic speech quality. In this paper, we describe our work on a new corpus-based Bell Labs' TTS system. This encompasses large acoustic inventories with a rich set of annotations, models and data structures for representing and managing such inventories, and an optimal unit selection algorithm that accommodates a broad range of possible cost criteria. We also propose a new method for setting weights in the cost functions based on a perceptual preference test. Our results show that this approach can successfully predict human preference patterns. Synthetic speech using weights determined in this manner consistently demonstrates smoother transitions and higher voice quality than speech using manually set weights.  相似文献   

14.
针对含有自动引导小车(Automated Guided Vehicle,AGV)的离散化车间物流调度问题,以最小化物流任务时间惩罚成本和最小化运载小车的总行驶距离为优化目标,构建离散化车间多目标物流调度优化模型,设计一种基于Pareto寻优的多目标混合变邻域搜索遗传算法(VNSGA-II).以遗传算法为基础,通过使用NSGA-II的Pareto分层和拥挤度计算方法评估种群优劣实现多目标优化,为了提高算法的寻优能力,避免算法陷入局部最优,通过添加保优记忆库对精英个体进行保护,并利用变邻域搜索算法在搜索过程中的局部寻优能力,针对本文模型特点,设计6个随机邻域结构,来达到算法求解最优值的目标.并提出了基于关键AGV小车的插入邻域和基于关键物流任务的交换邻域调整策略以进一步降低成本.最后,以某离散车间物流调度为实例,分别使用VNSGA-II、带精英策略的快速非支配排序遗传算法Ⅱ(Nondominated Sorting Genetic AlgorithmⅡ,NSGA-II)和强Pareto进化算法(Strong Pareto Evolutionary Algorithm 2,SPEA2)对问题进行求解,计算结果表明,VNSGA-II能得到更好的Pareto解集,验证了算法的有效性和可行性.  相似文献   

15.
In an electricity market generation companies need suitable bidding models to maximize their profits. Therefore, each supplier will bid strategically for choosing the bidding coefficients to counter the competitors bidding strategy. In this paper optimal bidding strategy problem is solved using a novel algorithm based on Shuffled Frog Leaping Algorithm (SFLA). It is memetic meta-heuristic that is designed to seek a global optimal solution by performing a heuristic search. It combines the benefits of the Genetic-based Memetic Algorithm (MA) and the social behavior-based Particle Swarm Optimization (PSO). Due to this it has better precise search which avoids premature convergence and selection of operators. Therefore, the proposed method overcomes the short comings of selection of operators and premature convergence of Genetic Algorithm (GA) and PSO method. Important merit of the proposed SFALA is that faster convergence. The proposed method is numerically verified through computer simulations on IEEE 30-bus system consist of 6 suppliers and practical 75-bus Indian system consist of 15 suppliers. The result shows that SFLA takes less computational time and producing higher profits compared to Fuzzy Adaptive PSO (FAPSO), PSO and GA.  相似文献   

16.
王欣  吴志勇  蔡莲红 《软件学报》2014,25(S2):63-69
语音合成技术是人机言语交互中重要的媒介方式,基元选取算法一直是拼接式语音合成中的研究重点.在传统的语音合成中基于代价函数的拼接合成基元选取算法的基础上,将双音子(diphone)的稳定段边界模型应用到单词和音节中,最后使用3种基元模型的分层不定长选音算法,从语料库中优选出最佳合成基元序列拼接合成最终语音.该算法一方面利用分层统一的不定长选音策略,尽可能地选取具有更好韵律特性和声学连续性的较大基元,从而显著减少拼接点,将有可能发生协同发音或者切分错误的拼接点包含到更大的基元内部;另一方面通过稳定段切分修改传统拼接基元边界类型,充分利用了diphone的稳定段边界良好的拼接特性,从而提高了合成语音的连续性和自然度.评测结果显示,这种方法与传统diphone拼接合成方法相比,其合成效果有显著的提升.  相似文献   

17.
BURS theory provides a powerful mechanism to efficiently generate pattern matches in a given expression tree. BURS, which stands for bottom-up rewrite system, is based on term rewrite systems, to which costs are added. We formalise the underlying theory, and derive an algorithm that computes all pattern matches. This algorithm terminates if the term rewrite system is finite. We couple this algorithm with the well-known search algorithm A that carries out pattern selection. The search algorithm is directed by a cost heuristic that estimates the minimum cost of code that has yet to be generated. The advantage of using a search algorithm is that we need to compute only those costs that may be part of an optimal rewrite sequence (and not the costs of all possible rewrite sequences as in dynamic programming). A system that implements the algorithms presented in this work has been built. Received: 20 November 1995 / 26 June 1996  相似文献   

18.
列车运行调整是一类特殊的NP完全问题,由于约束众多,搜索空间庞大,可行解范围狭小,因此难以获得最优解。针对高速列车运行调整问题的特点,以智能算法中有代表性发展优势的萤火虫算法(FA)为基础,根据实际问题提出一种离散的萤火虫算法(DFA)进行求解。为了增加萤火虫群的多样性,避免算法陷入局部最优解,采用了基于变邻域搜索算法的扰动机制。将该算法用于高速列车运行调整问题,经过算例对比分析,基于离散萤火虫算法调整方案的计算结果优于普通启发式算法调整结果。  相似文献   

19.
Clonal Strategy Algorithm Based on the Immune Memory   总被引:4,自引:0,他引:4       下载免费PDF全文
Based on the clonal selection theory and immune memory mechanism in the natural immune system, a novel artificial immune system algorithm, Clonal Strategy Algorithm based on the Immune Memory (CSAIM), is proposed in this paper. The algorithm realizes the evolution of antibody population and the evolution of memory unit at the same time, and by using clonal selection operator, the global optimal computation can be combined with the local searching. According to antibody-antibody (Ab-Ab) affinity and antibody-antigen (Ab-Ag) affinity, the algorithm can allot adaptively the scales of memory unit and antibody population. It is proved theoretically that CSAIM is convergent with probability 1. And with the computer simulations of eight benchmark functions and one instance of traveling salesman problem (TSP), it is shown that CSAIM has strong abilities in having high convergence speed, enhancing the diversity of the population and avoiding the premature convergence to some extent.  相似文献   

20.
We present the Dynamic Programming Projected Phase-Slope Algorithm (DYPSA) for automatic estimation of glottal closure instants (GCIs) in voiced speech. Accurate estimation of GCIs is an important tool that can be applied to a wide range of speech processing tasks including speech analysis, synthesis and coding. DYPSA is automatic and operates using the speech signal alone without the need for an EGG signal. The algorithm employs the phase-slope function and a novel phase-slope projection technique for estimating GCI candidates from the speech signal. The most likely candidates are then selected using a dynamic programming technique to minimize a cost function that we define. We review and evaluate three existing methods of GCI estimation and compare the new DYPSA algorithm to them. Results are presented for the APLAWD and SAM databases for which 95.7% and 93.1% of GCIs are correctly identified  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号