共查询到19条相似文献,搜索用时 109 毫秒
1.
2.
在对人用视觉提取基音周期过程模拟的基础上,提出一种基于语音波形外观形状的时域基音周期提取算法,该算法利用语音波形的一次峰值点和二次峰值点的幅度和位置以及峰到前峰的距离等几种属性,来判断决定基音周期值,具有算法简单、运算量小、能准确定位各基音周期位置的特点。 相似文献
3.
自相关函数法、平均幅度差函数法及小波变换法是经典的基音检测方法,本文简要分析了单独使用它们进行基音检测时存在的不足,提出了一种基于小波变换的加权自相关的检测方法。将多级小波变换的近似分量加权求和以突出基音信息,采用改进的平均幅度差函数加权自相关函数的方法以突出真实基音周期处的峰值,提高基音检测的正确率。实验表明,与传统的自相关函数法和平均幅度差函数法相比,本文方法减少了倍频和半频错误,提高了基音检测的精度,在信噪比为-5dB时仍能得到较准确的结果。 相似文献
4.
5.
6.
自相关函数法和小波变换法是经典的基音检测方法,在简要分析单独使用它们进行基音检测存在不足的基础上,提出一种结合改进自相关与加权小波分量的检测方法。采用改进自相关函数对传统自相关函数进行幅度补偿以弥补传统自相关函数随滞后时间增加导致幅度衰减的缺陷;将多级小波变换分量加权求和以突出语音的基音信息,然后将两种方法结合突出真实基音周期处的峰值。实验结果表明,与传统的自相关函数法和小波变换法相比,两者结合的方法减少了倍频、半频及伪随机点的错误,提高了基音检测的精度。 相似文献
7.
QIAN Kai-hua 《数字社区&智能家居》2008,(10)
通过对语音转换的研究,提出了一种把源说话人特征转换为目标说话人特征的方法。语音转换特征参数分为两类:(1)频谱特征参数;(2)基音和声调模式。分别描述信号模型和转换方法。频谱特征用基于音素的2维HMMS建模,F0轨迹用来表示基音和音调。用基音同步叠加法对基音周期﹑声调和语速进行变换。 相似文献
8.
语音基音频率的准确检测是语音信号处理的难点之一。提出一种加权短时自相关函数(Autocorrelation Function,ACF)算法提取基音频率。在传统的ACF方法基础上,利用短时平均幅度差函数(Average Magnitude Difference Function,AMDF)的平方对ACF函数进行加权,由此加强短时自相关函数在基音周期倍数处的峰值特性。对提取出的基频曲线做平滑处理。实验结果表明,该方法提高了基音周期检测的准确率。 相似文献
9.
介绍英语语音语调相关知识,根据国内英语学习者现状,研究基于重音与韵律的英语句子客观评价系统,通过提取语音的能量特征,对英语句子进行重音划分,使用改进的成对变异指数(Pairwise Variability Index,PVI)算法作为句子评价核心,旨在提高说话人对英语句子的重音与节奏的把握。 相似文献
10.
钱开华 《数字社区&智能家居》2008,(4):132-134
通过对语音转换的研究,提出了一种把源说话人特征转换为目标说话人特征的方法。语音转换特征参数分为两类:(1)频谱特征参数;(2)基音和声调模式。分别描述信号模型和转换方法。频谱特征用基于音素的2维HMMS建模,F0轨迹用来表示基音和音调。用基音同步叠加法对基音厨期、声调和语速进行变换。 相似文献
11.
Although English pitch accent detection has been studied extensively, there relatively a few works explore Mandarin stress detection. Moreover, the comparison and analysis between Mandarin stress detection and English pitch accent detection have not been touched for such counterpart tasks. In this paper, we discuss Mandarin stress detection and compare it with English pitch accent detection. The contributions of the paper are two aspects: one is that we use classifier combination method to detect Mandarin stress and English pitch accent by using acoustic, lexical and syntactic evidence. Our proposed method achieves better performance on both the Mandarin prosodic annotation corpus—ASCCD and the English prosodic annotation corpus—Boston University Radio News Corpus (BURNC) when compared with the baseline system. We also verify our proposed method on other prosodic annotation corpus and continuous speech corpus. The other is the feature analysis. Duration, pitch, energy and intensity features are compared for Mandarin stress detection and English pitch accent detection. Based on the analysis of prosodic annotation corpora, we also verify some linguistic conclusions. 相似文献
12.
Automatic prosodic break detection and annotation are important for both speech understanding and natural speech synthesis.In this paper,we discuss automatic prosodic break detection and feature analysis.The contributions of the paper are two aspects.One is that we use classifier combination method to detect Mandarin and English prosodic break using acoustic,lexical and syntactic evidence.Our proposed method achieves better performance on both the Mandarin prosodic annotation corpus - Annotated Speech Corpus of Chinese Discourse and the English prosodic annotation corpus - Boston University Radio News Corpus when compared with the baseline system and other researches’ experimental results.The other is the feature analysis for prosodic break detection.The functions of different features,such as duration,pitch,energy,and intensity,are analyzed and compared in Mandarin and English prosodic break detection.Based on the feature analysis,we also verify some linguistic conclusions. 相似文献
13.
《Computer Speech and Language》2014,28(5):1083-1114
Discriminative confidence based on multi-layer perceptrons (MLPs) and multiple features has shown significant advantage compared to the widely used lattice-based confidence in spoken term detection (STD). Although the MLP-based framework can handle any features derived from a multitude of sources, choosing all possible features may lead to over complex models and hence less generality. In this paper, we design an extensive set of features and analyze their contribution to STD individually and as a group. The main goal is to choose a small set of features that are sufficiently informative while keeping the model simple and generalizable. We employ two established models to conduct the analysis: one is linear regression which targets for the most relevant features and the other is logistic linear regression which targets for the most discriminative features. We find the most informative features are comprised of those derived from diverse sources (ASR decoding, duration and lexical properties) and the two models deliver highly consistent feature ranks. STD experiments on both English and Spanish data demonstrate significant performance gains with the proposed feature sets. 相似文献
14.
情感词典是文本情感分析的基础资源,但采用手工方式构建工作量大,且覆盖有限。一种可行的途径是从新情感词传播的重要媒介-微博数据-中自动抽取情感词。该文以COAE 2014评测任务3提供的中文微博数据为统计对象,发现传统的基于共现的方法,如点互信息等,对中文微博数据中的新情感词发现是无效的。为此,设计一组基于上下文词汇的分类特征,即N-Gram特征,以刻画情感词的用词环境和用词模式,并以已知情感词为训练数据训练分类器,对候选情感词进行分类。实验结果表明,该方法较传统基于共现的方法要好。实验还发现,与英语不同的是,中文情感词通常会以名词词性出现,而基于共现的方法无法有效地区分该类情感词,这是造成其失效的主要原因,而该文提出的分类特征能解决这一问题。 相似文献
15.
William Yang Wang Fadi Biadsy Andrew Rosenberg Julia Hirschberg 《Computer Speech and Language》2013,27(1):168-189
Traditional studies of speaker state focus primarily upon one-stage classification techniques using standard acoustic features. In this article, we investigate multiple novel features and approaches to two recent tasks in speaker state detection: level-of-interest (LOI) detection and intoxication detection. In the task of LOI prediction, we propose a novel Discriminative TFIDF feature to capture important lexical information and a novel Prosodic Event detection approach using AuToBI; we combine these with acoustic features for this task using a new multilevel multistream prediction feedback and similarity-based hierarchical fusion learning approach. Our experimental results outperform published results of all systems in the 2010 Interspeech Paralinguistic Challenge – Affect Subchallenge. In the intoxication detection task, we evaluate the performance of Prosodic Event-based, phone duration-based, phonotactic, and phonetic-spectral based approaches, finding that a combination of the phonotactic and phonetic-spectral approaches achieve significant improvement over the 2011 Interspeech Speaker State Challenge – Intoxication Subchallenge baseline. We discuss our results using these new features and approaches and their implications for future research. 相似文献
16.
Investigating new effective feature extraction methods applied to the speech signal is an important approach to improve the performance of automatic speech recognition (ASR) systems. Owing to the fact that the reconstructed phase space (RPS) is a proper field for true detection of signal dynamics, in this paper we propose a new method for feature extraction from the trajectory of the speech signal in the RPS. This method is based upon modeling the speech trajectory using the multivariate autoregressive (MVAR) method. Moreover, in the following, we benefit from linear discriminant analysis (LDA) for dimension reduction. The LDA technique is utilized to simultaneously decorrelate and reduce the dimension of the final feature set. Experimental results show that the MVAR of order 6 is appropriate for modeling the trajectory of speech signals in the RPS. In this study recognition experiments are conducted with an HMM-based continuous speech recognition system and a naive Bayes isolated phoneme classifier on the Persian FARSDAT and American English TIMIT corpora to compare the proposed features to some older RPS-based and traditional spectral-based MFCC features. 相似文献
17.
事件检测主要研究从非结构化文本中自动识别事件触发词,实现所属事件类型的正确分类。与英文相比,中文需要经过分词才能利用词汇信息,还存在“分词-触发词”不匹配问题。针对中文语言特性与事件检测任务的特点,本文提出一种基于多词汇特征增强的中文事件检测模型,通过外部词典为字级别模型引入包含多词汇信息的词汇集,以利用多种分词结果的词汇信息。同时采用静态文本词频统计与自动分词工具协同决策词汇集中词汇的权重,获取更加精确的词汇语义。在ACE2005中文数据集上与现有模型进行实验对比分析,结果表明本文方法取得了最好的性能,验证了该方法在中文事件检测上的有效性。 相似文献
18.
《IEEE transactions on audio, speech, and language processing》2009,17(5):935-944
19.
大规模未标注语料中蕴含了丰富的词汇信息,有助于提高中文分词词性标注模型效果。该文从未标注语料中抽取词汇的分布信息,表示为高维向量,进一步使用自动编码器神经网络,无监督地学习对高维向量的编码算法,最终得到可直接用于分词词性标注模型的低维特征表示。在宾州中文树库5.0数据集上的实验表明,所得到的词汇特征对分词词性标注模型效果有较大帮助,在词性标注上优于主成分分析与k均值聚类结合的无监督特征学习方法。 相似文献