首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
A novel approach to the combination of volatility forecasts is discussed. The proposed procedure makes use of the generalized method of moments (GMM) for estimating the combination weights. The asymptotic properties of the GMM estimator are derived while its finite sample properties are assessed by means of a simulation study. The results of an application to a time series of daily returns on the S&P500 are presented.  相似文献   

2.
GMM文本无关的说话人识别系统研究   总被引:1,自引:2,他引:1       下载免费PDF全文
在高斯混合模型(Gaussian Mixture Model,GMM)训练时,对传统的模型参数初始化方法(随机法、K均值聚类法)进行改进,提出分裂法与K均值聚类相结合的新方法。实验表明,采用改进的方法与传统方法相比,系统平均识别率有15.47%和7.5%的提高。研究了GMM的阶数、协方差阈值、预加重系数对系统识别率的影响。对实验结果进行详细分析,并根据实验数据,取它们各自表现最好的值,从而使构建的说话人识别系统获得一个较高的识别率。实验表明,在规定的实验条件下,系统可达到90%以上的识别率。  相似文献   

3.
提出了一种基于高斯混合模型(GMM)的自然环境声音的识别方法。提取Mel频率倒谱系数(MFCCs)来分析声音信号;对于每种声音使用期望最大化算法基于MFCC特征集建立高斯混合模型;使用最小错误率判决规则和投票裁决的方法进行识别。使用GMM对36种自然环境的声音进行识别的正确率可达95.83%,且识别效果优于K最近邻(KNN)。  相似文献   

4.
Object recognition by combining paraperspective images   总被引:2,自引:2,他引:0  
This paper provides a study on object recognition under paraperspective projection. Discussed is the problem of determining whether or not a given image was obtained from a 3-D object to be recognized. First it is clarified that paraperspective projection is the first-order approximation of perspective projection. Then it is shown that, if we represent an object as a set of its feature points and the object undergoes a rigid transformation or an affine transformation, any paraperspective image can be expressed as a linear combination of several appropriate paraperspective images: we need at least three images for rigid transformations; whereas we need at least two images for affine transformations. Particularly in the case of a rigid transformation, the coefficients of the combination have to satisfy two conditions: orthogonality and norm equality. A simple algorithm to solve the above problem based on these properties is presented: a linear, single-shot algorithm. Some experimental results with synthetic images and real images are also given.This work was done while the author was with ATR Auditory and Visual Perception Research Laboratories.Advanced Research Laboratory Hitachi, Ltd.  相似文献   

5.
提出了一种将基于深度神经网络(Deep Neural Network,DNN)特征映射的回归分析模型应用到身份认证矢量(identity vector,i-vector)/概率线性判别分析(Probabilistic Linear Discriminant Analysis,PLDA)说话人系统模型中的方法。DNN通过拟合含噪语音和纯净语音i-vector之间的非线性函数关系,得到纯净语音i-vector的近似表征,达到降低噪声对系统性能影响的目的。在TIMIT数据集上的实验验证了该方法的可行性和有效性。  相似文献   

6.
改进的说话人聚类初始化和GMM的多说话人识别*   总被引:2,自引:1,他引:1  
针对多说话人聚类线性初始化方法精度较差的问题,提出了一种改进的聚类初始化方法。该方法引入BIC对由线性初始化产生的初始类进行检测分割,有效提升了说话人初始类纯度。最后将该方法应用到高斯混合模型(GMM)多说话人识别系统。实验结果表明,所提方法使说话人平均类纯度(ACP)提高了48.51%,系统的错误识别率平均降低12.09%。  相似文献   

7.
为了探讨高斯混合模型在说话人识别中的作用,设计了一个基于GMM的说话人识别系统。整个系统由音频信号预处理,语音活动检测,说话人模型建立以及音频信号识别4个模块组成。前三个模块构成了系统的模型训练部分,最后一个模块构成了系统的语音识别部分。包含在第二个模块中的由GMM模型搭建的语音活动检测器是研究的创新之处。利用增强的多方互动会议语料库中的视听会议对系统中的部分可调参数以及系统的识别错误率进行了测试。仿真结果表明,在语音活动检测器和若干滤波算法的帮助下,系统对包含重叠语音的音频信号的识别准确率可以达到83.02%。  相似文献   

8.
Most state-of-the-art speaker recognition systems are based on discriminative learning approaches. On the other hand, generative Gaussian mixture models (GMM) have been widely used in speaker recognition during the last decades. In an earlier work, we proposed an algorithm for discriminative training of GMM with diagonal covariances under a large margin criterion. In this paper, we propose an improvement of this algorithm, which has the major advantage of being computationally highly efficient, thus well suited to handle large-scale databases. We also develop a new strategy to detect and handle the outliers that occur in the training data. To evaluate the performances of our new algorithm, we carry out full NIST speaker identification and verification tasks using NIST-SRE’2006 data, in a Symmetrical Factor Analysis compensation scheme. The results show that our system significantly outperforms the traditional discriminative support vector machines (SVM)-based system of SVM-GMM supervectors, in the two speaker recognition tasks.  相似文献   

9.
针对矿井复杂异构的无线环境,提出一种基于高阶累积量和DNN模型的井下信号识别方法,实现了井下BPSK,QPSK,8PSK,2FSK,4FSK,8FSK,32QAM,64QAM,OFDM等数字信号的自动调制识别。分析得到9种数字信号的高阶累积量理论值,并通过傅里叶变换提高信号辨识度;分析井下小尺度衰落信道对高阶累积量的影响,推导出经过井下衰落信道后信号的高阶累积量计算表达式,根据高阶累积量理论值构造特征参数并训练DNN模型,实现信号识别。仿真分析结果表明,该方法在矿井Nakagami-m衰落信道下有出色的调制识别性能,信噪比为-5 dB时平均正确识别率为89.2%以上,信噪比为5 dB以上时平均正确识别率为100%。该方法为在特殊复杂环境下的信号识别检测提供了新思路。  相似文献   

10.
With the development of intelligent surveillance systems, human behavior recognition has been extensively researched. Most of the previous methods recognized human behavior based on spatial and temporal features from (current) input image sequences, without the behavior prediction from previously recognized behaviors. Considering an example of behavior prediction, “punching” is more probable in the current frame when the previous behavior is “standing” as compared to the previous behavior being “lying down.” Nevertheless, there has been little study regarding the combination of currently recognized behavior information with behavior prediction. Therefore, we propose a fuzzy system based behavior recognition technique by combining both behavior prediction and recognition. To perform behavior recognition during daytime and nighttime, a dual camera system of visible light and thermal (far infrared light) cameras is used to capture 12 datasets including 11 different human behaviors in various surveillance environments. Experimental results along with the collected datasets and open database showed that the proposed method achieved higher accuracy of behavior recognition when compared to conventional methods.  相似文献   

11.
Speech processing is very important research area where speaker recognition, speech synthesis, speech codec, speech noise reduction are some of the research areas. Many of the languages have different speaking styles called accents or dialects. Identification of the accent before the speech recognition can improve performance of the speech recognition systems. If the number of accents is more in a language, the accent recognition becomes crucial. Telugu is an Indian language which is widely spoken in Southern part of India. Telugu language has different accents. The main accents are coastal Andhra, Telangana, and Rayalaseema. In this present work the samples of speeches are collected from the native speakers of different accents of Telugu language for both training and testing. In this work, Mel frequency cepstral coefficients (MFCC) features are extracted for each speech of both training and test samples. In the next step Gaussian mixture model (GMM) is used for classification of the speech based on accent. The overall efficiency of the proposed system to recognize the speaker, about the region he belongs, based on accent is 91 %.  相似文献   

12.
手指静脉识别是利用人体手指静脉结构的唯一性实现个体身份认证,具有高度安全和使用便捷等优点。为了进一步提高手指静脉识别系统的性能,提出了一种融合局部特征和全局特征的手指静脉识别方法。应用局部二元模式方法提取手指静脉局部特征,利用海明距离计算匹配得分;应用双向两维主成分分析方法提取手指静脉全局特征,利用欧式距离计算匹配得分;在得分级上融合二者的匹配得分以产生识别结果。实验结果表明,局部特征与全局特征具有较好的互补性,有效地提高了识别精度。  相似文献   

13.
给出了一个基于HMM和GMM双引擎识别模型的维吾尔语联机手写体整词识别系统。在GMM部分,系统提取了8-方向特征,生成8-方向特征样式图像、定位空间采样点以及提取模糊的方向特征。在对模型精细化迭代训练之后,得到GMM模型文件。HMM部分,系统采用了笔段特征的方法来获取笔段分段点特征序列,在对模型进行精细化迭代训练后,得到HMM模型文件。将GMM模型文件和HMM模型文件分别打包封装再进行联合封装成字典。在第一期的实验中,系统的识别率达到97%,第二期的实验中,系统的识别率高达99%。  相似文献   

14.
Computational Visual Media - This paper presents a vision-based system for recognizing when elderly adults fall. A fall is characterized by shape deformation and high motion. We represent shape...  相似文献   

15.
徐鹏进  郭莉  刘书昌 《计算机应用》2011,31(Z2):172-175
结合次谐波与谐波比率(SHR)音高提取算法与基于短时能量及过零率的端点检测算法,实现了一种音高、端点联合检测算法.该方法以短时能量及过零率为音符切分的基础,进而利用音高跳变来检测音符的变化.仿真实验表明,采用SHR音高提取算法,该联合检测提高了音符识别的精度,尤其在噪声环境下,其表现优于传统端点检测算法.  相似文献   

16.
利用话者识别原理和语音数字信号处理技术对人声建模方法进行研究,建立了基于GMM模型的VDR环境下的人声识别基准系统;从分析影响人声识别率因素的角度出发,指出传统算法的不足,并提出一种基于近似熵的语音端点检测算法。理论分析和实验结果证明:新算法能有效屏蔽大动态冲击性噪声,解决了语音的虚检现象,并且在低信噪比0 dB情况下的识别率提升66%。  相似文献   

17.
Ma  Xueqi  Tao  Dapeng  Liu  Weifeng 《Multimedia Tools and Applications》2019,78(10):13313-13329

The ever-growing popularity of mobile networks and electronics has prompted intensive research on multimedia data (e.g. text, image, video, audio, etc.) management. This leads to the researches of semi-supervised learning that can incorporate a small number of labeled and a large number of unlabeled data by exploiting the local structure of data distribution. Manifold regularization and pairwise constraints are representative semi-supervised learning methods. In this paper, we introduce a novel local structure preserving approach by considering both manifold regularization and pairwise constraints. Specifically, we construct a new graph Laplacian that takes advantage of pairwise constraints compared with the traditional Laplacian. The proposed graph Laplacian can better preserve the local geometry of data distribution and achieve the effective recognition. Upon this, we build the graph regularized classifiers including support vector machines and kernel least squares as special cases for action recognition. Experimental results on a multimodal human action database (CAS-YNU-MHAD) show that our proposed algorithms outperform the general algorithms.

  相似文献   

18.
19.
综合结构和纹理特征的场景识别   总被引:1,自引:0,他引:1  
当前在计算机视觉领域,场景识别尽管取得了较大进展,但其对于计算机视觉而言,仍然是一个极具挑战的问题.此前的场景识别方法,有些需要预先手动地对训练图像进行语义标注,并且大部分场景识别方法均基于"特征袋"模型,需要对提取的大量特征进行聚类,计算量和内存消耗均很大,且初始聚类中心及聚类数目的选择对识别效果有较大影响.为此本文提出一种不基于"特征袋"模型的无监督场景识别方法.先通过亚采样构建多幅不同分辨率的图像,在多级分辨率图像上,分别提取结构和纹理特征,用本文提出的梯度方向直方图描述方法表示图像的结构特征,用Gabor滤波器组和Schmid滤波集对图像的滤波响应表示图像的纹理特征,并将结构和纹理特征作为相互独立的两个特征通道,最后综合这两个特征通道,通过SVM分类,实现对场景的自动识别.分别在Oliva,Li Fei-Fei和Lazebnik等的8类、13类和15类场景图像库上进行测试实验,实验结果表明,梯度方向直方图描述方法比经典的SIFT描述方法,有着更好的场景识别性能;综合结构和纹理特征的场景识别方法,在通用的三个场景图像库上取得了很好的识别效果.  相似文献   

20.
An approach to the problem of inter-speaker variability in automatic speech recognition is described which exploits systematic vowel differences in a two-stage process of adaptation to individual speaker characteristics. In stage one, an accent identification procedure selects one of four gross regional English accents on the basis of vowel quality differences within four calibration sentences. In stage two, an adjustment procedure shifts the regional reference vowel space onto the speaker's vowel space as calculated from the accent identification data. Results for 58 speakers from the four regional accent areas are presented.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号