首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
许允喜  俞一彪 《计算机应用》2008,28(2):339-341,
矢量量化(VQ)方法是文本无关说话人识别中广泛应用的建模方法之一,它的主要问题是码本设计问题。语音特征参数是高维数据,样本分布复杂,因此码本设计的难度也很大,传统的LBG算法只能获得局部最优的码本。提出一种VQ码本设计的新方法,将小生境技术与K-均值算法融入到免疫算法训练过程中,形成混合免疫算法,采用针对高维数据聚类的改进变异算子,降低了随机变异的盲目性,增强群体的全局及局部搜索能力,同时通过接种疫苗提高算法的收敛速度。说话人识别实验表明,与传统LBG和基于混合遗传算法的VQ码本设计方法相比,该方法可以得到更优的模型参数,使得系统的识别率进一步提高。  相似文献   

2.
针对LBG算法初始码本随机选取后易出现空胞腔、易陷入局部极小、迭代次数大等缺陷,本文依据模糊聚类理论引入了矢量量化码本设计训练的模糊聚类与LBG级联算法:先用模糊聚类算法训练码本,将训练得到的码本作为传统LBG算法的初始码本,再用传统LBG算法训练.论述了模糊聚类和LBG联合算法的原理与方法;用该算法分剐训练了语音线性...  相似文献   

3.
A GPU implementation for LBG and SOM training   总被引:1,自引:1,他引:0  
Vector quantization (VQ) is an effective technique applicable in a wide range of areas, such as image compression and pattern recognition. The most time-consuming procedure of VQ is codebook training, and two of the frequently used training algorithms are LBG and self-organizing map (SOM). Nowadays, desktop computers are usually equipped with programmable graphics processing units (GPUs), whose parallel data-processing ability is ideal for codebook training acceleration. Although there are some GPU algorithms for LBG training, their implementations suffer from a large amount of data transfer between CPU and GPU and a large number of rendering passes within a training iteration. This paper presents a novel GPU-based training implementation for LBG and SOM training. More specifically, we utilize the random write ability of vertex shader to reduce the overheads mentioned above. Our experimental results show that our approach can run four times faster than the previous approach.  相似文献   

4.
基于方差归一化失真测度的改进的LBG算法   总被引:3,自引:1,他引:2  
矢量量化(VQ)技术在话者识别系统中得到了广泛的应用。 VQ码本的产生通常采用 LBG算法,失真测度则为对矢量的各分量等权重的欧氏距离。在话者识别系统中特征矢量的各个分量的分布是有差别的,且对于不同的话者,这种差别的程度又是不一样的。由于不同分布的各维参数对话者识别的有效性各不相同,因此,文章提出了一种能反映这种有效性差别的失真测度,即:方差归一化失真测度。以该失真测度为基础,并结合时序相关的初始码本设计方法及有效的零胞腔处理技术,文章提出了改进的LBG算法,同时利用该算法训练出改进的VQ话者模型,并进行了话者识别实验。  相似文献   

5.
采用遗传算法的文本无关说话人识别   总被引:1,自引:0,他引:1  
为解决在说话人识别方法的矢量量化(Vector Quantization,VQ)系统中,K-均值法的码本设计很容易陷入局部最优,而且初始码本的选取对最佳码本设计影响很大的问题,将遗传算法(Genetic Algorithm,GA)与基于非参数模型的VQ相结合,得到1种VQ码本设计的GA-K算法.该算法利用GA的全局优化能力得到最优的VQ码本,避免LBG算法极易收敛于局部最优点的问题;通过GA自身参数,结合K-均值法收敛速度快的优点,搜索出训练矢量空间中全局最优的码本.实验结果表明,GA-K算法优于LBG算法,可以很好地协调收敛性和识别率之间的关系.  相似文献   

6.
一种高效的基于模拟退火的LBG算法   总被引:7,自引:0,他引:7  
针对传统矢量量化码书设计LBG算法对初始码书敏感和在迭代过程中容易陷入局部极小的缺陷,结合模拟退火算法,提出了一种基于模拟退火的LBG改进算法,并给出了退火过程中的扰动因子刘画、扰动策略选取、稳定性判据确定和温度下降策略等细节.模拟实验结果表明,本文所提出的改进算法能够有效地回避对初始码书的敏感,同时在搜索性能和图像压缩后还原质量上都得到很好的改善.  相似文献   

7.
矢量量化的初始码书算法   总被引:2,自引:0,他引:2       下载免费PDF全文
矢量量化的初始码书设计是很重要的,影响或决定着其后码书形成算法的迭代次数和最终的码书质量。针对原有的初始码书算法在性能上随机性强与信源匹配程度不高的问题,提出一种对于训练矢量实施基于分量的和值排序,然后做分离平均的初始码书形成算法。算法使用了矢量的特征量,脱离了对于图像结构因数的依赖,能产生鲁棒性较好的初始码书。实验证明了该方法的有效性,与LBG算法结合可进一步提高码书质量。  相似文献   

8.
This paper evaluates the impact of three special forms of the Minkowski metric (Euclidean, City Block, and Chebychev distances) on the performance of the conventional vector quantization (VQ) and Gaussian mixture model (GMM) based closed-set text-independent speaker recognition systems, in terms of recognition rate and confidence on decisions. For the VQ based system, evaluations are carried out using the two most common clustering algorithms, LBG and K-means, and it is revealed which clustering algorithm and distance pair should be used to exploit the best attribute of both to achieve the best recognition rate for a given codebook size. In the case of GMM based system, we introduce the metrics into the GMM using a concatenation of the LBG and K-means algorithms in estimating the initial mean vectors, to which the system performance is sensitive, and explore their impact on system performance. We also make comparison of results obtained from evaluations on clean speech (TIMIT) and telephone speech databases (NTIMIT and NIST2001) with the modern classifiers VQ-UBM and GMM-UBM. It is found that there are cases where conventional VQ based system outperforms the modern systems. Moreover, the impact of distance metrics on the performance of the conventional and modern systems depends on the recognition task imposed (verification/identification).  相似文献   

9.
采用模糊聚类C均值聚类确定型心改进LBG算法,实现语音参数MFCC码本的矢量量化,实验结果表明,该算法有着与单一LBG算法相近的量化误差,自适应确定码本大小码,码本尺寸显著降低,减小码本的存储量。  相似文献   

10.
This article develops an evolutional fuzzy particle swarm optimization (FPSO) learning algorithm to self extract the near optimum codebook of vector quantization (VQ) for carrying on image compression. The fuzzy particle swarm optimization vector quantization (FPSOVQ) learning schemes, combined advantages of the adaptive fuzzy inference method (FIM), the simple VQ concept and the efficient particle swarm optimization (PSO), are considered at the same time to automatically create near optimum codebook to achieve the application of image compression. The FIM is known as a soft decision to measure the relational grade for a given sequence. In our research, the FIM is applied to determine the similar grade between the codebook and the original image patterns. In spite of popular usage of Linde–Buzo–Grey (LBG) algorithm, the powerful evolutional PSO learning algorithm is taken to optimize the fuzzy inference system, which is used to extract appropriate codebooks for compressing several input testing grey-level images. The proposed FPSOVQ learning scheme compared with LBG based VQ learning method is presented to demonstrate its great result in several real image compression examples.  相似文献   

11.
王让定  杜呈透 《计算机工程》2004,30(17):146-148
研究了基于VQ的有限非特定人汉语语音命令的识别方法,识别对象是有限的特定人群(如5到6人)、有限汉语短语。该文采用MFCC作为识别特征,利用改进的LBG算法训VQ码本,为了提高识别率和拒识率,提出了采用倒谱距离法的有效语音端点检测方法以及实用的拒识方法。实验结果证明,系统在具有背景噪声的一般办公环境下由有限人训练后,当训练的说话人与识别系统的距离在0.5m的范围内时,测试识别率达到99%以上,未训练说话人的拒识率达82%。  相似文献   

12.
Traditional LBG algorithm is a pure iterative optimization procedure to achieve the vector quantization (VQ) codebook, where an initial codebook is continually refined at every iteration to reduce the distortion between code-vectors and a given training data set. However, such interactive type learning algorithms will easily direct final results converging toward the local optimization while the high quality of the initial codebook is not available. In this article, an efficient heuristic-based learning method, called novel particle swarm optimization (NPSO), is proposed to design the proper codebook of VQ scheme that can develop the image compression system. To improve the performance of the basic PSO, the centroid updating machine applies the one step-size gradient descent learning step in the heuristic learning procedure. Additionally, the presented NPSO with advantages of the centroid updating machine is proposed to quickly achieve the near-optimal reconstructive image. For demonstrating the proposed NPSO learning scheme, the image with several horizontal grey bars is first applied to present the efficiency of the NPSO learning mechanism. LBG and NPSO learning methods are also applied to test the reconstructing performance in several type images “Lena,” “Airplane,” “Cameraman”, and “peppers.” In our experiments, the NPSO learning algorithm provides the higher performance than conventional LBG methods in the application of building image compression system.  相似文献   

13.
Recently, medical image compression becomes essential to effectively handle large amounts of medical data for storage and communication purposes. Vector quantization (VQ) is a popular image compression technique, and the commonly used VQ model is Linde–Buzo–Gray (LBG) that constructs a local optimal codebook to compress images. The codebook construction was considered as an optimization problem, and a bioinspired algorithm was employed to solve it. This article proposed a VQ codebook construction approach called the L2‐LBG method utilizing the Lion optimization algorithm (LOA) and Lempel Ziv Markov chain Algorithm (LZMA). Once LOA constructed the codebook, LZMA was applied to compress the index table and further increase the compression performance of the LOA. A set of experimentation has been carried out using the benchmark medical images, and a comparative analysis was conducted with Cuckoo Search‐based LBG (CS‐LBG), Firefly‐based LBG (FF‐LBG) and JPEG2000. The compression efficiency of the presented model was validated in terms of compression ratio (CR), compression factor (CF), bit rate, and peak signal to noise ratio (PSNR). The proposed L2‐LBG method obtained a higher CR of 0.3425375 and PSNR value of 52.62459 compared to CS‐LBG, FA‐LBG, and JPEG2000 methods. The experimental values revealed that the L2‐LBG process yielded effective compression performance with a better‐quality reconstructed image.  相似文献   

14.
A new vector quantization method (LBG-U) closely related to a particular class of neural network models (growing self-organizing networks) is presented. LBG-U consists mainly of repeated runs of the well-known LBG algorithm. Each time LBG converges, however, a novel measure of utility is assigned to each codebook vector. Thereafter, the vector with minimum utility is moved to a new location, LBG is run on the resulting modified codebook until convergence, another vector is moved, and so on. Since a strictly monotonous improvement of the LBG-generated codebooks is enforced, it can be proved that LBG-U terminates in a finite number of steps. Experiments with artificial data demonstrate significant improvements in terms of RMSE over LBG combined with only modestly higher computational costs.  相似文献   

15.
一种基于改进CP网络与HMM相结合的混合音素识别方法   总被引:2,自引:0,他引:2  
提出了一种基于改进对偶传播(CP)神经网络与隐驰尔可夫模型(HMM)相结合的混合音素识别方法.这一方法的特点是用一个具有有指导学习矢量量化(LVQ)和动态节点分配等特性的改进的CP网络生成离散HMM音素识别系统中的码书。因此,用这一方法构造的混合音素识别系统中的码书实际上是一个由有指导LVQ算法训练的具有很强分类能力的高性能分类器,这就意味着在用HMM对语音信号进行建模之前,由码书产生的观测序列中  相似文献   

16.
目前在矢量量化的码本训练中经典的聚类方法是LBG算法,但该算法的主要缺陷是对初始码书的依赖性较大,容易过早地陷入局部极小.本文在基于矢量量化的说话人识别中研究了一种随机局部搜索的聚类算法.该算法不依赖初始条件,结构规则,容易实现,效果好,具有很优越的全局优化搜索能力,在语音参数聚类实验中表现出了很好的性能,得到的码书质量也优于经典的LBG-算法,从而为在基于矢量量化的说话人识别中设计准全局最优码书提供了一种新思路.  相似文献   

17.
该文介绍了一种基于矢量量化(VQ)方法的一个说话人识别算法。基于矢量量化的说话人识别,因其运算过程简单等特点,在说话人识别领域有着广泛的应用。用不同语音参数进行实验,实验表明应用矢量量化的方法用在说话人识别中是一种有效方法。  相似文献   

18.
The vector quantization (VQ) was a powerful technique in the applications of digital image compression. The traditionally widely used method such as the Linde–Buzo–Gray (LBG) algorithm always generated local optimal codebook. Recently, particle swarm optimization (PSO) is adapted to obtain the near-global optimal codebook of vector quantization. An alternative method, called the quantum particle swarm optimization (QPSO) had been developed to improve the results of original PSO algorithm. In this paper, we applied a new swarm algorithm, honey bee mating optimization, to construct the codebook of vector quantization. The results were compared with the other three methods that are LBG, PSO–LBG and QPSO–LBG algorithms. Experimental results showed that the proposed HBMO–LBG algorithm is more reliable and the reconstructed images get higher quality than those generated from the other three methods.  相似文献   

19.
基于分布特征统计的说话人识别   总被引:2,自引:2,他引:0       下载免费PDF全文
给出了基于公共码书的说话人分布特征的定义。提出了基于分布特征统计的说话人识别算法,根据所有参考说话人的训练语音建立公共码书,实现对语音特征空间的分类,统计各参考说话人训练语音的在公共码字上的分布特征进行建模。识别中引入双序列比对方法进行识别语音的分布特征统计与参考说话人模型间的相似度匹配,实现对说话人的辨认。实验表明,该方法保证识别率的情况下,进一步提高了基于VQ的说话人识别的速度。  相似文献   

20.
为解决采用矢量量化的方法进行说话人识别时出现的失真问题,根据汉语语音的发音特性,提出了将矢量量化与语音特征的聚类技术相结合的方法,在进行矢量量化码书训练之前,先对特征矢量进行聚类筛选。实验结果表明,当测试语音片段长度为4 s时,在保持95%左右识别率下,采用普通矢量量化方法需64码本数,而采用该文方法只需8码本数,降低了8倍。结果说明该方法不但在一定程度上解决了因训练样本不足而引起的失真问题,而且通过方法的改进,实现了采用较低码字数产生较好的识别结果,从而提高识别效率。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号