首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Vector quantization in speech coding   总被引:8,自引:0,他引:8  
Quantization, the process of approximating continuous-amplitude signals by digital (discrete-amplitude) signals, is an important aspect of data compression or coding, the field concerned with the reduction of the number of bits necessary to transmit or store analog data, subject to a distortion or fidelity criterion. The independent quantization of each signal value or parameter is termed scalar quantization, while the joint quantization of a block of parameters is termed block or vector quantization. This tutorial review presents the basic concepts employed in vector quantization and gives a realistic assessment of its benefits and costs when compared to scalar quantization. Vector quantization is presented as a process of redundancy removal that makes effective use of four interrelated properties of vector parameters: linear dependency (correlation), nonlinear dependency, shape of the probability density function (pdf), and vector dimensionality itself. In contrast, scalar quantization can utilize effectively only linear dependency and pdf shape. The basic concepts are illustrated by means of simple examples and the theoretical limits of vector quantizer performance are reviewed, based on results from rate-distortion theory. Practical issues relating to quantizer design, implementation, and performance in actual applications are explored. While many of the methods presented are quite general and can be used for the coding of arbitrary signals, this paper focuses primarily on the coding of speech signals and parameters.  相似文献   

2.
Vector quantization: A pattern-matching technique for speech coding   总被引:1,自引:0,他引:1  
  相似文献   

3.
A method for vector quantisation of pitch predictor coefficients according to a minimum residual energy criterion is proposed and compared to vector quantisation using the traditional minimum squared error between coefficients. Squared error quantisation is found to be adequate for 1-tap prediction, but for 3-tap prediction the residual energy method performs consistently better. The predictor code-books are also found to give robust performance outside the training sequence.  相似文献   

4.
Vector quantization by deterministic annealing   总被引:7,自引:0,他引:7  
A deterministic annealing approach is suggested to search for the optimal vector quantizer given a set of training data. The problem is reformulated within a probabilistic framework. No prior knowledge is assumed on the source density, and the principle of maximum entropy is used to obtain the association probabilities at a given average distortion. The corresponding Lagrange multiplier is inversely related to the `temperature' and is used to control the annealing process. In this process, as the temperature is lowered, the system undergoes a sequence of phase transitions when existing clusters split naturally, without use of heuristics. The resulting codebook is independent of the codebook used to initialize the iterations  相似文献   

5.
Vector quantization with complexity costs   总被引:2,自引:0,他引:2  
Vector quantization is a data compression method by which a set of data points is encoded by a reduced set of reference vectors: the codebook. A vector quantization strategy is discussed that jointly optimizes distortion errors and the codebook complexity, thereby determining the size of the codebook. A maximum entropy estimation of the cost function yields an optimal number of reference vectors, their positions, and their assignment probabilities. The dependence of the codebook density on the data density for different complexity functions is investigated in the limit of asymptotic quantization levels. How different complexity measures influence the efficiency of vector quantizers is studied for the task of image compression. The wavelet coefficients of gray-level images are quantized, and the reconstruction error is measured. The approach establishes a unifying framework for different quantization methods like K-means clustering and its fuzzy version, entropy constrained vector quantization or topological feature maps, and competitive neural networks  相似文献   

6.
提出一种基于多小波变换结合矢量量化的图像编码算法(MDWT VQ)。首先对图像进行多小波分解,然后对高频系数用改进后的LBG算法形成的码书进行VQ编码。算法充分利用了多小波域不同分辨率层间各方向子图像的相似性,仅对最高分辨率层进行码书地址索引,低级分辨率层的系数按照一定的组织形式直接套用最高分辨率层的地址索引信息。对比实验的结果验证了该算法在提高图像的重建质量以及在降低位码率方面均比传统的单小波图像编码算法有一定的提高。  相似文献   

7.
Super resolution pitch determination of speech signals   总被引:7,自引:0,他引:7  
Based on a new similarity model for the voice excitation process, a novel pitch determination procedure is derived. The unique features of the proposed algorithm are infinite (super) resolution, better accuracy than the difference limen for F0, robustness to noise, reliability, and modest computational complexity. The algorithm is instrumental to speech processing applications which require pitch synchronous spectral analysis. The computational complexity of the proposed algorithm is well within the capacity of modern digital signal processing (DSP) technology and therefore can be implemented in real time  相似文献   

8.
In this paper, we propose a binary-tree structure neural network model suitable for structured clustering. During and after training, the centroids of the clusters in this model always form a binary tree in the input pattern space. This model is used to design tree search vector quantization codebooks for image coding. Simulation results show that the acquired codebook not only produces better-quality images but also achieves a higher compression ratio than conventional tree search vector quantization. When source coding is applied after VQ, the new model performs better than the generalized Lloyd algorithm in terms of distortion, bits per pixel, and encoding complexity for low-detail and medium-detail images  相似文献   

9.
In speech processing an estimation of the speech pitch period is important. Real time pitch detection is only possible by the selection of an efficient algorithm suitable for implementation on a programmable processor or in special-purpose hardware. The use of the periodogram algorithm (p.a.) is proposed to detect the pitch period of voiced speech. This algorithm is attractive for the following reasons: (a) it has no multiply operation; (b) when implemented on a 16-bit computer (e.g. microprocessor) the computation can be done in integer arithmetic without exceeding the microprocessor's dynamic range; (c) it is a simple technique for estimating the pitch period with reasonable accuracy. Results of the analysis of speech signals and sinusoids using the periodogram algorithm are presented and comparisons are made with the average magnitude difference function (a.m.d.f.) which is an alternative method of estimating the pitch period of the voiced speech.  相似文献   

10.
We investigate variable-precision classification (VPC) for speeding vector quantization (VQ). VPC evaluates bit-serially, from the most significant bit. When the magnitude of the error due to the unevaluated bits is less than the absolute magnitude of the discriminant, we can classify without processing the remaining bits. A proof shows that as the operand precision increases, the average necessary precision becomes asymptotically independent of the operand precision, VPC makes the complexity of the L(2) norm equivalent to the L(1) norm. In VQ of real images, on average, the codevector element's precision necessary for classification was under four bits. We implemented binary classification circuitry using VPC and conventional approaches. The key modules were designed and their performance estimated assuming 1.0-mum gate array technology. The implementations could search binary pruned trees at the television quality video rate. When the overall execution time is important, VPC more than halves the computational complexity.  相似文献   

11.
A hierarchical classified vector quantization (HCVQ) method is described. In this method, the image is coded in several steps, starting with a relatively large block size, and successively dividing the block into smaller sub-blocks in a quad-tree fashion. The initial block is first vector quantized in the normal way. Classified vector quantization is then performed for its sub-blocks using the vector index of the initial block, i.e. rough information of the image, and the location of the sub-block within the initial block as classifiers. The coding proceeds in a similar way, adding more information of the fine details at each level. The method is found to be effective and to give a good subjective quality. It is also simple to implement, leading to coding speeds typical to a tree search VQ.  相似文献   

12.
Vector quantization for compression of multichannel ECG   总被引:2,自引:0,他引:2  
We propose a scheme based on vector quantization (VQ) for the data-compression of multichannel ECG waveforms. N-channel ECG is first coded using m-AZTEC, a new, multichannel extension of the AZTEC algorithm. As in AZTEC, the waveform is approximated using only lines and slopes; however, in m-AZTEC, the N-channels are coded simultaneously into a sequence of N + 1 dimensional vectors, thus exploiting the correlation that exists across channels in the AZTEC duration-parameter. Classified vector quantization (CVQ) of the m-AZTEC output is next performed to exploit the correlation in the other AZTEC parameter, namely, the value-parameter. CVQ preserves the waveform morphology by treating the lines and slopes as two perceptually-distinct classes. Both m-AZTEC and CVQ provide data-compression and their performance improves as the number of channels increases. Moreover, the final output differs little from the AZTEC output and hence ought to enjoy the same acceptability.  相似文献   

13.
This paper studies two kinds of methods for pitch predictor in speech compressing coding, i.e., open-loop and closed-loop structures. Some of simplified approaches for solving pitch predictor equation are suggested, and the performances are compared under several conditions. The computer simulation results are shown.  相似文献   

14.
An algorithm for estimating wideband pitch trajectories of acoustic signals is described. Its software implementation is fast, and real-time operation can be achieved by using inexpensive hardware. Results are presented for representative speech and music signals.  相似文献   

15.
16.
Shape invariant time-scale and pitch modification of speech   总被引:7,自引:0,他引:7  
The simplified linear model of speech production predicts that when the rate of articulation is changed, the resulting waveform takes on the appearance of the original, except for a change in the time scale. A time-scale modification system that preserves this shape-invariance property during voicing is developed. This is done using a version of the sinusoidal analysis-synthesis system that models and independently modifies the phase contributions of the vocal tract and vocal cord excitation. An important property of the system is its ability to perform time-varying rates of change. Extensions of the method are applied to fixed and time-varying pitch modification of speech. The sine-wave analysis-synthesis system also allows for shape-invariant joint time-scale and pitch modification, and allows for the adjustment of the time scale and pitch according to speech characteristics such as the degree of voicing  相似文献   

17.
Adaptive vector transform quantization (AVTQ) as a coding system is discussed. The optimal bit assignment is derived based on vector quantization asymptotic theory for different PDFs (probability density functions) of the transform coefficients. Strategies for shaping the quantization noise spectrum and for adapting the bit assignment to the changes in the speech statistics are discussed. A good estimate of the efficiency of any coding system is given by the system coding gain over scalar PCM (pulse code modulation). Based on the optimal bit allocation, the coding gain of the vector transform quantization (VTQ) system operating on a stationary input signal is derived. The VTQ coding gain demonstrates a significant advantage of vector quantization over scalar quantization within the framework of transform coding. System simulation results are presented for a first-order Gauss-Markov process and for typical speech waveforms. The results of fixed and adaptive systems are compared for speech input. Also, the AVTQ results are compared to known scalar speech coding systems  相似文献   

18.
An RNN-based robust signal bias removal (RRSBR) method is proposed for improving both the recognition performance and the computational efficiency of the SBR method for adverse Mandarin speech recognition. It differs from the SBR method in using three broad-class sub-codebooks to encode the feature vector of each frame and combining the three encoding residuals to form the frame-level signal bias estimate. A novel approach involving softly combining the board-class encoding residuals using dynamic weighting functions generated by an RNN is applied. Experimental results show that the RRSBR method significantly outperforms the SBR method.  相似文献   

19.
基于经典隐马尔可夫模型的汉语连续语音识别系统   总被引:1,自引:0,他引:1  
该文构造了基于经典隐马尔可夫模型(Hidden Markov Model,HMM)的汉语连续语音识别系统,定量地分析与评价了经典HMM的性能。  相似文献   

20.
Vector quantization of image subbands: a survey   总被引:13,自引:0,他引:13  
Subband and wavelet decompositions are powerful tools in image coding because of their decorrelating effects on image pixels, the concentration of energy in a few coefficients, their multirate/multiresolution framework, and their frequency splitting, which allows for efficient coding matched to the statistics of each frequency band and to the characteristics of the human visual system. Vector quantization (VQ) provides a means of converting the decomposed signal into bits in a manner that takes advantage of remaining inter and intraband correlation as well as of the more flexible partitions of higher dimensional vector spaces. Since 1988, a growing body of research has examined the use of VQ for subband/wavelet transform coefficients. We present a survey of these methods.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号