首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Mobile communication through 3G network has grown rapidly in recent years. It might be of interest to transmit secret messages over 3G voice channels. In this paper, we introduce a new covert communication scheme via Adaptive Multi-Rate Wideband (AMR-WB) encoded speech. An adaptive suboptimal pulse combination constrained (ASOPCC) method is presented to embed data on compressed speech signal of AMR-WB codec. The method takes advantage of the “redundancy”, created by non-exhaustive search of algebraic codebook, to encode secret information. An embedding factor η is used to control embedding bits. By properly setting η, ASOPCC can offer a better trade-off between speech quality and embedding capacity in the process of coding mode switching. Experimental results show that the proposed method is quite promising for both high capacity and good imperceptivity. Although ASOPCC is only applied to AMR-WB codec in this article, it can be further used by any other speech coding based on Algebraic Coded Exited Linear Prediction (ACELP).  相似文献   

2.
Recent advances in speech coding have made wideband coding feasible at the bit-rates sufficient for mobile communication. Here we propose a novel hybrid harmocic Code Excited Linear Prediction (CELP) scheme for highband coding of band-split scalable wideband codec, where the low-band (0–4?kHz) is critically subsampled and coded selectively using existing narrowband codecs such as 5.4 kbps and 6.3 kbps G.723.1, 8 kbps G.729, and 11.8 kbps G.729E. The high-band signal is divided into stationary mode (SM) and non-stationary mode (NSM) components based on its unique characteristics. In the SM portion, the high-band signal is compressed using a multi-stage coding that combines the sinusoidal model and CELP. The first stage coding applies the damping factor matching pursuit (MP) algorithm without either the Over-Lap-Add (OLA) or smoothly interpolative synthesis schemes and the second stage utilizes CELP with the circular codebook. In the NSM portion, the high-band signals are coded by CELP with both pulse and circular codebooks by applying the complexity-reduced algorithm. To ensure scalability in highband coding, two enhancement layers are used to increase the number of pulses and control the quantizing sinusoidal parameter numbers. This paper describes the new algorithm and discuses novel techniques for efficient bandwidth wideband speech coding and subjective quality performance. For efficient bit allocation and enhanced performance, the pitch of the high-band codec is estimated using the quantized pitch parameter in low-band codec. An informal listening test, rated the subjective speech quality as comparable to that obtainable with G.722.2 as the fullband wideband codec and G.722.2 as the highband codec, the recent standardized band-split wideband codec.  相似文献   

3.
由于ITU-TG.723.1语音编码算法具有较高的算法复杂度,故而在应用与实现时受到了很多的限制。该文提出一种低复杂度闭环基音搜索算法,该算法仍以5阶基音预测器为基础,但在求取5个基音预测增益时不是采用原算法中对20维矢量码本进行搜索的方法,而是利用这个20维矢量组成一个Wiener-Hopf方程,并利用语音的短时平稳特性将该方程简化为一个Toeplitz线性代数方程组,方程组的解就是所求的基音预测增益。对该增益进行5维码本矢量量化,从而用5维矢量码本搜索代替了原来的20维矢量码本搜索。这样使闭环基音搜索部分的运算量降低了一半,语音质量只有略微下降,同时与G.723.1算法码流兼容。  相似文献   

4.
Low delay-code excited linear prediction (LD-CELP) is an attractive algorithm in implementing vocoders in voice over Internet protocol networks. This algorithm has been proposed for the coding of speech at 16 kbps with toll quality. However, operation at transmission rates lower than 16 kbps is desirable, so that traffic can be accommodated during system overload conditions. In this paper, an array of self-organizing maps (SOMs) is employed instead of traditional codebook search module, recommended in ITU-T G.728, to determine the optimum index value of shape codebook. It is noted that a modified supervised training algorithm is used for SOMs in which some of the training parameters are optimized using particle swarm optimization (PSO) algorithm. Based on the occurrence frequency characteristics of codevectors, six bits for shape codebook and two bits for gain codebook are used in this work to produce a vocoder with lower bit rate as compared with traditional ITU-T G.728 vocoder. The performance comparison of the proposed SOM array trained by PSO-optimized supervised algorithm as the codebook search module in the structure of LD-CELP with a conventional implementation of LD-CELP coder shows that execution time of the algorithm is reduced up to 44 %. However, the degradation of voice quality in terms of mean opinion score, perceived evaluation of speech quality and segmental signal-to-noise ratio (SNRseg) is acceptable.  相似文献   

5.
为了降低代数码激励线性预测(algebraic code-excited linear prediction, ACELP)语音编码算法的复杂度, 以便更好地实时实现, 提出了一种有效的改进算法。在自适应码书搜索上提出了不连续的开环基音搜索算法, 利用时间抽取因子对不同时延段语音样点进行不连续抽取; 在代数码书的搜索上提出了一致脉冲替换法, 采用脉冲位置预选和循环判断机制控制码书搜索的次数。以G. 729A为实验平台进行仿真, 仿真结果表明, 改进的算法在保证语音质量的情况下, 有效降低了ACELP码书搜索的复杂度。  相似文献   

6.
提出了一种基于自适应加权谱内插(STRAIGHT)的宽带语音编码算法。输入的语音信号首先经过STRAIGHT分析得到精确的基频参数和谱参数,然后通过时域抽取和频域建模实现有效的编码压缩。在时域抽取时采用的区别于传统编码算法固定帧长的自适应可变帧长方法,使得编码存储量可以根据实际语音变化情况得到更加合理的分配。主观测听结果表明,该算法针对16kHz采样的语音信号,在6kbps码率上可以取得与AMR-WB(G.722.2)在8.85kbps时的相当的音质效果。此外,该算法还具有对恢复语音的时长、基频以及谱参数较强的调整能力。  相似文献   

7.
介绍了一种用于第三代移动通信的AMR-WB(自适应多码率宽带)语音编码算法,简要地介绍了其编解码原理,并进行了该算法的定点C代码仿真,给出了算法的计算复杂度及存储空间的仿真结果。  相似文献   

8.
为降低固定码本搜索算法的复杂度,在脉冲取代法的基础上提出一种码矢分段优化的快速搜索方法。采用码矢分段优化的方法,在保证语音质量的前提下,降低计算复杂度。实验结果表明,与AMR-WB采用的深度优先树算法及传统的脉冲取代算法相比,在不影响语音质量的条件下,码矢分段优化算法复杂度降低了70%~80%。  相似文献   

9.
The excellent performance in communications quality speech coding below 8 kbps achievable with the code-excited linear prediction (CELP) coders gives to this architecture a predominant role in medium-rate and low-rate speech coding, as evidenced by the adoption of several recent fixed-rate and variable-rate standards. Unfortunately, some of these CELP-based schemes are not completely described in the literature, and consequently they are difficult to understand and implement efficiently. This paper presents an original study of the G723.1 codec. The G723.1 encoder is dedicated to compress the voice signals with bandwidth up to 4 kHz efficiently and to deliver an encoded data stream with a very low binary rate and a good quality of transmitted speech (typical applications being encoding of the vocal signal for video conferences via GSTN and Voice over IP). We perform a detailed and gradually analysis, describing the MP-MLQ/ACELP speech coder from the point of view of a classical CELP structure. This approach allows us to identify (using theoretical considerations) the starting internal structure of each processing block from the encoder scheme. These results are used in breaking the main encoding algorithm loop. Finally, using the previously revealed starting internal structure, we derive the algorithm for the pitch predictor block, which is one of the most difficult parts of the ITU-T G723.1 encoder. The accompanying comments, explanations and diagrams allow efficient implementation and debugging of the corresponding software by regular DSP programmers.  相似文献   

10.
This paper proposes modification in the transmission of excitation codevector and its non-zero pulse sign magnitude using “codebook partition and label assignment” approach, which in turn reduces the number of bits required to transmit it through the communication channel in legacy CS-ACELP 8 kbps speech codec. The proposed approach uses the excitation codebook structure of forward mode standard G.729E 11.8 kbps with two non-zero pulses per track which avoids the use of two algebraic codebook structure for forward mode as well as for backward mode of G.729E with least significant pulse replacement approach for finding optimized excitation codevector. Proposed modification in legacy 8 kbps CS-ACELP (80 bits/10 ms) speech codec actuates the bit rate of 10.6 kbps (106 bits/10 ms) with a better objective and subjective analysis in stark contrast with legacy 8 kbps CS-ACELP speech coder and also avoids the switching of codebook modes of standard 11.8 kbps (G.729E) CS-ACELP speech coder. This paper also aims to propose the reduction in the number of searches in the final codevector of excitation structure by considering initial codevector as a final codevector which improves the quality of the speech compared to the output speech quality of legacy G.729 CS-ACELP working at 8 kbps. Both legacy CS-ACELP 8 kbps speech codec and proposed CS-ACELP 10.6 kbps are implemented in MATLAB. Subjective and objective analysis are carried out on a proposed CS-ACELP 10.6 kbps speech codec in order to evaluate its performance and the results obtained are then cross- compared with the results of legacy CS-ACELP (8 kbps) using set of tables and graphs. It is evident from obtained results that both PESQ and MOS scores are quite comparable for each set of wave files even though bitrates are reduced. Consistency and efficiency of proposed algorithm is assured by calculating the population mean of 95% confidence interval based on obtained objective and subjective parameter results.  相似文献   

11.
卓越  周敬利 《计算机仿真》2004,21(11):110-113
G.728语音压缩标准的算法延迟只有0.625毫秒,对于绝大多数应用来说是非常令人满意的。但是,其占用的带宽似乎稍微高了一点。为了能够降低G.728算法使用的码率,人们尝试了很多的方法,比如只采用前32个波形码字,采用奇数号的码字,等等。根据前人的研究,该文提出了一种新的12.8kbit/s编码器。该编码器利用人与人之间对码字使用的统计差异,自动地为每一个人生成一个独特的码书。这种编码器的计算复杂度与其它同码率编码器相当,内存使用有少量的增加,而音质则有比较明显的提高。  相似文献   

12.
在AMR-WB中,固定码本搜索是影响性能和复杂度的关键模块,约占总复杂度的40%。为了降低计算量,提出了一种码字分裂、子码字脉冲取代的高效码本搜索算法。该算法包括四步:一个初始码字分裂为两个或更多的子码字;每个子码字通过最不重要脉冲取代法进行更新;更新后的子码字合成一个候选的码字;比较初始码字和候选码字,选择优者作为最后的码字。实验表明,与传统方法相比,编码时间减小约16%。  相似文献   

13.
This paper presents novel techniques for source-controlled variable-rate wideband speech coding. These techniques have been used in the variable-rate multimode wideband (VMR-WB) speech codec recently selected by the Third-Generation Partnership Project 2 (3GPP2) for wideband (WB) speech telephony, streaming, and multimedia messaging services in the cdma2000 third-generation wireless system. The codec utilizes efficient coding modes optimized for different classes of speech signal including generic coding based on AMR-WB for transients and onsets, voiced coding optimized for stable voiced signals, unvoiced coding optimized for unvoiced segments, and comfort noise generation for inactive segments. Several innovations enable very good performance at average bit rates below 8 kb/s for active speech coding. The article presents an overview of the codec and describes in detail some of the codec novel features: Robust pitch tracking algorithm, coding-mode dependent prediction of linear prediction (LP) filter quantization, and novel frame erasure concealment techniques including supplementary information for reconstruction of lost onsets and improving decoder convergence. Selected results from the Selection and Characterization tests of the codec illustrate its performance  相似文献   

14.
针对无线衰落信道中AMR-WB宽带语音编码,提出一种基于多速率删余卷积码的不等错误保护传输方案。根据AMR-WB语音编码数据的不同重要性,采用强错误保护能力的删余卷积码为AMR-WB语音编码中的A类数据提供错误保护能力,对B类数据采用高码率删余卷积码提供错误保护能力。研究结果表明,在同样传输带宽条件下,不等错误能力保护可以有效改善无线衰落信道中AMR-WB语音质量。  相似文献   

15.
G.729a是ITU-T推出的用于PSTN的第四代语音编码标准,采用了共轭结构-算术码本激励线性预测编码(CS-ACELP)算法,其码率为8Kbps。本文在对G.729a的编解码算法作出扼要介绍后,就如何在定点DSP芯片TMS320C541上实时实现该编码算法做出了具体讨论,包括系统的软硬件设计及关键技术。随后文中给出了详细的实验结果以供分析。根据测试结果,最后得出结论:在'C541上实现一路全双工G.729a编解码器需程序空间7.23K字、数据空间6.7K字,其算法复杂度最大为15.5MIPS。  相似文献   

16.
数字信号处理器在语音编解码中得到广泛应用。在简要介绍TMS320C50定点DSP芯片和ITU-T G.723.1语音编解码算法后,详细讨论了G.723.1在TMS320C50上的实现及其技术要点,主要是内存安排、算法和代码优化、数据精度等。设计的编解码器通过了ITU-T G.723.1标准测试数据测试,占用内存资源较少,并具备较高的编解码速度。  相似文献   

17.
DM642上G.729A编解码算法的实现和应用   总被引:3,自引:0,他引:3  
沈勇  唐昆 《微计算机信息》2006,22(2):134-136
语音通信是多媒体通信的基础,G.729A标准由于其高质量和低延迟在多媒体通信中被广泛应用;TITMS320DM642高性能处理器是专门为多媒体通信而设计的,是组成多媒体终端的核心;本文研究了G.729A算法在DM642上的优化方法和具体应用,并实现了高效率的编解码器。  相似文献   

18.
以降低码率为目的对G.728算法进行改进,提出了一个延迟为2.5 ms的8 Kbit/s的语音编码算法。算法引入了由最近的历史激励构成的自适应码书和归一化的固定码书的双码书结构。计算增益真值并量化,增益量化时对自适应码书用固定量化,固定码书用自适应量化。码书搜索时先进行后向基音检测,在基音周期T附近对自适应码书进行精细搜索。搜索64个自适应码矢、256个固定码矢和各自8个增益值获得最佳激励,每帧耗费20 bit。用平均分段信噪比和感知语音质量评价(PESQ)测试,改进算法编码质量接近于G.728。  相似文献   

19.
Speech coding facilitates speech compression without perceptual loss that results in the elimination or deterioration of both speech and speaker specific features used for a wide range of applications like automatic speaker and speech recognition, biometric authentication, prosody evaluations etc. The present work investigates the effect of speech coding in the quality of features which include Mel Frequency Cepstral Coefficients, Gammatone Frequency Cepstral Coefficients, Power-Normalized Cepstral Coefficients, Perceptual Linear Prediction Cepstral Coefficients, Rasta-Perceptual Linear Prediction Cepstral Coefficients, Residue Cepstrum Coefficients and Linear Predictive Coding-derived cepstral coefficients extracted from codec compressed speech. The codecs selected for this study are G.711, G.729, G.722.2, Enhanced Voice Services, Mixed Excitation Linear Prediction and also three codecs based on compressive sensing frame work. The analysis also includes the variation in the quality of extracted features with various bit-rates supported by Enhanced Voice Services, G.722.2 and compressive sensing codecs. The quality analysis of extracted epochs, fundamental frequency and formants estimated from codec compressed speech was also performed here. In the case of various features extracted from the output of selected codecs, the variation introduced by Mixed Excitation Linear Prediction codec is the least due to its unique method for the representation of excitation. In the case of compressive sensing based codecs, there is a drastic improvement in the quality of extracted features with the augmentation of bit rate due to the waveform type coding used in compressive sensing based codecs. For the most popular Code Excited Linear Prediction codec based on Analysis-by-Synthesis coding paradigm, the impact of Linear Predictive Coding order in feature extraction is investigated. There is an improvement in the quality of extracted features with the order of linear prediction and the optimum performance is obtained for Linear Predictive Coding order between 20 and 30, and this varies with gender and statistical characteristics of speech. Even though the basic motive of a codec is to compress single voice source, the performance of codecs in multi speaker environment is also studied, which is the most common environment in majority of the speech processing applications. Here, the multi speaker environment with two speakers is considered and there is an augmentation in the quality of individual speeches with increase in diversity of mixtures that are passed through codecs. The perceptual quality of individual speeches extracted from the codec compressed speech is almost same for both Mixed Excitation Linear Prediction and Enhanced Voice Services codecs but regarding the preservation of features, the Mixed Excitation Linear Prediction codec has shown a superior performance over Enhanced Voice Services codec.  相似文献   

20.
一种改进的G.729标准固定码本快速搜索算法*   总被引:1,自引:0,他引:1  
在G.729建议的共轭结构代数码激励线性预测编码(CS-ACELP)中,固定码本搜索在整个语音编码算法中占有较大比重,直接影响编码算法复杂度。全搜索算法准确度很高,但搜索量过大,而传统的脉冲序列替换搜索,搜索次数减少,但合成语音质量较差。为解决该问题,提出一种基于脉冲序列替换的改进码本搜索算法。设定循环阈值门限,对脉冲序列重置后的部分脉冲组合进行全搜索,引入双脉冲位置替换,有效地减少了搜索次数,同时提高了搜索准确度。实验结果证明,该算法在增加算法复杂度较少的情况下,合成语音质量有明显的改进。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号