共查询到20条相似文献,搜索用时 60 毫秒
为了方便网络传输和本地存储需对大量音频文件进行压缩处理,但获取存储空间下降的同时会牺牲相应的音质。针对音频最常使用的MPEG-1Layer3有损压缩方法,即mp3文件,使用ASRGAN(Audio Super-Resolution Generative Adversarial Nets)对码率下降的音频进行音质还原,使用生成模型和判别模型相互促进学习,并进行交叠加权处理,同时使用空洞卷积和双向循环网络增强整体网络对超长序列处理的能力,最终选出最优的音频提升模型。该方法减小了音频传输和存储所使用的网络带宽和存储容量,同时还能够获得较好的音质。 相似文献
本文介绍了利用声霸卡进行声音实时仿真的系统结构和原理,对实现中出现的难点和问题作了详细讨论,并给出具体的解决方法。 相似文献
以整型提升小波变换、人类听觉掩蔽特性、数字音频局部邻域特性为基础,提出了一种自适应小波域数字音频水印嵌入算法,该算法具有以下特点:(1)结合人类听觉系统的掩蔽特性,实现了水印嵌入位置的自适应确定;(2)引入了高效的整型提升小波变换;(3)利用数字音频的局部邻域特性,实现了数字水印嵌入深度的智能调节;(4)数字水印信息的提取不需要原始音频信号.对比实验表明:该自适应数字音频水印算法不仅具有较好的透明性,而且对诸如叠加噪声、有损压缩、低通滤波、重新采样、重新量化等攻击均具有较好的鲁棒性(特别是叠加噪声与低通滤波)。 相似文献
本文介绍了一种将声卡作为A/D和D/A,用LabVIEW实现声音播放、声音采集以及对声音信号做频谱分析实现声音均衡器的功能。文中设计的声音均衡器可以加深对信号采集和信号处理的理解,具有一定的参考借鉴价值。 相似文献
In this paper we describe a prototype spatial audio user interface for a Global Positioning System (GPS). The interface is
designed to allow mobile users to carry out location tasks while their eyes, hands or attention are otherwise engaged. Audio
user interfaces for GPS have typically been designed to meet the needs of visually impaired users, and generally, though not
exclusively, employ speech-audio. In contrast, our prototype system uses a simple form of non-speech, spatial audio. This
paper analyses various candidate audio mappings for location and distance information. A variety of tasks, design considerations,
design trade-offs and opportunities are considered. The findings from pilot empirical testing are reported. Finally, opportunities
for improvements to the system and for future evaluation are explored. 相似文献
一种有效的音频信息检索技术 总被引:2,自引:0,他引:2
音频数据检索是数字化信息检索的迫切需要,但国际上对音频检索技术的研究尚远未尽人意。提出了一种新的音频检索机制,利用小波变换来产生音频数据的特征向量,通过关联规则挖掘发现音频数据特征元素向量与音频数据所属类别之间的联系,从而进行音频的分类和检索。实验表明,这种方法具有较高的检索效率,能大大缩短计算时间,具有良好的检索性能。 相似文献
为了解决在压缩音频中实现高透明性、大容量信息隐藏的问题,提出了一种新的基于MPEG音频编码的盲检测隐写算法,首先通过对可变长码字(VLC)配对,实现对原始码字空间的扩展,然后利用码字映射规则完成秘密信息的嵌入.该算法能够保持隐写前后的压缩音频文件大小不变,隐写过程中不需要对MPEG音频进行完全解码.实验结果表明,所提出算法计算复杂度低,同时可获得较高的隐藏容量和良好的不可感知性. 相似文献
《IEEE transactions on audio, speech, and language processing》2006,14(6):1902-1911
A new method for the objective assessment and prediction of perceived audio quality is introduced. It represents an expansion of the speech quality measure$q_C$ , introduced by Hansen and Kollmeier, and is based on a psychoacoustically validated, quantitative model of the “effective” peripheral auditory processing by Dau To evaluate the audio quality of a given distorted signal relative to a corresponding high-quality reference signal, the auditory model is employed to compute “internal representations” of the signals, which are partly assimilated in order to account for assumed cognitive aspects. The linear cross correlation coefficient of the assimilated internal representations represents the perceptual similarity measure (PSM). PSM shows good correlations with subjective quality ratings if different types of audio signals are considered separately, whereas a better accuracy of signal-independent quality prediction is achieved by a second quality measure$ PSM_t$ represented by the fifth percentile of the sequence of instantaneous audio quality PSM(t). The new measures were evaluated using a large database of subjective listening tests that were originally carried out on behalf of the International Telecommunication Union (ITU) and Moving Pictures Experts Group (MPEG) for the evaluation of various low bit-rate audio codecs. Additional tests with data unknown in the development phase of the model were carried out. Except for linear distortions, the new method shows a higher prediction accuracy than the ITU-R recommendation BS.1387 (“PEAQ”) for the tested data. 相似文献
一种抵抗去同步攻击的音频隐藏信息的方法 总被引:4,自引:0,他引:4
基于音频信息隐藏技术,提出了一种有效抵抗恶意去同步攻击的语音保密通信方法.对保密语音进行压缩编码,利用G.729编码标准的帧内独立编码特性,实现语音码流的帧内自同步;采用量化方法,将语音信息隐藏到载体音频的小波域中;以PN序列作为时域同步帧,定位保密信息的隐藏位置.该算法复杂度低,隐藏容量满足正常语音通信要求,且保密语音的检测与提取不需要使用原始音频.实验表明,算法抵抗音频处理(如加噪、MP3压缩、重采样、随机裁剪等)性能理想,特别是对于音频信号的恶意裁剪攻击,与同类方法相比具有更强的鲁棒性. 相似文献
对于盲人或视力有障碍的人群来说,打电话一直是这个特殊人群比较头痛的问题,面对众多的联系人,庞大的电话号码簿,把所有号码都记住显然是不现实的.文章就如何解决盲人或视力有障碍人群如何记录和查找电话号码的问题进行了一系列探索,并且提出了一种新型语音电话簿的设计方案,通过易隆公司eSZL000音频处理芯片实现了这一设计. 相似文献
对于盲人或视力有障碍的人群来说,打电话一直是这个特殊人群比较头痛的问题,面对众多的联系人,庞大的电话号码簿,把所有号码都记住显然是不现实的。文章就如何解决盲人或视力有障碍人群如何记录和查找电话号码的问题进行了一系列探索,并且提出了一种新型语音电话簿的设计方案,通过易隆公司eSZL000音频处理芯片实现了这一设计。 相似文献
近年来,在某些交互场景中,面向对象空间音频编码能够允许用户更加灵活地对特定对象进行个性化的渲染和重组。然而,如果对象音频分别编码会导致整体码率随着对象数目的增加而同步大幅增加。MPEG组织提出的面向对象音频编码(spatial audio object coding,SAOC)可以将所有对象下混成一个单独的混合信号,同时为每个对象提取少量的边信息。但是,当其对超过32个对象音频进行编码时,边信息的码率会随着对象数目的增加而增大,甚至会远大于下混对象的码率。为了解决这个问题,提出一种在面向对象编码中基于空间位置约束的空间参数动态量化方法,以叠加定位原理为理论基础,利用虚拟声源的空间位置与产生该声源的音频对象的空间位置之间的约束关系,确定出空间约束区域以及局部空间量化码本,并针对提取出的虚拟声源的空间方位,进行空间参数的量化编码。最后给出主观实验和客观实验,表明在音质和空间方位大致相当的情况下,边信息的码率比SAOC方法降低约30%。 相似文献
为了进一步改善波束形成的降噪性能,研究了一种稳键后置滤波自适应空间波束形成算法.用麦克风代替传统波束形成器的延时抽头线,使所有的麦克风都有一阶的滤波器,利用经典的线性约束最小方差准则使空间波束形成产生语音参考信号,同阻塞矩阵输出的噪声参考信号一起经自适应多路相消器,从而有效的消除干扰噪声;最后结合后置滤波技术进一步改善语音质量.实验结果表明,相对于传统后置滤波自适应波束形成算法,在消噪性能上有明显的改善且具有更高的输出信噪比. 相似文献
Network transmission is liable to errors and data loss. In movie transmission, packets of video frames are subject to loss or even explicit elimination for many reasons including congestion handling and the achievement of higher compression. Not only does the loss of video frames cause significant reduction in video quality, but it could also cause a loss of synchronization between the audio and video streams. If not corrected, this cumulative loss can seriously degrade the motion picture's quality beyond viewers' tolerance. In this paper, we study and classify the effect of audio-video de-synchronization. Afterwards, we develop and examine the performance and appropriateness of the application of many client-based techniques in the estimation of lost frames using the existing received frames, without the need for retransmissions or error control information. The estimated frames are injected at their appropriate locations in the movie stream to restore the loss. The objective is to enhance video quality by finding a very close estimate to the original frames at a suitable computation cost, and to contribute to the restoration of synchronization within the tolerance level of viewers. 相似文献
贾丽娟 《计算机光盘软件与应用》2010,(3):69-69,66
计算机声卡是多媒体技术中最基本的组成部分,是实现声波/数字信号相互转换的一种硬件。本文对基于计算机声卡的谱相减语音增强系统进行分析。首先分析了基于计算机声卡的语音增强系统。其次,介绍了谱相减算法,具有一定的参考价值。 相似文献
WANG Xiangyang CUI Yongrui YANG Hongying ZHAO Hong 《通讯和计算机》2005,2(1):49-56
Digital audio watermarking embeds inaudible information into digital audio data for the purposes of copyright protection, ownership verification, coven communication, and/or auxiliary data carrying. In this paper, we present a novel watermarking scheme to embed a meaningful gray image into digital audio by quantizing the wavelet coefficients (using integer lifting wavelet transform) of audio samples. Our audiodependent watermarking procedure directly exploits temporal and frequency perceptual masking of the human auditory system (HAS) to guarantee that the embedded watermark image is inaudible and robust. The watermark is constructed by utilizing still image compression technique, breaking each audio clip into smaller segments, selecting the perceptually significant audio segments to wavelet transform, and quantizing the perceptually significant wavelet coefficients. The proposed watermarking algorithm can extract the watermark image without the help from the original digital audio signals. We also demonstrate the robustness of that watermarking procedure to audio degradations and distortions, e.g., those that result from noise adding, MPEG compression, low pass filtering, resampling, and requantization. 相似文献
分析和研究了基于声波耦合和语音增强模块级联的语音增强方法的工业语音识别系统设计和实施过程,并对其进行了算法建模,同时在比较谱减法和MMSE-LSA的语音增强算法的同时进行了实验数据分析,使工业机器人语音识别系统在噪声环境下提高了识别率. 相似文献