首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 961 毫秒
1.
In this paper, we present a fine-grained parallel implementation of the MPEG-2 video encoder an the Intel Paragon XP/S parallel computer. We use a data-parallel approach and exploit parallelism within each frame, unlike some of the previous approaches that employ multiple processing of several disjoint video sequences. This makes our encoder suitable for real-time applications where the complete video sequence may not be present on the disk and may become available on a frame-by-frame basis with time. The Express parallel programming environment is employed as the underlying message-passing system making our encoder portable across a wide range of parallel and distributed architectures. The encoder also provides control over various parameters such as the number of processors in each dimension, the size of the motion search window, buffer management, and bitrate. Moreover, it has the flexibility to allow the inclusion of fast and new algorithms for different stages of the codec into the program, replacing current algorithms. Comparisons of execution times, speedups, and frame encoding rates using different numbers of processors are provided. An analysis of frame data distribution among multiple processors is also presented. In addition, our study reveals the degrees of parallelism and bottlenecks in the various computational modules of the MPEG-2 algorithm. We have used two motion estimation techniques and five different video sequences for our experiments. Using maximum parallelism by dividing one block per processor, an encoding rate higher than 30 frames/s has been achieved.  相似文献   

2.
针对在均匀条带划分的HEVC并行视频编码器中出现的负载失衡问题,提出了一种基于多条带HEVC并行编码器的负载均衡算法。从编码参数入手,通过分析量化参数、参考帧数目和图像组等因素与编码耗时之间的关系,提出了一种基于编码参数的编码时间预测模型。以位置上和时间层上相邻已编码帧的编码信息为基础,以实际编码参数为依据,根据编码时间预测模型进行当前帧编码时间的预测,从而以当前帧的预测时间为依据,进行多条带HEVC并行编码器的负载均衡操作。实验结果表明,与现有均匀条带划分方法相比,提出的方法能够提升加速比9.23%左右,而编码的性能损失几乎可以忽略不计。  相似文献   

3.
王晗  林涛 《计算机工程》2006,32(7):224-226
根据H,264这一新的视频压缩标准的特点,利用Intel的超线程技术以及OpenMp,可以使软件编码器进行线程级的并行运算,从而大大提高编码速度。该文对超线程技术、OpenMp以及编码过程中不同级别的并行运算进行了分析。证明了利用OpenMP对H.264编码器进行并行处理优化,再加上具有超线程技术的处理器的支持,必将会大大提高编码器的性能。  相似文献   

4.
JPEG2000采用的MQ编码器是一种优于Huffman编码的无损数据压缩算法。基于JPEG2000算法规程的MQ编码器速度较慢,限制了整个编码系统的实时性。本文采用部分并行算法,MQ编码器每个时钟最多可以编码2对数据,有效提高了编码器的数据吞吐率,并设计了基于3级流水的VLSI结构。试验结果表明,该算法平均每个时钟编码1.32bit,比普通算法的编码效率提高了约32%。  相似文献   

5.
MPEG-4 currently being finalized by the Moving Pictures Experts Group of the ISO is a multimedia standard, MPEG-4 aims to support content-based coding of audio, text, image, and video (synthetic and natural) data, multiplexing of coded data, as well as composition and representation of audiovisual scenes. One of the most critical components of an MPEG-4 environment is the system encoder. An MPEG-4 scene may contain several audio and video objects, images, and text, each of which must be encoded individually and then multiplexed to form the system bitstream. Due to its flexible features, object-based nature, and provision for user interaction, MPEG-4 encoder is highly suitable for a software-based implementation. A full-scale software-based MPEG-J system encoder with real-time encoding speed is a nontrivial task and requires massive computation. We have built such an encoder using a cluster of workstations collectively working as a virtual parallel machine. Parallel processing of MPEG-4 encoder needs to be carried out carefully as objects may appear or disappear dynamically in a scene. In addition, objects may be synchronized with each other. User interactions may also prohibit a straightforward parallelization. We propose a modeling methodology that captures the spatio-temporal relationship between various objects and user interaction. We then propose a number of scheduling algorithms that periodically allocate MPEG-4 objects to multiple workstations ensuring load balancing and synchronization requirements among multiple objects. Each scheduling algorithm has its own performance and complexity characteristics. The experimental results, while showing real-time encoding rates, exhibit tradeoffs between load balancing, scheduling overhead cost, and global performance  相似文献   

6.
High efficiency video coding (HEVC) is the newest video coding standard that can support powerful video compression performance with increased picture resolution for ultrahigh definition (UHD). Compared to the previous standard, HEVC achieved a coding efficiency double with a tremendous increase in encoder computational complexity, making support of commercial applications for UHD video service difficult. Especially, optimized HEVC encoder for UHD is expected to be deployed as a key technology for an emerging smart surveillance system in Internet of Things environment. Single-instruction-multiple-data implementation on an Intel x86 processor and several fast encoding schemes were investigated for the complexity reduction of the HEVC reference model (HM) encoder. Fast encoding schemes included early termination processes and data-level parallel processing. The computational complexity of the proposed HEVC encoder was decreased by approximately 192 times compared with HM encoder with an acceptable coding loss.  相似文献   

7.
基于缓冲器的编码器接口扩展方法研究   总被引:3,自引:0,他引:3  
在数据采集测控系统中,用于检测的绝对值角度编码器的输出很多是基于SSI接口,采用SSI接口的绝对值角度编码器输出有时需要连接到多个不同的独立测控系统,由于SSI接口是单访问接口,多主机共享往往是较难解决又不得不解决的问题,本文提出了一种缓冲式并行读取绝对值角度编码器值的方法,实现了信号的并行读取,并给出了硬件和软件设计的思想和方法。  相似文献   

8.
基于同构多核处理器的H.264多粒度并行编码器   总被引:2,自引:0,他引:2  
H.264码率低和视频质量高的优越性能以增加编码计算的复杂度为代价,如何开发适用于多核处理器平台的并行编码算法是提高其编码速度的重要研究内容,对于满足高清视频实时传输和大规模共享具有十分重要的意义.利用H.264开源编码器项目X264,在片级和数据级并行编码算法的基础上,通过分析图像帧之间的参考关系,提出并实现了B帧个数可变的帧级并行算法;根据宏块之间的参考关系,设计了一种类似流水线的宏块级并行方法;基于Intel同构多核平台,提出融合帧级、片级、宏块级和数据级4种不同粒度的并行编码方案,开发了H.264多粒度并行编码器.实验结果表明,在码率增加不大的情况下,H.264多粒度并行编码器可以很好地提升编码加速比,视频编码质量符合高质量的要求.  相似文献   

9.
为了在野外快速准确地测量编码器的误差,研制了编码器的误差测量系统。因为编码器的误差主要来自细分误差,该系统主要对细分误差进行分析。系统采用2片MAX125 A/D转换芯片对编码器输出的信号进行采集,通过并口将数据传给计算机进行误差分析。与传统的误差测量系统相比具有测量速度快、便携、简单等特点。利用该系统对某21位编码器的误差进行测量分析,证明该方法可行。  相似文献   

10.
由于具有绝对零位的便携式经纬仪开机后必须过零,且零位不稳定,故不能较长时间标定某一方位。为研制具有绝对编码的便携式经纬仪,基于并行光学码盘的理论,提出了一种以左移循环码为编码方式的串行码盘传感器思想,以CCD作为此设计的感应和细分元件,将码盘编码进行软件识别与细分。实验结果表明:串行码盘可以作为一种新型角度传感器。  相似文献   

11.
介绍了一种基于多slice的并行AVS 实时编码器的算法研究与实现, 基于slice的并行编码的优点是处理速度快、延迟小和数据处理简单,缺点是条纹效应。介绍了一种对基于slice的并行编码带来的条纹效应缺点进行改进的方法。实验结果表明,实现标清AVS 实时编码器是可行的。  相似文献   

12.
MQ编码算法的高复杂度,低吞吐率严重制约其应用.本文在分析连续码流编码更新规律的基础上,利用滑动窗口机制和概率统计规律预测区间变化,减少并行数据间的关联,设计出三种不同并行度的MQ编码VLSI结构.并在FPGA芯片上进行优化实现.实验结果表明,与单输入MQ编码器相比,三种结构能在不影响器件工作频率和编码效率的情况下,不同程度的提高系统的处理速率,为MQ编码的大规模应用提供了广泛的选择空问.  相似文献   

13.
提出了一种可配置的整数变换运算单元并将其用于H.264/AVC HiProfile视频编码器的自适应变换模块中。通过变换类型信号的配置,该变换单元可以完成相应的变换操作。本设计采用Altera公司的CycloneⅡ系列FPGA进行实现和验证,布局布线后的最大工作频率为63 MHz,采用4个可配置变换单元的变换模块,可以满足HD1080P@50帧/s视频的实时编码要求。  相似文献   

14.
Document-level machine translation (MT) remains challenging due to its difficulty in efficiently using document-level global context for translation. In this paper, we propose a hierarchical model to learn the global context for document-level neural machine translation (NMT). This is done through a sentence encoder to capture intra-sentence dependencies and a document encoder to model document-level inter-sentence consistency and coherence. With this hierarchical architecture, we feedback the extracted document-level global context to each word in a top-down fashion to distinguish different translations of a word according to its specific surrounding context. Notably, we explore the effect of three popular attention functions during the information backward-distribution phase to take a deep look into the global context information distribution of our model. In addition, since large-scale in-domain document-level parallel corpora are usually unavailable, we use a two-step training strategy to take advantage of a large-scale corpus with out-of-domain parallel sentence pairs and a small-scale corpus with in-domain parallel document pairs to achieve the domain adaptability. Experimental results of our model on Chinese-English and English-German corpora significantly improve the Transformer baseline by 4.5 BLEU points on average which demonstrates the effectiveness of our proposed hierarchical model in document-level NMT.  相似文献   

15.
Albertengo  G. Sisto  R. 《Micro, IEEE》1990,10(5):63-71
Theoretical aspects of encoding cyclic redundant codes (CRCs) are reviewed. A method of designing hardware parallel encoders for CRCs that is based on digital system theory and z-transforms is presented. It allows designers to derive the logic equations of the parallel encoder circuit for any generator polynomial. A few interesting application areas for hardware parallel encoders are pointed out  相似文献   

16.
H.264视频编码标准采用了很多新技术,具有更优越的编码效率,同时也增加了计算复杂度,无法满足实时应用。由于单指令多数据扩展指令集2(SSE2)的并行运算能力可以提高计算机对多媒体数据的实时处理。文中主要采用了SSE2对H.264中的一些耗时较多的关键模块,例如整数像素运动估计中计算SAD、整数DCT变换、量化、Hadamard变换以及亚像素运动估计中计算SATD进行了指令级优化。实验结果表明,经过优化后,在保持视频图像质量的前提下,相应模块运行速度得到了提高,使H.264编码器整体的编码速度较好地满足实时要求。  相似文献   

17.
从单指令多数据并行运算的角度出发,将面向对象的思想引入到SAD值计算的并行操作过程中,给出了一种改进的图像组织优化算法,通过对多个标准测试序列进行运动预测的测试结果知,在当前最复杂的视频编码H.264/AVC上,该算法的实施可以明显地提高编码器的编码速度,为实现窄带中的实时视频通信提供了保障.  相似文献   

18.
In this paper, we introduce a Turbo coded modulation scheme, called multilevel turbo coded-continuous phase frequency shift keying (MLTC-CPFSK). The underlying basis of multilevel coding is to partition a signal set into several levels and to encode separately each level through the respective layer of the encoder. In MLTC-CPFSK, to provide phase continuity of the signals, turbo encoder and continuous phase encoder (CPE) are serially concatenated at the last level, while all other levels consist of only a turbo encoder. Therefore, the proposed system contains multiple turbo encoder/decoder blocks in its architecture. The parallel input data sequences are encoded by our multilevel scheme and mapped to CPFSK signals. Then, for the purpose of performance analysis, these modulated signals are passed through AWGN and fading channels. At the receiver side, the input sequence of the first level is estimated by the first turbo decoder block. Subsequently, the other input sequences of other levels are computed using the estimated input bit streams of the respective previous levels. Simulation results are drawn for 4-ary CPFSK two level and 8-ary CPFSK three level turbo codes over AWGN, Rician, and Rayleigh channels for three iterations while frame sizes are chosen as 100 and 1024. It is concluded that satisfactory performance is achieved in MLTC-CPFSK systems for all SNR values in various fading environments.  相似文献   

19.
提出了一种改进的行式二维小波变换器结构,设计了位平面并行的位平面编码器和四级流水线结构的算术编码器,并将其整合于一个SoPC中,实现了JPEG2000编码系统。整个设计通过Altera公司Stratix Ⅱ系列的EP2S60F1020C5平台验证,在最高时钟频率98 MHz下能达到编码分辨率512×512、灰度图像52帧/s的速度,满足了实时编码的要求。  相似文献   

20.
Recent advances in semiconductor technologies make it possible to integrate many processor cores in a small device package. The parallel execution capability of such multi-core processors can be exploited to enhance the performance of many traditional sequential applications. There have been numerous research activities to develop parallelization techniques using the OpenMp programming model, in order to speed up sequential applications such as the H.264/AVC codec, but mostly in the PC environment. Therefore, it is difficult to understand which parallelization technique fits well with the H.264/AVC encoder on an embedded multi-core architecture. In this paper, we present parallelization techniques applicable to the H.264/AVC encoder on ARM MPCore using the OpenMP programming model. Further, we propose an analytical model for the performance estimation of the H.264/AVC encoder, and we then verify the model accuracy by performing simulations using hardware/software co-verification tool. Our experimental results show that the parallelization techniques proposed in this paper for the embedded multi-core platform improve the encoder performance by up to 2.36 times, and that the parallelization technique exploiting data-level parallelism outperforms the one using task-level parallelism by 41%. It is also observed that balancing loads among processor cores is a critical parameter in achieving better scalability in the encoder.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号