首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
This paper proposes a task-based hybrid parallel and hybrid pipeline(THPHP)scheme to implement multi-standard video algorithms,including MPEG-2,H.264,and audio video coding standard(AVS),on a heterogeneous coarse-grained reconfigurable processor,called the reconfigurable multimedia system(REMUS).The proposed schemes greatly improve decoding performance and satisfy the real-time requirements of various high-definition(HD)video decoding standards.In THPHP,we propose both a task-based hybrid parallel scheme,in which macro-block(MB)-level,block-level,and sub-block-level decoding tasks are parallelized to improve data processing throughput,and a hybrid pipeline scheme,in which slice-level,MB-level,block-level and sub-block-level computations are pipelined to improve efficiency.Computation-intensive tasks,such as motion compensation,intra prediction,inverse discrete cosine transform,reconstruction,and deblocking filter,are implemented on two reconfigurable processing units,which are the core computing engines of REMUS.Thanks to the proposed schemes,the implementations can achieve H.264 high profile(HP)1920×1080@30 fps streams,AVS Jizhun profile(JP)1920×1080@39 fps streams,and MPEG-2 main profile(MP)1920×1080@41 fps streams when working at 200 MHz frequency.Compared with XPP-III(a commercial reconfigurable processor),when implementing H.264 HD decoding,the performance and energy efficiency on REMUS are improved by1.81×and 14.3×,respectively.  相似文献   

2.
This paper proposes a novel relay selection strategy based on the feedback beamforming (BF) information through designed sector sweep (SSW) report frame for millimeter-wave (mmWave) wireless personal networks (WPANs). First, an SSW report frame compatible with IEEE 802.11ad standard is designed. Second, an approach collecting instantaneous channel state information (CSI) overheard during BF is devised. Third, with the aim of minimizing the outage probability and maximizing the overall network throughput capacity, the optimal relay selection issue for non-line-of-sight (NLoS) links is formulated as a bipartite graph, and Kuhn Munkres (KM) algorithm is provided to resolve it. Both theoretical analysis and simulation results show, with CSI considering NLoS conditions and selected relays according to the overall network throughput capacity maximization principle, the improvements achieved over opportunistic relay selection strategy in terms of overall network throughput capacity and outage probability with minimal modifications to IEEE 802.1lad.  相似文献   

3.
In order to meet the increased computational demands of, e.g., multimedia applications, such as video processing in HDTV, and communication applications, such as baseband processing in telecommunication systems, the architectures of reconfigurable devices have evolved to coarse-grained compositions of functional units or program controlled processors, which are operated in a coordinated manner to improve performance and energy efficiency.In this survey we explore the field of coarse-grained reconfigurable computing on the basis of the hardware aspects of granularity, reconfigurability, and interconnection networks, and discuss the effects of these on energy related properties and scalability. We also consider the computation models that are being adopted for programming of such machines, models that expose the parallelism inherent in the application in order to achieve better performance. We classify the coarse-grained reconfigurable architectures into four categories and present some of the existing examples of these categories. Finally, we identify the emerging trends of introduction of asynchronous techniques at the architectural level and the use of nano-electronics from technological perspective in the reconfigurable computing discipline.  相似文献   

4.
在各类高清视频解码过程中,分像素插值是计算最为密集的处理环节之一.针对已有分像素插值结构在兼顾性能与灵活性方面所存在的不足,提出一种适用于多标准视频解码处理的可重构分像素插值结构设计.通过分析不同标准的插值计算模式之间的共性与差异,提出一种新型可重构并串混合滤波结构,其中的数据传输通路、输入/输出数据模式以及滤波计算单元均可进行动态配置,能够支持包括VC-1,H.264/263,AVS和MPEG-1/2/4在内的多种视频标准.实验结果表明,该设计能够完成多标准实时HDTV 1080 p(1920x1088@30 fps)视频解码;同已有工作相比,该设计在同等硅片资源下能够支持更多高清视频编解码标准.该设计目前已实际应用在一款多媒体SoC芯片中.  相似文献   

5.
一种可重配置系统的模型   总被引:4,自引:0,他引:4  
University of California,Irvine设计的MorphoSys M1作为粗粒度可重配置系统中一个比较有代表性的系统,对于很多多媒体应用都获得了很好的加速比,但是它在设计上的一些不足造成运算功能单元没有被充分利用,从而对其整体性能的进一步提升有较大影响.针对MorphoSys M1的不足,结合可重配置系统的研究现状和一些多媒体应用的特点,提出了一种新的可重配置系统的模型.实验数据显示,对于许多多媒体应用和加解密算法,改进后的模型相对于MorphoSys M1至少可以获得16%的加速比.  相似文献   

6.
Numerous classic multimedia activities already exist in elementary schools, alongside regular classes which are more or less technically supported and offer many possibilities for a sensible introduction of modem ICT (information-communication technology) into schools. By using modem technologies, it is possible to upgrade multimedia activities, enrich and at the same time raise work quality and efficiency in schools, increase knowledge, skills and competences in an unforced way, and increase the competitiveness of teachers and students with modem information communication equipment and technologies. By introducing modem technologies to school work, a modem organisation of school and class activities is enabled, technical culture improves, parents and experts from the environment are more involved as mentors to students and assistants to teachers. Realisation of students' ideas and projects is enabled on different projects where the students can express various forms of talent, as numerous possibilities for developing entrepreneur thinking will open. Schools acquire multimedia material which can enhance the learning process, archives, activities, school image, and website; they connect to local media and can cooperate in international school web projects, improve their recognition and competitiveness, and increase the chances of acquiring additional resources for their work.  相似文献   

7.
流水线配置技术在可重构处理器中的应用   总被引:1,自引:1,他引:0       下载免费PDF全文
提出一种应用于可重构处理器中的流水线配置技术,能够有效减低配置时间,提高应用程序的执行速度。可重构处理器包括通用处理器和一个粗颗粒度的可重构阵列。可重构阵列将处理应用中占据大量执行时间的循环,这些循环将被分解为不同的行在阵列上以流水线的方式执行。该技术在FPGA验证系统上得到了验证。验证的应用包括H.264基准中的整数离散余弦变换和运动估计。相比传统的可重构处理器PipeRench, MorphoSys以及TI的DSP TMS320DM642有大约3.5倍的性能提升。  相似文献   

8.
9.
In the development of robotic limbs, the side of members is of importance to define the shape of artificial limbs and the range of movements. It is mainly significant tbr biomedical applications concerning patients suffering arms or legs injuries, fn this paper, the concept of an ambidextrous design lbr robot hands is introduced. The fingers can curl in one xvay or another, to imitate either a right hand or a left hand. The advantages and inconveniences of different models have been investigated to optimise the range and the maximum force applied by fingers. Besides, a remote control interthce is integrated to the system, allowing both to send comrnands through internet and to display a video streaming of the ambidextrous hand as feedback. Therefore, a robotic prosthesis could be used for the first time in telerehabilitation. The main application areas targeted are physiotherapy alter strokes or management of phantom pains/br amputees by/earning to control the ambidextrous hand. A client application is also accessible on Facehook social network, making the robotic limb easily reachable for the patients. Additionally the ambidextrous hand can be used tbr robotics research as well as artistic performances.  相似文献   

10.
《Parallel Computing》2002,28(7-8):1111-1139
Multimedia processing is becoming increasingly important with wide variety of applications ranging from multimedia cell phones to high definition interactive television. Media processing techniques typically involve the capture, storage, manipulation and transmission of multimedia objects such as text, handwritten data, audio objects, still images, 2D/3D graphics, animation and full-motion video. A number of implementation strategies have been proposed for processing multimedia data. These approaches can be broadly classified into two major categories, namely (i) general purpose processors with programmable media processing capabilities, and (ii) dedicated implementations (ASICs). We have performed a detailed complexity analysis of the recent multimedia standard (MPEG-4) which has shown the potential for reconfigurable computing, that adapts the underlying hardware dynamically in response to changes in the input data or processing environment. We therefore propose a methodology for designing a reconfigurable media processor. This involves hardware–software co-design implemented in the form of a parser, profiler, recurring pattern analyzer, spatial and temporal partitioner. The proposed methodology enables efficient partitioning of resources for complex and time critical multimedia applications.  相似文献   

11.
Coarse-grained reconfigurable architectures can enhance the performance of critical loops and computation-intensive functions. Such architectures need efficient compilation techniques to map algorithms onto customized architectural configurations. A new compilation approach uses a generic reconfigurable architecture to tackle the memory bottleneck that typically limits the performance of many applications.  相似文献   

12.
This paper presents a new parallelization model, called coarse-grained thread pipelining, for exploiting speculative coarse-grained parallelism from general-purpose application programs in shared-memory multiprocessor systems. This parallelization model, which is based on the fine-grained thread pipelining model proposed for the superthreaded architecture, allows concurrent execution of loop iterations in a pipelined fashion with runtime data-dependence checking and control speculation. The speculative execution combined with the runtime dependence checking allows the parallelization of a variety of program constructs that cannot be parallelized with existing runtime parallelization algorithms. The pipelined execution of loop iterations in this new technique results in lower parallelization overhead than in other existing techniques. We evaluated the performance of this new model using some real applications and a synthetic benchmark. These experiments show that programs with a sufficiently large grain size compared to the parallelization overhead obtain significant speedup using this model. The results from the synthetic benchmark provide a means for estimating the performance that can be obtained from application programs that will be parallelized with this model. The library routines developed for this thread pipelining model are also useful for evaluating the correctness of the codes generated by the superthreaded compiler and in debugging and verifying the simulator for the superthreaded processor  相似文献   

13.
14.
CPU/FPGA混合架构是可重构计算的普遍结构,为了简化混合架构上FPGA的使用,提出了一种硬件线程方法,并设计了硬件线程的执行机制,以硬件线程的方式使用可重构资源.同时,软硬件线程可以通过共享数据存储方式进行多线程并行执行,将程序中计算密集部分以FPGA上的硬件线程方式执行,而控制密集部分则以CPU上的软件线程方式执行.在Simics仿真软件模拟的混合架构平台上,对DES,MD5SUM和归并排序算法进行软硬件多线程改造后的实验结果表明,平均执行加速比达到了2.30,有效地发挥了CPU/FPGA混合架构的计算性能.  相似文献   

15.
High definition video applications often require heavy computation, high bandwidth and high memory requirements which make their real-time implementation difficult. Multi-core architecture with parallelism provides new solutions to implementing complex multimedia applications in real-time. It is well-known that the speed of the H.264 encoder can be increased on a multi-core architecture using the parallelism concept. Most of the parallelization methods proposed earlier for these purposes suffer from the drawbacks of limited scalability and data dependency. In this paper, we present a result obtained using data-level parallelism at the Group-Of-Pictures (GOP) level for the video encoder. The proposed technique involves each GOP being encoded independently and implemented on JM 18.0 using advanced data structures and OpenMP programming techniques. The performance of the parallelized video encoder is evaluated for various resolutions based on the parameters such as encoding speed, bit rate, memory requirements and PSNR. The results show that with GOP level parallelism, very high speed up values can be achieved without much degradation in the video quality.  相似文献   

16.
傅丽丽  曾国荪 《计算机科学》2010,37(11):302-306
N体问题是一个经典动力学问题,在多个领域得到广泛的应用。但随着规模的增大,对求解计算性能的要求成为其研究的主要障碍。当前,FPGA可重构技术由于具有硬件可编程结构和高度并行处理能力而成为高性能计算关注的热点。现以FPGA加速求解N体问题为例,阐述一种新型的求解计算密集型任务的方法。  相似文献   

17.
New standards in signal, multimedia, and network processing for embedded electronics are characterized by computationally intensive algorithms, high flexibility due to the swift change in specifications. In order to meet demanding challenges of increasing computational requirements and stringent constraints on area and power consumption in fields of embedded engineering, there is a gradual trend towards coarse-grained parallel embedded processors. Furthermore, such processors are enabled with dynamic reconfiguration features for supporting time- and space-multiplexed execution of the algorithms. However, the formidable problem in efficient mapping of applications (mostly loop algorithms) onto such architectures has been a hindrance in their mass acceptance. In this paper we present (a) a highly parameterizable, tightly coupled, and reconfigurable parallel processor architecture together with the corresponding power breakdown and reconfiguration time analysis of a case study application, (b) a retargetable methodology for mapping of loop algorithms, (c) a co-design framework for modeling, simulation, and programming of such architectures, and (d) loosely coupled communication with host processor.  相似文献   

18.
为了有效地解决图形终端应用协议中如多媒体播放等计算密集型图形操作影响终端服务质量的问题,文中分析了终端应用协议中的多媒体播放过程,通过服务器和终端之间对多媒体处理能力的协商,提出一种将计算密集型图形操作在两者之间进行分布实现的自适应方案.方案中将多媒体播放的控制过程和计算密集型的视频解码等图形操作分别在服务器和终端上实现.实验结果表明:此方案可以明显地降低服务器负载并改善终端应用中的多媒体播放质量.  相似文献   

19.
Recently, a new video coding standard called HEVC has been developed to deal with the nowadays media market challenges, being able to reduce to the half, on average, the bit stream size produced by the former video coding standard H.264/AVC at the same video quality. However, the computing requirements to encode video improving compression efficiency have significantly been increased. In this paper, we focus on applying parallel processing techniques to HEVC encoder to significantly reduce the computational power requirements without disturbing the coding efficiency. So, we propose several parallelization approaches to the HEVC encoder which are well suited to multicore architectures. Our proposals use OpenMP programming paradigm working at a coarse grain level parallelization which we call GOP-based level. GOP-based approaches encode simultaneously several groups of consecutive frames. Depending on how these GOPs are conformed and distributed, it is critical to obtain good parallel performance, taking also into account the level of coding efficiency degradation. The results show that near ideal efficiencies are obtained using up to 12 cores.  相似文献   

20.
随着多媒体技术的兴起,演播教学模式越来越多地取代了传统教学模式。在这种模式中,多媒体信息传输的主要瓶颈便是视频,拥有如此庞大数据量的视频,是限制多媒体技术发展的重要障碍。视频经过压缩后,存储会更方便,且并不影响作品的最终视觉效果。基于此目的,系统地研究了如何利用时间线+事件+帧的流来记录演示内容,从而尽可能地将视频数据压缩到最小,在文件大小和画面质量之间达到最佳平衡的问题。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号