首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
An analog CMOS chip set for implementations of artificial neural networks (ANNs) has been fabricated and tested. The chip set consists of two cascadable chips: a neuron chip and a synapse chip. Neurons on the neuron chips can be interconnected at random via synapses on the synapse chips thus implementing an ANN with arbitrary topology. The neuron test chip contains an array of 4 neurons with well defined hyperbolic tangent activation functions which is implemented by using parasitic lateral bipolar transistors. The synapse test chip is a cascadable 4x4 matrix-vector multiplier with variable, 10-b resolution matrix elements. The propagation delay of the test chips was measured to 2.6 mus per layer.  相似文献   

2.
在广泛调研的基础上,通过对近十几年间相关参考文献的整理和分析,介绍了支持多媒体业务的随机接入技术的研究进展及随机接入协议、多媒体业务类型的设计思想和常用的3种多媒体业务分类方法,详细介绍了两类支持多媒体业务的随机接入协议的常见设计思想,分别是基于接入门限的多媒体随机接入协议和基于重发退避的多媒体随机接入协议.提出了支持多媒体业务的随机接入技术未来的重要研究方向.  相似文献   

3.
Cache performance is strongly influenced by the type of locality embodied in programs. In particular, multimedia programs handling images and videos are characterized by a bidimensional spatial locality, which is not adequately exploited by standard caches. In this paper we propose novel cache prefetching techniques for image data, called neighbor prefetching, able to improve exploitation of bidimensional spatial locality. A performance comparison is provided against other assessed prefetching techniques on a multimedia workload (with MPEG-2 and MPEG-4 decoding, image processing, and visual object segmentation), including a detailed evaluation of both the miss rate and the memory access time. Results prove that neighbor prefetching achieves a significant reduction in the time due to delayed memory cycles (more than 97% on MPEG-4 with respect to 75% of the second performing technique). This reduction leads to a substantial speedup on the overall memory access time (up to 140% for MPEG-4). Performance has been measured with the PRIMA trace-driven simulator, specifically devised to support cache prefetching.  相似文献   

4.
Lu  Xu  Ding  Hongwei  Yang  Zhijun  Bao  Liyong  Wang  Liqing  Liu  Qianlin 《Multimedia Tools and Applications》2020,79(23-24):16547-16571
Multimedia Tools and Applications - In recent years, multimedia video services have developed rapidly. However, the increasing number and variety of videos is likely to cause packet loss and delay...  相似文献   

5.
Due to the ever increasing resolution and frame rate of mainstream video sequences,memory access has become the main performance bottleneck of video decoding.To reduce the required of-chip memory,many decoders employ on-chip cache.However,they cannot distinguish whether a data block is reusable due to the lack of the information of undecoded Macro Blocks(MBs),thus often evicting reusable data from the cache and preserving non-reusable data in the cache,which will lead to a waste of of-chip memory bandwidth.In this paper,we manage to make full use of cache from a novel perspective,i.e.,auxiliary bitstream.Concretely speaking,since the memory access behavior of video decoding is determined in video encoding,the encoder can pack the memory access behaviors of video decoding as auxiliary bitstream,which can inform the decoder whether a data block will be reused by future MBs.Hence,such an auxiliary stream can enable optimal management of cache.To efectively reduce the size of auxiliary bitstream,we propose an Auxiliary Prior Information Coding(APIC)method complying with the current video standards.For future video standards,we introduce a Super Block scan Order(SBO)for MB organization to further reduce the bitrate overhead of auxiliary bitstream.The above ideas are evaluated on a number of representative video sequences.The additional prior information can reduce the required of-chip memory bandwidth for motion compensation by over 35%(for a 60 kB cache),while only causing less than 2.3%bitrate increase for high definition(HD)videos.  相似文献   

6.
High definition (HD) and ultra-high definition (UHD) digital TV require high-resolution images and lots of data transfers between processors and memory devices often become the bottleneck of the system. Video and image signal processing usually require blocks of square or rectangular shaped pixel data for signal processing. It requires frequent precharging and activating new rows, and results in extra latencies for reading and writing pixel data in memory devices. This paper proposes an efficient memory controller for video and image processing to reduce the latencies for reading and writing blocks of pixel data. The controller stores a frame of pixel data by distributing contiguous lines of pixel data to multiple banks in sequence. Its efficiency is enhanced more with an interface protocol such as AMBA AXI in which outstanding transactions are allowed. Memory controllers according to the proposed scheme are designed and the performance and the efficiency are compared with the previous works.  相似文献   

7.
In the present article the problem of efficient application of multiple march tests for the purpose of detecting faults in random access memory, realized through the generation of different address sequences, is considered. For these purposes, a new algorithm for generating address sequences is proposed and the efficiency gained from its application evaluated. In the conclusion experimental results on the use of the proposed method for generating address sequences are presented and the efficiency of the method in multiple run random access memory tests is demonstrated.  相似文献   

8.
9.
10.

With the rapid developments in cloud computing and mobile networks, multimedia content can be accessed conveniently. Recently, some novel intelligent caching-based approaches have been proposed to improve the memory architectures for multimedia applications. These applications often face bottleneck related challenges which result in performance degradation and service delay issues. Intelligent multimedia network applications access the shared data by using a specific network file system. This results in answering the processing related constraints on hard-drive storage and might result in bringing bottleneck issues. Therefore, to improve the performance of these multimedia network applications, we present an intelligent distributed memory caching system. We integrate the multimedia application message passing interface in a multi-threaded environment and propose an algorithm which can handle concurrent response behavior for different multimedia applications. Results demonstrate that our proposed scheme outperforms traditional approaches in terms of throughput and file read access features.

  相似文献   

11.
The recent emergence of multimedia services, such as Broadcast TV and Video on Demand over traditional twisted pair access networks, has complicated the network management in order to guarantee a decent Quality of Experience (QoE) for each user. The huge amount of services and the wide variety of service specifics require a QoE management on a per-user and per-service basis. This complexity can be tackled through the design of an autonomic QoE management architecture. In this article, the Knowledge Plane is presented as an autonomic layer that optimizes the QoE in multimedia access networks from the service originator to the user. It autonomously detects network problems, e.g. a congested link, bit errors on a link, etc. and determines an appropriate corrective action, e.g. switching to a lower bit rate video, adding an appropriate number of FEC packets, etc. The generic Knowledge Plane architecture is discussed, incorporating the triple design goal of an autonomic, generic and scalable architecture. The viability of an implementation using neural networks is investigated, by comparing it with a reasoner based on analytical equations. Performance results are presented of both reasoners in terms of both QoS and QoE metrics.  相似文献   

12.
Many current graphical display systems utilize a buffer memory system to contain a two-dimensional image array to be modified and displayed. In order to speed up the update of the buffer memory system, it is required that the buffer memory system accesses many image points within an image subarray in parallel. This paper proposes an efficient buffer memory system for a fast and high-resolution graphical display system. The memory system provides parallel accesses to pq image points within a block(p×q), a horizontal (1×pq), a vertical (pq×1), a forward-diagonal, or a backward-diagonal subarray in a two-dimensional image array, M×N, where the design parameters p and q are all powers of two. In the address calculation and routing circuit of the proposed buffer memory system, the address differences of the five subarrays are prearranged according to the index numbers of memory modules and stored in two static random access memories (SRAMs), so that the address differences are simply added to the base address to obtain the addresses according to the index numbers of memory modules. In addition, for the fast address calculation, one single multiplication operation in the base address calculation is replaced by a SRAM access, so that the multiplication operation can be performed during the SRAM access for the address differences for the case when N is not a power of two. The address calculation and routing circuit proposed in this paper is improved in the hardware cost, the complexity of control, and the speed over the previous circuits  相似文献   

13.
This paper describes a network-based video capture and processing peripheral, called the Vidboard, for a distributed multimedia system centered around a 1-Gbit/s asynchronous transfer mode (ATM) network. The Vidboard is capable of generating full-motion video streams having a range of presentation (picture size, color space, etc.) and network (traffic, transport, etc.) characteristics. The board is also capable of decoupling video from the real-time constraints of the television world, which allows easier integration of video into the software environment of computer systems. A suite of ATM-based protocols has been developed for transmitting video from the Vidboard to a workstation, and a series of experiments are presented in which video is transmitted to a workstation for display.  相似文献   

14.
In this paper, we present a novel lateral cell design for phase change random access memory (PCRAM) that features an improved thermal management to efficiently reduce the current consumption during reset operation. Simulation, fabrication and electrical characterization results of the lateral concept are presented.  相似文献   

15.
In this paper, an adaptive random access strategy is presented for multi-channel relaying networks to address the issue of random access of the non-real-time (NRT) services. In the proposed scheme, NRT services access the base station (BS) by first accessing the nearest relay node (RN). When collision occurs, for the sake of fast and efficient access, the user will begin a frequency domain backoff rather than randomly retry in time domain. A remarkable feature of this scheme is that the RN will adaptively determine the maximum allowed frequency backoff window at each access period. This is achieved according to the new arrival rate as well as the number of available access channels. Moreover, to alleviate the interference caused by sub-channel reuse among RNs, a fractional frequency reuse scheme is also considered. The analysis and numerical results demonstrate that our scheme achieves higher throughput, lower collision probability and lower access delay than conventional slotted Aloha as well as the scheme without frequency backoff window adaptation.  相似文献   

16.
With the rapid growing complexity of 3D applications, the memory subsystem has become the most bandwidth-exhausting bottleneck in a Graphics Processing Unit (GPU). To produce realistic images, tens to hundreds of thousands of primitives are used. Furthermore, each primitive generates thousands of pixels, and these pixels are computed by shaders with special effects, even to blend multiple texture pixels from external memory to obtain a final color. To hide the long latency texture operations, the shaders are usually highly multithreaded to increase its throughput. However, conventional memory scheduling mechanisms are unaware of the producer-consumer relationship between primitives and pixels. The conventional scheduling mechanisms neither assume that all initiators are independent nor that they use a fixed priority scheme. This paper proposes Demand Look-Ahead (DLA) memory access scheduling based on the statuses of each unit in the GPU, and dynamically generates priority for the memory request scheduler. By considering the producer-consumer relationship, the proposed mechanism reschedules most urgent requests to be serviced first. Experimental results show that the proposed DLA improves 1.47 % and 1.44 % in FPS and IPC, respectively, than First-Ready First-Come-First-Serve (FR-FCFS). By integrating DLA with Bank-level Parallelism Awareness (BPA), DLA-BPA improves FPS and IPC by 7.28 % and 6.55 %, respectively. Furthermore, shader thread performance is improved by 22.06 % and increases the attainable bandwidth by 5.91 % with DLA-BPA.  相似文献   

17.
This paper proposes a neuromorphic analog CMOS controller for interlimb coordination in quadruped locomotion. Animal locomotion, such as walking, running, swimming, and flying, is based on periodic rhythmic movements. These rhythmic movements are driven by the biological neural network, called the central pattern generator (CPG). In recent years, many researchers have applied CPG to locomotion controllers in robotics. However, most of these have been developed with digital processors and, thus, have several problems, such as high power consumption. In order to overcome such problems, a CPG controller with analog CMOS circuit is proposed. Since the CMOS transistors in the circuit operate in their subthreshold region and under low supply voltage, the controller can reduce power consumption. Moreover, low-cost production and miniaturization of controllers are expected. We have shown through computer simulation, such circuit has the capability to generate several periodic rhythmic patterns and transitions between their patterns promptly.  相似文献   

18.
Mobile e-health applications provide users and healthcare practitioners with an insightful way to check users/patients’ status and monitor their daily calorie intake. Mobile e-health applications provide users and healthcare practitioners with an insightful way to check users/patients’ status and monitor their daily activities. This paper proposes a cloud-based mobile e-health calorie system that can classify food objects in the plate and further compute the overall calorie of each food object with high accuracy. The novelty in our system is that we are not only offloading heavy computational functions of the system to the cloud, but also employing an intelligent cloud-broker mechanism to strategically and efficiently utilize cloud instances to provide accurate and improved time response results. The broker system uses a dynamic cloud allocation mechanism that takes decisions on allocating and de-allocating cloud instances in real-time for ensuring the average response time stays within a predefined threshold. In this paper, we further demonstrate various scenarios to explain the workflow of the cloud components including: segmentation, deep learning, indexing food images, decision making algorithms, calorie computation, scheduling management as part of the proposed cloud broker model. The implementation results of our system showed that the proposed cloud broker results in a 45% gain in the overall time taken to process the images in the cloud. With the use of dynamic cloud allocation mechanism, we were able to reduce the average time consumption by 77.21% when 60 images were processed in parallel.  相似文献   

19.
An adaptive image interpolation algorithm for image/video processing   总被引:6,自引:0,他引:6  
Image interpolation is one of the key technologies in image/video processing. In this study, a new adaptive image interpolation algorithm is proposed. The objective of the proposed approach is to recover up-sampled image frames from the corresponding decimated (low-resolution) image frames. In the proposed approach, within each iteration, two proposed nonlinear filters are utilized to iteratively generate high-frequency components lost within the decimation procedure. Finally, a post-processing procedure is adopted to reduce the blocking artifacts within the interpolated images. Based on the experimental results obtained in this study, in terms of the average PSNRp (peak signal-to-noise ratio) in dB and subjective measure of the quality of the interpolated images, the interpolation results by the proposed approach are better than that by three existing interpolation approaches for comparison.  相似文献   

20.
1 Introduction Graph processing has received significant attention for its ability to cope with large-scale and complex unstructured data in the real-world.However,most of the graph processing applications exhibit an irregular memory access pattern which leads to a poor locality in the memory access stream[1].  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号