期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Hot Chips Turns 20 [Guest editors' introduction]

Kozyrakis Christos Waerdt Jan-Willem van de 《Micro, IEEE》2009,29(2):4-5

The five papers in this special issue are extended versions of papers presented at the Hot Chips conference in August 1008. 相似文献

2.

Hardware/compiler codevelopment for an embedded media processor

Kozyrakis C. Judd D. Gebis J. Williams S. Patterson D. Yelick K. 《Proceedings of the IEEE. Institute of Electrical and Electronics Engineers》2001,89(11):1694-1709

Embedded and portable systems running multimedia applications create a new challenge for hardware architects. A microprocessor for such applications needs to be easy to program like a general-purpose processor and have the performance and power efficiency of a digital signal processor. This paper presents the codevelopment of the instruction set, the hardware, and the compiler for the Vector IRAM media processor. A vector architecture is used to exploit the data parallelism of multimedia programs, which allows the use of highly modular hardware and enables implementations that combine high performance, low power consumption, and reduced design complexity. It also leads to a compiler model that is efficient both in terms of performance and executable code size. The memory system for the vector processor is implemented using embedded DRAM technology, which provides high bandwidth in an integrated, cost-effective manner. The hardware and the compiler for this architecture make complementary contributions to the efficiency of the overall system. This paper explores the interactions and tradeoffs between them, as well as the enhancements to a vector architecture necessary for multimedia processing. We also describe how the architecture, design, and compiler features come together in a prototype system-on-a-chip, able to execute 3.2 billion operations per second per watt 相似文献

3.

Scalable processors in the billion-transistor era: IRAM

Kozyrakis C.E. Perissakis S. Patterson D. Anderson T. Asanovic K. Cardwell N. Fromm R. Golbus J. Gribstad B. Keeton K. Thomas R. Treuhaft N. Yelick K. 《Computer》1997,30(9):75-78

Members of the University of California, Berkeley, argue that the memory system will be the greatest inhibitor of performance gains in future architectures. Thus, they propose the intelligent RAM or IRAM. This approach greatly increases the on-chip memory capacity by using DRAM technology instead of much less dense SRAM memory cells. The resultant on-chip memory capacity coupled with the high bandwidths available on chip should allow cost-effective vector processors to reach performance levels much higher than those of traditional architectures. Although vector processors require explicit compilation, the authors claim that vector compilation technology is mature (having been used for decades in supercomputers), and furthermore, that future workloads will contain more heavily vectorizable components 相似文献

4.

Models and Metrics to Enable Energy-Efficiency Optimizations

Rivoire S. Shah M.A. Ranganatban P. Kozyrakis C. Meza J. 《Computer》2007,40(12):39-48

Power consumption and energy efficiency are important factors in the initial design and day-to-day management of computer systems. Researchers and system designers need benchmarks that characterize energy efficiency to evaluate systems and identify promising new technologies. To predict the effects of new designs and configurations, they also need accurate methods of modeling power consumption. 相似文献

5.

Scalable, vector processors for embedded systems

Kozyrakis C.E. Patterson D.A. 《Micro, IEEE》2003,23(6):36-45

For embedded applications with data-level parallelism, a vector processor offers high performance at low power consumption and low design complexity. Unlike superscalar and VLIW designs, a vector processor is scalable and can optimally match specific application requirements.To demonstrate that vector architectures meet the requirements of embedded media processing, we evaluate the Vector IRAM, or VIRAM (pronounced "V-IRAM"), architecture developed at UC Berkeley, using benchmarks from the Embedded Microprocessor Benchmark Consortium (EEMBC). Our evaluation covers all three components of the VIRAM architecture: the instruction set, the vectorizing compiler, and the processor microarchitecture. We show that a compiler can vectorize embedded tasks automatically without compromising code density. We also describe a prototype vector processor that outperforms high-end superscalar and VLIW designs by 1.5x to 100x for media tasks, without compromising power consumption. Finally, we demonstrate that clustering and modular design techniques let a vector processor scale to tens of arithmetic data paths before wide instruction-issue capabilities become necessary. 相似文献

6.

Transactional Memory: The Hardware-Software Interface 总被引：1，自引：0，他引：1

McDonald A. Carlstrom B.D. Chung J. Minh C.C. Chafi H. Kozyrakis C. Olukotun K. 《Micro, IEEE》2007,27(1):67-76

As multicore chips become ubiquitous, the need to provide architectural support for practical parallel programming is reaching critical. Conventional lock-based concurrency control techniques are difficult to use, requiring the programmer to navigate through the minefield of coarse-versus fine-grained locks, deadlock, livelock, lock convoying, and priority inversion. This explicit management of concurrency is beyond the reach of the average programmer, threatening to waste the additional parallelism available with multicore architectures. This comprehensive architecture supports nested transactions, transactional handlers, and two-phase commit. The result is a seamless integration of transactional memory with modern programming languages and runtime environments 相似文献

7.

A case for intelligent RAM

Patterson D. Anderson T. Cardwell N. Fromm R. Keeton K. Kozyrakis C. Thomas R. Yelick K. 《Micro, IEEE》1997,17(2):34-44

Two trends call into question the current practice of fabricating microprocessors and DRAMs as different chips on different fabrication lines. The gap between processor and DRAM speed is growing at 50% per year; and the size and organization of memory on a single DRAM chip is becoming awkward to use, yet size is growing at 60% per year. Intelligent RAM, or IRAM, merges processing and memory into a single chip to lower memory latency, increase memory bandwidth, and improve energy efficiency. It also allows more flexible selection of memory size and organization, and promises savings in board area. This article reviews the state of microprocessors and DRAMs today, explores some of the opportunities and challenges for IRAMs, and finally estimates performance and energy efficiency of three IRAM designs 相似文献

8.

RAMP: Research Accelerator for Multiple Processors 总被引：1，自引：0，他引：1

Wawrzynek J. Patterson D. Oskin M. Shin-Lien Lu Kozyrakis C. Hoe J.C. Chiou D. Asanovic K. 《Micro, IEEE》2007,27(2):46-57

The RAMP project's goal is to enable the intensive, multidisciplinary innovation that the computing industry will need to tackle the problems of parallel processing. RAMP itself is an open-source, community-developed, FPGA-based emulator of parallel architectures. its design framework lets a large, collaborative community develop and contribute reusable, composable design modules. three complete designs - for transactional memory, distributed systems, and distributed-shared memory - demonstrate the platform's potential. 相似文献