A fully pipelined single-precision floating-point unit in the synergistic processor element of a CELL processor |
| |
Authors: | Hwa-Joon Oh Mueller SM Jacobi C Tran KD Cottier SR Michael BW Nishikawa H Totsuka Y Namatame T Yano N Machida T Dhong SH |
| |
Affiliation: | IBM Syst. & Technol. Group, Austin, TX, USA; |
| |
Abstract: | The floating-point unit (FPU) in the synergistic processor element (SPE) of a CELL processor is a fully pipelined 4-way single-instruction multiple-data (SIMD) unit designed to accelerate media and data streaming with 128-bit operands. It supports 32-bit single-precision floating-point and 16-bit integer operands with two different latencies, six-cycle and seven-cycle, with 11 FO4 delay per stage. The FPU optimizes the performance of critical single-precision multiply-add operations. Since exact rounding, exceptions, and de-norm number handling are not important to multimedia applications, IEEE correctness on the single-precision floating-point numbers is sacrificed for performance and simple design. It employs fine-grained clock gating for power saving. The design has 768K transistors in 1.3 mm/sup 2/, fabricated SOI in 90-nm technology. Correct operations have been observed up to 5.6 GHz with 1.4 V and 56/spl deg/C, delivering 44.8 GFlops. Architecture, logic, circuits, and integration are codesigned to meet the performance, power, and area goals. |
| |
Keywords: | |
|
|