Reducing division latency with reciprocal caches |
| |
Authors: | Stuart F Oberman Michael J Flynn |
| |
Affiliation: | 1. Computer Systems Laboratory Department of Electrical Engineering, Stanford University, 94305-9030, Stanford, CA, USA
|
| |
Abstract: | Floating-point division is generally regarded as a high latency operation in typical floating-point applications. Many techniques exist for increasing division performance, often at the cost of increasing either chip area, cycle time, or both. This paper presents two methods for reducing the latency of division. Using applications from the SPECfp92 and NAS benchmark suites, these methods are evaluated to determine their effects on overall system performance. The notion of recurring computation is presented, and it is shown how recurring division can be exploited using an additional, dedicated division cache. For multiplication-based division algorithms, reciprocal caches can be utilized to store recurring reciprocals. Results show that reciprocal caches can achieve nearly a two-times speedup in division performance for reasonable cache sizes. |
| |
Keywords: | |
本文献已被 SpringerLink 等数据库收录! |
|