Upscaledb: Efficient integer-key compression in a key-value store using SIMD instructions |
| |
Affiliation: | 1. Department of Computer Sciences, Vrije Universiteit Amsterdam, De Boelelaan 1081, 1081HV Amsterdam, The Netherlands;2. Departemento de Lenguajes y Sistemas Informticos, University of Seville, Avda. Reina Mercedes 41012, Sevilla, Spain;3. Department of Mathematics and Computer Science, Eindhoven University of Technology, PO Box 513, 5600MB Eindhoven, The Netherlands |
| |
Abstract: | Compression can sometimes improve performance by making more of the data available to the processors faster. We consider the compression of integer keys in a B+-tree index. For this purpose, systems such as IBM DB2 use variable-byte compression over differentially coded keys. We revisit this problem with various compression alternatives such as Google's VarIntGB, Binary Packing and Frame-of-Reference. In all cases, we describe algorithms that can operate directly on compressed data. Many of our alternatives exploit the single-instruction-multiple-data (SIMD) instructions supported by modern CPUs. We evaluate our techniques in a database environment provided by Upscaledb, a production-quality key-value database. Our best techniques are SIMD accelerated: they simultaneously reduce memory usage while improving single-threaded speeds. In particular, a differentially coded SIMD binary-packing techniques (BP128) can offer a superior query speed (e.g., 40% better than an uncompressed database) while providing the best compression (e.g., by a factor of ten). For analytic workloads, our fast compression techniques offer compelling benefits. Our software is available as open source. |
| |
Keywords: | B+-tree Data compression Vectorization Key-value stores |
本文献已被 ScienceDirect 等数据库收录! |
|