首页 | 本学科首页   官方微博 | 高级检索  
     


Star join revisited: Performance internals for cluster architectures
Authors:Josep    Victor    Calisto   Josep-L.   
Affiliation:

aIBM Toronto Laboratory, 8200 Warden Avenue, Markham, ON, Canada L6G1C7

bUniversitat Politècnica de Catalunya, DAMA-UPC and Computer Architecture Department, Jordi Girona 1–3, Campus Nord-UPC, Modul D6, E-08034 Barcelona, Spain

Abstract:Data warehouse workloads are crucial for the support of on-line analytical processing (OLAP). The strategy to cope with OLAP queries on such huge amounts of data calls for the use of large parallel computers. The trend today is to use cluster architectures that show a reasonable balance between cost and performance. In such cases, it is necessary to tune the applications in order to minimize the amount of I/O and communication, such that the global execution time is reduced as much as possible.

In this paper, we model and analyze the most up-to-date strategies for ad hoc star join query processing in a cluster of computers. We show that, for ad hoc query processing and assuming a limited amount of resources available, these strategies still have room for improvement both in terms of I/O and inter-node data traffic communication. Our analysis concludes with the proposal of a hybrid solution that improves these two aspects compared to the previous techniques, and shows near optimal results in a broad spectrum of cases.

Keywords:Star join   Query processing   Data warehouses   Cluster architectures
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号