首页 | 本学科首页   官方微博 | 高级检索  
     

面向MIC协处理器的OLAP外键连接算法
引用本文:张宇,张延松,陈红,王珊.面向MIC协处理器的OLAP外键连接算法[J].软件学报,2017,28(3):490-501.
作者姓名:张宇  张延松  陈红  王珊
作者单位:国家卫星气象中心 中国气象局, 北京 100081,数据工程与知识工程教育部重点实验室(中国人民大学), 北京 100872;中国人民大学信息学院, 北京 100872;中国人民大学中国调查与数据中心, 北京 100872,数据工程与知识工程教育部重点实验室(中国人民大学), 北京 100872;中国人民大学信息学院, 北京 100872,数据工程与知识工程教育部重点实验室(中国人民大学), 北京 100872;中国人民大学信息学院, 北京 100872
基金项目:国家高技术研究发展计划(863计划)项目(2015AA015307);中央高校基本科研业务费专项资金项目(16XNLQ02);华为创新研究计划(HIRP20140507,HIRP20140510)
摘    要:众核架构协处理器Xeon Phi成为新兴的主流高性能计算平台.对于数据库应用而言,内存分析处理是一种计算密集型负载,其主要的性能取决于大事实表与维表之间的内存外键连接性能.本文关注于一种相对于缓存相关的分区哈希连接算法和缓存不相关的无分区哈希连接算法的缓存友好型外键连接算法,以适应Xeon Phi协处理器较小的LLC和高并发线程的特点.通过挖掘OLAP模式中的代理键特征,基于键值匹配的哈希探测操作可以进一步简化为事实表与维表之间基于主-外键参照完整性约束的代理键参照访问,因此复杂的哈希表和CPU代价较高的哈希探测操作可以简化为通过映射外键值为代理键向量内存偏移地址的方法对代理向量直接访问.基于代理向量参照访问的外键连接算法能够简单并高效地应用于Xeon Phi协处理器平台,通过更多的核心和高并发线程来掩盖内存访问延迟.实验中对传统的哈希连接算法(无分区哈希连接算法和基数分区哈希连接算法)和基于代理向量参照技术的外键连接算法在Xeon E5-2650 v3 10核处理器平台和Xeon Phi 5110P 60核协处理器平台进行性能测试和比较,实验结果给出了主流的内存外键连接算法在不同数据集和不同平台上全面的性能特征.

关 键 词:内存OLAP  外键连接  代理键  代理键参照
收稿时间:2016/7/18 0:00:00
修稿时间:2016/9/14 0:00:00

OLAP Foreign Join Algorithm for MIC Coprocessor
ZHANG Yu,ZHANG Yan-Song,CHEN Hong and WANG Shan.OLAP Foreign Join Algorithm for MIC Coprocessor[J].Journal of Software,2017,28(3):490-501.
Authors:ZHANG Yu  ZHANG Yan-Song  CHEN Hong and WANG Shan
Affiliation:National Satellite Meteorological Center, China Meteorological Administration, Beijing 100081, China,Key Laboratory of Data Engineering and Knowledge Engineering(Renmin University of China), Beijing 100872, China;School of Information, Renmin University of China, Beijing 100872, China;National Survey Research Center, Renmin University of China, Beijing 100872, China,Key Laboratory of Data Engineering and Knowledge Engineering(Renmin University of China), Beijing 100872, China;School of Information, Renmin University of China, Beijing 100872, China and Key Laboratory of Data Engineering and Knowledge Engineering(Renmin University of China), Beijing 100872, China;School of Information, Renmin University of China, Beijing 100872, China
Abstract:The emerging Many Integrated Core Architecture (MIC) Xeon Phi coprocessor comes to be the mainstream platform for high performance computing.For database applications,the in-memory analytics is computing intensive workload in which the in-memory foreign key joins between big fact table and dimension tables dominate the OLAP performance.This paper focused on a cache-friendly foreign key join opposite to cache-conscious radix partitioning oriented hash join and cache-oblivious no-partitioning hash join to adapt to the small LLC size and massive simultaneous multi-threading mechanism of Xeon Phi coprocessor.By exploiting the characteristic of surrogate key in OLAP schema,the key matching oriented hash probing can be further simplified as surrogate key referencing between fact table and dimension tables with PK-FK reference constraint,so that the complex hash table and CPU cycle consuming hash probing can be simplified as directly referencing surrogate vector by mapping foreign key to offset address of surrogate vector.The surrogate vector referencing oriented foreign key join is simple and efficient to be implemented for Xeon Phi coprocessor for more cores and massive simultaneous multi-threading mechanism to overlap memory access latency.In experiments,the surrogate vector referencing foreign key join algorithm and traditional hash join algorithms (NPO and PRO) are compared on both Xeon E5-2650 v3 10-core CPU platform and Xeon Phi 5110P 60-core platform,the experimental results give a comprehensive perspective for how the mainstream in-memory foreign key join algorithms perform with different datasets on different platforms.
Keywords:in-memory OLAP  foreign key join  surrogate key  surrogate vector referencing
点击此处可从《软件学报》浏览原始摘要信息
点击此处可从《软件学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号