首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到10条相似文献,搜索用时 281 毫秒
1.
向量计算Array OLAP查询处理技术   总被引:1,自引:0,他引:1       下载免费PDF全文
多核和众核处理器成为新的具有强大并行处理能力的大内存计算平台的主流配置。多核处理器遵循以LLC(Last Level Cache,最后一级cache)大小为中心的优化技术,而众核处理器,如Phi、GPU协处理器,则采用较小的cache并以更多的硬件级线程来掩盖内存访问延迟的设计。随着处理核心数量的增长,计算框架更倾向于面向大规模处理核心的、代码执行效率高并且扩展性强的设计思想。提出了一种基于数组存储和向量处理的内存分析处理框架Array OLAP,简化OLAP的存储模型和查询处理模型。在Array OLAP计算框架中,维表规范化为基于向量的维过滤器,事实表规范化为带有多维索引的度量属性。通过多维索引计算,一个多维查询被简化为事实表上的向量索引扫描并根据度量表达式进行聚集计算。规范化的向量查找和向量索引扫描具有较好的代码执行效率,并且阶段化的处理模型更好地适应不同的计算平台,将计算阶段分配给最适合的计算平台。同时,Array OLAP是一种面向数据仓库模式特点的设计,向量处理模型设计简单,对于数据仓库维表较小且增长缓慢的特点具有较好的效率。描述了在不同平台上的Array OLAP计算框架并且通过基准测试评估Array OLAP的性能,通过与当前的内存分析型数据库的性能对比,Array OLAP性能超过主流的内存分析型数据库并且可以平滑地迁移到新的硬件平台。  相似文献   

2.
针对汽车行业营销业务复杂、数据量大的特点,提出一种基于联机分析处理(online analytical processing, OLAP)的应用方案,给出系统的体系结构,并构建了汽车行业常用营销分析模型,开发出具有跨平台、开放性和自定义功能的汽车行业营销OLAP系统.该系统支持多种主流数据库和常用数据模型,并提供多种数据展现方式,为哈飞汽车销售公司的销售及售后数据分析处理提供了技术支撑,从而为企业深入把握市场动态,制定合理的营销策略提供科学的决策依据.  相似文献   

3.
Data warehouse workloads are crucial for the support of on-line analytical processing (OLAP). The strategy to cope with OLAP queries on such huge amounts of data calls for the use of large parallel computers. The trend today is to use cluster architectures that show a reasonable balance between cost and performance. In such cases, it is necessary to tune the applications in order to minimize the amount of I/O and communication, such that the global execution time is reduced as much as possible.

In this paper, we model and analyze the most up-to-date strategies for ad hoc star join query processing in a cluster of computers. We show that, for ad hoc query processing and assuming a limited amount of resources available, these strategies still have room for improvement both in terms of I/O and inter-node data traffic communication. Our analysis concludes with the proposal of a hybrid solution that improves these two aspects compared to the previous techniques, and shows near optimal results in a broad spectrum of cases.  相似文献   


4.
Caching web pages is an important part of web infrastructures. Medium to large‐scale infrastructures deploy a cluster of servers to solve the scalability and storage problems inherent in caching. In this paper we present dynamic information‐based scalable hashing that evenly hashes client requests to a cluster of cache servers, resulting in performance scalability. Runtime information is used to determine when and how to cache pages. Cached pages are stored and retrieved mutually exclusively to/from all the servers to minimize the use of storage, resulting in storage scalability. We set up an experimental environment consisting of various machines, including client servers, a cluster of 16 cache servers, and a load balancer. We demonstrate through experimental results that dynamic information‐based scalable hashing maximizes both performance scalability and storage scalability while the existing approaches do only either one of the two. Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

5.
Data warehouse workloads are crucial for the support of on-line analytical processing (OLAP). The strategy to cope with OLAP queries on such huge amounts of data calls for the use of large parallel computers. The trend today is to use cluster architectures that show a reasonable balance between cost and performance. In such cases, it is necessary to tune the applications in order to minimize the amount of I/O and communication, such that the global execution time is reduced as much as possible.In this paper, we model and analyze the most up-to-date strategies for ad hoc star join query processing in a cluster of computers. We show that, for ad hoc query processing and assuming a limited amount of resources available, these strategies still have room for improvement both in terms of I/O and inter-node data traffic communication. Our analysis concludes with the proposal of a hybrid solution that improves these two aspects compared to the previous techniques, and shows near optimal results in a broad spectrum of cases.  相似文献   

6.
印和平  杨科华 《微机发展》2005,15(10):107-109,112
论述了当前基于C/S结构的OLAP系统的不足,提出了一种基于B/S结构的OLAP系统。该系统读取OLAP建摸工具生成的语义对象存储文件,向用户屏蔽了各种分析数据源的差异,可以以普通关系数据库、数据仓库和数据立方体为分析数据源,通过浏览器向用户展示分析结果,同时可以对分析结果进行各种分析操作。  相似文献   

7.
数据库作为金融信息化建设的重要组成部分, 需要面对持续的业务量增长、高度可用性和扩展性等挑战,而以MySQL、Oracle等为代表的传统数据库单点架构, 在可用性、扩展性和存储能力上已经无法满足当前的金融服务要求. 分布式数据库的出现, 旨在解决单机数据库所面临的各种挑战, 提供更加灵活的架构, 保障系统稳定运行. 为此, 本文在结合实际的金融业务需求下, 研究实现了具有分布式事务支持、分布式SQL引擎、混合事务分析处理等特点的分布式数据库. 系统采用全组件的冗余设计, 通过类Raft增强一致性算法保证了存储层高可用和数据强一致, 同时利用基于Zookeeper的集群调度方案保证调度层的高可用.  相似文献   

8.
This paper applies online analytic processing (OLAP), a widely accepted database analysis technique, to a product data management (PDM) database to evaluate the performance of in-progress product development. This paper introduces a set of processing key performance indicators (KPIs) that can measure ongoing product development. To convert and analyze operational data in a PDM database with interactive OLAP operations, this study proposes a multidimensional data model specified by product facts with associated dimensions. The model is implemented using a commercial OLAP engine, and applied to a database supported by a prototype PDM system. The OLAP engine allows analysts to interactively evaluate the performance of in-progress product development in a multidimensional data space. This is a far more flexible and efficient approach than other result-oriented static evaluation approaches.  相似文献   

9.
The performance of online analytical processing (OLAP) is critical for meeting the increasing requirements of massive volume analytical applications. Typical techniques, such as in-memory processing, column-storage, and join indexes focus on high performance storage media, efficient storage models, and reduced query processing. While they effectively perform OLAP applications, there is a vital limitation: mainmemory database based OLAP (MMOLAP) cannot provide high performance for a large size data set. In this paper, we propose a novel memory dimension table model, in which the primary keys of the dimension table can be directly mapped to dimensional tuple addresses. To achieve higher performance of dimensional tuple access, we optimize our storage model for dimension tables based on OLAP query workload features. We present directly dimensional tuple accessing (DDTA) based join (DDTAJOIN), a technique to optimize query processing on the memory dimension table by direct dimensional tuple access. We also contribute by proposing an optimization of the predicate tree to shorten predicate operation length by pruning useless predicate processing. Our experimental results show that the DDTA-JOIN algorithm is superior to both simulated row-store main memory query processing and the open-source column-store main memory database MonetDB, thanks to the reduced join cost and simple yet efficient query processing.  相似文献   

10.
袁春燕 《办公自动化》2010,(10):38-39,47
随着数据仓库技术和联机分析处理(OLAP)技术的发展,多维数据查询与分析已经广泛应用到商务、金融以及军事等多个领域的信息处理中,为各行业的决策分析提供了强大的支持。本文主要从数据仓库及OLAP技术的相关概念,OLAP多维数据模型及核心技术方面进行了分析总结。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号