期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

丁祥武李子通《计算机科学》2016,43(11):265-271, 308

集成多核CPU-GPU架构已经成为计算机处理器芯片的发展方向。利用这种架构的并行计算能力进行数据处理已经成为了数据库领域的研究热点。为了提高列存储系统的查询性能,首先改进了已有协处理机制中的负载分配策略,通过监测数据库系统CPU占用率,动态地为处理器提供合理的数据划分;然后,针对集成多核CPU-GPU架构上的数据预取机制,提出了一种确定预取数据大小的模型,同时,针对GPU访存的特点,进行了GPU访存优化;最后,使用OpenCL作为编程语言,实现了一种集成多核CPU-GPU架构上的列存储排序归并连接算法,并采用提出的方法对连接处理进行优化。实验证明,所提优化策略可以使列存储系统排序归并连接性能提升33%。相似文献

2.

An Adaptive Parallel Distributive Join Algorithm on a Cluster of Workstations

Soon M. Chung Arindam Chatterjee 《The Journal of supercomputing》2002,21(1):5-35

In this paper, we present an adaptive version of the parallel Distributive Join (DJ) algorithm that we proposed in [5]. The adaptive parallel DJ algorithm can handle the data skew in operand relations efficiently. We implemented the original and adaptive parallel DJ algorithms on a network of Alpha workstations using the Parallel Virtual Machine (PVM). We analyzed the performance of the algorithms, and compared it with that of the parallel Hybrid-Hash (HH) join algorithms. Our results show that the parallel DJ algorithms perform comparably with the parallel HH join algorithms over the entire range of the number of processors used and for different join selectivities. A significant advantage of the parallel DJ algorithms is that they can easily support non-equijoin operations. 相似文献

3.

应对倾斜数据流在线连接方法

王春凯孟小峰《软件学报》2018,29(3):869-882

并行环境下的分布式连接处理要求制定划分策略以减少状态迁移和通信开销。相对于数据库管理系统而言,分布式数据流管理系统中的在线θ连接操作需要更高的计算成本和内存资源。基于完全二部图的连接模型可支持分布式数据流的连接操作。因为连接操作的每个关系仅存放于二部图模型的一侧处理单元,无需复制数据,且处理单元相互独立,因此该模型具有内存高效、易伸缩和可扩展等特性。然而,由于数据流速的不稳定性和属性值分布的不均衡性,导致倾斜数据流的连接操作易出现集群负载不均衡的现象。针对倾斜数据流的连接操作,模型无法动态分配查询节点,并需要人工干预数据分组的参数设置。尤其是应对全部历史数据的连接查询,模型效率更低。基于上述问题,提出了管理倾斜数据流连接的框架,使用基于键值和元组混合的划分样式有效应对二部图模型的各侧倾斜数据。并设计了重新动态分配查询节点的策略和状态迁移算法,以支持全历史数据的连接查询和自适应的资源管理。针对合成数据和真实数据的实验表明,该方案可有效应对倾斜数据的连接操作并进一步提升分布式数据流管理系统的吞吐率,特别是降低云环境中的计算成本。相似文献

4.

抗数据偏斜的高效并行join运算算法研究

LU Shu-ying ZHU Ping 《数字社区&智能家居》2008,(31)

通过分析ABJ 算法和Hybrid hash join算法,并对两个算法进行了结合和改进,提出了一种能克服各种数据偏斜的并行二元连接运算算法,可在不同的数据偏斜情况下启动不同的模块,克服数据偏斜造成的负载不平衡现象。相似文献

5.

阵列众核结构上的一种多层分区Hash连接算法

石嵩宁永波李宏亮郑方《计算机科学》2016,43(3):18-22

连接是数据查询处理中最耗时、使用最频繁的操作之一,对提高连接操作的速率具有重要意义。阵列众核处理器是一类重要的众核处理器,具有强大的并行能力,可用来加速并行计算。基于阵列众核处理器的结构,设计和优化了一种高效的多层分区Hash连接算法。该算法通过多层划分的策略大大降低了主存访问次数,通过分区重排方法有效消除了数据倾斜的影响,获得了很高的性能。在异构融合阵列众核处理器DFMC(Deeply-Fused Many Core)原型系统上的实验结果表明,DFMC上多层分区Hash连接算法的性能是CPU-GPU耦合结构上最快的连接算法的8.0倍,表明利用阵列众核处理器加速数据查询应用具有优势。相似文献

6.

抗数据偏斜的高效并行join运算算法研究

卢姝颖朱平《数字社区&智能家居》2008,3(11):938-939

通过分析ABJ＋算法和Hybrid hash join算法,并对两个算法进行了结合和改进,提出了一种能克服各种数据偏斜的并行二元连接运算算法,可在不同的数据偏斜情况下启动不同的模块,克服数据偏斜造成的负载不平衡现象。相似文献

7.

A generalised temporal algebra

Georgia 《Data & Knowledge Engineering》2006,57(3):283-310

The algebra (TNA) of a generalised temporal database model supporting temporal relations nested to any finite depth is presented. The temporal nested relations consist of temporal nested attributes which are formed from temporal attributes together with the corresponding time-varying attributes. Therefore, the temporal dimension of the model is nested and is not integral with the corresponding time-dependent value. All the operations of the algebra are defined recursively and are proved to be closed. In particular, considering the natural join operation for temporal nested relations, different cases are presented, distinguished by the types and the nesting levels of the common attributes that participate in the natural join operation. 相似文献

8.

Granularity reduction in temporal document databases

Kjetil Nørvåg 《Information Systems》2006

With rapidly decreasing storage costs, temporal document databases are now a viable solution in many contexts. However, storing an ever-growing database can still be too costly, and as a consequence it is desirable to be able to physically delete old versions of data. Traditionally, this has been performed by an operation called vacuuming, where the oldest versions are physically deleted or migrated from secondary storage to less costly tertiary storage. In temporal document databases on the other hand, it is often more appropriate to remove intermediate versions instead of removing the oldest versions. We call this operation granularity reduction. In this paper we describe the concept of granularity reduction, and present six strategies for selecting the document versions to eliminate. Three of the strategies have been implemented in the V2 temporal document database system, and in this context we discuss the cost of applying the strategies. 相似文献

9.

A conceptual model for the logical design of temporal databases

Debabrata Terence M. Veda C. 《Decision Support Systems》1995,15(4)

Although widely advocated as a tool for the conceptual modelling of data, the Entity-Relationship (E-R) model [4] and its extensions are generally lacking in constructs to model the dynamic nature of the real world, making them inadequate for designing temporal databases. This research first extends the E-R model to a Temporal Event-Entity-Relationship Model (TEERM), by introducing events as an additional construct. Second, a method is proposed for mapping this conceptual model into a temporal relational model for the logical design of temporal relational databases with a corresponding set of integrity constraints. The model is illustrated with an example and evaluated using a set of criteria proposed by ^{Batini et al. [2]}. The model appears to be expressive, simple and easy to use, and should, therefore, aid the temporal database design process significantly. 相似文献

10.

Discovering multi-label temporal patterns in sequence databases

Yen-Liang Chen Shin-Yi Wu 《Information Sciences》2011,181(3):398-418

Sequential pattern mining is one of the most important data mining techniques. Previous research on mining sequential patterns discovered patterns from point-based event data, interval-based event data, and hybrid event data. In many real life applications, however, an event may involve many statuses; it might not occur only at one certain point in time or over a period of time. In this work, we propose a generalized representation of temporal events. We treat events as multi-label events with many statuses, and introduce an algorithm called MLTPM to discover multi-label temporal patterns from temporal databases. The experimental results show that the efficiency and scalability of the MLTPM algorithm are satisfactory. We also discuss interesting multi-label temporal patterns discovered when MLTPM was applied to historical Nasdaq data. 相似文献

11.

Mining temporal interval relational rules from temporal data

Yong Joon Lee Author Vitae 《Journal of Systems and Software》2009,82(1):155-167

Temporal data mining is still one of important research topic since there are application areas that need knowledge from temporal data such as sequential patterns, similar time sequences, cyclic and temporal association rules, and so on. Although there are many studies for temporal data mining, they do not deal with discovering knowledge from temporal interval data such as patient histories, purchaser histories, and web logs etc. We propose a new temporal data mining technique that can extract temporal interval relation rules from temporal interval data by using Allen’s theory: a preprocessing algorithm designed for the generalization of temporal interval data and a temporal relation algorithm for mining temporal relation rules from the generalized temporal interval data. This technique can provide more useful knowledge in comparison with conventional data mining techniques. 相似文献

12.

区间值模糊图的分解性质

杨文华李生刚《计算机工程与应用》2012,48(31):25-29

给出了区间值模糊图在笛卡尔积与合成运算下分解的充要条件,证明了区间值模糊图能够在并与联运算下分解。相似文献

13.

统计数据库的扩展散列连接方法

叶翰霖《计算机研究与发展》1993,30(5):18-21,49

相似文献

14.

SQL外连接查询在系统开发中的应用 总被引：1，自引：0，他引：1

王平勤董付国周翔凤《电脑开发与应用》2008,21(3):57-58

介绍了ASP．NET／C#对SOL语句的调用和执行方法与技术,重点介绍了外连接查询SOL语句的编写技巧以及在系统开发中的应用。大量应用证明,正确使用外连接查询不仅可以减少网络数据流量,更重要的是提高了系统的运行效率。最后,通过一个在某公司生产监督管理系统开发中的实际案例介绍了外连接查询和子查询技术的应用。相似文献

15.

XQuery在XML流上查询的结构化连接

下载免费PDF全文

吴晓勇张昱孙东海《计算机工程》2008,34(4):63-65

通过分析XQuery查询与XPath查询的区别与联系,定义了扩展的基本XSIEQ机E-XSIEQ,它是一种被索引化、基于栈的自动机。提出用变量表来收集XPath查询结果,并将这些中间结果组织成原子表集合,结果构造时能够根据原子表元组之间的上下文关系,对原子表集合快速地进行连接。描述了XQuery查询过程中的结构化连接算法,给出了结果构造的时间性能分析。相似文献

16.

约束数据库带有多重用户函数连接的查询优化

洪晓光杨波王海洋《计算机学报》2000,23(10):1072-1077

数据库查询优化一直是数据库界研究的热点,而查询谓词中带有用户函数的优化问题,尤其是存在以满足用户函数为条件的连接运算的优化工作尚未深入进行。文中给出了这类问题的详细讨论,并设计了优化方案。相似文献

17.

A fast and effective method to find correlations among attributes in databases 总被引：3，自引：0，他引：3

Elaine P. M. de Sousa Caetano Traina Jr. Agma J. M. Traina Leejay Wu Christos Faloutsos 《Data mining and knowledge discovery》2007,14(3):367-407

The problem of identifying meaningful patterns in a database lies at the very heart of data mining. A core objective of data mining processes is the recognition of inter-attribute correlations. Not only are correlations necessary for predictions and classifications – since rules would fail in the absence of pattern – but also the identification of groups of mutually correlated attributes expedites the selection of a representative subset of attributes, from which existing mappings allow others to be derived. In this paper, we describe a scalable, effective algorithm to identify groups of correlated attributes. This algorithm can handle non-linear correlations between attributes, and is not restricted to a specific family of mapping functions, such as the set of polynomials. We show the results of our evaluation of the algorithm applied to synthetic and real world datasets, and demonstrate that it is able to spot the correlated attributes. Moreover, the execution time of the proposed technique is linear on the number of elements and of correlations in the dataset. 相似文献

18.

Modelling and solving temporal reasoning as propositional satisfiability

Duc Nghia Pham John Thornton Abdul Sattar 《Artificial Intelligence》2008,172(15):1752-1782

Representing and reasoning about time dependent information is a key research issue in many areas of computer science and artificial intelligence. One of the best known and widely used formalisms for representing interval-based qualitative temporal information is Allen's interval algebra (IA). The fundamental reasoning task in IA is to find a scenario that is consistent with the given information. This problem is in general NP-complete.In this paper, we investigate how an interval-based representation, or IA network, can be encoded into a propositional formula of Boolean variables and/or predicates in decidable theories. Our task is to discover whether satisfying such a formula can be more efficient than finding a consistent scenario for the original problem. There are two basic approaches to modelling an IA network: one represents the relations between intervals as variables and the other represents the end-points of each interval as variables. By combining these two approaches with three different Boolean satisfiability (SAT) encoding schemes, we produced six encoding schemes for converting IA to SAT. In addition, we also showed how IA networks can be formulated into satisfiability modulo theories (SMT) formulae based on the quantifier-free integer difference logic (QF-IDL). These encodings were empirically studied using randomly generated IA problems of sizes ranging from 20 to 100 nodes. A general conclusion we draw from these experimental results is that encoding IA into SAT produces better results than existing approaches. More specifically, we show that the new point-based 1-D support SAT encoding of IA produces consistently better results than the other alternatives considered. In comparison with the six different SAT encodings, the SMT encoding came fourth after the point-based and interval-based 1-D support schemes and the point-based direct scheme. Further, we observe that the phase transition region maps directly from the IA encoding to each SAT or SMT encoding, but, surprisingly, the location of the hard region varies according to the encoding scheme. Our results also show a fixed performance ranking order over the various encoding schemes. 相似文献

19.

Modification semantics in now-relative databases 总被引：3，自引：0，他引：3

Kristian Torp Christian S. Jensen Richard T. Snodgrass 《Information Systems》2004,29(8):899-683

Most real-world databases record time-varying information. In such databases, the notion of “the current time,” or NOW, occurs naturally and prominently. For example, when capturing the past states of a relation using begin and end time columns, tuples that are part of the current state have some past time as their begin time and NOW as their end time. While the semantics of such variable databases has been described in detail and is well understood, the modification of variable databases remains unexplored. This paper defines the semantics of modifications involving the variable NOW. More specifically, the problems with modifications in the presence of NOW are explored, illustrating that the main problems are with modifications of tuples that reach into the future. The paper defines the semantics of modifications—including insertions, deletions, and updates—of databases without NOW, with NOW, and with values of the type NOW+Δ, where Δ is a non-variable time duration. To accommodate these semantics, three new timestamp values are introduced. Finally, implementation is explored. We show how to represent the variable NOW with columns of standard SQL data types and give a mapping from SQL on NOW-relative data to standard SQL on these columns. The paper thereby completes the semantics, the querying, and the modification of now-relative databases. 相似文献

20.

An algorithm for testing lossless join property in relational databases

Lishing Liu Alan Demers 《Information Processing Letters》1980,11(2):73-76

相似文献