期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

基于云计算平台Hadoop的并行k-means聚类算法设计研究 总被引：2，自引：0，他引：2

赵卫中马慧芳傅燕翔史忠植《计算机科学》2011,38(10):166-168

随着数据库技术的发展和Intcrnct的迅速普及,实际应用中需要处理的数据量急剧地增长,致聚类研究面临许多新的问题和挑战,如海量数据和新的计算环境等。深入研究了基于云计算平台Hadoop的并行k-means聚类算法,给出了算法设计的方法和策略。在多个不同大小数据集上的实验表明,设计的并行聚类算法具有优良的加速比、扩展率和数据伸缩率等性能,适合用于海量数据的分析和挖掘。相似文献

2.

Tuning a parallel database algorithm on a shared-memory multiprocessor

Goetz Graefe Shreekant S. Thakkar 《Software》1992,22(7):495-517

Database query processing can benefit significantly from parallelism. Parallel database algorithms combine substantial CPU and I/O activity, memory requirements, and massive data exchange between processes, all of which must be considered to obtain optimal performance. Since parallel external sorting is a very typical example, we have focused on sorting to tune Volcano, a new query processing system. The purpose of the Volcano project is to provide efficient, extensible tools for query and request processing in novel application domains, particularly in object-oriented and scientific database systems, and for experimental database performance research. It includes all query processing algorithms conventionally used in relational database systems as well as several new ones, and can execute all of them in parallel. In this article, we present Volcano's parallel external sorting algorithm and a sequence of enhancements to improve its performance. We obtained very good absolute performance, 84 seconds for 100 MB of data, as well as near-linear speedup with sixteen CPUs and disks. Furthermore, these results were achieved on a shared-memory machine despite the common belief that parallel query processing is best implemented on distributed-memory systems. We detail our tuning measures and report on their effectiveness. 相似文献

3.

基于Hadoop云计算平台的大规模图像检索方案

朱为盛王鹏《计算机应用》2014,34(3):695-699

针对传统图像检索方法在处理海量图像数据时面临困扰的问题,提出了一种基于传统视觉词袋(BoVW)模型和MapReduce计算模型的大规模图像检索(MR-BoVW)方案。该方案充分利用了Hadoop云计算平台海量存储能力和强大的并行计算能力。为了更好地处理图像数据,首先引入一种改进的Hadoop图像数据处理方法,在此基础上分特征向量生成、特征聚类、图片的向量表示与倒排索引构建三个阶段MapReduce化。多组实验表明,MR-BoVW方案具有优良的加速比、扩展率以及数据伸缩率,效率均大于0.62,扩展率以及数据伸缩率曲线平缓,适于大规模图像检索。相似文献

4.

Parallelizing OODBMS traversals: a performance evaluation

David J. De Witt Jeffrey F. Naughton John C. Shafer Shivakumar Venkataraman 《The VLDB Journal The International Journal on Very Large Data Bases》1996,5(1):3-18

In this paper we describe the design and implementation of ParSets, a means of exploiting parallelism in the SHORE OODBMS. We used ParSets to parallelize the graph traversal portion of the OO7 OODBMS benchmark, and present speedup and scaleup results from parallel SHORE running these traversals on a cluster of commodity workstations connected by a standard ethernet. For some OO7 traversals, SHORE achieved excellent speedup and scaleup; for other OO7 traversals, only marginal speedup and scaleup occurred. The characteristics of these traversals shed light on when the ParSet approach to parallelism can and cannot be applied to speed up an application. Edited by Henry F. Korth and Amith Sheth. Received November 1994 / Accepted March 20, 1995 相似文献

5.

An optimal algorithm for Gaussian elimination of band matrices on an MIMD computer

《Parallel Computing》1990,15(1-3):133-145

This paper describes a parallel algorithm for the LU decomposition of band matrices using Gaussian elimination. The matrix dimension is n × n with 2r−1 diagonals. In the case when 1 r 2 p an optimal number of the processors, , is determined according to the equation . When 2 p r n a number of processors, p, statged by Veldhorst is adopted (see [7]). For band matrix with 2r-1 diagonals (1 r 2p) the task scheduling procedure with the aim to obtain maximal parallelism in system operation, i.e. good load balancing, is defined. The architecture of the system is of MIMD type. The connection between the processors is realised via a common bus. Communication and synchronization is performed by message passing technique. 相似文献

6.

A parallel distributive join algorithm for cube-connectedmultiprocessors

Chung S.M. Jaerheen Yang 《Parallel and Distributed Systems, IEEE Transactions on》1996,7(2):127-137

This paper presents a parallel distributive join algorithm for cube-connected multiprocessors. The performance analysis shows that the proposed algorithm has an almost linear speedup over the sequential distributive join algorithm as the number of processors increases, and its performance is comparable to that of the parallel hybrid-hash join algorithm. A big advantage of the proposed algorithm over hash-based join algorithms is that it does not have the bucket overflow problem caused by nonuniform hashing of the smaller operand relation. Moreover, the proposed algorithm can easily support the nonequijoin operation, which is very hard to implement by using hash-based join algorithms 相似文献

7.

Scheduling parallel iterative methods on multiprocessor systems

Nikolaos M. Missirlis 《Parallel Computing》1987,5(3):295-302

The paper describes the implementation of the Successive Overrelaxation (SOR) method on an asynchronous multiprocessor computer for solving large, linear systems. The parallel algorithm is derived by dividing the serial SOR method into noninterfering tasks which are then combined with an optimal schedule of a feasible number of processors. The important features of the algorithm are: (i) achieves a speedup S_p O(N/3) and an efficiency E_p 2/3 using P = [N/2] processors, where N is the number of the equations, (ii) contains a high level of inherent parallelism, whereas on the other hand, the convergence theory of the parallel SOR method is the same as its sequential counterpart and (iii) may be modified to use block methods in order to minimise the overhead due to communication and synchronisation of the processors. 相似文献

8.

基于CUDA的并行布谷鸟搜索算法设计与实现 总被引：1，自引：0，他引：1

韦向远 ;杨辉华 ;谢谱模《计算机科学与探索》2014,(6):665-673

布谷鸟搜索（cuckoo search,CS）算法是近几年发展起来的智能元启发式算法,已经被成功应用于多种优化问题中。针对CS算法在求解大数据、大规模复杂问题时,计算时间过长的问题,提出了一种基于统一计算设备架构（compute unified device architecture,CUDA）的并行布谷鸟搜索算法。该算法的并行实现采用任务并行与数据并行相结合的方式,利用图形处理器（graphic processing unit,GPU）线程块与线程分别映射布谷鸟个体与个体的每一维数据,并行实现CS算法中的鸟巢位置更新、个体适应度评估、鸟巢重建、寻找最优个体操作。整个CS算法的寻优迭代过程完全通过GPU实现,降低了算法计算过程中CPU与GPU的通信开销。对4个经典基准测试函数进行了仿真实验,结果表明,相比标准CS算法,基于CUDA架构的并行CS算法在求解收敛性一致的前提下,在求解速度上获得了高达110倍的计算加速比。相似文献

9.

路径表达式的并行算法研究 总被引：1，自引：0，他引：1

王国仁于戈张斌郑怀远《计算机学报》1999,22(2):126-133

在面向对象数据库系统中,路径表达式是用于定位复杂对象的必要查询设施,因此,优化和并行化路径表达式的执行是实现高性能面向对象数据库系统的关键因素之一,由于ＯＱＬ语言的正交性,在ＳＥＬＥＣＴ,ＦＲＯＭ和（或）ＷＨＥＲＥ子句中均可嵌套路径表达式,而我们将着重讨论ＷＨＥＲＥ子句子路径表达式的并行计算,种路径表达式也称之为复杂谓词。本文分析了现有路径表达式的计算方法后,提出了两种新的路径表达式并行计算算法, 相似文献

10.

Parallel Sequence Mining on Shared-Memory Machines

《Journal of Parallel and Distributed Computing》2001,61(3):401-426

We present pSPADE, a parallel algorithm for fast discovery of frequent sequences in large databases. pSPADE decomposes the original search space into smaller suffix-based classes. Each class can be solved in main-memory using efficient search techniques and simple join operations. Furthermore, each class can be solved independently on each processor requiring no synchronization. However, dynamic interclass and intraclass load balancing must be exploited to ensure that each processor gets an equal amount of work. Experiments on a 12 processor SGI Origin 2000 shared memory system show good speedup and excellent scaleup results. 相似文献

11.

Solving very large instances of the scheduling of independent tasks problem on the GPU 总被引：1，自引：0，他引：1

Frédéric Pinel Bernabé Dorronsoro Pascal Bouvry 《Journal of Parallel and Distributed Computing》2013

In this paper, we present two new parallel algorithms to solve large instances of the scheduling of independent tasks problem. First, we describe a parallel version of the Min–min heuristic. Second, we present GraphCell, an advanced parallel cellular genetic algorithm (CGA) for the GPU. Two new generic recombination operators that take advantage of the massive parallelism of the GPU are proposed for GraphCell. A speedup study shows the high performance of the parallel Min–min algorithm in the GPU versus several CPU versions of the algorithm (both sequential and parallel using multiple threads). GraphCell improves state-of-the-art solutions, especially for larger problems, and it provides an alternative to our GPU Min–min heuristic when more accurate solutions are needed, at the expense of an increased runtime. 相似文献

12.

OpenVX与三维渲染在多态GPU上的并行实现

延酉玫李涛王鹏博韩俊刚李雪丹姚静乔虹《计算机应用》2015,35(1):53-57

针对图像处理与机器视觉以及三维图形渲染等所具有的大规模并行处理特征,通过充分利用面向图形图像处理的多态阵列架构(PAAG)处理器的可编程性以及灵活的并行处理方式,采用操作级并行与数据级并行相结合的并行化设计方法,实现了OpenVX中Kernel函数以及3D图形渲染.实验结果表明,在OpenVX标准图像处理Kernel函数以及图形渲染的并行实现中,采用PAAG处理器中的多指令多数据(MIMD)并行处理方式可以获得斜率为1的线性加速比,比传统图形处理器(GPU)中单指令多数据(SIMD)并行处理方式所得到的斜率值小于1的非线性加速比效率更高. 相似文献

13.

Applying parallel computer systems to solve symmetric tridiagonal eigenvalue problems

Mi Lu Xiangzhen Qiao 《Parallel Computing》1992,18(12):1301-1315

A block parallel partitioning method for computing the eigenvalues of symmetric tridiagonal matrix is presented. The algorithm is based on partitioning, in a way that ensures load balance during computation. This method is applicable to both shared memory- and distributed memory-MIMD systems. Compared with other parallel tridiagonal eigenvalue algorithms existing in the literature, the proposed algorithm achieves a higher speedup of O(p) on a parallel computer with p-fold parallelism, which is linear, and the data communication between processors is less than that required for other methods. The results were tested and evaluated on an MIMD machine, and were within 62% to 98% of the predicted performance. 相似文献

14.

Protein database search of hybrid alignment algorithm based on GPU parallel acceleration

Wei Zhou Zhanxiu Cai Bo Lian Jincai Wang Jianping Ma 《The Journal of supercomputing》2017,73(10):4517-4534

In biological research, alignment of protein sequences by computer is often needed to find similarities between them. Although results can be computed in a reasonable time for alignment of two sequences, it is still very central processing unit (CPU) time-consuming when solving massive sequences alignment problems such as protein database search. In this paper, an optimized protein database search method is presented and tested with Swiss-Prot database on graphic processing unit (GPU) devices, and further, the power of CPU multi-threaded computing is also involved to realize a GPU-based heterogeneous parallelism. In our proposed method, a hybrid alignment approach is implemented by combining Smith–Waterman local alignment algorithm with Needleman–Wunsch global alignment algorithm, and parallel database search is realized with compute unified device architecture (CUDA) parallel computing framework. In the experiment, the algorithm is tested on a lower-end and a higher-end personal computers equipped with GeForce GTX 750 Ti and GeForce GTX 1070 graphics cards, respectively. The results show that the parallel method proposed in this paper can achieve a speedup up to 138.86 times over the serial counterpart, improving efficiency and convenience of protein database search significantly. 相似文献

15.

基于数据网格环境的连接操作算法 总被引：5，自引：1，他引：5

杨东华李建中张文平《计算机研究与发展》2004,41(10):1848-1855

数据网格是一种分布式数据管理体系结构，能够为分布在网格中的资源提供协同的管理机制．数据库管理系统在数据网格中发挥着重要作用，在各种数据库操作中，连接操作是一种最常用也是最耗时的操作，到目前为止，尚未有文献提出数据网格环境下的连接操作算法．主要对数据网格环境下海量数据的连接操作算法进行了研究，针对网格中各结点之间网络带宽异构的特点，采取关系缩减算法、行分块传输技术和流水线并行机制来减少查询的响应时间．理论分析和实验结果证明，算法在减少网络通信开销、增加I／0和CPU并行、降低响应时间方面具有较好的性能．相似文献

16.

Effectiveness of parallel joins

Seetha Lakshmi M. Yu P.S. 《Knowledge and Data Engineering, IEEE Transactions on》1990,2(4):410-424

The effectiveness of parallel processing of relational join operations is examined. The skew in the distribution of join attribute values and the stochastic nature of the task processing times are identified as the major factors that can affect the effective exploitation of parallelism. Expressions for the execution time of parallel hash join and semijoin are derived and their effectiveness analyzed. When many small processors are used in the parallel architecture, the skew can result in some processors becoming sources of bottleneck while other processors are being underutilized. Even in the absence of skew, the variations in the processing times of the parallel tasks belonging to a query can lead to high task synchronization delay and impact the maximum speedup achievable through parallel execution. For example, when the task processing time on each processor is exponential with the same mean, the speedup is proportional to P/ln(P) where P is the number of processors. Other factors such as memory size, communication bandwidth, etc., can lead to even lower speedup. These are quantified using analytical models 相似文献

17.

The 0—1 Integer Programming Problem in a finite ring with identity

Bart Rice 《Computers & Mathematics with Applications》1981,7(6):497-502

We define the 0—1 Integer Programming Problem in a finite field or finite ring with identity as: given an m × n matrix A and an n × 1 vector b with entries in the ring R, find or determine the non-existence of a 0—1 vector x such that Ax = b. We give an easily implemented enumerative algorithm for solving this problem, along with conditions that spurious solutions occur with probability as small as desired. Finally, we show that the problem is NP-complete if R is the ring of integers modulo r for r ≥ 3. This result suggests that it will be difficult to improve on our algorithm. 相似文献

18.

The power of inequality semijoins

Philip A. Bernstein Nathan Goodman 《Information Systems》1981,6(4):255-265

Semijoin is a relational operator used in many relational query processing algorithms. Semijoins can be used to “reduce” the database by delimitting portions of the database that contain data relevant to a given query. For some queries, there exist sequences of semijoins that delimit the exact portions of the database needed to answer the query. Such sequences are called full reducers.

This paper considers a class of queries called natural inequality queries (NI queries), and characterizes a subclass for which full reducers exist. We also present an efficient algorithm that decides whether an NI query lies within this subclass, and constructs a full reducer for the query. The NI queries are a subset of the aggregate-free, conjunctive queries of QUEL, and permit join clauses to include <, , =, , >. 相似文献

19.

A constant-time parallel sorting algorithm and its opticalimplementation

Louri A. Hatch J.A. Jr. Jongwhoa Na 《Micro, IEEE》1995,15(3):60-71

High-speed electronic sorting networks are difficult to implement with VLSI technology because of the dense and global connectivity required. Optics eliminates this bottleneck by offering global interconnections, massive parallelism, and noninterfering communications. We present a parallel sorting algorithm and its efficient optical implementation using currently available optical hardware. The algorithm sorts n data elements in a few steps, independent of the number of elements to be sorted. Thus, it is a constant-time sorting algorithm, that is, O(1) time 相似文献

20.

基于CUDA的并行K-近邻连接算法实现

潘茜张育平陈海燕《计算机科学》2016,43(10):190-192, 219

针对大规模空间数据的K-近邻连接查询问题,设计了一种CUDA编程模型下K-近邻连接算法的并行优化方法。将K-近邻连接算法的并行过程分两个阶段:1)对参与查询的数据集P和Q分别建立R-Tree索引;2)基于R-Tree索引进行KNNJ查询。首先根据结点所在位置划分最小外包框,在CUDA下基于递归网格排序算法创建R-Tree索引。然后在CUDA下基于R-Tree索引进行KNNJ查询,其中涉及并行求距离和并行距离排序两个阶段:求距离阶段利用每一个线程计算任意两点之间的距离,点与点之间距离的求取无依赖并行;排序阶段将快速排序基于CUDA以实现并行化。实验结果表明,随着样本量的不断增大,基于R-Tree索引的并行K-近邻连接算法的优势更加明显,具有高效性和可扩展性。相似文献