期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Work stealing with private integer–vector–matrix data structure for multi‐core branch‐and‐bound algorithms

Jan Gmys Rudi Leroy Mohand Mezmaz Nouredine Melab Daniel Tuyttens 《Concurrency and Computation》2016,28(18):4463-4484

In this paper, the focus is put on multi‐core branch‐and‐bound algorithms for solving large‐scale permutation‐based optimization problems. We investigate five work stealing (WS) strategies with a new data structure called integer–vector–matrix (IVM). In these strategies, each thread has a private IVM allowing the local management of a set of subproblems enumerated using a factorial system. The WS strategies differ in the way the victim thread is selected and the granularity of stolen work units (intervals of factoradics). To assess the efficiency of the private IVM‐based WS approach, the five WS strategies have been extensively experimented on the flowshop scheduling permutation problem and compared with their conventional linked‐list‐based counterparts. The obtained results demonstrate that the IVM‐based WS outperforms the linked‐list‐based one in terms of CPU time, memory usage and number of performed WS operations. Copyright © 2016 John Wiley & Sons, Ltd. 相似文献

2.

Graphics processing unit‐accelerated bounding for branch‐and‐bound applied to a permutation problem using data access optimization

N. Melab I. Chakroun A. Bendjoudi 《Concurrency and Computation》2014,26(16):2667-2683

Branch‐and‐bound (B&B) algorithms are attractive methods for solving to optimality combinatorial optimization problems using an implicit enumeration of a dynamically built tree‐based search space. Nevertheless, they are time‐consuming when dealing with large problem instances. Therefore, pruning tree nodes (subproblems) is traditionally used as a powerful mechanism to reduce the size of the explored search space. Pruning requires to perform the bounding operation, which consists of applying a lower bound function to the subproblems generated during the exploration process. Preliminary experiments performed on the Flow‐Shop scheduling problem (FSP) have shown that the bounding operation consumes over 98% of the execution time of the B&B algorithm. In this paper, we investigate the use of graphics processing unit (GPU) computing as a major complementary way to speed up the search. We revisit the design and implementation of the parallel bounding model on GPU accelerators. The proposed approach enables data access optimization. Extensive experiments have been carried out on well‐known FSP benchmarks using an Nvidia Tesla C2050 GPU card. Compared to a CPU‐based single core execution using an Intel Core i7‐970 processor without GPU, speedups higher than 100 times faster are achieved for large problem instances. At an equivalent peak performance, GPU‐accelerated B&B is twice faster than its multi‐core counterpart. Copyright © 2013 John Wiley & Sons, Ltd. 相似文献

3.

A parallel bi-objective hybrid metaheuristic for energy-aware scheduling for cloud computing systems 总被引：3，自引：0，他引：3

M. Mezmaz N. Melab Y. Kessaci Y.C. Lee E.-G. Talbi A.Y. Zomaya D. TuyttensAuthor vitae 《Journal of Parallel and Distributed Computing》2011,71(11):1497-1508

In this paper, we investigate the problem of scheduling precedence-constrained parallel applications on heterogeneous computing systems (HCSs) like cloud computing infrastructures. This kind of application was studied and used in many research works. Most of these works propose algorithms to minimize the completion time (makespan) without paying much attention to energy consumption.We propose a new parallel bi-objective hybrid genetic algorithm that takes into account, not only makespan, but also energy consumption. We particularly focus on the island parallel model and the multi-start parallel model. Our new method is based on dynamic voltage scaling (DVS) to minimize energy consumption.In terms of energy consumption, the obtained results show that our approach outperforms previous scheduling methods by a significant margin. In terms of completion time, the obtained schedules are also shorter than those of other algorithms. Furthermore, our study demonstrates the potential of DVS. 相似文献

4.

Combining multi-core and GPU computing for solving combinatorial optimization problems

I. Chakroun N. Melab M. Mezmaz D. Tuyttens 《Journal of Parallel and Distributed Computing》2013

In this paper, we revisit the design and implementation of Branch-and-Bound (B&B) algorithms for solving large combinatorial optimization problems on GPU-enhanced multi-core machines. B&B is a tree-based optimization method that uses four operators (selection, branching, bounding and pruning) to build and explore a highly irregular tree representing the solution space. In our previous works, we have proposed a GPU-accelerated approach in which only a single CPU core is used and only the bounding operator is performed on the GPU device. Here, we extend the approach (LL-GB&B) in order to minimize the CPU–GPU communication latency and thread divergence. Such an objective is achieved through a GPU-based fine-grained parallelization of the branching and pruning operators in addition to the bounding one. The second contribution consists in investigating the combination of a GPU with multi-core processing. Two scenarios have been explored leading to two approaches: a concurrent (RLL-GB&B) and a cooperative one (PLL-GB&B). In the first one, the exploration process is performed concurrently by the GPU and the CPU cores. In the cooperative approach, the CPU cores prepare and off-load to GPU pools of tree nodes using data streaming while the GPU performs the exploration. The different approaches have been extensively experimented on the Flowshop scheduling problem. Compared to a single CPU-based execution, LL-GB&B allows accelerations up to (×

\times

160) for large problem instances. Moreover, when combining multi-core and GPU, we figure out that using RLL-GB&B is not beneficial while PLL-GB&B enables an improvement up to 36% compared to LL-GB&B. 相似文献

5.

A Parallel Adaptive Gauss-Jordan Algorithm

Melab N. Talbi E.-G. Petiton S. 《The Journal of supercomputing》2000,17(2):167-185

This paper presents a parallel adaptive version of the block-based Gauss-Jordan algorithm, utilized to invert large matrices. This version includes a characterization of the workload and a mechanism of its folding/unfolding. Furthermore, this paper proposes a work scheduling strategy and an application-oriented solution for the fault tolerance problem. The application is implemented and experimented with MARS¹ in dedicated and non-dedicated environments. The results show that an absolute efficiency of 92% is possible on a cluster of DEC/ALPHA processors interconnected by a Gigaswitch network and an absolute efficiency of 67% can be obtained on an Ethernet network of SUN-Sparc 4 workstations. Moreover, the algorithm is tested on a meta-system including both the two parks of machines. Finally, an out-of-core solution for the algorithm is proposed. This solution allows a gain of 66% of data input operations and reduces the central memory space required for storing the data space of the algorithm by a factor q, where q is the dimension of the matrix to be inverted in terms of data blocks. 相似文献

6.

Data Mining: A Key Contribution to E-business 总被引：5，自引：0，他引：5

Nordine Melab 《Information & Communications Technology Law》2001,10(3):309-318

Data mining consists of extracting knowledge from huge volumes of data, allowing better business decisions to be taken. In this paper, we show how data mining is integrated in the knowledge discovery process. We highlight its potential applications and the techniques that are often used to perform it. Association rule mining is presented as a case study. Furthermore, we show through an integrated architecture how data mining can contribute to e-business via the new technologies. Finally, we present some commercially-available architectures. 相似文献

7.

GGM: efficient navigation and mining in distributed genomedical data

Pierson JM Gossa J Wehrle P Cardenas Y Cahon S El Samad M Brunie L Dhaenens C Hameurlain A Melab N Miquel M Morvan F Talbi el-G Tchounikine A 《IEEE transactions on nanobioscience》2007,6(2):110-116

The integration of genomics and patient related data is considered as one of the most promising investigation topic in health care research. Started in 2004, the Grid for Geno Medicine (GGM) project aims at providing a comprehensive grid software infrastructure designed to allow biologists to mine and analyze relationships between medical, genetic, and genomic data stored in distributed datawarehouses. The proposed layered service oriented architecture offers a number of independent but compliant services that can be deployed in a grid environment. This paper presents these services insisting on their integration into a common software platform, the use case that is carried out. It also presents the current state of the developments and of the performance evaluations. 相似文献

8.

A grid-based genetic algorithm combined with an adaptive simulated annealing for protein structure prediction

Alexandru-Adrian Tantar Nouredine Melab El-Ghazali Talbi 《Soft Computing - A Fusion of Foundations, Methodologies and Applications》2008,12(12):1185-1198

A hierarchical hybrid model of parallel metaheuristics is proposed, combining an evolutionary algorithm and an adaptive simulated annealing. The algorithms are executed inside a grid environment with different parallelization strategies: the synchronous multi-start model, parallel evaluation of different solutions and an insular model with asynchronous migrations. Furthermore, a conjugated gradient local search method is employed at different stages of the exploration process. The algorithms were evaluated using the protein structure prediction problem, having as benchmarks the tryptophan-cage protein (Brookhaven Protein Data Bank ID: 1L2Y), the tryptophan-zipper protein (PDB ID: 1LE1) and the α-Cyclodextrin complex. Experimentations were performed on a nation-wide grid infrastructure, over six distinct administrative domains and gathering nearly 1,000 CPUs. The complexity of the protein structure prediction problem remains prohibitive as far as large proteins are concerned, making the use of parallel computing on the computational grid essential for its efficient resolution. The current article is developed within the context of the DOCK—Conformational Sampling and Docking on Grids project, sustained by ANR (Agence Nationale de la Recherche, ). The project is a joint work between LIFL (USTL-CNRS-INRIA), IBL (CNRS-INSERM) and CEA DSV/DRDC. 相似文献

9.

An adaptive hierarchical master–worker (AHMW) framework for grids—Application to B&B algorithms

A. Bendjoudi N. Melab E-G. Talbi 《Journal of Parallel and Distributed Computing》2012

Well-suited to embarrassingly parallel applications, the master–worker (MW) paradigm has largely and successfully used in parallel distributed computing. Nevertheless, such a paradigm is very limited in scalability in large computational grids. A natural way to improve the scalability is to add a layer of masters between the master and the workers making a hierarchical MW (HMW). In most existing HMW frameworks and algorithms, only a single layer of masters is used, the hierarchy is statically built and the granularity of tasks is fixed. Such frameworks and algorithms are not adapted to grids which are volatile, heterogeneous and large scale environments. In this paper, we revisit the HMW paradigm to match such characteristics of grids. We propose a new dynamic adaptive multi-layer hierarchical MW (AHMW ) dealing with the scalability, volatility and heterogeneity issues. The construction and deployment of the hierarchy and the task management (deployment, decomposition of work, distribution of tasks, …

\dots

) are performed in a dynamic collaborative distributed way. The framework has been applied to the parallel Branch and Bound algorithm and experimented on the Flow-Shop scheduling problem. The implementation has been performed using the ProActive grid middleware and the large experiments have been conducted using about 2000 processors from the Grid’5000 French nation-wide grid infrastructure. The results demonstrate the high scalability of the proposed approach and its efficiency in terms of deployment cost, decomposition and distribution of work and exploration time. The results show that AHMW outperforms HMW and MW in scalability and efficiency in terms of deployment and exploration time. 相似文献