首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 593 毫秒
1.
bsp is a bridging model between abstract execution and concrete parallel systems. Structure and abstraction brought by bsp allow to have portable parallel programs with scalable performance predictions, without dealing with low-level details of architectures. In the past, we designed bsml for programming bsp algorithms in ml. However, the simplicity of the bsp model does not fit the complexity of today’s hierarchical architectures such as clusters of machines with multiple multi-core processors. The multi-bsp model is an extension of the bsp model which brings a tree-based view of nested components of hierarchical architectures. To program multi-bsp algorithms in ml, we propose the multi-ml language as an extension of bsml where a specific kind of recursion is used to go through a hierarchy of computing nodes. We define a formal semantics of the language and present preliminary experiments which show performance improvements with respect to bsml.  相似文献   

2.
We present bsp-why, a tool for deductive verification of bsp  algorithms with subgroup synchronisation. From bsp  programs, bsp-why generates sequential codes for the back-end condition generator why and thus benefits from its large range of existing provers. By enabling subgroups, the user can prove the correctness of programs that run on hierarchical machines—e.g. clusters of multi-cores. In general, bsp-why is able to generate proof obligations of mpi programs that only use collective operations. Our case studies are distributed state-space construction algorithms, the basis of model-checking.  相似文献   

3.
We describe a scheme for subdividing long-running, variable-length analyses into short, fixed-length boinc workunits using phylogenetic analyses as an example. Fixed-length workunits decrease variance in analysis runtime, improve overall system throughput, and make boinc a more useful resource for analyses that require a relatively fast turnaround time, such as the phylogenetic analyses submitted by users of the garli web service at molecularevolution.org. Additionally, we explain why these changes will benefit volunteers who contribute their processing power to boinc projects, such as the Lattice boinc Project (http://boinc.umiacs.umd.edu). Our results, which demonstrate the advantages of relatively short workunits, should be of general interest to anyone who develops and deploys an application on the boinc platform.  相似文献   

4.
How can we discover interesting patterns from time-evolving high-speed data streams? How to analyze the data streams quickly and accurately, with little space overhead? How to guarantee the found patterns to be self-consistent? High-speed data stream has been receiving increasing attention due to its wide applications such as sensors, network traffic, social networks, etc. The most fundamental task on the data stream is frequent pattern mining; especially, focusing on recentness is important in real applications. In this paper, we develop two algorithms for discovering recently frequent patterns in data streams. First, we propose TwMinSwap to find top-k recently frequent items in data streams, which is a deterministic version of our motivating algorithm TwSample providing theoretical guarantees based on item sampling. TwMinSwap improves TwSample in terms of speed, accuracy, and memory usage. Both require only O(k) memory spaces and do not require any prior knowledge on the stream such as its length and the number of distinct items in the stream. Second, we propose TwMinSwap-Is to find top-k recently frequent itemsets in data streams. We especially focus on keeping self-consistency of the discovered itemsets, which is the most important property for reliable results, while using O(k) memory space with the assumption of a constant itemset size. Through extensive experiments, we demonstrate that TwMinSwap outperforms all competitors in terms of accuracy and memory usage, with fast running time. We also show that TwMinSwap-Is is more accurate than the competitor and discovers recently frequent itemsets with reasonably large sizes (at most 5–7) depending on datasets. Thanks to TwMinSwap and TwMinSwap-Is, we report interesting discoveries in real world data streams, including the difference of trends between the winner and the loser of U.S. presidential candidates, and temporal human contact patterns.  相似文献   

5.
Massively parallel architectures are mainly based on a parallel heterogeneous setup. They are composed by different computing devices that speed up specific code regions, named kernels. These kernels are usually executed offline in the corresponding devices. Porting applications to a specific heterogeneous platform is a costly task in terms of time and human resources. The key points in the porting process are the manual analysis of the source code and kernel detection. Each device of these heterogeneous platforms has their own restrictions, such as the memory allocation support. Kernels must be mapped with suitable computing devices. We introduced AKI as an automatic kernel identification and annotation tool that aims to identify potential kernels on C\(++\) sequential applications. AKI identifies those kernels that can be offlined on heterogeneous computing devices. To annotate these kernels, REPARA C++ attributes have been defined. This annotation mechanism can aid future automatic source-to-source transformation tools to facilitate the work for parallel heterogeneous platforms. AKI has been evaluated over all benchmarks included in the NAS suite. The benchmark suite incorporates a big set of realistic high performance applications. The evaluation results demonstrate that AKI is a competitive solution for identifying and annotating parallel code fragments (aka kernels).  相似文献   

6.
7.
In this paper we consider both the maximization variant Max Rep and the minimization variant Min Rep of the famous Label Cover problem. So far the best approximation ratios known for these two problems were \(O(\sqrt{n})\) and indeed some authors suggested the possibility that this ratio is the best approximation factor for these two problems. We show, in fact, that there are a O(n 1/3)-approximation algorithm for Max Rep and a O(n 1/3log?2/3 n)-approximation algorithm for Min Rep. In addition, we also exhibit a randomized reduction from Densest k-Subgraph to Max Rep, showing that any approximation factor for Max Rep implies the same factor (up to a constant) for Densest k-Subgraph.  相似文献   

8.
Given a simple undirected graph G = (V, E) and an integer k < |V|, the Sparsest k-Subgraph problem asks for a set of k vertices which induces the minimum number of edges. As a generalization of the classical independent set problem, Sparsest k-Subgraph is ????-hard and even not approximable unless ?????? in general graphs. Thus, we investigate Sparsest k-Subgraph in graph classes where independent set is polynomial-time solvable, such as subclasses of perfect graphs. Our two main results are the ????-hardness of Sparsest k-Subgraph on chordal graphs, and a greedy 2-approximation algorithm. Finally, we also show how to derive a P T A S for Sparsest k-Subgraph on proper interval graphs.  相似文献   

9.
Let a program-predicate t testing another program p with respect to a given postcondition be given. Concrete tests d (data of the program p) are input data for t. Let us consider the program t when values of its argument d are unknown. Then a proof of the fact that the prediate t is true for all input data of the program p is verification of p with respect to the given postcondition. In this paper, we describe experiments on automatic verification of a number of cache coherence protocols with the SCP4 supercompiler (an optimizer of programs written in the REFAL-5 functional language).  相似文献   

10.
The task assignment on the Internet has been widely applied to many areas, e.g., online labor market, online paper review and social activity organization. In this paper, we are concerned with the task assignment problem related to the online labor market, termed as ClusterHire. We improve the definition of the ClusterHire problem, and propose an efficient and effective algorithm, entitled Influence. In addition, we place a participation constraint on ClusterHire. It constrains the load of each expert in order to keep all members from overworking. For the participation-constrained ClusterHire problem, we devise two algorithms, named ProjectFirst and Era. The former generates a participationconstrained team by adding experts to an initial team, and the latter generates a participation-constrained team by removing the experts with the minimum influence from the universe of experts. The experimental evaluations indicate that 1) Influence performs better than the state-of-the-art algorithms in terms of effectiveness and time efficiency; 2) ProjectFirst performs better than Era in terms of time efficiency, yet Era performs better than ProjectFirst in terms of effectiveness.  相似文献   

11.
We study the computational complexity of the existence and the verification problem for wonderfully stable partitions (WSPE and WSPV) and of the existence problem for strictly core stable coalition structures (SCSCS) in enemy-oriented hedonic games. In this note, we show that WSPV is NP-complete and both WSPE and SCSCS are DP-hard, where DP is the second level of the boolean hierarchy, and we discuss an approach for classifying the latter two problems in terms of their complexity.  相似文献   

12.
This paper discusses the implementation of deformable ring gears in Multi-Body-Simulation-Models (MBS-Models) of planetary gearboxes using the example of a Wind Turbine (WT). For this purpose an add-on to the MBS-Software Simpack was developed and tested by the project partners Bosch Rexroth and the Institute for Machine Elements and Machine Design at Rwth-Aachen University (Ime).The presented research includes a measurement campaign on a test rig owned by Bosch Rexroth to obtain data for a validation of the newly developed add-on.  相似文献   

13.
Satisfiability Modulo Theories (SMT) have been widely investigated over the last decade. Recently researchers have extended SMT to the optimization problem over linear arithmetic constraints. To the best of our knowledge, Symba and OPT-MathSAT are two most efficient solvers available for this problem. The key algorithms used by Symba and OPT-MathSAT consist of the loop of two procedures: 1) critical finding for detecting a critical point, which is very likely to be globally optimal, and 2) global checking for confirming the critical point is really globally optimal. In this paper, we propose a new approach based on the Simplex method widely used in operation research. Our fundamental idea is to find several critical points by constructing and solving a series of linear problems with the Simplex method. Our approach replaces the algorithms of critical finding in Symba and OPT-MathSAT, and reduces the runtime of critical finding and decreases the number of executions of global checking. The correctness of our approach is proved. The experiment evaluates our implementation against Symba and OPT-MathSAT on a critical class of problems in real-time systems. Our approach outperforms Symba on 99.6% of benchmarks and is superior to OPT-MathSAT in large-scale cases where the number of tasks is more than 24. The experimental results demonstrate that our approach has great potential and competitiveness for the optimization problem.  相似文献   

14.
Providing high level tools for parallel programming while sustaining a high level of performance has been a challenge that techniques like Domain Specific Embedded Languages try to solve. In previous works, we investigated the design of such a DSEL—NT\(^2\)—providing a Matlab -like syntax for parallel numerical computations inside a C++ library. In this paper, we show how NT\(^2\!\) has been redesigned for shared memory systems in an extensible and portable way. The new NT\(^2\!\) design relies on a tiered Parallel Skeleton system built using asynchronous task management and automatic compile-time taskification of user level code. We describe how this system can operate various shared memory runtimes and evaluate the design by using two benchmarks implementing linear algebra algorithms.  相似文献   

15.
There is an increasing interest in executing complex analyses over large graphs, many of which require processing a large number of multi-hop neighborhoods or subgraphs. Examples include ego network analysis, motif counting, finding social circles, personalized recommendations, link prediction, anomaly detection, analyzing influence cascades, and others. These tasks are not well served by existing vertex-centric graph processing frameworks, where user programs are only able to directly access the state of a single vertex at a time, resulting in high communication, scheduling, and memory overheads in executing such tasks. Further, most existing graph processing frameworks ignore the challenges in extracting the relevant portions of the graph that an analysis task is interested in, and loading those onto distributed memory. This paper introduces NScale, a novel end-to-end graph processing framework that enables the distributed execution of complex subgraph-centric analytics over large-scale graphs in the cloud. NScale enables users to write programs at the level of subgraphs rather than at the level of vertices. Unlike most previous graph processing frameworks, which apply the user program to the entire graph, NScale allows users to declaratively specify subgraphs of interest. Our framework includes a novel graph extraction and packing (GEP) module that utilizes a cost-based optimizer to partition and pack the subgraphs of interest into memory on as few machines as possible. The distributed execution engine then takes over and runs the user program in parallel on those subgraphs, restricting the scope of the execution appropriately, and utilizes novel techniques to minimize memory consumption by exploiting overlaps among the subgraphs. We present a comprehensive empirical evaluation comparing against three state-of-the-art systems, namely Giraph, GraphLab, and GraphX, on several real-world datasets and a variety of analysis tasks. Our experimental results show orders-of-magnitude improvements in performance and drastic reductions in the cost of analytics compared to vertex-centric approaches.  相似文献   

16.
A degree-constrained graph orientation of an undirected graph G is an assignment of a direction to each edge in G such that the outdegree of every vertex in the resulting directed graph satisfies a specified lower and/or upper bound. Such graph orientations have been studied for a long time and various characterizations of their existence are known. In this paper, we consider four related optimization problems introduced in reference (Asahiro et al. LNCS 7422, 332–343 (2012)): For any fixed non-negative integer W, the problems MAX W-LIGHT, MIN W-LIGHT, MAX W-HEAVY, and MIN W-HEAVY take as input an undirected graph G and ask for an orientation of G that maximizes or minimizes the number of vertices with outdegree at most W or at least W. As shown in Asahiro et al. LNCS 7422, 332–343 (2012)).  相似文献   

17.
How do the k-core structures of real-world graphs look like? What are the common patterns and the anomalies? How can we exploit them for applications? A k-core is the maximal subgraph in which all vertices have degree at least k. This concept has been applied to such diverse areas as hierarchical structure analysis, graph visualization, and graph clustering. Here, we explore pervasive patterns related to k-cores and emerging in graphs from diverse domains. Our discoveries are: (1) Mirror Pattern: coreness (i.e., maximum k such that each vertex belongs to the k-core) is strongly correlated with degree. (2) Core-Triangle Pattern: degeneracy (i.e., maximum k such that the k-core exists) obeys a 3-to-1 power-law with respect to the count of triangles. (3) Structured Core Pattern: degeneracy–cores are not cliques but have non-trivial structures such as core–periphery and communities. Our algorithmic contributions show the usefulness of these patterns. (1) Core-A, which measures the deviation from Mirror Pattern, successfully spots anomalies in real-world graphs, (2) Core-D, a single-pass streaming algorithm based on Core-Triangle Pattern, accurately estimates degeneracy up to 12 \(\times \) faster than its competitor. (3) Core-S, inspired by Structured Core Pattern, identifies influential spreaders up to 17 \(\times \) faster than its competitors with comparable accuracy.  相似文献   

18.
The paper studies three fundamental problems in graph analytics, computing connected components (CCs), biconnected components (BCCs), and 2-edge-connected components (ECCs) of a graph. With the recent advent of big data, developing efficient distributed algorithms for computing CCs, BCCs and ECCs of a big graph has received increasing interests. As with the existing research efforts, we focus on the Pregel programming model, while the techniques may be extended to other programming models including MapReduce and Spark. The state-of-the-art techniques for computing CCs and BCCs in Pregel incur \(O(m\times \#\text {supersteps})\) total costs for both data communication and computation, where m is the number of edges in a graph and #supersteps is the number of supersteps. Since the network communication speed is usually much slower than the computation speed, communication costs are the dominant costs of the total running time in the existing techniques. In this paper, we propose a new paradigm based on graph decomposition to compute CCs and BCCs with O(m) total communication cost. The total computation costs of our techniques are also smaller than that of the existing techniques in practice, though theoretically almost the same. Moreover, we also study distributed computing ECCs. We are the first to study this problem and an approach with O(m) total communication cost is proposed. Comprehensive empirical studies demonstrate that our approaches can outperform the existing techniques by one order of magnitude regarding the total running time.  相似文献   

19.
Given a real world graph, how can we find a large subgraph whose partition quality is much better than the original? How can we use a partition of that subgraph to discover a high quality global partition? Although graph partitioning especially with balanced sizes has received attentions in various applications, it is known NP-hard, and also known that there is no good cut at a large scale for real graphs. In this paper, we propose a novel approach for graph partitioning. Our first focus is on finding a large subgraph with high quality partitions, in terms of conductance. Despite the difficulty of the task for the whole graph, we observe that there is a large connected subgraph whose partition quality is much better than the original. Our proposed method MTP finds such a subgraph by removing “hub” nodes with large degrees, and taking the remaining giant connected component. Further, we extend MTP to gb MTP (Global Balanced MTP) for discovering a global balanced partition. gb MTP attaches the excluded nodes in MTP to the partition found by MTP in a greedy way. In experiments, we demonstrate that MTP finds a subgraph of a large size with low conductance graph partitions, compared with competing methods. We also show that the competitors cannot find connected subgraphs while our method does, by construction. This improvement in partition quality for the subgraph is especially noticeable for large scale cuts—for a balanced partition, down to 14 % of the original conductance with the subgraph size 70 % of the total. As a result, the found subgraph has clear partitions at almost all scales compared with the original. Moreover, gb MTP generally discovers global balanced partitions whose conductance are lower than those found by METIS, the state-of-the-art graph partitioning method.  相似文献   

20.
Given a large collection of co-evolving online activities, such as searches for the keywords “Xbox”, “PlayStation” and “Wii”, how can we find patterns and rules? Are these keywords related? If so, are they competing against each other? Can we forecast the volume of user activity for the coming month? We conjecture that online activities compete for user attention in the same way that species in an ecosystem compete for food. We present EcoWeb, (i.e., Ecosystem on the Web), which is an intuitive model designed as a non-linear dynamical system for mining large-scale co-evolving online activities. Our second contribution is a novel, parameter-free, and scalable fitting algorithm, EcoWeb-Fit, that estimates the parameters of EcoWeb. Extensive experiments on real data show that EcoWeb is effective, in that it can capture long-range dynamics and meaningful patterns such as seasonalities, and practical, in that it can provide accurate long-range forecasts. EcoWeb consistently outperforms existing methods in terms of both accuracy and execution speed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号