首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
S-Net is a declarative coordination language and component technology aimed at radically facilitating software engineering for modern parallel compute systems by near-complete separation of concerns between application (component) engineering and concurrency orchestration. S-Net builds on the concept of stream processing to structure networks of communicating asynchronous components implemented in a conventional (sequential) language. In this paper we present the design, implementation and evaluation of a new and innovative runtime system for S-Net streaming networks. The Front runtime system outperforms the existing implementations of S-Net by orders of magnitude for stress-test benchmarks, significantly reduces runtimes of fully-fledged parallel applications with compute-intensive components and achieves good scalability on our 48-core test system.  相似文献   

2.
The adoption of Artificial Neural Networks (ANNs) in safety-related applications is often avoided because it is difficult to rule out possible misbehaviors with traditional analytical or probabilistic techniques. In this paper we present NeVer, our tool for checking safety of ANNs. NeVer encodes the problem of verifying safety of ANNs into the problem of satisfying corresponding Boolean combinations of linear arithmetic constraints. We describe the main verification algorithm and the structure of NeVer. We present also empirical results confirming the effectiveness of NeVer on realistic case studies.  相似文献   

3.
In recent years multi-core processors have seen broad adoption in application domains ranging from embedded systems through general-purpose computing to large-scale data centres. Simulation technology for multi-core systems, however, lags behind and does not provide the simulation speed required to effectively support design space exploration and parallel software development. While state-of-the-art instruction set simulators (Iss) for single-core machines reach or exceed the performance levels of speed-optimised silicon implementations of embedded processors, the same does not hold for multi-core simulators where large performance penalties are to be paid. In this paper we develop a fast and scalable simulation methodology for multi-core platforms based on parallel and just-in-time (Jit) dynamic binary translation (Dbt). Our approach can model large-scale multi-core configurations, does not rely on prior profiling, instrumentation, or compilation, and works for all binaries targeting a state-of-the-art embedded multi-core platform implementing the ARCompact instruction set architecture (Isa). We have evaluated our parallel simulation methodology against the industry standard Splash-2 and Eembc MultiBench benchmarks and demonstrate simulation speeds up to 25,307 Mips on a 32-core x86 host machine for as many as 2,048 target processors whilst exhibiting minimal and near constant overhead, including memory considerations.  相似文献   

4.
The uml Profile for Modeling and Analysis of Real-Time and Embedded (RTE) systems has recently been adopted by the OMG. Its Time Model extends the informal and simplistic Simple Time package proposed by Unified Modeling Language (UML2) and offers a broad range of capabilities required to model RTE systems including discrete/dense and chronometric/logical time. The Marte specification introduces a Time Structure inspired from several time models of the concurrency theory and proposes a new clock constraint specification language (ccsl) to specify, within the context of the uml, logical and chronometric time constraints. A semantic model in ccsl is attached to a (uml) model to give its timed causality semantics. In that sense, ccsl is comparable to the Ptolemy environment, in which directors give the semantics to models according to predefined models of computation and communication. This paper focuses on one historical model of computation of Ptolemy [Synchronous Data Flow (SDF)] and shows how to build SDF graphs by combining uml models and ccsl.  相似文献   

5.
We describe PSurface, a C $++$ library that allows to store and access piecewise linear maps between simplicial surfaces in $\mathbb{R }^2$ and $\mathbb{R }^3$ . Piecewise linear maps can be used, e.g., to construct boundary approximations for finite element grids, and grid intersections for domain decomposition methods. In computer graphics the maps allow to build level-of-detail representations as well as texture- and bump maps. The PSurface library can be used as the basis for the implementation of a wide range of algorithms that use piecewise linear maps between triangulated surfaces. A few simple examples are given in this work. We document the data structures and algorithms and show how PSurface is used in the numerical analysis framework Dune and the visualization software Amira.  相似文献   

6.
Graphs appear in numerous applications including cyber security, the Internet, social networks, protein networks, recommendation systems, citation networks, and many more. Graphs with millions or even billions of nodes and edges are common-place. How to store such large graphs efficiently? What are the core operations/queries on those graph? How to answer the graph queries quickly? We propose Gbase, an efficient analysis platform for large graphs. The key novelties lie in (1) our storage and compression scheme for a parallel, distributed settings and (2) the carefully chosen graph operations and their efficient implementations. We designed and implemented an instance of Gbase using Mapreduce/Hadoop. Gbase provides a parallel indexing mechanism for graph operations that both saves storage space, as well as accelerates query responses. We run numerous experiments on real and synthetic graphs, spanning billions of nodes and edges, and we show that our proposed Gbase is indeed fast, scalable, and nimble, with significant savings in space and time.  相似文献   

7.
A DIN Kernel LISP Draft (DKLisp) has been developed by DIN as Reaction to Action D1 (N79), short term goal, of ISO WG16. It defines a subset language, as compatible as possible with the ANSICommon-Lisp draft, but also with theEuLisp draft. It combines the most important LISP main stream features in a single, compact, but nevertheless complete language definition, which thereby could be well suited as basis for a short term InternationalLisp Standard. Besides the functional and knowledge processing features, the expressive power of the language is well comparable with contemporary procedural languages, as e.g. C++ (of course without libraries). Important features ofDKLisp are:
  • to be a “Lisp-1,” but allowing an easy “Lisp-2” transformation;
  • to be a simple, powerful and standardized educationalLisp;
  • to omit all features, which are unclean or in heavy discussion;
  • DKLisp programs run nearly unchanged inCommon-Lisp;
  • DKLisp contains a simple object and package system;
  • DKLisp contains those data classes and control structures also common to most modernLisp and non-Lisp languages;
  • DKLisp offers a simple stream I/O;
  • DKLisp contains a clean unified hierarchical class/type system;
  • DKLisp contains the typical “Lisp-features” in an orthogonal way;
  • DKLisp allows and encourages really small but powerful implementations;
  • DKLisp comes in levels, so allowing ANSICommon-Lisp to be an extension ofDKLisp level-1.
  • The present is the second version of the proposal, namely version 1.2, with slight differences with respect to the one sent to ISO. Sources of such changes were the remarks generously sent by many readers of the previous attempt.  相似文献   

    8.
    In this paper, we present Para Miner which is a generic and parallel algorithm for closed pattern mining. Para Miner is built on the principles of pattern enumeration in strongly accessible set systems. Its efficiency is due to a novel dataset reduction technique (that we call EL-reduction), combined with novel technique for performing dataset reduction in a parallel execution on a multi-core architecture. We illustrate Para Miner’s genericity by using this algorithm to solve three different pattern mining problems: the frequent itemset mining problem, the mining frequent connected relational graphs problem and the mining gradual itemsets problem. In this paper, we prove the soundness and the completeness of Para Miner. Furthermore, our experiments show that despite being a generic algorithm, Para Miner can compete with specialized state of the art algorithms designed for the pattern mining problems mentioned above. Besides, for the particular problem of gradual itemset mining, Para Miner outperforms the state of the art algorithm by two orders of magnitude.  相似文献   

    9.
    Reasoning about the termination of equational programs in sophisticated equational languages such as Elan, Maude, OBJ, CafeOBJ, Haskell, and so on, requires support for advanced features such as evaluation strategies, rewriting modulo, use of extra variables in conditions, partiality, and expressive type systems (possibly including polymorphism and higher-order). However, many of those features are, at best, only partially supported by current term rewriting termination tools (for instance mu-term, C i ME, AProVE, TTT, Termptation, etc.) while they may be essential to ensure termination. We present a sequence of theory transformations that can be used to bridge the gap between expressive membership equational programs and such termination tools, and prove the correctness of such transformations. We also discuss a prototype tool performing the transformations on Maude equational programs and sending the resulting transformed theories to some of the aforementioned standard termination tools.  相似文献   

    10.
    Overlay networks support a wide range of peer-to-peer media streaming applications on the Internet. The user experience of such applications is affected by the churn resilience of the system. When peers disconnect from the system, streamed data may be delayed or lost due to missing links in the overlay topology. In this paper, we explore a proactive strategy to create churn-aware overlay networks that reduce the potential of disruptions caused by churn events. We describe Chams, a middleware for constructing overlay networks that mitigates the impact of churn. Chams uses a ??hybrid?? approach??it implicitly defines an overlay topology using a gossip-style mechanism, while taking the reliability of peers into account. Unlike systems for overlay construction, Chams supports a variety of topologies used in media streaming systems, such as trees, multi-trees and forests. We evaluate Chams with different topologies and show that it reduces the impact of churn, while imposing only low computational and message overheads.  相似文献   

    11.
    Given a set of input points, the Steiner Tree Problem (STP) is to find a minimum-length tree that connects the input points, where it is possible to add new points to minimize the length of the tree. Solving the STP is of great importance since it is one of the fundamental problems in network design, very large scale integration routing, multicast routing, wire length estimation, computational biology, and many other areas. However, the STP is NP-hard, which shatters any hopes of finding a polynomial-time algorithm to solve the problem exactly. This is why the majority of research has looked at finding efficient heuristic algorithms. Additionally, many authors focused their work on utilizing the ever-increasing computational power and developed many parallel and distributed methods for solving the problem. In this way we are able to obtain better results in less time than ever before. Here, we present a survey of the parallel and distributed methods for solving the STP and discuss some of their applications.  相似文献   

    12.
    Scientific applications are getting increasingly complex, e.g., to improve their accuracy by taking into account more phenomena. Meanwhile, computing infrastructures are continuing their fast evolution. Thus, software engineering is becoming a major issue to offer ease of development, portability and maintainability while achieving high performance. Component based software engineering offers a promising approach that enables the manipulation of the software architecture of applications. However, existing models do not provide an adequate support for performance portability of HPC applications. This paper proposes a low level component model (L \(^2\) C) that supports inter-component interactions for typical scenarios of high performance computing, such as process-local shared memory and function invocation (C++ and Fortran), MPI, and Corba. To study the benefits of using L \(^2\) C, this paper walks through an example of stencil computation, i.e. a structured mesh Jacobi implementation of the 2D heat equation parallelized through domain decomposition. The experimental results obtained on the Grid’5000 testbed and on the Curie supercomputer show that L \(^2\) C can achieve performance similar to that of native implementations, while easing performance portability.  相似文献   

    13.
    In this paper we study the prevalent problem of graph partitioning by analyzing the diffusion-based partitioning heuristic Bubble-FOS/C, a key component of a practical successful graph partitioner?(Meyerhenke et al. in J. Parallel Distrib. Comput. 69(9):750?C761, 2009). We begin by studying the disturbed diffusion scheme FOS/C, which computes the similarity measure used in Bubble-FOS/C and is therefore the most crucial component. By relating FOS/C to random walks, we obtain precise characterizations of the behavior of FOS/C on tori and hypercubes. Besides leading to new knowledge on FOS/C (and therefore also on Bubble-FOS/C), these characterizations have been recently used for the analysis of load balancing algorithms?(Berenbrink et al. in Proceedings of the 22nd Annual Symposium on Discrete Algorithms, pp. 429?C439, 2011). We then regard Bubble-FOS/C, which has been shown in previous experiments to produce solutions with good partition shapes and other favorable properties. In this paper we prove that it computes a relaxed solution to an edge cut minimizing binary quadratic program (BQP). This result provides the first substantial theoretical insight why Bubble-FOS/C yields good experimental results in terms of graph partitioning metrics. Moreover, we show that in bisections computed by Bubble-FOS/C, at least one of the two parts is connected. Using the aforementioned relation between FOS/C and random walks, we prove that in vertex-transitive graphs both parts must be connected components.  相似文献   

    14.
    Golomb rulers are special rulers where for any two marks it holds that the distance between them is unique. They find applications in radio frequency selection, radio astronomy, data encryption, communication networks, and bioinformatics. An important subproblem in constructing “compact” Golomb rulers is Golomb Subruler  (GSR), which asks whether it is possible to make a given ruler Golomb by removing at most \(k\) marks. We initiate a study of GSR from a parameterized complexity perspective. In particular, we consider a natural hypergraph characterization of rulers and investigate the construction and structure of the corresponding hypergraphs. We exploit their properties to derive polynomial-time data reduction rules that reduce a given instance of GSR to an equivalent one with \({{\mathrm{O}}}(k^3)\)  marks. Finally, we complement a recent computational complexity study of GSR by providing a simplified reduction that shows NP-hardness even when all integers are bounded by a polynomial in the input length.  相似文献   

    15.
    We treat the problem of generating cost-optimal schedules for orders with individual due dates and cost functions based on earliness/tardiness. Orders can run in parallel in a resource-constrained manufacturing environment, where resources are subject to stochastic breakdowns. The goal is to generate schedules while minimizing the expected costs. First, we estimate the distribution of each order type by simulation (assuming a reasonable machine/load model) and derive from the cost-function an optimal offset from the due date of each individual order. Second, these optimal offsets are then used to guide the generation of schedules which are responsible to resolve resource conflicts. Third, we evaluate the generated schedules by simulation. The approach is demonstrated by means of a non-trivial case-study from lacquer production. Optimal offsets are derived with the Modest/Möbius tool, schedules are generated using Uppaal Cora. The experimental results show that our approach achieves good results in all considered scenarios, and better results than an approach based on adding slack to processing times.  相似文献   

    16.
    In this paper, we present a thorough integration of qualitative representations and reasoning for positional information for domestic service robotics domains into our high-level robot control. In domestic settings for service robots like in the RoboCup@Home competitions, complex tasks such as “get the cup from the kitchen and bring it to the living room” or “find me this and that object in the apartment” have to be accomplished. At these competitions the robots may only be instructed by natural language. As humans use qualitative concepts such as “near” or “far”, the robot needs to cope with them, too. For our domestic robot, we use the robot programming and plan language Readylog, our variant of Golog. In previous work we extended the action language Golog, which was developed for the high-level control of agents and robots, with fuzzy set-based qualitative concepts. We now extend our framework to positional fuzzy fluents with an associated positional context called frames. With that and our underlying reasoning mechanism we can transform qualitative positional information from one context to another to account for changes in context such as the point of view or the scale. We demonstrate how qualitative positional fluents based on a fuzzy set semantics can be deployed in domestic domains and showcase how reasoning with these qualitative notions can seamlessly be applied to a fetch-and-carry task in a RoboCup@Home scenario.  相似文献   

    17.
    This paper presents a distributed (Bulk-Synchronous Parallel or bsp) algorithm to compute on-the-fly whether a structured model of a security protocol satisfies a ctl \(^*\) formula. Using the structured nature of the security protocols allows us to design a simple method to distribute the state space under consideration in a need-driven fashion. Based on this distribution of the states, the algorithm for logical checking of a ltl formula can be simplified and optimised allowing, with few tricky modifications, the design of an efficient algorithm for ctl \(^*\) checking. Some prototype implementations have been developed, allowing to run benchmarks to investigate the parallel behaviour of our algorithms.  相似文献   

    18.
    19.
    In this paper we describe the libMesh (http://libmesh.sourceforge.net) framework for parallel adaptive finite element applications. libMesh is an open-source software library that has been developed to facilitate serial and parallel simulation of multiscale, multiphysics applications using adaptive mesh refinement and coarsening strategies. The main software development is being carried out in the CFDLab (http://cfdlab.ae.utexas.edu) at the University of Texas, but as with other open-source software projects; contributions are being made elsewhere in the US and abroad. The main goals of this article are: (1) to provide a basic reference source that describes libMesh and the underlying philosophy and software design approach; (2) to give sufficient detail and references on the adaptive mesh refinement and coarsening (AMR/C) scheme for applications analysts and developers; and (3) to describe the parallel implementation and data structures with supporting discussion of domain decomposition, message passing, and details related to dynamic repartitioning for parallel AMR/C. Other aspects related to C++ programming paradigms, reusability for diverse applications, adaptive modeling, physics-independent error indicators, and similar concepts are briefly discussed. Finally, results from some applications using the library are presented and areas of future research are discussed.  相似文献   

    20.
    设为首页 | 免责声明 | 关于勤云 | 加入收藏

    Copyright©北京勤云科技发展有限公司  京ICP备09084417号