期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Python accelerators for high-performance computing

Ami Marowka 《The Journal of supercomputing》2018,74(4):1449-1460

Python became the preferred language for teaching in academia, and it is one of the most popular programming languages for scientific computing. This wide popularity occurs despite the weak performance of the language. This weakness is the motivation that drives the efforts devoted by the Python community to improve the performance of the language. In this article, we are following these efforts while we focus on one specific promised solution that aims to provide high-performance and performance portability for Python applications. 相似文献

2.

A new definition for high-performance computing

《Micro, IEEE》2002,22(2):2-2

相似文献

3.

Communication system for high-performance distributed computing

S. Hariri J.-B. Park M. Parashar G. C. Fox 《Concurrency and Computation》1994,6(4):251-270

With the current advances in computer and networking technology coupled with the availability of software tools for parallel and distributed computing, there has been increased interest in high-performance distributed computing (HPDC). We envision that HPDC environments with supercomputing capabilities will be available in the near future. However, a number of issues have to be resolved before future network-based applications can fully exploit the potential of the HPDC environment. In the paper we present an architecture for a high-speed local area network and a communication system that provides HPDC applications with high bandwidth and low latency. We also characterize the message-passing primitives required in HPDC applications and develop a communication protocol that implements these primitives efficiently. 相似文献

4.

基于RMI的高性能计算网格二次开发模型 总被引：1，自引：0，他引：1

曹荣强曹宗雁迟学斌肖海力《计算机应用》2010,30(9):2526-2529

根据高性能计算和网格的特点,提出一种网格二次开发模型(GRM)。结合RMI和SSL/TLS技术,该模型提供了访问中间件的一致性接口,屏蔽了通过网络访问网格的繁杂问题,解决了敏感数据在不安全的广域网中传输的问题。以科学计算网格的中间件为基础,实现了GRM。多个基于GRM的用户接口开发经历和实验结果表明,GRM为开发人员提供了一个简单易用和功能全面的开发模型,而且具有良好的性能和可移植性。相似文献

5.

MOVIE model for open-systems-based high-performance distributed computing

Wojtek Furmanski Chris Faigle Tom Haupt Janusz Niemiec Marek Podgorny Diglio Simoni 《Concurrency and Computation》1993,5(4):287-308

MOVIE (Multitasking Object-oriented Visual Interactive Environment) is a new software system for high-performance distributed computing (HPDC), currently in the advanced design and implementation stage at Northeast Parallel Architectures Center (NPAC), Syracuse University. The MOVIE system is structured as a multiserver network of interpreters of the high-level object-oriented programming language MovieScript. MovieScript derives from PostScript and extends it in the C⁺⁺ syntax-based object-oriented interpreted style towards 3D graphics, high-performance computing and general-purpose high-level communication protocol for distributed and MIMD-parallel computing. The paper describes the overall open systems-based MOVIE design and itemizes currently implemented, developed and planned components of the system. 相似文献

6.

基于太空级 Virtex FPGA的灵活高性能计算平台

Ian Troxel Greg Lara 《电子技术应用》2009,35(4)

采用太空级Virtex FPGA与可重构的系统架构,可满足天基系统对尺寸、重量及功耗的苛刻要求,并缩短设计周期。SEAKR工程公司采用可重构的赛灵思VirtexFPGA创建了灵活的高性能计算平台,用作各种天基系统的核心。该全新计算平台已成功应用于4个太空任务。相似文献

7.

The marketplace of high-performance computing

《Parallel Computing》1999,25(13-14):1517-1544

In this paper we analyze the major trends and changes in the High-Performance Computing (HPC) market place since the beginning of the journal `Parallel Computing'. The initial success of vector computers in the 1970s was driven by raw performance. The introduction of this type of computer systems started the area of `Supercomputing'. In the 1980s the availability of standard development environments and of application software packages became more important. Next to performance these factors determined the success of MP vector systems, especially at industrial customers. MPPs became successful in the early 1990s due to their better price/performance ratios, which was made possible by the attack of the `killer-micros'. In the lower and medium market segments the MPPs were replaced by microprocessor based symmetrical multiprocessor (SMP) systems in the middle of the 1990s. There success formed the basis for the use of new cluster concepts for very high-end systems. In the last few years only the companies which have entered the emerging markets for massive parallel database servers and financial applications attract enough business volume to be able to support the hardware development for the numerical high-end computing market as well. Success in the traditional floating point intensive engineering applications seems to be no longer sufficient for survival in the market. 相似文献

8.

A parallel computing architecture for high-performance OWL reasoning

《Parallel Computing》2019

The Web Ontology Language (OWL) is a widely used knowledge representation language for describing knowledge in application domains by using classes, properties, and individuals. Ontology classification is an important and widely used service that computes a taxonomy of all classes occurring in an ontology. It can require significant amounts of runtime, but most OWL reasoners do not support any kind of parallel processing. We present a novel thread-level parallel architecture for ontology classification, which is ideally suited for shared-memory SMP servers, but does not rely on locking techniques and thus avoids possible race conditions. We evaluated our prototype implementation with a set of real-world ontologies. Our experiments demonstrate a very good scalability resulting in a speedup that is linear to the number of available cores. 相似文献

9.

Special issue editorial: Accelerators for high-performance computing

Ramón Doallo Basilio B. Fraguela 《Journal of Parallel and Distributed Computing》2012

相似文献

10.

Implementation of hierarchical F-channels for high-performance distributed computing

Keith Shafer Mohan Ahuja 《Distributed Computing》1995,8(4):211-218

Summary High performance distributed computing systems require high performance communication systems.F-channels andHierarchical F-channels address this need by permitting a high level of concurrency like non-FIFO channels while retaining the simplicity of FIFO channels critical to the design and proof of many distributed algorithms. In this paper, we present counter-based implementations for F-channels and Hierarchical F-channels using message augmentation-appending control information to a message. These implementations guarantee that no messages are unnecessarily delayed at the receiving end. Keith Shafer received the B.A. degree in computer science and mathematics in 1986 from Mount Vernon Nazarene College, Mount Vernon, Ohio, USA, and the M.S. and Ph.D. degrees in computer science from The Ohio State University, Columbus, Ohio, USA, in 1988 and 1992, respectively. He is currently a Senior Research Scientist at OCLC Online Computer Library Center Inco, Dublin, OH, USA. His research interests include tools for comparing logical channels and methods for automatically constructing corpus grammars from tagged documents as an aid for database preparation and document conversion. Dr. Shafer is a member of the IEEE Computer Society. Mohan Ahuja received the M.A. degree in 1983 and the Ph.D. degree in 1985, both in computer science, from the University of Texas at Austin. He is currently with Department of Computer Science and Engineering, Univ. of California, San Diego. His recent research contributions include Global Flushing, message receipt in Receive-Phases, Incremental Publication of a Partial Order, Design of Highways (a high-performance distributed programming system) and — in collaboration with others — Passive-space and Time View, Performance evaluation of F-Channels, and Units of Computation in Fault-Tolerant Distributed Systems. His current research interests are in high-performance distributed communication and computing architectures, building high-performance systems, distributed operating systems, distributed algorithms, fault tolerance, and performance evaluation.Parts of this paper appeared in two conference papers, (1) Distributed Modeling and Implementation of High Performance Communication Architectures, in proceedings of the Thirteenth IEEE International Conference on Distributed Computing Systems, papes 56–65, 1993 and (2) Process-Channel_agem-Process model of asynchronous distributed communication, in proceedings of the Twelfth IEEE International Conference on Distributed Computing Systems, pages 4–11, 1992 相似文献

11.

A scalable high-performance computing solution for networks onchips

《Micro, IEEE》2002,22(5):46-55

The Eclipse network-on-a-chip architecture uses a sophisticated parallel programming model, realized through multithreaded processors, interleaved memory modules, and a high-capacity interconnection network to support system-on-a-chip designs 相似文献

12.

Evaluation of messaging middleware for high-performance cloud computing

Roberto R. Expósito Guillermo L. Taboada Sabela Ramos Juan Touriño Ramón Doallo 《Personal and Ubiquitous Computing》2013,17(8):1709-1719

Cloud computing is posing several challenges, such as security, fault tolerance, access interface singularity, and network constraints, both in terms of latency and bandwidth. In this scenario, the performance of communications depends both on the network fabric and its efficient support in virtualized environments, which ultimately determines the overall system performance. To solve the current network constraints in cloud services, their providers are deploying high-speed networks, such as 10 Gigabit Ethernet. This paper presents an evaluation of high-performance computing message-passing middleware on a cloud computing infrastructure, Amazon EC2 cluster compute instances, equipped with 10 Gigabit Ethernet. The analysis of the experimental results, confronted with a similar testbed, has shown the significant impact that virtualized environments still have on communication performance, which demands more efficient communication middleware support to get over the current cloud network limitations. 相似文献

13.

Intelligent technologies of high-performance computing

I. V. Sergienko I. N. Molchanov A. N. Khimich 《Cybernetics and Systems Analysis》2010,46(5):833-844

This article analyzes mathematical and technological problems that arise in performing computational experiments on modern high-performance computers (supercomputers). As a means of overcoming the difficulties associated with the analysis and solution of computer model problems under conditions of approximate initial data on computers with parallel architectures, intelligent technologies are proposed that are based on intelligent software supported by architectural decisions of an intelligent computer and predictive system software. 相似文献

14.

A low-overhead networking mechanism for virtualized high-performance computing systems 总被引：1，自引：0，他引：1

Jae-Wan Jang Euiseong Seo Heeseung Jo Jin-Soo Kim 《The Journal of supercomputing》2012,59(1):443-468

The use of virtualized parallel and distributed computing systems is rapidly becoming the mainstream due to the significant benefit of high energy-efficiency and low management cost. Processing network operations in a virtual machine, however, incurs a lot of overhead from the arbitration of network devices between virtual machines, inherently by the nature of the virtualized architecture. Since data transfer between server nodes frequently occurs in parallel and distributed computing systems, the high overhead of networking may induce significant performance loss in the overall system. This paper introduces the design and implementation of a novel networking mechanism with low overhead for virtualized server nodes. By sacrificing isolation between virtual machines, which is insignificant in distributed or parallel computing systems, our approach significantly reduces the processing overhead in networking operations by up to 29% of processor load, along with up to 36% of processor cache miss. Furthermore, it improves network bandwidth by up to 8%, especially when transmitting large packets. As a result, our prototype enhances the performance of real-world workloads by up to 12% in our evaluation. 相似文献

15.

A sliding window technique for interactive high-performance computing scenarios

《Advances in Engineering Software》2015

Interactive high-performance computing is doubtlessly beneficial for many computational science and engineering applications whenever simulation results should be visually processed in real time, i.e. during the computation process. Nevertheless, interactive HPC entails a lot of new challenges that have to be solved – one of them addressing the fast and efficient data transfer between a simulation back end and visualisation front end, as several gigabytes of data per second are nothing unusual for a simulation running on some (hundred) thousand cores. Here, a new approach based on a sliding window technique is introduced that copes with any bandwidth limitations and allows users to study both large and small scale effects of the simulation results in an interactive fashion. 相似文献

16.

Performance analysis challenges and framework for high-performance reconfigurable computing

Seth Koehler John Curreri Alan D. George 《Parallel Computing》2008,34(4-5):217-230

Reconfigurable computing (RC) applications employing both microprocessors and FPGAs have potential for large speedup when compared with traditional (software) parallel applications. However, this potential is marred by the additional complexity of these dual-paradigm systems, making it difficult to identify performance bottlenecks and achieve desired performance. Performance analysis concepts and tools are well researched and widely available for traditional parallel applications but are lacking in RC, despite being of great importance due to the applications’ increased complexity. In this paper, we explore challenges and present new techniques in automated instrumentation, runtime measurement, and visualization of RC application behavior. We also present ideas for integration with conventional performance analysis tools to create a unified tool for RC applications as well as our initial framework for FPGA instrumentation and measurement. Results from a case study are provided using a prototype of this new tool. 相似文献

17.

日地空间信息分布式协同高性能计算框架

下载免费PDF全文

李姗姗王群《计算机工程与应用》2010,46(16):9-11

以日地系统活动规律研究为背景,基于美国新近提出的应用于大规模科学计算领域的组件规范CCA（Common Component Architecture）,设计提出了日地空间信息分布式协同高性能计算框架DCHF-SI,它集物理模型组件化封装、模拟应用的构建和管理、模型互操作、分布式容错和计算驾驭可视化等服务于一体,能够充分利用网络集成大量的分布式高性能计算资源和空间物理模型资源来构建多物理松耦合模拟应用,支持日地空间信息的分布式协同高性能计算,解决了多物理耦合模拟的复杂性问题,最终为空间天气预报服务系统提供支持。相似文献

18.

A taxonomy of task-based parallel programming technologies for high-performance computing

Peter Thoman Kiril Dichev Thomas Heller Roman Iakymchuk Xavier Aguilar Khalid Hasanov Philipp Gschwandtner Pierre Lemarinier Stefano Markidis Herbert Jordan Thomas Fahringer Kostas Katrinis Erwin Laure Dimitrios S. Nikolopoulos 《The Journal of supercomputing》2018,74(4):1422-1434

Task-based programming models for shared memory—such as Cilk Plus and OpenMP 3—are well established and documented. However, with the increase in parallel, many-core, and heterogeneous systems, a number of research-driven projects have developed more diversified task-based support, employing various programming and runtime features. Unfortunately, despite the fact that dozens of different task-based systems exist today and are actively used for parallel and high-performance computing (HPC), no comprehensive overview or classification of task-based technologies for HPC exists. In this paper, we provide an initial task-focused taxonomy for HPC technologies, which covers both programming interfaces and runtime mechanisms. We demonstrate the usefulness of our taxonomy by classifying state-of-the-art task-based environments in use today. 相似文献

19.

Programming environments for high-performance Grid computing: the Albatross project

Thilo Henri E. Jason Rob Lionel Rutger Kees 《Future Generation Computer Systems》2002,18(8)

The aim of the Albatross project is to study applications and programming environments for computational Grids. We focus on high-performance applications, running in parallel on multiple clusters or MPPs that are connected by wide-area networks (WANs). We briefly present three Grid programming environments developed in the context of the Albatross project: the MagPIe library for collective communication with MPI, the replicated method invocation (RepMI) mechanism for Java, and the Java-based Satin system for running divide-and-conquer programs on Grid platforms.A major challenge in investigating the performance of such applications is the actual WAN behavior. Typical wide-area links are just part of the Internet and thus shared among many applications, making runtime measurements irreproducible and thus scientifically hardly valuable. To overcome this problem, we developed a WAN emulator as part of Panda, our general-purpose communication substrate. The WAN emulator allows us to run parallel applications on a single (large) parallel machine with only the wide-area links being emulated. The Panda emulator is highly accurate and configurable at runtime. We present a case study in which Satin runs across various emulated WAN scenarios. 相似文献

20.

GPU-based high-performance computing for integrated surface–sub-surface flow modeling

《Environmental Modelling & Software》2015

The widespread availability of high-resolution lidar data provides an opportunity to capture micro-topographic control on the partitioning and transport of water for incorporation in coupled surface – sub-surface flow modeling. However, large-scale simulations of integrated flow at the lidar data resolution are computationally expensive due to the density of the computational grid and the iterative nature of the algorithms for solving nonlinearity. Here we present a distributed physically based integrated flow model that couples two-dimensional overland flow and three-dimensional variably saturated sub-surface flow on a GPU-based (Graphic Processing Unit) parallel computing architecture. Alternating Direction Implicit (ADI) scheme modified for GPU structure is used for numerical solutions in both models. Boundary condition switching approach is applied to partition potential water fluxes into actual fluxes for the coupling between surface and sub-surface models. The algorithms are verified using five benchmark problems that have been widely adopted in literature. This is followed by a large-scale simulation using lidar data. We demonstrate that the method is computationally efficient and produces physically consistent solutions. This computational efficiency suggests the feasibility of GPU computing for fully distributed, physics-based hydrologic models over large areas. 相似文献