首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
2.
The mismatch between compute performance and I/O performance has long been a stumbling block as supercomputers evolve from petaflops to exaflops. Currently, many parallel applications are I/O intensive, and their overall running times are typically limited by I/O performance. To quantify the I/O performance bottleneck and highlight the significance of achieving scalable performance in peta/exascale supercomputing, in this paper, we introduce for the first time a formal definition of the ‘storage wall’ from the perspective of parallel application scalability. We quantify the effects of the storage bottleneck by providing a storage-bounded speedup, defining the storage wall quantitatively, presenting existence theorems for the storage wall, and classifying the system architectures depending on I/O performance variation. We analyze and extrapolate the existence of the storage wall by experiments on Tianhe-1A and case studies on Jaguar. These results provide insights on how to alleviate the storage wall bottleneck in system design and achieve hardware/software optimizations in peta/exascale supercomputing.  相似文献   

3.
4.
A framework consisting of reusable modules acting at different levels can provide fault tolerance mechanisms for synchronous and asynchronous communication to ensure coherence of parallel tasks  相似文献   

5.
A distributed heterogeneous supercomputing management system   总被引:3,自引:0,他引:3  
Ghafoor  A. Yang  J. 《Computer》1993,26(6):78-86
A general management framework for distributed heterogenous supercomputing systems (DHSSs) that is based on an application-characterization technique is presented. The technique uses code profiling and analytical benchmarking of supercomputers. An optimal scheduling of tasks in these systems is an NP-complete problem. The use of network caching to reduce the complexity associated with the scheduling decisions is discussed. An experimental prototype of a DHSS management system is described  相似文献   

6.
7.
8.
为确保建成的中国科学院“十一五”信息化重大专项超级计算环境提供稳定可靠的服务,提出三层架构超级计算环境的容错框架。对计算环境可靠性和计算节点可靠性两大部分,通过作业可靠性、服务可靠性和网格节点可靠性三个主要方面的可靠性研究,提出并实现了三层架构超级计算环境的可靠性解决方案。该框架重点解决了单点故障对环境的影响,确保单点故障发生后系统能够继续提供高可用的高性能计算服务。  相似文献   

9.
10.
The range and definition of digital media is vast, reaching from PCs to consumer electronics, but at the center of it all is the display controller. Powered by the advances in semiconductor fabrication and the expansion of the PC market, new graphics controllers, video processors, and audio processors have emerged that promise astounding home entertainment systems at affordable prices. Along with digital media advances have come developments in communications devices. However, the cost of expanding the physical infrastructure is thwarting advances in communications. In addition, both the communications infrastructure problems and the lack of interesting or compelling 3D content are limiting the interactivity of the home entertainment system. But the pace of development and function assimilation continues unabated, and new amazing digital media devices and systems are appearing  相似文献   

11.
介绍了神经计算平台的整体架构,阐述了网格分布式超级计算理论在神经计算平台中的应用,并详述了具体的设计与实现,最后分析了该方法的优势.  相似文献   

12.
There are large-scale computation problems that could benefit from the power of the Web. The authors have chosen the year 2007 to set a hardware scenario where petaflops performance could be obtained by a variety of architectures, including an ATM-connected corporate intranet. They classify the kinds of parallelism found in different applications, arguing that the Web has advantages in the expression of general forms of parallelism that make it suitable as the software environment of choice for these applications on all hardware platforms. They conclude by analyzing Web software approaches in some detail, taking an inevitably nearer term perspective. Their purpose is to show how the Web and MPP can be advanced synergistically to solve different problems  相似文献   

13.
We describe a concurrent visualization pipeline designed for operation in a production supercomputing environment. The facility was initially developed on the NASA Ames "Columbia" supercomputer for a massively parallel forecast model (GEOS4). During the 2005 Atlantic hurricane season, GEOS4 was run 4 times a day under tight time constraints so that its output could be included in an ensemble prediction that was made available to forecasters at the National Hurricane Center. Given this time-critical context, we designed a configurable concurrent pipeline to visualize multiple global fields without significantly affecting the runtime model performance or reliability. We use MPEG compression of the accruing images to facilitate live low-bandwidth distribution of multiple visualization streams to remote sites. We also describe the use of our concurrent visualization framework with a global ocean circulation model, which provides a 864-fold increase in the temporal resolution of practically achievable animations. In both the atmospheric and oceanic circulation models, the application scientists gained new insights into their model dynamics, due to the high temporal resolution animations attainable.  相似文献   

14.
Woodward  P.R. 《Computer》1996,29(10):99-111
I am fortunate to have had access to supercomputers for the last 28 years. Over this time I have used them to simulate time-dependent fluid flows in the compressible regime. Strong shocks and unstable multifluid boundaries, along with the phenomenon of fluid turbulence, have provided the simulation complexity that demands supercomputer power. The supercomputers I have used-the CDC 6600, 7600, and Star-100, the Cray-1, Cray-XMP, Cray-2, and Cray C-90, the Connection Machines CM-2 and CM-5, the Cray T3D, and the Silicon Graphics Challenge Array and Power Challenge Array-span three revolutions in supercomputer design: the introduction of vector supercomputing, parallel supercomputing on multiple CPUs, and supercomputing on hierarchically organized clusters of microprocessors with cache memories. The last revolution is still in progress, so its outcome is somewhat uncertain. I view these design revolutions through the prism of my specialty and through applications of the supercomputers I have used. Also, because these supercomputer design changes have driven equally important changes in numerical algorithms and the programs that implement them, I describe the three revolutions from this perspective  相似文献   

15.
超算系统大多是基于Linux操作系统搭建的,限制了基于Windows操作系统的应用软件使用。此外,超算系统操作的高门槛使不熟悉Linux操作系统的用户望而却步,造成超算系统用户流失。基于Linux超算系统环境,探索兼顾超算系统运维管理便利性的Windows应用程序使用方法。研究通过X11转发、Wine和虚拟化等技术,为用户提供兼容超算作业调度系统Windows应用程序运行环境,同时提供安全、稳定的用户个人文件访问方法。所采用的配置方法与实例,可为具有类似需求的超算中心提供解决方案,从而拓宽用户软件应用范围,提高用户满意度。  相似文献   

16.
Experimental research in dependability has evolved over the past 30 years accompanied by dramatic changes in the computing industry. To understand the magnitude and nature of this evolution, this paper analyzes industrial trends, namely: 1) shifting error sources, 2) explosive complexity, and 3) global volume. Under each-of these trends, the paper explores research technologies that are applicable either to the finished product or artifact, and the processes that are used to produce products. The study gives a framework to not only reflect on the research of the past, but also project the needs of the future.  相似文献   

17.
Large supercomputers are built today using thousands of commodity components, and suffer from poor reliability due to frequent component failures. The characteristics of failure observed on large-scale systems differ from smaller scale systems studied in the past. One striking difference is that system events are clustered temporally and spatially, which complicates failure analysis and application design. Developing a clear understanding of failures for large-scale systems is a critical step in building more reliable systems and applications that can better tolerate and recover from failures. In this paper, we analyze the event logs of two large IBM Blue Gene systems, statistically characterize system failures, present a model for predicting the probability of node failure, and assess the effects of differing rates of failure on job failures for large-scale systems. The work presented in this paper will be useful for developers and designers seeking to deploy efficient and reliable petascale systems.  相似文献   

18.
19.
Recent developments in European supercomputing are reviewed covering both the latest hardware trends and the increasing difficulties faced by scientists in utilising these machines to perform large-scale numerical simulations. These challenges are reflected in the large number of international initiatives which have come into being over the last few years, founded in anticipation of exascale hardware which is foreseen within the next decade. The role of a key institution in supercomputing within these programmes is described using the example of the Jülich Supercomputing Centre (JSC), and progress in setting up its own community-oriented support units for scientific computing – Simulation Laboratories – is reported on. Finally, an assessment is made of some common grand challenges and their suitability for scaling to exaflop-scale computation.  相似文献   

20.
There is a common misconception that the automobile industry is slow to adapt new technologies, such as artificial intelligence (AI) and soft computing. The reality is that many new technologies are deployed and brought to the public through the vehicles that they drive. This paper provides an overview and a sampling of many of the ways that the automotive industry has utilized AI, soft computing and other intelligent system technologies in such diverse domains like manufacturing, diagnostics, on-board systems, warranty analysis and design. Oleg Gusikhin received the Ph.D. degree from St. Petersburg Institute of Informatics and Automation of the Russian Academy of Sciences and the M.B.A. degree from the University of Michigan, Ann Arbor, MI. Since 1993, he has been with the Ford Motor Company, where he is a Technical Leader at the Ford Manufacturing and Vehicle Design Research Laboratory, and is engaged in different functional areas including information technology, advanced electronics manufacturing, and research and advanced engineering. He has also been involved in the design and implementation of intelligent control applications for manufacturing and vehicle systems. He is the recipient of the 2004 Henry Ford Technology Award. He holds two U.S. patents and has published over 30 articles in refereed journals and conference proceedings. He is an Associate Editor of the International Journal of Flexible Manufacturing Systems. He is also a Certified Fellow of the American Production and Inventory Control Society and a member of IEEE and SME. Nestor Rychtyckyj received the Ph.D. degree in computer science from Wayne State University, Detroit, MI. He is a technical expert in Artificial Intelligence at Ford Motor Company, Dearborn, MI, in Advanced and Manufacturing Engineering Systems. His current research interests include the application of knowledge-based systems for vehicle assembly process planning and scheduling. Currently, his responsibilities include the development of automotive ontologies, intelligent manufacturing systems, controlled languages, machine translation and corporate terminology management. He has published more than 30 papers in referred journals and conference proceedings. He is a member of AAAI, ACM and the IEEE Computer Society. Dimitar P. Filev received the Ph.D. degree in electrical engineering from the Czech Technical University, Prague, in 1979. He is a Senior Technical Leader, Intelligent Control and Information Systems with Ford Research and Advanced Engineering specializing in industrial intelligent systems and technologies for control, diagnostics and decision making. He is conducting research in systems theory and applications, modeling of complex systems, intelligent modeling and control, and has published 3 books and over 160 articles in refereed journals and conference proceedings. He holds 14 granted U.S. patents and numerous foreign patents in the area of industrial intelligent systems He is the recipient of the 1995 Award for Excellence of MCB University Press. He was awarded the Henry Ford Technology Award four times for development and implementation of advanced intelligent control technologies. He is an Associate Editor of International Journal of General Systems and International Journal of Approximate Reasoning. He is a member of the Board of Governors of the IEEE Systems, Man and Cybernetics Society and President of the North American Fuzzy Information Processing Society (NAFIPS).  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号