首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Multidimensional array I/O in Panda 1.0   总被引:1,自引:0,他引:1  
Large multidimensional arrays are a common data type in high-performance scientific applications. Without special techniques for handling input and output, I/O can easily become a large fraction of execution time for applications using these arrays, especially on parallel platforms. Our research seeks to provide scientific programmers with simpler and more abstract interfaces for accessing persistent multidimensional arrays, and to produce advanced I/O libraries supporting more efficient layout alternatives for these arrays on disk and in main memory. We have created the Panda (Persistence AND Arrays) I/O library as a result of developing interfaces and libraries for applications in computational fluid dynamics in the areas of checkpoint, restart, and time-step output data. In the applications we have studied, we find that a simple, abstract interface can be used to insulate programmers from physical storage implementation details, while providing improved I/O performance at the same time.(A preliminary version of this paper was presented at Supercomputing '94.)  相似文献   

2.
并存文伴系统是解决I/O瓶颈问题的重要途径。研究表明,科学应用中跨越式的文件访问模式与现存并行文件系统访问这些数据的方法的结合,对于大型数据集的访问其I/O性能是难以接受的。为了提高并行文件系统中对不连续数据的I/O性能,创建了一种新型高性能I/O方法:用户自定义文件视图结合合并I/O请求。并且在WPFS并行文件系统中实现了该方法。研究和实验结果表明,该方法具有增强科学应用性能的潜力。  相似文献   

3.
A critical performance issue for a number of scientific and engineering applications is the efficient transfer of data to secondary storage. Languages such as High Performance Fortran (HPF) have been introduced to allow programming distributed-memory systems at a relatively high level of abstraction. However, the present version of HPF does not provide appropriate constructs for controlling the parallel I/O capabilities of these systems. In this paper, constructs to specify parallel I/O operations on multidimensional arrays in the context of HPF are proposed. The paper also presents implementation concepts that are based on the HPF compiler VFC and the parallel I/O run-time system Panda. Experimental performance results are discussed in the context of financial management and traffic simulation applications.  相似文献   

4.
为了缓解I/O瓶颈问题,可以从应用程序、可扩展算法、编译器和语言、运行时库、操作系统和体系结构六方面展开研究。其中,I/O体系结构是所有技术途径的关键支撑。当前并行I/O性能分析缺乏科学的理论模型为I/O体系结构设计提供理论依据。本文针对并行计算机系统的可扩展性问题,研究了I/O负载对并行计算机系统可扩展性的影响,建立了I/O受限的并行加速比性能模型,对目前大规模并行计算机系统中三种常用I/O体系结构的可扩展性进行了分析;以此为理论依据,提出了一种面向高性能计算的可扩展并行I/O系统结构。同时,还提出了几种有效降低I/O操作服务时间的策略,从而达到增强系统可扩展性的目的,为后续研究奠定了基础。  相似文献   

5.
利用机器学习方法解决存储领域中若干技术难题是目前存储领域的研究热点之一。强化学习作为一种以环境反馈作为输入、自适应环境的特殊的机器学习方法,能通过观测环境状态的变化,评估控制决策对系统性能的影响来选择最优的控制策略,基于强化学习的智能RAID控制技术具有重要的研究价值。本文针对高性能计算应用特点,将机器学习领域中的强化学习技术引入RAID控制器中,提出了基于强化学习的智能I/O调度算法RL-scheduler,利用Q-学习策略实现了面向并行应用的自治调度策略。RL-scheduler综合考虑了调度的公平性、磁盘寻道时间和MPI应用的I/O访问效率,并提出多Q-表交叉组织方法提高Q-表的更新效率。实验结果表明,RL-scheduler缩短了并行应用的平均I/O服务时间,提高了大规模并行计算系统的I/O吞吐率。  相似文献   

6.
The Scalable I/O(SIO)Initiative‘s Low-Level Application Programming Interface(SIO LLAP)provides file system implementers with a simple low-Level interface to support high-level parallel /O interfaces efficiently and effectively.This paper describes a reference implementation and the evaluation of the SIO LLAPI on the Intel Paragon multicomputer.The implementation provides the file system structure and striping algorithm compatible with the Parallel File System(PFS)of Intel Paragon ,and runs either inside the kernel or as a user level library.The scatter-gather addressing read/write,asynchronous I/O,client caching and prefetching mechanism,file access hint mechanism,collective I/O and highly efficient file copy have been implemented.The preliminary experience shows that the SIO LLAPI provides opportunities of significant performance improvement and is easy to implement.Some high level file system interfaces and applications such as PFS,ADIO and Hartree-Fock application,are also implemented on top of SIO.The performance of PFS is at least the same as that of Intel‘s native pfs,and in many cases,such as small sequential file access,huge I/O requests and collective I/O,it is stable and much better,The SIO features help to support high level interfaces easily,quickly and more efficiently,and the cache,prefetching,hints are useful to get better performance based on different access models.The scalability and performance of SIO are limited by the network latency,network scalable bandwidth,memory copy bandwidth,memory size and pattern of I/O requests.The tadeoff between generality and efficienc should be considered in implementation.  相似文献   

7.
This paper presents a new scheme of I/O scheduling on storage servers of distributed/parallel file systems, for yielding better I/O performance. To this end, we first analyze read/write requests in the I/O queue of storage server (we name them block I/Os), by using our proposed technique of horizontal partition. Then, all block requests are supposed to be divided into multiple groups, on the basis of their offsets. This is to say, all requests related to the same chunk file will be grouped together, and then be satisfied within the same time slot between opening and closing the target chunk file on the storage server. As a result, the time resulted by completing block I/O requests can be significantly decreased, because of less file operations on the corresponding chunk files at the low-level file systems of server machines. Furthermore, we introduce an algorithm to rate a priority for each group of block I/O requests, and then the storage server dispatches groups of I/Os by following the priority order. Consequently, the applications having higher I/O priorities, e.g. they have less I/O operations and small size of involved data, can finish at a earlier time. We implement a prototype of this server-side scheduling in the PARTE file system, to demonstrate the feasibility and applicability of the proposed scheme. Experimental results show that the newly proposed scheme can achieve better I/O bandwidth and less I/O time, compared with the strategy of First Come First Served, as well as other server-side I/O scheduling approaches.  相似文献   

8.
With the advent of new computing paradigms, parallel file systems serve not only traditional scientific computing applications but also non-scientific computing applications, such as financial computing, business, and public administration. Parallel file systems provide storage services for multiple applications. As a result, various requirements need to be met. However, parallel file systems usually provide a unified storage solution, which cannot meet specific application needs. In this paper, an extended file handle scheme is proposed to deal with this problem. The original file handle is extended to record I/O optimization information, which allows file systems to specify optimizations for a file or directory based on workload characteristics. Therefore, fine-grained management of I/O optimizations can be achieved. On the basis of the extended file handle scheme, data prefetching and small file optimization mechanisms are proposed for parallel file systems. The experimental results show that the proposed approach improves the aggregate throughput of the overall system by up to 189.75%.  相似文献   

9.
一个基于NOW的并行I/O系统   总被引:1,自引:0,他引:1  
李冀  陈晓林  陆桑璐  陈贵海  谢立 《软件学报》2001,12(11):1654-1659
随着NOW(networksofworkstations)在科学研究中的应用日益广泛,如何为NOW上的科学计算提供高性能的输入、输出成为人们所面临的一个新课题.根据NOW的特点,设计并实现了一个具有NOW特色的采用CollectiveI/O技术的并行I/O系统CION(collectiveI/Oonnowsystem).CION吸取了DDIO(disk-directedI/O)与two-phaseI/O的优点,同时采用了数据筛选等一系列优化技术.初步的测试已经显示了良好的系统性能.  相似文献   

10.
分析了RAID-10系统的并行I/O任务模型,应用模糊函数提出了衡量I/O服务的满意度指标,并应用此指标,提出了一种适用于RAID-10系统I/O任务调度算法,提高了RAID-10系统的实时性能,并改善了负载其平衡能力。  相似文献   

11.
In the Big Data era, the gap between the storage performance and an application’s I/O requirement is increasing. I/O congestion caused by concurrent storage accesses from multiple applications is inevitable and severely harms the performance. Conventional approaches either focus on optimizing an application’s access pattern individually or handle I/O requests on a low-level storage layer without any knowledge from the upper-level applications. In this paper, we present a novel I/O-aware bandwidth allocation framework to coordinate ongoing I/O requests on petascale computing systems. The motivation behind this innovation is that the resource management system has a holistic view of both the system state and jobs’ activities and can dynamically control the jobs’ status or allocate resource on the fly during their execution. We treat a job’s I/O requests as periodical sub-jobs within its lifecycle and transform the I/O congestion issue into a classical scheduling problem. Based on this model, we propose a bandwidth management mechanism as an extension to the existing scheduling system. We design several bandwidth allocation policies with different optimization objectives either on user-oriented metrics or system performance. We conduct extensive trace-based simulations using real job traces and I/O traces from a production IBM Blue Gene/Q system at Argonne National Laboratory. Experimental results demonstrate that our new design can improve job performance by more than 30%, as well as increasing system performance.  相似文献   

12.
高强度I/O的应用对并行存储系统的挑战和解决方法研究   总被引:1,自引:0,他引:1  
具有高I/O密集特性的高性能计算应用对高性能计算机存储系统综合性能的要求越来越高.以石油地震勘探数据处理为代表的一类重要应用表现出I/O数据量巨大、I/O访问密度大,对单个磁盘阵列存储部件的读写带宽要求高的特征.在Lustre文件系统中,充当对象存储服务功能的磁盘阵列设备输出带宽的不足将成为阻碍存储系统整体性能发挥的重要因素.针对此类问题,提出了一种缓存管理方法,分别在客户端添加VDISK模块,在OST端添加Cache模块,二者协同提高并行文件系统I/O的输出带宽的使用效率;另外,充分利用客户端空闲内存以及客户端之间的通信带宽,降低应用程序对磁盘阵列设备输出带宽的要求.通过大规模并行模型的验证表明,VDISK提高了实际可用的输出带宽,提高了外部存储系统的I/O效率.  相似文献   

13.
I/O performance of an RAID-10 style parallel file system   总被引:1,自引:0,他引:1       下载免费PDF全文
Without any additional cost, all the disks on the nodes of a cluster can be connected together through CEFT-PVFS, an RAID-10 style parallel file system, to provide a multi-GB/s parallel I/O performance.I/O response time is one of the most important measures of quality of service for a client. When multiple clients submit data-intensive jobs at the same time, the response time experienced by the user is an indicator of the power of the cluster. In this paper, a queuing model is used to analyze in detail the average response time when multiple clients access CEFT-PVFS. The results reveal that response time is with a function of several operational parameters. The results show that I/O response time decreases with the increases in I/O buffer hit rate for read requests, write buffer size for write requests and the number of server nodes in the parallel file system, while the higher the I/O requests arrival rate, the longer the I/O response time. On the other hand, the collective power of a large cluster supported by CEFT-PVFS is shown to be able to sustain a steady and stable I/O response time for a relatively large range of the request arrival rate.  相似文献   

14.
设计并实现了一个基于透明计算模式的I/O Server系统,I/O Server和I/O Client是一个在透明计算环境下,支持多操作系统远程启动和运行的网络存储访问服务I/O Manager的2个软件模块,I/O Server工作在服务器端,I/O Client工作在客户端。在透明计算模式中,各客户机硬件与操作系统分离,用户需要的操作系统的应用程序存储在服务器端。在客户机启动时,I/O Server和启动协议将I/O Client下载到端系统上运行,然后I/O Client向I/O Server发出I/O请求,I/O Server对收到的I/O请求加以分析,进行优先级分类,在优先级分时轮转调度I/O请求、操作服务器上的虚拟硬盘文件,并通过预取和缓存策略减少磁盘I/O操作,将处理结果返回给客户端,支持操作系统的远程启动,并为系统运行时的各种请求提供服务。  相似文献   

15.
叶孝斌  杨树强 《计算机工程》2000,26(3):57-58,76
并行I/O是基于无共享结构的并行数据库系统提高性能的有效途径之一。它通过并行磁盘服务和网络传输并行化提供了高带宽I/O。文章设计实现了基于无共享结构的并行数据库系统的并行I/O,探讨了设计并行I/O时的几个关键问题及实现技术。  相似文献   

16.
Cloud computing is currently being explored by the scientific community to assess its suitability for High Performance Computing (HPC) environments. In this novel paradigm, compute and storage resources, as well as applications, can be dynamically provisioned on a pay-per-use basis. This paper presents a thorough evaluation of the I/O storage subsystem using the Amazon EC2 Cluster Compute platform and the recent High I/O instance type, to determine its suitability for I/O-intensive applications. The evaluation has been carried out at different layers using representative benchmarks in order to evaluate the low-level cloud storage devices available in Amazon EC2, ephemeral disks and Elastic Block Store (EBS) volumes, both on local and distributed file systems. In addition, several I/O interfaces (POSIX, MPI-IO and HDF5) commonly used by scientific workloads have also been assessed. Furthermore, the scalability of a representative parallel I/O code has also been analyzed at the application level, taking into account both performance and cost metrics. The analysis of the experimental results has shown that available cloud storage devices can have different performance characteristics and usage constraints. Our comprehensive evaluation can help scientists to increase significantly (up to several times) the performance of I/O-intensive applications in Amazon EC2 cloud. An example of optimal configuration that can maximize I/O performance in this cloud is the use of a RAID 0 of 2 ephemeral disks, TCP with 9,000 bytes MTU, NFS async and MPI-IO on the High I/O instance type, which provides ephemeral disks backed by Solid State Drive (SSD) technology.  相似文献   

17.
Computation offloading enables mobile devices to execute rich applications by using the abundant computing resources of powerful server systems. The distributed shared memory based (DSM-based) computation offloading approach is expected to be especially popular in the near future because it can dynamically migrate running threads to computing nodes and does not require any modifications of existing applications to do so. The current DSM-based computation offloading scheme, however, has focused on efficiently offloading computationally intensive applications and has not considered the significant performance degradation caused by processing the I/O requests issued by offloaded threads. Because most mobile applications are interactive and thus yield frequent I/O requests, efficient handling of I/O operations is critically important. In this paper, we quantitatively analyze the performance degradation caused by I/O processing in DSM-based computation offloading schemes using representative commodity applications. To remedy the performance degradation, we apply a remote I/O scheme based on remote device support to computation offloading. The proposed approach improves the execution time by up to 43.6% and saves up to 17.7% of energy consumption in comparison with the existing offloading schemes. Selective compression of the remote I/O scheme reduces the network traffic by up to 53.5%.  相似文献   

18.
In future computer system design, I/O systems will have to support continuous media such as video and audio, whose system demands are different from those of data such as text. Multimedia computing requires us to focus on designing I/O systems that can handle real-time demands. Video- and audio-stream playback and teleconferencing are real-time applications with different I/O demands. We primarily consider playback applications which require guaranteed real-time I/O throughput. In a multimedia server, different service phases of a real-time request are disk, small computer systems interface (SCSI) bus, and processor scheduling. Additional service might be needed if the request must be satisfied across a local area network. We restrict ourselves to the support provided at the server, with special emphasis on two service phases: disk scheduling and SCSI bus contention. When requests have to be satisfied within deadlines, traditional real-time systems use scheduling algorithms such as earliest deadline first (EDF) and least slack time first. However, EDF makes the assumption that disks are preemptable, and the seek-time overheads of its strict real-time scheduling result in poor disk utilization. We can provide the constant data rate necessary for real-time requests in various ways that require trade-offs. We analyze how trade-offs that involve buffer space affect the performance of scheduling policies. We also show that deferred deadlines, which increase buffer requirements, improve system performance significantly  相似文献   

19.
Krieger  O. Stumm  M. Unrau  R. 《Computer》1994,27(3):75-82
The authors introduce an application-level I/O facility, the Alloc Stream Facility, that addresses three primary goals. First, ASF addresses recent computing substrate changes to improve performance, allowing applications to benefit from specific features such as mapped files. Second, it is designed for parallel systems, maximizing concurrency and reporting errors properly. Finally, its modular and object-oriented structure allows it to support a variety of popular I/O interfaces (including stdio and C++ stream I/O) and to be tuned to system behavior, exploiting a system's strengths while avoiding its weaknesses. On a number of standard Unix systems, I/O-intensive applications perform substantially better when linked to the Alloc facility. Also, modifying applications to use a new interface provided by the facility can improve performance by another factor of two. These performance improvements are achieved primarily by reducing data copying and the number of system calls. Not visible in these improvements is the extra degree of concurrency the facility brings to multithreaded and parallel applications  相似文献   

20.
如何有效地解决I/O瓶颈问题,一直是高性能并行计算机有待解决的关键技术。该文提出了一种高效共享的并行I/O系统——HPPIO,该系统基于CC-NUMA并行系统结构,采用了一系列高效共享、并行I/O技术。该文对其分布与集中相结合的高效共享并行I/O系统结构、基于PCI Express的高性能I/O控制器设计等进行了介绍。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号