共查询到20条相似文献,搜索用时 31 毫秒
1.
Bandwidth, latency, system speed, and, of course, the size of future microprocessor systems all highly depend on interconnection technologies. Interconnection will become the key performance bottleneck as semiconductor technology improvements continue to reduce feature size. In this article, we describe the use of on-chip area I/O for future microprocessor systems on the basis of a case study we made of an Intel Pentium system. Area I/O is simply a method of locating I/Os over the entire chip instead of just the periphery. We show that system designers can achieve significant performance gains with area I/O and size reductions at both the system and chip levels. We also explain how area I/O in conjunction with high-density interconnects leads to a new package and chip partitioning concept 相似文献
2.
A Hint Frequency Based Approach to Enhancing the I/O Performance of Multilevel Cache Storage Systems
下载免费PDF全文
![点击此处可从《计算机科学技术学报》网站下载免费的PDF全文](/ch/ext_images/free.gif)
Xiao-Dong Meng Chen-Tao Wu Min-Yi Guo Jie Li Xiao-Yao Liang Bin Yao Long Zheng 《计算机科学技术学报》2017,32(2):312-328
With the enormous and increasing user demand, I/O performance is one of the primary considerations to build a data center. Several new technologies in data centers, such as tiered storage, prompt the widespread usage of multilevel cache techniques. In these storage systems, the upper level storage typically serves as a cache for the lower level, which forms a distributed multilevel cache system. However, although many excellent multilevel cache algorithms have been proposed to improve the I/O performance, they still have potential to be enhanced by investigating the history information of hints. To address this challenge, in this paper, we propose a novel hint frequency based approach (HFA), to improve the overall multilevel cache performance of storage systems. The main idea of HFA is using hint frequencies (the total number of demotions/promotions by employing demote/promote hints) to efficiently explore the valuable history information of data blocks among multiple levels. HFA can be applied with several popular multilevel cache algorithms, such as Demote, Promote and Hint-K. Simulation results show that, compared with original multilevel cache algorithms such as Demote, Promote and Hint-K, HFA can improve the I/O performance by up to 20% under different I/O workloads. 相似文献
3.
4.
针对多操作系统核心下网络I/O资源的高效共享问题,提出的基于全局地址空间的I/O虚拟化方法.方法采用了半虚拟化的设计思想,基于全局地址空间支持,主、从核心在通信的关键路径上均可对网络设备直接发起I/O操作,从而获得最佳的I/O虚拟化性能.本文以HPP结构为实例,研究了将提出的I/O虚拟化方法应用到HPP结构下对InfiniBand网络进行虚拟化的关键技术,实现了从核心I/O通信时的OS旁路和主核心旁路.对曙光6000原型系统的测试表明,在主、从核心配置相同的情况下,从核心使用虚拟化InfiniBand的通信性能与主核心相当,I/O虚拟化对应用性能的影响小于2%. 相似文献
5.
Memory devices can be used as storage systems to provide a lower latency that can be achieved by disk and flash storage. However, traditional buffered input/output (I/O) and direct I/O are not optimized for memory-based storages. Traditional buffered I/O includes a redundant memory copy with a disk cache. Traditional direct I/O does not support byte addressing. Memory-mapped direct I/O, which optimizes file operations for byte-addressable persistent memory and appears to the CPU as a main memory. However, it has an interface that is not always compatible with existing applications. In addition, it cannot be used for peripheral memory devices (e.g., networked memory devices and hardware RAM drives) that are not interfaced with the memory bus. This paper presents a new Linux I/O layer, byte direct I/O (BDIO), that can process byte-addressable direct I/O using the standard application programming interface. It requires no modification of existing application programs and can be used not only for the memory but also for the peripheral memory devices that are not addressable by a memory management unit. The proposed BDIO layer allows file systems and device drivers to easily support BDIO. The new I/O achieved 18% to 102% performance improvements in the evaluation experiments conducted with online transaction processing, file server, and desktop virtualization storage. 相似文献
6.
7.
T. L. Kunii Y. Shinagawa R. M. Paul M. F. Khan A. A. Khokhar 《Multimedia Systems》1995,3(5-6):298-304
As the proliferation of multimedia systems continues in diverse application areas, it is becoming increasingly apparent that
the performance of the I/O subsystem is a critical limiting factor in the usefulness of such systems. This has spurred extensive
research to discover and design efficient and robust I/O systems for the storage and retrieval of multimedia data. To mitigate
the effects of the I/O bottleneck in multimedia application environments, we must employ novel technologies and efficient
algorithms, and we must use available resources carefully. This paper identifies some significant issues involved and presents
a survey of the techniques developed or proposed during recent years to make multimedia I/O more efficient.
e-mail: ashfaq@eecis.udel.edu 相似文献
8.
Garcia-Carballeira F. Carretero J. Calderon A. Perez J.M. Garcia J.D. 《Parallel and Distributed Systems, IEEE Transactions on》2004,15(6):533-545
Caching has been intensively used in memory and traditional file systems to improve system performance. However, the use of caching in parallel file systems and I/O libraries has been limited to I/O nodes to avoid cache coherence problems. We specify an adaptive cache coherence protocol that is very suitable for parallel file systems and parallel I/O libraries. This model exploits the use of caching, both at processing and I/O nodes, providing performance improvement mechanisms such as aggressive prefetching and delayed-write techniques. The cache coherence problem is solved by using a dynamic scheme of cache coherence protocols with different sizes and shapes of granularity. The proposed model is very appropriate for parallel I/O interfaces, such as MPI-IO. Performance results, obtained on an IBM SP2, are presented to demonstrate the advantages offered by the cache management methods proposed. 相似文献
9.
10.
New intelligent adapters, advanced bus architectures, and powerful microprocessors have resulted in a new generation of personal computers with true multimedia capabilities. Collaborative applications are the most demanding applications of multimedia technologies today. We present a performance analysis of how effective video conferencing applications can be supported with personal computers connected through a local area network (LAN). We also evaluate the performance impact of an advanced, peer-to-peer I/O protocol. The key factor in the performance of a video conferencing system over a LAN is the video compression and decompression algorithms. At high video frame rates, the peer-to-peer I/O protocol performs better than the traditional, bus-master, interrupt-driven I/O protocol. 相似文献
11.
Local area networks and computer I/O are both interconnects that move information from one location to another. Despite this shared purpose, LANs have traditionally connected independent and widely separated computers. In contrast, computer I/O has traditionally connected a host to peripheral devices such as terminals, disks, and tape drives. Because these connection tasks were different, the architectures developed for one task were not suitable for the other. Consequently, the technologies used to implement one architecture could not address the issues faced by the other, and the technologies were seen as fundamentally different. However, an examination of the architectural requirements of modern I/O and LANs shows that the differences between the two technologies are now disappearing. We believe that LAN and I/O architectures are in fact converging, and that this convergence reflects significant changes in how and where computing resources are used. To illustrate this convergence and its implications, this article examines several modern LANs and channels 相似文献
12.
对于同类型的I/O请求,基于闪存固态盘的请求响应时间与请求大小基本呈线性比例关系,并且固态盘的读写性能具有非对称性。针对该特性,提出一种基于请求大小的固态盘I/O调度(SIOS)算法,从I/O请求平均响应时间的角度提高固态盘设备的I/O性能。根据读写性能的非对称性,对读写请求进行分组并且优先处理读请求。在此基础上首先处理等待队列中的小请求,从而减少队列中请求的平均等待时间。采用SLC和MLC2种类型的固态盘进行实验,在5种测试负载的驱动下与Linux系统中的3种调度算法进行比较,对于SLC固态盘,SIOS平均响应时间分别减少18.4%、25.8%、14.9%、14.5%和13.1%,而对于MLC固态盘,平均响应时间分别减少16.9%、24.4%、13.1%、13.0%和13.7%,结果表明,SIOS能有效减少I/O请求的平均响应时间,提高固态盘存储系统的I/O性能。 相似文献
13.
14.
针对传统的文件系统(如UFS等)在支持缓存服务器时存在着元数据一致性维护、同步写操作、内存拷贝和多缓存诸多固有的缺陷,我们设计和实现了一种新的、高效的、可移植性好的文件系统Sloth。该系统在应用层实现,采用异步写操作和聚集文件等技术。仿真实验表明,Sloth文件系统有效地提高了读写磁盘的性能,大大减少了访问磁盘的次
数。 相似文献
数。 相似文献
15.
Heuristics for scheduling I/O operations 总被引:1,自引:0,他引:1
Jain R. Somalwar K. Werth J. Browne J.C. 《Parallel and Distributed Systems, IEEE Transactions on》1997,8(3):310-320
The I/O bottleneck in parallel computer systems has recently begun receiving increasing interest. Most attention has focused on improving the performance of I/O devices using fairly low level parallelism in techniques such as disk striping and interleaving. Widely applicable solutions, however, will require an integrated approach which addresses the problem at multiple system levels, including applications, systems software, and architecture. We propose that within the context of such an integrated approach, scheduling parallel I/O operations will become increasingly attractive and can potentially provide substantial performance benefits. We describe a simple I/O scheduling problem and present approximate algorithms for its solution. The costs of using these algorithms in terms of execution time, and the benefits in terms of reduced time to complete a batch of I/O operations, are compared with the situations in which no scheduling is used, and in which an optimal scheduling algorithm is used. The comparison is performed both theoretically and experimentally. We have found that, in exchange for a small execution time overhead, the approximate scheduling algorithms can provide substantial improvements in I/O completion times 相似文献
16.
VODStar视频点播系统的设计、实现及性能测试 总被引:8,自引:0,他引:8
以技术报告的形式阐述了一种采用新的信道调度方案的视频点播系统——VODStar系统——的设计、实现及性能测试.该系统综合应用了CM(controlled multicast)与EEB(extended exponential broadcasting)两种新的信道调度方案,采用以信道调度为中心的控制器和以处理媒体数据为中心的“数据泵”相结合的系统结构,基于IP网络实现了RTSP,RTP和SDP等协议.性能测试结果表明,VODStar系统可以有效地缓解传统VOD系统中的服务器I/O或网络带宽的瓶颈问题,是一种性能较高的VOD系统. 相似文献
17.
Xiuqiao Li Limin Xiao Meikang Qiu Bin Dong Li Ruan 《The Journal of supercomputing》2014,68(2):996-1021
Parallel file systems are experiencing more and more applications from various fields. Various applications have different I/O workload characteristics, which have diverse requirements on accessing storage resources. However, parallel file systems often adopt the “one-size-fits-all” solution, which fails to meet specific application needs and hinders the full exploitation of potential performance. This paper presents a framework to enable dynamic file I/O path selection with fine granularity at runtime. The framework adopts a file handle-rich scheme to allow file systems choose corresponding optimizations to serve I/O requests. Consistency control algorithms are proposed to ensure data consistency while changing optimizations at runtime. One case study on our prototype shows that choosing proper optimizations can improve the I/O performance for small files and large files by up to 40 and 64.4 %, respectively. Another case study shows that the data prefetch performance for real-world application traces can be improved by up to 193 % by selecting correct prefetch patterns. Simulations in large-scale environment also show that our method is scalable and both the memory consumption and the consistency control overhead can be negligible. 相似文献
18.
A critical performance issue for a number of scientific and engineering applications is the efficient transfer of data to secondary storage. Languages such as High Performance Fortran (HPF) have been introduced to allow programming distributed-memory systems at a relatively high level of abstraction. However, the present version of HPF does not provide appropriate constructs for controlling the parallel I/O capabilities of these systems. In this paper, constructs to specify parallel I/O operations on multidimensional arrays in the context of HPF are proposed. The paper also presents implementation concepts that are based on the HPF compiler VFC and the parallel I/O run-time system Panda. Experimental performance results are discussed in the context of financial management and traffic simulation applications. 相似文献
19.
《Journal of Parallel and Distributed Computing》2005,65(10):1190-1203
Network contention hotspots can limit network throughput for parallel disk I/O, even when the interconnection network appears to be sufficiently provisioned. We studied I/O hotspots in mesh networks as a function of the spatial layout of an application's compute nodes relative to the I/O nodes.Our analytical modeling and dynamic simulations show that when I/O nodes are configured on one side of a two-dimensional mesh, realizable I/O throughput is at best bounded by four times the network bandwidth per link. Maximal performance depends on the spatial layout of jobs, and cannot be further improved by adding I/O nodes.Applying these results, we devised a new parallel layout allocation strategy (PLAS) which minimizes I/O hotspots, and approaches the theoretical best case for parallel I/O throughput. Our I/O performance analysis and processor allocation strategy are applicable to a wide range of contemporary and emerging high-performance computing systems. 相似文献
20.
随着大数据的发展,Hadoop系统成为了大数据处理中的重要工具之一。在实际应用中,Hadoop的I/O操作制约系统性能的提升。通常Hadoop系统通过软件压缩数据来减少I/O操作,但是软件压缩速度较慢,因此使用硬件压缩加速器来替换软件压缩。Hadoop运行在Java虚拟机上,无法直接调用底层I/O硬件压缩加速器。通过实现Hadoop压缩器/解压缩器类和设计C++动态链接库来解决从Hadoop系统中获得压缩数据和将数据流向I/O硬件压缩加速器两个关键技术,从而将I/O硬件压缩加速器集成到Hadoop系统框架。实验结果表明,I/O硬件压缩加速器的每赫兹压缩速度为15.9Byte/s/Hz,集成I/O硬件压缩加速器提升Hadoop系统性能2倍。 相似文献