首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 781 毫秒
1.
当今,磁盘I/O的发展速度永远赶不上遵照摩尔定律的CPU的发展速度,并且网络I/O资源稀缺,所以I/O常常成为数据处理的瓶颈。Hadoop能存储PB级数据,因此I/O问题愈加显著。压缩是I/O调优的一个重要方法,它能减少I/O的负载,加快磁盘和网络上的数据传输。首先通过分析Hadoop中各压缩算法的特点,得出一个压缩使用策略来帮助Hadoop的使用者确定如何使用压缩,并用实验得以验证补充。基于该策略,一些Hadoop应用在合理使用压缩后,效率能提高65%。  相似文献   

2.
大数据时代各应用领域对计算机存储系统的性能和可靠性需求与日俱增。新型存储介质为计算机存储系统的性能提升提供了良好的机遇,基于固态盘的存储阵列(RAIS)已在各种存储系统中广泛使用。传统RAIS系统中当一块固态盘出现故障时,通过数据重构操作恢复故障盘的数据,重构时间长,且影响对上层应用提供I/O访问服务的能力。针对该问题,设计实现了基于多线程并发处理的存储池架构,该架构能够并发处理存储池中的I/O请求,提高用户I/O和数据重构I/O的访问性能。提出了一种负载自适应的I/O调度策略,能够在保证用户I/O服务质量的同时,提升数据重构效率。实验结果表明,基于存储池的多线程并发I/O处理架构能够提升数据重构性能,负载自适应的I/O调度策略能够根据用户I/O的负载情况动态调整用户I/O和数据重构I/O的调度比例,在保证用户I/O服务质量的同时,提升数据重构效率。  相似文献   

3.
刘星  江松  王洋  范小朋  须成忠 《软件学报》2017,28(8):1968-1981
小数据同步写普遍存在于的各种计算机环境中,并且可以由计算机系统的不同层次软件产生,从底层操作系统一直到上层应用软件都可以生成小数据同步写请求.然而,操作系统的文件系统是以块作为最小逻辑可寻址单位,小数据写将会导致严重的写放大问题,使得系统的I/O性能大幅降低.为了解决上述问题,我们提出了一种新的I/O调度器,并将它取名为“Hitchhike”.该调度器可以识别小数据写,并通过对其他数据块中的数据进行压缩,,将小数据嵌入到压缩出来的空间中,从而将小数据和该数据块一起写入到磁盘上,以异步回写的方式完成小数据的同步写,不仅有效缓解了磁盘的写放大问题,也大大提高了小数据同步写的效率.我们基于Linux中Deadline调度器实现了Hitchhike原型系统,并利用Filebench基准测试来测试调度器在吞吐量、I/O延迟等方面的性能.通过和传统I/O调度器的性能进行比较,我们可以发现Hitchhike调度器能够显著的提高小数据同步写性能高达48.6%.  相似文献   

4.
高性能计算系统需要一个可靠高效的并行文件系统.Lustre集群文件系统是典型的基于对象存储的集群文件系统,它适合大数据量聚合I/O操作.大文件I/O操作能够达到很高的带宽,但是小文件I/O性能低下.针对导致Lustre的设计中不利于小文件I/O操作的两个方面,提出了Filter Cache方法.在Lustre的OST组件中设计一个存放小文件I/O数据的Cache,让OST端的小文件I/O操作异步进行,以此来减少用户感知的小文件I/O操作完成的时间,提高小文件I/O操作的性能.  相似文献   

5.
Cache一致性维护是构建共享存储多处理器系统的关键,分布共享I/O系统和I/O数据一致性维护的实现方式将直接影响CC-NUMA系统的性能.基于大规模CC-NUMA系统SCCMP (scalable cache coherence multi-processors),构造并实现了基于HyperTransport互连架构的分布共享I/O子系统,由硬件维护I/O设备DMA访问的数据一致性,解决分布式缓存一致性问题.重点分析了I/O访问对Cache协议的影响,介绍了I/O数据一致性维护策略和硬件实现机制,并基于FPGA验证平台进行了系统I/O性能分析与评测.  相似文献   

6.
Oracle中磁盘竞争是影响I/O速度、降低系统性能的常见问题,可以通过分离顺序I/O、利用分片技术分散随机I/O;分别存储数据和索引、消除磁盘上的非Oracle的I/O操作、减少数据迁移和数据链接、减少碎片等方法有效地减少磁盘竞争.提高系统性能.  相似文献   

7.
并行文件系统的设计   总被引:4,自引:2,他引:2  
李群  谢立 《计算机科学》1996,23(4):35-39
随着处理机速度和网络传输速度越来越快,外部1/0设备的速度却相对落后了三、四个数量级,已经成为影响整个系统速度的瓶颈.另一方面,诸如多媒体、图像处理这些应用所需要的数据传输率越来越大,因而有必要支持高速的I/O子系统以弥补磁盘与处理机之间速度的差异,支持I/0密集的应用之高数据传输率。为了提高1/0速率,提供大的I/O带宽,在硬件结构上可以并行使用磁盘来解决。对于一个MIMD系统中使用多磁盘输入/输出子系统,  相似文献   

8.
利用开源Hadoop平台,重点研究了MapReduce在轻量数据集、网络I/O密集型程序的适用性。采用MapReduce编程模型改造了一个典型的轻量数据集、网络I/O密集型应用——FTP站点扫描程序;搭建了一个小规模Hadoop集群环境,调整了Hadoop平台的默认配置,并用真实数据对改造前后的程序进行了性能测试。实验证明,MapReduce编程模型具有良好的分布式特性,可适用于轻量数据集、网络I/O密集型程序。  相似文献   

9.
提出一种针对iSCSI网络计算系统的I/O加速策略——基于相似负载的iSCSI透明可靠多播。通过对I/O请求数据块进行相似负载判定以及对原有iSCSI协议进行扩展,在iSCSI网络计算系统中构建数据传输双路径——iSCSI协议数据包路径和相似负载的多播路径,以提高网络计算数据的加载速度。测试结果表明,该策略能有效提升多个客户主机的并发I/O性能。  相似文献   

10.
一种基于虚拟机的高效磁盘I/O特征分析方法   总被引:1,自引:0,他引:1  
沈玉良  许鲁 《软件学报》2010,21(4):849-862
于磁盘系统的机械运动本质,磁盘系统I/O往往会成为计算机系统的性能瓶颈.为了有效地提高系统性能,收集和分析应用系统的磁盘I/O特征信息将成为性能优化工作的重要基础.与以往I/O特征分析方法不同,给出了一种基于Xen 3.0虚拟机系统的磁盘I/O特征在线分析方法.在虚拟机环境下,该磁盘I/O特征采集方法可以透明地应用于任意无须修改的操作系统.该方法可以高效地在线采集多种基本I/O特征数据,其中包括:磁盘I/O块大小、I/O延迟、I/O时间间隔、I/O空间局部性、时间局部性以及磁盘I/O操作热点分布.通过测试和分析,该在线I/O分析方法有着较小的系统开销,并且对应用系统I/O性能的影响很小.此外,还给出了在大文件拷贝、基于Filebench的filemirco和varmail等工作负载下的I/O特征分析结果.  相似文献   

11.
The Hadoop Distributed File System (HDFS) is designed to run on commodity hardware and can be used as a stand-alone general purpose distributed file system (Hdfs user guide, 2008). It provides the ability to access bulk data with high I/O throughput. As a result, this system is suitable for applications that have large I/O data sets. However, the performance of HDFS decreases dramatically when handling the operations of interaction-intensive files, i.e., files that have relatively small size but are frequently accessed. The paper analyzes the cause of throughput degradation issue when accessing interaction-intensive files and presents an enhanced HDFS architecture along with an associated storage allocation algorithm that overcomes the performance degradation problem. Experiments have shown that with the proposed architecture together with the associated storage allocation algorithm, the HDFS throughput for interaction-intensive files increases 300% on average with only a negligible performance decrease for large data set tasks.  相似文献   

12.
In the last decades, there has been an explosion in the volume of data to be processed by data-intensive computing applications. As a result, processing I/O operations efficiently has become an important challenge. SSDs (solid state drives) are an effective solution that not only improves the I/O throughput but also reduces the amount of I/O transfer by adopting the concept of active SSDs. Active SSDs offload a part of the data-processing tasks usually performed in the host to the SSD. Offloading data-processing tasks removes extra data transfer and improves the overall data processing performance.In this work, we propose ActiveSort, a novel mechanism to improve the external sorting algorithm using the concept of active SSDs. External sorting is used extensively in the data-intensive computing frameworks such as Hadoop. By performing merge operations on-the-fly within the SSD, ActiveSort reduces the amount of I/O transfer and improves the performance of external sorting in Hadoop. Our evaluation results on a real SSD platform indicate that the Hadoop applications using ActiveSort outperform the original Hadoop by up to 36.1%. ActiveSort reduces the amount of write by up to 40.4%, thereby improving the lifetime of the SSD.  相似文献   

13.
Introduces queuing network models for the performance analysis of SPMD (single-program, multiple-data) applications executed on general-purpose parallel architectures such as MIMD (multiple-input, multiple data) and clusters of workstations. The models are based on the pattern of computation, communication and I/O operations of typical parallel applications. Analysis of the models leads to the definition of speedup surfaces which capture the relative influence of processors and I/O parallelism and show the effects of different hardware and software components on the performance. Since the parameters of the models correspond to measurable program and hardware characteristics, the models can be used to anticipate the performance behavior of a parallel application as a function of the target architecture (i.e. the number of processors, number of disks, I/O topology, etc.)  相似文献   

14.
Database machines are special purpose backend architectures that are designed to support efficiently database management system operations. An important problem in the development of database machines has been that of increasing their performance. Earlier research on the performance evaluation of database machines has indicated that I/O operations constitute a principle performance bottleneck. This is increasingly the case with the advances in multiprocessing and a growth in the volume of data handled by a database machine. One possible strategy to improve the performance of such a system which handles huge volumes of data is to store the data in a compressed form. This can be achieved by introducing VLSI chips for data compression so that data can be compressed and decompressed “on-the-fly”. A set of hardware algorithms for data compression based on the Huffman coding scheme proposed in an earlier work is described. The main focus of this paper is the investigation conducted by the authors to study the effect of incorporating such hardware in a special purpose backend relational database machine. Detailed analytical models of a relational database machine and the analytical results that quantify the performance improvement due to compression hardware are presented.  相似文献   

15.
This paper introduces queuing network models for the performance analysis of SPMD applications executed on general-purpose parallel architectures such as MIMD and clusters of workstations. The models are based on the pattern of computation, communication, and I/O operations of typical parallel applications. Analysis of the models leads to the definition of speedup surfaces which capture the relative influence of processors and I/O parallelism and show the effects of different hardware and software components on the performance. Since the parameters of the models correspond to measurable program and hardware characteristics, the models can be used to anticipate the performance behavior of a parallel application as a function of the target architecture (i.e., number of processors, number of disks, I/O topology, etc).  相似文献   

16.
Hadoop分布式文件系统(Hadoop Distributed File System,HDFS)是一种适合在通用硬件上运行的低成本、高度容错性的分布式文件系统,能提供高吞吐量的数据访问,适合针对大规模数据集上的应用。然而,HDFS中还面临一些性能优化问题,如负载均衡不足。虽然Hadoop系统自带的负载均衡器可以实现均衡调整,但需要用户预先给出静态的阈值。为了解决阈值的固定性和主观性,通过对磁盘空间使用率、CPU利用率、内存利用率、磁盘I/O占用率、网络带宽占用率等参数的分析评估优化,形成对阈值的计算表达式,并通过理论分析和仿真实验对阈值的计算和负载均衡进行验证。实验结果表明,相比较Hadoop静态的输入阈值的算法,该方法达到了更好的平衡效果,提高了计算资源的利用率。  相似文献   

17.
It was proposed to use the hardware accelerators for analysis and data processing in the systems of logic control on a chip including the interacting processor system, memory, and configurable logic components. The data processing expected execution of operations over the sets of elements each of which can be activated by software and realized in the hardware in parallel networks admitting, if necessary, pipeline processing. New methods of design and use of the sorting and search networks were proposed, and the results of their theoretical and experimental comparison with the existing networks were presented.  相似文献   

18.
ROPDetector:一种基于硬件性能计数器的ROP攻击实时检测方法   总被引:1,自引:0,他引:1  
面向返回编程(Return-Oriented Programming,ROP)是针对软件漏洞利用最广泛的攻击技术之一,能够绕过数据执行保护、地址空间布局随机化等防御机制.本文提出了一种基于硬件的ROP攻击实时检测方法,在不需要任何边缘信息(如源代码、编译器支持)和二进制重写的情况下,利用现代CPU中的硬件性能计数器监控目标程序执行过程,提取ROP攻击发生时底层硬件事件特征来实时检测ROP攻击.然后,在32位Linux实验环境下实现了原型系统ROPDetector,使用真实的ROP攻击与漏洞进行实验,并与同类方法进行了对比实验,最后评估了系统的性能消耗.实验结果表明,该方法能有效地检测真实的ROP攻击,在分别以6次和9次错误预测返回指令为检测周期时,系统性能消耗仅有5.05%和5.25%,磁盘I/O性能消耗仅有0.94%和2%,网络I/O性能消耗仅有0.06%和0.78%.  相似文献   

19.
Patt  Y.N. 《Computer》1994,27(3):15-16
A computer system can be partitioned into hardware and the software executing on that hardware. The hardware consists of processor(s), memory, and “everything else”. The “everything else” we generally combine under the umbrella “I/O, whose job it is to manage the availability of information to and from the processor(s) and memory”. That information comes from storage devices, networks, and nonstorage devices. The I/O subsystem is the collection of all three; its influence on performance is a reflection of how well it manages the availability of information to and from all three. The impression today, from both the hardware side and the software side, is that the I/O subsystem can certainly stand improvement. The author considers improvements to the I/O subsystem  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号