首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In performance analysis of computer systems, trace-driven simulation techniques have the important advantage of credibility and accuracy. Unfortunately, traces are usually difficult to obtain, and little work has been done to provide efficient tools to help in the process of reducing and manipulating them. This paper presents TRAMP, a tool for the data reduction and data analysis phases of trace-driven simulation studies. TRAMP has three main advantages: it accepts a variety of common trace formats; it provides a programmable user interface in which many actions can be directly specified; and it is easy to extend. TRAMP is particularly helpful for reducing and analysing complex trace data, such as traces of file system or database activity. This paper presents the design principles and implementation techniques of TRAMP and provides a few concrete examples of the use of this tool.  相似文献   

2.
With the increasing presence, scale, and complexity of distributed systems, resource failures are becoming an important and practical topic of computer science research. While numerous failure models and failure-aware algorithms exist, their comparison has been hampered by the lack of public failure data sets and data processing tools. To facilitate the design, validation, and comparison of fault-tolerant models and algorithms, we have created the Failure Trace Archive (FTA)—an online, public repository of failure traces collected from diverse parallel and distributed systems. In this work, we first describe the design of the archive, in particular of the standard FTA data format, and the design of a toolbox that facilitates automated analysis of trace data sets. We also discuss the use of the FTA for various current and future purposes. Second, after applying the toolbox to nine failure traces collected from distributed systems used in various application domains (e.g., HPC, Internet operation, and various online applications), we present a comparative analysis of failures in various distributed systems. Our analysis presents various statistical insights and typical statistical modeling results for the availability of individual resources in various distributed systems. The analysis results underline the need for public availability of trace data from different distributed systems. Last, we show how different interpretations of the meaning of failure data can result in different conclusions for failure modeling and job scheduling in distributed systems. Our results for different interpretations show evidence that there may be a need for further revisiting existing failure-aware algorithms, when applied for general rather than for domain-specific distributed systems.  相似文献   

3.
设计了一款网络分布式文件系统。该系统基于软件定义网络,通过充分应用其底层网络动态数据资源完成实时数据传输最佳路径的获取,使分布式文件系统的性能通过有效导引分布式文件系统中的大数据流而得以显著提高。通过构建的分布式文件系统原型对文件读写及修复操作进行测试,同传统网络环境相比,基于软件定义网络环境的分布式文件系统的读写及修复能力均得以显著提升,更适用于网络数据流量大且异构明显的情况。  相似文献   

4.
名字服务是异构分布式文件系统的核心部分.在参阅国内外众多分布式文件系统的基础上,提出一种新的命名方案──输出共享方式,在保留原文件系统命名语法的前题下,将本地文件系统与远程文件系统有机地结合起来,并采用混和的分布方式实现了名字服务员,达到了位置透明性要求.  相似文献   

5.
对运行在文件系统上的工作负载进行分析有助于优化分布式文件系统的性能,且对构建新型存储系统至关重要。由于工作负载的复杂性和规模多样性的增加,使用基于直觉的分析来显式地捕获工作负载踪迹的特征是不完备的。针对这一问题,提出了一个分布式日志分析与负载特征提取模型。首先,从分布式文件系统日志中根据关键字抽取出与读写相关的信息;其次,从统计与时序两方面对负载特征进行描述;最后,分析基于负载特征进行系统优化的可能。实验结果表明,提出的模型具有一定的可行性与准确性,且可以较为详细地给出负载统计与时序特征,具有低开销、高时效、易于分析等优点,可以用来指导具有相同特征的工作负载的合成、热点数据监测、系统的缓存预取优化。  相似文献   

6.
通过检测虚拟机内部的隐藏文件,检测工具可以及时判断虚拟机是否受到攻击.传统的文件检测工具驻留在被监视虚拟机中,容易遭到恶意软件的攻击.基于虚拟机自省原理,设计并实现一种模块化的虚拟机文件检测方法FDM. FDM借助操作系统内核知识,解析虚拟机所依存的物理硬件,构建虚拟机文件语义视图,并通过与内部文件列表比较来发现隐藏文件. FDM将硬件状态解析和操作系统语义信息获取以不同模块实现,不仅具备虚拟机自省技术的抗干扰性,还具备模块化架构的可移植性与高效性.实验结果表明, FDM能够准确快速地检测出虚拟机内部的隐藏文件.  相似文献   

7.
对运行在文件系统上的工作负载进行分析有助于优化分布式文件系统的性能,且对构建新型存储系统至关重要。由于工作负载的复杂性和规模多样性的增加,使用基于直觉的分析来显式地捕获工作负载踪迹的特征是不完备的。针对这一问题,提出了一个分布式日志分析与负载特征提取模型。首先,从分布式文件系统日志中根据关键字抽取出与读写相关的信息;其次,从统计与时序两方面对负载特征进行描述;最后,分析基于负载特征进行系统优化的可能。实验结果表明,提出的模型具有一定的可行性与准确性,且可以较为详细地给出负载统计与时序特征,具有低开销、高时效、易于分析等优点,可以用来指导具有相同特征的工作负载的合成、热点数据监测、系统的缓存预取优化。  相似文献   

8.
基于网络软RAID的分布式文件系统的设计与实现   总被引:1,自引:0,他引:1  
随着计算机技术与网络技术的快速发展,网络分布式存储逐渐成为存储技术研究的重点。分布式文件系统作为分布式存储系统中数据的管理者,为不同节点上的用户提供了对底层存储系统的文件访问接口。文章设计与实现了基于网络软RAID的分布式文件系统,并对系统的设计思想、总体结构和技术要点进行了详细介绍。  相似文献   

9.
This research proposes and tests an approach to engineering distributed file systems that are aimed at wide-scale, Internet-based use. The premise is that replication is essential to deliver performance and availability, yet the traditional conservative replica consistency algorithms do not scale to this environment. Our Ficus replicated file system uses a single-copy availability, optimistic update policy with reconciliation algorithms that reliably detect concurrent updates and automatically restore the consistency of directory replicas. The system uses the peer-to-peer model in which all machines are architectural equals but still permits configuration in a client-server arrangement where appropriate. Ficus has been used for six years at several geographically scattered installations. This paper details and evaluates the use of optimistic replica consistency, automatic update conflict detection and repair, the peer-to-peer (as opposed to client-server) interaction model, and the stackable file system architecture in the design and construction of Ficus. The paper concludes with a number of lessons learned from the experience of designing, building, measuring, and living with an optimistically replicated file system. © 1998 John Wiley & Sons, Ltd.  相似文献   

10.
Understanding the behavioural aspects of a software system can be made easier if efficient tool support is provided. Lately, there has been an increase in the number of tools for analysing execution traces. These tools, however, have different formats for representing execution traces, which hinders interoperability and limits reuse and sharing of data. To allow for better synergies among trace analysis tools, it would be beneficial to develop a standard format for exchanging traces. In this paper, we present a graph-based format, called compact trace format (CTF), which we hope will lead the way towards such a standard. CTF can model traces generated from a variety of programming languages, including both object-oriented and procedural ones. CTF is built with scalability in mind to overcome the vast size of most interesting traces. Indeed, the design of CTF is based on the idea that call trees can be transformed into more compact ordered acyclic directed graphs by representing similar subtrees only once. CTF is also supported by our trace analysis tool SEAT (Software Exploration and Analysis Tool).  相似文献   

11.
一种性能优化的小文件存储访问策略的研究   总被引:1,自引:0,他引:1  
在分布式文件系统中,小文件的管理一般存在访问性能较差和存储空间浪费较大等缺点.为了解决这些问题,提出了一种性能优化的小文件存储访问(SFSA)策略.SFSA将逻辑上连续的数据尽可能存储在物理磁盘的连续空间,使用Cache充当元数据服务器的角色并通过简化的文件信息节点提高Cache利用率,提高了小文件访问性能;写数据时聚合更新数据及其文件夹域中的相关数据为一次I/O请求写入,减少了文件碎片数量,提高了存储空间利用率;文件传输时利用局部性原理,提前发送批量的高访问率的小文件,降低了建立网络连接开销,提升了文件传输性能.理论分析和实验证明,SFSA的设计思想和方法能有效地优化小文件的存储访问性能.  相似文献   

12.
近年来随着云计算市场规模不断增长,作为云计算平台基础设施的云存储系统也随之显得越来越重要。数以万计的互联网应用已经运行于云计算环境,同时大量不同的应用也即将从传统运行环境转移到云计算平台。不同的互联网应用的存储需求可能不一样。例如:应用中涉及的单个文件大小,文件数量,IO访问模式,读写比率等,都对底层存储系统提出了不同的要求。这说明在云计算环境中,单个文件系统可能无法满足全部应用的存储需求,本文尝试通过在单一云计算平台中部署多个不同分布式文件系统来优化存储系统的总体性能。为了优化混合式文件系统的性能,首先需要分析不同文件系统的性能特征。本文通过量化方法分析了云计算环境下几个常用的分布式文件系统,这些文件系统分别是ceph,moosefs,glusterfs和hdfs。实验结果显示:即使针对同一文件的相同读写操作,不同分布式文件系统之间的性能也差异显著,当单个文件的大小小于256MB时,moosefs的平均写性能比其它几个文件系统高22.3%;当单个文件大小大于256KB时,glusterfs的平均读性能比其它几个文件系统高21.0%。这些结果为设计和实现一个基于以上几个分布式文件系统的混合式文件系统提供了基础。  相似文献   

13.
We describe a simulator which emulates the activity of a shared memory, common bus multiprocessor system with private caches. Both kernel and user program activities are considered, thus allowing an accurate analysis and evaluation of coherence protocol performance. The simulator can generate synthetic traces, based on a wide set of input parameters which specify processor, kernel and workload features. Other parameters allow us to detail the multiprocessor architecture for which the analysis has to be carried out. An actual-trace-driven simulation is possible, too, in order to evaluate the performance of a specific multiprocessor with respect to a given workload, if traces concerning this workload are available. In a separate section, we describe how actual traces can also be used to extract a set of input parameters for synthetic trace generation. Finally, we show how the simulator may be successfully employed to carry out a detailed performance analysis of a specific coherence protocol  相似文献   

14.
本文分析了传统文件监控技术的不足,然后详细介绍了Linux系统中文件实时监控技术在内核中的研究与实现,同时详细介绍了不同版本的Linux在系统调用的截获方法与实现。  相似文献   

15.
In many-task computing (MTC), applications such as scientific workflows or parameter sweeps communicate via intermediate files; application performance strongly depends on the file system in use. The state of the art uses runtime systems providing in-memory file storage that is designed for data locality: files are placed on those nodes that write or read them. With data locality, however, task distribution conflicts with data distribution, leading to application slowdown, and worse, to prohibitive storage imbalance. To overcome these limitations, we present MemFS, a fully symmetrical, in-memory runtime file system that stripes files across all compute nodes, based on a distributed hash function. Our cluster experiments with Montage and BLAST workflows, using up to 512 cores, show that MemFS has both better performance and better scalability than the state-of-the-art, locality-based file system, AMFS. Furthermore, our evaluation on a public commercial cloud validates our cluster results. On this platform MemFS shows excellent scalability up to 1024 cores and is able to saturate the 10G Ethernet bandwidth when running BLAST and Montage.  相似文献   

16.
为缓解单一存储设备存储海量小文件的压力,提出了一种国产化环境下的海量小文件数据分布式存储技术。利用聚类算法实现海量小文件合并。以达到最大均衡度为目标,在多项约束条件下利用人工鱼群算法求解分布式存储方案。按照分布式存储方案将海量小文件数据迁移到存储节点及其存储设备上,完成海量小文件数据分布式存储。结果表明:14个存储节点和28个存储设备的内存占用较为均衡,内存资源利用率较高。将小文件样本迁移并存储到节点的过程中,分布式存储均衡度整体波动均超过设定的阈值1.0,说明分布式存储均衡度较好,证明了所提存储技术的有效性。  相似文献   

17.
基于进程文件系统的调试器设计   总被引:2,自引:0,他引:2  
早期的unix调试器是基于ptrace设计的,1984年killkian提出了基于进程文件系统的进程跟踪方式,由于它有许多优点,最新的unixSVR4。2上的DEBUG、DBX、GNU的GDB等都使用了这种方式。本文讨论了基于进程诉调试器设计方法,主要内容包括进程控制中的要扫行程序加载,进程地址空间访问,断点实现机制等。  相似文献   

18.
Dynamic analysis through execution traces is frequently used to analyze the runtime behavior of software systems. However, tracing long running executions generates voluminous data, which are complicated to analyze and manage. Extracting interesting performance or correctness characteristics out of large traces of data from several processes and threads is a challenging task. Trace abstraction and visualization are potential solutions to alleviate this challenge. Several efforts have been made over the years in many subfields of computer science for trace data collection, maintenance, analysis, and visualization. Many analyses start with an inspection of an overview of the trace, before digging deeper and studying more focused and detailed data. These techniques are common and well supported in geographical information systems, automatically adjusting the level of details depending on the scale. However, most trace visualization tools operate at a single level of representation, which are not adequate to support multilevel analysis. Sophisticated techniques and heuristics are needed to address this problem. Multi‐scale (multilevel) visualization with support for zoom and focus operations is an effective way to enable this kind of analysis. Considerable research and several surveys are proposed in the literature in the field of trace visualization. However, multi‐scale visualization has yet received little attention. In this paper, we provide a survey and methodological structure for categorizing tools and techniques aiming at multi‐scale abstraction and visualization of execution trace data and discuss the requirements and challenges faced to be able to meet evolving user demands.  相似文献   

19.
随着存储系统的快速发展,以及实际应用中对存储系统的要求日益苛刻,为了研究存储系统I/O子系统的运行形态,设计并实现了一种分布式I/O日志收集系统,该系统能够通过总控制台同时控制分布式系统的多个节点并行收集分布式系统的I/O日志,为分析和回放分布式系统的I/O日志提供有效的数据,且详细描述了系统的设计与实现。  相似文献   

20.
HDFS存储和优化技术研究综述   总被引:1,自引:0,他引:1  
HDFS(Hadoop distributed file system)作为面向数据追加和读取优化的开源分布式文件系统,具备可移植、高容错和可大规模水平扩展的特性.经过10余年的发展,HDFS已经广泛应用于大数据的存储.作为存储海量数据的底层平台,HDFS存储了海量的结构化和非结构化数据,支撑着复杂查询分析、交互式分析、详单查询、Key-Value读写和迭代计算等丰富的应用场景.HDFS的性能问题将影响其上所有大数据系统和应用,因此,对HDFS存储性能的优化至关重要.介绍了HDFS的原理和特性,对已有HDFS的存储及优化技术,从文件逻辑结构、硬件设备和应用负载这3个维度进行了归纳和总结.综述了近年来HDFS存储和优化相关研究.未来,随着HDFS上层应用的日益丰富和底层硬件平台的发展,基于异构平台的数据存储、面向应用负载的自适应存储优化以及结合机器学习的存储优化技术将成为未来研究的主要方向.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号