首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 23 毫秒
1.
This paper presents an effective method of metadata rebalance in exascale distributed file systems. Exponential data growth has led to the need for an adaptive and robust distributed file system whose typical architecture is composed of a large cluster of metadata servers and data servers. Though each metadata server can have an equally divided subset from the entire metadata set at first, there will eventually be a global imbalance in the placement of metadata among metadata servers, and this imbalance worsens over time. To ensure that disproportionate metadata placement will not have a negative effect on the intrinsic performance of a metadata server cluster, it is necessary to recover the balanced performance of the cluster periodically. However, this cannot be easily done because rebalancing seriously hampers the normal operation of a file system. This situation continues to get worse with both an ever-present heavy workload on the file system and frequent failures of server components at exascale. As one of the primary reasons for such a degraded performance, file system clients frequently fail to look up metadata from the metadata server cluster during the period of metadata rebalance; thus, metadata operations cannot proceed at their normal speed. We propose a metadata rebalance model that minimizes failures of metadata operations during the metadata rebalance period and validate the proposed model through a cost analysis. The analysis results demonstrate that our model supports the feasibility of online metadata rebalance without the normal operation obstruction and increases the chances of maintaining balance in a huge cluster of metadata servers.  相似文献   

2.
Data deduplication for file communication across wide area network (WAN) in the applications such as file synchronization and mirroring of cloud environments usually achieves significant bandwidth saving at the cost of significant time overheads of data deduplication. The time overheads include the time required for data deduplication at two geographi-cally distributed nodes (e.g., disk access bottleneck) and the duplication query/answer operations between the sender and the receiver, since each query or answer introduces at least one round-trip time (RTT) of latency. In this paper, we present a data deduplication system across WAN with metadata feedback and metadata utilization (MFMU), in order to harness the data deduplication related time overheads. In the proposed MFMU system, selective metadata feedbacks from the receiver to the sender are introduced to reduce the number of duplication query/answer operations. In addition, to harness the metadata related disk I/O operations at the receiver, as well as the bandwidth overhead introduced by the metadata feedbacks, a hysteresis hash re-chunking mechanism based metadata utilization component is introduced. Our experimental results demonstrated that MFMU achieved an average of 20%~40% deduplication acceleration with the bandwidth saving ratio not reduced by the metadata feedbacks, as compared with the “baseline” content defined chunking (CDC) used in LBFS (Low-bandwith Network File system) and exiting state-of-the-art Bimodal chunking algorithms based data deduplication solutions.  相似文献   

3.
DCFS2的元数据一致性策略   总被引:6,自引:0,他引:6  
随着集群应用对机群文件系统的性能、容量和规模等需求的日益增长,采用多元数据服务器是机群文件系统发展的必然趋势.基于多元数据服务器的分布式元数据处理是文件系统研究的一个重要问题.机群文件系统DCFS2采用分布式日志技术和改进的两阶段提交协议解决了分布式元数据处理下元数据的一致性问题.性能测试结果表明,DCFS2所采用的基于分布式日志的元数据处理策略能够提供高的I/O性能,并能够保证在元数据服务器失效后文件系统快速恢复.  相似文献   

4.
持久性内存(persistent memory,PM EM)同时具备内存的低时延字节寻址和磁盘的持久化特性,将对现有软件架构体系产生革命性的变化和深远的影响.分布式存储在云计算和数据中心得到了广泛的应用,然而现有的以Ceph BlueStore为代表的后端存储引擎是面向传统机械盘和固态硬盘(solid state di...  相似文献   

5.
董豪宇  陈康 《计算机应用》2020,40(9):2577-2585
针对在使用高速存储硬件时常规网络文件系统会被软件开销影响整体性能的问题,提出了利用存储性能开发套件(SPDK)搭建文件系统的方法,并在此基础上实现了一个网络文件系统RUFS的原型。该系统通过键值存储模拟文件系统的目录树结构以及对文件系统的元数据进行管理,通过SPDK存储文件的内容。另外,利用远程直接内存访问(RDMA)技术对外提供文件系统服务。RUFS相较于NFS+ext4,在4 KB随机访问上,读写吞吐性能分别提高了202.2%和738.9%,读写平均延迟分别降低了74.4%和97.2%;在4 MB顺序访问上,读写吞吐性能分别提高了153.1%和44.0%。在大部分元数据操作上,RUFS相比NFS+ext4也有显著优势,特别是文件夹创建操作,RUFS的吞吐性能提高了约5 693.8%。该系统能够充分发挥高速网络和高速存储设备的性能优势,为用户提供延时更低、吞吐性能更好的文件系统服务。  相似文献   

6.
董豪宇  陈康 《计算机应用》2005,40(9):2577-2585
针对在使用高速存储硬件时常规网络文件系统会被软件开销影响整体性能的问题,提出了利用存储性能开发套件(SPDK)搭建文件系统的方法,并在此基础上实现了一个网络文件系统RUFS的原型。该系统通过键值存储模拟文件系统的目录树结构以及对文件系统的元数据进行管理,通过SPDK存储文件的内容。另外,利用远程直接内存访问(RDMA)技术对外提供文件系统服务。RUFS相较于NFS+ext4,在4 KB随机访问上,读写吞吐性能分别提高了202.2%和738.9%,读写平均延迟分别降低了74.4%和97.2%;在4 MB顺序访问上,读写吞吐性能分别提高了153.1%和44.0%。在大部分元数据操作上,RUFS相比NFS+ext4也有显著优势,特别是文件夹创建操作,RUFS的吞吐性能提高了约5 693.8%。该系统能够充分发挥高速网络和高速存储设备的性能优势,为用户提供延时更低、吞吐性能更好的文件系统服务。  相似文献   

7.
蒋炎华 《计算机应用》2011,31(2):462-465
提出了计算资源共享平台中的一种非集中式元数据管理方法,它利用对等网络的方式把数据和元数据分散到网络上的其他节点,而不是在后台服务器端。该方法可以在平台运行数据密集型应用时支持大容量的并行工作机读写数据,同时具有随机的访问、灵活的访问粒度、支持高负载的并行读写的特点。运用了分布式哈希表技术,把大容量元数据划分成树型结构的分段树的形式。描述了数据与元数据的读写与追加新数据的过程。测试结果表明:该方法针对3D图像渲染类数据密集型应用,在不同的工作机并行访问与读写过程中,可以获得很高的聚集带宽与平均读写带宽。  相似文献   

8.
We describe a data deduplication system for backup storage of PC disk images, named in-RAM metadata utilizing deduplication (IR-MUD). In-RAM hash granularity adaptation and miniLZO based data compression are firstly proposed to reduce the in-RAM metadata size and thereby reduce the space overheads required by the in-RAM metadata caches. Secondly, an in-RAM metadata write cache, as opposed to the traditional metadata read cache, is proposed for further reducing metadata-related disk I/O operations and improving deduplication throughput. During deduplication, the metadata write cache is managed following the LRU caching policy. For each manifest that is hit in the metadata write cache, an expensive manifest reloading operation from the disk is avoided. After deduplication, all the manifests in the metadata write cache are cleared and stored on the disk. Our experimental results using 1.5 TB real-world disk image dataset show that 1) IR-MUD achieved about 95% size reduction for the deduplication metadata, with a small time overhead introduced, 2) when the metadata write cache was not utilized, with the same RAM space size for the metadata read cache, IR-MUD achieved a 400% higher RAM hit ratio and a 50% higher deduplication throughput, as compared with the classic Sparse Indexing deduplication system where no metadata utilization approaches are utilized, and 3) when the metadata write cache was utilized and enough RAM space was available, IR-MUD achieved a 500% higher RAM hit ratio compared with Sparse Indexing and a 70% higher deduplication throughput compared with IR-MUD with only a single metadata read cache. The in-RAM metadata harnessing and metadata write caching approaches of IR-MUD can be applied in most parallel deduplication systems for improving metadata caching efficiency.  相似文献   

9.
操顺德  华宇  冯丹  孙园园  左鹏飞 《软件学报》2017,28(8):1999-2009
通过对视频监控数据的特点和传统存储方案进行分析,提出一种高性能分布式存储系统解决方案.不同于传统的基于文件存储的方式,设计了一种逻辑卷结构,将非结构化的视频流数据以此结构进行组织并直接写入RAW磁盘设备,解决了传统存储方案中随机磁盘读写和磁盘碎片导致存储性能下降的问题.该方案将元数据组织为两级索引结构,分别由状态管理器和存储服务器管理,极大地减少了状态管理器需要管理元数据的数量,消除了性能瓶颈,并提供精确到秒级的检索精度.此外,该方案灵活的存储服务器分组策略和组内互备关系使得存储系统具备容错能力和线性扩展能力.系统测试结果表明,该方案在成本低廉的PC服务器上实现了单台服务器能同时记录400路1080P视频流,写入速度是本地文件系统的2.5倍.  相似文献   

10.
对象存储系统中自适应的元数据负载均衡机制   总被引:1,自引:0,他引:1  
陈涛  肖侬  刘芳 《软件学报》2013,24(2):331-342
面向对象的存储系统在研究、工程以及服务领域均得到了广泛的应用.在面向对象的存储系统中,元数据的负载均衡对于提高整个系统的I/O性能具有重要的作用.现有的元数据负载均衡策略不能动态地平衡元数据的访问负载,而且自适应性以及容错特性有待提高.提出了一种自适应的分布式元数据负载均衡机制(adaptabledistributed load balancing of metadata,简称ADMLB),包含基本的负载均衡算法和分布式的增量负载均衡算法.采用基本的负载均衡算法按照服务器的性能公平地分布负载,使用分布式的负载均衡算法定时地调整负载的分布.ADMLB采取分布式的方法均衡地在元数据服务器之间分布负载,根据负载的变化自适应地进行调整,具有很好的容错特性,而且用户可以高效地定位元数据服务器.  相似文献   

11.
社交网站和电子商务等网络服务发展迅速,这类服务需要存储大量图片、音乐、微博文本等小文件。传统的分布式存储系统,如HDFS(Hadoop distributed file system),是面向大文件而设计的,在存储小文件时会产生元数据开销过大,访问延迟较高等问题,不能适应存储海量小文件的应用环境。分析了TFS(Taobao file system)的系统架构和读写流程,发现TFS在每次读/写过程中至少要建立3次网络连接,增大了读写延迟。针对海量小文件存储带来的挑战和TFS存在的问题,提出了一种新的低延迟、高可用的面向海量小文件的分布式存储方案,并实现了分布式文件系统SFFS(small-file file system)。性能测试表明,SFFS和TFS相比,写延迟降低了76.6%,读延迟降低了约10%。通过对系统结构的分析,相比于TFS,SFFS在中心节点的负载更轻,失效恢复更快,在可用性方面更有优势。  相似文献   

12.
High-performance Web sites rely on Web server `farms', hundreds of computers serving the same content, for scalability, reliability, and low-latency access to Internet content. Deploying these scalable farms typically requires the power of distributed or clustered file systems. Building Web server farms on file systems complements hierarchical proxy caching. Proxy caching replicates Web content throughout the Internet, thereby reducing latency from network delays and off-loading traffic from the primary servers. Web server farms scale resources at a single site, reducing latency from queuing delays. Both technologies are essential when building a high-performance infrastructure for content delivery. The authors present a cache consistency model and locking protocol customized for file systems that are used as scalable infrastructure for Web server farms. The protocol takes advantage of the Web's relaxed consistency semantics to reduce latencies and network overhead. Our hybrid approach preserves strong consistency for concurrent write sharing with time-based consistency and push caching for readers (Web servers). Using simulation, we compare our approach to the Andrew file system and the sequential consistency file system protocols we propose to replace  相似文献   

13.
1 引言随着存储子系统、数据压缩技术、网络带宽各方面技术的发展,视频点播(VOD)得到了极大的推动。VOD的本质是用户根据自己的需求去主动选择所感兴趣的信息,这种主动性和选择性使得它在广告宣传、信息查询、娱乐、教育等领域变得前景广阔。国内外一些知名大学和公司已在这方面进行了深入的研究。  相似文献   

14.
In this work, we propose a novel hard disk technique, “AV Disk”, for modern multimedia applications. Modern hard disk drives adopt complex sector layout mechanisms to reduce track and head switch overhead. While these complex sector layout mechanism can reduce average overhead involved in the track and head switch, they bring larger variability in the overhead. From a multimedia application’s point of view, it is important to minimize the worst case I/O latency rather than to improve the average IO latency. We focus our effort to minimize track switch overhead as well as the variability in track switch overhead involved in disk I/O. We propose that track of the hard disk drive is aligned with a certain IO size. In this work, we develop an elaborate performance model with which we can compute the optimal IO unit size for multimedia applications. We propose that hard disk controller is responsible for positioning data blocks in the hard disk platter in such a manner that I/O units are not placed across the track boundaries, where a single I/O unit has size of 32–128 KByte. Optimal IO unit size is used in aligning the tracks in hard disk drives. We develop Skewed Sector Sparing technique in aligning a track with a given IO size. However, when the I/O unit for alignment is increased to 128 KByte, 17% of the disk space becomes unusable. Despite the decreased storage area, track aligning technique increases the overall performance of the hard disk. According to our simulation-based experiment, overall disk performance increases about 5–25%. Given that capacity of hard disk increases 100% every year, we cautiously regard it as reasonable tradeoff to increase the I/O latency of the disk.  相似文献   

15.
分布式键值存储将数据复制到多个存储服务器的本地引擎中,并通过一致性协议保证各副本数据的一致性。其中,以日志结构合并树为核心数据结构的实现方式最为常见。然而,面向通用业务模式设计的日志结构合并树,并不适合一致性逻辑的特殊业务模式,会引发增删改性能的降低,并在全量修复过程中造成空间放大。针对上述问题,该文提出了一种新型本地引擎PheonixLSM,通过增加增删改操作和回刷操作的约束,消除了分布式键值存储增删改流程中的双写问题,提升了引擎性能。通过重构日志结构合并树底层的SST文件布局,支持删除实时回收空间,消除了全量修复时的额外空间放大。实验结果显示,与原生本地引擎相比,使用PheonixLSM的分布式键值存储系统,增删改性能提升90.7%,全量修复的空间放大从65.6%降至6.4%,并减少了72.3%的修复时间。  相似文献   

16.
The flash-based SSD is used as a tiered cache between RAM and HDD. Conventional schemes do not utilize the nonvolatile feature of SSD and cannot cache write requests. Writes are a significant, or often dominant, fraction of storage workloads. To cache write requests, the SSD cache should persistently and consistently manage its data and metadata, and guarantee no data loss even after a crash. Persistent cache management may require frequent metadata changes and causes high overhead. Some researchers insist that a nonvolatile persistent cache requires new additional primitives that are not supported by general SSDs in the market. We proposed a fully persistent read/write cache, which improves both read and write performance, does not require any special primitive, has a low overhead, guarantees the integrity of the cache metadata and the consistency of the cached data, even during a crash or power failure, and is able to recover the flash cache quickly without any data loss. We implemented the persistent read/write cache as a block device driver in Linux. Our scheme aims at virtual desktop infra servers. So the evaluation was performed with massive, real desktop traces of five users for ten days. The evaluation shows that our scheme outperforms an LRU version of SSD cache by 50% and the read-only version of our scheme by 37%, on average, for all experiments. This paper describes most of the parts of our scheme in detail. Detailed pseudo-codes are included in the Appendix.  相似文献   

17.
传统的网络文件系统难以满足高性能计算系统的I/O 需求,并行网络文件系统——PNFS可以有效地解决传统网络文件系统在可扩展性、可用性和性能上存在的问题。首先对PNFS的体系结构进行了设计,实现了元数据服务器与存储服务器的分离,消除了由于集中服务器结构引发的I/O瓶颈问题。然后,对PNFS的原型系统进行了性能测试,并与相同环境下NFS的测试结果进行比较与分析,结果表明PNFS能够为客户端提供并行访问文件数据的能力,有着较高的I/O读写带宽和较低的访问延迟,同时实现了客户端I/O带宽与存储服务器规模之间的线性可扩展关系,能较好地满足高性能计算中的I/O需求。  相似文献   

18.
Due to the rapid development of flash memory technology, NAND flash has been widely used as a storage device in portable embedded systems, personal computers, and enterprise systems. However, flash memory is prone to performance degradation due to the long latency in flash program operations and flash erasure operations. One common technique for hiding long program latency is to use a temporal buffer to hold write data. Although DRAM is often used to implement the buffer because of its high performance and low bit cost, it is volatile; thus, that the data may be lost on power failure in the storage system. As a solution to this issue, recent operating systems frequently issue flush commands to force storage devices to permanently move data from the buffer into the non-volatile area. However, the excessive use of flush commands may worsen the write performance of the storage systems. In this paper, we propose two data loss recovery techniques that require fewer write operations to flash memory. These techniques remove unnecessary flash writes by storing storage metadata along with user data simultaneously by utilizing the spare area associated with each data page.  相似文献   

19.
In this paper, a closed queuing network model with both single and multiple servers has been proposed to model dataflow in a multi-threaded architecture. Multi-threading is useful in reducing the latency by switching among a set of threads in order to improve the processor utilization. Two sets of processors, synchronization and execution processors exist. Synchronization processors handle load/store operations and execution processors handle arithmetic/logic and control operations. A closed queuing network model is suitable for large number of job arrivals. The normalization constant is derived using a recursive algorithm for the given model. State diagrams are drawn from the hybrid closed queuing network model, and the steady-state balance equations are derived from it. Performance measures such as average response times and average system throughput are derived and plotted against the total number of processors in the closed queuing network model. Other important performance measures like processor utilizations, average queue lengths, average waiting times and relative utilizations are also derived.  相似文献   

20.
基于协同缓存的分布式数据库更新机制研究   总被引:1,自引:0,他引:1       下载免费PDF全文
减小数据库事务中的写操作开销对于分布式数据库系统的性能而言很关键。该文提出了一种基于协同缓存技术的分布式数据库更新机制,通过在分布式数据库服务器节点物理内存之上构建全局协同缓冲池,并利用其缓存写入记录,减小了数据库事务中的磁盘访问开销,研究了基于协同缓存的分布式数据库更新机制与其在该机制下事务性能改进。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号