期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Network and data location aware approach for simultaneous job scheduling and data replication in large-scale data grid environments

Najme MANSOURI 《Frontiers of Computer Science in China》2014,(3):391-408

Data Grid integrates graphically distributed resources for solving data intensive scientific applications. Effective scheduling in Grid can reduce the amount of data transferred among nodes by submitting a job to a node, where most of the requested data files are available. Scheduling is a traditional problem in parallel and distributed system. However, due to special issues and goals of Grid, traditional approach is not effective in this environment any more. Therefore, it is necessary to propose methods specialized for this kind of parallel and distributed system. Another solution is to use a data replication strategy to create multiple copies of files and store them in convenient locations to shorten file access times. To utilize the above two concepts, in this paper we develop a job scheduling policy, called hierarchical job scheduling strategy （HJSS）, and a dynamic data replication strategy, called advanced dynamic hierarchical replication strategy （ADHRS）, to improve the data access efficiencies in a hierarchical Data Grid. HJSS uses hierarchical scheduling to reduce the search time for an appropriate computing node. It considers network characteristics, number of jobs waiting in queue, file locations, and disk read speed of storage drive at data sources. Moreover, due to the limited storage capacity, a good replica replacement algorithm is needed. We present a novel replacement strategy which deletes files in two steps when free space is not enough for the new replica： first, it deletes those files with minimum time for transferring. Second, if space is still insufficient then it considers the last time the replica was requested, number of access, size of replica and file transfer time. The simulation results show that our proposed algorithm has better performance in comparison with other algorithms in terms of job execution time, number of intercommunications, number of replications, hit ratio, computing resource usage and storage usage. 相似文献

2.

PMC: Select Materialized Cells in Data Cubes 总被引：1，自引：0，他引：1

下载免费PDF全文

Hong-Song Li Hou-Kuan Huang 《计算机科学技术学报》2006,21(2):297-305

QC-Tree is one of the most storage-efficient structures for data cubes in an MOLAP system. Although QC-Tree can achieve a high compression ratio, it is still a fully materialized data cube. In this paper, an improved structure PMC is presented allowing us to materialize only a part of the cells in a QC-Tree to save more storage space. There is a notable difference between our partially materialization algorithm and traditional materialized views selection algorithms. In a traditional algorithm, when a view is selected, all the cells in this view are to be materialized. Otherwise, if a view is not selected, all the cells in this view will not be materialized. This strategy results in the unstable query performance. The presented algorithm, however, selects and materializes data in cell level, and, along with further reduced space and update cost, it can ensure a stable query performance. A series of experiments are conducted on both synthetic and real data sets. The results show that PMC can further reduce storage space occupied by the data cube, and can shorten the time to update the cube. 相似文献

3.

Massive Storage Systems

下载免费PDF全文

Dan Feng Hai Jin 《计算机科学技术学报》2006,21(5):648-664

To accommodate the explosively increasing amount of data in many areas such as scientific computing and e-Business, physical storage devices and control components have been separated from traditional computing systems to become a scalable, intelligent storage subsystem that, when appropriately designed, should provide transparent storage interface, effective data allocation, flexible and efficient storage management, and other impressive features. The design goals and desirable features of such a storage subsystem include high performance, high scalability, high availability, high reliability and high security. Extensive research has been conducted in this field by researchers all over the world, yet many issues still remain open and challenging. This paper studies five different online massive storage systems and one offline storage system that we have developed with the research grant support from China. The storage pool with multiple network-attached RAIDs avoids expensive store-and-forward data copying between the server and storage system, improving data transfer rate by a factor of 2-3 over a traditional disk array. Two types of high performance distributed storage systems for local-area network storage are introduced in the paper. One of them is the Virtual Interface Storage Architecture (VISA) where VI as a communication protocol replaces the TCP/IP protocol in the system. VISA's performance is shown to achieve better than that of IP SAN by designing and implementing the vSCSI (Vl-attached SCSI) protocol to support SCSI commands in the VI network. The other is a fault-tolerant parallel virtual file system that is designed and implemented to provide high I/O performance and high reliability. A global distributed storage system for wide-area network storage is discussed in detail in the paper, where a Storage Service Provider is added to provide storage service and plays the role of user agent for the storage system. Object based Storage Systems not only store data but also adopt the attributes and methods of objects that encapsulate the data. The adaptive policy triggering mechanism (APTM), which borrows proven machine learning techniques to improve the scalability of object storage systems, is the embodiment of the idea about smart storage device and facilitates the self-management of massive storage systems. A typical offline massive storage system is used to backup data or store documents, for which the tape virtualization technology is discussed. Finally, a domain-based storage management framework for different types of storage systems is presented in the paper. 相似文献

4.

GekkoFS—A Temporary Burst Buffer File System for HPC Applications

下载免费PDF全文

Marc-AndréVef Nafiseh Moti Tim Süβ Markus Tacke Tommaso Tocci Ramon Nou Alberto Miranda Toni Cortes AndréBrinkmann 《计算机科学技术学报》2020,35(1):72-91

Many scientific fields increasingly use high-performance computing(HPC)to process and analyze massive amounts of experimental data while storage systems in today's HPC environments have to cope with new access patterns.These patterns include many metadata operations,small I/O requests,or randomized file I/O,while general-purpose parallel file systems have been optimized for sequential shared access to large files.Burst buffer file systems create a separate file system that applications can use to store temporary data.They aggregate node-local storage available within the compute nodes or use dedicated SSD clusters and offer a peak bandwidth higher than that of the backend parallel file system without interfering with it.However,burst buffer file systems typically offer many features that a scientific application,running in isolation for a limited amount of time,does not require.We present GekkoFS,a temporary,highly-scalable file system which has been specifically optimized for the aforementioned use cases.GekkoFS provides relaxed POSIX semantics which only offers features which are actually required by most(not all)applications.GekkoFS is,therefore,able to provide scalable I/O performance and reaches millions of metadata operations already for a small number of nodes,significantly outperforming the capabilities of common parallel file systems. 相似文献

5.

A cost effective fault-tolerant scheme for RAIDs 总被引：1，自引：0，他引：1

下载免费PDF全文

方粮卢锡城《计算机科学技术学报》2003,18(2):0-0

The rapid progress in mass storage technology has made it possible for designers to implement large data storage systems for a variety of applications.One of the efficient ways to build large storage systems is to use RAIDs only when one error occurs .But in large RAIDs systems ,the fault probability will increase when the number of disks increases ,and the use of disks with big storage capacity will cause the recovering time to prolong,thus the probability of the second disk‘‘‘‘‘‘‘‘s fault will incerease Therefore,it is necessary to develop methods to recover data when two or more errors have occurred In this paper,a fault tolerant scheme is proposed based on extended Reed-Solomon code,a recovery procedure is designed to correct up to two errors which is implemented by software and hardware together,and the scheme is verified by computer simulation,In this scheme,only two redundant disks are used to recover up to two disks‘‘‘‘‘‘‘‘ fault .The encoding and decoding methods,and the implementation based on software and hardware are described.The application of the scheme in software RAIDs that are builit in cluster computers are also described .Compared with the existing methods such as EVENODD and DH ,the proposed scheme has distinct improvement in implementation and redundancy. 相似文献

6.

SCR algorithm: Saving/restoring states of file systems

下载免费PDF全文

魏晓辉鞠九滨《计算机科学技术学报》2000,15(4):0-0

Fault-tolerance is very important in cluster computing and has been implemented in many famous cluster-computing systems using checkpoint/restart mechanisms,But existent check-pointing algorithms cannot restore the states of a file system when roll-backing the running of a program,so there are many restrictions on file accesses in existent fault-tolerance systems.SCR algorithm,an algorithm based on atomic operation and consistent schedule,which can restore the states of file systems,is presented into idem-potent operations and non-idem-potent operations.systems are classified into idem-potent operations and non-idem-potent operations.A non-idem-potent operation modifies a file system‘s states,while an idem-potent operation does not.SCR algorithm tracks changes of the file system states.It logs each non-idem-potent operation used by user programs and the information that can restore the operation in disks.When check-pointing roll-backing the program,SCR algorithm will revert the file system states to the last checkpoint time.By using SCR algorithm,users are allowed to use any file operation in their programs. 相似文献

7.

利用日志文件进行工作流恢复的一种策略

高军王海洋《计算机科学》2000,27(9):95-98

In a traditional way, when the execution of workflow is interruptted,we use compensateworktask for workflow recovery in order to keep the consistense of database. In this paper ,we open anew,feasible way for system-supported recovery mechanism via database log file and workflow log file, 相似文献

8.

Active Block Layout：一种高性能磁盘布局机制

卢军卢显良等《计算机科学》2003,30(3):164-167

The access frequency of different files in file system is dissimilar. If file system can optimize the block lay-out of these hot files which are frequently accessed,the performance of file system will be improved. This paper pre-sents a high performance block layout mechanism Active Block Layout (ABL). ABL can record the access frequencyof every file in file system and actively optimize the block layout of these hot files by block duplicating. The duplicatedblocks can be placed in the special zone of track,which is called "Cooling Zone". ABL can automatically determine theplacing position and the copy count of the blocks which need to be duplicated. In order to reduce the overhead of blockduplication,this paper also presents a mechanism which uses the potential disk bandwidth to realize the block duplica-tion,and does not obviously degrade the performance of file system. 相似文献

9.

A multi-dimensional index structure based on improved VA-file and CAN in the cloud

Chun-Ling Cheng Chun-Ju Sun Xiao-Long Xu Deng-Yin Zhang 《国际自动化与计算杂志》2014,11(1):109-117

Currently,the cloud computing systems use simple key-value data processing,which cannot support similarity search efectively due to lack of efcient index structures,and with the increase of dimensionality,the existing tree-like index structures could lead to the problem of"the curse of dimensionality".In this paper,a novel VF-CAN indexing scheme is proposed.VF-CAN integrates content addressable network(CAN)based routing protocol and the improved vector approximation fle(VA-fle) index.There are two index levels in this scheme:global index and local index.The local index VAK-fle is built for the data in each storage node.VAK-fle is thek-means clustering result of VA-fle approximation vectors according to their degree of proximity.Each cluster forms a separate local index fle and each fle stores the approximate vectors that are contained in the cluster.The vector of each cluster center is stored in the cluster center information fle of corresponding storage node.In the global index,storage nodes are organized into an overlay network CAN,and in order to reduce the cost of calculation,only clustering information of local index is issued to the entire overlay network through the CAN interface.The experimental results show that VF-CAN reduces the index storage space and improves query performance efectively. 相似文献

10.

Mlock:building delegable metadata service for the parallel file systems

ZHANG Quan FENG Dan WANG Fang WU Sen 《中国科学:信息科学(英文版)》2015,(3):66-79

The ever-growing demand for high performance computation calls for progressively larger parallel distributed file systems to match their requirement.These file systems can achieve high performance for large I/O operations through distributing load across numerous data servers.However,they fail to provide quality service for applications pertaining to small files.In this paper,we propose a delegable metadata service(DMS)for hiding latency of metadata accesses and optimizing small-file performance.In addition,four techniques have been designed to maintain consistency and efficiency in DMS:pre-allocate serial metahandles,directory-based metadata replacement,packing transaction operations and fine-grained lock revocation.These schemes have been employed in Cappella parallel distributed file system,and various experiments complying with industrial standards have been conducted for evaluation of its efficiency.The results show that our design has achieved significant improvement in performance of both metadata operations and small-file access.Moreover,this scheme is widely applicable for integration within many other distributed file systems. 相似文献

11.

基于IOP321的嵌入式RAID系统中的硬异或实现

下载免费PDF全文

李明董晓明刘瑞芳《计算机工程与科学》2007,29(3):87-90

为提高RAID系统的可靠性,RAID3/5/6算法在数据写入的过程中采用异或运算产生奇偶校验信息。当RAID出现故障时,也是通过异或运算完成数据的重构。因此,异或运算是RAID系统工作时频繁而且重要的操作之一。本文详细介绍了采用Intel IOP321处理器的应用加速单元实现异或运算的工作原理和软件模块设计,并通过实验测试证明,专门的硬异或单元比嵌入式处理器做软件异或运算的速度快7倍以上,有效地解决了嵌入式环境下异或的性能瓶颈问题。相似文献

12.

RAID-VCR:一种能够承受三个磁盘故障的RAID结构 总被引：1，自引：0，他引：1

董欢庆李战怀林伟《计算机学报》2006,29(5):792-800

提出了一种新RAID结构——RAID-VCR.这种结构仅需要3个额外的磁盘来保存校验信息,但是却能够承受任意模式的3个成员磁盘故障.与现有的其它RAID结构相比,RAID-VCR的容灾能力大幅提高,但是对磁盘空间利用率和系统吞吐量的影响却非常小.RAID-VCR的编码和解码过程都是基于简单的XOR操作,并且以明文方式保存了用户数据,从而可以高效地执行读操作.仿真实验结果表明,RAID-VCR的编码和解码性能较好,具有很好的应用前景. 相似文献

13.

BW-VSDS:大容量、可扩展、高性能和高可靠性的网络虚拟存储系统

那文武孟晓烜柯剑朱旭东许金萍李一鸣卜庆忠许鲁《计算机研究与发展》2009,46(Z2)

设计并实现了一个大容量、可扩展、高性能和高可靠性的网络虚拟存储系--BW-VSDS.和其他网络存储系统对比,它有如下的特点:1)采用带内元数据管理和带外数据访问的虚拟存储管理架构,存储管理更灵活,并且系统扩展性更好;2)在单个节点内部的多个虚拟卷、多个虚拟池和多个网络存储设备上利用存储虚拟化技术重构得到面向多种存储应用的网络虚拟存储设备,实现了3层的层次化存储虚拟化模型,对内共享存储设备的容量和带宽,对外提供不同属性的虚拟磁盘;3)采用写时按需分配策略提高了存储空间的利用率,使用数据块重组提高了I/O读写性能;4)使用设备链表和位图实现了层叠式虚拟快照,支持增量快照、写时拷贝和写时重定向机制,实现源卷和快照卷的数据共享;5)提出结合带外存储虚拟化管理的后端集中的带外冗余管理结构,数据读写直接访问存储节点,冗余管理节点在磁盘上以日志方式缓存从存储节点镜像写的数据,然后在后台进行RAID5冗余计算,提高了活跃数据的可靠性,减轻了冗余计算对写性能的影响. 相似文献

14.

RAID0.5: design and implementation of a low cost disk array data protection method

John A. Chandy 《The Journal of supercomputing》2008,46(2):108-123

RAID has long been established as an effective way to provide highly reliable as well as high-performance disk subsystems. However, reliability in RAID systems comes at the cost of extra disks. In this paper, we describe a mechanism that we have termed RAID0.5 that enables striped disks with very high data reliability but low disk cost. We take advantage of the fact that most disk systems use offline backup systems for disaster recovery. With the use of these offline backup systems, the disk system needs to only replicate data since the last backup, thus drastically reducing the storage space requirement. Though RAID0.5 has the same data loss characteristics of traditional mirroring, the lower storage space comes at the cost of lower availability. Thus, RAID0.5 is a tradeoff between lower disk cost and lower availability while still preserving very high data reliability. We present analytical reliability models and experimental results that demonstrate the enhanced reliability and performance of the proposed RAID0.5 system. 相似文献

15.

一种改进的云存储系统容错机制

聂瑞华张科伦梁军《计算机应用研究》2013,30(12):3724-3728

在云存储平台下, 提出了一种基于访问统计的自适应容错机制SFMAF, 该机制通过近似最近最少使用算法维护一张文件访问频率表来自适应调整容错方式。SFMAF对于常读取的文件采用副本冗余机制; 对于不常读取的文件采用Reed-Solomon（RS）纠删码容错机制。实验结果表明, SFMAF相对于副本冗余机制, 在CPU和内存使用率可接受范围内的增加上, 减少了系统内部数据的传输流量, 即减少了系统存储空间。相似文献

16.

神威3000A高可用海量存储系统

郑翔余婷《计算机工程与科学》2009,31(Z1)

神威3000A海量存储系统采用基于文件分条的网络数据冗余方法,支持网络RAID1和RAID5冗余模式,能够对存储服务器及盘阵故障在线容错,是一个高可用的分布式存储系统。相似文献

17.

一种新颖的SAN下文件RAID方案

毛友发杨明福《计算机工程》2003,29(21):23-24,40

存储区域网络SAN是未来最有前途的网络存储技术。RAID技术在高性能存储系统中被广泛应用，在SAN条件下RAID是异质的。该文提出了一种新颖的文件RAID(file RAID)机制。它将RAID结合到文件系统中，克服了独立硬件和软件RAID的缺陷。它是自适应的、多RAID级别混合的，在SAN下能具有很好的可用性、扩展性、可靠性、动态适应性和高性能。提出了fileRAID自适应算法。相似文献

18.

冗余磁盘阵列RAID5性能损失研究

祝夭龙周学仁《计算机工程与设计》1995,16(3):9-19,8

冗余磁盘阵列虽然引入了容错机制使得磁盘阵列的数据可靠性得到了很大的提高，但同时也引起性能不降。而且随着磁盘数量的增加，磁盘失效的概率将明显增大，当单个磁盘失效后，虽然此时磁盘阵列数据并未矢失，且仍能服务于系统的请量此时磁盘阵列是带“病”工作，处于一种降级模式，本文对冗余磁盘阵列ＲＡＩＤ５进行了队列建模和仿真计算，提出了性能损失率的概念，并作为评价磁盘阵列性能损失的衡量指标。计算结果分析表明，ＲＡＩ相似文献

19.

分布式系统中纠删码容错机制的研究与实现

张科伦《计算机与现代化》2012,(8):47-50

为降低分布式系统中容错机制的存储开销,在分布式文件系统中使用纠删码容错机制。本文总结纠删码容错机制实现的几个理论基础,并分析其系统可靠性,在阐明实现该机制的具体步骤后对几个关键算法模块进行了说明,最后对该机制在分布式系统环境下进行实验。实验结果表明,该机制能够有效地恢复受损数据。在合理的缓存块大小和文件分块数策略下,该机制的编、译码率能够较好地匹配局域网中的网络传输速率,且能够节省存储空间。相似文献

20.

一种可扩展分布式RAID存储集群系统 总被引：2，自引：0，他引：2

章宏灿薛巍舒继武《计算机研究与发展》2008,45(4):741-746

一种可扩展、单位存储价格低廉的分布式RAID存储集群系统被提出,用以取代硬件RAID磁盘阵列.该系统具有3个特性:1)通过网络共享存储模型共享存储资源,每个数据节点处于完全对等的地位,是一种P2P架构的实现,易于扩展;2)在用户空间对节点间的RAID元数据进行同步,具有Single I/O Space(SIOS)特性,并能支持各种RAID技术;3)提供内核空间虚拟块设备的访问接口,对文件系统透明.测试结果表明该系统最高连续读带宽达到190MBps;在 RAID6冗余配置下,1～2个节点失效不会造成数据访问服务的中断,但1个节点失效会造成连续读带宽下降15%,2个节点失效造成连续读带宽下降18%. 相似文献