首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
高能物理网格数据管理关键技术研究*   总被引:1,自引:0,他引:1  
首先概要介绍高能物理网格的需求和发展,然后对其中数据管理的关键技术进行深入分析和探讨,包括名字服务、数据复制管理、数据传输、海量存储系统、用户访问接口等.最后,介绍一个用于高能物理网格数据管理的文件系统原型设计.  相似文献   

2.
校园网格文件系统的研究   总被引:1,自引:0,他引:1  
随着信息技术的发展,科学计算和并行技术所要处理的数据越来越多,现有的分布式文件系统已经越来越难以满足海量数据存储和资源在地理上的广泛协同和共享。校园网格文件系统是解决建立校园网格系统的重要技术之一,它将校园网格系统中的资源有机地统一起来,其应用将促进网格研究的进一步发展。  相似文献   

3.
马常霞 《微机发展》2006,16(4):153-154
随着信息技术的发展,科学计算和并行技术所要处理的数据越来越多,现有的分布式文件系统已经越来越难以满足海量数据存储和资源在地理上的广泛协同和共享。校园网格文件系统是解决建立校园网格系统的重要技术之一,它将校园网格系统中的资源有机地统一起来,其应用将促进网格研究的进一步发展。  相似文献   

4.
当前,高能物理网格计算平台主要依赖于单一的网格中间件与操作系统.而实际上,越来越多的异构资源需要被整合进来.因此,如何在多个平台上(包括不同的网格中间件与操作系统)共享高能物理网格数据是一个基本的研究课题.本文在不改动原有平台的基础上,提出并实现一个网格数据共享系统.基于该系统用户可以透明的在多个平台上管理和共享海量数据.本文主要描述这个系统的体系结构、实现方法、性能优化以及使用场景.  相似文献   

5.
提出一种在高能物理网格环境下适用的客户端动态自适应文件副本选择算法。该算法基于历史传输信息进行选择预测,并可根据实际情况进行合理配置。理论证明和真实环境下的运行结果表明,该算法能够很好地满足高能物理网格的需求。  相似文献   

6.
数据存储可以使利用不同存储环境的用户有效地访问数据。本文对网格环境下的主要存储技术进行了介绍。讨论了在本地和远程存储数据时,存储方法要考虑的问题;数据存储的基本传输和存储单元及存储模式。  相似文献   

7.
日志文件系统在嵌入式存储设备上的实现   总被引:9,自引:1,他引:8  
In embedded systems,unexpected power-off often causes the corruption of the file system and the lose of data.It is necessary to develop a special kind of file system to prevent such corruption.As a kind of journaling file system specially for embedded memory devices,JFFS is just the files system we need.In order to make use of JFFS more extensively,Redfiag Software Ltd.Co.has successfully soved the problem about JFFS‘‘‘‘‘‘‘‘ implementation on DiskOnChip,a special kind of embedded memory device.This paper mostly discusses the design of JFFS and its implementation on DiskOnChip.  相似文献   

8.
基于网格环境的GOS文件系统与一般计算机文件系统有很大区别。文章给出网格文件系统的体系结构,引入路径解析机制、Mount机制、文件访问控制机制等,对网格文件系统的设计进行了研究。  相似文献   

9.
高能物理科学研究大多依托固定站点大科学装置,拥有海量实验数据。因此数据计算往往基于异地站点的海量实验数据。针对这些海量的分布式实验数据,传统的高能物理计算模式中采用了网格的方式进行跨域数据共享,但资源利用率低、响应时间长以及部署维护困难等问题,限制了网格技术在中小型站点间的数据共享。针对高能物理计算环境中,中小型站点间的数据共享问题,以Streaming & Cache为核心思想,设计一种远程文件系统,提出远程数据访问本地化,提供高实时性数据访问模式,实现基于HTTP协议的按需数据传输与管理,拥有数据块散列存储和文件统一化视图管理。与高能物理计算中常用分布式文件系统EOS、Lustre、GlusterFS相比,具有广域网可用性、网络时延不敏感性和高性能数据访问模式。  相似文献   

10.
数据存储可以使利用不同存储环境的用户有效地访问数据。本文对网格环境下的主要存储技术进行了介绍。讨论了在本地和远程存储数据时,存储方法要考虑的问题;数据存储的基本传输和存储单元及存储模式。  相似文献   

11.
    
Grid computing promises access to large amounts of computing power, but so far adoption of Grid computing has been limited to highly specialized experts for three reasons. First, users are used to batch systems, and interfaces to Grid software are often complex and different to those in batch systems. Second, users are used to having transparent file access, which Grid software does not conveniently provide. Third, efforts to achieve wide‐spread coordination of computers while solving the first two problems is hampered when clusters are on private networks. Here we bring together a variety of software that allows users to almost transparently use Grid resources as if they were local resources while providing transparent access to files, even when private networks intervene. As a motivating example, the BaBar Monte Carlo production system is deployed on a truly distributed environment, the European DataGrid, without any modification to the application itself. Copyright © 2005 John Wiley & Sons, Ltd.  相似文献   

12.
高能物理数据由物理事例组成,事例之间没有相关性。可以通过大量作业同时处理大量不同的数据文件,从而实现高能物理计算任务的并行化,因此高能物理计算是典型的高吞吐量计算场景。高能所计算集群使用开源的TORQUE/Maui进行资源管理及作业调度,并通过将集群资源划分成不同队列以及限制用户最大运行作业数来保证公平性,然而这也导致了集群整体资源利用率非常低下。SLURM和HTCondor都是近年来流行的开源资源管理系统,前者拥有丰富的作业调度策略,后者非常适合高吞吐量计算,二者都能够替代老旧、缺乏维护的TORQUE/Maui,都是管理计算集群资源的可行方案。在SLURM和HTCondor测试集群上模拟大亚湾实验用户的作业提交行为,对SLURM和HTCondor的资源分配行为和效率进行了测试,并与相同作业在高能物理研究所TORQUE/Maui集群上的实际调度结果进行了对比,分析了SLURM及HTCondor的优势和不足,探讨了使用SLURM或HTCondor管理高能物理研究所计算集群的可行性。  相似文献   

13.
DotGrid platform is a Grid infrastructure integrated with a set of open and standard protocols recently implemented on the top of Microsoft .NET in Windows and MONO .NET in UNIX/Linux. DotGrid infrastructure along with its proposed protocols provides a right and solid approach to targeting other platforms, e.g., the native C/C++ runtime. In this paper, we propose a new concurrent file transfer protocol called DotDFS as a high-throughput distributed file transfer component for DotGrid. DotDFS introduces some open binary protocols for efficient file transfers on current Grid infrastructures. DotDFS protocol also provides mechanisms for multiple file streams to gain high-throughput file transfer similar to GridFTP protocol, but by proposing and implementing a new parallel TCP connection-oriented paradigm. Almost no research work has been conducted to suggest a concurrent file transfer protocol that simultaneously employs threaded and event-driven models in the protocol level. Due to our knowledge, DotDFS is the first concurrent file transfer protocol that, from this viewpoint, presents a new computing paradigm in the field of data transmission protocols. In our LAN tests, we have achieved better results than Globus GridFTP implementation particularly in multiple TCP streams and directory tree transfers. Our LAN experiences in memory-to-memory tests show that DotDFS accesses to the 94% bottleneck bandwidth while GridFTP is accessing 91%. In LAN disk-to-disk tests, comparing DotDFS protocol with GridFTP protocol unveils a set of interesting and technical problems in GridFTP for both the nature of the protocol and its implementation by Globus. In the WAN experimental studies, we propose a new idea for analytical modeling of file transfer protocols like DotDFS inspired by sampling, experimentation and mathematical interpolation approaches. The cross-platform and open standard-based features of DotDFS provide a substantial framework for unifying data access and resource sharing in real heterogeneous Grid environments.  相似文献   

14.
在由多计算机集群构成的数据网格环境下,挖掘网格计算节点的空余资源来支持数据并行型计算(Data Parallel Computing,DPC),提出了一个基于分类、统计机制的数据网格管理模型。根据不同时间的网格资源的空余、各类DPC以及逻辑计算机机群,研究了支持DPC的网格资源管理模型。实验表明,该模型有效地解决了网格环境下数据并行型计算所需的空余资源优化使用问题。  相似文献   

15.
Scientific workflow orchestration interoperating HTC and HPC resources   总被引:1,自引:0,他引:1  
In this work we describe our developments towards the provision of a unified access method to different types of computing infrastructures at the interoperation level. For that, we have developed a middleware suite which bridges not interoperable middleware stacks used for building distributed computing infrastructures, UNICORE and gLite. Our solution allows to transparently access and operate on HPC and HTC resources from a single interface. Using Kepler as workflow manager, we provide users with the needed integration of codes to create scientific workflows accessing both types of infrastructures.  相似文献   

16.
数据是天文学发展的重要驱动。分布式存储和高性能计算(High Performance Computing,HPC)为应对海量天文数据的复杂性、不规则的存储和计算起到推动作用。天文学研究中多信息和多学科交叉融合成为必然,天文大数据已进入大规模计算时代。高性能计算为天文大数据处理和分析提供了新的手段,针对一些传统手段无法解决的问题给出了新的方案。文中根据天文数据分类和特征,以高性能计算为支撑,对天文大数据的数据融合、高效存取、分析及后续处理、可视化等问题进行了研究,总结了现阶段的技术特点,提出了处理天文大数据的研究策略和技术方法,并对天文大数据处理面对的问题和发展趋势进行了探讨。  相似文献   

17.
Grid Data Management: Open Problems and New Issues   总被引:3,自引:0,他引:3  
Initially developed for the scientific community, Grid computing is now gaining much interest in important areas such as enterprise information systems. This makes data management critical since the techniques must scale up while addressing the autonomy, dynamicity and heterogeneity of the data sources. In this paper, we discuss the main open problems and new issues related to Grid data management. We first recall the main principles behind data management in distributed systems and the basic techniques. Then we make precise the requirements for Grid data management. Finally, we introduce the main techniques needed to address these requirements. This implies revisiting distributed database techniques in major ways, in particular, using P2P techniques. Work partially funded by ARA “Massive Data” of the French ministry of research (project Respire), the European Strep Grid4All project, the CAPES–COFECUB Daad project and the CNPq–INRIA Gridata project.  相似文献   

18.
基于知识网格的数据挖掘   总被引:8,自引:0,他引:8  
魏定国  彭宏 《计算机科学》2006,33(6):210-213
工业、科学、商务等领域的数据通常分布在不同的地方,需要在不同的地点对其进行分布式维护。只有使用计算功能超强的分布式、并行处理系统才能分析这些领域所产生的超大规模数据集。网格为分布式知识发现应用中的计算提供了有效支持。为了在网格上进行数据挖掘的开发,本文提供了一个称之为知识网格的系统,讨论如何应用知识网格设计实施数据挖掘应用,并说明如何搜索网格资源、编制软件和数据组件,以及数据挖掘应用在网格上的执行过程。  相似文献   

19.
Grid portals are web gateways aiming at concealing the underlying infrastructure through a pervasive, transparent, user-friendly, ubiquitous and seamless access to heterogeneous and geographically spread resources (i.e. storage, computational facilities, services, sensors, network and databases). The Climate-G Portal is the web gateway of the Climate-G testbed (an interdisciplinary research effort involving scientists both in Europe and US) and it is devoted to climate change research studies. The main goal of this paper is to present the Climate-G Portal providing a complete understanding of the international context, discussing its main requirements, challenges, architecture and key functionalities, and finally carrying out and presenting a multi-dimensional analysis of the Climate-G Portal, starting from a general schema proposed and discussed in this work.  相似文献   

20.
Scientific data analysis and visualization have become the key component for nowadays large scale simulations. Due to the rapidly increasing data volume and awkward I/O pattern among high structured files, known serial methods/tools cannot scale well and usually lead to poor performance over traditional architectures. In this paper, we propose a new framework: ParSA (parallel scientific data analysis) for high-throughput and scalable scientific analysis, with distributed file system. ParSA presents the optimization strategies for grouping and splitting logical units to utilize distributed I/O property of distributed file system, scheduling the distribution of block replicas to reduce network reading, as well as to maximize overlapping the data reading, processing, and transferring during computation. Besides, ParSA provides the similar interfaces as the NetCDF Operator (NCO), which is used in most of climate data diagnostic packages, making it easy to use this framework. We utilize ParSA to accelerate well-known analysis methods for climate models on Hadoop Distributed File System (HDFS). Experimental results demonstrate the high efficiency and scalability of ParSA, getting the maximum 1.3 GB/s throughput on a six nodes Hadoop cluster with five disks per node. Yet, it can only get 392 MB/s throughput on a RAID-6 storage node.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号