首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
A scientific workflow environment for Earth system related studies   总被引:1,自引:0,他引:1  
Many separate tasks must be performed to configure, run, and analyze Earth system modeling applications. This work is motivated by the complexities of running a large modeling system on a high performance network and the need to reduce those complexities, particularly for the average user. Scientific workflow systems can be used to simplify these task and their relationships, although how to implement such systems is still an open research area. In this paper, we present a methodology to combine a scientific workflow and modeling framework approach to create a standardized work environment and provide a first example of a self-describing Earth system model. We then show the results of an example workflow that is based on the proposed methodology. The example workflow allows running and analyzing a global circulation model on both a grid computing environment and a cluster system, with meaningful abstractions for the model and computing environment. As can be seen through this example, a layered approach to collecting provenance and metadata information has the added benefit of documenting a run in far greater detail than before. This approach facilitates exploration of runs and leads to possible reproducibility.  相似文献   

2.
3.
Provenance information in eScience is metadata that's critical to effectively manage the exponentially increasing volumes of scientific data from industrial-scale experiment protocols. Semantic provenance, based on domain-specific provenance ontologies, lets software applications unambiguously interpret data in the correct context. The semantic provenance framework for eScience data comprises expressive provenance information and domain-specific provenance ontologies and applies this information to data management. The authors' "two degrees of separation" approach advocates the creation of high-quality provenance information using specialized services. In contrast to workflow engines generating provenance information as a core functionality, the specialized provenance services are integrated into a scientific workflow on demand. This article describes an implementation of the semantic provenance framework for glycoproteomics.  相似文献   

4.
工作流管理系统建模研究   总被引:5,自引:1,他引:4  
分析了工作流过程模型的4种互操作模式,并提出了“均衡式”互操作模式。对XPDL1.0进行了扩展,采用扩展后的XPDL作为过程定义语言设计了过程模型,过程模型表达企业的控制流、数据流、资源流。根据以上研究内容,采用C/S结构,COM组件技术、网络编程接口Windows Sockets规范实现了工作流建模。  相似文献   

5.
溯源管理是科学工作流系统的核心功能之一。科学工作流语境下的溯源,可分为工作流定义溯源和工作流执行溯源,分别描述工作流定义和执行阶段的元数据、过程依赖及数据演化。本文重点关注工作流定义溯源和执行溯源的表示及查询技术,并阐释针对科学工作流领域内独有问题,如"黑盒"问题、依赖区分问题以及细粒度溯源等问题的解决方案。文中还将介绍现存的一些面向科学工作流的溯源系统,并提出对溯源技术未来的展望。  相似文献   

6.
随着数据库、网络和分布式计算的发展,组织任务进一步自动化,与服务相关的信息进一步计算机化,实际的信息系统往往需要由多个相关的任务构成业务流程(工作流)来完成,这促使我们将安全问题方面的注意力从独立的计算机系统中静态的主体和客体保护转移到随着任务的执行而进行动态授权的保护上。目前的访问控制模型都是从系统的角度出发去保护资源,在进行权限控制时没有考虑执行的上下文环境,这种静态的访问控制不能满足工作流对访问控制的要求。针对访问控制策略难以适应工作流系统的问题,文章介绍了一种新的安全模型——基于任务的访问控制(Task—based Aeeess Control,TBAC),TBAC可依据任务和任务状态的不同,对权限进行动态实时的管理。介绍了TBAC的基本概念,对其模型进行了描述和分析,就审批系统的一个典型的审批流程进行了模型化。TBAC把实际应用中的工作流和访问控制所需的各种关系整体地结合在一起,可以清晰地表达复杂工作流的控制机制。  相似文献   

7.
集成对象代理数据库的科学工作流服务框架中的数据跟踪   总被引:2,自引:0,他引:2  
文中提出了一个集成数据库的科学工作流服务框架,它采用对象代理模型描述一系列科学任务的执行过程,使得工作流管理操作以类似于传统数据库管理操作的方式来完成.同时,基于对象代理数据库中的双向指针机制,文中提出了一种新的数据跟踪方法,该方法能提供比注释或反向方法更高的性能,不仅节省了大量的存储空间,而且减少了额外的计算代价,为了进一步提高数据跟踪的高效性,文中也提出了一种部分物化中间数据模式,实验显示它具有较好的系统性能.  相似文献   

8.
基于对象和实例互操作行为模型的工作流研究   总被引:10,自引:1,他引:9  
通过工作流程逻辑领域和活动领域的划分,建立了逻辑领域的过程对象和活动领域的活动实例局部互操作行为模型和全局互操作行为模型,并分别对这两个行为模型内部的互操作逻辑关系进行了分析和研究,从对象和实例集成的角度对工作流的运作机理进行了探讨,为实现工作流模型的扩展和行为活动的复用提供了理论基础。并通过实例对工作流过程对象和活动实例全局互操作行为模型的运作关系进行了阐述和说明。  相似文献   

9.
Directed Acyclic Graph (DAG) is an important tool for workflow modeling and data provenance management. In these applications, DAG usually performs well. Yet for some workflow applications, except data or control dependencies between atomic tasks, there exists another requirement that each atomic task should be accomplished at an expected stage. Therefore, this paper proposes an improved DAG model – LDAG, in which each vertex has a level. Three cases of the level of vertices are discussed. For a reasonable one of these cases, this paper proposes a topological ordering algorithm and proves its correctness. In addition, it discusses the complexity of the algorithm and some other relevant problems.  相似文献   

10.
This paper investigates an interoperable framework to disseminate Earth Science data to different application domains. The proposed framework can manage different Earth science data products and raster snapshots over time through the use of relevant metadata information. The framework generates images to be accessed by GIS software for various Earth science and web‐based applications. The access is enabled through the compliance with OpenGeospatial Consortium's Web Map Service (WMS) for interoperability such that any WMS viewer can access the service. The framework can provide GIS users the capability to incorporate geospatial information from other WMS servers. Using the United States NEXt generation weather RADar (NEXRAD) data, we demonstrate how the proposed framework can facilitate the dissemination of Earth Science data to a broad community in a near real‐time fashion. The proposed framework can be used to manage and disseminate various types of spatiotemporal Earth science data.  相似文献   

11.
This paper describes the metadata and metadata management algorithms necessary to handle the concurrent execution of multiple tasks from a single workflow, in a collaborative service oriented architecture environment. Metadata requirements are imposed by the distributed workflow that calculates thermoelastic properties of materials at high pressures and temperatures. The scientific relevance of this workflow is also discussed. We explain the basic metaphor, the receipt, underlying the metadata management. We show the actual java representation of the receipt, and explain how it is converted to XML in order to be transferred between servers, and stored in a database. We also discuss how the collaborative aspect of user activity on running workflows could potentially lead to race conditions, how this affects requirements on metadata, and how these race conditions are precluded. Finally we describe an additional metadata structure, complementary to the receipts, that contains general information about the workflow.  相似文献   

12.
In this paper we propose an efficient and scalable storage model and lookup for provenance logs. The proposed system exploits the loosely coupled structure of the provenance logs by separating metadata from the generating process to manage large datasets with good scalability. In addition, the system utilizes the trie based lookup table to greatly improve the provenance data lookup time. Performance results on thousands of graph logs show that our prototype implementation can effectively handle logs without any resource over-utilization, thus leading to good scalability.  相似文献   

13.
Understanding regional-scale water resource systems requires understanding coupled hydrologic and climate interactions. The traditional approach in the hydrologic sciences and engineering fields has been to either treat the atmosphere as a forcing condition on the hydrologic model, or to adopt a specific hydrologic model design in order to be interoperable with a climate model. We propose here a different approach that follows a service-oriented architecture and uses standard interfaces and tools: the Earth System Modeling Framework (ESMF) from the weather and climate community and the Open Modeling Interface (OpenMI) from the hydrologic community. A novel technical challenge of this work is that the climate model runs on a high performance computer and the hydrologic model runs on a personal computer. In order to complete a two-way coupling, issues with security and job scheduling had to be overcome. The resulting application demonstrates interoperability across disciplinary boundaries and has the potential to address emerging questions about climate impacts on local water resource systems. The approach also has the potential to be adapted for other climate impacts applications that involve different communities, multiple frameworks, and models running on different computing platforms. We present along with the results of our coupled modeling system a scaling analysis that indicates how the system will behave as geographic extents and model resolutions are changed to address regional-scale water resources management problems.  相似文献   

14.
15.
首先介绍了科技资源数据库中元数据标准的分级和互操作策略,根据科技资源数据特点,提出元数据注册系统体系架构。分析并解决了元数据注册系统中的关键技术。研究表明,元数据注册有助于元数据标准管理,促进科技资源数据更好的利用、共享、交换和整合。  相似文献   

16.
Today there exist a wide variety of scientific workflow management systems, each designed to fulfill the needs of a certain scientific community. Unfortunately, once a workflow application has been designed in one particular system it becomes very hard to share it with users working with different systems. Portability of workflows and interoperability between current systems barely exists. In this work, we present the fine-grained interoperability solution proposed in the SHIWA European project that brings together four representative European workflow systems: ASKALON, MOTEUR, WS-PGRADE, and Triana. The proposed interoperability is realised at two levels of abstraction: abstract and concrete. At the abstract level, we propose a generic Interoperable Workflow Intermediate Representation (IWIR) that can be used as a common bridge for translating workflows between different languages independent of the underlying distributed computing infrastructure. At the concrete level, we propose a bundling technique that aggregates the abstract IWIR representation and concrete task representations to enable workflow instantiation, execution and scheduling. We illustrate case studies using two real-workflow applications designed in a native environment and then translated and executed by a foreign workflow system in a foreign distributed computing infrastructure.  相似文献   

17.
Automation of the execution of computational tasks is at the heart of improving scientific productivity. Over the last years, scientific workflows have been established as an important abstraction that captures data processing and computation of large and complex scientific applications. By allowing scientists to model and express entire data processing steps and their dependencies, workflow management systems relieve scientists from the details of an application and manage its execution on a computational infrastructure. As the resource requirements of today’s computational and data science applications that process vast amounts of data keep increasing, there is a compelling case for a new generation of advances in high-performance computing, commonly termed as extreme-scale computing, which will bring forth multiple challenges for the design of workflow applications and management systems. This paper presents a novel characterization of workflow management systems using features commonly associated with extreme-scale computing applications. We classify 15 popular workflow management systems in terms of workflow execution models, heterogeneous computing environments, and data access methods. The paper also surveys workflow applications and identifies gaps for future research on the road to extreme-scale workflows and management systems.  相似文献   

18.
Integrated environmental modeling (IEM) provides a systematic way to couple models for integrated analysis. Coupled models in IEM often exchange data at runtime for time-step based executions. It is a challenge to track which raw observations or intermediate data exchanged at runtime contribute to individual model outputs. Time-step level provenance is needed to audit the trail of model execution or perform diagnosis in case of anomalies. This paper introduces a method to support provenance awareness in IEM. It suggests that individual models should expose necessary interfaces for provenance capturing in IEM environments. The provenance is represented using the W3C PROV model for interoperability. Fine-grained provenance is inferred based on coarse-grained provenance and temporal characteristics of computations of numerical time marching models. The approach is implemented in OpenMI-compliant models. A case study of model provenance tracking and inference on the watershed runoff simulation scenario illustrates the applicability of the approach.  相似文献   

19.
The paper presents a complete solution for modeling scientific and business workflow applications, static and just-in-time QoS selection of services and workflow execution in a real environment. The workflow application is modeled as an acyclic directed graph where nodes denote tasks and edges denote dependencies between the tasks. The BeesyCluster middleware is used to allow providers to publish services from sequential or parallel applications, from their servers or clusters. Optimization algorithms are proposed to select a capable service for each task so that a global criterion is optimized such as a product of workflow execution time and cost, a linear combination of those or minimization of the time with a cost constraint. The paper presents implementation details of the multithreaded workflow execution engine implemented in JEE. Several tests were performed for three different optimization goals for two business and scientific workflow applications. Finally, the overhead of the solution is presented.  相似文献   

20.
One of the main advantages of using a scientific workflow management system (SWfMS) is to orchestrate data flows among scientific activities and register provenance of the whole workflow execution. Nevertheless, the execution control of distributed activities in high performance computing environments by SWfMS presents challenges such as steering control and provenance gathering. Such challenges may become a complex task to be accomplished in bioinformatics experiments, particularly in Many Task Computing scenarios. This paper presents a data parallelism solution for a bioinformatics experiment supported by Hydra, a middleware that bridges SWfMS and high performance computing to enable workflow parallelization with provenance gathering. Hydra Many Task Computing parallelization strategies can be registered and reused. Using Hydra, provenance may also be uniformly gathered. We have evaluated Hydra using an Orthologous Gene Identification workflow. Experimental results show that a systematic approach for distributing parallel activities is viable, sparing scientist time and diminishing operational errors, with the additional benefits of distributed provenance support. Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号