期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

HELIO: Discovery and analysis of data in heliophysics

Robert Bentley John Brooke André Csillaghy Donal Fellows Anja Le Blanc Mauro Messerotti David Pérez-Suárez Gabriele Pierantoni Marco Soldati 《Future Generation Computer Systems》2013,29(8):2157-2168

Heliophysics is the study of highly energetic events that originate on the Sun and propagate through the solar system. Such events can cause critical and possibly fatal disruption of the electromagnetic systems on spacecraft and on ground-based structures such as electric power grids, so there is a clear need to understand the events in their totality as they propagate through space and time. The e-Science challenge posed is that the data was gathered by many observatories and communities that have hitherto not needed to work together. Firstly, this involves the problem of helping users to more easily find and understand the relevance of data, especially data from outside their domain. Secondly, it involves solving challenges of data integration. We describe the design of the HELIO infrastructure, based on the use of Web services linked together by workflows and accessible via portal-based user interfaces. We also discuss current progress in the implementation of this infrastructure and the feedback from the user community. 相似文献

2.

Scaling up workflow-based applications

Scott Callaghan Ewa Deelman Dan Gunter Gideon Juve Philip Maechling Christopher Brooks Karan Vahi Kevin Milner Robert Graves Edward Field David Okaya Thomas Jordan 《Journal of Computer and System Sciences》2010,76(6):428-446

相似文献

3.

Information flow analysis of scientific workflows

Ping Yang Shiyong Lu Mikhail I. Gofman Zijiang Yang 《Journal of Computer and System Sciences》2010,76(6):390-402

Recently, scientific workflows have emerged as a platform for automating and accelerating data processing and data sharing in scientific communities. Many scientific workflows have been developed for collaborative research projects that involve a number of geographically distributed organizations. Sharing of data and computation across organizations in different administrative domains is essential in such a collaborative environment. Because of the competitive nature of scientific research, it is important to ensure that sensitive information in scientific workflows can be accessed by and propagated to only authorized parties. To address this problem, we present techniques for analyzing how information propagates in scientific workflows. We also present algorithms for incrementally analyzing how information propagates upon every change to an existing scientific workflow. 相似文献

4.

Mapping Abstract Complex Workflows onto Grid Environments 总被引：18，自引：0，他引：18

Ewa Deelman James Blythe Yolanda Gil Carl Kesselman Gaurang Mehta Karan Vahi Kent Blackburn Albert Lazzarini Adam Arbree Richard Cavanaugh Scott Koranda 《Journal of Grid Computing》2003,1(1):25-39

In this paper we address the problem of automatically generating job workflows for the Grid. These workflows describe the execution of a complex application built from individual application components. In our work we have developed two workflow generators: the first (the Concrete Workflow Generator CWG) maps an abstract workflow defined in terms of application-level components to the set of available Grid resources. The second generator (Abstract and Concrete Workflow Generator, ACWG) takes a wider perspective and not only performs the abstract to concrete mapping but also enables the construction of the abstract workflow based on the available components. This system operates in the application domain and chooses application components based on the application metadata attributes. We describe our current ACWG based on AI planning technologies and outline how these technologies can play a crucial role in developing complex application workflows in Grid environments. Although our work is preliminary, CWG has already been used to map high energy physics applications onto the Grid. In one particular experiment, a set of production runs lasted 7 days and resulted in the generation of 167,500 events by 678 jobs. Additionally, ACWG was used to map gravitational physics workflows, with hundreds of nodes onto the available resources, resulting in 975 tasks, 1365 data transfers and 975 output files produced. 相似文献

5.

Modeling and Managing Interactions among Business Processes 总被引：3，自引：0，他引：3

Fabio Casati Angela Discenza 《Journal of Systems Integration》2001,10(2):145-168

Most workflow management systems (WfMSs) only support the separate andindependent execution of business processes. However, processes often needto interact with each other, in order to synchronize the execution of theiractivities, to exchange process data, to request execution of services, orto notify progresses in process execution. Recent market trends also raisethe need for cooperation and interaction between processes executed in differentorganizations, posing additional challenges. In fact, in order to reduce costsand provide better services, companies are pushed to increase cooperation and toform virtual enterprises, where business processes span across organizationalboundaries and are composed of cooperating workflows executed in differentorganizations. Workflow interaction in a cross-organizational environment iscomplicated by the heterogeneity of workflow management platforms on top ofwhich workflows are defined and executed and by the different and possiblycompeting business policies and business goals that drive process executionin each organization.In this paper we propose a model and system that enable interactionbetween workflows executed in the same or in different organizations. Weextend traditional workflow models by allowing workflows to publish andsubscribe to events, and by enabling the definition of points in the processexecution where events should be sent or received. Event notifications aremanaged by a suitable event service that is capable of filtering andcorrelating events, and of dispatching them to the appropriate targetworkflow instances. The extended model can be easily mapped onto anyworkflow model, since event specific constructs can be specified by means ofordinary workflow activities, for which we provide the implementation. Inaddition, the event service is easily portable to different platforms, anddoes not require integration with the WfMS that supports the cooperatingworkflows. Therefore, the proposed approach is applicable in virtually anyenvironment and is independent on the specific platform adopted 相似文献

6.

Overhead Analysis of Scientific Workflows in Grid Environments

Prodan R. Fahringer T. 《Parallel and Distributed Systems, IEEE Transactions on》2008,19(3):378-393

Scientific workflows are a topic of great interest in the grid community that sees in the workflow model an attractive paradigm for programming distributed wide-area grid infrastructures. Traditionally, the grid workflow execution is approached as a pure best effort scheduling problem that maps the activities onto the grid processors based on appropriate optimization or local matchmaking heuristics such that the overall execution time is minimized. Even though such heuristics often deliver effective results, the execution in dynamic and unpredictable grid environments is prone to severe performance losses that must be understood for minimizing the completion time or for the efficient use of high-performance resources. In this paper, we propose a new systematic approach to help the scientists and middleware developers understand the most severe sources of performance losses that occur when executing scientific workflows in dynamic grid environments. We introduce an ideal model for the lowest execution time that can be achieved by a workflow and explain the difference to the real measured grid execution time based on a hierarchy of performance overheads for grid computing. We describe how to systematically measure and compute the overheads from individual activities to larger workflow regions and adjust well-known parallel processing metrics to the scope of grid computing, including speedup and efficiency. We present a distributed online tool for computing and analyzing the performance overheads in real time based on event correlation techniques and introduce several performance contracts as quality-of-service parameters to be enforced during the workflow execution beyond traditional best effort practices. We illustrate our method through postmortem and online performance analysis of two real-world workflow applications executed in the Austrian grid environment. 相似文献

7.

Exploring Workflow Interoperability for Neuroimage Analysis on the SHIWA Platform

Vladimir Korkhov Dagmar Krefting Tamas Kukla Gabor Z. Terstyanszky Matthan W. A. Caan Silvia D. Olabarriaga 《Journal of Grid Computing》2013,11(3):505-522

Neuroimaging is a field that benefits from distributed computing infrastructures (DCIs) to perform data processing and analysis, which is often achieved using Grid workflow systems. Collaborative research in neuroimaging requires ways to facilitate exchange between different groups, in particular to enable sharing, re-use and interoperability of applications implemented as workflows. The SHIWA project provides solutions to facilitate sharing and exchange of workflows between workflow systems and DCI resources. In this paper we present and analyse how the SHIWA Platform was used to implement various cases in which workflow exchange supports collaboration in neuroscience. The SHIWA Platform and the implemented solutions are described and analysed from a “user” perspective, in this case workflow developers and neuroscientists. We conclude that the platform in its current form is valuable for these cases, and we identify remaining challenges. 相似文献

8.

Static analysis of Taverna workflows to predict provenance patterns

《Future Generation Computer Systems》2017

相似文献

9.

Effective and efficient similarity search in scientific workflow repositories

《Future Generation Computer Systems》2016

Scientific workflows have become a valuable tool for large-scale data processing and analysis. This has led to the creation of specialized online repositories to facilitate workflow sharing and reuse. Over time, these repositories have grown to sizes that call for advanced methods to support workflow discovery, in particular for similarity search. Effective similarity search requires both high quality algorithms for the comparison of scientific workflows and efficient strategies for indexing, searching, and ranking of search results. Yet, the graph structure of scientific workflows poses severe challenges to each of these steps. Here, we present a complete system for effective and efficient similarity search in scientific workflow repositories, based on the Layer Decomposition approach to scientific workflow comparison. Layer Decomposition specifically accounts for the directed dataflow underlying scientific workflows and, compared to other state-of-the-art methods, delivers best results for similarity search at comparably low runtimes. Stacking Layer Decomposition with even faster, structure-agnostic approaches allows us to use proven, off-the-shelf tools for workflow indexing to further reduce runtimes and scale similarity search to sizes of current repositories. 相似文献

10.

A Provenance-based Adaptive Scheduling Heuristic for Parallel Scientific Workflows in Clouds

Daniel de Oliveira Kary A. C. S. Oca?a Fernanda Bai?o Marta Mattoso 《Journal of Grid Computing》2012,10(3):521-552

In the last years, scientific workflows have emerged as a fundamental abstraction for structuring and executing scientific experiments in computational environments. Scientific workflows are becoming increasingly complex and more demanding in terms of computational resources, thus requiring the usage of parallel techniques and high performance computing (HPC) environments. Meanwhile, clouds have emerged as a new paradigm where resources are virtualized and provided on demand. By using clouds, scientists have expanded beyond single parallel computers to hundreds or even thousands of virtual machines. Although the initial focus of clouds was to provide high throughput computing, clouds are already being used to provide an HPC environment where elastic resources can be instantiated on demand during the course of a scientific workflow. However, this model also raises many open, yet important, challenges such as scheduling workflow activities. Scheduling parallel scientific workflows in the cloud is a very complex task since we have to take into account many different criteria and to explore the elasticity characteristic for optimizing workflow execution. In this paper, we introduce an adaptive scheduling heuristic for parallel execution of scientific workflows in the cloud that is based on three criteria: total execution time (makespan), reliability and financial cost. Besides scheduling workflow activities based on a 3-objective cost model, this approach also scales resources up and down according to the restrictions imposed by scientists before workflow execution. This tuning is based on provenance data captured and queried at runtime. We conducted a thorough validation of our approach using a real bioinformatics workflow. The experiments were performed in SciCumulus, a cloud workflow engine for managing scientific workflow execution. 相似文献

11.

Efficiently supporting secure and reliable collaboration in scientific workflows

Jiangang Ma Jinli Cao Yanchun Zhang 《Journal of Computer and System Sciences》2010,76(6):475-489

Recently, workflow technologies have been increasingly used in scientific communities. Scientists carry out research by employing scientific workflows to automate computing steps, analyze large data sets and integrate distributed computing processes. This is a challenging task because of insecure procedures in a distributed environment. In this paper, we present an access control framework and models for supporting secure and reliable collaboration. The proposed approaches combine control flows and data flow models to describe scientific workflows, and extend the atomicity sphere concept by considering two levels of atomicity abstraction at the level of process as well as at the level of data, in order to maintain the process consistency and the data consistency in the presence of failures. We also present a case study in a scientific research scenario to show the effectiveness of our approaches. 相似文献

12.

Cloud-aware data intensive workflow scheduling on volunteer computing systems

《Future Generation Computer Systems》2015

Volunteer computing systems offer high computing power to the scientific communities to run large data intensive scientific workflows. However, these computing environments provide the best effort infrastructure to execute high performance jobs. This work aims to schedule scientific and data intensive workflows on hybrid of the volunteer computing system and Cloud resources to enhance the utilization of these environments and increase the percentage of workflow that meets the deadline. The proposed workflow scheduling system partitions a workflow into sub-workflows to minimize data dependencies among the sub-workflows. Then these sub-workflows are scheduled to distribute on volunteer resources according to the proximity of resources and the load balancing policy. The execution time of each sub-workflow on the selected volunteer resources is estimated in this phase. If any of the sub-workflows misses the sub-deadline due to the large waiting time, we consider re-scheduling of this sub-workflow into the public Cloud resources. This re-scheduling improves the system performance by increasing the percentage of workflows that meet the deadline. The proposed Cloud-aware data intensive scheduling algorithm increases the percentage of workflow that meet the deadline with a factor of 75% in average with respect to the execution of workflows on the volunteer resources. 相似文献

13.

Visualization workflows for level-12 HUC scales: Towards an expert system for watershed analysis in a distributed computing environment

《Environmental Modelling & Software》2016

Visualization workflows are important services for expert users to analyze watersheds when using our HydroTerre end-to-end workflows. Analysis is an interactive and iterative process and we demonstrate that the expert user can focus on model results, not data preparation, by using a web application to rapidly create, tune, and calibrate hydrological models anywhere in the continental USA (CONUS). The HydroTerre system captures user interaction for provenance and reproducibility to share modeling strategies with modelers. Our end-to-end workflow consists of four workflows. The first is data workflows using Essential Terrestrial Variables (ETV) data sets that we demonstrated to construct watershed models anywhere in the CONUS (Leonard and Duffy, 2013). The second is data-model workflows that transform the data workflow results to model inputs. The model inputs are consumed in the third workflow, model workflows (Leonard and Duffy, 2014a) that handle distribution of data and model within High Performance Computing (HPC) environments. This article focuses on our fourth workflow, visualization workflows, which consume the first three workflows to form an end-to-end system to create and share hydrological model results efficiently for analysis and peer review. We show how visualization workflows are incorporated into the HydroTerre infrastructure design and demonstrate the efficiency and robustness for an expert modeler to produce, analyze, and share new hydrological models using CONUS national datasets. 相似文献

14.

Using ontologies for verification and validation of workflow-based experiments

《Journal of Web Semantics》2017

相似文献

15.

Social workflows—Vision and potential study

《Information Systems》2015

Social workflows pervade peoples׳ everyday life. Whenever a group of persons works together on a challenging or multifaceted task, a social workflow begins. Unlike traditional business workflows, such social workflows aim at supporting processes that contain personal tasks and data. In this work, we envision a social workflow service as part of a social network that enables private individuals to construct social workflows according to their specific needs and to keep track of the workflow execution. The proposed features for a social workflow service could help individuals to accomplish their private goals. The presented idea is contrasted with established research areas and applications to show the degree of novelty of this work. It is shown how novel ideas for knowledge management, facilitated by a process-oriented case-based reasoning approach, support private individuals and how they can obtain an appropriate social workflow through sharing and reuse of respective experience. Two empirical studies confirm the potential benefits of a social workflow service in general and the core features of the developed concept. 相似文献

16.

Scientific workflows with the jABC framework

Anna-Lena Lamprecht Bernhard Steffen Tiziana Margaria 《International Journal on Software Tools for Technology Transfer (STTT)》2016,18(6):629-651

The jABC is a framework for process modelling and execution according to the XMDD (eXtreme model-driven design) paradigm, which advocates the rigorous use of user-level models in the software development process and software life cycle. We have used the jABC in the domain of scientific workflows for more than a decade now—an occasion to look back and take stock of our experiences in the field. On the one hand, we discuss results from the analysis of a sample of nearly 100 scientific workflow applications that have been implemented with the jABC. On the other hand, we reflect on our experiences and observations regarding the workflow development process with the framework. We then derive and discuss ongoing further developments and future perspectives for the framework, all with an emphasis on simplicity for end users through increased domain specificity. Concretely, we describe how the use of the PROPHETS synthesis plugin can enable a semantics-based simplification of the workflow design process, how with the jABC4 and DyWA frameworks more attention is paid to the ease of data management, and how the Cinco SCCE Meta-Tooling Suite can be used to generate tailored workflow management tools. 相似文献

17.

Performance metrics and ontologies for Grid workflows

《Future Generation Computer Systems》2007,23(6):760-772

Many Grid workflow middleware services require knowledge about the performance behavior of Grid applications/services in order to effectively select, compose, and execute workflows in dynamic and complex Grid systems. To provide performance information for building such knowledge, Grid workflow performance tools have to select, measure, and analyze various performance metrics of workflows. However, there is a lack of a comprehensive study of performance metrics which can be used to evaluate the performance of a workflow executed in the Grid. Moreover, given the complexity of both Grid systems and workflows, semantics of essential performance-related concepts and relationships, and associated performance data in Grid workflows should be well described. In this paper, we analyze performance metrics that performance monitoring and analysis tools should provide during the evaluation of the performance of Grid workflows. Performance metrics are associated with multiple levels of abstraction. We introduce an ontology for describing performance data of Grid workflows and illustrate how the ontology can be utilized for monitoring and analyzing the performance of Grid workflows. 相似文献

18.

Nonintrusive collection and management of data provenance in scientific workflows

Giorgos Tylissanakis Yiannis Cotronis 《Concurrency and Computation》2012,24(18):2268-2281

In this paper, we introduce an efficient mechanism to collect, store, and retrieve data provenance information in workflows of multiphysics simulations. Using notifications, we enable the nonintrusive collection of information about workflow events during workflow execution. Combining these events with workflow structure information, constant for every execution of a workflow, we obtain the data provenance information for the specific run of the workflow. Data provenance information is structured into a graph that represents workflow events on the basis of their causal dependency. We use a graph database to store this graph and utilize the traversal framework provided, to efficiently retrieve data provenance information from the graph by traversing backwards from a data object to every workflow event that is part of its provenance. Finally, we integrate data provenance information with semantics of workflow services to provide complete and meaningful data provenance information. Copyright © 2012 John Wiley & Sons, Ltd. 相似文献

19.

A Case Study into Using Common Real-Time Workflow Monitoring Infrastructure for Scientific Workflows

Karan Vahi Ian Harvey Taghrid Samak Daniel Gunter Kieran Evans David Rogers Ian Taylor Monte Goode Fabio Silva Eddie Al-Shakarchi Gaurang Mehta Ewa Deelman Andrew Jones 《Journal of Grid Computing》2013,11(3):381-406

Scientific workflow systems support various workflow representations, operational modes, and configurations. Regardless of the system used, end users have common needs: to track the status of their workflows in real time, be notified of execution anomalies and failures automatically, perform troubleshooting, and automate the analysis of the workflow results. In this paper, we describe how the Stampede monitoring infrastructure was integrated with the Pegasus Workflow Management System and the Triana Workflow Systems, in order to add generic real time monitoring and troubleshooting capabilities across both systems. Stampede is an infrastructure that provides interoperable monitoring using a three-layer model: (1) a common data model to describe workflow and job executions; (2) high-performance tools to load workflow logs conforming to the data model into a data store; and (3) a common query interface. This paper describes the integration of Stampede monitoring architecture with Pegasus and Triana and shows the new analysis capabilities that Stampede provides to these workflow systems. The successful integration of Stampede with these workflow engines demonstrates the generic nature of the Stampede monitoring infrastructure and its potential to provide a common platform for monitoring across scientific workflow engines. 相似文献

20.

Abstract,link, publish,exploit: An end to end framework for workflow sharing

《Future Generation Computer Systems》2017

Scientific workflows are increasingly used to manage and share scientific computations and methods to analyze data. A variety of systems have been developed that store the workflows executed and make them part of public repositories However, workflows are published in the idiosyncratic format of the workflow system used for the creation and execution of the workflows. Browsing, linking and using the stored workflows and their results often becomes a challenge for scientists who may only be familiar with one system. In this paper we present an approach for addressing this issue by publishing and exploiting workflows as data on the Web with a representation that is independent from the workflow system used to create them. In order to achieve our goal, we follow the Linked Data Principles to publish workflow inputs, intermediate results, outputs and codes; and we reuse and extend well established standards like W3C PROV. We illustrate our approach by publishing workflows and consuming them with different tools designed to address common scenarios for workflow exploitation. 相似文献