期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Dynamic Instrumentation, Performance Monitoring and Analysis of Grid Scientific Workflows

Hong-Linh Truong Thomas Fahringer Schahram Dustdar 《Journal of Grid Computing》2005,3(1-2):1-18

While existing work concentrates on developing QoS models of business workflows and Web services, few tools have been developed to support the monitoring and performance analysis of scientific workflows in Grids. This paper describes novel Grid services for dynamic instrumentation of Grid-based applications, performance monitoring and analysis of Grid scientific workflows. We describe a Grid dynamic instrumentation service that provides a widely accessible interface for other services and users to conduct the dynamic instrumentation of Grid applications during the runtime. We introduce a Grid performance analysis service for Grid scientific workflows. The analysis service utilizes various types of data including workflow graphs, monitoring data of resources, execution status of activities, and performance measurements obtained from the dynamic instrumentation of invoked applications, and provides a rich set of functionalities and features to support the online monitoring and performance analysis of scientific workflows. Workflows and their relevant information including performance metrics are stored and utilized for comparing the performance of constructs of different workflows and for supporting multi-workflow analysis. The work described in this paper is supported in part by the Austrian Science Fund as part of the Aurora Project under contract SFBF1104 and by the European Union through the IST-2002-511385 project K-WfGrid. 相似文献

2.

A Distributed Workflow Management System with Case Study of Real-life Scientific Applications on Grids

Qishi Wu Mengxia Zhu Yi Gu Patrick Brown Xukang Lu Wuyin Lin Yangang Liu 《Journal of Grid Computing》2012,10(3):367-393

Next-generation scientific applications feature complex workflows comprised of many computing modules with intricate inter-module dependencies. Supporting such scientific workflows in wide-area networks especially Grids and optimizing their performance are crucial to the success of collaborative scientific discovery. We develop a Scientific Workflow Automation and Management Platform (SWAMP), which enables scientists to conveniently assemble, execute, monitor, control, and steer computing workflows in distributed environments via a unified web-based user interface. The SWAMP architecture is built entirely on a seamless composition of web services: the functionalities of its own are provided and its interactions with other tools or systems are enabled through web services for easy access over standard Internet protocols while being independent of different platforms and programming languages. SWAMP also incorporates a class of efficient workflow mapping schemes to achieve optimal end-to-end performance based on rigorous performance modeling and algorithm design. The performance superiority of SWAMP over existing workflow mapping schemes is justified by extensive simulations, and the system efficacy is illustrated by large-scale experiments on real-life scientific workflows for climate modeling through effective system implementation, deployment, and testing on the Open Science Grid. 相似文献

3.

Grid Service Orchestration Using the Business Process Execution Language (BPEL) 总被引：6，自引：0，他引：6

Wolfgang Emmerich Ben Butchart Liang Chen Bruno Wassermann Sarah L. Price 《Journal of Grid Computing》2005,3(3-4):283-304

Modern scientific applications often need to be distributed across Grids. Increasingly applications rely on services, such as job submission, data transfer or data portal services. We refer to such services as Grid services. While the invocation of Grid services could be hard coded in theory, scientific users want to orchestrate service invocations more flexibly. In enterprise applications, the orchestration of web services is achieved using emerging orchestration standards, most notably the Business Process Execution Language (BPEL). We describe our experience in orchestrating scientific workflows using BPEL. We have gained this experience during an extensive case study that orchestrates Grid services for the automation of a polymorph prediction application. Using this example, we explain the extent with which the BPEL language supports the definition of scientific workflows. We then describe the reliability, performance and scalability that can be achieved by executing a complex scientific workflow with ActiveBPEL, an industrial strength but freely available BPEL engine. *The work has been funded by the UK EPSRC through grants GR/R97207/01 (e-Materials) and GR/S90843/01 (OMII Managed Programme). 相似文献

4.

SEAM: A state-entity-activity-model for a well-defined workflowdevelopment methodology

Bajaj A. Ram S. 《Knowledge and Data Engineering, IEEE Transactions on》2002,14(2):415-431

Current conceptual workflow models use either informally defined conceptual models or several formally defined conceptual models that capture different aspects of the workflow, e.g., the data, process, and organizational aspects of the workflow. To the best of our knowledge, there are no algorithms that can amalgamate these models to yield a single view of reality. A fragmented conceptual view is useful for systems analysis and documentation. However, it fails to realize the potential of conceptual models to provide a convenient interface to automate the design and management of workflows. First, as a step toward accomplishing this objective, we propose SEAM (State-Entity-Activity-Model), a conceptual workflow model defined in terms of set theory. Second, no attempt has been made, to the best of our knowledge, to incorporate time into a conceptual workflow model. SEAM incorporates the temporal aspect of workflows. Third, we apply SEAM to a real-life organizational unit's workflows. In this work, we show a subset of the workflows modeled for this organization using SEAM. We also demonstrate, via a prototype application, how the SEAM schema can be implemented on a relational database management system. We present the lessons we learned about the advantages obtained for the organization and, for developers who choose to use SEAM, we also present potential pitfalls in using the SEAM methodology to build workflow systems on relational platforms. The information contained in this work is sufficient enough to allow application developers to utilize SEAM as a methodology to analyze, design, and construct workflow applications on current relational database management systems. The definition of SEAM as a context-free grammar, definition of its semantics, and its mapping to relational platforms should be sufficient also, to allow the construction of an automated workflow design and construction tool with SEAM as the user interface 相似文献

5.

A Collaborative Multiagent System for Mining Transcriptional Regulatory Elements

Xiong Yun Zheng Guangyong Yang Qing Zhu Yangyong 《Intelligent Systems, IEEE》2009,24(3):26-37

相似文献

6.

P-GRADE: A Grid Programming Environment

P. Kacsuk G. Dózsa J. Kovács R. Lovas N. Podhorszki Z. Balaton G. Gombás 《Journal of Grid Computing》2003,1(2):171-197

P-GRADE provides a high-level graphical environment to develop parallel applications transparently both for parallel systems and the Grid. P-GRADE supports the interactive execution of parallel programs as well as the creation of a Condor, Condor-G or Globus job to execute parallel programs in the Grid. In P-GRADE, the user can generate either PVM or MPI code according to the underlying Grid where the parallel application should be executed. PVM applications generated by P-GRADE can migrate between different Grid sites and as a result P-GRADE guarantees reliable, fault-tolerant parallel program execution in the Grid. The GRM/PROVE performance monitoring and visualisation toolset has been extended towards the Grid and connected to a general Grid monitor (Mercury) developed in the EU GridLab project. Using the Mercury/GRM/PROVE Grid application monitoring infrastructure any parallel application launched by P-GRADE can be remotely monitored and analysed at run time even if the application migrates among Grid sites. P-GRADE supports workflow definition and co-ordinated multi-job execution for the Grid. Such workflow management can provide parallel execution at both inter-job and intra-job level. Automatic checkpoint mechanism for parallel programs supports the migration of parallel jobs inside the workflow providing a fault-tolerant workflow execution mechanism. The paper describes all of these features of P-GRADE and their implementation concepts. 相似文献

7.

Optimizing Grid-Based Workflow Execution

Gurmeet Singh Carl Kesselman Ewa Deelman 《Journal of Grid Computing》2005,3(3-4):201-219

Large-scale applications can be expressed as a set of tasks with data dependencies between them, also known as application workflows. Due to the scale and data processing requirements of these applications, they require Grid computing and storage resources. So far, the focus has been on developing easy to use interfaces for composing these workflows and finding an optimal mapping of tasks in the workflow to the Grid resources in order to minimize the completion time of the application. After this mapping is done, a workflow execution engine is required to run the workflow over the mapped resources. In this paper, we show that the performance of the workflow execution engine in executing the workflow can also be a critical factor in determining the workflow completion time. Using Condor as the workflow execution engine, we examine the various factors that affect the completion time of a fine granularity astronomy workflow. We show that changing the system parameters that influence these factors and restructuring the workflow can drastically reduce the completion time of this class of workflows. We also examine the effect on the optimizations developed for the astronomy application on a coarser granularity biology application. We were able to reduce the completion time of the Montage and the Tomography application workflows by 90% and 50%, respectively. 相似文献

8.

Adaptive service scheduling for workflow applications in Service-Oriented Grid 总被引：1，自引：1，他引：0

Sung Ho Chin Taeweon Suh Heon Chang Yu 《The Journal of supercomputing》2010,52(3):253-283

When the workflow application is executed in Service-Oriented Grid (SOG), performance issues such as service scheduling should be considered, to achieve high and stable performance in execution. However, most of the prior works on workflow management neither study the performance issues nor provide evaluation methodologies on the performance of Grid Services. Therefore, it is infeasible to apply for the service scheduling problem in SOG. In this paper, we propose and model evaluation metrics for the Grid Service performance. The metrics are extracted based on common properties of Grid Services and are used to quantify and evaluate the performance of an individual Grid Service. With these metrics, we develop a service scheduling scheme with a list scheduling heuristic, to choose proper and optimal Grid Services for tasks in workflow applications. It ensures high performance in the execution of the workflow applications. In addition, we propose a low-overhead rescheduling method, referred to as Adaptive List Scheduling for Service (ALSS), to adapt to the dynamic nature of a grid environment. ALSS provides stable performance for workflow applications, even in abnormal circumstances. Finally, we design an experimental environment with actual traces and perform simulations to quantify the benefits of our approach. Throughout the experiments, we demonstrate that ALSS outperforms conventional scheduling methods. Our scheme produces a scheduling performance that is superior to AHEFT by 50.2%, SLACK by 50.8%, HEFT by 68.3%, MaxMin by 72.0%, MinMin by 71.0%, and Myopic by 69.8%. 相似文献

9.

A Taxonomy of Workflow Management Systems for Grid Computing 总被引：12，自引：0，他引：12

Jia Yu Rajkumar Buyya 《Journal of Grid Computing》2005,3(3-4):171-200

With the advent of Grid and application technologies, scientists and engineers are building more and more complex applications to manage and process large data sets, and execute scientific experiments on distributed resources. Such application scenarios require means for composing and executing complex workflows. Therefore, many efforts have been made towards the development of workflow management systems for Grid computing. In this paper, we propose a taxonomy that characterizes and classifies various approaches for building and executing workflows on Grids. We also survey several representative Grid workflow systems developed by various projects world-wide to demonstrate the comprehensiveness of the taxonomy. The taxonomy not only highlights the design and engineering similarities and differences of state-of-the-art in Grid workflow systems, but also identifies the areas that need further research. 相似文献

10.

A Grid resource brokering strategy based on resource and network performance in Grid

Valliyammai Chinnaiah^{Author Vitae} Thamarai Selvi Somasundaram Author Vitae 《Future Generation Computer Systems》2012,28(3):491-499

To achieve high performance distributed data access and computing in Grid environment, monitoring of resource and network performance is vital. Our proposed Grid network monitoring architecture is modeled by the Grid scheduler. The proposed Grid network monitoring retrieves network metrics using sensors as network monitoring tools. The mobile agents are migrated to start the sensors to measure the network metrics in all Grid Resources from the Resource Broker. The raw data provided by the monitoring tools is used to produce a high level view of the Grid through the set of internal cost functions. The network cost function is formed by combining various network metrics such as bandwidth, Round Trip Time, jitter and packet loss to measure the network performance. This paper presents the Grid Resource Brokering strategy which analyzes the network metrics along with the resource metrics for the selection of the Grid resource to submit the job and the proposed approach is integrated with CARE Resource Broker (CRB) for job submission. The experimental results are evident for the minimization of job completion time for the submitted job. The simulation results also prove that the more number of jobs are completed with the proposed strategy which influences the better utilization of the Grid resources. 相似文献

11.

A Case Study into Using Common Real-Time Workflow Monitoring Infrastructure for Scientific Workflows

Karan Vahi Ian Harvey Taghrid Samak Daniel Gunter Kieran Evans David Rogers Ian Taylor Monte Goode Fabio Silva Eddie Al-Shakarchi Gaurang Mehta Ewa Deelman Andrew Jones 《Journal of Grid Computing》2013,11(3):381-406

Scientific workflow systems support various workflow representations, operational modes, and configurations. Regardless of the system used, end users have common needs: to track the status of their workflows in real time, be notified of execution anomalies and failures automatically, perform troubleshooting, and automate the analysis of the workflow results. In this paper, we describe how the Stampede monitoring infrastructure was integrated with the Pegasus Workflow Management System and the Triana Workflow Systems, in order to add generic real time monitoring and troubleshooting capabilities across both systems. Stampede is an infrastructure that provides interoperable monitoring using a three-layer model: (1) a common data model to describe workflow and job executions; (2) high-performance tools to load workflow logs conforming to the data model into a data store; and (3) a common query interface. This paper describes the integration of Stampede monitoring architecture with Pegasus and Triana and shows the new analysis capabilities that Stampede provides to these workflow systems. The successful integration of Stampede with these workflow engines demonstrates the generic nature of the Stampede monitoring infrastructure and its potential to provide a common platform for monitoring across scientific workflow engines. 相似文献

12.

Abstract,link, publish,exploit: An end to end framework for workflow sharing

《Future Generation Computer Systems》2017

Scientific workflows are increasingly used to manage and share scientific computations and methods to analyze data. A variety of systems have been developed that store the workflows executed and make them part of public repositories However, workflows are published in the idiosyncratic format of the workflow system used for the creation and execution of the workflows. Browsing, linking and using the stored workflows and their results often becomes a challenge for scientists who may only be familiar with one system. In this paper we present an approach for addressing this issue by publishing and exploiting workflows as data on the Web with a representation that is independent from the workflow system used to create them. In order to achieve our goal, we follow the Linked Data Principles to publish workflow inputs, intermediate results, outputs and codes; and we reuse and extend well established standards like W3C PROV. We illustrate our approach by publishing workflows and consuming them with different tools designed to address common scenarios for workflow exploitation. 相似文献

13.

Multiple Workflow Scheduling Strategies with User Run Time Estimates on a Grid

Adán Hirales-Carbajal Andrei Tchernykh Ramin Yahyapour José Luis González-García Thomas R?blitz Juan Manuel Ramírez-Alcaraz 《Journal of Grid Computing》2012,10(2):325-346

In this paper, we present an experimental study of deterministic non-preemptive multiple workflow scheduling strategies on a Grid. We distinguish twenty five strategies depending on the type and amount of information they require. We analyze scheduling strategies that consist of two and four stages: labeling, adaptive allocation, prioritization, and parallel machine scheduling. We apply these strategies in the context of executing the Cybershake, Epigenomics, Genome, Inspiral, LIGO, Montage, and SIPHT workflows applications. In order to provide performance comparison, we performed a joint analysis considering three metrics. A case study is given and corresponding results indicate that well known DAG scheduling algorithms designed for single DAG and single machine settings are not well suited for Grid scheduling scenarios, where user run time estimates are available. We show that the proposed new strategies outperform other strategies in terms of approximation factor, mean critical path waiting time, and critical path slowdown. The robustness of these strategies is also discussed. 相似文献

14.

Achieving Interoperation of Grid Data Resources via Workflow Level Integration

Tamas Kiss Tamas Kukla 《Journal of Grid Computing》2009,7(3):355-374

Production Grids are becoming widely utilized by the e-Science community to run computation and data intensive experiments more efficiently. Unfortunately, different production Grid infrastructures are based on different middleware technologies, both for computation and for data access. Although there is significant effort from the Grid community to standardize the underlying middleware, solutions that allow existing non-standard tools to interoperate are one of the major concerns of Grid users today. This paper describes the generic requirements towards the interoperation of Grid data resources within computational workflows, and suggests integration techniques that allow workflow engines to access various heterogeneous data resources during workflow execution. Reference implementations of these techniques are presented and recommendations on their applicability and suitability are made. 相似文献

15.

Campus Grids Meet Applications: Modeling, Metascheduling and Integration

Yonghong Yan Barbara M. Chapman 《Journal of Grid Computing》2006,4(2):159-175

Air Quality Forecasting (AQF) is a new discipline that attempts to reliably predict atmospheric pollution. An AQF application has complex workflows and in order to produce timely and reliable forecast results, each execution requires access to diverse and distributed computational and storage resources. Deploying AQF on Grids is one option to satisfy such needs, but requires the related Grid middleware to support automated workflow scheduling and execution on Grid resources. In this paper, we analyze the challenges in deploying an AQF application in a campus Grid environment and present our current efforts to develop a general solution for Grid-enabling scientific workflow applications in the GRACCE project. In GRACCE, an application’s workflow is described using GAMDL, a powerful dataflow language for describing application logic. The GRACCE metascheduling architecture provides the functionalities required for co-allocating Grid resources for workflow tasks, scheduling the workflows and monitoring their execution. By providing an integrated framework for modeling and metascheduling scientific workflow applications on Grid resources, we make it easy to build a customized environment with end-to-end support for application Grid deployment, from the management of an application and its dataset, to the automatic execution and analysis of its results.The work has been performed as part of the University of Houston’s Sun Microsystems Center of Excellence in Geosciences [38]. 相似文献

16.

基于随机良构工作流网的工作流进程性能分析

沙静庞善臣《计算机科学》2011,38(4):226-229

工作流进程中的QoS受很多非功能性因素的影响,例如性能、可靠性以及安全性等等。对QoS度量的管理直接影响到参与工作流具体应用的服务能否顺利完成。因此,当服务被工作流或Web进程创建或管理时,底层的工作流引擎必须估计、检测并控制用户的QoS。基于已有的工作流模型分解算法,利用随机良构工作流网的数值分析,给出了分析工作流进程QoS的一种新方法,并结合实例加以验证。相似文献

17.

Modeling and Managing Interactions among Business Processes 总被引：3，自引：0，他引：3

Fabio Casati Angela Discenza 《Journal of Systems Integration》2001,10(2):145-168

Most workflow management systems (WfMSs) only support the separate andindependent execution of business processes. However, processes often needto interact with each other, in order to synchronize the execution of theiractivities, to exchange process data, to request execution of services, orto notify progresses in process execution. Recent market trends also raisethe need for cooperation and interaction between processes executed in differentorganizations, posing additional challenges. In fact, in order to reduce costsand provide better services, companies are pushed to increase cooperation and toform virtual enterprises, where business processes span across organizationalboundaries and are composed of cooperating workflows executed in differentorganizations. Workflow interaction in a cross-organizational environment iscomplicated by the heterogeneity of workflow management platforms on top ofwhich workflows are defined and executed and by the different and possiblycompeting business policies and business goals that drive process executionin each organization.In this paper we propose a model and system that enable interactionbetween workflows executed in the same or in different organizations. Weextend traditional workflow models by allowing workflows to publish andsubscribe to events, and by enabling the definition of points in the processexecution where events should be sent or received. Event notifications aremanaged by a suitable event service that is capable of filtering andcorrelating events, and of dispatching them to the appropriate targetworkflow instances. The extended model can be easily mapped onto anyworkflow model, since event specific constructs can be specified by means ofordinary workflow activities, for which we provide the implementation. Inaddition, the event service is easily portable to different platforms, anddoes not require integration with the WfMS that supports the cooperatingworkflows. Therefore, the proposed approach is applicable in virtually anyenvironment and is independent on the specific platform adopted 相似文献

18.

GeneGrid: Architecture, Implementation and Application

P. V. Jithesh P. Donachy T. Harmer N. Kelly R. Perrott S. Wasnik J. Johnston M. McCurley M. Townsley S. McKee 《Journal of Grid Computing》2006,4(2):209-222

The emergence of Grid computing technology has opened up an unprecedented opportunity for biologists to share and access data, resources and tools in an integrated environment leading to a greater chance of knowledge discovery. GeneGrid is a Grid computing framework that seamlessly integrates a myriad of heterogeneous resources spanning multiple administrative domains and locations. It provides scientists an integrated environment for the streamlined access of a number of bioinformatics programs and databases through a simple and intuitive interface. It acts as a virtual bioinformatics laboratory by allowing scientists to create, execute and manage workflows that represent bioinformatics experiments. A number of cooperating Grid services interact in an orchestrated manner to provide this functionality. This paper gives insight into the details of the architecture, components and implementation of GeneGrid. 相似文献

19.

Using imbalance metrics to optimize task clustering in scientific workflow executions

《Future Generation Computer Systems》2015

Scientific workflows can be composed of many fine computational granularity tasks. The runtime of these tasks may be shorter than the duration of system overheads, for example, when using multiple resources of a cloud infrastructure. Task clustering is a runtime optimization technique that merges multiple short running tasks into a single job such that the scheduling overhead is reduced and the overall runtime performance is improved. However, existing task clustering strategies only provide a coarse-grained approach that relies on an over-simplified workflow model. In this work, we examine the reasons that cause Runtime Imbalance and Dependency Imbalance in task clustering. Then, we propose quantitative metrics to evaluate the severity of the two imbalance problems. Furthermore, we propose a series of task balancing methods (horizontal and vertical) to address the load balance problem when performing task clustering for five widely used scientific workflows. Finally, we analyze the relationship between these metric values and the performance of proposed task balancing methods. A trace-based simulation shows that our methods can significantly decrease the runtime of workflow applications when compared to a baseline execution. We also compare the performance of our methods with two algorithms described in the literature. 相似文献

20.

A probabilistic approach to modeling and estimating the QoS of web-services-based workflows 总被引：1，自引：0，他引：1

San-Yih Hwang Haojun Wang Jaideep Srivastava 《Information Sciences》2007,177(23):5484-5503

Web services promise to become a key enabling technology for B2B e-commerce. One of the most-touted features of Web services is their capability to recursively construct a Web service as a workflow of other existing Web services. The quality of service (QoS) of Web-services-based workflows may be an essential determinant when selecting constituent Web services and determining the service-level agreement with users. To make such a selection possible, it is essential to estimate the QoS of a WS workflow based on the QoSs of its constituent WSs. In the context of WS workflow, this estimation can be made by a method called QoS aggregation. While most of the existing work on QoS aggregation treats the QoS as a deterministic value, we argue that due to some uncertainty related to a WS, it is more realistic to model its QoS as a random variable, and estimate the QoS of a WS workflow probabilistically. In this paper, we identify a set of QoS metrics in the context of WS workflows, and propose a unified probabilistic model for describing QoS values of a broader spectrum of atomic and composite Web services. Emulation data are used to demonstrate the efficiency and accuracy of the proposed approach. 相似文献