首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
3.
Workflow technology continues to play an important role as a means for specifying and enacting computational experiments in modern science. Reusing and re-purposing workflows allow scientists to do new experiments faster, since the workflows capture useful expertise from others. As workflow libraries grow, scientists face the challenge of finding workflows appropriate for their task, understanding what each workflow does, and reusing relevant portions of a given workflow. We believe that workflows would be easier to understand and reuse if high-level views (abstractions) of their activities were available in workflow libraries. As a first step towards obtaining these abstractions, we report in this paper on the results of a manual analysis performed over a set of real-world scientific workflows from Taverna, Wings, Galaxy and Vistrails. Our analysis has resulted in a set of scientific workflow motifs that outline (i) the kinds of data-intensive activities that are observed in workflows (Data-Operation motifs), and (ii) the different manners in which activities are implemented within workflows (Workflow-Oriented motifs). These motifs are helpful to identify the functionality of the steps in a given workflow, to develop best practices for workflow design, and to develop approaches for automated generation of workflow abstractions.  相似文献   

4.
Workflows are a popular means of automating processes in many domains, ranging from high-level business process modeling to lower-level web service orchestration. However, state-of-the-art workflow languages offer a limited set of modularization mechanisms. This results in monolithic workflow specifications, in which different concerns are scattered across the workflow and tangled with one another. This hinders the design, evolution, and reusability of workflows expressed in these languages. We address this problem through the Unify framework. This framework enables uniform modularization of workflows by supporting the specification of all workflow concerns – including crosscutting ones – in isolation of each other. These independently specified workflow concerns are connected to each other using workflow-specific connectors. In order to further facilitate the development of workflows, we enable the definition of concern-specific languages (CSLs) on top of the Unify framework. A CSL facilitates the expression of a family of workflow concerns by offering abstractions that map well to the concerns' domain. Thus, domain experts can add concerns to a workflow using concern-specific language constructs. We exemplify the specification of a workflow in Unify, and show the definition and application of two concern-specific languages built on top of Unify.  相似文献   

5.
6.
One of the applications of workflow systems is the management of administrative processes characterized by the transmission of information elements among users of an organization. Tasks contained in these processes are carried out by users responsible for confirming, modifying or adding information throughout. These processes need to be defined in workflow management systems in which all the elements are perfectly identified and are easily adaptable to changes that may arise in the sequences of tasks, in the users involved or in the data transmitted from one task to another. For this kind of processes is easier to reuse those represented in ontologies. On one hand, existing ontologies for representing some domain elements can be reused. At the same time, ontologies have an excellent expressive capacity to define tasks, their relationships and the flow control among them with precision. This paper proposes a complete model, together with the necessary software tools, for tackling this issue.
álvaro E. PrietoEmail:

álvaro E. Prieto   is a teaching/research assistant professor of Computer Science at the University of Extremadura, Spain. He has an MSc in Computer Science from the University of Extremadura (2000). His Ph.D. research addresses the use of ontologies in workflows. He is currently involved in various national and regional R&D&I projects. Adolfo Lozano-Tello   is teaching/research assistant professor of Computer Science Department at University of Extremadura, Spain. He is a Ph.D. (2002) with a special prize of extraordinary thesis about selection of ontologies for software applications. He has published more than 50 papers on the above issues on Software Engineering and Knowledge Engineering.  相似文献   

7.
A Taxonomy of Workflow Management Systems for Grid Computing   总被引:12,自引:0,他引:12  
With the advent of Grid and application technologies, scientists and engineers are building more and more complex applications to manage and process large data sets, and execute scientific experiments on distributed resources. Such application scenarios require means for composing and executing complex workflows. Therefore, many efforts have been made towards the development of workflow management systems for Grid computing. In this paper, we propose a taxonomy that characterizes and classifies various approaches for building and executing workflows on Grids. We also survey several representative Grid workflow systems developed by various projects world-wide to demonstrate the comprehensiveness of the taxonomy. The taxonomy not only highlights the design and engineering similarities and differences of state-of-the-art in Grid workflow systems, but also identifies the areas that need further research.  相似文献   

8.
Cloud computing has established itself as an interesting computational model that provides a wide range of resources such as storage, databases and computing power for several types of users. Recently, the concept of cloud computing was extended with the concept of federated clouds where several resources from different cloud providers are inter-connected to perform a common action (e.g. execute a scientific workflow). Users can benefit from both single-provider and federated cloud environment to execute their scientific workflows since they can get the necessary amount of resources on demand. In several of these workflows, there is a demand for high performance and parallelism techniques since many activities are data and computing intensive and can execute for hours, days or even weeks. There are some Scientific Workflow Management Systems (SWfMS) that already provide parallelism capabilities for scientific workflows in single-provider cloud. Most of them rely on creating a virtual cluster to execute the workflow in parallel. However, they also rely on the user to estimate the amount of virtual machines to be allocated to create this virtual cluster. Most SWfMS use this initial virtual cluster configuration made by the user for the entire workflow execution. Dimensioning the virtual cluster to execute the workflow in parallel is then a top priority task since if the virtual cluster is under or over dimensioned it can impact on the workflow performance or increase (unnecessarily) financial costs. This dimensioning is far from trivial in a single-provider cloud and specially in federated clouds due to the huge number of virtual machine types to choose in each location and provider. In this article, we propose an approach named GraspCC-fed to produce the optimal (or near-optimal) estimation of the amount of virtual machines to allocate for each workflow. GraspCC-fed extends a previously proposed heuristic based on GRASP for executing standalone applications to consider scientific workflows executed in both single-provider and federated clouds. For the experiments, GraspCC-fed was coupled to an adapted version of SciCumulus workflow engine for federated clouds. This way, we believe that GraspCC-fed can be an important decision support tool for users and it can help determining an optimal configuration for the virtual cluster for parallel cloud-based scientific workflows.  相似文献   

9.
In this paper, we leverage the previous work on the SHIWA bundling format and expand on this specification in order to facilitate workflow execution within a multi-workflow environment. We introduce a scalable and robust execution pool environment that supports workflows consisting of sub-workflows built upon a multitude of different workflow engines and environments, and also provide a common workflow representation for seamless connectivity through serialization to workflow bundles. We also present a meta-workflow scenario based upon this system. Workflow bundles employ the lightweight Open Archives Initiative Object Reuse and Exchange (ORE) Web-based standard, to provide a common format for representing and sharing workflows and the associated metadata required for their execution. This generalized bundling approach is already available within five workflow engines and has proven a useful environment for inter-workflow experimentation. The execution pool facilitates federated access to multiple distributed computing infrastructures supported by the underlying workflow engines subscribed to the pool. Workflow bundles are exposed using the eXtensible Messaging and Presence Protocol (XMPP), which provides the necessary communication backbone to enable multiple workflow engine agents to asynchronously publish and subscribe to bundles in meta-workflow pipelines. We present experiments showing the scalability and robustness of the pool execution approach with results showing that overheads remain controlled for up to 150 workflow agents, and that agent failures have very limited impact. We then demonstrate the applicability of our architecture by describing how a Java-based music analysis workflow can be distributed within such a multi-workflow environment consisting of the Triana and MOTEUR workflow engines.  相似文献   

10.
Workflow management systems (WfMS) are widely used by business enterprises as tools for administrating, automating and scheduling the business process activities with the available resources. Since the control flow specifications of workflows are manually designed, they entail assumptions and errors, leading to inaccurate workflow models. Decision points, the XOR nodes in a workflow graph model, determine the path chosen toward completion of any process invocation. In this work, we show that positioning the decision points at their earliest points can improve process efficiency by decreasing their uncertainties and identifying redundant activities. We present novel techniques to discover the earliest positions by analyzing workflow logs and to transform the model graph. The experimental results show that the transformed model is more efficient with respect to its average execution time and uncertainty, when compared to the original model.  相似文献   

11.
12.
Social workflows pervade peoples׳ everyday life. Whenever a group of persons works together on a challenging or multifaceted task, a social workflow begins. Unlike traditional business workflows, such social workflows aim at supporting processes that contain personal tasks and data. In this work, we envision a social workflow service as part of a social network that enables private individuals to construct social workflows according to their specific needs and to keep track of the workflow execution. The proposed features for a social workflow service could help individuals to accomplish their private goals. The presented idea is contrasted with established research areas and applications to show the degree of novelty of this work. It is shown how novel ideas for knowledge management, facilitated by a process-oriented case-based reasoning approach, support private individuals and how they can obtain an appropriate social workflow through sharing and reuse of respective experience. Two empirical studies confirm the potential benefits of a social workflow service in general and the core features of the developed concept.  相似文献   

13.
14.
Extract, Transform and Load (ETL) processes organized as workflows play an important role in data warehousing. As ETL workflows are usually complex, various ETL facilities have been developed to address their control-flow process modeling and execution control. To evaluate the quality of ETL facilities, Synthetic ETL workflow test cases, consisting of control-flow and data-flow aspects are needed to check ETL facility functionalities at construction time and to validate the correctness and performance of ETL facilities at run time. Although there are some synthetic workflow and data set test case generation approaches existed in literatures, little work is done to consider both aspects at the same time specifically for ETL workflow generators. To address this issue, this paper proposes a schema aware ETL workflow generator with which users can characterize their ETL workflows by various parameters and get ETL workflow test cases with control-flow of ETL activities, complied schemas and associated recordsets. Our generator consists of three steps. First, with type and ratio of individual activities and their connection characteristic parameter specification, the generator will produce ETL activities and form ETL skeleton which determine how generated activities are cooperated with each other. Second, with schema transformation characteristic parameter specification, e.g. ranges of numbers of attributes, the generator will resolve attribute dependencies and refine input/output schemas with complied attributes and their data types. In the last step, recordsets are generated following cardinality specifications. ETL workflows in specific patterns are produced in the experiment in order to show the ability of our generator. Also experiments to generate thousands of ETL workflow test cases in seconds have been done to verify the usability of the generator.  相似文献   

15.
This study explores the relationship between primary care physicians' interactions with health information technology and primary care workflow. Clinical encounters were recorded with high-resolution video cameras to capture physicians' workflow and interaction with two objects of interest, the electronic health record (EHR) system, and their patient. To analyze the data, a coding scheme was developed based on a validated list of primary care tasks to define the presence or absence of a task, the time spent on each task, and the sequence of tasks. Results revealed divergent workflows and significant differences between physicians' EHR use surrounding common workflow tasks: gathering information, documenting information, and recommend/discuss treatment options. These differences suggest impacts of EHR use on primary care workflow, and capture types of workflows that can be used to inform future studies with larger sample sizes for more effective designs of EHR systems in primary care clinics. Future research on this topic and design strategies for effective health information technology in primary care are discussed.Relevance to industryThis paper presents the effect of EHR use on workflow of a primary care visit. Understanding physicians' interaction styles can inform design of specific features of future health IT systems for more effective and efficient workflow in outpatient setting.  相似文献   

16.
An important challenge for the adoption of cloud computing in the scientific community remains the efficient allocation and execution of data-intensive scientific workflows to reduce execution time and the size of transferred data. The transferred data overhead is becoming significant with emerging scientific workflows that have input/output files and intermediate data products ranging in the hundreds of gigabytes. The allocation of scientific workflows on public clouds can be described through a variety of perspectives and parameters, and has been proved to be NP-complete. This paper proposes an evolutionary approach for task allocation on public clouds considering data transfer and execution time. In our framework, a solution is represented using an allocation chromosome that encodes the allocation of tasks to nodes, and an ordering chromosome that defines the execution order according to the scientific workflow representation. We propose a multi-objective optimization that relies on a cloud cost model and employs tailored evolution operators. Starting from a population of possible solutions, we employ crossover and mutation operators on both chromosomes aiming at optimizing the data transferred between nodes as well as the total workflow runtime. The crossover operators combine parts of solutions to reduce data overhead, whereas the mutation operators swamp between parts of the same chromosome according to pre-defined rules. Our experimental study compares between the proposed approach and current state-of-the art approaches using synthetic and real-life workflows. Our algorithm performs similarly to existing heuristics for small workflows and shows up to 80 % improvements for larger synthetic workflows. To further validate our approach we compare between the allocation and scheduling obtained by our approach with that obtained by popular scientific workflow managers, when real workflows with hundreds of tasks are executed on a public cloud. The results show a 10 % improvement in runtime over existing schedulers, caused by a 80 % reduction in transferred data and optimized allocation and ordering of tasks. This improved data locality has greater impact as it can be employed to improve and study data provenance and facilitate data persistence for scientific workflows.  相似文献   

17.
18.
Nowadays, more and more computer-based scientific experiments need to handle massive amounts of data. Their data processing consists of multiple computational steps and dependencies within them. A data-intensive scientific workflow is useful for modeling such process. Since the sequential execution of data-intensive scientific workflows may take much time, Scientific Workflow Management Systems (SWfMSs) should enable the parallel execution of data-intensive scientific workflows and exploit the resources distributed in different infrastructures such as grid and cloud. This paper provides a survey of data-intensive scientific workflow management in SWfMSs and their parallelization techniques. Based on a SWfMS functional architecture, we give a comparative analysis of the existing solutions. Finally, we identify research issues for improving the execution of data-intensive scientific workflows in a multisite cloud.  相似文献   

19.
In order to design workflows in changing and dynamic environments, a flexible, correct, and rapid realization of models of the activity flow is required. In particular, techniques are needed to design workflows capable of adapting themselves effectively when exceptional situations occur during process execution. The authors present an approach to flexible workflow design based on rules and patterns developed in the framework of the WIDE project. Rules allow a high degree of flexibility during workflow design by modeling exceptional aspects of the workflow separately from the main activity flow. Patterns model frequently occurring exceptional situations in a generalized way by providing the designer with skeletons of rules and suggestions about their instantiation, together with indications on relationships with other rules, with the activity flow, and with related information. Pattern based design relies on a pattern catalog containing patterns to be reused and on a formal basis for specializing and instantiating available patterns  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号