共查询到20条相似文献,搜索用时 15 毫秒
1.
An automated process discovery technique generates a process model from an event log recording the execution of a business process. For it to be useful, the generated process model should be as simple as possible, while accurately capturing the behavior recorded in, and implied by, the event log. Most existing automated process discovery techniques generate flat process models. When confronted to large event logs, these approaches lead to overly complex or inaccurate process models. An alternative is to apply a divide-and-conquer approach by decomposing the process into stages and discovering one model per stage. It turns out, however, that existing divide-and-conquer process discovery approaches often produce less accurate models than flat discovery techniques, when applied to real-life event logs. This article proposes an automated method to identify business process stages from an event log and an automated technique to discover process models based on a given stage-based process decomposition. An experimental evaluation shows that: (i) relative to existing automated process decomposition methods in the field of process mining, the proposed method leads to stage-based decompositions that are closer to decompositions derived by human experts; and (ii) the proposed stage-based process discovery technique outperforms existing flat and divide-and-conquer discovery techniques with respect to well-accepted measures of accuracy and achieves comparable results in terms of model complexity. 相似文献
2.
Micha? WalickiAuthor VitaeDiogo R. FerreiraAuthor Vitae 《Data & Knowledge Engineering》2011,70(10):821-841
Finding the case id in unlabeled event logs is arguably one of the hardest challenges in process mining research. While this problem has been addressed with greedy approaches, these usually converge to sub-optimal solutions. In this work, we describe an approach to perform complete search over the search space. We formulate the problem as a matter of finding the minimal set of patterns contained in a sequence, where patterns can be interleaved but do not have repeating symbols. This represents a new problem that has not been previously addressed in the literature, with NP-hard variants and conjectured NP-completeness. We solve it in a stepwise manner, by generating and verifying a list of candidate solutions. The techniques, introduced to address various subtasks, can be applied independently for solving more specific problems. The approach has been implemented and applied in a case study with real data from a business process supported in a software application. 相似文献
3.
Companies have to adhere to compliance requirements. The compliance analysis of business operations is typically a joint effort of business experts and compliance experts. Those experts need to create a common understanding of business processes to effectively conduct compliance management. In this paper, we present a technique that aims at supporting this process. We argue that process templates generated out of compliance requirements provide a basis for negotiation among business and compliance experts. We introduce a semi-automated and iterative approach to the synthesis of such process templates from compliance requirements expressed in Linear Temporal Logic (LTL). We show how generic constraints related to business process execution are incorporated and present criteria that point at underspecification. Further, we outline how such underspecification may be resolved to iteratively build up a complete specification. For the synthesis, we leverage existing work on process mining and process restructuring. However, our approach is not limited to the control-flow perspective, but also considers direct and indirect data-flow dependencies. Finally, we elaborate on the application of the derived process templates and present an implementation of our approach. 相似文献
4.
《Expert systems with applications》2014,41(16):7291-7306
In an inter-organizational setting the manual construction of process models is challenging because the different people involved have to put together their partial knowledge about the overall process. Process mining, an automated technique to discover and analyze process models, can facilitate the construction of inter-organizational process models. This paper presents a technique to merge the input data of the different partners of an inter-organizational process in order to serve as input for process mining algorithms. The technique consists of a method for configuring and executing the merge and an algorithm that searches for links between the data of the different partners and that suggests rules to the user on how to merge the data. Tool support is provided in the open source process mining framework ProM. The method and the algorithm are tested using two artificial and three real life datasets that confirm their effectiveness and efficiency. 相似文献
5.
This paper discusses four algorithms for detecting anomalies in logs of process aware systems. One of the algorithms only marks as potential anomalies traces that are infrequent in the log. The other three algorithms: threshold, iterative and sampling are based on mining a process model from the log, or a subset of it. The algorithms were evaluated on a set of 1500 artificial logs, with different profiles on the number of anomalous traces and the number of times each anomalous traces was present in the log. The sampling algorithm proved to be the most effective solution. We also applied the algorithm to a real log, and compared the resulting detected anomalous traces with the ones detected by a different procedure that relies on manual choices. 相似文献
6.
The search for good lineal, or depth-first, spanning trees is an important aspect in the implementation of a wide assortment of graph algorithms. We consider the complexity of findingoptimal lineal spanning trees under various notions of optimality. In particular, we show that several natural problems, such as constructing a shortest or a tallest lineal tree, are NP-hard. We also address the issue of polynomial-time, near-optimization strategies for these difficult problems, showing that efficient absolute approximation algorithms cannot exist unlessP = NP.This author's research was supported in part by the Sandia University Research Program and by the National Science Foundation under Grant M IP-8603879.This author's research was supported in part by the National Science Foundation under Grants ECS-8403859 and MIP-8603879. 相似文献
7.
In real-world business processes it is often difficult to explain why some process instances take longer than usual to complete. With process mining techniques, it is possible to do an a posteriori analysis of a large number of process instances and detect the occurrence of delays, but discovering the actual cause of such delays is a different problem. For example, it may be the case that when a certain activity is performed or a certain user (or combination of users) participates in the process, the process suffers a delay. In this work, we show that it is possible to retrieve possible causes of delay based on the information recorded in an event log. The approach consists in translating the event log into a logical representation, and then applying decision tree induction to classify process instances according to duration. Besides splitting those instances into several subsets, each path in the tree yields a rule that explains why a given subset has an average duration that is higher or lower than other subsets of instances. The approach is applied in two case studies involving real-world event logs, where it succeeds in discovering meaningful causes of delay, some of which having been pointed out by domain experts. 相似文献
8.
The problem of traffic sign recognition is generally approached by first constructing a classifier, which is trained by some relevant image features extracted from traffic signs, to recognize new unknown traffic signs. Feature selection and instance selection are two important data preprocessing steps in data mining, with the former aimed at removing some irrelevant and/or redundant features from a given dataset and the latter at discarding the faulty data. However, there has thus far been no study examining the impact of performing feature and instance selection on traffic sign recognition performance. Given that genetic algorithms (GA) have been widely used for these types of data preprocessing tasks in related studies, we introduce a novel genetic-based biological algorithm (GBA). GBA fits “biological evolution” into the evolutionary process, where the most streamlined process also complies with reasonable rules. In other words, after long-term evolution, organisms find the most efficient way to allocate resources and evolve. Similarly, we closely simulate the natural evolution of an algorithm, to find an option it will be both efficient and effective. Experiments are carried out comparing the performance of the GBA and a GA based on the German Traffic Sign Recognition Benchmark. The results show that the GBA outperforms the GA in terms of the reduction rate, classification accuracy, and computational cost. 相似文献
9.
Process mining is the research domain that is dedicated to the a posteriori analysis of business process executions. The techniques developed within this research area are specifically designed to provide profound insight by exploiting the untapped reservoir of knowledge that resides within event logs of information systems. Process discovery is one specific subdomain of process mining that entails the discovery of control-flow models from such event logs. Assessing the quality of discovered process models is an essential element, both for conducting process mining research as well as for the use of process mining in practice. In this paper, a multi-dimensional quality assessment is presented in order to comprehensively evaluate process discovery techniques. In contrast to previous studies, the major contribution of this paper is the use of eight real-life event logs. For instance, we show that evaluation based on real-life event logs significantly differs from the traditional approach to assess process discovery techniques using artificial event logs. In addition, we provide an extensive overview of available process discovery techniques and we describe how discovered process models can be assessed regarding both accuracy and comprehensibility. The results of our study indicate that the HeuristicsMiner algorithm is especially suited in a real-life setting. However, it is also shown that, particularly for highly complex event logs, knowledge discovery from such data sets can become a major problem for traditional process discovery techniques. 相似文献
10.
Given a model of the expected behavior of a business process and given an event log recording its observed behavior, the problem of business process conformance checking is that of identifying and describing the differences between the process model and the event log. A desirable feature of a conformance checking technique is that it should identify a minimal yet complete set of differences. Existing conformance checking techniques that fulfill this property exhibit limited scalability when confronted to large and complex process models and event logs. One reason for this limitation is that existing techniques compare each execution trace in the log against the process model separately, without reusing computations made for one trace when processing subsequent traces. Yet, the execution traces of a business process typically share common fragments (e.g. prefixes and suffixes). A second reason is that these techniques do not integrate mechanisms to tackle the combinatorial state explosion inherent to process models with high levels of concurrency. This paper presents two techniques that address these sources of inefficiency. The first technique starts by transforming the process model and the event log into two automata. These automata are then compared based on a synchronized product, which is computed using an A* heuristic with an admissible heuristic function, thus guaranteeing that the resulting synchronized product captures all differences and is minimal in size. The synchronized product is then used to extract optimal (minimal-length) alignments between each trace of the log and the closest corresponding trace of the model. By representing the event log as a single automaton, this technique allows computations for shared prefixes and suffixes to be made only once. The second technique decomposes the process model into a set of automata, known as S-components, such that the product of these automata is equal to the automaton of the whole process model. A product automaton is computed for each S-component separately. The resulting product automata are then recomposed into a single product automaton capturing all the differences between the process model and the event log, but without minimality guarantees. An empirical evaluation using 40 real-life event logs shows that, used in tandem, the proposed techniques outperform state-of-the-art baselines in terms of execution times in a vast majority of cases, with improvements ranging from several-fold to one order of magnitude. Moreover, the decomposition-based technique leads to optimal trace alignments for the vast majority of datasets and close to optimal alignments for the remaining ones. 相似文献
11.
Christian Bessiere 《Artificial Intelligence》2009,173(11):1054-1078
We propose Range and Roots which are two common patterns useful for specifying a wide range of counting and occurrence constraints. We design specialised propagation algorithms for these two patterns. Counting and occurrence constraints specified using these patterns thus directly inherit a propagation algorithm. To illustrate the capabilities of the Range and Roots constraints, we specify a number of global constraints taken from the literature. Preliminary experiments demonstrate that propagating counting and occurrence constraints using these two patterns leads to a small loss in performance when compared to specialised global constraints and is competitive with alternative decompositions using elementary constraints. 相似文献
12.
《Expert systems with applications》2014,41(11):5340-5352
A business process (BP) consists of a set of activities which are performed in coordination in an organizational and technical environment and which jointly realize a business goal. In such context, BP management (BPM) can be seen as supporting BPs using methods, techniques, and software in order to design, enact, control, and analyze operational processes involving humans, organizations, applications, and other sources of information. Since the accurate management of BPs is receiving increasing attention, conformance checking, i.e., verifying whether the observed behavior matches a modelled behavior, is becoming more and more critical. Moreover, declarative languages are more frequently used to provide an increased flexibility. However, whereas there exist solid conformance checking techniques for imperative models, little work has been conducted for declarative models. Furthermore, only control-flow perspective is usually considered although other perspectives (e.g., data) are crucial. In addition, most approaches exclusively check the conformance without providing any related diagnostics. To enhance the accurate management of flexible BPs, this work presents a constraint-based approach for conformance checking over declarative BP models (including both control-flow and data perspectives). In addition, two constraint-based proposals for providing related diagnosis are detailed. To demonstrate both the effectiveness and the efficiency of the proposed approaches, the analysis of different performance measures related to a wide diversified set of test models of varying complexity has been performed. 相似文献
13.
Discovering Social Networks from Event Logs 总被引:5,自引:0,他引:5
Process mining techniques allow for the discovery of knowledge based on so-called “event logs”, i.e., a log recording the
execution of activities in some business process. Many information systems provide such logs, e.g., most WFM, ERP, CRM, SCM,
and B2B systems record transactions in a systematic way. Process mining techniques typically focus on performance and control-flow
issues. However, event logs typically also log the performer, e.g., the person initiating or completing some activity. This paper focuses on mining social networks using this information.
For example, it is possible to build a social network based on the hand-over of work from one performer to the next. By combining
concepts from workflow management and social network analysis, it is possible to discover and analyze social networks. This
paper defines metrics, presents a tool, and applies these to a real event log within the setting of a large Dutch organization. 相似文献
14.
Given an undirected network with positive edge costs and a positive integer , the minimum-degree constrained minimum spanning tree problem is the problem of finding a spanning tree with minimum total cost such that each non-leaf node in the tree has a degree of at least . This problem is new to the literature while the related problem with upper bound constraints on degrees is well studied. Mixed-integer programs proposed for either type of problem is composed, in general, of a tree-defining part and a degree-enforcing part. In our formulation of the minimum-degree constrained minimum spanning tree problem, the tree-defining part is based on the Miller–Tucker–Zemlin constraints while the only earlier paper available in the literature on this problem uses single and multi-commodity flow-based formulations that are well studied for the case of upper degree constraints. We propose a new set of constraints for the degree-enforcing part that lead to significantly better solution times than earlier approaches when used in conjunction with Miller–Tucker–Zemlin constraints. 相似文献
15.
Process mining can be viewed as the missing link between model-based process analysis and data-oriented analysis techniques. Lion׳s share of process mining research has been focusing on process discovery (creating process models from raw data) and replay techniques to check conformance and analyze bottlenecks. These techniques have helped organizations to address compliance and performance problems. However, for a more refined analysis, it is essential to correlate different process characteristics. For example, do deviations from the normative process cause additional delays and costs? Are rejected cases handled differently in the initial phases of the process? What is the influence of a doctor׳s experience on treatment process? These and other questions may involve process characteristics related to different perspectives (control-flow, data-flow, time, organization, cost, compliance, etc.). Specific questions (e.g., predicting the remaining processing time) have been investigated before, but a generic approach was missing thus far. The proposed framework unifies a number of approaches for correlation analysis proposed in literature, proposing a general solution that can perform those analyses and many more. The approach has been implemented in ProM and combines process and data mining techniques. In this paper, we also demonstrate the applicability using a case study conducted with the UWV (Employee Insurance Agency), one of the largest “administrative factories” in The Netherlands. 相似文献
16.
Analysing performance of business processes is an important vehicle to improve their operation. Specifically, an accurate assessment of sojourn times and remaining times enables bottleneck analysis and resource planning. Recently, methods to create respective performance models from event logs have been proposed. These works have several limitations, though: They either consider control-flow and performance information separately, or rely on an ad-hoc selection of temporal relations between events. In this paper, we introduce the Temporal Network Representation (TNR) of a log. It is based on Allen’s interval algebra, comprises the pairwise temporal relations for activity executions, and potentially incorporates the context in which these relations have been observed. We demonstrate the usefulness of the TNR for detecting (unrecorded) delays and for probabilistic mining of variants when modelling the performance of a process. In order to compare different models from the performance perspective, we further develop a framework for measuring performance fitness. Under this framework, TNR-based process discovery is guaranteed to dominate existing techniques in measuring performance characteristics of a process. In addition, we show how contextual information in terms of the congestion levels of the process can be mined in order to further improve capabilities for performance analysis. To illustrate the practical value of the proposed models, we evaluate our approaches with three real-life datasets. Our experiments show that the TNR yields an improvement in performance fitness over state-of-the-art algorithms, while congestion learning is able to accurately reconstruct congestion levels from event data. 相似文献
17.
18.
19.
Current workflow management technology offers rich support for process-oriented coordination of distributed teamwork. In this
paper, we evaluate the performance of an industrial workflow process where similar tasks can be performed by various actors
at many different locations. We analyzed a large workflow process log with state-of-the-art mining tools associated with the
ProM framework. Our analysis leads to the conclusion that there is a positive effect on process performance when workflow
actors are geographically close to each other. Our case study shows that the use of workflow technology in itself is not sufficient
to level geographical barriers between team members and that additional measures are required for a desirable performance.
相似文献
Byungduk JeongEmail: |