共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
Simona Bernardi Juan L. Domínguez Abel Gómez Christophe Joubert José Merseguer Diego Perez-Palacin José I. Requeno Alberto Romeu 《Empirical Software Engineering》2018,23(6):3394-3441
Software performance engineering is a mature field that offers methods to assess system performance. Process mining is a promising research field applied to gain insight on system processes. The interplay of these two fields opens promising applications in the industry. In this work, we report our experience applying a methodology, based on process mining techniques, for the performance assessment of a commercial data-intensive software application. The methodology has successfully assessed the scalability of future versions of this system. Moreover, it has identified bottlenecks components and replication needs for fulfilling business rules. The system, an integrated port operations management system, has been developed by Prodevelop, a medium-sized software enterprise with high expertise in geospatial technologies. The performance assessment has been carried out by a team composed by practitioners and researchers. Finally, the paper offers a deep discussion on the lessons learned during the experience, that will be useful for practitioners to adopt the methodology and for researcher to find new routes. 相似文献
3.
Gang Wu Huxing Zhang Meikang Qiu Zhong Ming Jiayin Li Xiao Qin 《Journal of Parallel and Distributed Computing》2013
Nowadays, there is an increasing demand to monitor, analyze, and control large scale distributed systems. Events detected during monitoring are temporally correlated, which is helpful to resource allocation, job scheduling, and failure prediction. To discover the correlations among detected events, many existing approaches concentrate detected events into an event database and perform data mining on it. We argue that these approaches are not scalable to large scale distributed systems as monitored events grow so fast that event correlation discovering can hardly be done with the power of a single computer. In this paper, we present a decentralized approach to efficiently detect events, filter irrelative events, and discover their temporal correlations. We propose a MapReduce-based algorithm, MapReduce-Apriori, to data mining event association rules, which utilizes the computational resource of multiple dedicated nodes of the system. Experimental results show that our decentralized event correlation mining algorithm achieves nearly ideal speedup compared to centralized mining approaches. 相似文献
4.
5.
A novel approach for process mining based on event types 总被引:2,自引:0,他引:2
Lijie Wen Jianmin Wang Wil M. P. van der Aalst Biqing Huang Jiaguang Sun 《Journal of Intelligent Information Systems》2009,32(2):163-190
Despite the omnipresence of event logs in transactional information systems (cf. WFM, ERP, CRM, SCM, and B2B systems), historic
information is rarely used to analyze the underlying processes. Process mining aims at improving this by providing techniques
and tools for discovering process, control, data, organizational, and social structures from event logs, i.e., the basic idea
of process mining is to diagnose business processes by mining event logs for knowledge. Given its potential and challenges
it is no surprise that recently process mining has become a vivid research area. In this paper, a novel approach for process
mining based on two event types, i.e., START and COMPLETE, is proposed. Information about the start and completion of tasks
can be used to explicitly detect parallelism. The algorithm presented in this paper overcomes some of the limitations of existing
algorithms such as the α-algorithm (e.g., short-loops) and therefore enhances the applicability of process mining.
相似文献
Jiaguang SunEmail: |
6.
Wil M. P. van der Aalst 《Distributed and Parallel Databases》2013,31(4):471-507
The practical relevance of process mining is increasing as more and more event data become available. Process mining techniques aim to discover, monitor and improve real processes by extracting knowledge from event logs. The two most prominent process mining tasks are: (i) process discovery: learning a process model from example behavior recorded in an event log, and (ii) conformance checking: diagnosing and quantifying discrepancies between observed behavior and modeled behavior. The increasing volume of event data provides both opportunities and challenges for process mining. Existing process mining techniques have problems dealing with large event logs referring to many different activities. Therefore, we propose a generic approach to decompose process mining problems. The decomposition approach is generic and can be combined with different existing process discovery and conformance checking techniques. It is possible to split computationally challenging process mining problems into many smaller problems that can be analyzed easily and whose results can be combined into solutions for the original problems. 相似文献
7.
The quality of knowledge in the knowledge repository determines the effect of knowledge reusing and sharing. Knowledge to
be reused should be checked in advance through a knowledge maintenance process. The knowledge maintenance process model is
difficult to be constructed because of the balance between the efficiency and the effect. In this paper, process mining is
applied to analyze the knowledge maintenance logs to discover process and then construct a more appropriate knowledge maintenance
process model. We analyze knowledge maintenance logs from the control flow perspective to find a good characterization of
knowledge maintenance tasks and dependencies. In addition, the logs are analyzed from the organizational perspective to cluster
the performers who are qualified to do the same kinds of tasks and to get the relations among these clusters. The proposed
approach has been applied in the knowledge management system. The result of the experiment shows that our approach is feasible
and efficient. 相似文献
8.
改进的Web访问日志会话识别算法 总被引:2,自引:2,他引:2
针对Web日志挖掘中的会话识别问题,分别对Timeout方法、参引长度法进行改进,提出了一种改进的会话识别方法.该方法运用网站的拓扑结构信息,动态设定各页面的时间间隔阀值,使页面时间间隔阀值同页面的重要程度结合起来.同时通过灵活界定内容页,并针对内容页,提出了一些启发式规则,突破了"参引长度法"所固有的一个会话中只包含一个内容页的瓶颈.该方法提高了会话识别的准确度,实验结果表明是有效的. 相似文献
9.
Web日志挖掘中的会话识别算法 总被引:7,自引:0,他引:7
会话识别是Web日志挖掘的关键步骤,然而很多方法所得到的会话不够精确.针对Web日志挖掘中的会话识别问题,在最常用的Timeout方法的基础上,提出了一种改进的基于平均时间阈值的识别方法.通过动态计算会话中请求记录间的平均时间间隔,个性化地调整页面的时间阈值,相对于传统的对所有用户页面使用单一的先验阈值,该方法能够更准确地识别出长对话.最后对生成的侯选会话集进行二次识别,使识别出的会话更为合理有效.实验结果表明,会话质量得到了提高. 相似文献
10.
介绍了以西门子S7300PLC和力控组态软件为基础的采输卤SCADA系统的设计与实现,并辅以生产使用效果实例分析阐明SCADA系统对卤水采输现场各区域运行设备的数据采集、控制、测量、参数调节以及各类信号报警等功能。 相似文献
11.
12.
Lee J. Wells Fadel M. Megahed Cory B. Niziolek Jaime A. Camelio William H. Woodall 《Journal of Intelligent Manufacturing》2013,24(6):1267-1279
Statistical process control (SPC) methods have been extensively applied to monitor the quality performance of manufacturing processes to quickly detect and correct out-of-control conditions. As sensor and measurement technologies advance, there is a continual need to adapt and refine SPC methods to effectively and efficiently use these new data-sets. One of the most state-of-the-art dimensional measurement technologies currently being implemented in industry is the 3D laser scanner, which rapidly provides millions of data points to represent an entire manufactured part’s surface. Consequently, this data has a great potential to detect unexpected faults, i.e., faults that are not captured by measuring a small number of predefined dimensions. However, in order for this potential to be realized, SPC methods capable of handling these large data-sets need to be developed. This paper presents an approach to performing SPC using point clouds obtained through a 3D laser scanner. The proposed approach transforms high-dimensional point clouds into linear profiles through the use of Q–Q plots, which can be monitored by well established profile monitoring techniques. In this paper point clouds are simulated to determine the performance of the proposed approach under varying fault scenarios. In addition, experimental studies were performed to determine the effectiveness of the proposed approach using actual point cloud data. The results of these experiments show that the proposed approach can significantly improve the monitoring capabilities for manufacturing parts that are characterized by complex surface geometries. 相似文献
13.
提出了一种改进的会话识别方法.该方法基于访问站点的首页和导航页,以首页或导航页作为新会话开始的标识.选取真实的Web日志,用PL/SQL编程实现改进的会话识别方法,并与现有方法进行比较.实验结果证明,改进的会话识别方法比现有方法识别会话更有效. 相似文献
14.
Recently, researchers discovered that the major problems of mining event logs is to discover a simple, sound and complete process model. But since the mining techniques can only reproduce the behaviour recorded in the log, the fitness of the reproduced model is a function of the event log completeness. In this paper, a Fuzzy-Genetic Mining model based on Bayesian Scoring Functions (FGM-BSF) which we called probabilistic approach was developed to tackle problems which emanated from the incomplete event logs. The main motivation of using genetic mining for the process discovery is to benefit from the global search performed by the algorithm. The incompleteness in processes deals with uncertainty and is tackled by using the probabilistic nature of the scoring functions in Bayesian network based on a fuzzy logic value prediction. The global search performed by the genetic approach is panacea to dealing with the population that has both good and bad individuals. Hence, the proposed approach helps to enhance a robust fitness function for the genetic algorithm through highlift traces representing only good individuals not detected by mining model without an intelligent system. The implementation of our approach was carried out on java platform with MySQL for event log parsing and preprocessing while the actual discovery was done in ProM. The results showed that the proposed approach achieved 0.98% fitness when compared with existing schemes. 相似文献
15.
Jeffrey Xu Yu Zhiheng Li Guimei Liu 《The VLDB Journal The International Journal on Very Large Data Bases》2008,17(4):947-970
Data mining has attracted a lot of research efforts during the past decade. However, little work has been reported on the
efficiency of supporting a large number of users who issue different data mining queries periodically when there are new needs
and when data is updated. Our work is motivated by the fact that the pattern-growth method is one of the most efficient methods
for frequent pattern mining which constructs an initial tree and mines frequent patterns on top of the tree. In this paper,
we present a data mining proxy approach that can reduce the I/O costs to construct an initial tree by utilizing the trees that have already been resident in memory. The tree we construct
is the smallest for a given data mining query. In addition, our proxy approach can also reduce CPU cost in mining patterns,
because the cost of mining relies on the sizes of trees. The focus of the work is to construct an initial tree efficiently.
We propose three tree operations to construct a tree. With a unique coding scheme, we can efficiently project subtrees from
on-disk trees or in-memory trees. Our performance study indicated that the data mining proxy significantly reduces the I/O cost to construct trees and CPU cost to mine patterns over the trees constructed. 相似文献
16.
A policy-based process mining framework: mining business policy texts for discovering process models
Jiexun Li Harry Jiannan Wang Zhu Zhang J. Leon Zhao 《Information Systems and E-Business Management》2010,8(2):169-188
Many organizations use business policies to govern their business processes, often resulting in huge amounts of policy documents.
As new regulations arise such as Sarbanes-Oxley, these business policies must be modified to ensure their correctness and
consistency. Given the large amounts of business policies, manually analyzing policy documents to discover process information
is very time-consuming and imposes excessive workload. In order to provide a solution to this information overload problem,
we propose a novel approach named Policy-based Process Mining (PBPM) to automatically extracting process information from
policy documents. Several text mining algorithms are applied to business policy texts in order to discover process-related
policies and extract such process components as tasks, data items, and resources. Experiments are conducted to validate the
extracted components and the results are found to be very promising. To the best of our knowledge, PBPM is the first approach
that applies text mining towards discovering business process components from unstructured policy documents. The initial research
results presented in this paper will require more research efforts to make PBPM a practical solution. 相似文献
17.
A novel process monitoring scheme is proposed to compensate for shortcomings in the conventional independent component analysis (ICA) based monitoring method. The primary idea is first to augment the observed data matrix in order to take the process dynamic into consideration. An outlier rejection rule is then proposed to screen out outliers, in order to better describe the majority of the data. Finally, a rectangular measure is used as a monitoring statistic. The proposed approach is investigated via three cases: a simulation example, the Tennessee Eastman process and a real industrial case. Results indicate that the proposed method is more efficient as compared to alternate methods. 相似文献
18.
Web日志挖掘中的会话识别方法 总被引:3,自引:0,他引:3
为更好地实现会话识别,从而为后续模式发现提供准确的挖掘数据,在分析现有常用的会话识别方法后,提出一种基于待挖掘站点首页的用户会话识别方法.该方法根据用户浏览站点的习惯,以站点首页作为用户新会话开始标识,并增强了用户会话的定义.最后利用数据库编程实现该方法,将识别出的会话存储在数据库中,便于后续数据挖掘使用.实验结果表明,该方法能识别出更多的用户会话,且识别会话的准确率也更高. 相似文献
19.
A flexible approach for visual data mining 总被引:3,自引:0,他引:3
The exploration of heterogenous information spaces requires suitable mining methods as well as effective visual interfaces. Most of the existing systems concentrate either on mining algorithms or on visualization techniques. This paper describes a flexible framework for visual data mining which combines analytical and visual methods to achieve a better understanding of the information space. We provide several pre-processing methods for unstructured information spaces, such as a flexible hierarchy generation with user-controlled refinement. Moreover, we develop new visualization techniques, including an intuitive focus+context technique to visualize complex hierarchical graphs. A special feature of our system is a new paradigm for visualizing information structures within their frame of reference 相似文献
20.
《Future Generation Computer Systems》2007,23(1):48-54
We describe a grid-based approach for enterprise-scale data mining, which is based on leveraging parallel database technology for data storage, and on-demand compute servers for parallelism in the statistical computations. This approach is targeted towards the use of data mining in highly-automated vertical business applications, where the data is stored on one or more relational database systems, and an independent set of high-performance compute servers or a network of low-cost, commodity processors is used to improve the application performance and overall workload management. The goal of this paper is to describe an algorithmic decomposition of data mining kernels between the data storage and compute grids, which makes it possible to exploit the parallelism on the respective grids in a simple way, while minimizing the data transfer between these grids. This approach is compatible with existing standards for data mining task specification and results reporting, so that larger applications using these data mining algorithms do not have to be modified to benefit from this grid-based approach. 相似文献