Discovering Social Networks from Event Logs   总被引:5,自引:0,他引:5  
Process mining techniques allow for the discovery of knowledge based on so-called “event logs”, i.e., a log recording the execution of activities in some business process. Many information systems provide such logs, e.g., most WFM, ERP, CRM, SCM, and B2B systems record transactions in a systematic way. Process mining techniques typically focus on performance and control-flow issues. However, event logs typically also log the performer, e.g., the person initiating or completing some activity. This paper focuses on mining social networks using this information. For example, it is possible to build a social network based on the hand-over of work from one performer to the next. By combining concepts from workflow management and social network analysis, it is possible to discover and analyze social networks. This paper defines metrics, presents a tool, and applies these to a real event log within the setting of a large Dutch organization.  相似文献   

关系数据库中知识发现的一种粒计算方法   总被引:1,自引:0,他引:1  
邱桃荣  刘清  黄厚宽 《自动化学报》2009,35(8):1071-1079
提出用粒计算方法从关系数据库或信息系统中挖掘具有不同粒度大小的多维多层次关联规则. 首先, 基于粒计算的划分模型给出了从关系数据库或信息系统中进行知识发现的框架; 其次, 提出频繁k-项目集生成的粒计算方法; 最后, 对所提出的粒计算方法通过实际例子进行说明, 并选择两类不同数据集在给定不同支持度下进行测试, 以及与两种经典方法进行了比较. 测试结果表明所提出的粒计算方法有效. 而且借助粒计算使得关联规则的语义变得更加清晰和易于理解.  相似文献   

流程工业集成制造系统(CIMS)采用了BPS/MES/PCS三层体系结构。文章指出,现有的CIMS三层体系结构局限性已被明确提出,从而为此提出了一种基于数据挖掘和数据存储技术的新型数据平台。一个统一的数据平台是运用知识发现技术设计的,通过在生产和管理行为中管理企业的显式知识以及发现隐式知识。结果表明,文章提出的流程工业现代集成自动化系统在信息收集和知识共享方面拥有完整的结构。  相似文献   

We are obtaining a large database of some objects' records of fluctuations of a stock market,medical treatments,changes of weather in certain area and so on,where each record consists of multi-attributes taking multi-values changing with time. Our work is motivated by prediction,which is different from the work in 4,5,8,11. We want to help learn from past data and make informed decisions for the future. This paper is very significant to perfect the theory and the development of the temporal data mining.  相似文献   

针对复杂工业过程中产生积累的大量数据,分析了数据挖掘的技术基础和数据特点,提出了一种集成的数据挖掘模型,并给出了基于SQL Server 2000的数据挖掘实现方案,为复杂工业过程的知识发现提供借鉴和参考。  相似文献   

Efficient Rule-Based Attribute-Oriented Induction for Data Mining   总被引:3,自引:0,他引:3  
Data mining has become an important technique which has tremendous potential in many commercial and industrial applications. Attribute-oriented induction is a powerful mining technique and has been successfully implemented in the data mining system DBMiner (Han et al. Proc. 1996 Int'l Conf. on Data Mining and Knowledge Discovery (KDD'96), Portland, Oregon, 1996). However, its induction capability is limited by the unconditional concept generalization. In this paper, we extend the concept generalization to rule-based concept hierarchy, which enhances greatly its induction power. When previously proposed induction algorithm is applied to the more general rule-based case, a problem of induction anomaly occurs which impacts its efficiency. We have developed an efficient algorithm to facilitate induction on the rule-based case which can avoid the anomaly. Performance studies have shown that the algorithm is superior than a previously proposed algorithm based on backtracking.  相似文献   

Petri网是一个功能强大的建模工具,已广泛应用于业务流程的建模与分析,但是原型Petri网对业务流程的成本分析却无能为力.首先介绍原型Petri网的定义,然后针对实际业务流程建模中成本预算分析的需要,对原型Petri网扩展价格因素,定义了价格Petri网及其变迁触发规则,而后定义了计价状态空间的概念,并给出计价状态可达空间的构造算法,最后通过一个例子说明价格Petri网可以有效地对业务流程进行成本分析.  相似文献   

In this paper, we propose two sampling theories of rule discovery based on generality and accuracy. The first theory concerns the worst case: it extends a preliminary version of PAC learning, which represents a worst-case analysis for classification. In our analysis, a rule is defined as a probabilistic constraint of true assignment to the class attribute for corresponding examples, and we mainly analyze the case in which we try to avoid finding a bad rule. Effectiveness of our approach is demonstrated through examples for conjunction-rule discovery. The second theory concerns a distribution-based case: it represents the conditions that a rule exceeds pre-specified thresholds for generality and accuracy with high reliability. The idea is to assume a 2-dimensional normal distribution for two probabilistic variables, and obtain the conditions based on their confidence region. This approach has been validated experimentally using 21 benchmark data sets in the machine learning community against conventional methods each of which evaluates the reliability of generality. Discussions on related work are provided for PAC learning, multiple comparison, and analysis of association-rule discovery.  相似文献   

面向服务计算是近年研究的热点,在面向软件方面上主要体现在Web服务发现、服务选择、服务组合等方面。提出一种面向自动推理的服务发现方法,主要建立一种Petri网的流程控制方法和自动机推理模式,并给出一些定理和性质来说明服务发现的可行性。从而实现服务发现的自动识别,并有效完成服务组合。最后以Amazon中的一组定购服务进行分析得出,该方法可行且有效。  相似文献   

提出了一个基于定向原子规则的非精确关联规则挖掘算法ARA,以支持用户兴趣导向的探索式知识发现。ARA算法采用一个两层的Hash表结构对数据库事务中与定向原子规则对应的项进行计数。复合规则通过频繁的原子规则前件项组合和支持度和置信度的估算得到。ARA只需一次数据库扫描。实验结果表明ARA算法的速度快,消耗的内存少,非常适合大型数据库和响应速度要求高的数据挖掘环境。  相似文献   

通过分析研究工作流建模的各种方法,本文提出了一种基于构件技术的工作流建模方法,引入了构件工作流网的概念和新的建模元素,使工作流过程表达更丰富,提高了可读性和重用性,给出了一个建模应用实例,并指出下一步研究工作和方向.  相似文献   

摘 归纳了最新的数据挖掘和知识发现方法的理论和应用进展,详细总结了研究和应用的一些关键技术,最后对数据挖掘和知识发现将来的理论发展趋势和应用趋势做出了展望。  相似文献   

地理信息知识获取Rough-NN模型研究   总被引:1,自引:0,他引:1  
提出了一种粗糙集结合神经网络的粗糙集神经网络模型,对具有高度自相关性的地理信息进行知识获取.主要思想是利用辨别矩阵形成约简算法,得到最简的if-then规则;然后构造三层神经网络模拟最简规则,其中网络的输入输出由本文提出的参数训练方法确定.本文利用VB实现该模型,并对松花江流域的洪涝干旱灾情进行了仿真实验,结果表明该模型可以快速地获取最简的if-then规则,得到正确的决策结果.  相似文献   

互斥是解决资源利用冲突、实现资源共享的一种有效方法,但是简单互斥方法给同步带来一些问题.为此从Petri网对互斥进程的表示入手,分析并提出了改进和优化的互斥进程解决方案,利用同步距离的概念对不同性质互斥进程的逻辑同步距离、时间同步距离和数据同步策略进行分析和计算,证明了优化方案在缩小同步距离、减少系统运行耗时和资源占用率方面的优势.  相似文献   

This paper presents a Text Mining approach for discovering knowledge in texts to later construct decision support systems. Text mining can take advantage of knowledge stored in textual documents, reducing the effort for knowledge acquisition. The approach consists in performing a mining process on concepts present in texts instead of working with words. The assumption is that concepts represent real world events and characteristics better than words, allowing the understanding and the explanation of the reasoning used in decision processes. The proposed approach extracts concepts expressed in natural phrases, and then analyzes their distributions and associations. Concepts distributions and associations are used to characterize classes or situations. After the discovery process, the obtained knowledge can be embedded in automated systems to classify elements or to suggest actions or solutions to problems. In this paper, experiments using the approach in a psychiatric domain are discussed. Concepts extracted from textual medical records represent patients' symptoms, signals and social/behavior characteristics. An automatic system was constructed with the approach: a classifier whose goal is to help physicians in disease diagnoses. Results from this system show that the approach is feasible for constructing decision support systems with satisfactory performance.  相似文献   

基于数据挖掘的知识发现在水电站优化调度中的应用研究   总被引:1,自引:0,他引:1  
主要讨论基于数据挖掘技术的知识发现在水电调度系统中的应用,提出了基于数据挖掘的知识发现方法,建立了知识向量集的拓扑空间概念并提出了基于拓扑空间向量集的不确定性知识表示方法。  相似文献   

Advanced Scout: Data Mining and Knowledge Discovery in NBA Data   总被引:1,自引:0,他引:1  
Advanced Scout is a PC-based data mining application used by National Basketball Association (NBA)coaching staffs to discover interesting patterns in basketball game data. We describe Advanced Scout software from the perspective of data mining and knowledge discovery. This paper highlights the pre-processing of raw data that the program performs, describes the data mining aspects of the software and how the interpretation of patterns supports the processof knowledge discovery. The underlying technique of attribute focusing asthe basis of the algorithm is also described. The process of pattern interpretation is facilitated by allowing the user to relate patterns to video tape.  相似文献   

基于Agent的知识发现模型的设计   总被引:8,自引:3,他引:8  
KDD(the Knowledge Discovery in Database)模型的研究是数据挖掘领域中的一个重要分支,现有的一些模型各有其优势,但又不是完美的,尤其在智能性方面都表现得较差。文章设计了一个基于Agent的智能数据挖掘系统,利用多智能体技术实现了信息的收集、预处理、查询、知识的自动提取、数据挖掘等功能,使整个挖掘过程实现了知识性、智能性,它可以为智能信息系统提供必要的支持。  相似文献   

粗糙集知识发现的研究现状和展望   总被引:18,自引:4,他引:14  
通过对粗糙集知识发现理论发展历史的问题,对粗糙集知识发现研究现状的探讨,结合目前主要的粗糙集知识发现系统,指出了粗糙集知识发现存在的问题,并对今后几年的研究进行了展望。  相似文献   

This paper describes how succinct rules, which reduce the size of decision tables, can be found by employing multiple-valued logic (MVL). Two multiple-valued algebras are described, one based on level detection, and the other on literal functions. Then a decision table which had also been reduced in size using rough set theory, is now reduced using both algebras and it is seen that all three approaches lead to reductions of comparable simplicity. The new methods require coding the values of each attribute as integers. Then an MVL function that maps the coding of the condition attributes to the coding of the decision attribute is found. As the coded table is sparse only some of the basis functions for each algebra are required. Then a simple approach requiring the reduction of a matrix to row echelon form is used to finding all suitable MVL functions. By decomposing a function in terms of its variables a complete set of rules can be found. The MVL function encodes the data in a very compact form and its decomposition into subfunctions reveals a good way to slice up the table into subtables. The structure of the subfunctions can then be used to simplify each subtable until compact sets of rules emerge. Alternatively, rules can be found by substitution into the MVL function. Encoding a decision table using MVL makes the data easy to manipulate and can uncover relationships that may not become apparent when using other methods.  相似文献   

