共查询到20条相似文献,搜索用时 0 毫秒
1.
This paper studies the problem of mining frequent itemsets along with their temporal patterns from large transaction sets. A model is proposed in which users define a large set of temporal patterns that are interesting or meaningful to them. A temporal pattern defines the set of time points where the user expects a discovered itemset to be frequent. The model is general in that (i) no constraints are placed on the interesting patterns given by the users, and (ii) two measures—inclusiveness and exclusiveness—are used to capture how well the temporal patterns match the time points given by the discovered itemsets. Intuitively, these measures indicate to what extent a discovered itemset is frequent at time points included in a temporal pattern p, but not at time points not in p. Using these two measures, one is able to model many temporal data mining problems appeared in the literature, as well as those that have not been studied. By exploiting the relationship within and between itemset space and pattern space simultaneously, a series of pruning techniques are developed to speed up the mining process. Experiments show that these pruning techniques allow one to obtain performance benefits up to 100 times over a direct extension of non-temporal data mining algorithms. 相似文献
2.
《Expert systems with applications》2014,41(1):195-209
The processes of logistics service providers are considered as highly human-centric, flexible and complex. Deviations from the standard operating procedures as described in the designed process models, are not uncommon and may result in significant uncertainties. Acquiring insight in the dynamics of the actual logistics processes can effectively assist in mitigating the uncovered risks and creating strategic advantages, which are the result of uncertainties with respectively a negative and a positive impact on the organizational objectives.In this paper a comprehensive methodology for applying process mining in logistics is presented, covering the event log extraction and preprocessing as well as the execution of exploratory, performance and conformance analyses. The applicability of the presented methodology and roadmap is demonstrated with a case study at an important Chinese port that specializes in bulk cargo. 相似文献
3.
余茂坦 《数字社区&智能家居》2006,(12):22-22,99
本文探讨水库调度系统的知识获取问题,即水库调度DM问题。根据DM的作用和特点,以及水库调度领域存在的问题,分析了水库调度对DM的需求,DM在水库调度中得以实施的数据基础,井且指出了若干应用方向。 相似文献
4.
余茂坦 《数字社区&智能家居》2006,(35)
本文探讨水库调度系统的知识获取问题,即水库调度DM问题。根据DM的作用和特点,以及水库调度领域存在的问题,分析了水库调度对DM的需求,DM在水库调度中得以实施的数据基础,并且指出了若干应用方向。 相似文献
5.
预测是很多行业都需要的一项方法和技术,随着数据积累的越来越多,现在许多行业大多面临基于海量数据的预测问题,该文从基于海量数据挖掘的预测方法出发,给出了一个数据挖掘预测系统的模型,并针对一个行业案例介绍了预测的具体处理过程,最后对预测结果的评价和选取情况进行了分析。 相似文献
6.
KDD中知识的自动评价系统 总被引:1,自引:0,他引:1
How to automatically evaluate the discovered knowledge is very important in KDD,but the research on this problem is very little. The paper gives one automatic system for evaluating the knowledge, and provides many solutions. First some relative concepts are described,and the construction of this system is given,and uses the case to prove it. 相似文献
7.
文本知识发现:基于信息抽取的文本挖掘蝌 总被引:9,自引:0,他引:9
In the general context of Knowledge Discovery, Knowledge Discovery in Text (KDT), which uses TextMining techniques to extract and induce hidden knowledge from unstructured text data, surges in the data and naturallanguage processing research. KDT is a multi-discipline of Artificial Intelligence, Machine learning, Natural Lan-ing with a stressing on its IE (Information Extraction)-based induction and specific sublanguage fields oriented prac-tices. 相似文献
8.
Ersin Ersoy;Engin Çallı;Batuhan Erdoğan;Selami Bağrıyanık;Hasan Sözer; 《Journal of Software: Evolution and Process》2024,36(5):e2615
There have been success stories reported regarding the adoption of agile software development methods in the industry. There also exist observations on their limitations. One of these limitations is scalability since agile methods like Scrum were originally designed for small software teams. Scalable agile frameworks were introduced to address this limitation. We conducted an industrial case study on the adoption of such a framework, called Nexus. Our study involves quantitative and qualitative evaluation based on observations within a product development organization over a period of 12 months. Scrum is used for the development of a product during the first 6 months of this period. Nexus is used in the remaining 6 months. Data are collected throughout the whole period for measuring productivity, quality, and team member motivation. Results suggest a significant increase in productivity and product quality after switching to Nexus. Team motivation was slightly improved as well. 相似文献
9.
本文首先提出了一种挖掘频集的高效算法PP。它采用了一种基于树的模式支持集表示,避免了反复扫描数据库和递归建造个数与频繁模式数相同的模式支持集,其效率比Apriori和FPGrowth高1—3个数量级。PP被进一步扩展成发现分类规则的有效算法CRM-PP。CRM-PP将多支持率剪裁集成到频集发现阶段,将二阶段挖掘法改进为单阶段挖掘法。CRM-PP的效率也比基于Apriori和FPGrowth的二阶段算法高1—3个数量级。 相似文献
10.
序贯模式是时间相关数据库中存在的一种十分有用的知识模式,其发掘方法的研究有着十分重要的意义,本文给出了一种挖掘数据库中序贯模式的算法,通过认真地研究了挖掘过程中的中间及结果数据的存储结构,从而大大地减少了对数据库的扫描遍数,提高了算法的效率。 相似文献
11.
TANG Zhi-gui 《数字社区&智能家居》2008,(1)
针对复杂工业过程中产生积累的大量数据,分析了数据挖掘的技术基础和数据特点,提出了一种集成的数据挖掘模型,并给出了基于SQL Server 2000的数据挖掘实现方案,为复杂工业过程的知识发现提供借鉴和参考。 相似文献
12.
13.
14.
15.
基于Agent的知识发现模型的设计 总被引:8,自引:3,他引:8
KDD(the Knowledge Discovery in Database)模型的研究是数据挖掘领域中的一个重要分支,现有的一些模型各有其优势,但又不是完美的,尤其在智能性方面都表现得较差。文章设计了一个基于Agent的智能数据挖掘系统,利用多智能体技术实现了信息的收集、预处理、查询、知识的自动提取、数据挖掘等功能,使整个挖掘过程实现了知识性、智能性,它可以为智能信息系统提供必要的支持。 相似文献
16.
Anonymity preserving pattern discovery 总被引:5,自引:0,他引:5
Maurizio Atzori Francesco Bonchi Fosca Giannotti Dino Pedreschi 《The VLDB Journal The International Journal on Very Large Data Bases》2008,17(4):703-727
It is generally believed that data mining results do not violate the anonymity of the individuals recorded in the source database. In fact, data mining models and patterns, in order to ensure a required
statistical significance, represent a large number of individuals and thus conceal individual identities: this is the case
of the minimum support threshold in frequent pattern mining. In this paper we show that this belief is ill-founded. By shifting the concept of k
-anonymity from the source data to the extracted patterns, we formally characterize the notion of a threat to anonymity in the context
of pattern discovery, and provide a methodology to efficiently and effectively identify all such possible threats that arise
from the disclosure of the set of extracted patterns. On this basis, we obtain a formal notion of privacy protection that
allows the disclosure of the extracted knowledge while protecting the anonymity of the individuals in the source database.
Moreover, in order to handle the cases where the threats to anonymity cannot be avoided, we study how to eliminate such threats
by means of pattern (not data!) distortion performed in a controlled way. 相似文献
17.
Ricardo Pérez‐Castillo Ignacio García‐Rodríguez de Guzmán Mario Piattini Ángeles S. Places 《Software》2012,42(2):159-189
Business processes have become one of the key assets of organization, since these processes allow them to discover and control what occurs in their environments, with information systems automating most of an organization's processes. Unfortunately, and as a result of uncontrolled maintenance, information systems age over time until it is necessary to replace them with new and modernized systems. However, while systems are aging, meaningful business knowledge that is not present in any of the organization's other assets gradually becomes embedded in them. The preservation of this knowledge through the recovery of the underlying business processes is, therefore, a critical problem. This paper provides, as a solution to the aforementioned problem, a model‐driven procedure for recovering business processes from legacy information systems. The procedure proposes a set of models at different abstraction levels, along with the model transformations between them. The paper also provides a supporting tool, which facilitates its adoption. Moreover, a real‐life case study concerning an e‐government system applies the proposed recovery procedure to validate its effectiveness and efficiency. The case study was carried out by following a formal protocol to improve its rigor and replicability. Copyright © 2011 John Wiley & Sons, Ltd. 相似文献
18.
基于本体的Web分类技术研究 总被引:2,自引:3,他引:2
主要提出了一种基于本体的抽象的Web挖掘模型。首先利用本体的方法表示出要挖掘的领域,然后把从用户处收集来的数据转换成表格;最后再根据定义和公式来进行知识发现。抽象的Web挖掘模型可以提取出语义Web中隐藏在大量信息背后的近似概念,来实现知识发现。 相似文献
19.
结构化数据挖掘与复杂类型数据挖掘既有联系,又有区别。如何将这两者统一起来,建立一个统一的理论框架,以指导数据挖掘与知识发现研完,已经成为一个迫切需要解决的问题。本文提出了知识发现状态空间统一模型UMKDSS,将结构化数据挖掘与复杂类型数据挖掘联系起来,为复杂类型数据挖掘提供理论指导。文章最后给出了UMKDSS在Web文本挖掘中的应用实例。 相似文献