期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Association rules on significant rare data using second support

《国际计算机数学杂志》2012,89(1):69-80

Association rule is one of the data mining techniques involved in discovering information that represents the association among data. Data in the database sometimes appear infrequent but highly associated with a specific data. This paper proposes a technique for significant rare data by introducing second support in discovering the association rules of such data. We show that the proposed approach provides better performance as compared to standard association rules techniques. 相似文献

2.

一种关联规则增量更新算法 总被引：22，自引：0，他引：22

陈劲松施小英《计算机工程》2002,28(7):106-107

针对事务数据库的内容不断增加后相应关联规则的更新问题，提出了一种简单高效的增量式关联规则挖掘算法SFUA，并和已有的FUP算法进行了分析比较。相似文献

3.

负增量式关联规则更新算法 总被引：3，自引：0，他引：3

张师超张继连陈峰倪艾玲《计算机科学》2005,32(9):153-155

模式维护是数据挖掘中一个具有挑战性的任务.现有的增量式关联规则更新算法主要解决两种情况下的维护问题：一是最小支持度不变,而数据量增加;二是数据量不变,而改变最小支持度.本文提出了一种负增量关联规则更新算法.实验表明,该算法是有效的. 相似文献

4.

Using association rules to mine for strong approximate dependencies

Daniel Sánchez José María Serrano Ignacio Blanco Maria Jose Martín-Bautista María-Amparo Vila 《Data mining and knowledge discovery》2008,16(3):313-348

In this paper we deal with the problem of mining for approximate dependencies (AD) in relational databases. We introduce a definition of AD based on the concept of association rule, by means of suitable definitions of the concepts of item and transaction. This definition allow us to measure both the accuracy and support of an AD. We provide an interpretation of the new measures based on the complexity of the theory (set of rules) that describes the dependence, and we employ this interpretation to compare the new measures with existing ones. A methodology to adapt existing association rule mining algorithms to the task of discovering ADs is introduced. The adapted algorithms obtain the set of ADs that hold in a relation with accuracy and support greater than user-defined thresholds. The experiments we have performed show that our approach performs reasonably well over large databases with real-world data. 相似文献

5.

Assessment of data quality in accounting data with association rules

《Expert systems with applications》2014,41(5):2259-2268

Business rules are an effective way to control data quality. Business experts can directly enter the rules into appropriate software without error prone communication with programmers. However, not all business situations and possible data quality problems can be considered in advance. In situations where business rules have not been defined yet, patterns of data handling may arise in practice. We employ data mining to accounting transactions in order to discover such patterns. The discovered patterns are represented in form of association rules. Then, deviations from discovered patterns can be marked as potential data quality violations that need to be examined by humans. Data quality breaches can be expensive but manual examination of many transactions is also expensive. Therefore, the goal is to find a balance between marking too many and too few transactions as being potentially erroneous. We apply appropriate procedures to evaluate the classification accuracy of developed association rules and support the decision on the number of deviations to be manually examined based on economic principles. 相似文献

6.

Elicitation of classification rules by fuzzy data mining 总被引：1，自引：0，他引：1

Yi-Chung Hu Gwo-Hshiung Tzeng 《Engineering Applications of Artificial Intelligence》2003,16(7-8):709-716

Data mining techniques can be used to find potentially useful patterns from data and to ease the knowledge acquisition bottleneck in building prototype rule-based systems. Based on the partition methods presented in simple-fuzzy-partition-based method (SFPBM) proposed by Hu et al. (Comput. Ind. Eng. 43(4) (2002) 735), the aim of this paper is to propose a new fuzzy data mining technique consisting of two phases to find fuzzy if–then rules for classification problems: one to find frequent fuzzy grids by using a pre-specified simple fuzzy partition method to divide each quantitative attribute, and the other to generate fuzzy classification rules from frequent fuzzy grids. To improve the classification performance of the proposed method, we specially incorporate adaptive rules proposed by Nozaki et al. (IEEE Trans. Fuzzy Syst. 4(3) (1996) 238) into our methods to adjust the confidence of each classification rule. For classification generalization ability, the simulation results from the iris data demonstrate that the proposed method may effectively derive fuzzy classification rules from training samples. 相似文献

7.

Ranking discovered rules from data mining with multiple criteria by data envelopment analysis

Mu-Chen Chen 《Expert systems with applications》2007,33(4):1110-1116

In data mining applications, it is important to develop evaluation methods for selecting quality and profitable rules. This paper utilizes a non-parametric approach, Data Envelopment Analysis (DEA), to estimate and rank the efficiency of association rules with multiple criteria. The interestingness of association rules is conventionally measured based on support and confidence. For specific applications, domain knowledge can be further designed as measures to evaluate the discovered rules. For example, in market basket analysis, the product value and cross-selling profit associated with the association rule can serve as essential measures to rule interestingness. In this paper, these domain measures are also included in the rule ranking procedure for selecting valuable rules for implementation. An example of market basket analysis is applied to illustrate the DEA based methodology for measuring the efficiency of association rules with multiple criteria. 相似文献

8.

一种基于约束的关联规则挖掘算法

李广原杨炳儒周如旗《计算机科学》2012,39(1):244-247

基于约束的关联规则挖掘是一种重要的关联挖掘,能按照用户给出的条件来实行有针对性的挖掘。大多数此类算法仅处理具有一种约束的挖掘,因而其应用受到一定程度的限制。提出一种新的基于约束的关联规则挖掘算法MCAL,它同时处理两种类型的约束:非单调性约束和单调性约束。算法包括3个步骤:第一步,挖掘当前数据集的频繁1项集;第二,应用约束的性质和有效剪枝策略来寻找约束点,同时生成频繁项的条件数据库;最后,递归地应用前面两步寻找条件数据库中频繁项的约束点,以生成满足约束的全部频繁项集。通过实验对比,无论从运行时间还是可扩展性来说,本算法均达到较好的效果。相似文献

9.

An evolutionary algorithm to discover quantitative association rules from huge databases without the need for an a priori discretization 总被引：1，自引：0，他引：1

Victoria Pachón Álvarez Jacinto Mata Vázquez 《Expert systems with applications》2012,39(1):585-593

Association rules are one of the most frequently used tools for finding relationships between different attributes in a database. There are various techniques for obtaining these rules, the most common of which are those which give categorical association rules. However, when we need to relate attributes which are numeric and discrete, we turn to methods which generate quantitative association rules, a far less studied method than the above. In addition, when the database is extremely large, many of these tools cannot be used. In this paper, we present an evolutionary tool for finding association rules in databases (both small and large) comprising quantitative and categorical attributes without the need for an a priori discretization of the domain of the numeric attributes. Finally, we evaluate the tool using both real and synthetic databases. 相似文献

10.

Multi-objective PSO algorithm for mining numerical association rules without a priori discretization

《Expert systems with applications》2014,41(9):4259-4273

In the domain of association rules mining (ARM) discovering the rules for numerical attributes is still a challenging issue. Most of the popular approaches for numerical ARM require a priori data discretization to handle the numerical attributes. Moreover, in the process of discovering relations among data, often more than one objective (quality measure) is required, and in most cases, such objectives include conflicting measures. In such a situation, it is recommended to obtain the optimal trade-off between objectives. This paper deals with the numerical ARM problem using a multi-objective perspective by proposing a multi-objective particle swarm optimization algorithm (i.e., MOPAR) for numerical ARM that discovers numerical association rules (ARs) in only one single step. To identify more efficient ARs, several objectives are defined in the proposed multi-objective optimization approach, including confidence, comprehensibility, and interestingness. Finally, by using the Pareto optimality the best ARs are extracted. To deal with numerical attributes, we use rough values containing lower and upper bounds to show the intervals of attributes. In the experimental section of the paper, we analyze the effect of operators used in this study, compare our method to the most popular evolutionary-based proposals for ARM and present an analysis of the mined ARs. The results show that MOPAR extracts reliable (with confidence values close to 95%), comprehensible, and interesting numerical ARs when attaining the optimal trade-off between confidence, comprehensibility and interestingness. 相似文献