首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Mining fuzzy association rules from uncertain data   总被引:3,自引:3,他引:0  
Association rule mining is an important data analysis method that can discover associations within data. There are numerous previous studies that focus on finding fuzzy association rules from precise and certain data. Unfortunately, real-world data tends to be uncertain due to human errors, instrument errors, recording errors, and so on. Therefore, a question arising immediately is how we can mine fuzzy association rules from uncertain data. To this end, this paper proposes a representation scheme to represent uncertain data. This representation is based on possibility distributions because the possibility theory establishes a close connection between the concepts of similarity and uncertainty, providing an excellent framework for handling uncertain data. Then, we develop an algorithm to mine fuzzy association rules from uncertain data represented by possibility distributions. Experimental results from the survey data show that the proposed approach can discover interesting and valuable patterns with high certainty.  相似文献   

2.
Association rule mining is one of most popular data analysis methods that can discover associations within data. Association rule mining algorithms have been applied to various datasets, due to their practical usefulness. Little attention has been paid, however, on how to apply the association mining techniques to analyze questionnaire data. Therefore, this paper first identifies the various data types that may appear in a questionnaire. Then, we introduce the questionnaire data mining problem and define the rule patterns that can be mined from questionnaire data. A unified approach is developed based on fuzzy techniques so that all different data types can be handled in a uniform manner. After that, an algorithm is developed to discover fuzzy association rules from the questionnaire dataset. Finally, we evaluate the performance of the proposed algorithm, and the results indicate that our method is capable of finding interesting association rules that would have never been found by previous mining algorithms.  相似文献   

3.
Data-mining is the process of extracting desirable knowledge or interesting patterns from existing databases for specific purposes. Most conventional data-mining algorithms identify the relationships among transactions using binary values, however, transactions with quantitative values are commonly seen in real-world applications. This paper thus proposes a new data-mining algorithm for extracting interesting knowledge from transactions stored as quantitative values. The proposed algorithm integrates fuzzy set concepts and the apriori mining algorithm to find interesting fuzzy association rules in given transaction data sets. Experiments with student grades at I-Shou University were also made to verify the performance of the proposed algorithm.  相似文献   

4.
借助模糊概念和模糊运算,对时间区间的描述很容易实现。对于指定的日历模式,不同的时间区间可根据它们的隶属度具有不同的权重。在模糊日历代数基础上,结合增量挖掘和累进计数的思想,提出了一种基于模糊日历的模糊时序关联规则挖掘方法。理论分析和实验结果均表明,该算法是高效可行的。  相似文献   

5.
We develop techniques for discovering patterns with periodicity in this work. Patterns with periodicity are those that occur at regular time intervals, and therefore there are two aspects to the problem: finding the pattern, and determining the periodicity. The difficulty of the task lies in the problem of discovering these regular time intervals, i.e., the periodicity. Periodicities in the database are usually not very precise and have disturbances, and might occur at time intervals in multiple time granularities. To overcome these difficulties and to be able to discover the patterns with fuzzy periodicity, we propose the fuzzy periodic calendar which defines fuzzy periodicities. Furthermore, we develop algorithms for mining fuzzy periodicities and the fuzzy periodic association rules within them. Experimental results have shown that our method is effective in discovering fuzzy periodic association rules.  相似文献   

6.
Abstract: The concept of fuzzy sets is one of the most fundamental and influential tools in the development of computational intelligence. In this paper the fuzzy pincer search algorithm is proposed. It generates fuzzy association rules by adopting combined top-down and bottom-up approaches. A fuzzy grid representation is used to reduce the number of scans of the database and our algorithm trims down the number of candidate fuzzy grids at each level. It has been observed that fuzzy association rules provide more realistic visualization of the knowledge extracted from databases.  相似文献   

7.
In association rule mining, the trade-off between avoiding harmful spurious rules and preserving authentic ones is an ever critical barrier to obtaining reliable and useful results. The statistically sound technique for evaluating statistical significance of association rules is superior in preventing spurious rules, yet can also cause severe loss of true rules in presence of data error. This study presents a new and improved method for statistical test on association rules with uncertain erroneous data. An original mathematical model was established to describe data error propagation through computational procedures of the statistical test. Based on the error model, a scheme combining analytic and simulative processes was designed to correct the statistical test for distortions caused by data error. Experiments on both synthetic and real-world data show that the method significantly recovers the loss in true rules (reduces type-2 error) due to data error occurring in original statistically sound method. Meanwhile, the new method maintains effective control over the familywise error rate, which is the distinctive advantage of the original statistically sound technique. Furthermore, the method is robust against inaccurate data error probability information and situations not fulfilling the commonly accepted assumption on independent error probabilities of different data items. The method is particularly effective for rules which were most practically meaningful yet sensitive to data error. The method proves promising in enhancing values of association rule mining results and helping users make correct decisions.  相似文献   

8.
Mining fuzzy association rules for classification problems   总被引:3,自引:0,他引:3  
The effective development of data mining techniques for the discovery of knowledge from training samples for classification problems in industrial engineering is necessary in applications, such as group technology. This paper proposes a learning algorithm, which can be viewed as a knowledge acquisition tool, to effectively discover fuzzy association rules for classification problems. The consequence part of each rule is one class label. The proposed learning algorithm consists of two phases: one to generate large fuzzy grids from training samples by fuzzy partitioning in each attribute, and the other to generate fuzzy association rules for classification problems by large fuzzy grids. The proposed learning algorithm is implemented by scanning training samples stored in a database only once and applying a sequence of Boolean operations to generate fuzzy grids and fuzzy rules; therefore, it can be easily extended to discover other types of fuzzy association rules. The simulation results from the iris data demonstrate that the proposed learning algorithm can effectively derive fuzzy association rules for classification problems.  相似文献   

9.
Data mining is the process of extracting desirable knowledge or interesting patterns from existing databases for specific purposes. In real-world applications, transactions may contain quantitative values and each item may have a lifespan from a temporal database. In this paper, we thus propose a data mining algorithm for deriving fuzzy temporal association rules. It first transforms each quantitative value into a fuzzy set using the given membership functions. Meanwhile, item lifespans are collected and recorded in a temporal information table through a transformation process. The algorithm then calculates the scalar cardinality of each linguistic term of each item. A mining process based on fuzzy counts and item lifespans is then performed to find fuzzy temporal association rules. Experiments are finally performed on two simulation datasets and the foodmart dataset to show the effectiveness and the efficiency of the proposed approach.  相似文献   

10.
Mining fuzzy association rules in a bank-account database   总被引:1,自引:0,他引:1  
This paper describes how we applied a fuzzy technique to a data-mining task involving a large database that was provided by an international bank with offices in Hong Kong. The database contains the demographic data of over 320,000 customers and their banking transactions, which were collected over a six-month period. By mining the database, the bank would like to be able to discover interesting patterns in the data. The bank expected that the hidden patterns would reveal different characteristics about different customers so that they could better serve and retain them. To help the bank achieve its goal, we developed a fuzzy technique, called fuzzy association rule mining II (FARM II). FARM II is able to handle both relational and transactional data. It can also handle fuzzy data. The former type of data allows FARM II to discover multidimensional association rules, whereas the latter data allows some of the patterns to be more easily revealed and expressed. To effectively uncover the hidden associations in the bank-account database, FARM II performs several steps which are described in detail in this paper. With FARM II, the bank discovered that they had identified some interesting characteristics about the customers who had once used the bank's loan services but then decided later to cease using them. The bank translated what they discovered into actionable items by offering some incentives to retain their existing customers.  相似文献   

11.
Lee, Stolfo, and Mok 1 previously reported the use of association rules and frequency episodes for mining audit data to gain knowledge for intrusion detection. The integration of association rules and frequency episodes with fuzzy logic can produce more abstract and flexible patterns for intrusion detection, since many quantitative features are involved in intrusion detection and security itself is fuzzy. We present a modification of a previously reported algorithm for mining fuzzy association rules, define the concept of fuzzy frequency episodes, and present an original algorithm for mining fuzzy frequency episodes. We add a normalization step to the procedure for mining fuzzy association rules in order to prevent one data instance from contributing more than others. We also modify the procedure for mining frequency episodes to learn fuzzy frequency episodes. Experimental results show the utility of fuzzy association rules and fuzzy frequency episodes for intrusion detection. © 2000 John Wiley & Sons, Inc.  相似文献   

12.
闫伟  张浩  陆剑峰 《计算机应用》2005,25(11):2676-2678
采用数据挖掘中的模糊聚类分析了流程企业中历史数据的区间值,然后用模糊关联规则挖掘出有用的规则。首先阐述了模糊聚类的RFCM算法和关联规则的Apriori算法的内容,分析了实现模糊关联规则的Fuzzy_ClustApriori算法流程,并用RFCM算法对实际数据进行分析,得到不同类别的模糊数。根据Fuzzy_ClustApriori算法的步骤对模糊化的参数点进行处理,得到了有价值的模糊规则,为流程企业的生产优化提供了理论依据。  相似文献   

13.
This article presents a new differential evolution (DE) algorithm for mining optimized statistically significant fuzzy association rules that are abundant in number and high in rule interestingness measure (RIM) values, with strict control over the risk of spurious rules. The risk control over spurious rules, as the most distinctive feature of the proposed DE compared with existing evolutionary algorithms (EAs) for association rule mining (ARM), is realized via two new statistically sound significance tests on the rules. The two tests, in the experimentwise and generationwise adjustment approach, can respectively limit the familywise error rate (the probability that any spurious rules occur in the ARM result) and percentage of spurious rules upon the user specified level. Experiments on variously sized data show that the proposed DE can keep the risk of spurious rules well below the user specified level, which is beyond the ability of existing EA-based ARM. The new method also carries forward the advantages of EA-based ARM and distinctive merits of DE in optimizing the rules: it can obtain several times as many rules and as high RIM values as conventional non-evolutionary ARM, and even more informative rules and better RIM values than genetic-algorithm-based ARM. Case studies on hotel room price determinants and wildfire risk factors demonstrate the practical usefulness of the proposed DE.  相似文献   

14.
挖掘Web日志中的分类关联规则   总被引:1,自引:0,他引:1       下载免费PDF全文
用户分类是Web访问模式挖掘研究的一个重要任务。提出一种应用关联分类技术对Web用户进行分类的方法:首先通过对Web日志文件预处理得到训练事务数据集,然后从该事务集中挖掘分类关联规则,并利用所挖掘的规则集构建了一个分类器,从而实现了根据用户访问历史对用户进行分类。  相似文献   

15.
The association rules, discovered by traditional support–confidence based algorithms, provide us with concise statements of potentially useful information hidden in databases. However, only considering the constraints of minimum support and minimum confidence is far from satisfying in many cases. In this paper, we propose a fuzzy method to formulate how interesting an association rule may be. It is indicated by the membership values belonging to two fuzzy sets (i.e., the stronger rule set and the weaker rule set), and thus provides much more flexibility than traditional methods to discover some potentially more interesting association rules. Furthermore, revised algorithms based on Apriori algorithm and matrix structure are designed under this framework.  相似文献   

16.
17.
针对单一层次结构实现规则提取具有规则提取准确性不高、算法运行时间长、难以满足用户使用需求的问题,提出一种基于改进多层次模糊关联规则的定量数据挖掘算法。采用高频项目集合,通过不断深化迭代的方法形成自顶向下的挖掘过程,整合模糊集合理论、数据挖掘算法以及多层次分类技术,从事务数据集中寻找模糊关联规则,挖掘出储存在多层次结构事务数据库中定量值信息的隐含知识,实现用户的定制化信息挖掘需求。实验结果表明,提出的数据挖掘算法在挖掘精度和运算时间方面相较于其他算法具有突出优势,可为多层次关联规则提取方法的实际应用带来新的发展空间。  相似文献   

18.
高置信度关联规则的挖掘   总被引:3,自引:1,他引:2       下载免费PDF全文
传统的关联规则和基于效用的关联规则,会忽略一些支持度或效用值不高、置信度(又称可信度)却非常高的规则,这些置信度很高的规则能帮助人们满足规避风险、提高成功率的期望。为挖掘这些低支持度(或效用值)、高置信度的规则,提出了HCARM算法。HCARM采用了划分的方法来处理大数据集,利用新的剪枝策略压缩搜索空间。同时,通过设定长度阈值minlen,使HCARM适合长模式挖掘。实验结果表明,该方法对高置信度长模式有效。  相似文献   

19.
Extracting fuzzy classification rules from partially labeled data   总被引:1,自引:1,他引:0  
The interpretability and flexibility of fuzzy if-then rules make them a popular basis for classifiers. It is common to extract them from a database of examples. However, the data available in many practical applications are often unlabeled, and must be labeled manually by the user or by expensive analyses. The idea of semi-supervised learning is to use as much labeled data as available and try to additionally exploit the information in the unlabeled data. In this paper we describe an approach to learn fuzzy classification rules from partially labeled datasets.  相似文献   

20.
针对顺序的模糊关联规则算法在处理海量飞行数据时,由于算法可扩展性低、响应时间过长而带来数据处理的不便,本文采用模糊关联并行挖掘算法,先使用并行的模糊c-2均值算法将数量型属性划分成若干个模糊集,并借助模糊集软化属性的划分边界:在用改进的布尔型关联规则的并行挖掘算法来发现频繁模糊属性集.通过飞行数据库进行算法验证,证明了并行算法具有好的可扩展性、规模增长性和加速比性能.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号