首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
This paper deals with some improvements to rule induction algorithms in order to resolve the tie that appear in special cases during the rule generation procedure for specific training data sets. These improvements are demonstrated by experimental results on various data sets. The tie occurs in decision tree induction algorithm when the class prediction at a leaf node cannot be determined by majority voting. When there is a conflict in the leaf node, we need to find the source and the solution to the problem. In this paper, we propose to calculate the Influence factor for each attribute and an update procedure to the decision tree has been suggested to deal with the problem and provide subsequent rectification steps.  相似文献   

2.
This paper defines a new kind of rule,probability functional dependency rule.The functional dependency degree can be depicted by this kind of rule.Five algorithms,from the simple to the complex,are presented to mine this kind of rule in different condition.The related theorems are proved to ensure the high efficiency and the correctness of the above algorithms.  相似文献   

3.
分布式环境下挖掘约束性关联规则的算法研究   总被引:2,自引:0,他引:2  
关联规则是数据挖掘的重要研究内容。基于约束的关联规则挖掘可以促进交互式探查与分析。该文主要研究了分布式环境中挖掘约束性关联规则的问题。在并行关联规则挖掘算法CD和约束性关联规则挖掘算法Direct的基础上,提出了一种新的分布式挖掘约束性关联规则算法DMA_IC。该算法对于解决分布式挖掘约束性关联规则的问题是十分有效的。同时,文章还对DMA_IC算法的通信性能进行了讨论。  相似文献   

4.
时空数据异常探测方法   总被引:1,自引:0,他引:1       下载免费PDF全文
以“k倍标准差”准则为基础,提出一种专题属性双重偏离的时空异常检测方法,在每个要素的空间邻近域里采用“k倍标准差”准则探测各时刻的空间异常数据,在每个空间异常数据的时间邻近域中,再次使用该准则判断该要素是否为时序异常,并将所有空间和时间邻近域上均表现为异常的数据定义为时空异常。实验结果表明,该方法是有效可行的。  相似文献   

5.
Generalized multidimensional association rules   总被引:2,自引:0,他引:2       下载免费PDF全文
The problem of association rule mining has gained considerable prominence in the data mining community for its use as an important tool of knwledge discovery from large-scale databases.Ande there has been a sput of research activities around this problem.Traditional association rule mining is limited to intra-transaction.Only recently the concept on N-dimensional inter-transaction association rule(NDITAR)was proposed by H.J.Lu.This paper modifies and extends Lu‘s definition of NDITAR based on the analysis of its limitations,and the generalized multidimensional association rule(GMDAR)is subsequently introduced,which is ore general,flexible and reasonable than NDITAR.  相似文献   

6.
The recent progress in high-speed communication networks and large-capacity storage devices has led to a tremendous increase in the number of databases and the volume of data in them. This has created a need to discover structural equivalence relationships from the databases since queries tend to access information from structurally equivalent media objects residing in different databases. The more databases there are, the more query-processing performance improvement can be achieved when the structural equivalence relationships are automatically discovered. In response to such a demand, association rule mining has emerged and proven to be a highly successful technique for discovering knowledge from large databases. In this paper, we explore a generalized affinity-based association rule mining approach to discover the quasi-equivalence relationships from a network of databases. The algorithm is implemented and two empirical studies on real databases are conducted. The results show that the proposed generalized affinity-based association rule mining approach not only correctly exploits the set of quasi-equivalent media objects from the databases, but also outperforms the basic association rule mining approach in the discovery of the quasi-equivalent media object pairs. Received 16 September 1999 / Revised 12 September 2000 / Accepted in revised form 9 January 2001  相似文献   

7.
以PX吸附分离过程为研究对象,运用基于SOM模型的数据挖掘算法对其进行分析研究.SOM模型在整个挖掘过程中起了关键性的作用.一方面,SOM模型作为探索性数据分析的有效工具,为进一步的挖掘提供了依据.另一方面,SOM模型为聚类算法提供参数指导和数据支持.最终,通过数据挖掘实现了两个目标,得到了在不同负荷情况下操作参数的稳态优化区域;建立了可用于指导操作员改进操作的可视化实时评估模型.  相似文献   

8.
This paper considers a problem of finding predictive and useful association rules with a new Web mining algorithm, a streaming association rule (SAR) model. We first adopt a weighted order-dependent scheme (assigning more weights for early visited pages) rather than taking a traditional Boolean scheme (assigning 1 for visited and 0 for non-visited pages). This way, we intend to improve the limited representation of navigation patterns in previous association rule mining (ARM) algorithms. We also note that most traditional association rule models are not scalable because they require multiple scans of all records to re-calibrate a predictive model when there are new updates in original databases. The proposed SAR model takes a “divide-and-conquer” approach and requires only single scan of data sets to avoid the curse of dimensionality. Through comparative experiments on a real-world data set, we show that prediction models based on a weighted order-dependent representation are more accurate in predicting the next moves of Web navigators than models based on a Boolean representation. In particular, when combined with several heuristics developed to eliminate redundant association rules, SAR models show a very comparable prediction accuracy while maintaining a small fraction of association rules compared to traditional ARM models. Finally, we quantify and graphically show the significance or contribution of each pages to forming unique rule sets in each database segments.  相似文献   

9.
郑祥云  陈志刚  黄瑞  李博 《计算机应用》2015,35(9):2569-2573
针对传统推荐算法精准度不高的问题,在潜在狄利克雷分布(LDA)主题挖掘模型的基础上提出了一种新的适用于图书推荐(BR)的数据挖掘模型——BR_LDA模型。通过对目标借阅者的历史借阅数据与其他图书数据进行内容相似度分析,得到与目标借阅者历史借阅图书内容相似度较高的其他图书。通过对目标借阅者的历史借阅数据及其他借阅者的历史借阅数据进行相似性分析,得到最近邻借阅者的历史借阅数据。通过求解图书被推荐的概率,最终得到目标借阅者潜在感兴趣的图书。特别地,当推荐数量为4000时,BR_LDA模型比基于多特征方法和关联规则方法精准度分别提高了6.2%、4.5%;当推荐数量为500时,BR_LDA模型比协同过滤的近邻方法和矩阵分解方法分别提高了2.1%、0.5%。实验表明本模型能够更准确地向目标借阅者推荐历史感兴趣类别的新图书及潜在感兴趣的新类别的图书。  相似文献   

10.
利用聚类技术对图书馆读者社群的研究分析   总被引:1,自引:0,他引:1  
以图书馆读者借阅量数据为实例,利用统计分析系统(SAS)的聚类技术对图书馆读者社群进行数据挖掘.先对图书馆借阅量数据进行预处理,然后进行聚类分析,最后用Average法、Ward法和Single法分别进行聚类方法的比较研究,探讨高校图书馆的读者社群分类情况.研究结果表明,Average法更适合数据较平均的数据分析,分析结果可为管理者的决策提供科学依据.  相似文献   

11.
为提高弱连接重叠社区的检测识别性能,提出一种基于时间交互偏置影响传播模型的弱连接重叠社区检测算法。设计针对社区检测图模型分割的目标函数,利用群落结构对处理器负载均衡进行优化,以提高模型求解的效率。基于邻域边缘密度对近似活跃边缘进行重新定义,构建一种影响传播模型以确定用户具有高频率的相互作用,从而提高弱连接用户的识别性能。在此基础上,提出时间交互偏置社区检测方法。实验结果表明,该方法对重叠社区进行检测时具有较高的识别精度和效率。  相似文献   

12.
Sequential rule mining is an important data mining task used in a wide range of applications. However, current algorithms for discovering sequential rules common to several sequences use very restrictive definitions of sequential rules, which make them unable to recognize that similar rules can describe a same phenomenon. This can have many undesirable effects such as (1) similar rules that are rated differently, (2) rules that are not found because they are considered uninteresting when taken individually, (3) and rules that are too specific, which makes them less likely to be used for making predictions. In this paper, we address these problems by proposing a more general form of sequential rules such that items in the antecedent and in the consequent of each rule are unordered. We propose an algorithm named CMRules for mining this form of rules. The algorithm proceeds by first finding association rules to prune the search space for items that occur jointly in many sequences. Then it eliminates association rules that do not meet the minimum confidence and support thresholds according to the sequential ordering. We evaluate the performance of CMRules in three different ways. First, we provide an analysis of its time complexity. Second, we compare its performance (in terms of execution time, memory usage and scalability) with an adaptation of an algorithm from the literature that we name CMDeo. For this comparison, we use three real-life public datasets, which have different characteristics and represent three kinds of data. In many cases, results show that CMRules is faster and has a better scalability for low support thresholds than CMDeo. Lastly, we report a successful application of the algorithm in a tutoring agent.  相似文献   

13.
Mining Informative Rule Set for Prediction   总被引:2,自引:0,他引:2  
Mining transaction databases for association rules usually generates a large number of rules, most of which are unnecessary when used for subsequent prediction. In this paper we define a rule set for a given transaction database that is much smaller than the association rule set but makes the same predictions as the association rule set by the confidence priority. We call this rule set informative rule set. The informative rule set is not constrained to particular target items; and it is smaller than the non-redundant association rule set. We characterise relationships between the informative rule set and non-redundant association rule set. We present an algorithm to directly generate the informative rule set without generating all frequent itemsets first that accesses the database less frequently than other direct methods. We show experimentally that the informative rule set is much smaller and can be generated more efficiently than both the association rule set and non-redundant association rule set.  相似文献   

14.
提出一种基于数据挖掘和混合遗传算法(HGA)的自适应模型生成(AMG)模型。采用改进的聚类算法,从网络和系统的行为记录中划分出正常/异常行为库,利用HGA从行为库中挖掘出入侵规则加入规则库中,通过混合检测模块进行检测。实验结果证明,该AMG模型能以更高的检测率、更低的误检率检测未知的网络入侵。  相似文献   

15.
针对基于Hopfield神经网络的最大频繁项集挖掘(HNNMFI)算法存在的挖掘结果不准确的问题,提出基于电流阈值自适应忆阻器(TEAM)模型的Hopfield神经网络的改进关联规则挖掘算法。首先,使用TEAM模型设计实现突触,利用阈值忆阻器的忆阻值随方波电压连续变化的能力来设定和更新突触权值,自适应关联规则挖掘算法的输入。其次,改进原算法的能量函数以对齐标准能量函数,并用忆阻值表示权值,放大权值和偏置。最后,设计由最大频繁项集生成关联规则的算法。使用10组大小在30以内的随机事务集进行1000次仿真实验,实验结果表明,与HNNMFI算法相比,所提算法在关联挖掘结果准确率上提高33.9个百分点以上,说明忆阻器能够有效提高Hopfield神经网络在关联规则挖掘中的结果准确率。  相似文献   

16.
基于频繁模式树的分布式约束性关联规则挖掘算法研究   总被引:1,自引:0,他引:1  
在分布式环境中挖掘约束性关联规则是当前研究的热点问题之一。该文在FP-growth算法的基础上,提出了一种新的分布式挖掘约束性关联规则算法DAMICFP。该算法对于解决分布式挖掘约束性关联规则的问题是十分有效的。  相似文献   

17.
针对容迟网络中节点由于资源有限而表现出来的自私特性,为改善网络中节点的合作行为,进而提高网络的整体性能,提出一种基于演化博弈(EGT)的节点合作行为促进机制。首先,采用囚徒困境模型建立节点与其邻居博弈的收益矩阵;其次,基于度中心性定义节点的社会权威性;进一步地,在节点策略更新规则时考虑社会权威的影响,选择当前邻居中社会权威较高的节点进行模仿学习;最后,在机会网络环境仿真器上基于真实的动态网络拓扑数据进行仿真实验。仿真结果表明,与随机选择邻居的费米(Fermi)更新规则相比,考虑社会权威的更新规则能够更好地促进节点合作行为的涌现,进而提升网络的整体性能。  相似文献   

18.
从视野和步长等方面对人工鱼群算法进行改进,结合乒乓球技战术的特点,对乒乓球技战术原始数据进行预处理。建立基于改进人工鱼群算法的乒乓球技战术分类规则数据挖掘模型,分析顶级乒乓球运动员比赛实例,结果表明与乒乓球技战术关联规则数据挖掘相比,该模型在挖掘质量和挖掘效果上有较大的优势。  相似文献   

19.
亚复杂系统中动力学干预规则挖掘技术研究进展   总被引:3,自引:1,他引:2  
唐常杰  张悦  唐良  李川  陈瑜 《计算机应用》2008,28(11):2732-2736
亚复杂系统干预规则挖掘是数据挖掘领域的新内容。综述了亚复杂系统干预规则研究背景和典型问题,通过实例,描述了干预规则挖掘领域一些基本概念和术语,如干预相关度、传递相关度、干预分型和干预代数等;介绍了在亚复杂系统干预规则挖掘的初步探索和成果,包括关于朴素干预规则和数值型干预规则挖掘算法,以及基于密度的数据流干预分析模型及相关结果。  相似文献   

20.
基于混合遗传克隆算法的关联规则挖掘   总被引:1,自引:0,他引:1       下载免费PDF全文
符保龙 《计算机工程》2009,35(22):216-217
针对在数据挖掘应用中关联规则挖掘的问题,给出一种基于混合遗传克隆算法的关联规则挖掘方法,该算法将遗传算法和克隆算法优点相结合,通过克隆操作来产生一组新的个体,独立地对所产生的各个体进行变异,交叉操作,同时采用自适应方式动态选取交叉和变异概率,有效地克服了遗传算法容易陷入局部最优的缺点,从而求得问题的最优解。实验结果表明,该方法能高效地解决关联规则挖掘问题。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号