共查询到20条相似文献,搜索用时 46 毫秒
1.
论文首先对一种基于关联规则分类的算法做出了分析。然后对算法中的类关联规则的提取方法进行了改进,得到了一种新的基于关联规则分类的算法。并结合棉花病虫害数据运行的结果对两种算法的运行效率和实用性进行了比较。 相似文献
2.
本文根据关联规则和分类规则的概念与表示形式,指出在关联规则挖掘过程中如果指定挖掘与一个确定的项相关联,那么就是分类规则挖掘了,论述了分类规则是特殊情况下的关联规则,并指出在这种特殊情况下,关联规则所具有的特征;然后根据这一论述,提出了一种在关联规则挖掘算法中利用限制条件概率分布来发现分类规则的算法。 相似文献
3.
关联规则是数据挖掘的一种常用方法。本文以Apriorl算法中频繁项集的概念为基础,在加入了元向量、子规则、父规则等概念后,提出一种关联规则挖掘的改进方法(Improve算法)。该方法克服了传统关联规则挖掘方法的不足,在产生频繁项集的同时进行规则挖掘,从而提高了挖掘效率。 相似文献
4.
在数据库中发现分类规则是数据挖掘的一个重要内容,由于数据集往往由不精确数据组成,所以数据集不能截然划分为正例集和反倒集,因而无法直接采用示例学习的方法发现分类规则。本文结合关联规则技术,将原始数据集转换为决策表,使决策表具有无噪声和代表性高的特点,通过对决策表进行示例学习便可以挖掘出分类规则。 相似文献
5.
数据挖掘中IUA算法存在遗漏频繁项目集致使有的关联规则挖掘不出来的问题,在分析Apriori算法、IUA算法等经典关联规则挖掘算法的基础上,提出了一种基于最近挖掘结果的更新算法HIUA。HIUA算法吸收了Apriori算法和IUA算法的优点,在改变最小支持度和基于最近挖掘结果的条件下,从生成尽可能少的候选项目集考虑,从而得到完整的新频繁项目集,提高了算法的效率。 相似文献
6.
分析、比较了当前具有代表性的分类关联算法,总结了关联规则分类存在的问题,便于使用者根据需要选择合适的算法,也便于研究者对算法进行研究改进,提出性能更好的分类算法。 相似文献
7.
分布式系统下关联规则挖掘算法的挖掘效率取决于频繁项目集的确定和网络各站点间的通讯量.为提高频繁项目集的生成效率.提出了关系数据库下一种新的数据预处理方法以及一种基于数组形式的频繁项目集生成算法.新的数据预处理方法可以降低候选项目集的数量,基于二进制的数组只需进行逻辑与运算便可生成频繁项目集,将该算法结合星型网络结构下的分布式挖掘算法 SDMA 应用于实验挖掘,理论分析与实验结果表明,算法提高了挖掘效率,是可行的. 相似文献
8.
在一个模式信息保全引理的基础上,提出了一个基于频繁模式栈变换的关联规则挖掘算法FPST,给出了相应的栈构造和栈变换的算法描述,并进行了算法的性能分析和比较试验,结果表明算法性能优良。 相似文献
9.
分布式系统下关联规则挖掘算法的挖掘效率取决于频繁项目集的确定和网络各站点间的通讯量。为提高频繁项目集的生成效率,提出了关系数据库下一种新的数据预处理方法以及一种基于数组形式的频繁项目集生成算法。新的数据预处理方法可以降低候选项目集的数量,基于二进制的数组只需进行逻辑与运算便可生成频繁项目集,将该算法结合星型网络结构下的分布式挖掘算法SDMA应用于实验挖掘,理论分析与实验结果表明,算法提高了挖掘效率,是可行的。 相似文献
10.
对用户分类是Web挖掘的一个重要的研究方向.文中提出一种基于关联规则的分类方法,并且将它应用于用户兴趣预测.首先对服务器日志文件预处理,形成一个访问事务集.然后对该事务集进行数据挖掘,找出所有的满足最小支持度的类别关联规则.最后用这些类别关联规则去预测用户的兴趣.实验证明此方法是有效的. 相似文献
11.
New methodologies and tools have gradually made the life cycle for software development more human-independent. Much of the
research in this field focuses on defect reduction, defect identification and defect prediction. Defect prediction is a relatively
new research area that involves using various methods from artificial intelligence to data mining. Identifying and locating
defects in software projects is a difficult task. Measuring software in a continuous and disciplined manner provides many
advantages such as the accurate estimation of project costs and schedules as well as improving product and process qualities.
This study aims to propose a model to predict the number of defects in the new version of a software product with respect
to the previous stable version. The new version may contain changes related to a new feature or a modification in the algorithm
or bug fixes. Our proposed model aims to predict the new defects introduced into the new version by analyzing the types of
changes in an objective and formal manner as well as considering the lines of code (LOC) change. Defect predictors are helpful
tools for both project managers and developers. Accurate predictors may help reducing test times and guide developers towards
implementing higher quality codes. Our proposed model can aid software engineers in determining the stability of software
before it goes on production. Furthermore, such a model may provide useful insight for understanding the effects of a feature,
bug fix or change in the process of defect detection.
相似文献
12.
Rather than detecting defects at an early stage to reduce their impact, defect prevention means that defects are prevented from occurring in advance. Causal analysis is a common approach to discover the causes of defects and take corrective actions. However, selecting defects to analyze among large amounts of reported defects is time consuming, and requires significant effort. To address this problem, this study proposes a defect prediction approach where the reported defects and performed actions are utilized to discover the patterns of actions which are likely to cause defects. The approach proposed in this study is adapted from the Action-Based Defect Prediction (ABDP), an approach uses the classification with decision tree technique to build a prediction model, and performs association rule mining on the records of actions and defects. An action is defined as a basic operation used to perform a software project, while a defect is defined as software flaws and can arise at any stage of the software process. The association rule mining finds the maximum rule set with specific minimum support and confidence and thus the discovered knowledge can be utilized to interpret the prediction models and software process behaviors. The discovered patterns then can be applied to predict the defects generated by the subsequent actions and take necessary corrective actions to avoid defects.The proposed defect prediction approach applies association rule mining to discover defect patterns, and multi-interval discretization to handle the continuous attributes of actions. The proposed approach is applied to a business project, giving excellent prediction results and revealing the efficiency of the proposed approach. The main benefit of using this approach is that the discovered defect patterns can be used to evaluate subsequent actions for in-process projects, and reduce variance of the reported data resulting from different projects. Additionally, the discovered patterns can be used in causal analysis to identify the causes of defects for software process improvement. 相似文献
13.
Software defect prediction (SDP) plays a significant part in identifying the most defect-prone modules before software testing and allocating limited testing resources. One of the most commonly used classifiers in SDP is naive Bayes (NB). Despite the simplicity of the NB classifier, it can often perform better than more complicated classification models. In NB, the features are assumed to be equally important, and the numeric features are assumed to have a normal distribution. However, the features often do not contribute equivalently to the classification, and they usually do not have a normal distribution after performing a Kolmogorov-Smirnov test; this may harm the performance of the NB classifier. Therefore, this paper proposes a new weighted naive Bayes method based on information diffusion (WNB-ID) for SDP. More specifically, for the equal importance assumption, we investigate six weight assignment methods for setting the feature weights and then choose the most suitable one based on the F-measure. For the normal distribution assumption, we apply the information diffusion model (IDM) to compute the probability density of each feature instead of the acquiescent probability density function of the normal distribution. We carry out experiments on 10 software defect data sets of three types of projects in three different programming languages provided by the PROMISE repository. Several well-known classifiers and ensemble methods are included for comparison. The final experimental results demonstrate the effectiveness and practicability of the proposed method. 相似文献
14.
为解决软件缺陷预测问题引入了最小二乘支持向量机算法(LS-SVM),加速了超参数的选择过程,给出了逐个加入新的样本用以模型校正的快捷方法,以软件复杂性度量为线索,建立了基于FLS-SVM的软件缺陷预测模型。通过具体实例阐明了模型的执行过程及小样本情况下比神经网络更为出色的预测能力,并根据回归方程指出了对软件缺陷影响显著的复杂性度量。 相似文献
15.
为保证软件可靠性和软件质量,在基于软件开发周期的基础上,提出了一种利用PCA-BP模糊神经网络的软件缺陷预计方法.针对影响软件可靠性的各种因素,依据相关的标准,结合工程实践,选取了影响软件可靠性的度量元.收集了实际工程中的一类飞行控制软件的度量数据,利用提出的模型进行缺陷预测,并将预测结果与传统的BP神经网络模型计算的结果进行了对比.对比结果表明,与基于BP神经网络的预测方法相比较,结合了主成分分析方法的PCA-BP神经网络预测方法具有更快的收敛速度和更高的预测准确度. 相似文献
16.
在总结了已有的流量预测方法基础上,提出了一种基于多种预测技术组合而成的网络流量预测方法。该方法根据小波多尺度的分解和重构思想,将网络流量通过小波分解成不同尺度下的逼近信号和细节信号,然后分别单支重构成低频序列和高频序列。根据低频和高频序列的不同特性,分别采用自回归模型(AR)和线性最小均方误差估计(LMMSE)对未来网络流量进行预测,最后重新组合生成预测流量。通过对真实网络流量的仿真实验表明,该方法能比较准确地预测未来的网络流量。 相似文献
17.
为利用软件测试活动检测到的缺陷信息进行测试活动的有效性评估,改进软件测试活动,提出一种基于缺陷度量和缺陷基线的测试有效性评估方法.以正交缺陷分类方法为基础建立软件缺陷的分类框架,研究缺陷触发特征、缺陷类型统计信息与测试活动的关联,建立缺陷关联基线.通过分析实际缺陷度量与缺陷基线的偏离情况,评估测试活动的有效性,明确测试活动的改进方向.实例分析表明,该模型能够便捷地应用于测试活动有效性的定性评估. 相似文献
18.
针对传统的关联规则蕴含式表示方式和图形可视化方法对非专家用户来说不易理解的问题,提出了一种新的基于自然语言生成的关联规则可视化方法。该方法将自然语言生成技术引入到关联规则可视化中,通过领域知识库中的解释模式将关联规则中每一项生成简单的自然语言句子,并经过句子规划、表层实现,最终生成流畅的自然语言句子。实验最终得出的结果,便于普通用户理解和应用,从而帮助用户获取更有价值的信息。 相似文献
19.
Software defect prediction is aimed to find potential defects based on historical data and software features. Software features can reflect the characteristics of software modules. However, some of these features may be more relevant to the class (defective or non-defective), but others may be redundant or irrelevant. To fully measure the correlation between different features and the class, we present a feature selection approach based on a similarity measure (SM) for software defect prediction. First, the feature weights are updated according to the similarity of samples in different classes. Second, a feature ranking list is generated by sorting the feature weights in descending order, and all feature subsets are selected from the feature ranking list in sequence. Finally, all feature subsets are evaluated on a k-nearest neighbor (KNN) model and measured by an area under curve (AUC) metric for classification performance. The experiments are conducted on 11 National Aeronautics and Space Administration (NASA) datasets, and the results show that our approach performs better than or is comparable to the compared feature selection approaches in terms of classification performance. 相似文献
20.
在网络安全态势感知系统中,态势预测是关键的环节.为了保证及提高态势预测的精度,结合粒子群算法的寻优性能好和支持向量机的预测准确的优势,提出了一种在数据累加预处理基础上的PSO-SVM预测模型.此模型利用将原始序列累加,弱化了原始序列中的不规则扰动影响,增强了序列的规律性的特点,与粒子群优化支持向量机(PSO-SW)相结合,更好地发挥预测精度高的优势,更能保证预测精度.通过仿真实验检验此模型的有效性,并与PSO-SVM预测模型的结果进行对比,验证了其预测精度的优越性. 相似文献
|