首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
3.
4.
Elmidaoui  Sara  Cheikhi  Laila  Idri  Ali  Abran  Alain 《计算机科学技术学报》2020,35(5):1147-1174

Maintaining software once implemented on the end-user side is laborious and, over its lifetime, is most often considerably more expensive than the initial software development. The prediction of software maintainability has emerged as an important research topic to address industry expectations for reducing costs, in particular, maintenance costs. Researchers and practitioners have been working on proposing and identifying a variety of techniques ranging from statistical to machine learning (ML) for better prediction of software maintainability. This review has been carried out to analyze the empirical evidence on the accuracy of software product maintainability prediction (SPMP) using ML techniques. This paper analyzes and discusses the findings of 77 selected studies published from 2000 to 2018 according to the following criteria: maintainability prediction techniques, validation methods, accuracy criteria, overall accuracy of ML techniques, and the techniques offering the best performance. The review process followed the well-known systematic review process. The results show that ML techniques are frequently used in predicting maintainability. In particular, artificial neural network (ANN), support vector machine/regression (SVM/R), regression &; decision trees (DT), and fuzzy &; neuro fuzzy (FNF) techniques are more accurate in terms of PRED and MMRE. The N-fold and leave-one-out cross-validation methods, and the MMRE and PRED accuracy criteria are frequently used in empirical studies. In general, ML techniques outperformed non-machine learning techniques, e.g., regression analysis (RA) techniques, while FNF outperformed SVM/R, DT, and ANN in most experiments. However, while many techniques were reported superior, no specific one can be identified as the best.

  相似文献   

5.
本文建立了2个180个含苯基的羧酸类化合物酸碱解离常数(pKa)的定量预测模型。这些化合物分子量在122.12到288.34的范围内,包含H,C,N,O,S,F,Cl,Br及I等元素.使用Cerius~2程序计算236个分子描述符来表述这些化合物,并使用统计学方法从中选择了12个描述符.分别使用多元线性回归分析(MLR)及支持向量机回归(SVM)结合10重交互检验方法来预测pKa数值.多元线性回归模型对pKa的预测结果相关系数为0.90,标准偏差为0.32;支持向量机模型结果较好,相关系数为0.91,标准偏差为0.31.  相似文献   

6.
以2D-autocorrelation描述符为结构参数,采用PSO和逐步回归的方法进行变量筛选,再结合SVM等机器学习算法对28种苯丙烯盐类化合物对EBV-EA病毒的抑制性活性进行定量构效关系(QSAR)研究.研究结果表明,PSO-v-SVM模型具有最优的模型稳健性和预测效果.由PSO选入的构成该模型的5个2D-autocorrelation描述符为ATS5v,ATS6e,ATS8e,ATS3p,GATS5p;该模型对训练集的拟合和留一法交叉验证结果的相关系数R~2和q_(cv)~2分别为0.986和0.930,对测试集预测结果的相关系数R~2_(ext)达0.955.对5个变量的理化意义的分析表明,极化率、Van der Waals体积和电负性对苯丙烯盐类化合物的抑制性活性影响分别约占57.13%、15.90%和26.97%.  相似文献   

7.
8.
Support vector machine: A tool for mapping mineral prospectivity   总被引:1,自引:0,他引:1  
In this contribution, we describe an application of support vector machine (SVM), a supervised learning algorithm, to mineral prospectivity mapping. The free R package e1071 is used to construct a SVM with sigmoid kernel function to map prospectivity for Au deposits in western Meguma Terrain of Nova Scotia (Canada). The SVM classification accuracies of ‘deposit’ are 100%, and the SVM classification accuracies of the ‘non-deposit’ are greater than 85%. The SVM classifications of mineral prospectivity have 5-9% lower total errors, 13-14% higher false-positive errors and 25-30% lower false-negative errors compared to those of the WofE prediction. The prospective target areas predicted by both SVM and WofE reflect, nonetheless, controls of Au deposit occurrence in the study area by NE-SW trending anticlines and contact zones between Goldenville and Halifax Formations. The results of the study indicate the usefulness of SVM as a tool for predictive mapping of mineral prospectivity.  相似文献   

9.
Support vector machines (SVM) and other machine-learning (ML) methods have been explored as ligand-based virtual screening (VS) tools for facilitating lead discovery. While exhibiting good hit selection performance, in screening large compound libraries, these methods tend to produce lower hit-rate than those of the best performing VS tools, partly because their training-sets contain limited spectrum of inactive compounds. We tested whether the performance of SVM can be improved by using training-sets of diverse inactive compounds. In retrospective database screening of active compounds of single mechanism (HIV protease inhibitors, DHFR inhibitors, dopamine antagonists) and multiple mechanisms (CNS active agents) from large libraries of 2.986 million compounds, the yields, hit-rates, and enrichment factors of our SVM models are 52.4–78.0%, 4.7–73.8%, and 214–10,543, respectively, compared to those of 62–95%, 0.65–35%, and 20–1200 by structure-based VS and 55–81%, 0.2–0.7%, and 110–795 by other ligand-based VS tools in screening libraries of ≥1 million compounds. The hit-rates are comparable and the enrichment factors are substantially better than the best results of other VS tools. 24.3–87.6% of the predicted hits are outside the known hit families. SVM appears to be potentially useful for facilitating lead discovery in VS of large compound libraries.  相似文献   

10.
武帅  王雄  段云峰 《微计算机信息》2007,23(12):163-165
使用支持向量机(SVM,Support Vector Machine)数据挖掘方法对移动通信行业客户流失倾向进行预测,对支持向量机同决策树算法预测的结果进行对比,结果表明支持向量机对本文所选取的属性数据具有更强的分类能力,而且在不同训练数据规模情况下预测模型有较好的稳定性。实验证实,运用本研究模型选取全体客户的22.31%,可以预测出50.07%流失的客户,表明本研究中提出的预测模型具有实际应用价值。  相似文献   

11.
Support vector machines (SVM) and other machine-learning (ML) methods have been explored as ligand-based virtual screening (VS) tools for facilitating lead discovery. While exhibiting good hit selection performance, in screening large compound libraries, these methods tend to produce lower hit-rate than those of the best performing VS tools, partly because their training-sets contain limited spectrum of inactive compounds. We tested whether the performance of SVM can be improved by using training-sets of diverse inactive compounds. In retrospective database screening of active compounds of single mechanism (HIV protease inhibitors, DHFR inhibitors, dopamine antagonists) and multiple mechanisms (CNS active agents) from large libraries of 2.986 million compounds, the yields, hit-rates, and enrichment factors of our SVM models are 52.4–78.0%, 4.7–73.8%, and 214–10,543, respectively, compared to those of 62–95%, 0.65–35%, and 20–1200 by structure-based VS and 55–81%, 0.2–0.7%, and 110–795 by other ligand-based VS tools in screening libraries of ≥1 million compounds. The hit-rates are comparable and the enrichment factors are substantially better than the best results of other VS tools. 24.3–87.6% of the predicted hits are outside the known hit families. SVM appears to be potentially useful for facilitating lead discovery in VS of large compound libraries.  相似文献   

12.
In this paper, we compare some traditional statistical methods for predicting financial distress to some more “unconventional” methods, such as decision tree classification, neural networks, and evolutionary computation techniques, using data collected from 200 Taiwan Stock Exchange Corporation (TSEC) listed companies. Empirical experiments were conducted using a total of 42 ratios including 33 financial, 8 non-financial and 1 combined macroeconomic index, using principle component analysis (PCA) to extract suitable variables.This paper makes four critical contributions: (1) with nearly 80% fewer financial ratios by the PCA method, the prediction performance is still able to provide highly-accurate forecasts of financial bankruptcy; (2) we show that traditional statistical methods are better able to handle large datasets without sacrificing prediction performance, while intelligent techniques achieve better performance with smaller datasets and would be adversely affected by huge datasets; (3) empirical results show that C5.0 and CART provide the best prediction performance for imminent bankruptcies; and (4) Support Vector Machines (SVMs) with evolutionary computation provide a good balance of high-accuracy short- and long-term performance predictions for healthy and distressed firms. Therefore, the experimental results show that the Particle Swarm Optimization (PSO) integrated with SVM (PSO-SVM) approach could be considered for predicting potential financial distress.  相似文献   

13.
与核酸作用的蛋白质在基因功能许多方面扮演着极其重要的角色,预测蛋白质是否与核酸作用在生物信息学领域受到广泛关注。本文用氨基酸组成、氨基酸物化特性和蛋白质结构等信息作为特征参数,通过支持向量机方法预测了与核酸作用的蛋白质。分别取与rRNA,RNA和DNA作用的3个蛋白质数据集,用SVM训练,筛选最优核函数,优化核函数参数,建立分类判别模型,并用于预测蛋白质是否与核酸作用。结果表明:即使对同源相似性低于40%的蛋白质,通过用10-crossvalidation(交叉验证)方法测试上述3个数据集都分别有93.75%、83.41%、81.85%的预测正确率。用外部测试集测试所得模型分别有93.8%、84.2%、81.9%的预测正确率。在此基础上,我们建立了1个预测蛋白质与核酸是否作用的网上在线软件系统。网址是:http://chemdata.shu.edu.cn/protein_na。  相似文献   

14.
Bankruptcy prediction has drawn a lot of research interests in previous literature, and recent studies have shown that machine learning techniques achieved better performance than traditional statistical ones. This paper applies support vector machines (SVMs) to the bankruptcy prediction problem in an attempt to suggest a new model with better explanatory power and stability. To serve this purpose, we use a grid-search technique using 5-fold cross-validation to find out the optimal parameter values of kernel function of SVM. In addition, to evaluate the prediction accuracy of SVM, we compare its performance with those of multiple discriminant analysis (MDA), logistic regression analysis (Logit), and three-layer fully connected back-propagation neural networks (BPNs). The experiment results show that SVM outperforms the other methods.  相似文献   

15.
16.
We previously investigated the classification and prediction of dopamine D1 receptor agonists and antagonists using a topological fragment spectra (TFS)-based support vector machine (SVM), in which the dataset contained noise compounds that had no D1 receptor activity. This work extended the dataset to seven activity classes (dopamine D1, D2, and auto-receptor agonists, and D1, D2, D3, and D4 antagonists) and increased the noise ratio to ten times that of active compounds. In total, this study used 16,008 compounds for training and 1,779 compounds for prediction. The TFS-based SVM gave good, stable results for both classification and prediction, even in the case that included ten times the noise data. The resulting model correctly predicted 97.6% of the prediction set of 1,779 compounds.  相似文献   

17.
Predicting defect-prone software modules using support vector machines   总被引:2,自引:0,他引:2  
Effective prediction of defect-prone software modules can enable software developers to focus quality assurance activities and allocate effort and resources more efficiently. Support vector machines (SVM) have been successfully applied for solving both classification and regression problems in many applications. This paper evaluates the capability of SVM in predicting defect-prone software modules and compares its prediction performance against eight statistical and machine learning models in the context of four NASA datasets. The results indicate that the prediction performance of SVM is generally better than, or at least, is competitive against the compared models.  相似文献   

18.
细胞色素P4502C9(cytochrome P4502C9,CYP2C9)是肝脏重要的一种异物质代谢酶,许多药物或化学物质均可抑制和干扰其活性,在某种药物发现早期,预测基于CYP2C9抑制的药-药相互作用对筛选及发现新药具有重要意义。本文旨在建立CYP2C9抑制剂的预测模型,并确定抑制剂和非抑制剂显著不同的参数。选择81个化合物作为数据集,随机选其中64个为训练集,其余为验证集;选取250个分子参数给化合物数字化。采用逐步判别分析法(stepwise discriminant analysis method)和K-均值聚类分析法(K-Means cluster analysis method)模拟,建立数学模型,并用验证集检验模型的预测能力。结果表明:训练集的抑制剂正确率为96.4%,非抑制剂为97.2%;验证集的抑制剂正确率为85.7%,非抑制剂为90.0%。而采用K-均值聚类法时,抑制剂和非抑制剂的正确率也分别达到了82.9%和86.9%。对结果的深入分析找出对该模型贡献较大的参数为分子中氨基、烯基基团电拓扑状态指数、碳环数量以及疏水性参数,那些参数对区分抑制剂和非抑制剂两种结构差异、帮助指导CYP2C9抑制剂的筛选和发现具有重要意义。  相似文献   

19.
Financial distress prediction is an important and widely researched issue because of its potential significant influence on bank lending decisions and profitability. Since the 1970s, many mathematical and statistical researchers have proposed prediction models on such issues. Given the recent vigorous growth of artificial intelligence (AI) and data mining techniques, many researchers have begun to apply those techniques to the problem of bankruptcy prediction. Among these techniques, the support vector machine (SVM) has been applied successfully and obtained good performance with other AI and statistical method comparisons. Particle swarm optimization (PSO) has been increasingly employed in conjunction with AI techniques and has provided reliable optimization capability. However, researches addressing PSO and SVM integration are scarce, although there is great potential for useful applications in this field. This paper proposes an adaptive inertia weight (AIW) method for improving PSO performance and integrates SVM in two aspects: feature subset selection and parameter optimization. The experiments collected 54 listed companies as initial samples from American bank datasets. The proposed adaptive PSO-SVM approach could be a more suitable methodology for predicting potential financial distress. This approach also proves its capability to handle scalable and non-scalable function problems.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号