期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

南世慧魏伟吴华清邹金蓉赵志文《计算机科学》2018,45(8):141-145

现有的Web服务器指纹识别方法容易因响应头被篡改而得不到准确的识别结果 ,而且已有的基于机器学习的相关识别方法需要预先发送大量的请求来进行识别。针对上述问题,通过分析响应头的特征关系,提出一种基于KNN和GBDT的Web服务器指纹识别算法,其只需要发送两种不同类型的异常请求,就能识别对应的Web服务器指纹类型和版本范围。与已有Web服务器指纹识别算法进行的对比实验结果表明,所提算法的识别速度和准确率均得到了优化。相似文献

2.

基于异构网络特征与梯度提升决策树的协同药物预测

聂丽霞刘辉邹凌《计算机应用与软件》2020,37(4):48-52

组合药物在复杂疾病特别是癌症的治疗中发挥越来越重要的作用。以组合药物靶标为初始节点在药物-蛋白质异构网络上执行重启型随机游走,将收敛后的概率分布作为药物组合的特征向量,训练梯度提升决策树模型来预测新的药物组合。在标准药物组合数据集的性能评估表明,该方法比其他七种典型分类器和传统的提升算法具有更好的性能,且基于异构网络的特征显著提升了各分类器的性能,AUC值从0.528提升至0.909。相似文献

3.

《Applied Soft Computing》2019

Insider trading is a kind of criminal behavior in stock market by using nonpublic information. In recent years, it has become the major illegal activity in China’s stock market. In this study, a combination approach of GBDT (Gradient Boosting Decision Tree) and DE (Differential Evolution) is proposed to identify insider trading activities by using data of relevant indicators. First, insider trading samples occurred from year 2007 to 2017 and corresponding non-insider trading samples are collected. Next, the proposed method is trained by the GBDT, and initial parameters of the GBDT are optimized by the DE. Finally, out-of-samples are classified by the trained GBDT–DE model and its performances are evaluated. The experiment results show that our proposed method performed the best for insider trading identification under time window length of ninety days, indicating the relevant indicators under 90-days time window length are relatively more useful. Additionally, under all three time window lengths, relative importance result shows that several indicators are consistently crucial for insider trading identification. Furthermore, the proposed approach significantly outperforms other benchmark methods, demonstrating that it could be applied as an intelligent system to improve identification accuracy and efficiency for insider trading regulation in China stock market. 相似文献

4.

基于spark平台的供电煤耗并行回归预测

下载免费PDF全文

李偲希白全生舒畅肖祥武《电力大数据》2021,24(11):85-92

针对火电厂数据量大且复杂的特点,通过采用基于spark的并行回归算法,解决了传统供电煤耗回归预测模型所需的运行时间较长且预测精度较低的问题。本文采用了大数据平台中采集到的某电厂周期为一年的运行数据,对数据进行异常值筛选、空值填补等清洗及预处理过程,并对工况进行判稳,选取稳定工况下的健康数据进行数据分析,最后利用灰色关联度分析方法选择关联度最大的12个特征,对火电厂供电煤耗进行预测。通过参数调优建立基于spark的火电厂供电煤耗的随机森林和梯度提升决策树的并行回归模型,最后对实验结果进行比较分析和总结。结果表明,随机森林回归模型和梯度提升决策树回归模型对火电厂的供电煤耗都有较好的预测效果,但随机森林回归模型预测的准确度相对更高。相似文献

5.

Yanting Li 《SECURITY AND PRIVACY》2022,5(1):e190

相似文献

6.

基于差分隐私保护知识迁移的联邦学习方法

徐晨阳葛丽娜王哲周永权秦霞田蕾《计算机应用研究》2023,40(8)

联邦学习解决了机器学习的数据孤岛问题,然而,各方的数据集在数据样本空间和特征空间上可能存在较大差异,导致联邦模型的预测精度下降。针对上述问题,提出了一种基于差分隐私保护知识迁移的联邦学习方法。该方法使用边界扩展局部敏感散列计算各方实例之间的相似度,根据相似度对实例进行加权训练,实现基于实例的联邦迁移学习,在此过程中,实例本身无须透露给其他方,防止了隐私的直接泄露。同时,为了减少知识迁移过程的隐私间接泄露,在知识迁移过程中引入差分隐私机制,对需要在各方之间传输的梯度数据进行扰动,实现知识迁移过程的隐私保护。理论分析表明,知识迁移过程满足ε-差分隐私保护。在XGBoost梯度提升树模型上实现了所提方法,实验结果表明,与无知识迁移方法相比,所提方法使联邦模型测试误差平均下降6%以上。相似文献

7.

基于GBDT的卫星工程参数异常检测

下载免费PDF全文

马文臻王爱玲李旭东黎建辉邹自明李云龙《计算机系统应用》2022,31(1):138-144

卫星及其载荷的在轨运行异常诊断是卫星高效安全运行的重要支持,发展智能、高效的卫星异常检测方法,是卫星地面系统的研究焦点之一.在我国空间科学先导专项系列卫星任务的应用背景下,根据空间科学卫星的数据特性与异常形态,基于梯度提升决策树(gradient boosting decision tree,GBDT)原理构建卫星工程... 相似文献

8.

网络流量的决策树分类 总被引：1，自引：1，他引：1

王宇余顺争《小型微型计算机系统》2009,30(11)

应用识别与流量分类是网络管理、安全、研究等相关事务的必要前提.随着网络的高速发展以及各种新型应用的不断涌现,基于分组传输层端口号和深度分组解析的分类技术难以满足需求.本文验证网络流量的统计特性可以有效地区分不同应用,提出一种基于C4.5决策树分类器的有监督网络流量分类方法,讨论boosting增强方法和特征选择两种改进.实验结果表明,C4.5分类器的训练复杂度适中,准确率高且分类速度快;增强方法可以进一步提高分类器的准确率,代价是训练时间大幅提高和分类时间稍微减慢;特征选择算法则提高分类速度而稍微降低准确率. 相似文献

9.

隐私保护的加密流量检测研究

张心语张秉晟孟泉润任奎《网络与信息安全学报》2021,7(4):101-113

现有的加密流量检测技术缺少对数据和模型的隐私性保护,不仅违反了隐私保护法律法规,而且会导致严重的敏感信息泄露.主要研究了基于梯度提升决策树(GBDT)算法的加密流量检测模型,结合差分隐私技术,设计并实现了一个隐私保护的加密流量检测系统.在CICIDS2017数据集下检测了 DDoS攻击和端口扫描的恶意流量,并对系统性能... 相似文献

10.

基于梯度提升决策树的变形宏病毒检测

下载免费PDF全文

闫华刘嘉位凯志古亮《计算机系统应用》2021,30(5):39-46

宏病毒在高级持续性威胁中被广泛运用.其变形成本低廉且方式灵活,导致传统的基于病毒规则库的反病毒系统难于有效对抗.提出一种基于梯度提升决策树的变形宏病毒检测方法.该方法以病毒专家经验为指导,实施大规模特征工程,基于词法分析对变形宏病毒做细粒度建模,并使用海量样本训练模型.实验表明,该方法能够准确检测企业级用户网络中传播的... 相似文献

11.

《Advanced Engineering Informatics》2020

相似文献

12.

Covariance-guided One-Class Support Vector Machine

Naimul Mefraz Khan Riadh Ksantini Imran Shafiq Ahmad Ling Guan 《Pattern recognition》2014

In one-class classification, the low variance directions in the training data carry crucial information to build a good model of the target class. Boundary-based methods like One-Class Support Vector Machine (OSVM) preferentially separates the data from outliers along the large variance directions. On the other hand, retaining only the low variance directions can result in sacrificing some initial properties of the original data and is not desirable, specially in case of limited training samples. This paper introduces a Covariance-guided One-Class Support Vector Machine (COSVM) classification method which emphasizes the low variance projectional directions of the training data without compromising any important characteristics. COSVM improves upon the OSVM method by controlling the direction of the separating hyperplane through incorporation of the estimated covariance matrix from the training data. Our proposed method is a convex optimization problem resulting in one global optimum solution which can be solved efficiently with the help of existing numerical methods. The method also keeps the principal structure of the OSVM method intact, and can be implemented easily with the existing OSVM libraries. Comparative experimental results with contemporary one-class classifiers on numerous artificial and benchmark datasets demonstrate that our method results in significantly better classification performance. 相似文献

13.

MultiBoosting: A Technique for Combining Boosting and Wagging 总被引：12，自引：0，他引：12

Webb Geoffrey I. 《Machine Learning》2000,40(2):159-196

MultiBoosting is an extension to the highly successful AdaBoost technique for forming decision committees. MultiBoosting can be viewed as combining AdaBoost with wagging. It is able to harness both AdaBoost's high bias and variance reduction with wagging's superior variance reduction. Using C4.5 as the base learning algorithm, MultiBoosting is demonstrated to produce decision committees with lower error than either AdaBoost or wagging significantly more often than the reverse over a large representative cross-section of UCI data sets. It offers the further advantage over AdaBoost of suiting parallel execution. 相似文献

14.

基于伪梯度提升决策树的内网防御算法

厉柏伸李领治孙涌朱艳琴《计算机科学》2018,45(4):157-162

结合TF-IDF算法思想,提出了特征频率、森林频率以及伪梯度提升决策树,解决了梯度提升决策树随着迭代次数的增加,错误数据被边缘化的问题。在伪梯度提升决策树中,所有决策树分别在原始数据集的Bootstrapping后的数据集上产生,无须针对每次迭代来对数据集采样。在分布式集群上进行内网防御的实验,结果表明在一定规模的训练集上,伪梯度提升决策树具有更好的预测准确度。相似文献

15.

基于引力的孤立点检测算法

孟建良姚亮程伟想《计算机应用与软件》2009,26(1)

提出一种基于引力的孤立点检测算法.通过综合考虑数据对象周围的密度及数据对象之间的距离等因素对孤立点定义的影响来挖掘出数据集中隐含的孤立点.给出了与该算法相关的概念与技术,详细介绍了该算法,并用实际数据进行了实验.实验表明:该算法对数据集的维度具有很好的扩展性,能有效地识别孤立点,同时能反映出数据对象在数据集中的孤立程度. 相似文献

16.

Minimizing False Positives of a Decision Tree Classifier for Intrusion Detection on the Internet 总被引：1，自引：0，他引：1

Satoru Ohta Ryosuke Kurebayashi Kiyoshi Kobayashi 《Journal of Network and Systems Management》2008,16(4):399-419

Machine learning or data mining technologies are often used in network intrusion detection systems. An intrusion detection system based on machine learning utilizes a classifier to infer the current state from the observed traffic attributes. The problem with learning-based intrusion detection is that it leads to false positives and so incurs unnecessary additional operation costs. This paper investigates a method to decrease the false positives generated by an intrusion detection system that employs a decision tree as its classifier. The paper first points out that the information-gain criterion used in previous studies to select the attributes in the tree-constructing algorithm is not effective in achieving low false positive rates. Instead of the information-gain criterion, this paper proposes a new function that evaluates the goodness of an attribute by considering the significance of error types. The proposed function can successfully choose an attribute that suppresses false positives from the given attribute set and the effectiveness of using it is confirmed experimentally. This paper also examines the more trivial leaf rewriting approach to benchmark the proposed method. The comparison shows that the proposed attribute evaluation function yields better solutions than the leaf rewriting approach.

Satoru OhtaEmail:

相似文献

17.

改进的决策树算法在潜在客户获取中的应用 总被引：1，自引：0，他引：1

赵华宋顺林《计算机工程与应用》2005,41(11):196-198

在企业营销活动中,对潜在客户进行有针对性的营销活动,可以节省很大的开支,增加企业利润,该文将引入boosting思想的改进的决策树算法用于挖掘预测潜在客户群,并提出了获取潜在客户的合理可行的数据挖掘流程,用以指导企业的营销决策。试验结果表明,该方法有着很好的理论价值和应用价值。相似文献

18.

一种混合型多概念获取系统

下载免费PDF全文

高阳刘海涛周志华陈兆乾《软件学报》2000,11(4):453-460

文章实现混合型多概念获取系统HMCAS(hybrid multi-concept acquisition system).无论在离散值或连续值输入下,HMCAS系统都可以实现增量式教师学习.HMCAS的核心算法HMCAP基于事例空间的概率分布,结合了符号学习和神经网络学习,能够以混合型判定树形式产生概念描述.HMCAS的原型系统已经成功应用于台风预测领域. 相似文献

19.

多源域分布下优化权重的无监督迁移学习Boosting方法

李赟波王士同《计算机应用研究》2023,40(2)

深度决策树迁移学习Boosting方法（DTrBoost）可以有效地实现单源域有监督情况下向一个目标域迁移学习,但无法实现多个源域情况下的无监督迁移场景。针对这一问题,提出了多源域分布下优化权重的无监督迁移学习Boosting方法,主要思想是根据不同源域与目标域分布情况计算出对应的KL值,通过比较选择合适数量的不同源域样本训练分类器并对目标域样本打上伪标签。最后,依照各个不同源域的KL距离分配不同的学习权重,将带标签的各个源域样本与带伪标签的目标域进行集成训练得到最终结果。对比实验表明,提出的算法实现了更好的分类精度并对不同的数据集实现了自适应效果,分类错误率平均下降2.4%,在效果最好的marketing数据集上下降6%以上。相似文献

20.

基于梯度提升回归树的短时交通流预测模型

沈夏炯张俊涛韩道军《计算机科学》2018,45(6):222-227, 264

短时交通流预测是交通流建模的一个重要组成部分,在城市道路交通的管理和控制中起着重要的作用。然而,常见的时间序列模型(如ARIMA)、随机森林(RF)模型在交通流预测方面由于被构建模型产生的残差和输入变量所影响,其预测精度受到限制。针对该问题,提出了一种基于梯度提升回归树的短时交通预测模型来预测交通速度。首先,模型引入Huber损失函数作为模型残差的处理方法;其次, 在输入变量中考虑预测断面受到毗邻空间因素和时间因素相关性的影响。模型在训练过程中通过不断调整弱学习器的权重来纠正模型的残差,从而提高模型预测的精度。利用某城市快速路的交通速度数据进行实验,并使用MSE和MAPE等指标将本文模型与ARIMA模型和随机森林模型进行对比,结果表明,文中所提模型的预测精度最好,从而验证了模型在短时交通流预测方面的有效性。相似文献