首页 | 本学科首页   官方微博 | 高级检索  
     

基于集成聚类的流量分类架构
引用本文:鲁刚,余翔湛,张宏莉,郭荣华.基于集成聚类的流量分类架构[J].软件学报,2016,27(11):2870-2883.
作者姓名:鲁刚  余翔湛  张宏莉  郭荣华
作者单位:中国洛阳电子装备试验中心, 河南 洛阳 471003,哈尔滨工业大学 计算机科学与技术学院, 黑龙江 哈尔滨 150001,哈尔滨工业大学 计算机科学与技术学院, 黑龙江 哈尔滨 150001,中国洛阳电子装备试验中心, 河南 洛阳 471003
基金项目:国家自然科学基金(61303061,61402485);高性能计算国家重点实验室开放课题(201513-01)
摘    要:流量分类是优化网络服务质量的基础与关键.机器学习算法利用数据流统计特征分类流量,对于识别加密私有协议流量具有重要意义.然而,特征偏置和类别不平衡是基于机器学习的流量分类研究所面临的两大挑战.特征偏置是指一些数据流统计特征在提高部分应用识别准确率的同时也降低了另外一部分应用识别的准确率.类别不平衡是指机器学习流量分类器对样本数较少的应用识别的准确率较低.为解决上述问题,提出了基于集成聚类的流量分类架构(traffic classification framework based on ensemble clustering,简称TCFEC).TCFEC由多个基于不同特征子空间聚类的基分类器和一个最优决策部件构成,能够提高流量分类的准确率.具体而言,与传统的机器学习流量分类器相比,TCFEC的平均流准确率最高提升5%,字节准确率最高提升6%.

关 键 词:基于集成聚类的流量分类架构  集成聚类  流量分类  数据流特征  机器学习
收稿时间:2015/3/16 0:00:00
修稿时间:4/7/2015 12:00:00 AM

Traffic Classification Framework Based on Ensemble Clustering
LU Gang,YU Xiang-Zhan,ZHANG Hong-Li and GUO Rong-Hua.Traffic Classification Framework Based on Ensemble Clustering[J].Journal of Software,2016,27(11):2870-2883.
Authors:LU Gang  YU Xiang-Zhan  ZHANG Hong-Li and GUO Rong-Hua
Affiliation:Chinese Luoyang Electronic Equipment Center, Luoyang 471003, China,School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China,School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China and Chinese Luoyang Electronic Equipment Center, Luoyang 471003, China
Abstract:Traffic classification is the basis and key for optimizing network quality of service. Machine learning algorithms apply flow statistics in traffic classification, which are significant for identifying both encrypted and private traffic. However, the discriminator bias problem and the class imbalance problem are two main challenges in traffic classification. The discriminator bias problem denotes that some flow statistics can improve the accuracies for some applications but reduce the accuracies for other applications. The class imbalance problem denotes that machine learning based traffic classifier identifies the minority application with a low accuracy. To address the above two issues, traffic classification framework based on ensemble clustering (TCFEC) is proposed in this paper. TCFEC is composed of several base classifiers trained by clustering in different feature subspaces and an optimal decision component. It is able to improve accuracy in traffic classification. Specifically, compared with the traffic classifier based on traditional machine learning algorithms, TCFEC improves average flow accuracy by 5% as well as average byte accuracy by 6%.
Keywords:traffic classification framework based on ensemble clustering (TCFEC)  ensemble clustering  traffic classification  flow-based feature  machine learning
点击此处可从《软件学报》浏览原始摘要信息
点击此处可从《软件学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号