首页 | 本学科首页   官方微博 | 高级检索  
     

基于互信息选择聚类集成的网络流量分类方法
引用本文:丁要军,蔡皖东.基于互信息选择聚类集成的网络流量分类方法[J].计算机应用,2013,33(1):80-82.
作者姓名:丁要军  蔡皖东
作者单位:1. 西北工业大学 计算机学院, 西安 710129 2. 咸阳师范学院 信息工程学院, 陕西 咸阳 712000
基金项目:国家863计划项目(2009AA01Z424);陕西省教育厅专项(12JK0933)
摘    要:针对互联网流量标注困难以及单个聚类器的泛化能力较弱,提出一种基于互信息(MI)理论的选择聚类集成方法,以提高流量分类的精度。首先计算不同初始簇个数K的K均值聚类结果与训练集中流量协议的真实分布之间的规范化互信息(NMI);然后基于NMI的值来选择用于聚类集成的K均值基聚类器的K值序列;最后采用二次互信息(QMI)的一致函数生成一致聚类结果,并使用一种半监督方法对聚类簇进行标注。通过实验比较了聚类集成方法与单个聚类算法在4个不同测试集上总体分类精度。实验结果表明,聚类集成方法的流量分类总体精度能达到90%。所提方法将聚类集成模型应用到网络流量分类中,提高了流量分类的精度和在不同数据集上的分类稳定性。

关 键 词:聚类集成  K均值  流量分类  互信息  
收稿时间:2012-08-01
修稿时间:2012-08-28

Internet traffic classification method based on selective clustering ensemble of mutual information
DING Yaojun,CAI Wandong.Internet traffic classification method based on selective clustering ensemble of mutual information[J].journal of Computer Applications,2013,33(1):80-82.
Authors:DING Yaojun  CAI Wandong
Affiliation:1. School of Computer Science, Northwestern Polytechnical University, Xi'an Shaanxi 710129, China
2. School of Information Engineering, Xianyang Normal University, Xianyang Shaanxi 712000, China
Abstract:Because it is difficult to label Internet traffic and the generalization ability of single clustering algorithm is weak, a selective clustering ensemble method based on Mutual Information (MI) was proposed to improve the accuracy of traffic classification. In the method, the Normalized Mutual Information (NMI) between clustering results of K-means algorithm with different initial cluster number and the distribution of protocol labels of training set was computed first, and then a serial of K which were the initial cluster number of K-means algorithm based on NMI were selected. Finally, the consensus function based on Quadratic Mutual Information (QMI) was used to build the consensus partition, and the labels of clusters were labeled based on a semi-supervised method. The overall accuracies of clustering ensemble method and single clustering algorithm were compared over four testing sets, and the experimental results show that the overall accuracy of clustering ensemble method can achieve 90%. In the proposed method, a clustering ensemble model was used to classify Internet traffic, and the overall accuracy of traffic classification along with the stability of classification over different dataset got enhanced.
Keywords:clustering ensemble                                                                                                                          K-means                                                                                                                          traffic classification                                                                                                                          Mutual Information (MI)
本文献已被 CNKI 等数据库收录!
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号