首页 | 本学科首页   官方微博 | 高级检索  
     


Internet traffic clustering with side information
Authors:Yu Wang  Yang Xiang  Jun Zhang  Wanlei Zhou  Bailin Xie
Affiliation:1. School of Information Technology, Deakin University, Melbourne, Australia;2. Cisco School of Informatics, Guangdong University of Foreign Studies, Guangzhou, China
Abstract:Internet traffic classification is a critical and essential functionality for network management and security systems. Due to the limitations of traditional port-based and payload-based classification approaches, the past several years have seen extensive research on utilizing machine learning techniques to classify Internet traffic based on packet and flow level characteristics. For the purpose of learning from unlabeled traffic data, some classic clustering methods have been applied in previous studies but the reported accuracy results are unsatisfactory. In this paper, we propose a semi-supervised approach for accurate Internet traffic clustering, which is motivated by the observation of widely existing partial equivalence relationships among Internet traffic flows. In particular, we formulate the problem using a Gaussian Mixture Model (GMM) with set-based equivalence constraint and propose a constrained Expectation Maximization (EM) algorithm for clustering. Experiments with real-world packet traces show that the proposed approach can significantly improve the quality of resultant traffic clusters.
Keywords:Traffic classification  Semi-supervised machine learning  Constrained clustering
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号