首页 | 本学科首页   官方微博 | 高级检索  
     

基于启发式搜索的IP数据流分类方法的研究
引用本文:武飞,曾凡平,熊能,邓超强,董齐兴. 基于启发式搜索的IP数据流分类方法的研究[J]. 小型微型计算机系统, 2012, 0(10): 2153-2157
作者姓名:武飞  曾凡平  熊能  邓超强  董齐兴
作者单位:中国科学技术大学计算机科学与技术学院;中国科学院软件研究所计算机科学国家重点实验室;安徽省计算与通讯软件重点实验室
基金项目:安徽省自然科学基金项目(11040606M131)资助
摘    要:基于应用层载荷特征的IP流分类技术的准确性较高,但是,当特征库庞大时遍历匹配特征库需要消耗大量的时间.鉴于此,提出一种将应用层载荷特征和启发式搜索相结合的IP数据流分类方法.通过从各种应用产生的数据包之间提取共同特征并以此共同特征建立启发式规则,根据启发式规则将特征库划分为多个特征子集,在数据包匹配过程中只需要根据启发式规则搜索匹配特定的特征子集,从而大大减少了对无关特征的匹配过程,使待匹配的特征子集具有更强的针对性、使得时间性能得到提高.对于部分应用采用以DNS为引导的方法来对数据包进行分类,该方法部分消除了基于载荷无法对加密数据进行识别的弊端.本文用C语言实现了该算法,并与开源软件l7-filter算法进行了对比实验.实验结果表明:在离线状态下,本文提出的方法的分类速度是l7-filter分类速度的6-10倍,总体识别准确性达到98%以上.

关 键 词:数据流分类  启发式规则  正则表达式  l7-filter

Research of IP Flow Classification Based on Heuristic Search
WU Fei,ZENG Fan-ping,XIONG Neng,DENG Chao-qiang,DONG Qi-xing. Research of IP Flow Classification Based on Heuristic Search[J]. Mini-micro Systems, 2012, 0(10): 2153-2157
Authors:WU Fei  ZENG Fan-ping  XIONG Neng  DENG Chao-qiang  DONG Qi-xing
Affiliation:1,2 1(School of Computer Science and Technology,University of Science and Technology of China,Hefei 230026,China) 2(State Key Laboratory of Computer Science,Institute of Software,Chinese Academy of Sciences,Beijing 100190,China) 3(Anhui Province Key Lab of Software in Computing and Communication,Hefei 230026,China)
Abstract:The accuracy of IP flow classification based on the characteristics of the application layer is relatively high,but it will cost a lot of time to match the feature library when the feature library is huge.To solve this problem,this paper proposes an approach of traffic classification that combines the characteristics of the application layer with heuristic search.First,we extract the common features from the packets generated by a variety of applications to establish the heuristic rules.Second,we divide the feature library into several feature subsets according to heuristic rules.Then in the process of traffic classification,we only need to match a specific feature subset according to heuristic rules,so the matching of irrelevant features can be greatly reduced,the feature subset is more targeted to be matched and the time performance is improved.For some applications we use DNS as a guide in traffic classification,overcoming the drawback that the encrypted data can not be identified based on the characteristics of the application layer.This paper realizes the algorithm with C language and compares it with l7-filter.The experiments show that the offline classification speed of the method presented in this paper is as 6-10 times as l7-filter,and the accuracy of identifying traffic of various application in our method can reach more than 98%.
Keywords:traffic classification  heuristic rules  regular expression  l7-filter
本文献已被 CNKI 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号