首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
胡婷  王勇  陶晓玲 《计算机工程》2011,37(6):104-106
针对目前基于端口号匹配和特征码识别的流量分类方法准确率低、应用范围受限等问题,提出一种基于有监督的自组织映射(SSOM)的网络流量分类方法。该方法使用已标注类别的网络流量训练集,通过改变自组织映射(SOM)训练过程中的权值调整规则,使输出层中获胜神经元的选择更容易,各类别之间划分更清晰,从而提高分类性能。实验结果表明,SSOM的分辨率及拓扑连续性均优于SOM,对网络流量分类具有更高的准确率。  相似文献   

2.
Accurate and timely traffic classification is critical in network security monitoring and traffic engineering. Traditional methods based on port numbers and protocols have proven to be ineffective in terms of dynamic port allocation and packet encapsulation. The signature matching methods, on the other hand, require a known signature set and processing of packet payload, can only handle the signatures of a limited number of IP packets in real-time. A machine learning method based on SVM (supporting vector machine) is proposed in this paper for accurate Internet traffic classification. The method classifies the Internet traffic into broad application categories according to the network flow parameters obtained from the packet headers. An optimized feature set is obtained via multiple classifier selection methods. Experimental results using traffic from campus backbone show that an accuracy of 99.42% is achieved with the regular biased training and testing samples. An accuracy of 97.17% is achieved when un-biased training and testing samples are used with the same feature set. Furthermore, as all the feature parameters are computable from the packet headers, the proposed method is also applicable to encrypted network traffic.  相似文献   

3.
针对传统加密网络流量分类方法准确率较低、泛用性不强、易侵犯隐私等问题,提出了一种基于卷积神经网络的加密流量分类方法,避免依赖原始流量数据,防止过度拟合特定应用程序的字节结构。针对网络流量的数据包大小和到达时间信息,设计了一种将原始流量转换为二维图片的方法,直方图中每个单元格代表到达相应时间间隔的具有相应大小数据包的数量,不依赖数据包有效载荷,避免了侵犯隐私;针对LeNet-5卷积神经网络模型进行了优化以提高分类精度,嵌入Inception模块进行多维特征提取并进行特征融合,使用1*1卷积来控制输出的特征维度;使用平均池化层和卷积层替代全连接层,提高计算速度且避免过拟合;使用对象检测任务中的滑动窗口方法,将每个网络单向流划分为大小相等的块,确保单个会话中训练集中的块和测试集中的块没有重叠,扩充了数据集样本。在ISCX数据集上的分类实验结果显示,针对应用流量分类任务,准确率达到了95%以上。对比实验结果表明,训练集和测试集类型不同时,传统分类方法出现了显著的精度下降乃至失效,而所提方法的准确率依然达到了89.2%,证明了所提方法普适于加密流量与非加密流量。进行的所有实验均基于不平衡数据集,...  相似文献   

4.
Traffic visualization tools help network operators to maintain awareness of the status of a network, including anomalous activities. Unfortunately, the network operator may look away from the visualizer when beginning network forensics, such as launching a terminal application, logging into a server, and analyzing log files. Thus, the eyesight of the network operator will move from the visual screen even if valuable information is displayed. Our motivation is to develop the ability to use visualization tools as a network operation console. Whereas previous tools focused on outputting packet information, we herein extend the visualizer to accept inputting for operators to start their operations. Since little such software exists for our intent, we develop PACKTER, which is able to visualize traffic based on per-packet information in real time. We also extend PACKTER to have a function of negotiating to a network forensic system, which allows the operator to select an individual packet using a mouse, to start network forensics using a keyboard, and to receive results without looking away from the PACKTER viewer.  相似文献   

5.
Persistently saturated links are abnormal conditions that indicate bottlenecks in Internet traffic. Network operators are interested in detecting such links for troubleshooting, to improve capacity planning and traffic estimation, and to detect denial-of-service attacks. Currently bottleneck links can be detected either locally, through SNMP information, or remotely, through active probing or passive flow-based analysis. However, local SNMP information may not be available due to administrative restrictions, and existing remote approaches are not used systematically because of their network or computation overhead. This paper proposes a new approach to remotely detect the presence of bottleneck links using spectral and statistical analysis of traffic. Our approach is passive, operates on aggregate traffic without flow separation, and supports remote detection of bottlenecks, addressing some of the major limitations of existing approaches. Our technique assumes that traffic through the bottleneck is dominated by packets with a common size (typically the maximum transfer unit, for reasons discussed in Section 5.1). With this assumption, we observe that bottlenecks imprint periodicities on packet transmissions based on the packet size and link bandwidth. Such periodicities manifest themselves as strong frequencies in the spectral representation of the aggregate traffic observed at a downstream monitoring point. We propose a detection algorithm based on rigorous statistical methods to detect the presence of bottleneck links by examining strong frequencies in aggregate traffic. We use data from live Internet traces to evaluate the performance of our algorithm under various network conditions. Results show that with proper parameters our algorithm can provide excellent accuracy (up to 95%) even if the traffic through the bottleneck link accounts for less than 10% of the aggregate traffic.  相似文献   

6.
Offline/realtime traffic classification using semi-supervised learning   总被引:4,自引:0,他引:4  
Jeffrey  Anirban  Martin  Ira  Carey 《Performance Evaluation》2007,64(9-12):1194-1213
Identifying and categorizing network traffic by application type is challenging because of the continued evolution of applications, especially of those with a desire to be undetectable. The diminished effectiveness of port-based identification and the overheads of deep packet inspection approaches motivate us to classify traffic by exploiting distinctive flow characteristics of applications when they communicate on a network. In this paper, we explore this latter approach and propose a semi-supervised classification method that can accommodate both known and unknown applications. To the best of our knowledge, this is the first work to use semi-supervised learning techniques for the traffic classification problem. Our approach allows classifiers to be designed from training data that consists of only a few labeled and many unlabeled flows. We consider pragmatic classification issues such as longevity of classifiers and the need for retraining of classifiers. Our performance evaluation using empirical Internet traffic traces that span a 6-month period shows that: (1) high flow and byte classification accuracy (i.e., greater than 90%) can be achieved using training data that consists of a small number of labeled and a large number of unlabeled flows; (2) presence of “mice” and “elephant” flows in the Internet complicates the design of classifiers, especially of those with high byte accuracy, and necessitates the use of weighted sampling techniques to obtain training flows; and (3) retraining of classifiers is necessary only when there are non-transient changes in the network usage characteristics. As a proof of concept, we implement prototype offline and realtime classification systems to demonstrate the feasibility of our approach.  相似文献   

7.
Packet classification is implemented in modern network routers for providing differentiated services based on packet header information. Traditional packet classification only reports a single matched rule with the highest priority for an incoming packet and takes an action accordingly. With the emergence of new Internet applications such as network intrusion detection system, all matched rules need to be reported. This multi-match problem is more challenging and is attracting attentions in recent years. Because of the stringent time budget on classification, architectural solutions using ternary content addressable memory (TCAM) are the preferred choice for backbone network routers. However, despite its advantage on search speed, TCAM is much more expensive than SRAM, and is notorious for its extraordinarily high power consumption. These problems limit the application and scalability of TCAM-based solutions. This paper presents a tree-based multi-match packet classification technique combining the benefits of both TCAMs and SRAMs. The experiments show that the proposed solution achieves significantly more savings on both memory space and power consumption on packet matching compared to existing solutions.  相似文献   

8.
在当今信息爆炸、网络快速发展的时代,网络攻击与网络威胁日益增多,恶意流量识别在网络安全中发挥着非常重要的作用。深度学习在图像处理、自然语言处理上已经展现出优越的性能,因此有诸多研究将深度学习应用于流量分类中。将深度学习应用于流量识别时,部分研究对原始流量数据进行截断或者补零操作,截断操作容易造成流量信息的部分丢失,补零操作容易引入对模型训练无用的信息。针对这一问题,本文提出了一种用于恶意流量分类的不定长输入卷积神经网络(Indefinite Length Convolutional Neural Network, ILCNN),该网络模型基于不定长输入,在输入时使用未截断未补零的原始流量数据,利用池化操作将不定长特征向量转化为定长的特征向量,最终达到对恶意流量分类的目的。基于CICIDS-2017数据集的实验结果表明, ILCNN模型在F1-Score上的分类准确率能够达到0.999208。相较于现有的恶意流量分类工作,本文所提出的不定长输入卷积神经网络ILCNN在F1-Score和准确率上均有所提升。  相似文献   

9.
网络流量的决策树分类   总被引:2,自引:1,他引:1  
应用识别与流量分类是网络管理、安全、研究等相关事务的必要前提.随着网络的高速发展以及各种新型应用的不断涌现,基于分组传输层端口号和深度分组解析的分类技术难以满足需求.本文验证网络流量的统计特性可以有效地区分不同应用,提出一种基于C4.5决策树分类器的有监督网络流量分类方法,讨论boosting增强方法和特征选择两种改进.实验结果表明,C4.5分类器的训练复杂度适中,准确率高且分类速度快;增强方法可以进一步提高分类器的准确率,代价是训练时间大幅提高和分类时间稍微减慢;特征选择算法则提高分类速度而稍微降低准确率.  相似文献   

10.
Webshell是针对Web应用系统进行持久化控制的最常用恶意后门程序,对Web服务器安全运行造成巨大威胁。对于 Webshell 检测的方法大多通过对整个请求包数据进行训练,该方法对网页型 Webshell 识别效果较差,且模型训练效率较低。针对上述问题,提出了一种基于多特征融合的Webshell恶意流量检测方法,该方法以Webshell的数据包元信息、数据包载荷内容以及流量访问行为3个维度信息为特征,结合领域知识,从3个不同维度对数据流中的请求和响应包进行特征提取;并对提取特征进行信息融合,形成可以在不同攻击类型进行检测的判别模型。实验结果表明,与以往研究方法相比,所提方法在正常、恶意流量的二分类上精确率得到较大提升,可达99.25%;训练效率和检测效率也得到了显著提升,训练时间和检测时间分别下降95.73%和86.14%。  相似文献   

11.
We consider the mean–variance relationship of the number of flows in traffic aggregation, where flows are divided into several groups randomly, based on a predefined flow aggregation index, such as source IP address. We first derive a quadratic relationship between the mean and the variance of the number of flows belonging to a randomly chosen traffic aggregation group. Note here that the result is applicable to sampled flows obtained through packet sampling. We then show that our analytically derived mean–variance relationship fits well those in actual packet trace data sets. Next, we present two applications of the mean–variance relationship to traffic management. One is an application to detecting network anomalies through monitoring a time series of traffic. Using the mean–variance relationship, we determine the traffic aggregation level in traffic monitoring so that it meets two predefined requirements on false positive and false negative ratios simultaneously. The other is an application to load balancing among network equipments that require per-flow management. We utilize the mean–variance relationship for estimating the processing capability required in each network equipment.  相似文献   

12.
随着网络应用服务类型的多样化以及网络流量加密技术的不断发展,加密流量识别已经成为网络安全领域的一个重大挑战。传统的流量识别技术如深度包检测无法有效地识别加密流量,而基于机器学习理论的加密流量识别技术则表现出很好的效果。因此,本文提出一种融合梯度提升决策树算法(GBDT)与逻辑回归(LR)算法的加密流量分类模型,使用贝叶斯优化(BO)算法进行超参数调整,利用与时间相关的流特征对普通加密流量与VPN加密流量进行识别,实现了整体高于90%的流量识别准确度,与其他常用分类模型相比拥有更好的识别效果。  相似文献   

13.
工业物联网系统所面临的网络安全威胁随着物联网技术的广泛应用日益增加,信息安全问题已成为其发展过程中的一大挑战。MQTT(Message Queuing Telemetry Transport)协议是物联网通信的主流协议,基于该协议的物联网通信安全研究是当前研究的热点话题。传统的流量识别技术如深度包检测无法有效地识别符合包格式的异常流量,而基于机器学习理论的异常流量识别技术则表现出很好的效果。对此提出一种基于随机森林算法的MQTT异常流量检测方法,实现整体高于90%的MQTT异常流量识别准确度,与其他常用分类模型相比拥有更好的识别效果。  相似文献   

14.
Diagnosing Traffic Anomalies Using a Two-Phase Model   总被引:1,自引:0,他引:1       下载免费PDF全文
Network traffic anomalies are unusual changes in a network,so diagnosing anomalies is important for network management.Feature-based anomaly detection models (ab)normal network traffic behavior by analyzing packet header features.PCA-subspace method (Principal Component Analysis) has been verified as an efficient feature-based way in network-wide anomaly detection.Despite the powerful ability of PCA-subspace method for network-wide traffic detection,it cannot be effectively used for detection on a single link.In this paper,different from most works focusing on detection on flow-level traffic,based on observations of six traffic features for packet-level traffic,we propose a new approach B6SVM to detect anomalies for packet-level traffic on a single link.The basic idea of B6-SVM is to diagnose anomalies in a multi-dimensional view of traffic features using Support Vector Machine (SVM).Through two-phase classification,B6-SVM can detect anomalies with high detection rate and low false alarm rate.The test results demonstrate the effectiveness and potential of our technique in diagnosing anomalies.Further,compared to previous feature-based anomaly detection approaches,B6-SVM provides a framework to automatically identify possible anomalous types.The framework of B6-SVM is generic and therefore,we expect the derived insights will be helpful for similar future research efforts.  相似文献   

15.
The problem of classifying traffic flows in networks has become more and more important in recent times, and much research has been dedicated to it. In recent years, there has been a lot of interest in classifying traffic flows by application, based on the statistical features of each flow. Information about the applications that are being used on a network is very useful in network design, accounting, management, and security. In our previous work we proposed a classification algorithm for Internet traffic flow classification based on Artificial Immune Systems (AIS). We also applied the algorithm on an available data set, and found that the algorithm performed as well as other algorithms, and was insensitive to input parameters, which makes it valuable for embedded systems. It is also very simple to implement, and generalizes well from small training data sets. In this research, we expanded on the previous research by introducing several optimizations in the training and classification phases of the algorithm. We improved the design of the original algorithm in order to make it more predictable. We also give the asymptotic complexity of the optimized algorithm as well as draw a bound on the generalization error of the algorithm. Lastly, we also experimented with several different distance formulas to improve the classification performance. In this paper we have shown how the changes and optimizations applied to the original algorithm do not functionally change the original algorithm, while making its execution 50–60% faster. We also show that the classification accuracy of the Euclidian distance is superseded by the Manhattan distance for this application, giving 1–2% higher accuracy, making the accuracy of the algorithm comparable to that of a Naïve Bayes classifier in previous research that uses the same data set.  相似文献   

16.
Classifying traffic into specific network applications is essential for application-aware network management and it becomes more challenging because modern applications complicate their network behaviors. While port number-based classifiers work only for some well-known applications and signature-based classifiers are not applicable to encrypted packet payloads, researchers tend to classify network traffic based on behaviors observed in network applications. In this paper, a session level flow classification (SLFC) approach is proposed to classify network flows as a session, which comprises of flows in the same conversation. SLFC first classifies flows into the corresponding applications by packet size distribution (PSD) and then groups flows as sessions by port locality. With PSD, each flow is transformed into a set of points in a two-dimension space and the distances between each flow and the representatives of pre-selected applications are computed. The flow is recognized as the application having a minimum distance. Meanwhile, port locality is used to group flows as sessions because an application often uses consecutive port numbers within a session. If flows of a session are classified into different applications, an arbitration algorithm is invoked to make the correction. The evaluation shows that SLFC achieves high accuracy rates on both flow and session classifications, say 99.9% and 99.98%, respectively. When SLFC is applied to online classification, it is able to make decisions quickly by checking at most 300 packets for long-lasting flows. Based on our test data, an average of 72% of packets in long-lasting flows can be skipped without reducing the classification accuracy rates.  相似文献   

17.
Demands on data communication networks continue to drive the need for increasingly faster link speeds. Optical packet switching networks promise to provide data rates that are sufficiently high to satisfy the needs of the future Internet core network. However, a key technological problem with optical packet switching is the very small size of packet buffers that can be implemented in the optical domain. Existing protocols, for example the widely used Transmission Control Protocol (TCP), do not perform well in such small-buffer networks. To address this problem, we have proposed techniques for actively pacing traffic at edge networks to ensure that traffic bursts are reduced or eliminated and thus do not cause packet losses in routers with small buffers. We have also shown that this traffic pacing can improve the performance of conventional networks that use small buffers (e.g., to reduce the cost of buffer memory on routers). A key challenge in this context is to develop systems that can perform such packet pacing efficiently and at high data rates. In this paper, we present the design and prototype of a hardware implementation of our packet pacing technique. We discuss and evaluate design trade-offs and present performance results from an prototype implementation based on a NetFPGA fieldprogrammable gate array system. Our results show that traffic pacing can be implemented with few hardware resources and without reducing system throughput. Therefore, we believe that traffic pacing can be deployed widely to improve the operation of current and future networks.  相似文献   

18.
Network operators and mobile carriers are facing serious security challenges caused by an increasing number of services provided by smartphone Apps. For example, Android OS has more than 1 million Apps in stores. Hence, network administrators tend to adopt strict policies to secure their infrastructure. The aim of this study is to propose an efficient framework that has a classification component based on traffic analysis of Android Apps. The framework differs from other proposed studies by focusing on identifying Apps traffic from a network perspective without introducing any overhead on subscribers smartphones. Additionally, it involves a technique for pre-processing network flows generated by Apps to acquire a set of features that are used to build an identification model using machine learning algorithms. The classification model is built using classification ensembles. A group of chosen users contribute in training the classification model, which learns the normal behavior of selected Apps. Eventually, the model should be able to detect abnormal behavior of similar Apps across the network. A 93.78% classification accuracy is achieved with a low false positive rate under 0.5%. In addition, the framework is able to detect abnormal flows of unknown classes by implementing an outlier detection mechanism and reported a 94% accuracy.  相似文献   

19.
支持向量机方法具有良好的分类准确率、稳定性与泛化性,在网络流量分类领域已有初步应用,但在面对大规模网络流量分类问题时却存在计算复杂度高、分类器训练速度慢的缺陷。为此,提出一种基于比特压缩的快速SVM方法,利用比特压缩算法对初始训练样本集进行聚合与压缩,建立具有权重信息的新样本集,在损失尽量少原始样本信息的前提下缩减样本集规模,进一步利用基于权重的SVM算法训练流量分类器。通过大规模样本集流量分类实验对比,快速SVM方法能在损失较少分类准确率的情况下,较大程度地缩减流量分类器的训练时间以及未知样本的预测时间,同时,在无过度压缩前提下,其分类准确率优于同等压缩比例下的随机取样SVM方法。本方法在保留SVM方法较好分类稳定性与泛化性能的同时,有效提升了其应对大规模流量分类问题的能力。  相似文献   

20.
In this work, we develop a novel packet scheduling algorithm that properly incorporates the semantics of a packet. We find that improvement in overall packet loss does not necessarily coincide with improvement in user perceivable QoS. The objective of this work is to develop a packet scheduling mechanism which can improve the user perceivable QoS. We do not focus on improving packet loss, delay, or burstiness. We develop a metric called, “Packet Significance,” that effectively quantifies the importance of a packet that properly incorporates the semantics of a packet from the perspective of compression. Packet significance elaborately incorporates inter-frame, intra-frame information dependency, and the transitive information dependency characteristics of modern compression schemes. We apply packet significance in scheduling the packet. In our context, packet scheduling consists of two technical ingredients: packet selection and interval selection. Under limited network bandwidth availability, it is desirable to transmit the subset of the packets rather than transmitting the entire set of packets. We use a greedy approach in selecting packets for transmission and use packet significance as the selection criteria. In determining the transmission interval of a packet, we incorporate the packet significance. Simulation based experiments with eight video clips were performed. We embed the decoding engine in our simulation software and examine the user perceivable QoS (PSNR). We compare the performance of the proposed algorithm with best effort scheduling scheme and one with simple QoS metric based scheduling scheme. Our Significance-Aware Scheduling scheme (SAPS) effectively incorporates the semantics of a packet and delivers best user perceivable QoS. SAPS can result in more packet loss or burstier traffic. Despite these limitations, SAPS successfully improves the overall user perceivable QoS.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号