首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
支持向量机方法具有良好的分类准确率、稳定性与泛化性,在网络流量分类领域已有初步应用,但在面对大规模网络流量分类问题时却存在计算复杂度高、分类器训练速度慢的缺陷。为此,提出一种基于比特压缩的快速SVM方法,利用比特压缩算法对初始训练样本集进行聚合与压缩,建立具有权重信息的新样本集,在损失尽量少原始样本信息的前提下缩减样本集规模,进一步利用基于权重的SVM算法训练流量分类器。通过大规模样本集流量分类实验对比,快速SVM方法能在损失较少分类准确率的情况下,较大程度地缩减流量分类器的训练时间以及未知样本的预测时间,同时,在无过度压缩前提下,其分类准确率优于同等压缩比例下的随机取样SVM方法。本方法在保留SVM方法较好分类稳定性与泛化性能的同时,有效提升了其应对大规模流量分类问题的能力。  相似文献   

2.
一种基于SVM的P2P网络流量分类方法   总被引:10,自引:1,他引:9       下载免费PDF全文
提出一种基于SVM的P2P网络流量分类的方法。这种方法利用网络流量的统计特征和基于统计理论的SVM方法,对不同应用类型的P2P网络流量进行分类研究。主要对文件共享中的BitTorrent,流媒体中的PPLive,网络电话中的Skype,即时通讯中的MSN 4种P2P网络流量进行分类研究。介绍了基于SVM的P2P流量分类的整体框架,描述了流量样本的获取及处理方法,并对分类器的构建及实验结果进行了介绍。实验结果验证了提出方法的有效性,平均分类精确率为92.38%。  相似文献   

3.
针对传统的基于传输层端口和基于特征码的流量分类技术准确率低、应用范围有限等缺点,提出了使用树扩展的贝叶斯分类器的方法,该方法利用网络流量的统计属性和基于统计理论的贝叶斯方法构建分类模型,并利用该模型对未知流量进行分类。实验分析了不同权值、不同规模的数据集对其性能的影响,并与NB、C4.5算法做了比较。实验结果表明,该方法具有较好的分类性能和较高的分类准确率。  相似文献   

4.
基于传统循环神经网络的加密流量分类方法普遍存在并行性较差、模型运行效率较低等问题。为实现加密流量的快速准确分类,提出一种基于多层双向简单循环单元(SRU)与注意力(MLBSRU-A)模型的加密流量分类方法。将特征学习和分类统一到一个端到端模型中,利用SRU模型高度并行化的序列建模能力来提高整体运行效率。为了提升MLBSRU-A模型的分类精度,堆叠多层双向SRU网络使其自动地从原始流量中提取特征,并引入注意力机制为特征赋予不同的权重,从而提高重要特征之间的区分度。实验结果表明,在公开数据集ISCX VPN-nonVPN上,MLBSRU-A模型具有较高的分类精度和运行效率,与BGRUA模型相比,MLBSRU-A的细粒度分类准确率提高4.34%,训练时间减少55.38%,在USTC-TFC 2016数据集上,MLBSRU-A模型对未知加密恶意流量的检测准确率达到99.50%,细粒度分类准确率为98.84%,其兼具对未知加密恶意流量的高精度检测能力以及对加密恶意流量的细粒度分类能力。  相似文献   

5.
焦程波 《计算机应用》2011,31(11):2965-2968
网络地址翻译器转发的混合流与P2P数据流呈现相似的流量外部特征。实际测试结果显示,如果数据捕获点位于网络地址翻译器之后,当前P2P流量特征识别方法(TLI)因为没有对网络地址翻译器(NAT)转发混合流进行区分而将导致虚警和漏报情况。为了解决此类问题,提出了基于流身份识别的P2P流量检测方法,首先通过分析IP标识时间序列完成对NAT转发混合流中源自不同设备数据流的身份识别,在此基础上采用流量特征检测P2P流量。以当前主要的P2P应用为例进行测试,结果说明,利用该方法可以有效识别NAT混合流中的P2P流量,较大幅度降低虚警率和漏报率。  相似文献   

6.
Cervical cancer is one of the leading causes of cancer death in females worldwide. The disease can be cured if the patient is diagnosed in the pre-cancerous lesion stage or earlier. A common physical examination technique widely used in the screening is Papanicolaou test or Pap test. In this research, a method for automatic cervical cancer cell segmentation and classification is proposed. A single-cell image is segmented into nucleus, cytoplasm, and background, using the fuzzy C-means (FCM) clustering technique. Four cell classes in the ERUDIT and LCH datasets, i.e., normal, low grade squamous intraepithelial lesion (LSIL), high grade squamous intraepithelial lesion (HSIL), and squamous cell carcinoma (SCC), are considered. The 2-class problem can be achieved by grouping the last 3 classes as one abnormal class. Whereas, the Herlev dataset consists of 7 cell classes, i.e., superficial squamous, intermediate squamous, columnar, mild dysplasia, moderate dysplasia, severe dysplasia, and carcinoma in situ. These 7 classes can also be grouped to form a 2-class problem. These 3 datasets were tested on 5 classifiers including Bayesian classifier, linear discriminant analysis (LDA), K-nearest neighbor (KNN), artificial neural networks (ANN), and support vector machine (SVM). For the ERUDIT dataset, ANN with 5 nucleus-based features yielded the accuracies of 96.20% and 97.83% on the 4-class and 2-class problems, respectively. For the Herlev dataset, ANN with 9 cell-based features yielded the accuracies of 93.78% and 99.27% for the 7-class and 2-class problems, respectively. For the LCH dataset, ANN with 9 cell-based features yielded the accuracies of 95.00% and 97.00% for the 4-class and 2-class problems, respectively. The segmentation and classification performances of the proposed method were compared with that of the hard C-means clustering and watershed technique. The results show that the proposed automatic approach yields very good performance and is better than its counterparts.  相似文献   

7.
张剑  曹萍  寿国础 《计算机应用》2012,32(7):1807-1811
针对利用数据流统计特性的网络流量分类算法复杂及实时性差的问题,提出一种基于传输层拓扑的网络流量识别方法,根据应用类型在汇聚节点表现出不同的主机连接拓扑结构,提取应用类型的拓扑特征,结合深度包检测(DPI)技术生成应用类型库,并基于该库和启发式准则实现典型应用类型的快速识别和分类。实验结果表明,所提方法对各主要应用类型的识别精确度均高于85%,并将未识别流比例从深度包检测技术的18%降低到7%,有效利用了不同应用类型的连接拓扑信息,能提高应用类型的识别准确度。  相似文献   

8.
Internet traffic classification plays an important role in the field of network security and management. Past research works utilize flow-level statistical features for accurate and efficient classification, such as the nearest-neighbor based supervised classifier. However, classification accuracy of supervised approaches is significantly affected if the size of the training set is small. More importantly, the model built using a static training set will not be able to adapt to the non-static nature of Internet traffic. With the drastic evolution of the Internet, network traffic cannot be assumed to be static. In this paper, we develop the concept of ‘self-learning’ to deal with these two challenges. We propose, design and develop a new classifier called Self-Learning Intelligent Classifier (SLIC). SLIC starts with a small number of training instances, self-learns and rebuilds the classification model dynamically, with the aim of achieving high accuracy in classifying non-static traffic flows. We carry out performance evaluations using two real-world traffic traces, and demonstrate the effectiveness of SLIC. The results show that SLIC achieves significant improvement in accuracy compared to the state-of-the-art approach.  相似文献   

9.
使用机器学习算法分类P2P流量的方法*   总被引:4,自引:0,他引:4  
P2P应用的快速增长,带来网络拥塞等诸多问题,而传统的基于端口与有效载荷的P2P流量分类方法存在着很多缺陷。以抽取独立于端口、协议和有效载荷的P2P流的信息作为特征,用提出的基于ReliefF-CFS的方法选择流的特征子集,研究使用机器学习算法对P2P流量进行分类的方法,也研究了利用流的前向N个报文的统计信息作为特征,分类P2P流量的方法。实验结果显示提出的方法取得了较好的分类准确率。  相似文献   

10.
The goal of network traffic classification is to identify the protocols or types of protocols in the network traffic. In particular, the identification of network traffic with high resource consumption, such as peer-to-peer (P2P) traffic, represents a great concern for Internet Service Providers (ISP) and network managers. Most current flow-based classification approaches report high accuracy without paying attention to the generalization ability of the classifier. However, without this ability, a classifier may not be suitable for on-line classification. In this paper, a number of experiments on real traffic help to elucidate the reason for this lack of generalization. It is also shown that one way to attain the generalization ability is by using dynamic classifiers. From these results, a dynamic classification approach based on the pairing of flows according to a similarity criterion is proposed. The pairing method is not a classifier by itself. Rather, its goal is to determine in a fast way that two given flows are similar enough to conclude they correspond to the same protocol. Combining this method with a classifier, most of the flows do not need to be explicitly evaluated by the later, so that the computational overhead is reduced without a significant reduction in accuracy. In this paper, as a case study, we explore complementing the pairing method with payload inspection. In the experiments performed, the pairing approach generalizes well to traffic obtained in different conditions and scenarios than that used for calibration. Moreover, a high portion of the traffic unclassified by payload inspection is categorized with the pairing method.  相似文献   

11.

The visual sleep stages scoring by human experts is the current gold standard for sleep analysis. However, this method is tedious, time-consuming, prone to human errors, and unable to detect microstructure of sleep such as cyclic alternating pattern (CAP) which is an important diagnostic factor for the detection of sleep disorders such as insomnia and obstructive sleep apnea (OSA). The CAP is only observed as subtle changes in the electroencephalogram (EEG) signals during non-rapid eye movement (NREM) sleep, making it very difficult for human experts to discern. Hence, it is important to have an automated system developed using artificial intelligence for accurate and robust detection of CAP and sleep stages classification. In this study, a deep learning model based on 1-dimensional convolutional neural network (1D-CNN) is proposed for CAP detection and homogenous 3-class sleep stages classification, namely wakefulness (W), rapid eye movement (REM) and NREM sleep. The proposed model is developed using standardized EEG recordings. Our developed CNN network achieved good model performance for 3-class sleep stages classification with a classification accuracy of 90.46%. Our proposed model also yielded a classification accuracy of 73.64% using balanced CAP dataset, and sensitivity of 92.06% with unbalanced CAP dataset. Our proposed model correctly identified majority of A-phases which comprised of only 12.6% in the unbalanced dataset. The performance of the developed prototype is ready to be tested with more data before clinical application.

  相似文献   

12.
P2P流量识别技术综述   总被引:1,自引:0,他引:1  
在归纳P2P流量识别问题概念的基础上,对现有的P2P流量识别技术进行了较全面地分析.借助分类模型形式化地定义P2P流量识别问题,依据所采用的识别特征将已有技术分为基于端口号、基于流量特征、基于应用层签名、基于双重特征和基于统计行为特征五类方法,并对各类方法进行了介绍、分析与优劣对比.探讨了新兴的P2P流媒体流量识别问题,总结了P2P流量识别技术的发展趋势.  相似文献   

13.
P2P流媒体流量中的控制流与数据流,由于统计特征差异较大,致使DFI(深度流检测)方法识别其效果不佳。借鉴DFI的思想,提出一种基于端点特征识别P2P流媒体流量的方法。该方法针对网络端点,提取了六个有效特征,并结合机器学习的方法识别P2P流媒体流量。实验结果表明,该方法比DFI识别的整体准确率要高,且可以用于P2P流媒体的在线识别。  相似文献   

14.
关于流识别与分类,目前主流的技术是基于统计学方法,核心环节是提取有效的特征属性集。这种方法的假设条件是,特征不相关,数据不相关。正因为这种假设的不合理性,使得分类效果有限。虽然已经有很多研究在集中解决特征相关性问题,但数据相关性却难以突破。因此引入流量分形理论,该理论建立在数据相关性基础之上。通过对原有理论进行必要的修改、调整以适用于流的分类识别,并用理论证明验证其有效性,最后通过系列实验体现该方法在粗粒度分类、未知流分类等方面的实际效果。  相似文献   

15.

Service availability plays a vital role on computer networks, against which Distributed Denial of Service (DDoS) attacks are an increasingly growing threat each year. Machine learning (ML) is a promising approach widely used for DDoS detection, which obtains satisfactory results for pre-known attacks. However, they are almost incapable of detecting unknown malicious traffic. This paper proposes a novel method combining both supervised and unsupervised algorithms. First, a clustering algorithm separates the anomalous traffic from the normal data using several flow-based features. Then, using certain statistical measures, a classification algorithm is used to label the clusters. Employing a big data processing framework, we evaluate the proposed method by training on the CICIDS2017 dataset and testing on a different set of attacks provided in the more up-to-date CICDDoS2019. The results demonstrate that the Positive Likelihood Ratio (LR+) of our method is approximately 198% higher than the ML classification algorithms.

  相似文献   

16.
It is estimated that 70% or more of broadband bandwidth is consumed by transmitting music, games, video and other content through Peer-to-Peer (P2P) clients. In order to detect, identify, and manage P2P traffic, some port, payload and transport layer feature based methods were proposed. Most of them were applied to offline traffic classification mainly due to the performance reason. In this paper, a network processors (NPs) based online hybrid traffic classifier is proposed. The designed hardware classifier is able to classify P2P traffic based on the static characteristic namely on line speed, and the Flexible Neural Tree(FNT) based software classifier helps learning and selecting P2P traffic attributes from the statistical characteristics of the P2P traffic. Experiment results illustrate that the hybrid classifier performs well for online classification of P2P traffic from gigabit network. The proposed framework also depicts good expansion capabilities to add new P2P features and to adapt to new P2P applications online.  相似文献   

17.
李麟青  杨哲  朱艳琴 《计算机应用》2011,31(12):3210-3214
P2P流量已经成为互联网流量的主要部分,消耗大量的带宽,影响了服务质量,准确并实时检测出P2P流量有助于对P2P应用的监管,并研究其行为和发展。针对P2P流量中比例最大的BT流量,提出了一种混合式的检测方法。该方法由三个子方法构成,分别针对BT流量中的明文流、密文流和信令流进行检测,并预知即将发生的BT流量。实验结果表明,该方法的召回率、准确率和实时性,均优于目前实时性最好的几种机器学习方法。  相似文献   

18.
顾苏杭  王士同 《控制与决策》2020,35(11):2653-2664
针对实际数据集中的每一类数据都潜在或显著地包含独有的数据风格信息,提出一种挖掘数据风格信息的双知识表达分类方法.在训练阶段,利用K近邻(KNN)算法构建社交网络以表达数据点之间的组织架构,并利用社交网络属性挖掘数据点及每一类数据整体风格信息.在分类阶段,用双知识表达约束所提出方法的分类行为,即赋予测试样本标签时既要使该样本物理上与所建分类模型最相似,也要使该样本风格上与分类模型最相似.与其他对比分类方法相比,所提出方法在不包含或包含不显著风格的数据集上至少能够取得竞争性的分类性能,在包含明显风格的数据集上能够取得优越性的分类性能.  相似文献   

19.
For mobile communication traffic series, an accurate multistep prediction result plays an important role in network management, capacity planning, traffic congestion control, channel equalization, etc. A novel time series forecasting based on echo state networks and multiplicative seasonal ARIMA model are proposed for this multiperiodic, nonstationary, mobile communication traffic series. Motivated by the fact that the real traffic series exhibits periodicities at the cycle of 6, 12, and 24 h, as well as 1 week, we isolate most of mentioned above features for each cell and integrate all the wavelet multiresolution sublayers into two parts for consideration of alleviating the accumulated error. On seasonal characters, multiplicative seasonal ARIMA model is to predict the seasonal part, and echo state networks are to deal with the smooth part because of its prominent approximation capabilities and convenience. Experimental results on real traffic dataset show that proposed method performs well on the prediction accuracy.  相似文献   

20.
The problem of handwritten digit recognition has long been an open problem in the field of pattern classification and of great importance in industry. The heart of the problem lies within the ability to design an efficient algorithm that can recognize digits written and submitted by users via a tablet, scanner, and other digital devices. From an engineering point of view, it is desirable to achieve a good performance within limited resources. To this end, we have developed a new approach for handwritten digit recognition that uses a small number of patterns for training phase. To improve the overall performance achieved in classification task, the literature suggests combining the decision of multiple classifiers rather than using the output of the best classifier in the ensemble; so, in this new approach, an ensemble of classifiers is used for the recognition of handwritten digit. The classifiers used in proposed system are based on singular value decomposition (SVD) algorithm. The experimental results and the literature show that the SVD algorithm is suitable for solving sparse matrices such as handwritten digit. The decisions obtained by SVD classifiers are combined by a novel proposed combination rule which we named reliable multi-phase particle swarm optimization. We call the method “Reliable” because we have introduced a novel reliability parameter which is applied to tackle the problem of PSO being trapped in local minima. In comparison with previous methods, one of the significant advantages of the proposed method is that it is not sensitive to the size of training set. Unlike other methods, the proposed method uses just 15 % of the dataset as a training set, while other methods usually use (60–75) % of the whole dataset as the training set. To evaluate the proposed method, we tested our algorithm on Farsi/Arabic handwritten digit dataset. What makes the recognition of the handwritten Farsi/Arabic digits more challenging is that some of the digits can be legally written in different shapes. Therefore, 6000 hard samples (600 samples per class) are chosen by K-nearest neighbor algorithm from the HODA dataset which is a standard Farsi/Arabic digit dataset. Experimental results have shown that the proposed method is fast, accurate, and robust against the local minima of PSO. Finally, the proposed method is compared with state of the art methods and some ensemble classifier based on MLP, RBF, and ANFIS with various combination rules.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号