首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
针对Android平台恶意应用的检测技术,提出一种基于集成学习投票算法的Android恶意程序检测方法MASV(Soft-Voting Algorithm),以有效地对未知应用程序进行分类。从已知开源的数据集中获取了实验的基础数据,使用的应用程序集包含213 256个良性应用程序以及18 363个恶意应用程序。使用SVM-RFE特征选择算法对特征进行降维。使用多个分类器的集合,即SVM(Support Vector Machine)、[K]-NN[(K]-Nearest Neighbor)、NB(Na?ve Bayes)、CART(Classification and Regression Tree)和RF(Random Forest),以检测恶意应用程序和良性应用程序。使用梯度上升算法确定集成学习软投票的基分类器权重参数。实验结果表明,该方法在恶意应用程序检测中达到了99.27%的准确率。  相似文献   

2.
Automatic personality perception is the prediction of personality that others attribute to a person in a given situation. The aim of automatic personality perception is to forecast the behaviour of the speaker perceived by the listener from nonverbal behavior. Extroversion, Conscientiousness, Agreeableness, Neuroticism, and Openness are the speaker traits used for personality assessment. In this work, a speaker trait prediction approach for automatic personality assessment is proposed. This approach is based on modeling the relationship between speech signal and personality traits using spectral features. The experiments are achieved over the SSPNet Personality Corpus. The Frequency Domain Linear Prediction and Mel Frequency Cepstral Coefficient features are extracted for the prediction of speaker traits. The classification is done using Instance based k-Nearest neighbor and Support Vector Machine (SVM) classifiers. The experimental results show that k-Nearest Neighbor classifier outperforms SVM classifier. The classification accuracy is between 90 and 100%.  相似文献   

3.
We describe a system that learns from examples to recognize persons in images taken indoors. Images of full-body persons are represented by color-based and shape-based features. Recognition is carried out through combinations of Support Vector Machine (SVM) classifiers. Different types of multi-class strategies based on SVMs are explored and compared to k-Nearest Neighbors classifiers. The experimental results show high recognition rates and indicate the strength of SVM-based classifiers to improve both generalization and run-time performance. The system works in real-time.  相似文献   

4.
针对目前主流恶意网页检测技术耗费资源多、检测周期长和分类效果低等问题,提出一种基于Stacking的恶意网页集成检测方法,将异质分类器集成的方法应用在恶意网页检测识别领域。通过对网页特征提取分析相关因素和分类集成学习来得到检测模型,其中初级分类器分别使用K近邻(KNN)算法、逻辑回归算法和决策树算法建立,而次级的元分类器由支持向量机(SVM)算法建立。与传统恶意网页检测手段相比,此方法在资源消耗少、速度快的情况下使识别准确率提高了0.7%,获得了98.12%的高准确率。实验结果表明,所提方法构造的检测模型可高效准确地对恶意网页进行识别。  相似文献   

5.
In this paper, a classifier motivated from statistical learning theory, i.e., support vector machine, with a new approach based on multiclass directed acyclic graph has been proposed for classification of four types of electrocardiogram signals. The motivation for selecting Directed Acyclic Graph Support Vector Machine (DAGSVM) is to have more accurate classifier with less computational cost. Empirical mode decomposition and subsequently singular value decomposition have been used for computing the feature vector matrix. Further, fivefold cross-validation and particle swarm optimization have been used for optimal selection of SVM model parameters to improve the performance of DAGSVM. A comparison has been made between proposed algorithm and other two classifiers, i.e., K-Nearest Neighbor (KNN) and Artificial Neural Network (ANN). The DAGSVM has yielded an average accuracy of 98.96% against 95.83% and 96.66% for the KNN and the ANN, respectively. The results obtained clearly confirm the superiority of the DAGSVM approach over other classifiers.  相似文献   

6.
To develop Human-centric Driver Assistance Systems (HDAS) for automatic understanding and charactering of driver behaviors, an efficient feature extraction of driving postures based on Geronimo–Hardin–Massopust (GHM) multiwavelet transform is proposed, and Multilayer Perceptron (MLP) classifiers with three layers are then exploited in order to recognize four pre-defined classes of driving postures. With features extracted from a driving posture dataset created at Southeast University (SEU), the holdout and cross-validation experiments on driving posture classification are conducted by MLP classifiers, compared with the Intersection Kernel Support Vector Machines (IKSVMs), the k-Nearest Neighbor (kNN) classifier and the Parzen classifier. The experimental results show that feature extraction based on GHM multwavelet transform and MLP classifier, using softmax activation function in the output layer and hyperbolic tangent activation function in the hidden layer, offer the best classification performance compared to IKSVMs, kNN and Parzen classifiers. The experimental results also show that talking on a cellular phone is the most difficult one to classify among four predefined classes, which are 83.01% and 84.04% in the holdout and cross-validation experiments respectively. These results show the effectiveness of the feature extraction approach using GHM multiwavelet transform and MLP classifier in automatically understanding and characterizing driver behaviors towards Human-centric Driver Assistance Systems (HDAS).  相似文献   

7.
Traditional Support Vector Machine (SVM) solution suffers from O(n 2) time complexity, which makes it impractical to very large datasets. To reduce its high computational complexity, several data reduction methods are proposed in previous studies. However, such methods are not effective to extract informative patterns. In this paper, a two-stage informative pattern extraction approach is proposed. The first stage of our approach is data cleaning based on bootstrap sampling. A bundle of weak SVM classifiers are constructed on the sampled datasets. Training data correctly classified by all the weak classifiers are cleaned due to lacking useful information for training. To further extract more informative training data, two informative pattern extraction algorithms are proposed in the second stage. As most training data are eliminated and only the more informative samples remain, the final SVM training time is reduced significantly. Contributions of this paper are three-fold. (1) First, a parallelized bootstrap sampling based method is proposed to clean the initial training data. By doing that, a large number of training data with little information are eliminated. (2) Then, we present two algorithms to effectively extract more informative training data. Both algorithms are based on maximum information entropy according to the empirical misclassification probability of each sample estimated in the first stage. Therefore, training time can be further reduced for training data further reduction. (3) Finally, empirical studies on four large datasets show the effectiveness of our approach in reducing the training data size and the computational cost, compared with the state-of-the-art algorithms, including PEGASOS, LIBLINEAR SVM and RSVM. Meanwhile, the generalization performance of our approach is comparable with baseline methods.  相似文献   

8.
The errors resulting from satellite configuration geometry can be determined by Geometric Dilution of Precision (GDOP). Considering optimal satellite subset selection, lower GDOP value usually causes better accuracy in GPS positioning. However, GDOP computation based on complicated transformation and inversion of measurement matrices is a time consuming procedure. This paper deals with classification of GPS GDOP utilizing Parzen estimation based Bayesian decision theory. The conditional probability of each class is estimated by Parzen algorithm. Then based on Bayesian decision theory, the class with maximum posterior probability is selected. The experiments on measured dataset demonstrate that the proposed algorithm lead, in mean classification improvement, to 4.08% in comparison with Support Vector Machine (SVM) and 9.83% in comparison with K-Nearest Neighbour (KNN) classifier. Extra work on feature extraction has been performed based on Principle Component Analysis (PCA). The results demonstrate that the feature extraction approach has best performance respect to all classifiers.  相似文献   

9.
基于支持向量机和k-近邻分类器的多特征融合方法   总被引:1,自引:0,他引:1  
陈丽  陈静 《计算机应用》2009,29(3):833-835
针对传统分类方法只采用一种分类器而存在的片面性,分类精度不高,以及支持向量机分类超平面附近点易错分的问题,提出了基于支持向量机(SVM)和k 近邻(KNN)的多特征融合方法。在该算法中,设样本集特征可分为L组,先用SVM算法根据训练集中每组特征数据构造分类超平面,共构造L个;其次用SVM KNN方法对测试集进行测试,得到由L组后验概率构成的决策轮廓矩阵;最后将其进行多特征融合,输出最终的分类结果。用鸢尾属植物数据进行了数值实验,实验结果表明:采用基于SVM KNN的多特征融合方法比单独使用一种SVM或SVM KNN方法的平均预测精度分别提高了28.7%和1.9%。  相似文献   

10.
In this paper we investigate the combination of four machine learning methods for text categorization using Dempster's rule of combination. These methods include Support Vector Machine (SVM), kNN (Nearest Neighbor), kNN model-based approach (kNNM), and Rocchio. We first present a general representation of the outputs of different classifiers, in particular, modeling it as a piece of evidence by using a novel evidence structure called focal element triplet. Furthermore, we investigate an effective method for combining pieces of evidence derived from classifiers generated by a 10-fold cross-validation. Finally, we evaluate our methods on the 20-newsgroup and Reuters-21578 benchmark data sets and perform the comparative analysis with majority voting in combining multiple classifiers along with the previous result. Our experimental results show that the best combined classifier can improve the performance of the individual classifiers and Dempster's rule of combination outperforms majority voting in combining multiple classifiers.  相似文献   

11.
基于模糊支持向量机的多分类算法研究   总被引:1,自引:1,他引:0  
张钊  费一楠  宋麟  王锁柱 《计算机应用》2008,28(7):1681-1683
针对支持向量机理论中的多分类问题以及SVM对噪声数据的敏感性问题,提出了一种基于二叉树的模糊支持向量机多分类算法。该算法是在基于二叉树的支持向量机多分类算法的基础上引入模糊隶属度函数,根据每个样本数据对分类结果的不同影响,通过基于KNN的模糊隶属度的度量方法计算出相应的值,由此得到不同的惩罚值,这样在构造分类超平面时,就可以忽略对分类结果不重要的数据。通过实验证明,该算法有较好的抗干扰能力和分类效果。  相似文献   

12.
在文本分类领域中,KNN与SVM算法都具有较高的分类准确率,但两者都有其内在的缺点,KNN算法会因为大量的训练样本而导致计算量过大;SVM算法对于噪声数据过于敏感,对分布在分类超平面附近的数据点无法进行准确的分类,基于此提出一种基于变精度粗糙集理论的混合分类算法,该算法能够充分利用二者的优势同时又能克服二者的弱点,最后通过实验证明混合算法能够有效改善计算复杂度与分类精度。  相似文献   

13.

The process of separation of brain tumor from normal brain tissues is Brain tumor segmentation. Segmentation of tumor from the MR images is a very challenging task as brain tumors are of different shapes and sizes. There are multiple phases to achieve the segmentation and the phases are pre-processing, segmentation, feature extraction, feature reduction, and classification of the tumor into benign and malignant. In this paper, Otsu thresholding is used in segmentation phase, Discrete Wavelet Transform (DWT) in feature extraction phase, Principal Component Analysis (PCA) in feature reduction phase and Support Vector Machine (SVM), Least Squared-Support Vector Machine (LS-SVM), Proximal Support Vector Machine (PSVM) and Twin Support Vector Machine (TWSVM) in the classification phase. We have compared the performances of all these classifiers, where TWSVM outperformed all other classifiers with 100% accuracy.

  相似文献   

14.
Despite the online availability of data, analysis of this information in academic research is arduous. This article explores the application of supervised machine learning (SML) to overcome challenges associated with online data analysis. In SML classifiers are used to categorize and code binary data. Based on a case study of Dutch employees’ work-related tweets, this paper compares the coding performance of three classifiers, Linear Support Vector Machine, Naïve Bayes, and logistic regression. The performance of these classifiers is assessed by examining accuracy, precision, recall, the area under the precision-recall curve, and Krippendorf’s Alpha. These indices are obtained by comparing the coding decisions of the classifier to manual coding decisions. The findings indicate that the Linear Support Vector Machine and Naïve Bayes classifiers outperform the logistic regression classifier. This study also compared the performance of these classifiers based on stratified random samples and random samples of training data. The findings indicate that in smaller training sets stratified random training samples perform better than random training samples, in large training sets (n = 4000) random samples yield better results. Finally, the Linear Support Vector Machine classifier was trained with 4000 tweets and subsequently used to categorize 578,581 tweets obtained from 430 employees.  相似文献   

15.
针对当前Android平台资源受限及恶意软件检测能力不足这一问题,以现有Android安装方式、触发方式和恶意负载方面的行为特征为识别基础,构建了基于ROM定制的Android软件行为动态监控框架,采用信息增益、卡方检验和Fisher Score的特征选择方法,评估了支持向量机(SVM)、决策树、k-邻近(KNN)和朴素贝叶斯(NB)分类器四类算法在Android恶意软件分类检测方面的有效性。通过对20916个恶意样本及17086个正常样本的行为日志的整体分类效果进行评估,结果显示,SVM算法在恶意软件判定上准确率可以达到93%以上,误报率低于2%,整体效果最优。可应用于在线云端分析环境和检测平台,满足海量样本处理需求。  相似文献   

16.
不平衡数据集的特点导致了在分类时产生了诸多难题。对不平衡数据集的分类方法进行了分析与总结。在数据采样方法中从欠采样、过采样和混合采样三方面介绍不平衡数据集的分类方法;在欠采样方法中分为基于[K]近邻、Bagging和Boosting三种方法;在过采样方法中从合成少数过采样技术(Synthetic Minority Over-sampling Technology,SMOTE)、支持向量机(Support Vector Machine,SVM)两个角度来分析不平衡数据集的分类方法;对这两类采样方法的优缺点进行了比较,在相同数据集下比较算法的性能并进行分析与总结。从深度学习、极限学习机、代价敏感和特征选择四方面对不平衡数据集的分类方法进行了归纳。最后对下一步工作方向进行了展望。  相似文献   

17.
为了提高网络安全态势评估性能,提出一种K近邻和支持向量机相融合的网络安全态势评估模型(KNN-SVM)。将网络安全数据集输入到支持向量机学习,找到支持向量集,对于待评估网络安全态势样本,计算其与最优分类超平面间的距离,如果距离大于阈值,采用支持向量机进行网络安全态势评估,否则采用K近邻进行评估,以解决支持向量机对超平面附近样本易错分的缺陷,减少SVM的误判率。仿真结果表明,相对于单独SVM,KNN-SVM提高了网络安全态势评估正确率,而且性能更加稳定。  相似文献   

18.
基于支持向量机与反K近邻的分类算法研究   总被引:1,自引:0,他引:1       下载免费PDF全文
针对支持向量机在对样本进行分类时,决策超平面附近的点较易错分的问题,首先将反K近邻法引入分类问题,提出了反K近邻分类算法;然后,将支持向量机(SVM)与反K近邻分类算法(RKNN)相结合,提出了基于支持向量机与反K近邻的分类算法(SVM-RKNN);最后,为了避免单一分类器可能存在的片面性问题,提出了基于SVM-RKNN的多特征融合分类方法。实验结果表明:SVM-RKNN分类算法的分类准确率比SVM方法平均提高了2.13%,而基于SVM-RKNN的多特征融合分类算法的分类准确率分别比SVM和SVM-RKNN算法平均提高了2.54%和0.41%。  相似文献   

19.
基于支持向量机的遥感图像舰船目标识别方法   总被引:2,自引:0,他引:2  
李毅  徐守时 《计算机仿真》2006,23(6):180-183
针对高分辨率遥感图像舰船目标识别问题,提出了一种基于支持向量机的舰船目标分类方法。支持向量机(SVM)是一类新型机器学习方法,基于结构风险最小化归纳原则,具有出色的学习能力。与传统的方法相比,支持向量机不但结构简单,而且技术性能特别是泛化能力明显提高。该文简要介绍了有关统计学习理论和支持向量机算法,将支持向量机应用于遥感图像舰船目标识别,并同传统的舰船识别方法进行了相关的对比实验,实验结果说明本文提出的分类器在识别性能上明显优于其它传统分类器,具有更高的识别性能率。  相似文献   

20.
Generalized sparse metric learning with relative comparisons   总被引:2,自引:2,他引:0  
The objective of sparse metric learning is to learn a distance measure from a set of data in addition to finding a low-dimensional representation. Despite demonstrated success, the performance of existing sparse metric learning approaches is usually limited because the methods assumes certain problem relaxations or they target the SML objective indirectly. In this paper, we propose a Generalized Sparse Metric Learning method. This novel framework offers a unified view for understanding many existing sparse metric learning algorithms including the Sparse Metric Learning framework proposed in (Rosales and Fung ACM International conference on knowledge discovery and data mining (KDD), pp 367–373, 2006), the Large Margin Nearest Neighbor (Weinberger et al. in Advances in neural information processing systems (NIPS), 2006; Weinberger and Saul in Proceedings of the twenty-fifth international conference on machine learning (ICML-2008), 2008), and the D-ranking Vector Machine (D-ranking VM) (Ouyang and Gray in Proceedings of the twenty-fifth international conference on machine learning (ICML-2008), 2008). Moreover, GSML also establishes a close relationship with the Pairwise Support Vector Machine (Vert et al. in BMC Bioinform, 8, 2007). Furthermore, the proposed framework is capable of extending many current non-sparse metric learning models to their sparse versions including Relevant Component Analysis (Bar-Hillel et al. in J Mach Learn Res, 6:937–965, 2005) and a state-of-the-art method proposed in (Xing et al. Advances in neural information processing systems (NIPS), 2002). We present the detailed framework, provide theoretical justifications, build various connections with other models, and propose an iterative optimization method, making the framework both theoretically important and practically scalable for medium or large datasets. Experimental results show that this generalized framework outperforms six state-of-the-art methods with higher accuracy and significantly smaller dimensionality for seven publicly available datasets.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号