共查询到20条相似文献,搜索用时 15 毫秒
1.
Support Vector Machines (SVM) represent one of the most promising Machine Learning (ML) tools that can be applied to the problem of traffic classification in IP networks. In the case of SVMs, there are still open questions that need to be addressed before they can be generally applied to traffic classifiers. Having being designed essentially as techniques for binary classification, their generalization to multi-class problems is still under research. Furthermore, their performance is highly susceptible to the correct optimization of their working parameters. In this paper we describe an approach to traffic classification based on SVM. We apply one of the approaches to solving multi-class problems with SVMs to the task of statistical traffic classification, and describe a simple optimization algorithm that allows the classifier to perform correctly with as little training as a few hundred samples. The accuracy of the proposed classifier is then evaluated over three sets of traffic traces, coming from different topological points in the Internet. Although the results are relatively preliminary, they confirm that SVM-based classifiers can be very effective at discriminating traffic generated by different applications, even with reduced training set sizes. 相似文献
2.
We propose twin SVM, a binary SVM classifier that determines two nonparallel planes by solving two related SVM-type problems, each of which is smaller than in a conventional SVM. The twin SVM formulation is in the spirit of proximal SVMs via generalized eigenvalues. On several benchmark data sets, Twin SVM is not only fast, but shows good generalization. Twin SVM is also useful for automatically discovering two-dimensional projections of the data 相似文献
3.
C-SVM分类算法在不同类别样本数目不均衡的情况下,训练时的分类错误倾向于样本数目小的类别。样本集中出现重复样本时作为新样本重新计算,增加了算法的训练时间。针对这两种问题,分析了产生的原因,提出了一种加权支持向量机算法,补偿了类别差异造成的不利影响,加快了重复样本的决策速度。为提高算法的推广性能,在模型训练过程中引入遗传算法自动选择惩罚因子和核函数宽度两个参数。实验结果表明了该算法可以有效地解决类别不均衡和重复样本问题,且训练模型具有良好的推广性能。 相似文献
4.
Engine ignition pattern analysis is one of the trouble-diagnosis methods for automotive gasoline engines. Based on the waveform of the ignition pattern, the mechanic guesses what may be the potential malfunctioning parts of an engine with his/her experience and handbooks. However, this manual diagnostic method is imprecise because many ignition patterns are very similar. Therefore, a diagnosis may need many trials to identify the malfunctioning parts. Meanwhile the mechanic needs to disassemble and assemble the engine parts for verification. To tackle this problem, Wavelet Packet Transform (WPT) is firstly employed to extract the features of the ignition pattern. With the extracted features, a statistics over the frequency subbands of the pattern can then be produced, which can be used by Multi-class Least Squares S upport Vector Machines (MCLS-SVM) for engine fault classification. With the newly proposed classification system, the number of diagnostic trials can be reduced. Besides, MCLS-SVM is also compared with a typical classification method, Multi-layer Perceptron (MLP). Experimental results show that MCLS-SVM produces higher diagnostic accuracy than MLP. 相似文献
5.
支持向量机(SVM)方法并不假设样本的分布条件,它基于结构风险最小化原则,对小样本情况下的学习问题给出最优解,并且在样本趋于无穷时能保持良好的一致收敛性.在SVM的基础上提出的MSVM方法,通过记忆功能,用历次反馈的累积样本代替一次反馈样本,从而增加了学习样本数量,减小了查准率的振荡,提高了检索精度;同时为了减轻用户负担,提出了记忆性标注.实验证明,MSVM方法可以避免因训练样本集过小而出现的局部最小化的问题,能较为准确地分类图像库中的图像,同时有效地减轻了用户的负担. 相似文献
6.
Pattern classification methods are a crucial direction in the current study of brain–computer interface (BCI) technology. A simple yet effective ensemble approach for electroencephalogram (EEG) signal classification named the random electrode selection ensemble (RESE) is developed, which aims to surmount the instability demerit of the Fisher discriminant feature extraction for BCI applications. Through the random selection of recording electrodes answering for the physiological background of user-intended mental activities, multiple individual classifiers are constructed. In a feature subspace determined by a couple of randomly selected electrodes, principal component analysis (PCA) is first used to carry out dimensionality reduction. Successively Fisher discriminant is adopted for feature extraction, and a Bayesian classifier with a Gaussian mixture model (GMM) approximating the feature distribution is trained. For a test sample the outputs from all the Bayesian classifiers are combined to give the final prediction for its label. Theoretical analysis and classification experiments with real EEG signals indicate that the RESE approach is both effective and efficient. 相似文献
7.
Sleep stage scoring is a challenging task. Most of existing sleep stage classification approaches rely on analysing electroencephalography (EEG) signals in time or frequency domain. A novel technique for EEG sleep stages classification is proposed in this paper. The statistical features and the similarities of complex networks are used to classify single channel EEG signals into six sleep stages. Firstly, each EEG segment of 30 s is divided into 75 sub-segments, and then different statistical features are extracted from each sub-segment. In this paper, feature extraction is important to reduce dimensionality of EEG data and the processing time in classification stage. Secondly, each vector of the extracted features, which represents one EEG segment, is transferred into a complex network. Thirdly, the similarity properties of the complex networks are extracted and classified into one of the six sleep stages using a k-means classifier. For further investigation, in the statistical features extraction phase two statistical features sets are tested and ranked based on the performance of the complex networks. To investigate the classification ability of complex networks combined with k-means, the extracted statistical features were also forwarded to a k-means and a support vector machine (SVM) for comparison. We also compare the proposed method with other existing methods in the literature. The experimental results show that the proposed method attains better classification results and a reasonable execution time compared with the SVM, k-means and the other existing methods. The research results in this paper indicate that the proposed method can assist neurologists and sleep specialists in diagnosing and monitoring sleep disorders. 相似文献
8.
This paper presents a new approach to classify fault types and predict the fault location in the high-voltage power transmission lines, by using Support Vector Machines (SVM) and Wavelet Transform (WT) of the measured one-terminal voltage and current transient signals. Wavelet entropy criterion is applied to wavelet detail coefficients to reduce the size of feature vector before classification and prediction stages. The experiments performed for different kinds of faults occurred on the transmission line have proved very good accuracy of the proposed fault location algorithm. The fault classification error is below 1% for all tested fault conditions. The average error of fault location in a 380 kV–360-km transmission line is below 0.26% and the maximum error did not exceed 0.95 km. 相似文献
9.
Pattern Analysis and Applications - In this paper, deep-stacked error minimized extreme learning machine autoencoder (DSEMELMAE) and sine–cosine monarch butterfly optimization-based minimum... 相似文献
10.
Over the past two decades, wavelet theory has been used for the processing of biomedical signals for feature extraction, compression and de-noising applications. However the question as to which wavelet family is the most suitable for analysis of non-stationary bio-signals is still prevalent among researchers. This paper attempts to find the most useful wavelet function among the existing members of the wavelet families for electroencephalogram signal (EEG) analysis. The EEGs considered for this study belong to both normal as well as abnormal signals like epileptic EEG. Important features such as energy, entropy and standard deviation at different sub-bands were computed using the wavelet functions—Haar, Daubechies (orders 2-10), Coiflets (orders 1-10), and Biorthogonal (orders 1.1, 2.4, 3.5, and 4.4). Feature vectors were used to model and train the Probabilistic Neural Network (PNN) and the classification accuracies were evaluated for each case. The results obtained from PNN classifier were compared with Support Vector Machine (SVM) classifier. From the statistical analysis, it was found that Coiflets 1 is the most suitable candidate among the wavelet families considered in this study for accurate classification of the EEG signals. In this work, we have attempted to improve the computing efficiency as it selects the most suitable wavelet function that can be used for EEG signal processing efficiently and accurately with lesser computational time. 相似文献
11.
In conjunction with the advance in computer technology, virtual screening of small molecules has been started to use in drug discovery. Since there are thousands of compounds in early-phase of drug discovery, a fast classification method, which can distinguish between active and inactive molecules, can be used for screening large compound collections. In this study, we used Support Vector Machines (SVM) for this type of classification task. SVM is a powerful classification tool that is becoming increasingly popular in various machine-learning applications. The data sets consist of 631 compounds for training set and 216 compounds for a separate test set. In data pre-processing step, the Pearson's correlation coefficient used as a filter to eliminate redundant features. After application of the correlation filter, a single SVM has been applied to this reduced data set. Moreover, we have investigated the performance of SVM with different feature selection strategies, including SVM–Recursive Feature Elimination, Wrapper Method and Subset Selection. All feature selection methods generally represent better performance than a single SVM while Subset Selection outperforms other feature selection methods. We have tested SVM as a classification tool in a real-life drug discovery problem and our results revealed that it could be a useful method for classification task in early-phase of drug discovery. 相似文献
12.
This paper presents a new approach called clustering technique-based least square support vector machine (CT-LS-SVM) for the classification of EEG signals. Decision making is performed in two stages. In the first stage, clustering technique (CT) has been used to extract representative features of EEG data. In the second stage, least square support vector machine (LS-SVM) is applied to the extracted features to classify two-class EEG signals. To demonstrate the effectiveness of the proposed method, several experiments have been conducted on three publicly available benchmark databases, one for epileptic EEG data, one for mental imagery tasks EEG data and another one for motor imagery EEG data. Our proposed approach achieves an average sensitivity, specificity and classification accuracy of 94.92%, 93.44% and 94.18%, respectively, for the epileptic EEG data; 83.98%, 84.37% and 84.17% respectively, for the motor imagery EEG data; and 64.61%, 58.77% and 61.69%, respectively, for the mental imagery tasks EEG data. The performance of the CT-LS-SVM algorithm is compared in terms of classification accuracy and execution (running) time with our previous study where simple random sampling with a least square support vector machine (SRS-LS-SVM) was employed for EEG signal classification. We also compare the proposed method with other existing methods in the literature for the three databases. The experimental results show that the proposed algorithm can produce a better classification rate than the previous reported methods and takes much less execution time compared to the SRS-LS-SVM technique. The research findings in this paper indicate that the proposed approach is very efficient for classification of two-class EEG signals. 相似文献
13.
This paper is concerned with a two stage procedure for analysis and classification of electroencephalogram (EEG) signals for twenty schizophrenic patients and twenty age-matched control participants. For each case, 20 channels of EEG are recorded. First, the more informative channels are selected using the mutual information techniques. Then, genetic programming is employed to select the best features from the selected channels. Several features including autoregressive model parameters, band power and fractal dimension are used for the purpose of classification. Both linear discriminant analysis (LDA) and adaptive boosting (Adaboost) are trained using tenfold cross validation to classify the reduced feature set and a classification accuracy of 85.90% and 91.94% is obtained by LDA and Adaboost, respectively. Another interesting observation from the channel selection procedure is that most of the selected channels are located in the prefrontal and temporal lobes confirming neuropsychological and neuroanatomical findings. The results obtained by the proposed approach are compared with a one stage procedure, the principal component analysis (PCA)-based feature selection, utilizing only 100 features selected from all channels. It is illustrated that the two stage procedure consisting of channel selection followed by feature reduction gives a more enhanced results in an efficient computation time. 相似文献
14.
In this paper we explore the use of text-mining methods for the identification of the author of a text. We apply the support vector machine (SVM) to this problem, as it is able to cope with half a million of inputs it requires no feature selection and can process the frequency vector of all words of a text. We performed a number of experiments with texts from a German newspaper. With nearly perfect reliability the SVM was able to reject other authors and detected the target author in 60–80% of the cases. In a second experiment, we ignored nouns, verbs and adjectives and replaced them by grammatical tags and bigrams. This resulted in slightly reduced performance. Author detection with SVMs on full word forms was remarkably robust even if the author wrote about different topics. 相似文献
15.
针对标准的近似支持向量机(PSVM)没有考虑样本分布不平衡的问题,提出一种改进的PSVM算法(MPSVM).根据训练样本数量的不平衡对正负样本集分别分配不同的惩罚因子,并将原始优化问题中的惩罚因子由数值变更为一个对角阵.最后推导出线性和非线性MPSVM的决策函数,并将其与PSVM、非平衡的SVM的运算机理和性能进行比较.实验结果表明,MPSVM的性能优于PSVM,与非平衡SVM方法相比效率更高. 相似文献
16.
The Maximal Discrepancy (MD) is a powerful statistical method, which has been proposed for model selection and error estimation in classification problems. This approach is particularly attractive when dealing with small sample problems, since it avoids the use of a separate validation set. Unfortunately, the MD method requires a bounded loss function, which is usually avoided by most learning algorithms, including the Support Vector Machine (SVM), because it gives rise to a non-convex optimization problem. We derive in this work a new approach for rigorously applying the MD technique to the error estimation of the SVM and, at the same time, preserving the original SVM framework. 相似文献
17.
When dealing with pattern recognition problems one encounters different types of prior knowledge. It is important to incorporate such knowledge into classification method at hand. A very common type of prior knowledge is many data sets are on some kinds of manifolds. Distance based classification methods can make use of this by a modified distance measure called geodesic distance. We introduce a new kind of kernels for support vector machines which incorporate geodesic distance and therefore are applicable in cases such transformation invariance is known. Experiments results show that the performance of our method is comparable to that of other state-of-the-art method. 相似文献
18.
A brain–computer interface (BCI) provides a link between the human brain and a computer. The task of discriminating four classes (left and right hands and feet) of motor imagery movements of a simple limb-based BCI is still challenging because most imaginary movements in the motor cortex have close spatial representations. We aimed to classify binary limb movements, rather than the direction of movement within one limb. We also investigated joint time-frequency methods to improve classification accuracies. Neither of these, to our knowledge, has been investigated previously in BCI. We recorded EEG data from eleven participants, and demonstrated the classification of four classes of simple-limb motor imagery with an accuracy of 91.46% using intrinsic time-scale decomposition and 88.99% using empirical mode decomposition. In binary classifications, we achieved average accuracies of 89.90% when classifying imaginary movements of left hand versus right hand, 93.1% for left hand versus right foot, 94.00% for left hand versus left foot, 83.82% for left foot versus right foot, 97.62% for right hand versus left foot, and 95.11% for right hand versus right foot. The results show that the binary classification performance is slightly better than that of four-class classification. Our results also show that there is no significant difference in terms of spatial distribution between left and right foot motor imagery movements. There is also no difference in classification performances involving left or right foot movement. This work demonstrates that binary and four-class movements of the left and right feet and hands can be classified using recorded EEG signals of the motor cortex, and an intrinsic time-scale decomposition (ITD) feature extraction method can be used for real time brain computer interface. 相似文献
19.
Classification of Electroencephalogram (EEG) data for imagined motor movements has been a challenge in the design and development of Brain Computer Interfaces (BCIs). There are two principle challenges. The first is the variability in the recorded EEG data, which manifests across trials as well as across individuals. Consequently, features that are more discriminative need to be identified before any pattern recognition technique can be applied. The second challenge is in the pattern recognition domain. The number of data samples in a class of interest, e.g. a specific action, is a small fraction of the total data, which is composed of samples corresponding to all actions of all users. Building a robust classifier when learning from a highly unbalanced dataset is very difficult; minimizing the classification error typically causes the larger class to overwhelm the smaller one. We show that the combination of ‘classifiability’ for selecting the optimal frequency band and the use of the Twin Support Vector Machine (Twin SVM) for classification, yields significantly improved generalization. On benchmark BCI Competition datasets, the proposed approach often yields up to 20% improvement over the state-of-the-art. 相似文献
20.
In this article, we propose some methods for deriving symbolic interpretation of data in the form of rule based learning systems by using Support Vector Machines (SVM). First, Radial Basis Function Neural Networks (RBFNN) learning techniques are explored, as is usual in the literature, since the local nature of this paradigm makes it a suitable platform for performing rule extraction. By using support vectors from a learned SVM it is possible in our approach to use any standard Radial Basis Function (RBF) learning technique for the rule extraction, whilst avoiding the overlapping between classes problem. We will show that merging node centers and support vectors explanation rules can be obtained in the form of ellipsoids and hyper-rectangles. Next, in a dual form, following the framework developed for RBFNN, we construct an algorithm for SVM. Taking SVM as the main paradigm, geometry in the input space is defined from a combination of support vectors and prototype vectors obtained from any clustering algorithm. Finally, randomness associated with clustering algorithms or RBF learning is avoided by using only a learned SVM to define the geometry of the studied region. The results obtained from a certain number of experiments on benchmarks in different domains are also given, leading to a conclusion on the viability of our proposal. 相似文献
|