首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
In wireless sensor networks, target classification differs from that in centralized sensing systems because of the distributed detection, wireless communication and limited resources. We study the classification problem of moving vehicles in wireless sensor networks using acoustic signals emitted from vehicles. Three algorithms including wavelet decomposition, weighted k-nearest-neighbor andDempster-Shafer theory are combined in this paper. Finally, we use real world experimental data to validate the classification methods. The result shows that wavelet based feature extraction method can extract stable features from acoustic signals. By fusion with Dempster's rule, the classification performance is improved.  相似文献   

2.
This study presents the applicability of support vector machine (SVM) ensemble for traffic incident detection. The SVM has been proposed to solve the problem of traffic incident detection, because it is adapted to produce a nonlinear classifier with maximum generality, and it has exhibited good performance as neural networks. However, the classification result of the practically implemented SVM depends on the choosing of kernel function and parameters. To avoid the burden of choosing kernel functions and tuning the parameters, furthermore, to improve the limited classification performance of the real SVM, and enhance the detection performance, we propose to use the SVM ensembles to detect incident. In addition, we also propose a new aggregation method to combine SVM classifiers based on certainty. Moreover, we proposed a reasonable hybrid performance index (PI) to evaluate the performance of SVM ensemble for detecting incident by combining the common criteria, detection rate (DR), false alarm rate (FAR), mean time to detection (MTTD), and classification rate (CR). Several SVM ensembles have been developed based on bagging, boosting and cross-validation committees with different combining approaches, and the SVM ensemble has been tested on one real data collected at the I-880 Freeway in California. The experimental results show that the SVM ensembles outperform a single SVM based AID in terms of DR, FAR, MTTD, CR and PI. We used one non-parametric test, the Wilcoxon signed ranks test, to make a comparison among six combining schemes. Our proposed combining method performs as well as majority vote and weighted vote. Finally, we also investigated the influence of the size of ensemble on detection performance.  相似文献   

3.
In this paper, we investigate the performance of statistical, mathematical programming and heuristic linear models for cost‐sensitive classification. In particular, we use five cost‐sensitive techniques including Fisher's discriminant analysis (DA), asymmetric misclassification cost mixed integer programming (AMC‐MIP), cost‐sensitive support vector machine (CS‐SVM), a hybrid support vector machine and mixed integer programming (SVMIP) and heuristic cost‐sensitive genetic algorithm (CGA) techniques. Using simulated datasets of varying group overlaps, data distributions and class biases, and real‐world datasets from financial and medical domains, we compare the performances of our five techniques based on overall holdout sample misclassification cost. The results of our experiments on simulated datasets indicate that when group overlap is low and data distribution is exponential, DA appears to provide superior performance. For all other situations with simulated datasets, CS‐SVM provides superior performance. In case of real‐world datasets from financial domain, CGA and AMC‐MIP hold a slight edge over the two SVM‐based classifiers. However, for medical domains with mixed continuous and discrete attributes, SVM classifiers perform better than heuristic (CGA) and AMC‐MIP classifiers. The SVMIP model is the most computationally inefficient model and poor performing model.  相似文献   

4.
目前对等网络(Peer-to-Peer,P2P)流量的识别是网络管理研究的热门话题。基于支持向量机(Support Vector Machine , SVM)的P2P流量识别方法是常用的P2P流量识别方法之一。然而SVM的性能主要受参数和其使用特征的影响,而传统的方法则是将SVM的参数优化和特征选择问题分开处理,因此这样很难获得整体性能最优的SVM分类器。本论文提出了一种基于最优人工蜂群算法和支持向量机相结合的P2P流量识别方法,利用人工蜂群算法,将SVM的参数和特征选择问题视为最优化问题同步处理,可以获得整体性能最优的参数和特征子集。在真实的P2P数据上的实验结果表明提出的方法具有很好的自适应性和分类精度,能够同时获取特征子集和SVM参数的最优解,提高SVM分类器的整体性能。  相似文献   

5.
Collaborative applications are characterized by high levels of data sharing. Optimistic replication has been suggested as a mechanism to enable highly concurrent access to the shared data, whilst providing full application-defined consistency guarantees. Nowadays, there are a growing number of emerging cooperative applications adequate for Peer-to-Peer (P2P) networks. However, to enable the deployment of such applications in P2P networks, it is required a mechanism to deal with their high data sharing in dynamic, scalable and available way. Previous work on optimistic replication has mainly concentrated on centralized systems. Centralized approaches are inappropriate for a P2P setting due to their limited availability and vulnerability to failures and partitions from the network. In this paper, we focus on the design of a reconciliation algorithm designed to be deployed in large scale cooperative applications, such as P2P Wiki. The main contribution of this paper is a distributed reconciliation algorithm designed for P2P networks (P2P-reconciler). Other important contributions are: a basic cost model for computing communication costs in a DHT overlay network; a strategy for computing the cost of each reconciliation step taking into account the cost model; and an algorithm that dynamically selects the best nodes for each reconciliation step. Furthermore, since P2P networks are built independently of the underlying topology, which may cause high latencies and large overheads degrading performance, we also propose a topology-aware variant of our P2P-reconciler algorithm and show the important gains on using it. Our P2P-reconciler solution enables high levels of concurrency thanks to semantic reconciliation and yields high availability, excellent scalability, with acceptable performance and limited overhead.  相似文献   

6.
In this work, the parallel fast condensed nearest neighbor (PFCNN) rule, a distributed method for computing a consistent subset of a very large data set for the nearest neighbor classification rule is presented. In order to cope with the communication overhead typical of distributed environments and to reduce memory requirements, different variants of the basic PFCNN method are introduced. An analysis of spatial cost, CPU cost, and communication overhead is accomplished for all the algorithms. Experimental results, performed on both synthetic and real very large data sets, revealed that these methods can be profitably applied to enormous collections of data. Indeed, they scale up well and are efficient in memory consumption, confirming the theoretical analysis, and achieve noticeable data reduction and good classification accuracy. To the best of our knowledge, this is the first distributed algorithm for computing a training set consistent subset for the nearest neighbor rule.  相似文献   

7.
Sensor networks are multihop wireless networks of resource-constrained sensor nodes used to realize high-level collaborative sensing tasks. To query or access data generated by the sensor nodes, the sensor network can be viewed as a distributed database. In this paper, we develop algorithms for communication-efficient implementation of join of multiple (two or more) data streams in a sensor network. The distributed implementation of join in sensor networks is particularly challenging due to unique characteristics of the sensor networks such as limited memory and battery energy on individual nodes, arbitrary and dynamic network topology, multihop communication, and unreliable infrastructure. One of our proposed approaches, viz., the perpendicular approach (PA), is load balanced, and in fact, incurs near-optimal communication cost for the special case of binary joins in grid networks under the assumption of uniform generation of tuples across the network. We compare the performance of our designed approaches through extensive simulations on the ns2 simulator, and show that PA results in substantially prolonging the network lifetime compared to other approaches, especially for joins involving spatial constraints.  相似文献   

8.
Although classification in centralized environments has been widely studied in recent years, it is still an important research problem for classification in P2P networks due to the popularity of P2P computing environments. The main target of classification in P2P networks is how to efficiently decrease prediction error with small network overhead. In this paper, we propose an OS-ELM based ensemble classification framework for distributed classification in a hierarchical P2P network. In the framework, we apply the incremental learning principle of OS-ELM to the hierarchical P2P network to generate an ensemble classifier. There are two kinds of implementation methods of the ensemble classifier in the P2P network, one-by-one ensemble classification and parallel ensemble classification. Furthermore, we propose a data space coverage based peer selection approach to reduce high the communication cost and large delay. We also design a two-layer index structure to efficiently support peer selection. A peer creates a local Quad-tree to index its local data and a super-peer creates a global Quad-tree to summarize its local indexes. Extensive experimental studies verify the efficiency and effectiveness of the proposed algorithms.  相似文献   

9.
基于代价敏感SVM的电信客户流失预测研究*   总被引:3,自引:0,他引:3  
针对客户流失数据集的非平衡性问题和错分代价的差异性问题,将代价敏感学习应用于Veropoulos提出的采用不同惩罚系数的支持向量机,建立客户流失预测模型,对实际的电信客户流失数据进行验证。通过与传统SVM、C4.5和ANN对比研究,结果显示此方法在精确度、命中率、覆盖率和提升度均有所改善,表明此方法有效地解决了数据集的非平衡性和错分代价问题,是进行客户流失预测的有效方法。  相似文献   

10.
11.
SVM在多源遥感图像分类中的应用研究   总被引:7,自引:1,他引:7  
在利用遥感图像进行土地利用/覆盖分类过程中,可采用以下两种途径来提高分类精度:一是通过增加有利于分类的数据源,引入地理辅助数据和归一化植被指数(NDVI)来进行多源信息融合;二是选择更好的分类方法,例如支持向量机(SVM)学习方法,由于该方法克服了最大似然法和神经网络的弱点,非常适合高维、复杂的小样本多源数据的分类。为了提高多源遥感图像分类的精度,还研究了支持向量机在遥感图像分类中模型的选择,包括多类模型和核函数的选择。分类结果表明,支持向量机比传统的分类方法具有更高的精度,尤其是基于径向基核函数和一对一多类方法的支持向量机模型更适合多源遥感图像分类,因此,基于支持向量机的多源土地利用/覆盖分类能大大提高分类精度。  相似文献   

12.
The challenges of the classification for the large-scale and high-dimensional datasets are: (1) It requires huge computational burden in the training phase and in the classification phase; (2) it needs large storage requirement to save many training data; and (3) it is difficult to determine decision rules in the high-dimensional data. Nonlinear support vector machine (SVM) is a popular classifier, and it performs well on a high-dimensional dataset. However, it easily leads overfitting problem especially when the data are not evenly distributed. Recently, profile support vector machine (PSVM) is proposed to solve this problem. Because local learning is superior to global learning, multiple linear SVM models are trained to get similar performance to a nonlinear SVM model. However, it is inefficient in the training phase. In this paper, we proposed a fast classification strategy for PSVM to speed up the training time and the classification time. We first choose border samples near the decision boundary from training samples. Then, the reduced training samples are clustered to several local subsets through MagKmeans algorithm. In the paper, we proposed a fast search method to find the optimal solution for MagKmeans algorithm. Each cluster is used to learn multiple linear SVM models. Both artificial datasets and real datasets are used to evaluate the performance of the proposed method. In the experimental result, the proposed method prevents overfitting and underfitting problems. Moreover, the proposed strategy is effective and efficient.  相似文献   

13.
Electronic mail is a major revolution taking place over traditional communication systems due to its convenient, economical, fast, and easy to use nature. A major bottleneck in electronic communications is the enormous dissemination of unwanted, harmful emails known as spam emails. A major concern is the developing of suitable filters that can adequately capture those emails and achieve high performance rate. Machine learning (ML) researchers have developed many approaches in order to tackle this problem. Within the context of machine learning, support vector machines (SVM) have made a large contribution to the development of spam email filtering. Based on SVM, different schemes have been proposed through text classification approaches (TC). A crucial problem when using SVM is the choice of kernels as they directly affect the separation of emails in the feature space. This paper presents thorough investigation of several distance-based kernels and specify spam filtering behaviors using SVM. The majority of used kernels in recent studies concern continuous data and neglect the structure of the text. In contrast to classical kernels, we propose the use of various string kernels for spam filtering. We show how effectively string kernels suit spam filtering problem. On the other hand, data preprocessing is a vital part of text classification where the objective is to generate feature vectors usable by SVM kernels. We detail a feature mapping variants in TC that yield improved performance for the standard SVM in filtering task. Furthermore, to cope for realtime scenarios we propose an online active framework for spam filtering. We present empirical results from an extensive study of online, transductive, and online active methods for classifying spam emails in real time. We show that active online method using string kernels achieves higher precision and recall rates.  相似文献   

14.
In this paper, a novel Support Vector Machine (SVM) variant, which makes use of robust statistics, is proposed. We investigate the use of statistically robust location and dispersion estimators, in order to enhance the performance of SVMs and test it in two-class and multi-class classification problems. Moreover, we propose a novel method for class specific multi-class SVM, which makes use of the covariance matrix of only one class, i.e., the class that we are interested in separating from the others, while ignoring the dispersion of other classes. We performed experiments in artificial data, as well as in many real world publicly available databases used for classification. The proposed approach performs better than other SVM variants, especially in cases where the training data contain outliers. Finally, we applied the proposed method for facial expression recognition in three well known facial expression databases, showing that it outperforms previously published attempts.  相似文献   

15.
支持向量机SVM是目前最流行的二分类算法之一。现实生活中数据集大多要求能够进行多分类,而有向无环图DAG方法是将SVM应用扩展到多分类的用得最多的方式之一,它调用分类器次数较少,执行速度快,但是由于有错误向下累积和分类偏向性等情况存在,会影响DAG分类结果的准确度。在使用DAG-SVM的时候,对于k种类别有k!种不同的备选结构,根据数据集特性选择合适的DAG结构能够有效提高结果的准确度。提出使用估计准确度的方法,从备选结构中用穷举法选择出最高准确度估计值的DAG结构,以此作为测试集的结构进行分类。实验结果表明,相较其它方法,测试数据集采用该方法选择的DAG结构后的分类准确性得到显著提高,在对类别数量不太多的数据集进行多类分类时有较好的效果。  相似文献   

16.
Support vector machines (SVM) are an emerging data classification technique with many diverse applications. The feature subset selection, along with the parameter setting in the SVM training procedure significantly influences the classification accuracy. In this paper, the asymptotic behaviors of support vector machines are fused with genetic algorithm (GA) and the feature chromosomes are generated, which thereby directs the search of genetic algorithm to the straight line of optimal generalization error in the superparameter space. On this basis, a new approach based on genetic algorithm with feature chromosomes, termed GA with feature chromosomes, is proposed to simultaneously optimize the feature subset and the parameters for SVM.To evaluate the proposed approach, the experiment adopts several real world datasets from the UCI database and from the Benchmark database. Compared with the GA without feature chromosomes, the grid search, and other approaches, the proposed approach not only has higher classification accuracy and smaller feature subsets, but also has fewer processing time.  相似文献   

17.
Approximate Distributed K-Means Clustering over a Peer-to-Peer Network   总被引:4,自引:0,他引:4  
Data intensive Peer-to-Peer (P2P) networks are finding increasing number of applications. Data mining in such P2P environments is a natural extension. However, common monolithic data mining architectures do not fit well in such environments since they typically require centralizing the distributed data which is usually not practical in a large P2P network. Distributed data mining algorithms that avoid large-scale synchronization or data centralization offer an alternate choice. This paper considers the distributed K-means clustering problem where the data and computing resources are distributed over a large P2P network. It offers two algorithms which produce an approximation of the result produced by the standard centralized K-means clustering algorithm. The first is designed to operate in a dynamic P2P network that can produce clusterings by “local” synchronization only. The second algorithm uses uniformly sampled peers and provides analytical guarantees regarding the accuracy of clustering on a P2P network. Empirical results show that both the algorithms demonstrate good performance compared to their centralized counterparts at the modest communication cost.  相似文献   

18.
针对传统分布式数据流挖掘算法的通信开销较大、分类精度较低的问题,提出一种基于支持向量数据描述的分布式数据流挖掘算法。利用局部站点快速更新数据流信息,采用支持向量机算法学习元级数据并传递到中心站点。中心站点负责接收及合并元级数据,形成全局分类结果。实验结果表明,该算法能在降低局部站点和中心站点网络通信量的同时,获得较高精度的全局分类结果。  相似文献   

19.
Type-2 fuzzy logic-based classifier fusion for support vector machines   总被引:1,自引:0,他引:1  
As a machine-learning tool, support vector machines (SVMs) have been gaining popularity due to their promising performance. However, the generalization abilities of SVMs often rely on whether the selected kernel functions are suitable for real classification data. To lessen the sensitivity of different kernels in SVMs classification and improve SVMs generalization ability, this paper proposes a fuzzy fusion model to combine multiple SVMs classifiers. To better handle uncertainties existing in real classification data and in the membership functions (MFs) in the traditional type-1 fuzzy logic system (FLS), we apply interval type-2 fuzzy sets to construct a type-2 SVMs fusion FLS. This type-2 fusion architecture takes considerations of the classification results from individual SVMs classifiers and generates the combined classification decision as the output. Besides the distances of data examples to SVMs hyperplanes, the type-2 fuzzy SVMs fusion system also considers the accuracy information of individual SVMs. Our experiments show that the type-2 based SVM fusion classifiers outperform individual SVM classifiers in most cases. The experiments also show that the type-2 fuzzy logic-based SVMs fusion model is better than the type-1 based SVM fusion model in general.  相似文献   

20.
Peers in Mobile P2P (MP2P) networks exploit both the structured and unstructured styles to enable communication in a peer-to-peer fashion. Such networks involve the participation of two types of peers: benign peers and malicious peers. Complexities are witnessed in the determination of the identity of the peers because of the user mobility and the unrestricted switching (ON/OFF) of the mobile devices. MP2P networks require a scalable, distributed and light-weighted secure communication scheme. Nevertheless, existing communication approaches lack the capability to satisfy the requirements above. In this paper, we propose an Adaptive Trusted Request and Authorization model (ATRA) over MP2P networks, by exploiting the limited historical interaction information among the peers and a Bayesian game to ensure secure communication. The simulation results reveal that regardless of the peer’s ability to obtain the other such peer’s trust and risk data, the request peers always spontaneously connect the trusted resource peers and the resource peers always preferentially authorize the trusted request peers. Performance comparison of ATRA with state-of-the-art secure communication schemes over MP2P networks shows that ATRA can: (a) improve the success rate of node typing identification, (b) reduce time required for secure connections found, (c) provide efficient resource sharing, and (d) maintain the lower average cost.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号