期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Object detection using feature subset selection

Zehang Sun Author Vitae Ronald Miller^{Author Vitae} 《Pattern recognition》2004,37(11):2165-2176

Past work on object detection has emphasized the issues of feature extraction and classification, however, relatively less attention has been given to the critical issue of feature selection. The main trend in feature extraction has been representing the data in a lower dimensional space, for example, using principal component analysis (PCA). Without using an effective scheme to select an appropriate set of features in this space, however, these methods rely mostly on powerful classification algorithms to deal with redundant and irrelevant features. In this paper, we argue that feature selection is an important problem in object detection and demonstrate that genetic algorithms (GAs) provide a simple, general, and powerful framework for selecting good subsets of features, leading to improved detection rates. As a case study, we have considered PCA for feature extraction and support vector machines (SVMs) for classification. The goal is searching the PCA space using GAs to select a subset of eigenvectors encoding important information about the target concept of interest. This is in contrast to traditional methods selecting some percentage of the top eigenvectors to represent the target concept, independently of the classification task. We have tested the proposed framework on two challenging applications: vehicle detection and face detection. Our experimental results illustrate significant performance improvements in both cases. 相似文献

2.

工业过程数据异常检测的改进局部离群因子法

何九虎刘飞《计算机与应用化学》2013,(1):53-56

局部离群因子(LOF)是对过程数据的局部离群程度的定义,然而工业过程对数据异常检测的实时性要求高,要求出所有采样点的离群因子计算量较大。故本文对LOF算法进行相应的改进,采用k-近邻计算对象的局部可达密度,同时利用1种预处理采样点的方法CDC(Closest Distance to Center),通过计算每个点到中心点的距离先对采样点进行修剪,剔除大部分不可能是离群点的采样点,只需要计算剩余点改进的LOF值,从而提高离群点检测的效率。最终通过对TE过程数据仿真,说明在保证离群点检测准确性的情况下,相比于LOF缩短了算法运行的时间。相似文献

3.

基于Bayesian Lasso方法的变量选择和异常值检测

《计算机应用研究》2015,(12)

相似文献

4.

Clustering and outlier detection using isoperimetric number of trees

A. Daneshgar R. Javadi S.B. Shariat Razavi 《Pattern recognition》2013,46(12):3371-3382

相似文献

5.

采用异常值检测及重定位改进的KCF跟踪算法

下载免费PDF全文

刘延飞何燕辉姜柯张薇《计算机工程与应用》2018,54(20):166-171

针对传统核相关滤波器（KCF）跟踪算法受光照变化、严重遮挡和出视野等因素影响,出现目标丢失现象时,跟踪器会将背景信息作为目标继续进行跟踪而不能重新定位目标的问题,在KCF的基础上,引入异常值检测方法作为目标丢失预警机制,同时,提出了目标丢失重检测定位机制。方法对每帧的峰值进行检测,发现异常峰值,则判定目标丢失或即将丢失,预警机制发出警告,停止目标模板更新,启动目标丢失重检测定位机制,在全帧搜索定位目标。实验结果表明,改进的算法精确度为0.751,成功率为0.579,较之传统KCF跟踪算法分别提高了5.77%和12.43%。解决KCF跟踪器在目标丢失后不能重新找回目标继续跟踪的问题,提升了跟踪算法的性能,实现了长期跟踪。相似文献

6.

Genetic algorithms for outlier detection and variable selection in linear regression models

J.?Tolvi Email author 《Soft Computing - A Fusion of Foundations, Methodologies and Applications》2004,8(8):527-533

This article addresses some problems in outlier detection and variable selection in linear regression models. First, in outlier detection there are problems known as smearing and masking. Smearing means that one outlier makes another, non-outlier observation appear as an outlier, and masking that one outlier prevents another one from being detected. Detecting outliers one by one may therefore give misleading results. In this article a genetic algorithm is presented which considers different possible groupings of the data into outlier and non-outlier observations. In this way all outliers are detected at the same time. Second, it is known that outlier detection and variable selection can influence each other, and that different results may be obtained, depending on the order in which these two tasks are performed. It may therefore be useful to consider these tasks simultaneously, and a genetic algorithm for a simultaneous outlier detection and variable selection is suggested. Two real data sets are used to illustrate the algorithms, which are shown to work well. In addition, the scalability of the algorithms is considered with an experiment using generated data.I would like to thank Dr Tero Aittokallio and an anonymous referee for useful comments. 相似文献

7.

Enhancing SVM performance in intrusion detection using optimal feature subset selection based on genetic principal components

Iftikhar Ahmad Muhammad Hussain Abdullah Alghamdi Abdulhameed Alelaiwi 《Neural computing & applications》2014,24(7-8):1671-1682

Intrusion detection is very serious issue in these days because the prevention of intrusions depends on detection. Therefore, accurate detection of intrusion is very essential to secure information in computer and network systems of any organization such as private, public, and government. Several intrusion detection approaches are available but the main problem is their performance, which can be enhanced by increasing the detection rates and reducing false positives. This issue of the existing techniques is the focus of research in this paper. The poor performance of such techniques is due to raw dataset which confuse the classifier and results inaccurate detection due to redundant features. The recent approaches used principal component analysis (PCA) for feature subset selection which is based on highest eigenvalues, but the features corresponding to the highest eigenvalues may not have the optimal sensitivity for the classifier due to ignoring many sensitive features. Instead of using traditional approach of selecting features with the highest eigenvalues such as PCA, this research applied a genetic algorithm to search the genetic principal components that offers a subset of features with optimal sensitivity and the highest discriminatory power. The support vector machine (SVM) is used for classification purpose. This research work used the knowledge discovery and data mining cup dataset for experimentation. The performance of this approach was analyzed and compared with existing approaches. The results show that proposed method enhances SVM performance in intrusion detection that outperforms the existing approaches and has the capability to minimize the number of features and maximize the detection rates. 相似文献

8.

基于多粒度时序结构表示的异常检测算法在储层含油性检测中应用

孟凡陈广王勇高阳高德群贾文龙《计算机应用》2021,41(8):2453-2459

传统储层含油性勘测方法利用地震波穿过地层时产生的相关地震属性和地质钻井资料结合传统地球物理方法进行综合研判,但该类勘测方法往往存在研判成本高且对专家先验知识依赖性强的问题。针对该问题,以江苏油田苏北盆地的地震资料为基础,并结合含油样本的稀疏性和随机性,提出了一种基于多粒度时序结构表示的异常检测算法,直接利用叠后地震道数据进行预测。该算法首先对于单个地震道数据提取多粒度时序结构并形成独立特征表示;其次,在提取多个粒度时序结构表示的基础上进行特征融合,以形成对地震道数据的融合表示;最后,通过对融合后的特征采用代价敏感方法进行联合训练和判别,从而得到对于该地震数据的含油性勘测结果。所提算法在江苏油田实际原始地震资料上进行了实验仿真,实验结果表明：所提算法相比长短期记忆（LSTM）和门控循环单元（GRU）算法在曲线下方的面积（AUC）指标上均提升了10%。相似文献

9.

Optimized prediction for heats of formation of transition metal alloys

R.E. Watson L.H. Bennett 《Calphad》1981,5(1):25-40

A simple electron band theory model of the heat of formation ΔH, of transition metal alloys is used to predict Δ for 276 transition metal alloys at equiatomic composition. The model employs a rectangular d-band electron density of states. Some of the input parameters, namely bandwidth, Fermi level position, and number of electrons in the band, are allowed to vary within certain constraints, to closely approximate any known value of ΔH. The resulting predictions are considered to have errors of the same order as the experiments. 相似文献

10.

Feature subset selection using differential evolution and a statistical repair mechanism

Rami N. Khushaba Ahmed Al-Ani Adel Al-Jumaily 《Expert systems with applications》2011,38(9):11515-11526

One of the fundamental motivations for feature selection is to overcome the curse of dimensionality problem. This paper presents a novel feature selection method utilizing a combination of differential evolution (DE) optimization method and a proposed repair mechanism based on feature distribution measures. The new method, abbreviated as DEFS, utilizes the DE float number optimizer in the combinatorial optimization problem of feature selection. In order to make the solutions generated by the float-optimizer suitable for feature selection, a roulette wheel structure is constructed and supplied with the probabilities of features distribution. These probabilities are constructed during iterations by identifying the features that contribute to the most promising solutions. The proposed DEFS is used to search for optimal subsets of features in datasets with varying dimensionality. It is then utilized to aid in the selection of Wavelet Packet Transform (WPT) best basis for classification problems, thus acting as a part of a feature extraction process. Practical results indicate the significance of the proposed method in comparison with other feature selection methods. 相似文献

11.

Towards scalable rough set based attribute subset selection for intrusion detection using parallel genetic algorithm in MapReduce

《Simulation Modelling Practice and Theory》2016

Attribute subset selection based on rough sets is a crucial preprocessing step in data mining and pattern recognition to reduce the modeling complexity. To cope with the new era of big data, new approaches need to be explored to address this problem effectively. In this paper, we review recent work related to attribute subset selection in decision-theoretic rough set models. We also introduce a scalable implementation of a parallel genetic algorithm in Hadoop MapReduce to approximate the minimum reduct which has the same discernibility power as the original attribute set in the decision table. Then, we focus on intrusion detection in computer networks and apply the proposed approach on four datasets with varying characteristics. The results show that the proposed model can be a powerful tool to boost the performance of identifying attributes in the minimum reduct in large-scale decision systems. 相似文献

12.

Face anti-spoofing by identity masking using random walk patterns and outlier detection

Katika Balaji Rao Karthik Kannan 《Pattern Analysis & Applications》2020,23(4):1735-1754

Pattern Analysis and Applications - Existing architectures used in face anti-spoofing tend to deploy registered spatial measurements to generate feature vectors for spoof detection. This means that... 相似文献

13.

Threshold accepting trained principal component neural network and feature subset selection: Application to bankruptcy prediction in banks 总被引：1，自引：0，他引：1

V. Ravi C. Pramodh 《Applied Soft Computing》2008,8(4):1539-1548

This paper proposes an application of new principal component neural network (PCNN) architecture to bankruptcy prediction problem in commercial banks. Further, a new feature subset selection (FSS) algorithm is proposed. In this architecture, the hidden layer is completely replaced by what is referred to as a ‘principal component layer’. This layer consists of a few selected principal components that perform the function of hidden nodes. Moreover, this study proposes an algorithm based on the threshold accepting (TA) meta-heuristic to train the PCNN. The architecture reduces the number of weights by a great number as there are no formal connections between the input layer and the principal component layer. The efficacy of the algorithm is tested on the Spanish banks dataset and Turkish banks dataset. The results showed high generalization power of PCNN in the 10-fold cross-validation and also the feature subsets selected in each of the examples showed high discriminating power. PCNN is also compared with PCA-TANN and PCA-BPNN, which have PCA as the preprocessor and have one hidden layer each. Further comparisons are also made with TANN and BPNN. All these classifiers are compared with respect to the AUC (area under the receiver operating characteristic (ROC) curve) criterion. ROC curve is drawn for each classifier with sensitivity on the X-axis and one-specificity on the Y-axis. Based on the experiments conducted, it is inferred that the proposed PCNN hybrids outperformed other classifiers in terms of AUC. It is also observed that the proposed feature subset selection algorithm is very stable and powerful. 相似文献

14.

Application of density-based outlier detection to database activity monitoring

Seung Kim Nam Wook Cho Young Joo Lee Suk-Ho Kang Taewan Kim Hyeseon Hwang Dongseop Mun 《Information Systems Frontiers》2013,15(1):55-65

To prevent internal data leakage, database activity monitoring uses software agents to analyze protocol traffic over networks and to observe local database activities. However, the large size of data obtained from database activity monitoring has presented a significant barrier to effective monitoring and analysis of database activities. In this paper, we present database activity monitoring by means of a density-based outlier detection method and a commercial database activity monitoring solution. In order to provide efficient computing of outlier detection, we exploited a kd-tree index and an Approximated k-nearest neighbors (ANN) search method. By these means, the outlier computation time could be significantly reduced. The proposed methodology was successfully applied to a very large log dataset collected from the Korea Atomic Energy Research Institute (KAERI). The results showed that the proposed method can effectively detect outliers of database activities in a shorter computation time. 相似文献

15.

Local outlier detection reconsidered: a generalized view on locality with applications to spatial, video, and network outlier detection 总被引：1，自引：0，他引：1

Erich Schubert Arthur Zimek Hans-Peter Kriegel 《Data mining and knowledge discovery》2014,28(1):190-237

Outlier detection research has been seeing many new algorithms every year that often appear to be only slightly different from existing methods along with some experiments that show them to “clearly outperform” the others. However, few approaches come along with a clear analysis of existing methods and a solid theoretical differentiation. Here, we provide a formalized method of analysis to allow for a theoretical comparison and generalization of many existing methods. Our unified view improves understanding of the shared properties and of the differences of outlier detection models. By abstracting the notion of locality from the classic distance-based notion, our framework facilitates the construction of abstract methods for many special data types that are usually handled with specialized algorithms. In particular, spatial neighborhood can be seen as a special case of locality. Here we therefore compare and generalize approaches to spatial outlier detection in a detailed manner. We also discuss temporal data like video streams, or graph data such as community networks. Since we reproduce results of specialized approaches with our general framework, and even improve upon them, our framework provides reasonable baselines to evaluate the true merits of specialized approaches. At the same time, seeing spatial outlier detection as a special case of local outlier detection, opens up new potentials for analysis and advancement of methods. 相似文献

16.

基于可信对等的分布式入侵检测通信框架设计 总被引：1，自引：0，他引：1

金丽朱浩《计算机工程与设计》2010,31(5)

为了提高分布式入侵检测的实时性和安全性,提出了一种可信对等的分布式入侵检测通信框架的模型.该模型借鉴了P2P和代理技术,不同网络节点中的入侵检测代理是对等的,它们之间通过共享检测信息进行整体协防.该模型还借鉴了安全通信技术,在网络中建立了一个认证服务器,不在同一网络节点的任何两个网络进程的通信必须通过该认证服务器,提高了入侵检测自身的安全性.设计实现了一个原型系统,原型系统的实验结果表明了该模型的正确性和可行性. 相似文献

17.

基于粗糙熵的离群点检测方法及其在无监督入侵检测中的应用

江峰王凯郦于旭眭跃飞杜军威《控制与决策》2020,35(5):1199-1204

香农的信息熵被广泛用于粗糙集.利用粗糙集中的粗糙熵来检测离群点,提出一种基于粗糙熵的离群点检测方法,并应用于无监督入侵检测.首先,基于粗糙熵提出一种新的离群点定义,并设计出相应的离群点检测算法-–基于粗糙熵的离群点检测(rough entropy-based outlier detection,REOD);其次,通过将入侵行为看作是离群点,将REOD应用于入侵检测中,从而得到一种新的无监督入侵检测方法.通过多个数据集上的实验表明,REOD具有良好的离群点检测性能.另外,相对于现有的入侵检测方法,REOD具有较高的入侵检测率和较低的误报率,特别是其计算开销较小,适合于在海量高维的数据中检测入侵. 相似文献

18.

A hybrid algorithm for feature subset selection in high-dimensional datasets using FICA and IWSSr algorithm

《Applied Soft Computing》2015

Feature subset selection is a substantial problem in the field of data classification tasks. The purpose of feature subset selection is a mechanism to find efficient subset retrieved from original datasets to increase both efficiency and accuracy rate and reduce the costs of data classification. Working on high-dimensional datasets with a very large number of predictive attributes while the number of instances is presented in a low volume needs to be employed techniques to select an optimal feature subset. In this paper, a hybrid method is proposed for efficient subset selection in high-dimensional datasets. The proposed algorithm runs filter-wrapper algorithms in two phases. The symmetrical uncertainty (SU) criterion is exploited to weight features in filter phase for discriminating the classes. In wrapper phase, both FICA (fuzzy imperialist competitive algorithm) and IWSSr (Incremental Wrapper Subset Selection with replacement) in weighted feature space are executed to find relevant attributes. The new scheme is successfully applied on 10 standard high-dimensional datasets, especially within the field of biosciences and medicine, where the number of features compared to the number of samples is large, inducing a severe curse of dimensionality problem. The comparison between the results of our method and other algorithms confirms that our method has the most accuracy rate and it is also able to achieve to the efficient compact subset. 相似文献

19.

Blind digital modulation classification in software radio using the optimized classifier and feature subset selection

Ata EbrahimzadehReza Ghazalian 《Engineering Applications of Artificial Intelligence》2011,24(1):50-59

Automatic recognition of digital modulations plays an important role in various applications such as software defined radio. This study investigates the design of an accurate system for recognition of digital modulations. First, an efficient system is introduced that includes two main modules: the feature extraction module and the classifier module. First module extracts a suitable combination of the higher order moments up to eighth, higher order cumulants up to eighth and instantaneous characteristics of digital modulations. These features are applied for the first time in this area. In the classifier module, several supervised classifiers, such as multilayer perceptron neural network, radial basis function and multi-class support vector machine based classifier are investigated. By experimental study, we choose the best classifier for recognition of the considered modulations. Then, we propose a hybrid heuristic recognition system to which an optimization module is added to improve the generalization performance of the classifier. This module optimizes the classifier design by searching for the best value of the parameters that tune its discriminant function (kernel parameters selection) and upstream by looking for the best subset of features that feed the classifier. Simulation results show that the proposed system has a very high recognition accuracy. This high efficiency is achieved with little features, which have been selected using particle swarm optimizer. 相似文献

20.

Improving microaneurysm detection using an optimally selected subset of candidate extractors and preprocessing methods

Bálint Antal András Hajdu 《Pattern recognition》2012,45(1):264-270

In this paper, we present an approach to improve microaneurysm detection in digital color fundus images. Instead of following the standard process which considers preprocessing, candidate extraction and classification, we propose a novel approach that combines several preprocessing methods and candidate extractors before the classification step. We ensure high flexibility by using a modular model and a simulated annealing-based search algorithm to find the optimal combination. Our experimental results show that the proposed method outperforms the current state-of-the-art individual microaneurysm candidate extractors. 相似文献