期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

共查询到20条相似文献，搜索用时 31 毫秒

Normalized Mutual Information Feature Selection 总被引：6，自引：0，他引：6

《Neural Networks, IEEE Transactions on》2009,20(2):189-201

A filter method of feature selection based on mutual information, called normalized mutual information feature selection (NMIFS), is presented. NMIFS is an enhancement over Battiti's MIFS, MIFS-U, and mRMR methods. The average normalized mutual information is proposed as a measure of redundancy among features. NMIFS outperformed MIFS, MIFS-U, and mRMR on several artificial and benchmark data sets without requiring a user-defined parameter. In addition, NMIFS is combined with a genetic algorithm to form a hybrid filter/wrapper method called GAMIFS. This includes an initialization procedure and a mutation operator based on NMIFS to speed up the convergence of the genetic algorithm. GAMIFS overcomes the limitations of incremental search algorithms that are unable to find dependencies between groups of features. 相似文献

分类分析中基于信息论准则的特征选取 总被引：3，自引：1，他引：2

黄金杰吕宁李双全蔡云泽《自动化学报》2008,34(3):383-392

Feature selection aims to reduce the dimensionality of patterns for classificatory analysis by selecting the most informative instead of irrelevant and/or redundant features. In this study, two novel information-theoretic measures for feature ranking are presented: one is an improved formula to estimate the conditional mutual information between the candidate feature fi and the target class C given the subset of selected features S, i.e., I(C;fi|S), under the assumption that information of features is distributed uniformly; the other is a mutual information (MI) based constructive criterion that is able to capture both irrelevant and redundant input features under arbitrary distributions of information of features. With these two measures, two new feature selection algorithms, called the quadratic MI-based feature selection (QMIFS) approach and the MI-based constructive criterion (MICC) approach, respectively, are proposed, in which no parameters like β in Battiti's MIFS and (Kwak and Choi)'s MIFS-U methods need to be preset. Thus, the intractable problem of how to choose an appropriate value for β to do the tradeoff between the relevance to the target classes and the redundancy with the already-selected features is avoided completely. Experimental results demonstrate the good performances of QMIFS and MICC on both synthetic and benchmark data sets. 相似文献

Mixed feature selection based on granulation and approximation

Qinghua Hu Jinfu Liu Daren Yu 《Knowledge》2008,21(4):294-304

Feature subset selection presents a common challenge for the applications where data with tens or hundreds of features are available. Existing feature selection algorithms are mainly designed for dealing with numerical or categorical attributes. However, data usually comes with a mixed format in real-world applications. In this paper, we generalize Pawlak’s rough set model into δ neighborhood rough set model and k-nearest-neighbor rough set model, where the objects with numerical attributes are granulated with δ neighborhood relations or k-nearest-neighbor relations, while objects with categorical features are granulated with equivalence relations. Then the induced information granules are used to approximate the decision with lower and upper approximations. We compute the lower approximations of decision to measure the significance of attributes. Based on the proposed models, we give the definition of significance of mixed features and construct a greedy attribute reduction algorithm. We compare the proposed algorithm with others in terms of the number of selected features and classification performance. Experiments show the proposed technique is effective. 相似文献

Initial-interval driven biasing of nodal concentration along a fixed curve

L. D. Flippen Jr 《Computers & Structures》1999,70(6):1108-698

Element size transitioning in the construction of spatial meshes for finite element models is often controlled by biasing the concentration of nodes, towards one end or the other, along each of a set of curves in the model. A simple, common and efficient scheme to implement such nodal concentration biasing along a given curve is to require that the nodal spacings δ_i be (sequence) terms bⁱδ₀ of a geometric series. Current practice takes the parameter value b, or its equivalent, as an independent input, so that the initial nodal spacing δ₀ must be a computed output. This is the most straightforward approach, but the lack of direct control over the value δ₀ is a significant shortcoming. In an element size transitioning scenario, δ₀ is often a parameter for which the model builder/analyst has independent quantitative information. It may represent the a priori known thickness of a thin bond or weld, for example. A more rational choice for these cases, proposed by this paper, is a scheme for which δ₀ is an independent input parameter instead of b. The parameter b is computed by a convergence-guaranteed algorithm for which the existence of b as a single-valued function of its input is proven. 相似文献

Conditional mutual information-based feature selection for congestive heart failure recognition using heart rate variability

Yu SN Lee MY 《Computer methods and programs in biomedicine》2012,108(1):299-309

Feature selection plays an important role in pattern recognition systems. In this study, we explored the problem of selecting effective heart rate variability (HRV) features for recognizing congestive heart failure (CHF) based on mutual information (MI). The MI-based greedy feature selection approach proposed by Battiti was adopted in the study. The mutual information conditioned by the first-selected feature was used as a criterion for feature selection. The uniform distribution assumption was used to reduce the computational load. And, a logarithmic exponent weighting was added to model the relative importance of the MI with respect to the number of the already-selected features. The CHF recognition system contained a feature extractor that generated four categories, totally 50, features from the input HRV sequences. The proposed feature selector, termed UCMIFS, proceeded to select the most effective features for the succeeding support vector machine (SVM) classifier. Prior to feature selection, the 50 features produced a high accuracy of 96.38%, which confirmed the representativeness of the original feature set. The performance of the UCMIFS selector was demonstrated to be superior to the other MI-based feature selectors including MIFS-U, CMIFS, and mRMR. When compared to the other outstanding selectors published in the literature, the proposed UCMIFS outperformed them with as high as 97.59% accuracy in recognizing CHF using only 15 features. The results demonstrated the advantage of using the recruited features in characterizing HRV sequences for CHF recognition. The UCMIFS selector further improved the efficiency of the recognition system with substantially lowered feature dimensions and elevated recognition rate. 相似文献

Input feature selection for classification problems 总被引：30，自引：0，他引：30

Kwak N. Chong-Ho Choi 《Neural Networks, IEEE Transactions on》2002,13(1):143-159

Feature selection plays an important role in classifying systems such as neural networks (NNs). We use a set of attributes which are relevant, irrelevant or redundant and from the viewpoint of managing a dataset which can be huge, reducing the number of attributes by selecting only the relevant ones is desirable. In doing so, higher performances with lower computational effort is expected. In this paper, we propose two feature selection algorithms. The limitation of mutual information feature selector (MIFS) is analyzed and a method to overcome this limitation is studied. One of the proposed algorithms makes more considered use of mutual information between input attributes and output classes than the MIFS. What is demonstrated is that the proposed method can provide the performance of the ideal greedy selection algorithm when information is distributed uniformly. The computational load for this algorithm is nearly the same as that of MIFS. In addition, another feature selection algorithm using the Taguchi method is proposed. This is advanced as a solution to the question as to how to identify good features with as few experiments as possible. The proposed algorithms are applied to several classification problems and compared with MIFS. These two algorithms can be combined to complement each other's limitations. The combined algorithm performed well in several experiments and should prove to be a useful method in selecting features for classification problems. 相似文献

An application of C-calculus to texture analysis: C-transforms

A. Apostolico E.R. Caianiello E. Fischetti S. Vitulano 《Pattern recognition》1978,10(5-6):389-396

A method for the analysis and discrimination of textures, based on C-calculus, is proposed.

The concepts of C-space and C-transform of a digitized signal are introduced as simple tools, well suited to the visualization of the filtering properties of C-calculus: C-filters are thus also defined and the “natural” role they seem to play in problems concerning textures is investigated in some practical instances. In particular, C-transforms of some sample textures are provided and texture classification in C-space is performed. Discrimination of objects against textural background is obtained by C-filtering, in an inherently parallel fashion.

The philosophy involved in this approach is finally briefly discussed in a comparison with some extant methods. 相似文献

Estimating optimal feature subsets using efficient estimation of high-dimensional mutual information 总被引：4，自引：0，他引：4

Chow T.W.S. Huang D. 《Neural Networks, IEEE Transactions on》2005,16(1):213-224

A novel feature selection method using the concept of mutual information (MI) is proposed in this paper. In all MI based feature selection methods, effective and efficient estimation of high-dimensional MI is crucial. In this paper, a pruned Parzen window estimator and the quadratic mutual information (QMI) are combined to address this problem. The results show that the proposed approach can estimate the MI in an effective and efficient way. With this contribution, a novel feature selection method is developed to identify the salient features one by one. Also, the appropriate feature subsets for classification can be reliably estimated. The proposed methodology is thoroughly tested in four different classification applications in which the number of features ranged from less than 10 to over 15000. The presented results are very promising and corroborate the contribution of the proposed feature selection methodology. 相似文献

高斯核选择的线性性质检测方法^*

韩志卓廖士中《模式识别与人工智能》2017,30(9):815-821

核选择直接影响核方法的性能.已有高斯核选择方法的计算复杂度为Ω(n²),阻碍大规模核方法的发展.文中提出高斯核选择的线性性质检测方法,不同于传统核选择方法,询问复杂度为O(ln(1/δ)/ ^{2),计算复杂度独立于样本规模.文中首先给出函数线性水平的定义,证明可使用线性水平近似度量一个函数与线性函数类之间的距离,并以此为基础提出高斯核选择的线性性质检测准则.然后应用该准则,在随机傅里叶特征空间中有效评价并选择高斯核.理论分析与实验表明,应用性质检测以实现高斯核选择的方法有效可行. 相似文献}

10.

基于Lasso算法的中文情感混合特征选择方法研究

李燕   卫志华   徐凯《计算机科学》2018,45(1):39-46

中文情感分析中的一个重要问题就是情感倾向分类,情感特征选择是基于机器学习的情感倾向分类的前提和基础,其作用在于通过剔除无关或冗余的特征来降低特征集的维数。提出一种将Lasso算法与过滤式特征选择方法相结合的情感混合特征选择方法:先利用Lasso惩罚回归算法对原始特征集合进行筛选,得出冗余度较低的情感分类特征子集;再对特征子集引入CHI,MI,IG等过滤方法来评价候选特征词与文本类别的依赖性权重,并据此剔除候选特征词中相关性较低的特征词;最终,在使用高斯核函数的SVM分类器上对比所提方法与DF,MI,IG和CHI在不同特征词数量下的分类效果。在微博短文本语料库上进行了实验,结果表明所提算法具有有效性和高效性;并且在特征子集维数小于样本数量时,提出的混合方法相比DF,MI,IG和CHI的特征选择效果都有一定程度的改善;通过对比识别率和查全率可以发现,Lasso-MI方法相比MI以及其他过滤方法更为有效。  相似文献

11.

Minimizing the distance to one evader while chasing another

I. Shevchenko 《Computers & Mathematics with Applications》2004,47(12):1827-1855

To approach a simple game Δ² of P and E = {E₁, E₂} with no a priori evaders' role assignment and the payoff equal to the distance to one evader at an instant of catching another, we introduce a concept of casting and study the games Δ_1,2 and Δ_2,1 for preassigned and Δ_p² for open-loop casting procedures. Since Δ_p² is reduced to Δ_1,2 or Δ_2,1 which, in turn, are distinguished only by their notations, we focus attention mainly on Δ_1,2. According to the tenet of transition, Δ_1,2 is divided into a concatenation of Δ_1,2^b (basic) and Δ_1,2^a (auxiliary) games that model the problem before and after the first instant of E₁ capture. The games Δ_1,2^a, Δ_1,2^b, Δ_1,2 are studied one after another with use of the Isaacs' approach extended by Berkowitz, Breakwell, Bernhard et al.  相似文献

12.

Local feature extraction for iris recognition with automatic scale selection   总被引：1，自引：0，他引：1

Chenhong Lu  Zhaoyang Lu   《Image and vision computing》2008,26(7):935-940

This paper presents an iris recognition system using automatic scale selection algorithm for iris feature extraction. The proposed system first filters the given iris image adopting a bank of Laplacian of Gaussian (LoG) filters with many different scales and computes the normalized response of every filter. The parameter γ used to normalize the filter responses, is derived by analyzing the scale-space maxima of the blob feature detector responses. Then the maxima normalized response over scales for each point are selected together as the optimal filter outputs of the given iris image and the binary codes for iris feature representation are achieved by encoding these optimal outputs through a zero threshold. Comparison experiment results clearly demonstrate an efficient performance of the proposed algorithm.  相似文献

13.

Effects of spatial variability in light use efficiency on satellite-based NPP monitoring   总被引：11，自引：0，他引：11

David P. Turner  Stith T. Gower  Warren B. Cohen  Matthew Gregory  Tom K. Maiersperger 《Remote sensing of environment》2002,80(3):397-405

Light use efficiency (LUE) algorithms are a potentially effective approach to monitoring global net primary production (NPP) using satellite-borne sensors such as the Moderate Resolution Imaging Spectroradiometer (MODIS). However, these algorithms are applied at relatively coarse spatial resolutions (≥1 km), which may subsume significant heterogeneity in vegetation LUE (ε_n, g MJ⁻¹) and, hence, introduce error. To examine the effects of spatial heterogeneity on a LUE algorithm, imagery from the Advanced Very High Resolution Radiometer (AVHRR) at ≈1-km resolution was used to implement a LUE approach for NPP estimation over a 25-km² area of corn (Zea mays L.) and soybean (Glycine max Merr.) in central Illinois, USA. Results from several ε_n formulations were compared with a NPP reference surface based on measured NPPs and a high spatial resolution land cover surface derived from Landsat ETM+. Determination of ε_n based on measurements of biomass production and monitoring of absorbed photosynthetically active radiation (APAR) revealed that ε_n of soybean was 68% of that for corn. When a LUE algorithm for estimating NPP was implemented in the study area using the assumption of homogeneous cropland and the ε_n for corn, the estimate for total biomass production was 126% of that from the NPP reference surface. Because of counteracting errors, total biomass production using the soybean ε_n was closer (86%) to that from the NPP reference surface. Retention of high spatial resolution land cover to assign ε_n resulted in a total NPP very similar to the reference NPP because differences in leaf phenology between the crop types were small except early in the growing season. These results suggest several alternative approaches to accounting for land cover heterogeneity in ε_n when implementing LUE algorithms at coarse resolution.  相似文献

14.

Mutual information-based method for selecting informative feature sets

Gunawan Herman  Bang Zhang  Yang Wang  Getian Ye  Fang Chen 《Pattern recognition》2013,46(12):3315-3327

Feature selection is one of the fundamental problems in pattern recognition and data mining. A popular and effective approach to feature selection is based on information theory, namely the mutual information of features and class variable. In this paper we compare eight different mutual information-based feature selection methods. Based on the analysis of the comparison results, we propose a new mutual information-based feature selection method. By taking into account both the class-dependent and class-independent correlation among features, the proposed method selects a less redundant and more informative set of features. The advantage of the proposed method over other methods is demonstrated by the results of experiments on UCI datasets (Asuncion and Newman, 2010 [1]) and object recognition.  相似文献

15.

基于局部判别约束的半监督特征选择方法^*

严菲   王晓栋《模式识别与人工智能》2017,30(1):89-95

特征选择旨在选择待处理数据中最具代表性的特征,降低特征空间的维度.文中提出基于局部判别约束的半监督特征选择方法,充分利用已标记样本和未标记样本训练特征选择模型,并借助相邻数据间的局部判别信息提高模型的准确度,引入l_2,1约束提高特征之间可区分度,避免噪声干扰.最后通过实验验证文中方法的有效性.  相似文献

16.

基于l_2,0范数稀疏性和模糊相似性的图优化无监督组特征选择方法

孟田田   周水生   田昕润《模式识别与人工智能》2023,36(1):34-48

基于图的无监督特征选择方法大多选择投影矩阵的l_2,1范数稀疏正则化代替非凸的l_2,0范数约束,然而l_2,1范数正则化方法根据得分高低逐个选择特征,未考虑特征的相关性.因此,文中提出基于l_2,0范数稀疏性和模糊相似性的图优化无监督组特征选择方法,同时进行图学习和特征选择.在图学习中,学习具有精确连通分量的相似性矩阵.在特征选择过程中,约束投影矩阵的非零行个数,实现组特征选择.为了解决非凸的l_2,0范数约束,引入元素为0或1的特征选择向量,将l_2,0范数约束问题转化为0-1整数规划问题,并将离散的0-1整数约束转化为2个连续约束进行求解.最后,引入模糊相似性因子,拓展文中方法,学习更精确的图结构.在真实数据集上的实验表明文中方法的有效性.  相似文献

17.

Multivariate correlation coefficient and mutual information-based feature selection in intrusion detection

Sara Mohammadi  Mostafa Ghazizadeh-Ahsaee 《Information Security Journal: A Global Perspective》2017,26(5):229-239

Feature selection is one of the major problems in an intrusion detection system (IDS) since there are additional and irrelevant features. This problem causes incorrect classification and low detection rate in those systems. In this article, four feature selection algorithms, named multivariate linear correlation coefficient (MLCFS), feature grouping based on multivariate mutual information (FGMMI), feature grouping based on linear correlation coefficient (FGLCC), and feature grouping based on pairwise MI, are proposed to solve this problem. These algorithms are implementable in any IDS. Both linear and nonlinear measures are used in the sense that the correlation coefficient and the multivariate correlation coefficient are linear, whereas the MI and the multivariate MI are nonlinear. Least Square Support Vector Machine (LS-SVM) as an intrusion classifier is used to evaluate the selected features. Experimental results on the KDDcup99 and Network Security Laboratory-Knowledge Discovery and Data Mining (NSL) datasets showed that the proposed feature selection methods have a higher detection and accuracy and lower false-positive rate compared with the pairwise linear correlation coefficient and the pairwise MI employed in several previous algorithms.  相似文献

18.

基于相关性和冗余度的联合特征选择方法

周城   葛斌   唐九阳   肖卫东《计算机科学》2012,39(4):181-184

比较研究了与类别信息无关的文档频率和与类别信息有关的信息增益、互信息和χ2统计特征选择方法,在此基础上分析了以往直接组合这两类特征选择方法的弊端,并提出基于相关性和冗余度的联合特征选择算法。该算法将文档频率方法分别与信息增益、互信息和χ2统计方法联合进行特征选择,旨在删除冗余特征,并保留有利于分类的特征,从而提高文本情感分类效果。实验结果表明,该联合特征选择方法具有较好的性能,并且能够有效降低特征维数。  相似文献

19.

On absolute summability factors of infinite series

Ekrem Sava 《Computers & Mathematics with Applications》2008,56(1):25-29

In this paper a general theorem on |A,δ|_k-summability methods has been proved. This theorem includes, as a special case, a known result in [E. Savas, Factors for |A|_k Summability of infinite series, Comput. Math. Appl. 53 (2007) 1045–1049].  相似文献

20.

An improved lower bound on approximation algorithms for the Closest Substring problem

Jianxin Wang  Jianer Chen  Min Huang 《Information Processing Letters》2008,107(1):24-28

The Closest Substring problem (the CSP problem) is a basic NP-hard problem in the study of computational biology. It is known that the problem has polynomial time approximation schemes. In this paper, we prove that unless the Exponential Time Hypothesis fails, the CSP problem has no polynomial time approximation schemes of running time f(1/ε)n^o(1/ε) for any function f. This essentially excludes the possibility that the CSP problem has a practical polynomial time approximation scheme even for moderate values of the error bound ε. As a consequence, it is unlikely that the study of approximation schemes for the CSP problem in the literature would lead to practical approximation algorithms for the problem for small error bound ε.  相似文献