首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Recently developed methods for learning sparse classifiers are among the state-of-the-art in supervised learning. These methods learn classifiers that incorporate weighted sums of basis functions with sparsity-promoting priors encouraging the weight estimates to be either significantly large or exactly zero. From a learning-theoretic perspective, these methods control the capacity of the learned classifier by minimizing the number of basis functions used, resulting in better generalization. This paper presents three contributions related to learning sparse classifiers. First, we introduce a true multiclass formulation based on multinomial logistic regression. Second, by combining a bound optimization approach with a component-wise update procedure, we derive fast exact algorithms for learning sparse multiclass classifiers that scale favorably in both the number of training samples and the feature dimensionality, making them applicable even to large data sets in high-dimensional feature spaces. To the best of our knowledge, these are the first algorithms to perform exact multinomial logistic regression with a sparsity-promoting prior. Third, we show how nontrivial generalization bounds can be derived for our classifier in the binary case. Experimental results on standard benchmark data sets attest to the accuracy, sparsity, and efficiency of the proposed methods.  相似文献   

2.
Yun   《Computers & Security》2005,24(8):662-674
Although researchers have long studied using statistical modeling techniques to detect anomaly intrusion and profile user behavior, the feasibility of applying multinomial logistic regression modeling to predict multi-attack types has not been addressed, and the risk factors associated with individual major attacks remain unclear. To address the gaps, this study used the KDD-cup 1999 data and bootstrap simulation method to fit 3000 multinomial logistic regression models with the most frequent attack types (probe, DoS, U2R, and R2L) as an unordered independent variable, and identified 13 risk factors that are statistically significantly associated with these attacks. These risk factors were then used to construct a final multinomial model that had an ROC area of 0.99 for detecting abnormal events. Compared with the top KDD-cup 1999 winning results that were based on a rule-based decision tree algorithm, the multinomial logistic model-based classification results had similar sensitivity values in detecting normal (98.3% vs. 99.5%), probe (85.6% vs. 83.3%), and DoS (97.2% vs. 97.1%); remarkably high sensitivity in U2R (25.9% vs. 13.2%) and R2L (11.2% vs. 8.4%); and a significantly lower overall misclassification rate (18.9% vs. 35.7%). The study emphasizes that the multinomial logistic regression modeling technique with the 13 risk factors provides a robust approach to detect anomaly intrusion.  相似文献   

3.
在面向大规模复杂数据的模式分类和识别问题中,绝大多数的分类器都遇到了维数灾难这一棘手的问题.在进行高维数据分类之前,基于监督流形学习的非线性降维方法可提供一种有效的解决方法.利用多项式逻辑斯蒂回归方法进行分类预测,并结合基于非线性降维的非监督流形学习方法解决图像以及非图像数据的分类问题,因而形成了一种新的分类识别方法.大量的实验测试和比较分析验证了本文所提方法的优越性.  相似文献   

4.
在面向大规模复杂数据的模式分类和识别问题中,绝大多数的分类器都遇到了维数灾难这一棘手的问题.在进行高维数据分类之前,基于监督流形学习的非线性降维方法可提供一种有效的解决方法.利用多项式逻辑斯蒂回归方法进行分类预测,并结合基于非线性降维的非监督流形学习方法解决图像以及非图像数据的分类问题,因而形成了一种新的分类识别方法.大量的实验测试和比较分析验证了本文所提方法的优越性.  相似文献   

5.
This paper is focused on comparison of effectiveness of artificial intelligence (AI) techniques in fault diagnosis of rolling element bearings. The features for classification are extracted through wavelet packet decomposition using RBIO 5.5 wavelet. The whole classification is done using two features: energy and Kurtosis. The data samples for classification are taken with reference to a healthy bearing, thus, minimizing the errors from the experimental set-up. Four bearing conditions such as bearing with outer race defect, inner race defect, ball defect and combined defect on outer race, inner race and ball have been used in this paper. Localized defects of micron level are induced through laser machining. The effectiveness of three AI techniques viz. ANN, SVM and multinomial logistic regression are compared. The results show that the Logistic Regression technique is the more effective than other two techniques as ANN and SVM.  相似文献   

6.
《国际计算机数学杂志》2012,89(15):3113-3124
In this paper, we study a more general kernel regression learning with coefficient regularization. A non-iid setting is considered, where the sequence of probability measures for sampling is not identical but the sequence of marginal distributions for sampling converges exponentially fast in the dual of a Holder space; the sampling z i , i ≥ 1 are weakly dependent, which satisfy a strongly mixing condition. Satisfactory capacity independently error bounds and learning rates are derived by the techniques of integral operator for this learning algorithm.  相似文献   

7.
针对大数据量的图像分类问题,Laplacian正则化的半监督学习方法获得了广阔的应用前景。然而Laplacian正则化使分类函数趋向于常数函数而易导致较差的推测能力。提出了基于Hessian正则化的Logistic回归模型用于图像分类,Hessian正则化可以较好地预测区域之外的数据点。在MIR Flickr数据库上进行图像分类实验,与SVM、Logistic回归和Laplacian正则化的Logistic回归方法相比,Hessian正则化的Logistic回归模型更有效。  相似文献   

8.
In this article, a novel active learning approach is proposed for the classification of hyperspectral imagery using quasi-Newton multinomial logistic regression/Davidon, Fletcher, and Powell selective variance (MLR-DFP-SV). The proposed approach consists of two main steps: (1) a fast solution for the MLR classifier, where the logistic regressors are obtained by the use of the quasi-Newton algorithm; and (2) selection of the most informative unlabelled samples. The SV method is applied to select the most informative unlabelled samples, based on the posterior density distributions. Experiments on two real hyperspectral data sets confirmed that the proposed approach can effectively select the most informative unlabelled samples and improve the classification accuracy. Three different methods – the maximum information (MI), breaking ties (BT), and minimum error (ME) methods – were also used to obtain the most informative unlabelled samples, and it was found that the new sample selection method – SV – can select more informative samples than the BT, MI, and ME methods.  相似文献   

9.
Multimedia Tools and Applications - The multinomial logistic Gaussian process is a flexible non-parametric model for multi-class classification tasks. These tasks are often involved in solving a...  相似文献   

10.
Some medical and epidemiological surveys have been designed to predict a nominal response variable with several levels. With regard to the type of pregnancy there are four possible states: wanted, unwanted by wife, unwanted by husband and unwanted by couple. In this paper, we have predicted the type of pregnancy, as well as the factors influencing it using two different models and comparing them. Regarding the type of pregnancy with several levels, we developed a multinomial logistic regression and a neural network based on the data and compared their results using three statistical indices: sensitivity, specificity and kappa coefficient. Based on these three indices, neural network proved to be a better fit for prediction on data in comparison to multinomial logistic regression. When the relations among variables are complex, one can use neural networks instead of multinomial logistic regression to predict the nominal response variables with several levels in order to gain more accurate predictions.  相似文献   

11.
In this paper, we consider the coefficient-based regularized least-squares regression problem with the lq-regularizer (1≤q≤2) and data dependent hypothesis spaces. Algorithms in data dependent hypothesis spaces perform well with the property of flexibility. We conduct a unified error analysis by a stepping stone technique. An empirical covering number technique is also employed in our study to improve sample error. Comparing with existing results, we make a few improvements: First, we obtain a significantly sharper learning rate that can be arbitrarily close to O(m−1) under reasonable conditions, which is regarded as the best learning rate in learning theory. Second, our results cover the case q=1, which is novel. Finally, our results hold under very general conditions.  相似文献   

12.
13.
Technology credit scoring models have been used to screen loan applicant firms based on their technology. Typically a logistic regression model is employed to relate the probability of a loan default of the firms with several evaluation attributes associated with technology. However, these attributes are evaluated in linguistic expressions represented by fuzzy number. Besides, the possibility of loan default can be described in verbal terms as well. To handle these fuzzy input and output data, we proposed a fuzzy credit scoring model that can be applied to predict the default possibility of loan for a firm that is approved based on its technology. The method of fuzzy logistic regression as an appropriate prediction approach for credit scoring with fuzzy input and output was presented in this study. The performance of the model is improved compared to that of typical logistic regression. This study is expected to contribute to practical utilization of the technology credit scoring with linguistic evaluation attributes.  相似文献   

14.
Kuss and McLerran in a paper in this journal provide SAS code for the estimation of multinomial logistic models for correlated data. Their motivation derived from two papers that recommended to estimate such models using a Poisson likelihood, which is according to Kuss and McLerran "statistically correct but computationally inefficient". Kuss and McLerran propose several estimating methods. Some of these are based on the fact that the multinomial model is a multivariate binary model. Subsequently a procedure proposed by Wright is exploited to fit the models. In this paper we will show that the new computation methods, based on the approach by Wright, are statistically incorrect because they do not take into account that for multinomial data a multivariate link function is needed. An alternative estimation strategy is proposed using the clustered bootstrap.  相似文献   

15.
《国际计算机数学杂志》2012,89(7):1471-1483
This paper studies the regularized learning algorithm associated with the least-square loss and reproducing kernel Hilbert space. The target is the error analysis for the regression problem in learning theory. The upper and lower bounds of error are simultaneously estimated, which yield the optimal learning rate. The upper bound depends on the covering number and the approximation property of the reproducing kernel Hilbert space. The lower bound lies on the entropy number of the set that includes the regression function. Also, the rate is independent of the choice of the index q of the regular term.  相似文献   

16.
We show how multinomial logistic models with correlated responses can be estimated within SAS software. To achieve this, random effects and marginal models are introduced and the respective SAS code is given. An example data set on physicians' recommendations and preferences in traumatic brain injury rehabilitation is used for illustration. The main motivation for this work are two recent papers that recommend estimating multinomial logistic models with correlated responses by using a Poisson likelihood which is statistically correct but computationally inefficient.  相似文献   

17.
Multimedia Tools and Applications - Orientation of human body is an important feature that can be used for behavioral analysis in surveillance systems. This cue contains useful information such as...  相似文献   

18.
Sepsis is one of the main causes of death for non-coronary ICU (Intensive Care Unit) patients and has become the 10th most common cause of death in western societies. This is a transversal condition affecting immunocompromised patients, critically ill patients, post-surgery patients, patients with AIDS, and the elderly. In western countries, septic patients account for as much as 25% of ICU bed utilization and the pathology affects 1-2% of all hospitalizations. Its mortality rates range from 12.8% for sepsis to 45.7% for septic shock.The prediction of mortality caused by sepsis is, therefore, a relevant research challenge from a medical viewpoint. The clinical indicators currently in use for this type of prediction have been criticized for their poor prognostic significance. In this study, we redescribe sepsis indicators through latent model-based feature extraction, using factor analysis. These extracted indicators are then applied to the prediction of mortality caused by sepsis. The reported results show that the proposed method improves on the results obtained with the current standard mortality predictor, which is based on the APACHE II score.  相似文献   

19.
We are interested in testing hypotheses that concern the parameter of a logistic regression model. A robust Wald-type test based on a weighted Bianco and Yohai [ Bianco, A.M., Yohai, V.J., 1996. Robust estimation in the logistic regression model. In: H. Rieder (Ed) Robust Statistics, Data Analysis, and Computer Intensive Methods In: Lecture Notes in Statistics, vol. 109, Springer Verlag, New York, pp. 17–34] estimator, as implemented by Croux and Haesbroeck [Croux, C., Haesbroeck, G., 2003. Implementing the Bianco and Yohai estimator for logistic regression. Computational Statististics and Data Analysis 44, 273–295], is proposed. The asymptotic distribution of the test statistic is derived. We carry out an empirical study to get a further insight into the stability of the p-value. Finally, a Monte Carlo study is performed to investigate the stability of both the level and the power of the test, for different choices of the weight function.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号