首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
We consider the accuracy estimation of a classifier constructed on a given training sample. The naive resubstitution estimate is known to have a downward bias problem. The traditional approach to tackling this bias problem is cross-validation. The bootstrap is another way to bring down the high variability of cross-validation. But a direct comparison of the two estimators, cross-validation and bootstrap, is not fair because the latter estimator requires much heavier computation. We performed an empirical study to compare the .632+ bootstrap estimator with the repeated 10-fold cross-validation and the repeated one-third holdout estimator. All the estimators were set to require about the same amount of computation. In the simulation study, the repeated 10-fold cross-validation estimator was found to have better performance than the .632+ bootstrap estimator when the classifier is highly adaptive to the training sample. We have also found that the .632+ bootstrap estimator suffers from a bias problem for large samples as well as for small samples.  相似文献   

2.
In the problem of binary classification (or medical diagnosis), the classification rule (or diagnostic test) produces a continuous decision variable which is compared to a critical value (or threshold). Test values above (or below) that threshold are called positive (or negative) for disease. The two types of errors associated with every threshold value are Type I (false positive) and Type II (false negative) errors. The Receiver Operating Curve (ROC) describes the relationship between probabilities of these two types of errors. The inverse problem is considered; i.e., given the ROC curve (or its estimate) of a particular classification rule, one is interested in finding the value of the threshold ξ that leads to a specific operating point on that curve. A nonparametric method for estimating the threshold is proposed. Asymptotic distribution is derived for the proposed estimator. Results from simulated data and real-world data are presented for finite sample size. Finding a particular threshold value is crucial in medical diagnoses, among other fields, where a medical test is used to classify a patient as “diseased” or “nondiseased” based on comparing the test result to a particular threshold value. When the ROC is estimated, an operating point is obtained by fixing probability of one type of error, and obtaining the other one from the estimated curve. Threshold estimation can then be viewed as a quantile estimation for one distribution but with the utilization of the second one.  相似文献   

3.
Many safety related and critical systems warn of potentially dangerous events; for example, the short term conflict alert (STCA) system warns of airspace infractions between aircraft. Although installed with current technology, such critical systems may become out of date due to changes in the circumstances in which they function, operational procedures, and the regulatory environment. Current practice is to "tune," by hand, the many parameters governing the system in order to optimize the operating point in terms of the true positive and false positive rates, which are frequently associated with highly imbalanced costs. We cast the tuning of critical systems as a multiobjective optimization problem. We show how a region of the optimal receiver operating characteristic (ROC) curve may be obtained, permitting the system operators to select the operating point. We apply this methodology to the STCA system, using a multiobjective (1+1) evolution strategy, showing that we can improve upon the current hand-tuned operating point, as well as providing the salient ROC curve describing the true positive versus false positive tradeoff. We also provide results for three-objective optimization of the alert response time in addition to the true and false positive rates. Additionally, we illustrate the use of bootstrapping for representing evaluation uncertainty on estimated Pareto fronts, where the evaluation of a system is based upon a finite set of representative data.  相似文献   

4.
Discrete classification problems abound in pattern recognition and data mining applications. One of the most common discrete rules is the discrete histogram rule. This paper presents exact formulas for the computation of bias, variance, and RMS of the resubstitution and leave-one-out error estimators, for the discrete histogram rule. We also describe an algorithm to compute the exact probability distribution of resubstitution and leave-one-out, as well as their deviations from the true error rate. Using a parametric Zipf model, we compute the exact performance of resubstitution and leave-one-out, for varying expected true error, number of samples, and classifier complexity (number of bins). We compare this to approximate performance measures-computed by Monte-Carlo sampling—of 10-repeated 4-fold cross-validation and the 0.632 bootstrap error estimator. Our results show that resubstitution is low-biased but much less variable than leave-one-out, and is effectively the superior error estimator between the two, provided classifier complexity is low. In addition, our results indicate that the overall performance of resubstitution, as measured by the RMS, can be substantially better than the 10-repeated 4-fold cross-validation estimator, and even comparable to the 0.632 bootstrap estimator, provided that classifier complexity is low and the expected error rates are moderate. In addition to the results discussed in the paper, we provide an extensive set of plots that can be accessed on a companion website, at the URL http://ee.tamu.edu/edward/exact_discrete.  相似文献   

5.
By executing different fingerprint-image matching algorithms on large data sets, it reveals that the match and non-match similarity scores have no specific underlying distribution function. Thus, it requires a nonparametric analysis for fingerprint-image matching algorithms on large data sets without any assumption about such irregularly discrete distribution functions. A precise receiver operating characteristic (ROC) curve based on the true accept rate (TAR) of the match similarity scores and the false accept rate (FAR) of the non-match similarity scores can be constructed. The area under such an ROC curve computed using the trapezoidal rule is equivalent to the Mann-Whitney statistic directly formed from the match and non-match similarity scores. Thereafter, the Z statistic formulated using the areas under ROC curves along with their variances and the correlation coefficient is applied to test the significance of the difference between two ROC curves. Four examples from the extensive testing of commercial fingerprint systems at the National Institute of Standards and Technology are provided. The nonparametric approach presented in this article can also be employed in the analysis of other large biometric data sets.  相似文献   

6.
In this paper, we propose a novel algorithm for rule extraction from support vector machines (SVMs), termed SQRex-SVM. The proposed method extracts rules directly from the support vectors (SVs) of a trained SVM using a modified sequential covering algorithm. Rules are generated based on an ordered search of the most discriminative features, as measured by interclass separation. Rule performance is then evaluated using measured rates of true and false positives and the area under the receiver operating characteristic (ROC) curve (AUC). Results are presented on a number of commonly used data sets that show the rules produced by SQRex-SVM exhibit both improved generalization performance and smaller more comprehensible rule sets compared to both other SVM rule extraction techniques and direct rule learning techniques  相似文献   

7.
In this paper, we propose a novel algorithm for rule extraction from support vector machines (SVMs), termed SQRex-SVM. The proposed method extracts rules directly from the support vectors (SVs) of a trained SVM using a modified sequential covering algorithm. Rules are generated based on an ordered search of the most discriminative features, as measured by interclass separation. Rule performance is then evaluated using measured rates of true and false positives and the area under the receiver operating characteristic (ROC) curve (AUC). Results are presented on a number of commonly used data sets that show the rules produced by SQRex-SVM exhibit both improved generalization performance and smaller more comprehensible rule sets compared to both other SVM rule extraction techniques and direct rule learning techniques.  相似文献   

8.
We propose a general method for error estimation that displays low variance and generally low bias as well. This method is based on “bolstering” the original empirical distribution of the data. It has a direct geometric interpretation and can be easily applied to any classification rule and any number of classes. This method can be used to improve the performance of any error-counting estimation method, such as resubstitution and all cross-validation estimators, particularly in small-sample settings. We point out some similarities shared by our method with a previously proposed technique, known as smoothed error estimation. In some important cases, such as a linear classification rule with a Gaussian bolstering kernel, the integrals in the bolstered error estimate can be computed exactly. In the general case, the bolstered error estimate may be computed by Monte-Carlo sampling; however, our experiments show that a very small number of Monte-Carlo samples is needed. This results in a fast error estimator, which is in contrast to other resampling techniques, such as the bootstrap. We provide an extensive simulation study comparing the proposed method with resubstitution, cross-validation, and bootstrap error estimation, for three popular classification rules (linear discriminant analysis, k-nearest-neighbor, and decision trees), using several sample sizes, from small to moderate. The results indicate the proposed method vastly improves on resubstitution and cross-validation, especially for small samples, in terms of bias and variance. In that respect, it is competitive with, and in many occasions superior to, bootstrap error estimation, while being tens to hundreds of times faster. We provide a companion web site, which contains: (1) the complete set of tables and plots regarding the simulation study, and (2) C source code used to implement the bolstered error estimators proposed in this paper, as part of a larger library for classification and error estimation, with full documentation and examples. The companion web site can be accessed at the URL http://ee.tamu.edu/~edward/bolster.  相似文献   

9.
Given a large set of potential features, it is usually necessary to find a small subset with which to classify. The task of finding an optimal feature set is inherently combinatoric and therefore suboptimal algorithms are typically used to find feature sets. If feature selection is based directly on classification error, then a feature-selection algorithm must base its decision on error estimates. This paper addresses the impact of error estimation on feature selection using two performance measures: comparison of the true error of the optimal feature set with the true error of the feature set found by a feature-selection algorithm, and the number of features among the truly optimal feature set that appear in the feature set found by the algorithm. The study considers seven error estimators applied to three standard suboptimal feature-selection algorithms and exhaustive search, and it considers three different feature-label model distributions. It draws two conclusions for the cases considered: (1) depending on the sample size and the classification rule, feature-selection algorithms can produce feature sets whose corresponding classifiers possess errors far in excess of the classifier corresponding to the optimal feature set; and (2) for small samples, differences in performances among the feature-selection algorithms are less significant than performance differences among the error estimators used to implement the algorithms. Moreover, keeping in mind that results depend on the particular classifier-distribution pair, for the error estimators considered in this study, bootstrap and bolstered resubstitution usually outperform cross-validation, and bolstered resubstitution usually performs as well as or better than bootstrap.  相似文献   

10.
Different types of error rates are described and the state of the art of error rate estimation at the time of Toussaint's (1974) survey is briefly summarised. Developments since then are outlines, and the two major advances, namely bootstrap and average conditional error rate estimation methods, and their extensions, are described in detail.  相似文献   

11.
12.
We derive pointwise exact bootstrap distributions of ROC curves and the difference between ROC curves for threshold and vertical averaging. From these distributions, pointwise confidence intervals are derived and their performance is measured in terms of coverage accuracy. Improvements over techniques currently in use are obtained, in particular in the extremes of ROC curves where we show that typical drastic falls in coverage accuracy can be avoided.  相似文献   

13.
This paper is concerned with statistical inference for the coefficient of the linear regression model when the error term follows an autoregressive (AR) model. Past studies have reported severe size distortions, when the data are trending and autocorrelation of the error term is high. In this paper, we consider a test based on the bias-corrected bootstrap, where bias-corrected parameter estimators for the AR and regression coefficients are used. For bias-correction, the jackknife and bootstrap methods are employed. Monte Carlo simulations are conducted to compare size and power properties of the bias-corrected bootstrap test. It is found that the bias-corrected bootstrap test shows substantially improved size properties and exhibits excellent power for most of cases considered. It also appears that bootstrap bias-correction leads to better size and higher power values than jackknife bias-correction. These results are found to be robust to the choice of parameter estimation methods.JEL classifications: C12, C15, C63  相似文献   

14.
Glenn  Richard R.  Suresh   《Computers & Security》2006,25(8):600-615
Network Denial-of-Service (DoS) attacks that disable network services by flooding them with spurious packets are on the rise. Criminals with large networks (botnets) of compromised nodes (zombies) use the threat of DoS attacks to extort legitimate companies. To fight these threats and ensure network reliability, early detection of these attacks is critical. Many methods have been developed with limited success to date. This paper presents an approach that identifies change points in the time series of network packet arrival rates. The proposed process has two stages: (i) statistical analysis that finds the rate of increase of network traffic, and (ii) wavelet analysis of the network statistics that quickly detects the sudden increases in packet arrival rates characteristic of botnet attacks.Most intrusion detections are tested using data sets from special security testing configurations, which leads to unacceptable false positive rates being found when they are used in the real world. We test our approach using data from both network simulations and a large operational network. The true and false positive detection rates are determined for both data sets, and receiver operating curves use these rates to find optimal parameters for our approach. Evaluation using operational data proves the effectiveness of our approach.  相似文献   

15.
Standard errors for bagged and random forest estimators   总被引:1,自引:0,他引:1  
Bagging and random forests are widely used ensemble methods. Each forms an ensemble of models by randomly perturbing the fitting of a base learner. The standard errors estimation of the resultant regression function is considered. Three estimators are discussed. One, based on the jackknife, is applicable to bagged estimators and can be computed using the bagged ensemble. The two other estimators target the bootstrap standard error estimator, and require fitting multiple ensemble estimators, one for each bootstrap sample. It is shown that these bootstrap ensemble sizes can be small, which reduces the computation involved in forming the estimator. The estimators are studied using both simulated and real data.  相似文献   

16.
传统方法在对高维稀疏数据进行检测的过程中,受到高维特征扰动的影响,数据误差较大,因此提出一种基于深度学习的高维稀疏数据组合推荐算法。采用相空间重构方法进行高维稀疏数据的特征重构,根据重构结果结合非线性统计序列分析方法进行高维稀疏数据的回归分析和点云结构重组,在此基础上提取高维稀疏数据的组合特征量;依据特征量提取结果采用特征提取技术抽取高维稀疏数据的平均互信息特征量,并结合关联规则挖掘方法进行高维稀疏数据的主成分分析,挖掘高维稀疏数据的相似度属性类别成分,最终采用深度学习方法进行高维稀疏数据组合推荐过程中的自适应寻优,实现高维稀疏数据的组合推荐。仿真结果表明,采用该算法进行高维稀疏数据推荐的属性归类辨识性较好,特征分辨能力较强,提高了数据的检测和识别能力。  相似文献   

17.
This article outlines a Bayesian bootstrap method for case based imprecision estimates in Bayes classification. We argue that this approach is an important complement to methods such as k-fold cross validation that are based on overall error rates. It is shown how case based imprecision estimates may be used to improve Bayes classifiers under asymmetrical loss functions. In addition, other approaches to making use of case based imprecision estimates are discussed and illustrated on two real world data sets. Contrary to the common assumption, Bayesian bootstrap simulations indicate that the uncertainty associated with the output of a Bayes classifier is often far from normally distributed.  相似文献   

18.
Vessel structures such as retinal vasculature are important features for computer-aided diagnosis. In this paper, a probabilistic tracking method is proposed to detect blood vessels in retinal images. During the tracking process, vessel edge points are detected iteratively using local grey level statistics and vessel's continuity properties. At a given step, a statistic sampling scheme is adopted to select a number of vessel edge points candidates in a local studying area. Local vessel's sectional intensity profiles are estimated by a Gaussian shaped curve. A Bayesian method with the Maximum a posteriori (MAP) probability criterion is then used to identify local vessel's structure and find out the edge points from these candidates. Evaluation is performed on both simulated vascular and real retinal images. Different geometric shapes and noise levels are used for computer simulated images, whereas real retinal images from the REVIEW database are tested. Evaluation performance is done using the Segmentation Matching Factor (SMF) as a quality parameter. Our approach performed better when comparing it with Sun's and Chaudhuri's methods. ROC curves are also plotted, showing effective detection of retinal blood vessels (true positive rate) with less false detection (false positive rate) than Sun's method.  相似文献   

19.
Combat identification is one example where incorrect automatic target recognition (ATR) output labels may have substantial decision costs. For example, the incorrect labeling of hostile targets vs. friendly non-targets may have high costs; yet, these costs are difficult to quantify. One way to increase decision confidence is through fusion of data from multiple sources or from multiple looks through time. Numerous methods have been published to determine a Bayes’ optimal fusion decision if decision costs are well known. This research presents a novel mathematical programming ATR evaluation framework. A new objective function inclusive of time is introduced to optimize and compare ATR systems. Constraints are developed to enforce both decision maker preferences and traditional engineering measures of performance. This research merges rejection and receiver operating characteristic (ROC) analysis by incorporating rejection and ROC thresholds as decision variables. The rejection thresholds specify non-declaration regions, while the ROC thresholds explore viable true positive and false positive tradeoffs for output target labels. This methodology yields an optimal ATR system subject to decision maker constraints without using explicit costs for each type of output decision. A sample application is included for the fusion of two channels of collected polarized radar data for 10 different ground targets. A Boolean logic and probabilistic neural network fusion method are optimized and compared. Sensitivity analysis of significant performance parameters then reveals preferred regions for each of the fusion algorithms.  相似文献   

20.
Receiver Operating Characteristic (ROC) analysis is one of the most popular tools for the visual assessment and understanding of classifier performance. In this paper we present a new representation of regression models in the so-called regression ROC (RROC) space. The basic idea is to represent over-estimation against under-estimation. The curves are just drawn by adjusting a shift, a constant that is added (or subtracted) to the predictions, and plays a similar role as a threshold in classification. From here, we develop the notions of optimal operating condition, convexity, dominance, and explore several evaluation metrics that can be shown graphically, such as the area over the RROC curve (AOC). In particular, we show a novel and significant result: the AOC is equivalent to the error variance. We illustrate the application of RROC curves to resource estimation, namely the estimation of software project effort.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号