首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
A need exists for an unbiased measure of the accuracy of feed-forward neural networks used for classification. Receiver operating characteristic (ROC) analysis is suited for this measure, and has been used to assess the performance of several different network weights. The area under an ROC and its standard error were used to compare different network weight sets, and to follow the performance of a network during the course of training. The ROC is not sensitive to the prior probabilities of examples in the testing set nor to the system's decision bias. The area under an ROC curve is a readily understood measure, and should be used to evaluate neural networks and to report results of learning experiments. Examples are provided from experiments with data from the biotechnology domain.  相似文献   

2.
In this paper, a measure of competence based on random classification (MCR) for classifier ensembles is presented. The measure selects dynamically (i.e. for each test example) a subset of classifiers from the ensemble that perform better than a random classifier. Therefore, weak (incompetent) classifiers that would adversely affect the performance of a classification system are eliminated. When all classifiers in the ensemble are evaluated as incompetent, the classification accuracy of the system can be increased by using the random classifier instead. Theoretical justification for using the measure with the majority voting rule is given. Two MCR based systems were developed and their performance was compared against six multiple classifier systems using data sets taken from the UCI Machine Learning Repository and Ludmila Kuncheva Collection. The systems developed had typically the highest classification accuracies regardless of the ensemble type used (homogeneous or heterogeneous).  相似文献   

3.
人工免疫识别系统AIRS(Artificial Immune Recognition System)是著名的免疫网络分类器,被成功地应用到大量的分类问题,表现出了良好的性能。为了分析不同的距离测量方法对AIRS的性能影响, 采用三种距离测量方法实现AIRS,这三种方法分别是Euclidean距离、Manhattan距离和RBF核空间距离,并将三种用不同距离测量方法实现的AIRS算法应用于Iris,Heart和Wine数据集的分类测试。所获得的三组数据集分类的准确率和抗体规模进行了相互比较,结果表明采用Manhattan距离AIRS算法获得了对Iris和Heart的最高分类准确率,而采用核空间距离,算法获得了对Wine的最高分类准确率。从抗体群体规模来看,采用核空间距离则能获得最小的抗体群体。从性能比较可知,不同的距离测量方法对AIRS算法的分类性能较大的影响。  相似文献   

4.
Change detection based on the comparison of independently classified images (i.e. post-classification comparison) is well-known to be negatively affected by classification errors of individual maps. Incorporating spatial-temporal contextual information in the classification helps to reduce the classification errors, thus improving change detection results. In this paper, spatial-temporal Markov Random Fields (MRF) models were used to integrate spatial-temporal information with spectral information for multi-temporal classification in an attempt to mitigate the impacts of classification errors on change detection. One important component in spatial-temporal MRF models is the specification of transition probabilities. Traditionally, a global transition probability model is used that assumes spatial stationarity of transition probabilities across an image scene, which may be invalid if areas have varying transition probabilities. By relaxing the stationarity assumption, we developed two local transition probability models to make the transition model locally adaptive to spatially varying transition probabilities. The first model called locally adjusted global transition model adapts to the local variation by multiplying a pixel-wise probability of change with the global transition model. The second model called pixel-wise transition model was developed as a fully local model based on the estimation of the pixel-wise joint probabilities. When applied to the forest change detection in Paraguay, the two local models showed significant improvements in the accuracy of identifying the change from forest to non-forest compared with traditional models. This indicates that the local transition probability models can present temporal information more accurately in change detection algorithms based on spatial-temporal classification of multi-temporal images. The comparison between the two local transition models showed that the fully local model better captured the spatial heterogeneity of the transition probabilities and achieved more stable and consistent results over different regions of a large image scene.  相似文献   

5.
Two important factors that impact a classification model’s performance are imbalanced data and unequal misclassification cost consequences. These are especially important considerations for neural network models developed to estimate the posterior probabilities of group membership used in classification decisions. This paper explores the issues of asymmetric misclassification costs and unbalanced group sizes on neural network classification performance using an artificial data approach that is capable of generating more complex datasets than used in prior studies and which adds new insights to the problem and the results. A different performance measure, that is capable of directly measuring classification performance consistency with Bayes decision rule, is used. The results show that both asymmetric misclassification costs and imbalanced group sizes have significant effects on neural network classification performance both independently and via interaction effects. These are not always intuitive; they supplement prior findings, and raise issues for the future.  相似文献   

6.
This paper describes how probabilistic methods provide a means to integrate analysis of remotely sensed imagery and geo-information processing. In a case study from southern Spain, geological map units were used to improve land-cover classification from Landsat TM imagery. Overall classification accurracy improved from 76% to 90% (1984) and from 64% to 69% (1995) when using stratification according to geology combined with iterative estimation of prior probabilities. Differences between the two years were mainly due to extremely dry conditions during the 1995 growing season. Per-pixel probabilities of class successions and entropy values calculated from the classification's posterior probability vectors served to quantify uncertainty in a post-classification comparison. It is concluded that iterative estimation of prior probabilities provides a practical approach to improve classification accuracy. Posterior probabilities of class membership provide useful information about the magnitude and spatial distribution of classification uncertainty.  相似文献   

7.
基于可见光与红外数据融合的地形分类   总被引:1,自引:0,他引:1  
顾迎节  金忠 《计算机工程》2013,39(2):187-191
针对单传感器地形分类效果不佳的问题,提出一种基于可见光与红外数据融合的地形分类方法。分别对可见光图像与红外图像提取特征,使用最近邻分类器和最小距离分类器进行后验概率估计,将来自不同特征、不同分类器的后验概率加权组合,通过散度计算得到特征的权重,实验确定分类器的权重,并在最小距离的后验概率估计中,使用马氏距离代替欧氏距离。实验结果表明,该方法对水泥路和沙子路的识别率分别达到99.33%和96.67%,均高于同类方法。  相似文献   

8.
A method of classification accuracy evaluation for a cloud and precipitation classifier applied to geostationary meteorological satellite data is presented. The method has been developed to evaluate the accuracy of a rather precise classification algorithm. The algorithm produces nine classes, four of which involve precipitation. The classes are: (1) clear or insignificant cloud, (2) low thin cloud with no rain, (3) low or middle thin cloud with no rain, (4) low or middle thick cloud with no rain, (5) middle or high cloud with no rain, (6) middle or high cloud with the possibility of rain, (7) middle or high cloud with light–moderate precipitation, (8) middle–high cloud with moderate–heavy precipitation, (9) heavy thunderstorm. The evaluation classifier has been tested for its accuracy (ground truth) using comparison between actual meteorological weather reports and classification results derived from the algorithm applied. For the estimation of classification accuracy, the omission/commission method is applied between the observed and the classification‐produced values. The classifier used has proved to be very reliable for classifying major cloud types and precipitation, tested during the synoptic situation of depression systems approaching the south Balkan Peninsula from the west. In that synoptic situation, different intensities of rainfall as well as heavy thunderstorm were present, and the results are very satisfactory. The method can be used to evaluate classification results produced by algorithms applied to meteorological satellite data, classifying precipitation areas as well as the heaviness of precipitation.  相似文献   

9.
The incorporation of user-supplied information has become mandatory for the improvement of QoS in network systems. There is the question about accommodation of new users of a service, given that information about former users of a service is available. In the present work, we followed two approaches to derive information about new users in the network design and control processes, where both are based on prototype generation for the answers of former users to a QoS related questionnaire. In the first approach, attempts were made to map user attributes to prototypes. The second approach used a mapping from partial answers to a prototype. As a result, the first approach appeared to be infeasible, while the second showed good results. In the resulting trade-off between number of prototypes and classification accuracy, it is possible, for example, with 8 prototypes for around 1000 users to predict the answers of new users by using only 30% of the answers of former users, while reducing accuracy by only 13% at the same time.  相似文献   

10.
崔敏君  段利国  李爱萍 《计算机科学》2016,43(1):94-97, 102
社交媒体中的问答对可以为自动问答系统提供答案,但有些答案的质量不高,因此答案质量评价方法具有研究价值。已有的评价方法没有考虑问题类别特征,对不同类型的问题采用统一的评价方法。因此提出了一个层次分类模型。首先分析问题类型;然后提取文本、非文本、语言翻译性、答案中的链接数4类特征,依据特征分类影响力随问题类型不同而不同这一客观现象,采用逻辑回归算法对各类型问题的答案质量进行评价,取得了较好的实验效果;最后分析了影响各类问题答案质量的主要特征。  相似文献   

11.
A detailed comparison and assessment of the performance of features extracted from space-borne interferometric SAR data and classified with different types of classifiers is presented. Multi-seasonal ERS-1 and ERS-2 SAR data of the Czech Republic is used to automatically classify into four different land-cover classes. An exhaustive search in the space of all possible feature subsets out of an overall number of 14 features taken from local statistics, fractal analysis and co-occurrence matrices is presented. The evaluation of the subset performance is compared using the Jeffreys-Matusita distance in the feature space and classification performance measured on a validation set independent from the classifier's training set. Classifiers investigated are maximum-likelihood, fuzzy ARTMAP and multilayer perceptron. The exhaustive search shows the importance and irrelevance of individual features depending on the classifier used. Furthermore, the size of the best subsets ranges from three to six features only, thus decreasing overall computation time. The classifier performance is assessed by measuring overall accuracy and tau statistics. The overall classification accuracy of 88.8% for the maximum-likelihood method and 91.35% for the multilayer perceptron on the validation set is further improved to 90.9% by use of a simple Bayesian context classifier which operates on class likelihoods or to 93.2% by operating on multilayer perceptron outputs.  相似文献   

12.
A modified maximum-likelihood (ML) classifier was applied to increase the accuracy of land-cover classification over a complex mountain landscape. The traditional ML classifier is a robust parametric approach in remote-sensing image classification. However, it is difficult to improve classification accuracy when using the traditional ML classifier in complex landscapes such as mountainous regions. In this study, we demonstrated a modified ML classifier that uses the non-equal prior probabilities derived from digital elevation model (DEM) ancillary data and a Gaussian mixed model (GMM) to delineate land-cover types within forest stands. We designed and compared four experiments using Landsat Thematic Mapper (TM) images covering the Culai Hill region of the eastern territory of China: (1) traditional ML classification with equal prior probability, (2) modified ML classification with non-equal prior probability derived from elevation information, (3) Gaussian mixed classifier (GMC) with equal prior probability, and (4) GMC with non-equal prior probability. Overall, the highest accuracy (80.5%) was obtained using the GMC with variable prior probabilities. The GMC with equal prior probabilities and the ML using non-equal prior probabilities yielded maps with accuracy of 74.7% and 78.0%, respectively, values significantly higher than that obtained using the conventional ML method. This implies that use of modified prior probabilities and GMM analysis has considerable potential to increase the accuracy of land-use and land-cover classification using TM imagery for complex landscapes such as the Culai Hill region.  相似文献   

13.
Different checkpointing strategies are combined with recovery models of different refinement levels in the database systems. The complexity of the resulting model increases with its accuracy in representing a realistic system. Three different analytic approaches are used depending on the complexity of the model: analytic, numerical and simulation. A Markovian queuing model is developed, resulting in a combined Poisson and load-dependent checkpointing strategy with stochastic recovery. A state-space analysis approach is used to derive semianalytic expressions for the performance variables in terms of a set of unknown boundary state probabilities. An efficient numerical algorithm for evaluating unknown probabilities is outlined. The validity of the numerical solution is checked against simulation results and shown to be of acceptable accuracy, particularly in the stable operating range. Simulations have shown that realistic load-dependent checkpointing results in performance close to the optimal deterministic checkpointing. Furthermore, the stochastic recovery model is an accurate representation of a realistic recovery  相似文献   

14.
Question-answering (QA) models find answers to a given question. The necessity of automatically finding answers is increasing because it is very important and challenging from the large-scale QA data sets. In this paper, we deal with the QA pair matching approach in QA models, which finds the most relevant question and its recommended answer for a given question. Existing studies for the approach performed on the entire dataset or datasets within a category that the question writer manually specifies. In contrast, we aim to automatically find the category to which the question belongs by employing the text classification model and to find the answer corresponding to the question within the category. Due to the text classification model, we can effectively reduce the search space for finding the answers to a given question. Therefore, the proposed model improves the accuracy of the QA matching model and significantly reduces the model inference time. Furthermore, to improve the performance of finding similar sentences in each category, we present an ensemble embedding model for sentences, improving the performance compared to the individual embedding models. Using real-world QA data sets, we evaluate the performance of the proposed QA matching model. As a result, the accuracy of our final ensemble embedding model based on the text classification model is 81.18%, which outperforms the existing models by 9.81%∼14.16% point. Moreover, in terms of the model inference speed, our model is faster than the existing models by 2.61∼5.07 times due to the effective reduction of search spaces by the text classification model.  相似文献   

15.
Motor-imagery tasks generate event related synchronization and de-synchronization in certain subject-specific frequency ranges of the subject’s ElectroEncephaloGraphy (EEG) signals. The selection of frequency ranges for each subject is important for obtaining better classification accuracy of motor-imagery based Brain Computer Interface (BCI). Further, the spatial filters extracted corresponding to the selected spectral ranges also influence the classification accuracy. In this paper, a subject-specific spatio-spectral filter selection approach using a cognitive fuzzy inference system for classification of the motor-imagery tasks in a two step approach is presented. The cognitive fuzzy inference system (CFIS) employs an evolving interval type-2 system to classify the non-stationary features. The classifier employs a meta-cognitive sequential algorithm to determine both the structure and parameters of the CFIS. In the first step, the CFIS classifier is used to find the desired spectral filters by eliminating those frequency bands that do not affect the classification performance. In the second step, CFIS is used to eliminate those spatial filters which do not affect the performance. The performance of CFIS based spatio-spectral scheme has been evaluated using two publicly available BCI competition data sets and compared with other existing algorithms like FBCSP, DCSP and BSSFO. The results indicate that the proposed approach outperforms the CSP method by approximately 15–18% and other algorithms like FBCSP, DCSP by 8–10%. Compared to a recently proposed algorithm BSSFO, it achieves an improvement of 2%, but is simpler in comparison to BSSFO. The main impact of the work is its ability to handle non-stationarity using interval type-2 sets and provide good classification performance. In general, the proposed CFIS algorithm can be applied in the field of expert and intelligent systems where it is necessary to deal with non-stationary signals.  相似文献   

16.
实际工业过程数据的局部特性一般都较为复杂,不利于样本特征的提取和故障分类精度的提高.针对此问题,本文提出一种集成的局部费舍尔判别分析(ILFDA)模型,可以同时从变量和样本两个维度挖掘数据的局部结构特征,提高故障分类的性能并降低建模的难度.首先,根据过程的结构原理对复杂系统进行分块,从而可以有效获取变量维度的数据局部信息,并排除无关变量的影响.其次,针对样本维度的数据局部信息,在每个变量子块中分别建立局部费舍尔判别分析(LFDA)模型,并为每个局部模型分配相应的权值,从而可以更准确地衡量不同子块对当前故障的影响程度.最后,利用分类性能加权策略将各个子块的分类结果进行融合.田纳西–伊斯曼(TE)过程中的仿真结果验证本文所提的ILFDA方法具有更好的故障分类效果.  相似文献   

17.
We study the case of “inspect-all” policy, using off-line quality inspections to prevent non-conforming items from reaching the final consumer, in domains where an item is rejected upon first “failure” classification. Given a set of inspections with known inspection costs and error probabilities of two types (classifying conforming items as non-conforming and vice versa), the goal is to find a sequenced subset of inspections that maximizes the expected overall profit, taking into account the revenue from delivering conforming items, the penalty of delivering non-conforming ones, and the overall cost of the inspections used. Our model allows an additional degree of freedom, in comparison to prior work in this domain, enabling the selection of inspections sequence along the selection of which inspections to use. We present an efficient branch and bound algorithm for finding the optimal solution, and two types of heuristics: greedy-based and preliminary sort-based, differing in their accuracy and calculation time. The optimal and heuristic methods are extensively evaluated, using a factorial experimental design that includes 65 610 problem instances. For each instance we compared the methods performance in terms of reaching optimality, deviation from the optimal solution and calculation-time. The results reflect a substantial influence of the sequence over the expected profit. An interesting finding is that the suggested preliminary sort-based heuristics achieve a relatively accurate solution in a reasonable calculation-time and outperform the commonly used greedy-based heuristics. The usefulness of the different methods is illustrated using sample problems from the biometric inspection security domain.  相似文献   

18.
In the last years, the application of artificial intelligence methods on credit risk assessment has meant an improvement over classic methods. Small improvements in the systems about credit scoring and bankruptcy prediction can suppose great profits. Then, any improvement represents a high interest to banks and financial institutions. Recent works show that ensembles of classifiers achieve the better results for this kind of tasks. In this paper, it is extended a previous work about the selection of the best base classifier used in ensembles on credit data sets. It is shown that a very simple base classifier, based on imprecise probabilities and uncertainty measures, attains a better trade-off among some aspects of interest for this type of studies such as accuracy and area under ROC curve (AUC). The AUC measure can be considered as a more appropriate measure in this grounds, where the different type of errors have different costs or consequences. The results shown here present to this simple classifier as an interesting choice to be used as base classifier in ensembles for credit scoring and bankruptcy prediction, proving that not only the individual performance of a classifier is the key point to be selected for an ensemble scheme.  相似文献   

19.
In this paper, the fusion of probabilistic knowledge-based classification rules and learning automata theory is proposed and as a result we present a set of probabilistic classification rules with self-learning capability. The probabilities of the classification rules change dynamically guided by a supervised reinforcement process aimed at obtaining an optimum classification accuracy. This novel classifier is applied to the automatic recognition of digital images corresponding to visual landmarks for the autonomous navigation of an unmanned aerial vehicle (UAV) developed by the authors. The classification accuracy of the proposed classifier and its comparison with well-established pattern recognition methods is finally reported.  相似文献   

20.
We describe a method of combining classification and compression into a single vector quantizer by incorporating a Bayes risk term into the distortion measure used in the quantizer design algorithm. Once trained, the quantizer can operate to minimize the Bayes risk weighted distortion measure if there is a model providing the required posterior probabilities, or it can operate in a suboptimal fashion by minimizing the squared error only. Comparisons are made with other vector quantizer based classifiers, including the independent design of quantization and minimum Bayes risk classification and Kohonen's LVQ. A variety of examples demonstrate that the proposed method can provide classification ability close to or superior to learning VQ while simultaneously providing superior compression performance  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号