首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
ObjectiveTo classify patients by age based upon information extracted from their electrocardiograms (ECGs). To develop and compare the performance of Bayesian classifiers.Methods and materialWe present a methodology for classifying patients according to statistical features extracted from their ECG signals using a genetically evolved Bayesian network classifier. Continuous signal feature variables are converted to a discrete symbolic form by thresholding, to lower the dimensionality of the signal. This simplifies calculation of conditional probability tables for the classifier, and makes the tables smaller. Two methods of network discovery from data were developed and compared: the first using a greedy hill-climb search and the second employed evolutionary computing using a genetic algorithm (GA).Results and conclusionsThe evolved Bayesian network performed better (86.25% AUC) than both the one developed using the greedy algorithm (65% AUC) and the naïve Bayesian classifier (84.75% AUC). The methodology for evolving the Bayesian classifier can be used to evolve Bayesian networks in general thereby identifying the dependencies among the variables of interest. Those dependencies are assumed to be non-existent by naïve Bayesian classifiers. Such a classifier can then be used for medical applications for diagnosis and prediction purposes.  相似文献   

2.
王影  王浩  俞奎  姚宏亮 《计算机科学》2012,39(1):185-189
目前基于节点排序的贝叶斯网络分类器忽略了节点序列中已选变量和类标签之间的信息,导致分类器的准确率很难进一步提高。针对这个问题,提出了一种简单高效的贝叶斯网络分类器的学习算法:L1正则化的贝叶斯网络分类器(L1-BNC)。通过调整Lasso方法中的约束值,充分利用回归残差的信息,结合点序列中已选变量和类标签的信息,形成一条优秀的有序变量拓扑序列(L1正则化路径);基于该序列,利用K2算法生成优良的贝叶斯网络分类器。实验表明,L1-BNC在分类精度上优于已有的贝叶斯网络分类器。L1-BNC也与SVM,KNN和J48分类算法进行了比较,在大部分数据集上,L1-BNC优于这些算法。  相似文献   

3.
A data driven ensemble classifier for credit scoring analysis   总被引:2,自引:0,他引:2  
This study focuses on predicting whether a credit applicant can be categorized as good, bad or borderline from information initially supplied. This is essentially a classification task for credit scoring. Given its importance, many researchers have recently worked on an ensemble of classifiers. However, to the best of our knowledge, unrepresentative samples drastically reduce the accuracy of the deployment classifier. Few have attempted to preprocess the input samples into more homogeneous cluster groups and then fit the ensemble classifier accordingly. For this reason, we introduce the concept of class-wise classification as a preprocessing step in order to obtain an efficient ensemble classifier. This strategy would work better than a direct ensemble of classifiers without the preprocessing step. The proposed ensemble classifier is constructed by incorporating several data mining techniques, mainly involving optimal associate binning to discretize continuous values; neural network, support vector machine, and Bayesian network are used to augment the ensemble classifier. In particular, the Markov blanket concept of Bayesian network allows for a natural form of feature selection, which provides a basis for mining association rules. The learned knowledge is represented in multiple forms, including causal diagram and constrained association rules. The data driven nature of the proposed system distinguishes it from existing hybrid/ensemble credit scoring systems.  相似文献   

4.
Ying Yu  Hao Huang 《Expert Systems》2022,39(1):e12821
With the objective to automatically detect diseases from symptoms in free-text data, a methodology to extract symptom-diagnosis knowledge from online medical textual data in Q&A domain is proposed in this paper: (1) a term frequency-inverse document frequency and PRECISION method is adopted to retrieve symptom words from unstructured text; (2) a variable precision rough set based genetic algorithm is applied to reduce redundant symptom words, and a rough set based rule is utilized for adding discriminative symptom words assisting to discriminate diseases sharing similar symptoms; (3) by employing fuzzy linguistic variables to express the risk level of disease or severity level of symptoms, a knowledge base with fuzzy belief structure is generated. Using data extracted from a Chinese medical Q&A forum for training and testing, some classical gastrointestinal diseases serve as a case study to evaluate the efficiency of the proposed methodology. Subsequently performance comparisons are made between the proposed methodology and some other classifiers, such as the decision tree algorithms including ID3 and J45, and the Bayesian network classifier. The comparative results demonstrate that the proposed methodology outperforms the decision tree algorithms and the Bayesian network classifier.  相似文献   

5.
Many pattern classification algorithms such as Support Vector Machines (SVMs), Multi-Layer Perceptrons (MLPs), and K-Nearest Neighbors (KNNs) require data to consist of purely numerical variables. However many real world data consist of both categorical and numerical variables. In this paper we suggest an effective method of converting the mixed data of categorical and numerical variables into data of purely numerical variables for binary classifications. Since the suggested method is based on the theory of learning Bayesian Network Classifiers (BNCs), it is computationally efficient and robust to noises and data losses. Also the suggested method is expected to extract sufficient information for estimating a minimum-error-rate (MER) classifier. Simulations on artificial data sets and real world data sets are conducted to demonstrate the competitiveness of the suggested method when the number of values in each categorical variable is large and BNCs accurately model the data.  相似文献   

6.
Support vector learning for fuzzy rule-based classification systems   总被引:11,自引:0,他引:11  
To design a fuzzy rule-based classification system (fuzzy classifier) with good generalization ability in a high dimensional feature space has been an active research topic for a long time. As a powerful machine learning approach for pattern recognition problems, the support vector machine (SVM) is known to have good generalization ability. More importantly, an SVM can work very well on a high- (or even infinite) dimensional feature space. This paper investigates the connection between fuzzy classifiers and kernel machines, establishes a link between fuzzy rules and kernels, and proposes a learning algorithm for fuzzy classifiers. We first show that a fuzzy classifier implicitly defines a translation invariant kernel under the assumption that all membership functions associated with the same input variable are generated from location transformation of a reference function. Fuzzy inference on the IF-part of a fuzzy rule can be viewed as evaluating the kernel function. The kernel function is then proven to be a Mercer kernel if the reference functions meet a certain spectral requirement. The corresponding fuzzy classifier is named positive definite fuzzy classifier (PDFC). A PDFC can be built from the given training samples based on a support vector learning approach with the IF-part fuzzy rules given by the support vectors. Since the learning process minimizes an upper bound on the expected risk (expected prediction error) instead of the empirical risk (training error), the resulting PDFC usually has good generalization. Moreover, because of the sparsity properties of the SVMs, the number of fuzzy rules is irrelevant to the dimension of input space. In this sense, we avoid the "curse of dimensionality." Finally, PDFCs with different reference functions are constructed using the support vector learning approach. The performance of the PDFCs is illustrated by extensive experimental results. Comparisons with other methods are also provided.  相似文献   

7.
This paper presents a weed/crop classification method using computer vision and morphological analysis. Subsequent supervised and unsupervised learning methods are applied to extract dominant morphological characteristics of weeds present in corn and soybean fields. The novelty of the presented technique resides in the feature extraction process that is based on spatial localization of vegetation in fields. Features from the weed leaf area distribution are extracted from the cultivation inter-rows, then features from the crop are inferred from the mixture model equation. Those extracted features are then passed to a naive bayesian classifier and a gaussian mixture clustering algorithm to discriminate weed from crop plant. The presented technique correctly classifies an average of 94 % of corn and soybean plants and 85 % of the weed (multiple species) without any prior knowledge on the species present in the field.  相似文献   

8.
最小总风险准则的贝叶斯网络个人信用评估模型*   总被引:1,自引:0,他引:1  
将最小总风险准则MOR与贝叶斯网络分类器相结合,提出了一种新型信用评估模型。在两个真实数据集上以MOR用10层交叉验证对贝叶斯网络信用评估模型进行了测试,并与最小错误概率准则MPE的贝叶斯网络分类器的结果进行了对比。结果表明,基于MOR的贝叶斯网络分类模型可以有效地减小信用评估风险。  相似文献   

9.
For learning a Bayesian network classifier, continuous attributes usually need to be discretized. But the discretization of continuous attributes may bring information missing, noise and less sensitivity to the changing of the attributes towards class variables. In this paper, we use the Gaussian kernel function with smoothing parameter to estimate the density of attributes. Bayesian network classifier with continuous attributes is established by the dependency extension of Naive Bayes classifiers. We also analyze the information provided to a class for each attributes as a basis for the dependency extension of Naive Bayes classifiers. Experimental studies on UCI data sets show that Bayesian network classifiers using Gaussian kernel function provide good classification accuracy comparing to other approaches when dealing with continuous attributes.  相似文献   

10.
分析了贝叶斯分类器家族中有代表性的分类器;给出变量之间预测能力的概念及估计方法,在此基础上建立了基于变量间预测能力的贝叶斯网络分类器结构学习方法,并使用UCI数据进行分类实验.实验结果显示,该方法能够有效地进行贝叶斯网络分类器学习,使得贝叶斯网络分类器倾向于简单化,具有较强的分类能力.  相似文献   

11.
In this paper, we describe three Bayesian classifiers for mineral potential mapping: (a) a naive Bayesian classifier that assumes complete conditional independence of input predictor patterns, (b) an augmented naive Bayesian classifier that recognizes and accounts for conditional dependencies amongst input predictor patterns and (c) a selective naive classifier that uses only conditionally independent predictor patterns. We also describe methods for training the classifiers, which involves determining dependencies amongst predictor patterns and estimating conditional probability of each predictor pattern given the target deposit-type. The output of a trained classifier determines the extent to which an input feature vector belongs to either the mineralized class or the barren class and can be mapped to generate a favorability map. The procedures are demonstrated by an application to base metal potential mapping in the proterozoic Aravalli Province (western India). The results indicate that although the naive Bayesian classifier performs well and shows significant tolerance for the violation of the conditional independence assumption, the augmented naive Bayesian classifier performs better and exhibits finer generalization capability. The results also indicate that the rejection of conditionally dependent predictor patterns degrades the performance of a naive classifier.  相似文献   

12.
王中锋  王志海 《计算机学报》2012,35(2):2364-2374
通常基于鉴别式学习策略训练的贝叶斯网络分类器有较高的精度,但在具有冗余边的网络结构之上鉴别式参数学习算法的性能受到一定的限制.为了在实际应用中进一步提高贝叶斯网络分类器的分类精度,该文定量描述了网络结构与真实数据变量分布之间的关系,提出了一种不存在冗余边的森林型贝叶斯网络分类器及其相应的FAN学习算法(Forest-Augmented Naive Bayes Algorithm),FAN算法能够利用对数条件似然函数的偏导数来优化网络结构学习.实验结果表明常用的限制性贝叶斯网络分类器通常存在一些冗余边,其往往会降低鉴别式参数学习算法的性能;森林型贝叶斯网络分类器减少了结构中的冗余边,更加适合于采用鉴别式学习策略训练参数;应用条件对数似然函数偏导数的FAN算法在大多数实验数据集合上提高了分类精度.  相似文献   

13.
We present an approach for MPEG variable bit rate (VBR) video modeling and classification using fuzzy techniques. We demonstrate that a type-2 fuzzy membership function, i.e., a Gaussian MF with uncertain variance, is most appropriate to model the log-value of I/P/B frame sizes in MPEG VBR video. The fuzzy c-means (FCM) method is used to obtain the mean and standard deviation (std) of T/P/B frame sizes when the frame category is unknown. We propose to use type-2 fuzzy logic classifiers (FLCs) to classify video traffic using compressed data. Five fuzzy classifiers and a Bayesian classifier are designed for video traffic classification, and the fuzzy classifiers are compared against the Bayesian classifier. Simulation results show that a type-2 fuzzy classifier in which the input is modeled as a type-2 fuzzy set and antecedent membership functions are modeled as type-2 fuzzy sets performs the best of the five classifiers when the testing video product is not included in the training products and a steepest descent algorithm is used to tune its parameters  相似文献   

14.
This paper proposes an approach that detects surface defects with three-dimensional characteristics on scale-covered steel blocks. The surface reflection properties of the flawless surface changes strongly. Light sectioning is used to acquire the surface range data of the steel block. These sections are arbitrarily located within a range of a few millimeters due to vibrations of the steel block on the conveyor. After the recovery of the depth map, segments of the surface are classified according to a set of extracted features by means of Bayesian network classifiers. For establishing the structure of the Bayesian network, a floating search algorithm is applied, which achieves a good tradeoff between classification performance and computational efficiency for structure learning. This search algorithm enables conditional exclusions of previously added attributes and/or arcs from the network. The experiments show that the selective unrestricted Bayesian network classifier outperforms the naïve Bayes and the tree-augmented naïve Bayes decision rules concerning the classification rate. More than 98% of the surface segments have been classified correctly.  相似文献   

15.
有混合数据输入的自适应模糊神经推理系统   总被引:1,自引:0,他引:1  
现有数据建模方法大多依赖于定量的数值信息,而对于数值与分类混合输入的数据建模问题往往根据分类变量组合建立多个子模型,当有多个分类变量输入时易出现子模型数据分布不均匀、训练耗时长等问题.针对上述问题,提出一种具有混合数据输入的自适应模糊神经推理系统模型,在自适应模糊推理系统的基础上,引入激励强度转移矩阵和结论影响矩阵,采用基于高氏距离的减法聚类辨识模型结构,通过混合学习算法训练模型参数,使数值与分类混合数据对模糊规则的前后件参数同时产生作用,共同影响模型输出.仿真实验分析了分类数据对模型规则后件的作用以及结构辨识算法对模糊规则数的影响,与其他几种混合数据建模方法对比表明本文所提出的模型具有较高的预测精度和计算效率.  相似文献   

16.
分类准确性是分类器最重要的性能指标,特征子集选择是提高分类器分类准确性的一种有效方法。现有的特征子集选择方法主要针对静态分类器,缺少动态分类器特征子集选择方面的研究。首先给出具有连续属性的动态朴素贝叶斯网络分类器和动态分类准确性评价标准,在此基础上建立动态朴素贝叶斯网络分类器的特征子集选择方法,并使用真实宏观经济时序数据进行实验与分析。  相似文献   

17.
AdaBoost-based algorithm for network intrusion detection.   总被引:1,自引:0,他引:1  
Network intrusion detection aims at distinguishing the attacks on the Internet from normal use of the Internet. It is an indispensable part of the information security system. Due to the variety of network behaviors and the rapid development of attack fashions, it is necessary to develop fast machine-learning-based intrusion detection algorithms with high detection rates and low false-alarm rates. In this correspondence, we propose an intrusion detection algorithm based on the AdaBoost algorithm. In the algorithm, decision stumps are used as weak classifiers. The decision rules are provided for both categorical and continuous features. By combining the weak classifiers for continuous features and the weak classifiers for categorical features into a strong classifier, the relations between these two different types of features are handled naturally, without any forced conversions between continuous and categorical features. Adaptable initial weights and a simple strategy for avoiding overfitting are adopted to improve the performance of the algorithm. Experimental results show that our algorithm has low computational complexity and error rates, as compared with algorithms of higher computational complexity, as tested on the benchmark sample data.  相似文献   

18.
The fuzzy min–max neural network classifier is a supervised learning method. This classifier takes the hybrid neural networks and fuzzy systems approach. All input variables in the network are required to correspond to continuously valued variables, and this can be a significant constraint in many real-world situations where there are not only quantitative but also categorical data. The usual way of dealing with this type of variables is to replace the categorical by numerical values and treat them as if they were continuously valued. But this method, implicitly defines a possibly unsuitable metric for the categories. A number of different procedures have been proposed to tackle the problem. In this article, we present a new method. The procedure extends the fuzzy min–max neural network input to categorical variables by introducing new fuzzy sets, a new operation, and a new architecture. This provides for greater flexibility and wider application. The proposed method is then applied to missing data imputation in voting intention polls. The micro data—the set of the respondents’ individual answers to the questions—of this type of poll are especially suited for evaluating the method since they include a large number of numerical and categorical attributes.  相似文献   

19.
This paper presents the application of adaptive neuro-fuzzy inference system (ANFIS) model for estimation of vigilance level by using electroencephalogram (EEG) signals recorded during transition from wakefulness to sleep. The developed ANFIS model combined the neural network adaptive capabilities and the fuzzy logic qualitative approach. This study comprises of three stages. In the first stage, three types of EEG signals (alert signal, drowsy signal and sleep signal) were obtained from 30 healthy subjects. In the second stage, for feature extraction, obtained EEG signals were separated to its sub-bands using discrete wavelet transform (DWT). Then, entropy of each sub-band was calculated using Shannon entropy algorithm. In the third stage, the ANFIS was trained with the back-propagation gradient descent method in combination with least squares method. The extracted features of three types of EEG signals were used as input patterns of the three ANFIS classifiers. In order to improve estimation accuracy, the fourth ANFIS classifier (combining ANFIS) was trained using the outputs of the three ANFIS classifiers as input data. The performance of the ANFIS model was tested using the EEG data obtained from 12 healthy subjects that have not been used for the training. The results confirmed that the developed ANFIS classifier has potential for estimation of vigilance level by using EEG signals.  相似文献   

20.
Multiple classifier systems (MCS) are attracting increasing interest in the field of pattern recognition and machine learning. Recently, MCS are also being introduced in the remote sensing field where the importance of classifier diversity for image classification problems has not been examined. In this article, Satellite Pour l'Observation de la Terre (SPOT) IV panchromatic and multispectral satellite images are classified into six land cover classes using five base classifiers: contextual classifier, k-nearest neighbour classifier, Mahalanobis classifier, maximum likelihood classifier and minimum distance classifier. The five base classifiers are trained with the same feature sets throughout the experiments and a posteriori probability, derived from the confusion matrix of these base classifiers, is applied to five Bayesian decision rules (product rule, sum rule, maximum rule, minimum rule and median rule) for constructing different combinations of classifier ensembles. The performance of these classifier ensembles is evaluated for overall accuracy and kappa statistics. Three statistical tests, the McNemar's test, the Cochran's Q test and the Looney's F-test, are used to examine the diversity of the classification results of the base classifiers compared to the results of the classifier ensembles. The experimental comparison reveals that (a) significant diversity amongst the base classifiers cannot enhance the performance of classifier ensembles; (b) accuracy improvement of classifier ensembles can only be found by using base classifiers with similar and low accuracy; (c) increasing the number of base classifiers cannot improve the overall accuracy of the MCS and (d) none of the Bayesian decision rules outperforms the others.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号