首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Considerable attention has been given to the relative performance of the various commonly used discriminant analysis algorithms. This performance has been studied under varying conditions. This author and others have been particularly interested in the behavior of the algorithms as dimension is varied. Here we consider three basic questions: which algorithms perform better in high dimensions, when does it pay to add or delete a dimension, and how discriminant algorithms are best implemented in high dimensions.

One of the more interesting results has been the relatively good performance of non-parametric Bayes theorem type algorithms compared to parametric (linear and quadratic) algorithms. Surprisingly this effect occurs even when the underlying distributions are “ideal” for the parametric algorithms, provided, at least, that the true covariance matrices are not too close to singular. Monte Carlo results presented here further confirm this unexpected behavior and add to the existing literature (particularly Van Ness(9) and Van Ness et al.(11) by studying a different class of underlying Gaussian distributions. These and earlier results point out certain procedures, discussed here, which should be used in the selection of the density estimation windows for non-parametric algorithms to improve their performance. Measures of the effect on the various algorithms of adding dimensions are given graphically. A summary of some of the conclusions about several of the more common algorithms is included.  相似文献   


2.
This paper presents the colored farthest-neighbor graph (CFNG), a new method for finding clusters of similar objects. The method is useful because it works for both objects with coordinates and for objects without coordinates. The only requirement is that the distance between any two objects be computable. In other words, the objects must belong to a metric space. The CFNG uses graph coloring to improve on an existing technique by Rovetta and Masulli. Just as with their technique, it uses recursive partitioning to build a hierarchy of clusters. In recursive partitioning, clusters are sometimes split prematurely, and one of the contributions of this paper is a way to reduce the occurrence of such premature splits, which also result when other partition methods are used to find clusters.  相似文献   

3.
A classification scheme based on temporal characteristics of a given crop is described. The technique in its present form requires one training field representative of the crop under consideration. This training field is used to determine analytically the time behavior of the crop in the LACIE (Large Area Crop Inventory Experiment) segment. A comparison of this crop's temporal profile, generated in each of the Landsat channels, with that of every pixel in the segment is made to decide the category (crop/noncrop) of the pixel. Classification results have been compared with ground truth for 34 sites in the U.S. Corn Belt. This technique has the potential for a more automated method of generating a near-harvest crop inventory from the satellite data in comparison to the inventory method in current use.  相似文献   

4.
In cluster analysis, modes can be detected as regions where the underlying probability density function is locally concave. This paper investigates the possibility of using relaxation in order to improve the discrimination between the modes which are concave and the valleys which are convex.  相似文献   

5.
用Q型逐次信息群分对白银矿区42名矽肺患者和41名正常人头发样的元素谱Cr、Zn、Mg、Al、Cd进行无监督模式识别,获得分类清晰的谱系图,83个样本的判别正确率达98.8%。  相似文献   

6.
We aim to describe a new non-parametric methodology to support the clinician during the diagnostic process of oral videocapillaroscopy to evaluate peripheral microcirculation. Our methodology, mainly based on wavelet analysis and mathematical morphology to preprocess the images, segments them by minimizing the within-class luminosity variance of both capillaries and background. Experiments were carried out on a set of real microphotographs to validate this approach versus handmade segmentations provided by physicians. By using a leave-one-patient-out approach, we pointed out that our methodology is robust, according to precision–recall criteria (average precision and recall are equal to 0.924 and 0.923, respectively) and it acts as a physician in terms of the Jaccard index (mean and standard deviation equal to 0.858 and 0.064, respectively).  相似文献   

7.
A clustering procedure called HICAP (HIstogram Cluster Analysis Procedure) was developed to perform an unsupervised classification of multidimensional image data. The clustering approach used in HICAP is based upon an algorithm described by Narendra and Goldberg to classify four-dimensional Landsat Multispectral Scanner data. HICAP incorporates two major modifications to the scheme by Narendra and Goldberg. The first modification is that HICAP is generalized to process up to 32-bit data with an arbitrary number of dimensions. The second modification is that HICAP uses more efficient algorithms to implement the clustering approach described by Narendra and Goldberg.(1) This means that the HICAP classification requires less computation, although it is otherwise identical to the original classification. The computational savings afforded by HICAP increases with the number of dimensions in the data.  相似文献   

8.
Electronic auction markets collect large amounts of auction field data. This enables a structural estimation of the bid distributions and the possibility to derive optimal reservation prices. In this paper we propose a new approach to setting reservation prices. In contrast to traditional auction theory we use the buyer’s risk statement for getting a winning bid as a key criterion to set an optimal reservation price. The reservation price for a given probability can then be derived from the distribution function of the observed drop-out bids. In order to get an accurate model of this function, we propose a nonparametric technique based on kernel distribution function estimators and the use of order statistics. We improve our estimator by additional information, which can be observed about bidders and qualitative differences of goods in past auctions rounds (e.g. different delivery times). This makes the technique applicable to RFQs and multi-attribute auctions, with qualitatively differentiated offers.  相似文献   

9.
不规则类圆形目标图象识别新策略   总被引:7,自引:0,他引:7       下载免费PDF全文
了以往圆和椭圆形目标图象的识别方法,针对基偿足,以原木横截面图象为实例,提出不规则类圆形目标模式识别新策略。该策略结合了聚类分析和模糊识别两种方法,采取由粗到细,逐层分类,每层单独设计分类器,成功地解决了不确定量的分类问题,实现了对不规则类圆形目标的识别。  相似文献   

10.
We present an algorithm to solve the graph isomorphism problem for the purpose of object recognition. Objects, such as those which exist in a robot workspace, may be represented by labelled graphs (graphs with attributes on their nodes and/or edges). Thereafter, object recognition is achieved by matching pairs of these graphs. Assuming that all objects are sufficiently different so that their corresponding representative graphs are distinct, then given a new graph, the algorthm efficiently finds the isomorphic stored graph (if it exists). The algorithm consists of three phases: preprocessing, link construction, and ambiguity resolution. Results from experiments on a wide variety and sizes of graphs are reported. Results are also reported for experiments on recognising graphs that represent protein molecules. The algorithm works for all types of graphs except for a class of highly ambiguous graphs which includes strongly regular graphs. However, members of this class are detected in polynomial time, which leaves the option of switching to a higher complexity algorithm if desired.  相似文献   

11.
The LANDSAT multispectral scanner (MSS) data have been analyzed with a view toward classification to identify wheat. The notion of spectral signature of a crop, a commonly used basis for classification, has been found to be inadequate. Data analysis has revealed that the MSS data from agricultural sites are essentially two-dimensional, and that the data from different sites and different acquisitions lie on parallel planes in the four-dimensional feature space. These results have been exploited to gain new insight into the data and to develop alternate models for classification. In particular, it has been found that the temporal pattern of change in the spectral response of a crop constitutes its signature and provides a basis for crop classification.  相似文献   

12.
一个印刷体汉字识别系统的设计   总被引:1,自引:1,他引:1  
给出一个印刷全汉字识别系统的设计方案,它主要包括扫描输入,模糊增强与聚类分割 ,图象数据二值比,通过并行神经网络进行汉字匹配等四个步骤。  相似文献   

13.
Algorithms which are operationally efficient and which give a good partition of a finite set, produce solutions that are not necessarily optimum. The main aim of this paper is a synthetical study of properties of optimality in spaces formed by partitions of a finite set. We formalize and take for a model of that study a family of particularly efficient technics of “clusters centers” type. The proposed algorithm operates on groups of points or “kernels” these kernels adapt and evolve into interesting clusters. After having developed the notion of “strong” and “weak” patterns, and the computer aspects, we illustrate the different results by an artificial example and by two applications: one in mineral geology, the other in medicine to determine biological profiles.  相似文献   

14.
Combinatorial maps explicitly encode orientations of edges around vertices, and have been used in many fields. In this paper, we address the problem of searching for patterns in model maps by putting forward the concept of symbol graph. A symbol graph will be constructed and stored for each model map in the preprocessing. Furthermore, an algorithm for submap isomorphism is presented based on symbol sequence searching in the symbol graphs. The computational complexity of this algorithm is quadratic in the worst case if we neglect the preprocessing step.  相似文献   

15.
Most dominant point detection methods require heuristically chosen control parameters. One of the commonly used control parameter is maximum deviation. This paper uses a theoretical bound of the maximum deviation of pixels obtained by digitization of a line segment for constructing a general framework to make most dominant point detection methods non-parametric. The derived analytical bound of the maximum deviation can be used as a natural bench mark for the line fitting algorithms and thus dominant point detection methods can be made parameter-independent and non-heuristic. Most methods can easily incorporate the bound. This is demonstrated using three categorically different dominant point detection methods. Such non-parametric approach retains the characteristics of the digital curve while providing good fitting performance and compression ratio for all the three methods using a variety of digital, non-digital, and noisy curves.  相似文献   

16.
This paper examines a d-dimensional extension of the Cox-Lewis statistic and investigates its power as a function of dimensionality in discriminating among random, aggregated and regular arrangements of points in d-dimensions. It was motivated by the Clustering Tendency problem which attempts to prevent the inappropriate application of clustering algorithms and other exploratory procedures. After reviewing the literature, the d-dimensional Cox-Lewis statistic is defined and its distribution under a randomness hypothesis of a Poisson spatial point process is given. Analytical expressions for the densities of the Cox-Lewis statistic under lattice regularity and under extreme clustering are also provided. The powers of Neyman-Pearson tests of hypotheses based on the Cox-Lewis statistic are derived and compared. Power is a unimodal function of dimensionality in the test of lattice regularity, with the minimum occurring at 12 dimensions.The power of the Cox-Lewis statistic is also examined under hard-core regularity and under Neyman-Scott clustering with Monte Carlo simulations. The Cox-Lewis statistic leads to one-sided tests for regularity having reasonable power and provides a sharper discrimination between random and clustered data than other statistics. The choice of sampling window is a critical factor. The Cox-Lewis statistic shows great promise for assessing the gross structure of data.  相似文献   

17.
18.
One problem in clustering (classification) analysis relates to whether or not the original variables should be transformed in some way before they are used by the clustering algorithm. More often than not, the original variables do require some transformation. The purpose of the transformation may be a desire to have more compact clusters in the space of the transformed variables, to take into account the different nature and/or units of the variables involved, to allow for the different or equal ‘importance’ of different variables, to minimize the number of variables used, etc. Among the linear transformations of variables we distinguish two groups - those which change only the scales of the variables (they are often called weighting procedures), and those which also rotate the space of variables (a good example would be the method of principal components(1)). This paper addresses the former group of transformations.One strong reason for using the weighted variables (as opposed to their linear combinations) is that when using them one can interpret the results of the classification in terms of the original (physical) variables. Unfortunately, weighting the variables can result in ‘spoiling’ the compactness of the clusters in the space of the weighted variables if the weighting procedure being used ‘does not care’ about the results of clustering (in other words if the weighting is done prior to and independently of the clustering).A method of weighting the variables which is a part of the classification procedure and thus guarantees an improvement of the cluster clarity is suggested in this paper. The weights of variables and the clusters of objects produced by the algorithm correspond to a local minimum of some classification criterion. Because of this, the resultant weights can be interpreted as a measure of ‘importance’ of the variables for the classification purpose. These weights are compared with such popular weighting procedures as equal variance(6) and Mahalanobis distance(7) methods. Two examples of the performance of the algorithm are presented.  相似文献   

19.
一种基于颜色信息的图象检索方法   总被引:1,自引:0,他引:1       下载免费PDF全文
由于传统的基于颜色的图象检索都是基于颜色直方图的检索,其很难将颜色信息和其他信息结合起来,因此,降低了图象检索的准确度.为了提高图象检索的准确度,提出了一种基于颜色聚类表的图象检索方法,该方法首先定义颜色聚类表,并对图象进行颜色聚类;然后利用聚类后的颜色信息构造聚类表,并利用聚类表作为特征来对图象进行检索,同时给出颜色聚类表的获取方法;最后,利用该方法进行了仿真实验.实验结果表明,利用颜色聚类表,根据图象的聚类结果来实现检索,可以很方便地将颜色信息与其他信息结合起来.  相似文献   

20.
As humans, we have innate faculties that allow us to efficiently segment groups of objects. Computers, to some degree, can be programmed with similar categorical capabilities, which stem from exploratory data analysis. Out of the various subsets of data reasoning, clustering provides insight into the structure and relationships of input samples situated in a number of distributions. To determine these relationships, many clustering methods rely on one or more human inputs; the most important being the number of distributions, c, to seek. This work investigates a technique for estimating the number of clusters from a general type of data called relational data. Several numerical examples are presented to illustrate the effectiveness of the proposed method.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号