首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
The problem of selecting a subset of relevant features is classic and found in many branches of science including—examples in pattern recognition. In this paper, we propose a new feature selection criterion based on low-loss nearest neighbor classification and a novel feature selection algorithm that optimizes the margin of nearest neighbor classification through minimizing its loss function. At the same time, theoretical analysis based on energy-based model is presented, and some experiments are also conducted on several benchmark real-world data sets and facial data sets for gender classification to show that the proposed feature selection method outperforms other classic ones.  相似文献   

3.
Discriminant adaptive nearest neighbor classification   总被引:11,自引:0,他引:11  
Nearest neighbour classification expects the class conditional probabilities to be locally constant, and suffers from bias in high dimensions. We propose a locally adaptive form of nearest neighbour classification to try to ameliorate this curse of dimensionality. We use a local linear discriminant analysis to estimate an effective metric for computing neighbourhoods. We determine the local decision boundaries from centroid information, and then shrink neighbourhoods in directions orthogonal to these local decision boundaries, and elongate them parallel to the boundaries. Thereafter, any neighbourhood-based classifier can be employed, using the modified neighbourhoods. The posterior probabilities tend to be more homogeneous in the modified neighbourhoods. We also propose a method for global dimension reduction, that combines local dimension information. In a number of examples, the methods demonstrate the potential for substantial improvements over nearest neighbour classification  相似文献   

4.
A variant of nearest-neighbor (NN) pattern classification and supervised learning by learning vector quantization (LVQ) is described. The decision surface mapping method (DSM) is a fast supervised learning algorithm and is a member of the LVQ family of algorithms. A relatively small number of prototypes are selected from a training set of correctly classified samples. The training set is then used to adapt these prototypes to map the decision surface separating the classes. This algorithm is compared with NN pattern classification, learning vector quantization, and a two-layer perceptron trained by error backpropagation. When the class boundaries are sharply defined (i.e., no classification error in the training set), the DSM algorithm outperforms these methods with respect to error rates, learning rates, and the number of prototypes required to describe class boundaries.  相似文献   

5.
文本分类为一个文档自动分配一组预定义的类别或主题。文本分类中,文档的表示对学习机的学习性能有很大的影响。以实现哈萨克语文本分类为目的,根据哈萨克语语法规则设计实现哈萨克语文本的词干提取,完成哈萨克语文本的预处理。提出基于最近支持向量机的样本距离公式,避免k参数的选定,以SVM与KNN分类算法的特殊组合算法(SV-NN)实现了哈萨克语文本的分类。结合自己构建的哈萨克语文本语料库的语料进行文本分类仿真实验,数值实验展示了提出算法的有效性并证实了理论结果。  相似文献   

6.
壳近邻分类算法克服了k近邻分类在近邻选择上可能存在偏好的问题,使得在大数据集上的分类效果优于k近邻分类,为了进一步提高壳近邻算法的分类性能,提出了基于Relief特征加权的壳近邻分类算法.该算法在Relief算法的基础上求解训练集的特征权值,并利用特征权值来改进算法的距离度量方法和投票机制.实验结果表明,该算法在小数据和大数据上的分类性能都优于k近邻和壳近邻分类算法.  相似文献   

7.
针对图像搜索引擎的结果,对图像集依据视觉相似度将视觉相近的图像组织在一起,提供给用户一个有效的浏览接口.为降低计算时间,提出一种基于关键维的近邻搜索算法.实验证明了以上算法的有效性.  相似文献   

8.
Adaptive quasiconformal kernel nearest neighbor classification   总被引:1,自引:0,他引:1  
Nearest neighbor classification assumes locally constant class conditional probabilities. This assumption becomes invalid in high dimensions due to the curse-of-dimensionality. Severe bias can be introduced under these conditions when using the nearest neighbor rule. We propose an adaptive nearest neighbor classification method to try to minimize bias. We use quasiconformal transformed kernels to compute neighborhoods over which the class probabilities tend to be more homogeneous. As a result, better classification performance can be expected. The efficacy of our method is validated and compared against other competing techniques using a variety of data sets.  相似文献   

9.
Efficient prototype reordering in nearest neighbor classification   总被引:2,自引:0,他引:2  
Sanghamitra  Ujjwal   《Pattern recognition》2002,35(12):2791-2799
Nearest Neighbor rule is one of the most commonly used supervised classification procedures due to its inherent simplicity and intuitive appeal. However, it suffers from the major limitation of requiring n distance computations, where n is the size of the training data (or prototypes), for computing the nearest neighbor of a point. In this paper we suggest a simple approach based on rearrangement of the training data set in a certain order, such that the number of distance computations is significantly reduced. At the same time, the classification accuracy of the original rule remains unaffected. This method requires the storage of at most n distances in addition to the prototypes. The superiority of the proposed method in comparison to some other methods is clearly established in terms of the number of distances computed, the time required for finding the nearest neighbor, number of optimized operations required in the overhead computation and memory requirements. Variation of the performance of the proposed method with the size of the test data is also demonstrated.  相似文献   

10.
11.
In this paper, two novel classifiers based on locally nearest neighborhood rule, called nearest neighbor line and nearest neighbor plane, are presented for pattern classification. Comparison to nearest feature line and nearest feature plane, the proposed methods take much lower computation cost and achieve competitive performance.  相似文献   

12.
聚类分析是一种重要的数据挖掘方法。K-means聚类算法在数据挖掘领域具有非常重要的应用价值。针对K-means需要人工设定聚类个数并且易陷入局部极优的缺陷,提出了一种基于最近共享邻近节点的K-means聚类算法(KSNN)。KSNN在数据集中搜索中心点,依据中心点查找数据集个数,为K-means聚类提供参数。从而克服了K-means需要人工设定聚类个数的问题,同时具有较好的全局收敛性。实验证明KSNN算法比K-means、粒子群K-means(pso)以及多中心聚类算法(MCA)有更好的聚类效果。  相似文献   

13.
LDA/SVM driven nearest neighbor classification   总被引:3,自引:0,他引:3  
Nearest neighbor (NN) classification relies on the assumption that class conditional probabilities are locally constant. This assumption becomes false in high dimensions with finite samples due to the curse of dimensionality. The NN rule introduces severe bias under these conditions. We propose a locally adaptive neighborhood morphing classification method to try to minimize bias. We use local support vector machine learning to estimate an effective metric for producing neighborhoods that are elongated along less discriminant feature dimensions and constricted along most discriminant ones. As a result, the class conditional probabilities can be expected to be approximately constant in the modified neighborhoods, whereby better classification performance can be achieved. The efficacy of our method is validated and compared against other competing techniques using a number of datasets.  相似文献   

14.
Shows that systems built on a simple statistical technique and a large training database can be automatically optimized to produce classification accuracies of 99% in the domain of handwritten digits. It is also shown that the performance of these systems scale consistently with the size of the training database, where the error rate is cut by more than half for every tenfold increase in the size of the training set from 10 to 100,000 examples. Three distance metrics for the standard nearest neighbor classification system are investigated: a simple Hamming distance metric, a pixel distance metric, and a metric based on the extraction of penstroke features. Systems employing these metrics were trained and tested on a standard, publicly available, database of nearly 225,000 digits provided by the National Institute of Standards and Technology. Additionally, a confidence metric is both introduced by the authors and also discovered and optimized by the system. The new confidence measure proves to be superior to the commonly used nearest neighbor distance  相似文献   

15.
Multimedia Tools and Applications - Product quantization is a widely used lossy compression technique that can generate high quantization levels by a compact codebook set. It has been conducted in...  相似文献   

16.
针对手机信令数据存在的精度不高、时间间隔大、信号"乒乓切换"等问题,提出一种基于朴素贝叶斯分类(NBC)的方法来利用手机定位数据识别居民出行起讫点(OD)。首先,利用80位志愿者连续1个月记录的出行活动数据,依据职住距离分类统计移动和停留状态下的条件概率分布;其次,建立用于表征用户移动停留状态的两个特征参数指标:方向夹角和最小覆盖圆直径;最后,依据NBC原理计算用户的移动或停留状态概率,将连续两个以上为移动状态的过程集聚为出行OD。利用厦门市移动的手机定位数据的分析结果表明:所提方法得到的人均出行次数的平均绝对百分比误差(MAPE)误差为7.79%,具备较高的精度,出行OD的分析结果可以较好地反映真实出行规律。  相似文献   

17.
Density estimates based on k-nearest neighbors have useful applications in nonparametric discriminant analysis. In classification problems, optimal values of k are usually estimated by minimizing the cross-validated misclassification rates. However, these cross-validation techniques allow only one value of k for each population density estimate, while in a classification problem, the optimum value of k for a class may also depend on its competing population densities. Further, it is computationally difficult to minimize the cross-validated error rate when there are several competing populations. Moreover, in addition to depending on the entire training data set, a good choice of k should also depend on the specific observation to be classified. Therefore, instead of using a single value of k for each population density estimate, it is more useful in practice to consider the results for multiple values of k to arrive at the final decision. This paper presents one such approach along with a graphical device, which gives more information about classification results for various choices of k and the related statistical uncertainties present there. The utility of this proposed methodology has been illustrated using some benchmark data sets.  相似文献   

18.
Nearest neighbor (NN) classification assumes locally constant class conditional probabilities, and suffers from bias in high dimensions with a small sample set. In this paper, we propose a novel cam weighted distance to ameliorate the curse of dimensionality. Different from the existing neighborhood-based methods which only analyze a small space emanating from the query sample, the proposed nearest neighbor classification using the cam weighted distance (CamNN) optimizes the distance measure based on the analysis of inter-prototype relationship. Our motivation comes from the observation that the prototypes are not isolated. Prototypes with different surroundings should have different effects in the classification. The proposed cam weighted distance is orientation and scale adaptive to take advantage of the relevant information of inter-prototype relationship, so that a better classification performance can be achieved. Experiments show that CamNN significantly outperforms one nearest neighbor classification (1-NN) and k-nearest neighbor classification (k-NN) in most benchmarks, while its computational complexity is comparable with that of 1-NN classification.  相似文献   

19.
Modern computers provide excellent opportunities for performing fast computations. They are equipped with powerful microprocessors and large memories. However, programs are not necessarily able to exploit those computer resources effectively. In this paper, we present the way in which we have implemented a nearest neighbor classification. We show how performance can be improved by exploiting the ability of superscalar processors to issue multiple instructions per cycle and by using the memory hierarchy adequately. This is accomplished by the use of floating-point arithmetic which usually outperforms integer arithmetic, and block (tiled) algorithms which exploit the data locality of programs allowing for an efficient use of the data stored in the cache memory. Our results are validated with both an analytical model and empirical results. We show that regular codes could be performed faster than more complex irregular codes using standard data sets.  相似文献   

20.
Graph-based image segmentation techniques generally represent the problem in terms of a graph. In this work, we present a novel graph, called the directional nearest neighbor graph. The construction principle of this graph is that each node corresponding to a pixel in the image is connected to a fixed number of nearest neighbors measured by color value and the connected neighbors are distributed in four directions. Compared with the classical grid graph and the nearest neighbor graph, our method can capture low-level texture information using a less-connected edge topology. To test the performance of the proposed method, a comparison with other graph-based methods is carried out on synthetic and real-world images. Results show an improved segmentation for texture objects as well as a lower computational load.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号