首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
The K nearest neighbors approach is a viable technique in time series analysis when dealing with ill-conditioned and possibly chaotic processes. Such problems are frequently encountered in, e.g., finance and production economics. More often than not, the observed processes are distorted by nonnormal disturbances, incomplete measurements, etc. that undermine the identification, estimation and performance of multivariate techniques. If outliers can be duly recognized, many crisp statistical techniques may perform adequately as such. Geno-mathematical programming provides a connection between statistical time series theory and fuzzy regression models that may be utilized e.g., in the detection of outliers. In this paper we propose a fuzzy distance measure for detecting outliers via geno-mathematical parametrization. Fuzzy KNN is connected as a linkable library to the genetic hybrid algorithm (GHA) of the author, in order to facilitate the determination of the LR-type fuzzy number for automatic outlier detection in time series data. We demonstrate that GHA[Fuzzy KNN] provides a platform for automatically detecting outliers in both simulated and real world data.  相似文献   

2.
Classification of weld flaws with imbalanced class data   总被引:1,自引:0,他引:1  
This paper presents research results of our investigation of the imbalanced data problem in the classification of different types of weld flaws, a multi-class classification problem. The one-against-all scheme is adopted to carry out multi-class classification and three algorithms including minimum distance, nearest neighbors, and fuzzy nearest neighbors are employed as the classifiers. The effectiveness of 22 data preprocessing methods for dealing with imbalanced data is evaluated in terms of eight evaluation criteria to determine whether any method would emerge to dominate the others. The test results indicate that: (1) nearest neighbor classifiers outperform the minimum distance classifier; (2) some data preprocessing methods do not improve any criterion and they vary from one classifier to another; (3) the combination of using the AHC_KM data preprocessing method with the 1-NN classifier is the best because they together produce the best performance in six of eight evaluation criteria; and (4) the most difficult weld flaw type to recognize is crack.  相似文献   

3.
《Parallel Computing》1990,14(3):261-275
We geometrically classify multi-layer perceptron (MLP) solutions in two ways: the hyperplane partitioning interpretation and the hidden-unit representation of the pattern set. We show these classifications to be invariant under orthogonal transformations and translations in the space of the hidden units. These solitots [sic] can be enumerated for any given Boolean mapping problem. Using a geometrical argument we derive the total number of solitots available to a minimal network for the parity problem. A lower bound is computed for the scaling of the number of solitots with input vector dimension, when a fixed fraction of patterns is removed from the full training set. The generalization probability is shown to decrease with the exponential of the problem size for the parity problem. We suggest that this, and hidden layer scaling problems, are serious drawbacks to scapling-up of MLPs to larger tasks.  相似文献   

4.
  总被引:1,自引:1,他引:0  
One of the most important queries in spatio-temporal databases that aim at managing moving objects efficiently is the continuous K-nearest neighbor (CKNN) query. A CKNN query is to retrieve the K-nearest neighbors (KNNs) of a moving user at each time instant within a user-given time interval [t s , t e ]. In this paper, we investigate how to process a CKNN query efficiently. Different from the previous related works, our work relieves the past assumption, that an object moves with a fixed velocity, by allowing that the velocity of the object can vary within a known range. Due to the introduction of this uncertainty on the velocity of each object, processing a CKNN query becomes much more complicated. We will discuss the complications incurred by this uncertainty and propose a cost-effective P2 KNN algorithm to find the objects that could be the KNNs at each time instant within the given query time interval. Besides, a probability-based model is designed to quantify the possibility of each object being one of the KNNs. Comprehensive experiments demonstrate the efficiency and the effectiveness of the proposed approach.
Chiang Lee (Corresponding author)Email:
  相似文献   

5.
Schapire and Singer's improved version of AdaBoost for handling weak hypotheses with confidence rated predictions represents an important advance in the theory and practice of boosting. Its success results from a more efficient use of information in weak hypotheses during updating. Instead of simple binary voting a weak hypothesis is allowed to vote for or against a classification with a variable strength or confidence. The Pool Adjacent Violators (PAV) algorithm is a method for converting a score into a probability. We show how PAV may be applied to a weak hypothesis to yield a new weak hypothesis which is in a sense an ideal confidence rated prediction and that this leads to an optimal updating for AdaBoost. The result is a new algorithm which we term PAV-AdaBoost. We give several examples illustrating problems for which this new algorithm provides advantages in performance. Editor: Robert Schapire  相似文献   

6.
提出一种动态调整学习率和附加梯度变化量与动量项相结合的权值优化方法,同时引入绝对误差函数用于对多层感知器中BP算法的改进,并将改进算法用于旋转机械故障诊断实例样本的学习。仿真结果表明,改进的BP算法可显著加速网络训练速度,学习过程具有较好的收敛性,并能正确地诊断出存在的故障,具有一定的实用价值。  相似文献   

7.
Peer-to-peer systems have been widely used for sharing and exchanging data and resources among numerous computer nodes. Various data objects identifiable with high dimensional feature vectors, such as text, images, genome sequences, are starting to leverage P2P technology. Most of the existing works have been focusing on queries on data objects with one or few attributes and thus are not applicable on high dimensional data objects. In this study, we investigate K nearest neighbors query (KNN) on high dimensional data objects in P2P systems. Efficient query algorithm and solutions that address various technical challenges raised by high dimensionality, such as search space resolution and incremental search space refinement, are proposed. An extensive simulation using both synthetic and real data sets demonstrates that our proposal efficiently supports KNN query on high dimensional data in P2P systems.  相似文献   

8.
孙林  秦小营  徐久成  薛占熬 《软件学报》2022,33(4):1390-1411
密度峰值聚类(density peak clustering, DPC)是一种简单有效的聚类分析方法.但在实际应用中,对于簇间密度差别大或者簇中存在多密度峰的数据集,DPC很难选择正确的簇中心;同时,DPC中点的分配方法存在多米诺骨牌效应.针对这些问题,提出一种基于K近邻(K-nearest neighbors,KNN)和优化分配策略的密度峰值聚类算法.首先,基于KNN、点的局部密度和边界点确定候选簇中心;定义路径距离以反映候选簇中心之间的相似度,基于路径距离提出密度因子和距离因子来量化候选簇中心作为簇中心的可能性,确定簇中心.然后,为了提升点的分配的准确性,依据共享近邻、高密度最近邻、密度差值和KNN之间距离构建相似度,并给出邻域、相似集和相似域等概念,以协助点的分配;根据相似域和边界点确定初始聚类结果,并基于簇中心获得中间聚类结果.最后,依据中间聚类结果和相似集,从簇中心到簇边界将簇划分为多层,分别设计点的分配策略;对于具体层次中的点,基于相似域和积极域提出积极值以确定点的分配顺序,将点分配给其积极域中占主导地位的簇,获得最终聚类结果.在11个合成数据集和27个真实数据集上进行仿真...  相似文献   

9.
In this note by considering the notion of (weak) dual hyper K-ideal, we obtain some related results. After that we determine the relationships between (weak) dual hyper K-ideals and (weak) hyper K-ideals. Finally, we give a characterization of hyper K-algebras of order 3 or 4 based on the (weak) dual hyper K-ideals.  相似文献   

10.
This paper discusses the application of a class of feed-forward Artificial Neural Networks (ANNs) known as Multi-Layer Perceptrons(MLPs) to two vision problems: recognition and pose estimation of 3D objects from a single 2D perspective view; and handwritten digit recognition. In both cases, a multi-MLP classification scheme is developed that combines the decisions of several classifiers. These classifiers operate on the same feature set for the 3D recognition problem whereas different feature types are used for the handwritten digit recognition. The backpropagationlearning rule is used to train the MLPs. Application of the MLP architecture to other vision problems is also briefly discussed.  相似文献   

11.
苏乐  柴金祥  夏时洪 《软件学报》2016,27(S2):172-183
提出一种基于局部姿态先验的从深度图像中实时在线捕获3D人体运动的方法.关键思路是根据从捕获的深度图像中自动提取具有语义信息的虚拟稀疏3D标记点,从事先建立的异构3D人体姿态数据库中快速检索K个姿态近邻并构建局部姿态先验模型,通过迭代优化求解最大后验概率,实时地在线重建3D人体姿态序列.实验结果表明,该方法能够实时跟踪重建出稳定、准确的3D人体运动姿态序列,并且只需经过个体化人体参数自动标定过程,可跟踪身材尺寸差异较大的不同表演者;帧率约25fps.因此,所提方法可应用于3D游戏/电影制作、人机交互控制等领域.  相似文献   

12.
雷小锋  谢昆青  林帆  夏征义 《软件学报》2008,19(7):1683-1692
K-Means聚类算法只能保证收敛到局部最优,从而导致聚类结果对初始代表点的选择非常敏感.许多研究工作都着力于降低这种敏感性.然而,K-Means的局部最优和结果敏感性却构成了K-MeanSCAN聚类算法的基础.K-MeanSCAN算法对数据集进行多次采样和K-Means预聚类以产生多组不同的聚类结果,来自不同聚类结果的子簇之间必然会存在交集.算法的核心思想是,利用这些交集构造出关于子簇的加权连通图,并根据连通性合并子簇.理论和实验证明,K-MeanScan算法可以在很大程度上提高聚类结果的质量和算法的效率.  相似文献   

13.
On the strict logic foundation of fuzzy reasoning   总被引:2,自引:0,他引:2  
This paper focuses on the logic foundation of fuzzy reasoning. At first, a new complete first-order fuzzy predicate calculus system K* corresponding to the formal system L* is built. Based on the many-sort system Kms* corresponding to K*, the triple I methods of FMP and FMT for fuzzy reasoning and their consistency are formalized, thus fuzzy reasoning is put completely and rigorously into the logic framework of fuzzy logic.The author is indebted to anonymous referee for his useful comments which have helped to improve the paper.  相似文献   

14.
三维散乱数据的k个最近邻域快速搜索算法   总被引:31,自引:0,他引:31  
提出一种新的快速搜索算法.首先,采用空间分块策略,把数据空间分成许多大小相同的立方体子空间,立方体的大小决定了最近点的搜索速度;然后,综合考虑了数据集的范围、点的总数及最近点数目k,给出了一种新的估算立方体边长的方法.大量真实数据的实验结果表明:文中算法可以快速地给出接近于最佳搜索速度的立方体边长.  相似文献   

15.
For classifying multispectral satellite images, a multilayer perceptron (MLP) is trained using either (i) ground truth data or (ii) the output of a K-means clustering program or (iii) both, as applied to certain representative parts of the given data set. In the second case, different sets of clustered image outputs, which have been checked against actual ground truth data wherever available, are used for testing the MLP. The cover classes are, typically, different types of (a) vegetation (including forests and agriculture); (b) soil (including mountains, highways and rocky terrain); and (c) water bodies (including lakes). Since the extent of ground truth may not be sufficient for training neural networks, the proposed procedure (of using clustered output images) is believed to be novel and advantageous. Moreover, it is found that the MLP offers an accuracy of more than 99% when applied to the multispectral satellite images in our library. As importantly, comparison with some recent results shows that the proposed application of the MLP leads to a more accurate and faster classification of multispectral image data.  相似文献   

16.
In this paper we investigate multi-layer perceptron networks in the task domain of Boolean functions. We demystify the multi-layer perceptron network by showing that it just divides the input space into regions constrained by hyperplanes. We use this information to construct minimal training sets. Despite using minimal training sets, the learning time of multi-layer perceptron networks with backpropagation scales exponentially for complex Boolean functions. But modular neural networks which consist of independentky trained subnetworks scale very well. We conjecture that the next generation of neural networks will be genetic neural networks which evolve their structure. We confirm Minsky and Papert: “The future of neural networks is tied not to the search for some single, universal scheme to solve all problems at once, bu to the evolution of a many-faceted technology of network design.”  相似文献   

17.
不完备的空间数据影响了空间决策、分析与推理的结果及其可靠性。传统的不完备数据检测方法仅使用统计学理论,没有考虑空间数据的空间特性,从而不能直接用于检测不完备的空间数据。提出了一种基于邻近域的不完备空间数据检测方法—NNBiSDD算法,NNBiSDD算法在空间实体的k-邻近域内使用“三倍标准差”原则检测不完备的空间数据。最后,通过一个实际算例验证了NNBiSDD算法的有效性和可靠性。  相似文献   

18.
High-dimensional problems arising from robot motion planning, biology, data mining, and geographic information systems often require the computation of k nearest neighbor (knn) graphs. The knn graph of a data set is obtained by connecting each point to its k closest points. As the research in the above-mentioned fields progressively addresses problems of unprecedented complexity, the demand for computing knn graphs based on arbitrary distance metrics and large high-dimensional data sets increases, exceeding resources available to a single machine. In this work we efficiently distribute the computation of knn graphs for clusters of processors with message passing. Extensions to our distributed framework include the computation of graphs based on other proximity queries, such as approximate knn or range queries. Our experiments show nearly linear speedup with over 100 processors and indicate that similar speedup can be obtained with several hundred processors.  相似文献   

19.
基于项目的协同过滤推荐算法在电子商务中有着广泛的引用,该算法的核心是计算项目之间的相似度.传统的计算项目相似度算法仅仅通过项目间共同用户评分值差异来计算,在数据稀疏情况下,项目间共同用户评分值很少,导致此类算法性能严重下降.针对此问题,从项目间的整体评分角度出发,提出争议相似度的概念,争议相似度从项目间评分方差差异的角度衡量项目间相似性.将争议度特征融合到基于项目之间共同用户评分的传统相似度算法中,进而提出了融合项目争议度特征的协同过滤推荐算法,最终缓解了传统算法在稀疏数据情况下相似度计算不准确的问题.实验结果表明该算法在数据稀疏环境下可以明显提升推荐质量.  相似文献   

20.
K-近邻算法的改进及实现   总被引:1,自引:0,他引:1  
利用k-近邻算法进行分类时。如果属性集包含不相关属性或弱相关属性,那么分类精度将会降低。研究了k-近邻分类器,分析了k-近邻分类器的缺点,提出了一种利用随机属性子集组合k近邻分类器的算法。通过随机的属性子集组合多个k近邻分类器,利用简单的投票,对多个k-近邻分类器的输出进行组合,这样可有效地改进k-近邻分类器的精度。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号