首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The detection of spatial outliers helps extract important and valuable information from large spatial datasets. Most of the existing work in outlier detection views the condition of being an outlier as a binary property. However, for many scenarios, it is more meaningful to assign a degree of being an outlier to each object. The temporal dimension should also be taken into consideration. In this paper, we formally introduce a new notion of spatial outliers. We discuss the spatiotemporal outlier detection problem, and we design a methodology to discover these outliers effectively. We introduce a new index called the fuzzy outlier index, FoI, which expresses the degree to which a spatial object belongs to a spatiotemporal neighbourhood. The proposed outlier detection method can be applied to phenomena evolving over time, such as moving objects, pedestrian modelling or credit card fraud.  相似文献   

2.
We are concerned with the issue of detecting outliers and change points from time series. In the area of data mining, there have been increased interest in these issues since outlier detection is related to fraud detection, rare event discovery, etc., while change-point detection is related to event/trend change detection, activity monitoring, etc. Although, in most previous work, outlier detection and change point detection have not been related explicitly, this paper presents a unifying framework for dealing with both of them. In this framework, a probabilistic model of time series is incrementally learned using an online discounting learning algorithm, which can track a drifting data source adaptively by forgetting out-of-date statistics gradually. A score for any given data is calculated in terms of its deviation from the learned model, with a higher score indicating a high possibility of being an outlier. By taking an average of the scores over a window of a fixed length and sliding the window, we may obtain a new time series consisting of moving-averaged scores. Change point detection is then reduced to the issue of detecting outliers in that time series. We compare the performance of our framework with those of conventional methods to demonstrate its validity through simulation and experimental applications to incidents detection in network security.  相似文献   

3.
Data envelopment analysis (DEA) uses extreme observations to identify superior performance, making it vulnerable to outliers. This paper develops a unified model to identify both efficient and inefficient outliers in DEA. Finding both types is important since many post analyses, after measuring efficiency, depend on the entire distribution of efficiency estimates. Thus, outliers that are distinguished by poor performance can significantly alter the results. Besides allowing the identification of outliers, the method described is consistent with a relaxed set of DEA axioms. Several examples demonstrate the need for identifying both efficient and inefficient outliers and the effectiveness of the proposed method. Applications of the model reveal that observations with low efficiency estimates are not necessarily outliers. In addition, a strategy to accelerate the computation is proposed that can apply to influential observation detection.  相似文献   

4.
Robust TSK fuzzy modeling for function approximation with outliers   总被引:3,自引:0,他引:3  
The Takagi-Sugeno-Kang (TSK) type of fuzzy models has attracted a great attention of the fuzzy modeling community due to their good performance in various applications. Most approaches for modeling TSK fuzzy rules define their fuzzy subspaces based on the idea of training data being close enough instead of having similar functions. Besides, training data sets algorithms often contain outliers, which seriously affect least-square error minimization clustering and learning algorithms. A robust TSK fuzzy modeling approach is presented. In the approach, a clustering algorithm termed as robust fuzzy regression agglomeration (RFRA) is proposed to define fuzzy subspaces in a fuzzy regression manner with robust capability against outliers. To obtain a more precision model, a robust fine-tuning algorithm is then employed. Various examples are used to verify the effectiveness of the proposed approach. From the simulation results, the proposed robust TSK fuzzy modeling indeed showed superior performance over other approaches  相似文献   

5.
This study proposes a hybrid robust approach for constructing Takagi–Sugeno–Kang (TSK) fuzzy models with outliers. The approach consists of a robust fuzzy C-regression model (RFCRM) clustering algorithm in the coarse-tuning phase and an annealing robust back-propagation (ARBP) learning algorithm in the fine-tuning phase. The RFCRM clustering algorithm is modified from the fuzzy C-regression models (FCRM) clustering algorithm by incorporating a robust mechanism and considering input data distribution and robust similarity measure into the FCRM clustering algorithm. Due to the use of robust mechanisms and the consideration of input data distribution, the fuzzy subspaces and the parameters of functions in the consequent parts are simultaneously identified by the proposed RFCRM clustering algorithm and the obtained model will not be significantly affected by outliers. Furthermore, the robust similarity measure is used in the clustering process to reduce the redundant clusters. Consequently, the RFCRM clustering algorithm can generate a better initialization for the TSK fuzzy models in the coarse-tuning phase. Then, an ARBP algorithm is employed to obtain a more precise model in the fine-tuning phase. From our simulation results, it is clearly evident that the proposed robust TSK fuzzy model approach is superior to existing approaches in learning speed and in approximation accuracy.  相似文献   

6.
针对模糊C均值(FCM)算法聚类数需要预先设定的问题,提出了一种新的模糊聚类有效性指标。首先,计算簇中每个属性的方差,给方差较小的属性赋予较大的权值,给方差较大的属性赋予较小的权值,得到一种基于属性加权的FCM算法;然后,根据FCM改进算法得到的隶属度矩阵计算类内紧致性和类间分离性;最后,利用类内紧致性和类间分离性定义一个新的聚类有效性指标。实验结果表明,该指标可以找到符合数据自然分布的类的数目。基于属性加权的FCM算法可以识别不同属性的重要程度,增加聚类结果的准确率,使用FCM改进算法得到的隶属度矩阵定义的有效性指标,能够发现正确的聚类个数,实现聚类无监督的学习过程。  相似文献   

7.
A cluster validity index for fuzzy clustering   总被引:1,自引:0,他引:1  
A new cluster validity index is proposed for the validation of partitions of object data produced by the fuzzy c-means algorithm. The proposed validity index uses a variation measure and a separation measure between two fuzzy clusters. A good fuzzy partition is expected to have a low degree of variation and a large separation distance. Testing of the proposed index and nine previously formulated indices on well-known data sets shows the superior effectiveness and reliability of the proposed index in comparison to other indices and the robustness of the proposed index in noisy environments.  相似文献   

8.
9.
针对传统支持向量机对于噪声和野点敏感的问题,采用一种模糊技术去除样本中的噪声和野点.应用基于样本之间的紧密度确定每个样本的模糊隶属度,通过训练确定阀值,去除影响得到最优分类超平面的噪声和野点.实验结果表明,与传统的支持向量机相比,该方法提高了支持向量机的抗噪能力,在不影响精度的前提下,线性规划下的一类分类方法要比二次规划节省很多时间.  相似文献   

10.
11.
12.
Compared with optical satellite images, synthetic aperture radar (SAR) images are less influenced by weather conditions such as cloud and haze. With the support of SAR image time series, a framework of change detection based on spatiotemporal fuzzy clustering is presented. This framework mainly consists of three components: (1) pixel-level SAR image time-series modelling, based on scale invariant feature transform (SIFT); (2) probability analysis of change node based on iterative binary partition-mean square error model of the series is calculated to ascertain change nodes; (3) spatiotemporal fuzzy clustering is used to determine the types of change detection. To validate the method, 26 SAR images of the study area between 2004 and 2010 are utilized to monitor annual changes of cultivated land to construction land, and comparative experiments are conducted to evaluate the detection accuracy. Experimental results showed that the proposed framework could effectively extract the change nodes and change pixels, with correctness of 84.52% and completeness of 82.64%, outperforming the traditional fuzzy clustering method, as well as traditional classification methods.  相似文献   

13.
在经典的模糊C均值(FCM)算法中,聚类数需要预先给出,否则算法无法工作,这在一定程度上限制了FCM算法的应用范围。针对FCM算法中聚类数需要预先设定问题,提出了一种新的模糊聚类有效性指标。首先,通过运行FCM算法得到隶属度矩阵;然后,通过隶属度矩阵计算类内紧密性和类间重叠性;最后,利用类内的紧密性和类间的重叠性定义了一个新的聚类有效性指标。该指标克服了FCM算法中类数需要预先设定的缺点,利用该指标可以发现最符合数据自然分布的类的数目。通过对人工数据集和实际数据集的测试表明,对于模糊因子取1.8,2.0和2.2三个不同的常用值,均能发现最优聚类数。  相似文献   

14.
ABSTRACT

There are great interests of designing research metrics and indices to measure the research impacts in research institutes. Unfortunately, most of those indices ignore critical design issues, e.g. the disparity between domains, the impact of journals or conferences in which papers are published, normalising the range of the index values to certain intervals, and the scalability of using the index to rank different research entities. In this paper, a new normalised fuzzy index, (NFindex), is proposed as a fuzzy-based research impact metric. The proposed index is a scalable index whose values are normalised to the percentage levels. NFindex achieves both inter-discipline normalisation and intra-discipline consistency. The capability of NFindex to achieve the inter-discipline normalisation enables fair comparison between different research domains regardless their nature in terms of influence and contribution to other research areas, e.g. natural science. Therefore, NFindex gives a universal normalised single-number metric that can be used by research institutes to solve the problem of inter-discipline scholar ranking. Moreover, it can help universal ranking of universities and research institutes according to their research capabilities and impacts. The obtained results, on diverse research areas, prove the potential of NFindex in terms of both intra-discipline consistency and inter-discipline normalisation.  相似文献   

15.
16.
一个改进的模糊聚类有效性指标   总被引:1,自引:0,他引:1       下载免费PDF全文
聚类有效性指标既可用来评价聚类结果的有效性,也可以用来确定最佳聚类数。根据模糊聚类的基本特性,提出了一种新的模糊聚类有效性指标。该指标结合了数据集的分布特征和数据隶属度两个重要因素来评价聚类结果,提高了判别的准确性。实验证明,该指标能对模糊聚类结果进行正确的评价,并自动获得最佳聚类数,特别是对类间有交叠的情况能够做出准确判定。  相似文献   

17.
In this article, a cluster validity index and its fuzzification is described, which can provide a measure of goodness of clustering on different partitions of a data set. The maximum value of this index, called the PBM-index, across the hierarchy provides the best partitioning. The index is defined as a product of three factors, maximization of which ensures the formation of a small number of compact clusters with large separation between at least two clusters. We have used both the k-means and the expectation maximization algorithms as underlying crisp clustering techniques. For fuzzy clustering, we have utilized the well-known fuzzy c-means algorithm. Results demonstrating the superiority of the PBM-index in appropriately determining the number of clusters, as compared to three other well-known measures, the Davies-Bouldin index, Dunn's index and the Xie-Beni index, are provided for several artificial and real-life data sets.  相似文献   

18.
We introduce a form of spatiotemporal reasoning that uses homogeneous representations of time and the three dimensions of space. The basis of our approach is Allen's temporal logic on the one hand and general constraint satisfaction algorithms on the other, where we present a new view of constraint reasoning to cope with the affordances of spatiotemporal reasoning as introduced here. As a realization for constraint reasoning, we suggest a massively parallel implementation in form of Boltzmann machines.  相似文献   

19.
Corner matching in image sequences is an important and difficult problem that serves as a building block of several important applications of stereo vision etc. Normally, in area-based corner matching techniques, the linear measures like standard cross correlation coefficient, zero-mean (normalized) cross correlation coefficient, sum of absolute difference and sum of squared difference are used. Fuzzy logic is a powerful tool to solve many image processing problems because of its ability to deal with ambiguous data. In this paper, we use a similarity measure based on fuzzy correlations in order to establish the corner correspondence between sequence images in the presence of intensity variations and motion blur. The matching approach proposed here needs only to extract one set of corner points as candidates from the left image (first frame), and the positions of which in the right image (second frame) are determined by matching, not by extracting. Experiments conducted with the help of various sequences of images prove the superiority of our algorithm over standard and zero-mean cross correlation as well as one contemporary work using mutual information as a window similarity measure combined with graph matching techniques under non-ideal conditions.  相似文献   

20.
In the context of resistant learning, outliers are the observations far away from the fitting function that is deduced from a subset of the given observations and whose form is adaptable during the process. This study presents a resistant learning procedure for coping with outliers via single-hidden layer feed-forward neural network (SLFN). The smallest trimmed sum of squared residuals principle is adopted as the guidance of the proposed procedure, and key mechanisms are: an analysis mechanism that excludes any potential outliers at early stages of the process, a modeling mechanism that deduces enough hidden nodes for fitting the reference observations, an estimating mechanism that tunes the associated weights of SLFN, and a deletion diagnostics mechanism that checks to see if the resulted SLFN is stable. The lake data set is used to demonstrate the resistant-learning performance of the proposed procedure.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号