共查询到20条相似文献,搜索用时 15 毫秒
1.
离群点检测作为数据挖掘的一个重要研究方向,可以从大量数据中发现少量与多数数据有明显区别的数据对象。高维度环境是离群点检测技术的一个重要场景,现实生活中的高维数据中大量无关或嘈杂的特征给基于子空间/特征选择的高维离群点检测方法提出了重大挑战。Pang等人提出了一种将离群点打分和特征选择结合起来的方案CINFO,准确度相比传统的高维度离群点检测算法有所提升。CINFO方法在效率上有改进空间,本文通过引入扩展的孤立森林算法(Extended Isolation Forest,EIF)对CINFO方法进行改进,在几乎不损失精度的情况下,明显提升了算法效率。 相似文献
2.
定义了新的异常因子,将数据分为正常、异常、临界3种状态,并在此基础上构建了一个基于动态阈值的异常值检测模型。在修正马尔科夫假设的基础上,给出动态阈值的更新方法。算法在无需训练集的条件下,实现了在线的实时异常值检测。仿真实验表明,算法在保持较高检测精度的同时,维持了较低的误报率。 相似文献
3.
Outlier detection techniques play an important role in enhancing the reliability of data communication in wireless sensor networks (WSNs). Considering the importance of outlier detection in WSNs, many outlier detection techniques have been proposed. Unfortunately, most of these techniques still have some potential limitations, that is, (a) high rate of false positives, (b) high time complexity, and (c) failure to detect outliers online. Moreover, these approaches mainly focus on either temporal outliers or spatial outliers. Therefore, this paper aims to introduce novel algorithms that successfully detect both temporal outliers and spatial outliers. Our contributions are twofold: (i) modifying the Hampel Identifier (HI) algorithm to achieve high accuracy identification rate in temporal outlier detection, (ii) combining the Gaussian process (GP) model and graph‐based outlier detection technique to improve the performance of the algorithm in spatial outlier detection. The results demonstrate that our techniques outperform the state‐of‐the‐art methods in terms of accuracy and work well with various data types. 相似文献
4.
6.
Sourabh Bharti Kiran K. Pattanaik 《International Journal of Communication Systems》2016,29(13):2015-2027
Accuracy of sensed data and reliable delivery are the key concerns in addition to several other network‐related issues in wireless sensor networks (WSNs). Early detection of outliers reduces subsequent unwanted transmissions, thus preserving network resources. Recent techniques on outlier detection in WSNs are computationally expensive and based on message exchange. Message exchange‐based techniques incur communication overhead and are less preferred in WSNs. On the other hand, machine learning‐based outlier detection techniques are computationally expensive for resource constraint sensor nodes. The novelty of this paper is that it proposes a simple, non message exchange based, in‐network, real‐time outlier detection algorithm based on Newton's law of gravity. The mechanism is evaluated for its accuracy in detecting outliers, computational cost, and its influence on the network traffic and delay. The outlier detection mechanism resulted in almost 100% detection accuracy. Because the mechanism involves no message exchanges, there is a significant reduction in network traffic, energy consumption and end‐to‐end delay. An extension of the proposed algorithm for transient data sets is proposed, and analytic evaluation justifies that the mechanism is reactive to time series data. Copyright © 2016 John Wiley & Sons, Ltd. 相似文献
7.
8.
9.
针对分布复杂且离群类型多样的数据集进行离群检测困难的问题,提出基于相对距离的反k近邻树离群检测方法RKNMOD(Reversed K-Nearest Neighborhood).首先,将经典欧氏距离、对象局部密度和对象邻域结合,定义了对象的相对距离,能同时有效检出全局和局部离群点.其次,以最小生成树结构为基础,采取最大边切割法以快速分割离群点和离群簇.最后,人工合成数据集和UCI数据集试验均表明,新算法的检测准确率更高,为分布异常且离群类型多样的数据集的离群检测提供了一条有效的新途径. 相似文献
10.
11.
离群数据检测是找出与正常数据不一致的数据。由于某种原因,会出现一些噪声数据。针对噪声数据的特征,提出了一个有效的离群点检测算法。通过层次k-means算法对数据集进行聚类,从包括离群点可能性最大的簇开始进行检测,在检测过程中提出基于熵值距离来衡量数据点的离群程度,并通过剪枝规则来减少检测次数,从而提高了检测的效率。仿真结果表明该算法对出现的噪声数据具有较好的过滤效果。 相似文献
12.
基于运动矢量多级分析的视频全局运动估计 总被引:3,自引:0,他引:3
基于运动矢量场的视频全局运动估计相较于基于像素的估计方法具有较低的计算复杂度,因而广泛应用于视频分割及视频压缩等领域中。然而噪声和前景目标等外点区域的存在,降低了全局运动估计的准确性。为了提高全局运动估计的准确度,该文提出一种基于运动矢量多级分析的全局运动估计算法,该算法根据局部运动与全局运动的运动特性差异自适应地滤除前景目标区域,由邻域矢量间相似性度量检测出纹理平滑周期区域,最后滤除孤立的噪声区域,由滤波得到的内点区域求解全局运动参数。实验结果表明,该方法能有效地滤除外点区域,提高全局运动估计的准确性。 相似文献
13.
In view of the difficulty of accurate online detection for massive data collecting real-timely in a strong noise environment during control process, an order self-learning Autoregressive Hidden Markov Model (ARHMM) algorithm is proposed to carry out online outlier detection in industrial control process. The algorithm utilizes AR model to fit the time series and makes use of HMM as basic detection tool, which can avoid the deficiency of presetting the threshold in traditional detection methods. In order to update parameters of ARHMM online, the structure of traditional Brockwell–Dahlhaus–Trindade (BDT) algorithm is improved to be a double-iterative structure in which iterative calculation from both time and order is applied respectively. With the purpose of reducing the influence of outlier on parameter update of ARHMM, the strategies of detection-before-update and detection-based-update are adopted, which also improve the robustness of algorithm. Subsequent simulation by model data and practical application verify the accuracy, robustness and property of online detection of the algorithm. According to the result, it is obvious that new algorithm proposed in this paper is more suitable for outlier detection of control process data in process industry. 相似文献
14.
无线传感器网络中异常节点检测是确保网络数据准确性和可靠性的关键步骤。基于图信号处理理论,该文提出了一种新的无线传感器网络异常节点检测定位算法。新算法首先对网络建立图信号模型,然后基于节点域-图频域联合分析的方法,实现异常节点的检测和定位。具体而言,第1步是利用高通图滤波器提取网络信号的高频分量。第2步首先将网络划分为多个子图,然后筛选出子图输出信号的特定频率分量。第3步对筛选出的子图信号进行阈值判断从而定位疑似异常的子图中心节点。最后通过比较各子图的节点集合和疑似异常节点集合,检测并定位出网络中的异常节点。实验仿真表明,与已有的无线传感器网络中异常检测方法相比,新算法不仅有着较高的异常检测概率,而且异常节点的定位率也较高。 相似文献
15.
针对设备差异性造成信号偏差从而影响定位精度的问题,提出了一种结合BP神经网络和加权质心定位算法的室内定位算法。文中通过离群点检测算法对不同手机的RSSI数据进行清洗,并以清洗后的数据作为BP神经网络的数据源进行模型训练,得到了一种稳定的非线性的BP模型。在此基础上,结合改进的室内定位算法进行室内定位。实验结果表明,文中所提定位算法的均值误差、最小误差和最大误差分别为为0.58 m、0.24 m和1.06 m,定位精度明显高于现有的同类算法。 相似文献
16.
针对无源定位跟踪中野值的出现会降低滤波的可靠性和稳定性问题,结合新息似然的概念提出了一种基于似然的野值检测与剔除方法。通过计算卡尔曼滤波更新中得到的似然值,设定门限,以达到野值的检测与剔除的目的。仿真结果表明,该算法有效地处理了野值对定位跟踪精度的影响,使得目标定位跟踪精度有了较大的提高。 相似文献
17.
18.
高维空间下基于密度的离群点探测算法实现 总被引:4,自引:0,他引:4
离群点是数据仓库中表现行为异常的数据。对高维空间下离群点的性质进行了研究,采用高维空间数据在低维空间投影再进行探测的策略,解决了高维空间数据稀疏难以用数据点距离判断离群的问题。算法实现中选取彼此关联紧密的维,数据点之间的距离采用最近邻定义,用基于密度的离群点探测方法,能在局部空间内更有效地探测到离群点。 相似文献
19.
Outlier detection is one of the prominent research domain in the field of data mining and big data analytics. Nowadays, most of the data in healthcare centers are remotely monitored and are generated from different wireless sensors. The core objective of outlier detection in this domain is the recognition of the true physiologically anomalous data and the anomalies because of faulty sensors. In real healthcare monitoring scenario, various sensors are related to each other. So, while detecting outliers in wireless body sensor networks (WBSNs), correlation among different sensor nodes is of major concern. Most of the existing outlier detection techniques consider the sensors to be linearly correlated, which may not always be the case in real life applications. The traditional techniques for outlier detection are also not scalable to big data. To address the above issues, in this paper, we propose an approach for outlier detection that is scalable to big data and also handles the nonlinearly correlated attributes efficiently. The proposed approach is implemented on Hadoop map reduce framework for the rapid processing of big data. The evaluation results are validated using the simulated dataset of WBSNs taken from the Physionet library. The results are compared with various existing outlier detection approaches and demonstrated that the proposed approach is more effective in spotting the physiological outliers and sensor anomalies accurately. 相似文献
20.
为了去除高光谱影像的数据冗余,提高高光谱影像处理的精度和效率,提出了一种基于波段指数的高光谱影像波段选择算法。采用小波变换对高光谱图像数据进行去噪处理,依据联合偏度-峰度指数将波段进行分组,再根据波段指数的大小确定相对较小指数的波段,并将其作为冗余波段进行去除,从而得到最小波段集。结果表明,利用该波段集和全波段所选的端元是一致的,在不影响端元提取的前提下,最大程度地去除了冗余波段,而且该波段集与全波段的分类精度较接近。该算法在波段选择过程中具有可行性与有效性,为降低高光谱影像维数提供了一种帮助。 相似文献