首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 390 毫秒
1.
张悦  刘杰  李航 《计算机工程》2013,39(3):46-50,55
现有孤立点检测方法大多数都需要预先设定孤立点个数,若设定不准确将降低孤立点检测的准确性。针对该问题,提出一种基于概率的孤立点检测方法。结合基于密度的DBSCAN算法与中位数求方差的方法,对待检测数据集进行聚类,提取出不包含在任何聚类中的可疑孤立点并进行分析,从而确定最终孤立点。该方法所检测的数据与时间因素线性无关,不必预先设定孤立点个数及聚类数,并且对噪声数据具有较强的抗干扰能力。IRIS测试数据集上的实验结果表明,该方法能够有效地识别孤立点。  相似文献   

2.
针对支持向量数据描述(SVDD)对惩罚参数相当敏感的问题,提出一种新颖的异常检测方法,称为分布熵惩罚的支持向量数据描述(DEP-SVDD)。首先,将正常样本作为数据的全局分布,并在高斯核空间中定义每个样本点与正常样本分布中心的距离度量;然后,基于该距离设计评估样本点属于正常或异常样本的概率;最后,利用此概率构造基于分布熵的惩罚度以对相应的样本进行惩罚。在9个真实数据集上,将所提方法与SVDD、密度权的支持向量数据描述 (DW-SVDD)、位置正则的支持向量数据描述(P-SVDD)、K最近邻(KNN)和孤立森林(iForest)算法进行对比实验,结果表明DEP-SVDD在6个数据集上获得了最高的分类精度,可见相较于多种异常检测方法,DEP-SVDD在异常检测中具有更好的性能优势。  相似文献   

3.
Edge detection is an essential task in image processing. In some applications, such as Magnetic Resonance Imaging, the information about an image is available only through its frequency (Fourier) data. In this case, edge detection is particularly challenging, as it requires extracting local information from global data. The problem is exacerbated when the data are noisy. This paper proposes a new edge detection algorithm which combines the concentration edge detection method (Gelb and Tadmor in Appl. Comput. Harmon. Anal. 7:101–135, 1999) with statistical hypothesis testing. The result is a method that achieves a high probability of detection while maintaining a low probability of false detection.  相似文献   

4.
异常入侵检测系统在训练阶段建立对象的正常行为模型,在测试阶段把它与对象的行为进行比较,如果出现了大于给定域值的偏差,就认为发生了入侵.通常建立对象正常行为模型的方法是用没有入侵的数据训练系统.这种方法存在实用性和可靠性方面的缺陷:人工合成的训练数据基表可以保证没有攻击,但它与入侵检测系统将要实际工作的环境有很大的差别;而从实际使用环境提取的训练数据又不能保证不合有攻击.本文提出了一种基于网络的非纯净训练数据的异常入侵检测方法ADNTD(Anomaly Detection for Noisy Training Data),它通过过滤训练数据中的低概率特征域的方法过滤掉训练数据中的攻击数据并建立网络的正常行为模型,以保证即使训练数据含有攻击的情况下仍能取得较好的检测效果.实验结果显示:在训练数据含有攻击时,ADNTD的性能明显好于以前的系统;在采用纯净数据训练时,ADNTD也具有与以前的系统相当的性能;ADNTD用带有攻击的数据训练的情况下仍能达到以前的同类系统用纯净数据训练相同的检测性能.  相似文献   

5.
Currently, multiple sensors distributed detection systems with data fusion are used extensively in both civilian and military applications. The optimality of most detection fusion rules implemented in these systems relies on the knowledge of probability distributions for all distributed sensors. The overall detection performance of the central processor is often worse than expected due to instabilities of the sensors probability density functions. This paper proposes a new multiple decisions fusion rule for targets detection in distributed multiple sensor systems with data fusion. Unlike the published studies, in which the overall decision is based on single binary decision from each individual sensor and requires the knowledge of the sensors probability distributions, the proposed fusion method derives the overall decision based on multiple decisions from each individual sensor assuming that the probability distributions are not known. Therefore, the proposed fusion rule is insensitive to instabilities of the sensors probability distributions. The proposed multiple decisions fusion rule is derived and its overall performance is evaluated. Comparisons with the performance of single sensor, optimum hard detection, optimum centralized detection, and a multiple thresholds decision fusion, are also provided. The results show that the proposed multiple decisions fusion rule has higher performance than the optimum hard detection and the multiple thresholds detection systems. Thus it reduces the loss in performance between the optimum centralized detection and the optimum hard detection systems. Extension of the proposed method to the case of target detection when some probability density functions are known and applications to binary communication systems are also addressed.  相似文献   

6.
为了提高故障检测和分类能力,提出基于概率密度PCA的多模态过程故障检测算法。对各模态的训练数据建立PCA模型,计算各个模型的控制限和匹配系数。根据匹配系数计算各模态统一的控制限。对新来的数据,运用概率密度确定其模态。新来数据向对应模态的模型上投影并计算统一的统计量,比较统计量与控制限进行多模态过程故障检测。把该方法应用到数值例子和半导体过程中,仿真结果表明,该算法在分类及多模态过程故障检测方面具有很高的准确性。  相似文献   

7.
基于不完整数据的异常信号检测方法   总被引:1,自引:0,他引:1       下载免费PDF全文
针对异常电磁信号检测中常见的输入数据存在参数缺失的问题,提出一种基于不完整数据的异常信号检测方法。该方法借鉴几何数学的思想,通过将缺失数据与正常数据进行比对,分析出缺失数据异常的可能性,给出该数据的异常概率计算方法。通过该异常概率能直接检测出部分异常信号,并给出剩余不完整数据的异常可能性的排序,从而有利于在资源有限时优先处理异常概率高的信号,达到处理资源优化配置的目的。实验结果表明,该方法能给出缺失数据点的异常概率。  相似文献   

8.
Leaks and spills of hazardous fluids like petroleum endanger the environment, while remediation costs and penalties imposed when petroleum contaminates the ecosystem affect economics heavily. Therefore, it is crucial to detect any possible symptoms of a leak as soon as possible. Most of existing leak detection techniques require specialized equipment to be used, while purely software-based methods rely solely on data analysis and are very desirable since they can be deployed on petrol stations without any changes to the existing infrastructure. Moreover, such techniques can be considered as complementary to the hardware leak detection systems, as they provide additional security level. In this paper we present the TUBE algorithm, which detects fuel leaks from underground storage tanks, using only standard measurements that are normally registered on petrol stations, i.e. the amount of stored, sold, and delivered fuel. The TUBE algorithm is an autonomous solution capable of making decisions independently as well as supporting human-made decisions and thus can be considered as an expert leak detection system. The TUBE algorithm introduces a new data mining technique for trend detection and cleaning data over time series, which can be easily adapted to any other problem domain. A trend detection technique, called tubes, created for the TUBE algorithm is a novel data analysis method that allows to envelop uncertainties and oscillations in data and produce stable trends. Trend interpretation technique described in this paper has been designed especially for fuel leak detection purposes using our industrial experience. This paper includes a step-by-step usage example of the TUBE algorithm and its evaluation according to the United States Environmental Protection Agency requirements for leakage detection systems (the EPA SIR standard). Such an evaluation involves calculating the probability of detection and the probability of false alarm. The TUBE algorithm has obtained 98.84% probability of detection and 0.07% probability of false alarm while rejecting 42.22% of analyzed datasets due to their uncertainty. Rejecting datasets from analysis is compliant with the EPA SIR standard; however, rejection rate higher than 20% is not acceptable. Therefore we have evaluated the two-phase filtering stage of the algorithm in order to find the best combination of filters as means of data cleaning. Moreover, we have discussed the results pointing at the overall data quality problem, since it is the main cause of rejecting some datasets from the analysis. Finally, the TUBE algorithm has obtained 93.11% probability of detection and 0.73% probability of false alarm for the best combination of all parameters with 15.56% rejection rate, which is acceptable by the EPA SIR standard. The value of probability of detection is not fully compliant with the EPA SIR standard where 95% probability of detection with probability of false alarm lower than 5% is required. We have found that the requirements for the aforementioned probabilities have been completely fulfilled for datasets representing manifolded tank systems but not for single tank datasets. Such a situation was unexpected since manifolded tank systems are generally claimed to be more complex for analysis as they are in fact systems of multiple single tanks directly connected. In this paper we have also measured the time and memory complexity of the TUBE algorithm as well as discussed the issues connected to the TUBE algorithm deployment on petrol stations using our industrial experience in the topic.  相似文献   

9.
通过时空异常流检测技术可以发现城市交通数据中的异常交通特征。与时间序列中单个异常流检测采用的方法不同,提出了从流序列中检测异常流分布的k最近邻流序列算法(kNNFS)。算法首先为每个位置测定每个时间区间内的单个流观测值;随后计算单个流的观测频率来构建每个位置处每个时间区间的流分布概率库;最后由阈值判定使用KL散度计算的新的流分布概率与其k最近邻之间的距离是否为异常值,距离值小于阈值则更新入流分布概率库,否则为异常的流分布。仿真分析表明,对比DPMM算法和SETMADA算法,kNNFS算法在检测精度和算法运行时间方面均有优化提升。  相似文献   

10.
We consider the problem of determining a route of a search resource to search visually multiple areas in which targets are expected to be located. It is assumed that the probability a target exists in each area is given as a result of target detection operations and that the probability decreases as time passes. It is necessary to search the areas using a search resource, and identify the exact locations of the targets. We propose heuristic algorithms including a simulated annealing (SA) algorithm for the search sequencing problem. Since the search sequence must be determined as quickly as possible not to delay the search, heuristics for search sequencing should not take too much time. We introduce a new neighborhood generation method and a new parameter for an easier control of the overall computation time in the SA algorithm. A series of computational experiments is performed for evaluating the suggested algorithms, and results are reported.  相似文献   

11.
In this paper a new method of mode separation is proposed. The method is based on mapping of data points from the N-dimensional space onto a sequence so that the majority of points from each mode become successive elements of the sequence. The intervals of points in the sequence belonging to the respective modes of the p.d.f. are then determined from a function generated on this sequence. The nuclei of the modes formed by the elements of these intervals are then used to obtain separating surfaces between the modes and so to partition the data set with multimodal probability density function into unimodal subsets.  相似文献   

12.
Outlier detection is a very useful technique in many applications, where data is generally uncertain and could be described using probability. While having been studied intensively in the field of deterministic data, outlier detection is still novel in the emerging uncertain data field. In this paper, we study the semantic of outlier detection on probabilistic data stream and present a new definition of distance-based outlier over sliding window. We then show the problem of detecting an outlier over a set o...  相似文献   

13.
Metamorphic malware changes its internal structure on each infection while maintaining its function. Although many detection techniques have been proposed, practical and effective metamorphic detection remains a difficult challenge. In this paper, we analyze a previously proposed eigenvector-based method for metamorphic detection. The approach considered here was inspired by a well-known facial recognition technique. We compute eigenvectors using raw byte data extracted from executables belonging to a metamorphic family. These eigenvectors are then used to compute a score for a collection of executable files that includes family viruses and representative examples of benign code. We perform extensive testing to determine the effectiveness of this classification method. Among other results, we show that this eigenvalue-based approach is effective when applied to a family of highly metamorphic code that successfully evades statistical-based detection. We also experiment computing eigenvectors on extracted opcode sequences, as opposed to raw byte sequences. Our experimental evidence indicates that the use of opcode sequences does not improve the results.  相似文献   

14.
陈友荣  俞立  董齐芬  洪榛 《计算机应用》2011,31(11):2898-2901
为提高无线传感网的生存时间,对基于蚁群算法的最大化生存时间路由(MLRAC)进行了研究。该路由利用链路能耗模型和节点发送数据概率,计算一个数据收集周期内节点总能耗。同时考虑节点初始能量,建立了最大化生存时间路由的最优模型。为求解该最优模型,在经典蚁群算法的基础上,提出修正的蚁群算法。该算法采用新的邻居节点转发概率公式、信息素更新公式和分组探测方法,经过一定的迭代计算获得网络生存时间的最优值和每个节点的最优发送数据概率。最后,Sink节点洪泛通知网络中所有节点。节点根据接收到的最优概率,选择数据分组未经过的邻居节点发送数据。仿真实验表明,经过一定时间的迭代,MLRAC的生存时间可以收敛到最优值。该算法能延长网络生存时间,在一定的条件下,MLRAC算法比PEDAP、LET、Ratio-w、Sum-w等算法更优。  相似文献   

15.
初始运动估计和内点检测是影响立体视觉里程计定位精度的重要因素.目前,立体视觉里程计都采用基于3点线性运动估计的随机采样一致性(random sample consensus,RANSAC)方法.本文分析了随机采样一致性方法在初始运动估计中的性能:该方法对排除误匹配点是有效的,但在一定采样次数下采样到特征点提取误差和立体匹配误差都很小的匹配点的概率是很小的,所以通过该方法得到的初始运动参数和匹配内点不够精确。本文提出了采用微粒群优化的初始运动估计和内点检测新方法,该方法收敛速度快,搜索精确解的能力强,能够获得高精度的运动参数和匹配内点.立体视觉里程计仿真实验和真实智能车实验表明:和随机采样一致性方法相比,本文方法在运行时间、定位精度方面都更优越.  相似文献   

16.
针对多项式有限混合模型参数估计过程中存在的初始化依赖、参数易收敛到边界值以及容易陷入局部最优等问题,引入了最小信息长度准则,优化多项式有限混合模型的参数估计过程。在此基础上,采用基于多项式有限混合模型的聚类算法对用户评分行为进行聚类,利用模型求解得到的聚类归属概率对Slope One算法实施改进。实验结果表明:应用最小信息长度准则对多项式有限混合模型进行优化后,聚类效果明显提高;同时,相比于基于用户聚类的Slope One推荐算法,改进算法具有明显的改进效果。  相似文献   

17.
With the pervasiveness of online social media and rapid growth of web data, a large amount of multi-media data is available online. However, how to organize them for facilitating users’ experience and government supervision remains a problem yet to be seriously investigated. Topic detection and tracking, which has been a hot research topic for decades, could cluster web videos into different topics according to their semantic content. However, how to online discover topic and track them from web videos and images has not been fully discussed. In this paper, we formulate topic detection and tracking as an online tracking, detection and learning problem. First, by learning from historical data including labeled data and plenty of unlabeled data using semi-supervised multi-class multi-feature method, we obtain a topic tracker which could also discover novel topics from the new stream data. Second, when new data arrives, an online updating method is developed to make topic tracker adaptable to the evolution of the stream data. We conduct experiments on public dataset to evaluate the performance of the proposed method and the results demonstrate its effectiveness for topic detection and tracking.  相似文献   

18.
This study presents a new fault detection scheme based on the probability density function (PDF) of system output. Unlike the classical fault detection and diagnosis methods, in the proposed method, distribution of the system output is estimated online. To achieve this goal, an algorithm is introduced to estimate PDF online using fuzzy logic. Furthermore, convergence of this algorithm is investigated. Then, a residual is constructed that can show the existence of a fault in the system. The main advantages of the proposed method are robustness against measurement noise, even though it does not need the exact model and measured data of inputs and states. Simulation results show that this scheme can detect abrupt faults very well.  相似文献   

19.
It is challenging to use traditional data mining techniques to deal with real-time data stream classifications. Existing mining classifiers need to be updated frequently to adapt to the changes in data streams. To address this issue, in this paper we propose an adaptive ensemble approach for classification and novel class detection in concept drifting data streams. The proposed approach uses traditional mining classifiers and updates the ensemble model automatically so that it represents the most recent concepts in data streams. For novel class detection we consider the idea that data points belonging to the same class should be closer to each other and should be far apart from the data points belonging to other classes. If a data point is well separated from the existing data clusters, it is identified as a novel class instance. We tested the performance of this proposed stream classification model against that of existing mining algorithms using real benchmark datasets from UCI (University of California, Irvine) machine learning repository. The experimental results prove that our approach shows great flexibility and robustness in novel class detection in concept drifting and outperforms traditional classification models in challenging real-life data stream applications.  相似文献   

20.
针对红外探测系统中单帧红外图像中低信噪比小目标检测问题,提出一种基于边缘化粒子滤波算法的检测前跟踪方法.该方法根据混合状态滤波的思想,直接利用原始图像数据,采用粒子数确定的持续概率密度函数和新生概率密度函数,推导出目标存在的概率.对没有出现在量测方程中的线性状态变量边缘化,用卡尔曼滤波器进行时间更新.实验结果证明,该方...  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号