共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
针对TCP/IP骨干网,利用NetFlow技术,提出一种基于业务流量周期规律特性的建模与异常检测方法。该方法通过挖掘骨干网主要业务流量的规律性,结合时间序列分析方法,有效地预测流量的变化趋势,避免了对复杂的流量非线性趋势进行建模分析。 相似文献
3.
In this paper we present methodological advances in anomaly detection tailored to discover abnormal traffic patterns under the presence of seasonal trends in data. In our setup we impose specific assumptions on the traffic type and nature; our study features VoIP call counts, for which several traces of real data has been used in this study, but the methodology can be applied to any data following, at least roughly, a non-homogeneous Poisson process (think of highly aggregated traffic flows). A performance study of the proposed methods, covering situations in which the assumptions are fulfilled as well as violated, shows good results in great generality. Finally, a real data example is included showing how the system could be implemented in practice. 相似文献
4.
为克服Prophet模型对残差自相关性考虑的缺失,时间推理能力的不足,提高被动红外(passive infrared,PIR)运动探测器检测结果的准确性,提出一种Prophet与SARIMA动态加权组合的预测模型.分析PIR运动探测器的特点,分析对比几种流行的预测算法的优劣,在此基础上设计Prophet-SARIMA的组合预测模型,统计和分析用户的行为.为获取最好的组合效果,设计动态加权组合算法,通过加权算法可确定最优的权值组合.通过实验验证了Prophet-SARIM A组合预测模型在人体红外数据的预测中具有更强的适用性和更高的准确性. 相似文献
5.
A large class of monitoring problems can be cast as the detection of a change in the parameters of a static or dynamic system, based on the effects of these changes on one or more observed variables. In this paper, the use of random forest models to detect change points in dynamic systems is considered. The approach is based on the embedding of multivariate time series data associated with normal process conditions, followed by the extraction of features from the resulting lagged trajectory matrix. The features are extracted by recasting the data into a binary classification problem, which can be solved with a random forest model. A proximity matrix can be calculated from the model and from this matrix features can be extracted that represent the trajectory of the system in phase space. The results of the study suggest that the random forest approach may afford distinct advantages over a previously proposed linear equivalent, particularly when complex nonlinear systems need to be monitored. 相似文献
6.
针对传统对电机的异常检测常常出现误判或滞后的情况,采用基于时间序列对电机单一状态参量用AR拟合,同时利用SOM神经网络无监督的方式量化电机数据。然后,利用得到的量化序列结合AR曲线得到序列的转移概率,及早发现某种状态参量的异常变化;之后,DBSCAN算法挖掘多维参数之间特征关系来确定电机是否出现异常。最后结合实例说明该方法的检测过程,并对比验证了该方法的优越性。 相似文献
7.
为解决复杂系统中非线性时间序列预测模型构建效率低、预测精度低的问题,提出基于组合模型的HURST-EMD预测算法.采用EMD算法将非线性时间序列分解为代表原始序列特征的各个IMF,然后引入赫斯特(Hurst)指数将同类的IMF整合为新的分量,最后选用LS-SVR-ARIMA模型进行组合预测.在该算法中,设计了序列分类整合等过程,优化了建模的计算量,构建了高效精准的预测模型.为验证模型的有效性,采用上证指数公共数据集和真实交通流数据进行检验,实验结果表明,改进的基于组合模型的HURST-EMD预测算法在提高预测效率的同时具有更好的预测精度. 相似文献
8.
时间序列数据在测量过程中通常受到事物内在可变性以及外界干扰等因素的影响,针对各个时间点上数据受影响程度不同的情况,提出一种基于高斯过程预估模型的时间序列数据离群点检测方法。将监测数据分解为标准值和偏差项两个部分,除了对理想情况下的标准值建模,还再次使用高斯过程实现对异方差偏差项的有效描述,通过变分推断解决引入偏差项后的后验概率求解问题,将后验分布中设定的容差区间用于离群点判定。使用雅虎公司公开的网络流量时序数据进行验证,模型输出的容差区间在不同时间点上的变化趋势与标注的正常数据偏差情况相符,并在对比实验中异常检测性能指标F1-score优于自回归积分滑动平均模型、一类支持向量机以及基于密度并伴随噪声的空间聚类算法。实验结果表明,该模型能够有效描述各个时间点上正常数据的分布情况,取得误报率和召回率两方面的综合权衡,而且可以避免模型参数设置不当导致的性能问题。 相似文献
9.
Most temporal data models have concentrated on describing temporal data based on versioning of objects, tuples or attributes. The concept of time series, which is often needed in temporal applications, does not fit well within these models. The goal of this paper is to propose a generalized temporal database model that integrates the modeling of both version-based and time-series based temporal data into a single conceptual framework. The concept of calendar is also integrated into our proposed model. We also discuss how a conceptual Extended-ER design in our model can be mapped to an object-oriented or relational database implementation. 相似文献
10.
Duc-Son Pham Svetha Venkatesh Mihai Lazarescu Saha Budhaditya 《Data mining and knowledge discovery》2014,28(1):145-189
This paper addresses the anomaly detection problem in large-scale data mining applications using residual subspace analysis. We are specifically concerned with situations where the full data cannot be practically obtained due to physical limitations such as low bandwidth, limited memory, storage, or computing power. Motivated by the recent compressed sensing (CS) theory, we suggest a framework wherein random projection can be used to obtained compressed data, addressing the scalability challenge. Our theoretical contribution shows that the spectral property of the CS data is approximately preserved under a such a projection and thus the performance of spectral-based methods for anomaly detection is almost equivalent to the case in which the raw data is completely available. Our second contribution is the construction of the framework to use this result and detect anomalies in the compressed data directly, thus circumventing the problems of data acquisition in large sensor networks. We have conducted extensive experiments to detect anomalies in network and surveillance applications on large datasets, including the benchmark PETS 2007 and 83 GB of real footage from three public train stations. Our results show that our proposed method is scalable, and importantly, its performance is comparable to conventional methods for anomaly detection when the complete data is available. 相似文献
11.
Smartphones centralize a great deal of users’ private information and are thus a primary target for cyber-attack. The main goal of the attacker is to try to access and exfiltrate the private information stored in the smartphone without detection. In situations where explicit information is lacking, these attackers can still be detected in an automated way by analyzing data streams (continuously sampled information such as an application’s CPU consumption, accelerometer readings, etc.). When clustered, anomaly detection techniques may be applied to the data stream in order to detect attacks in progress. In this paper we utilize an algorithm called pcStream that is well suited for detecting clusters in real world data streams and propose extensions to the pcStream algorithm designed to detect point, contextual, and collective anomalies. We provide a comprehensive evaluation that addresses mobile security issues on a unique dataset collected from 30 volunteers over eight months. Our evaluations show that the pcStream extensions can be used to effectively detect data leakage (point anomalies) and malicious activities (contextual anomalies) associated with malicious applications. Moreover, the algorithm can be used to detect when a device is being used by an unauthorized user (collective anomaly) within approximately 30 s with 1 false positive every two days. 相似文献
12.
由于温度、光照等物理属性的时空连续性,密集部署的传感器网络中节点感知的数据往往具有很高的时空相关性。这种数据相关性产生的数据冗余会带来通信负担,也会缩短网络的生命周期。提出一种基于预测模型的簇型数据收集机制 (CDCF),探索数据相关性,减少无线传感器网络的通信量。该机制包括一种基于曲线拟合最小二乘法的时间序列预测模型和简单有效的误差控制方法。在数据收集过程中,簇型结构考虑到了数据间的空间相关性,时间序列预测模型探讨数据的时间相关性。实验仿真表明,在较为稳定的网络环境中,相对于收集原始数据,该机制只需10%~20%的通信量就可完成整个网络的数据收集任务;数据误差控制方法可以确保基站恢复数据的误差控制在用户定义的误差范围之内。 相似文献
13.
In this research, we propose a novel framework referred to as collective game behavior decomposition where complex collective behavior is assumed to be generated by aggregation of several groups of agents following different strategies and complexity emerges from collaboration and competition of individuals. The strategy of an agent is modeled by certain simple game theory models with limited information. Genetic algorithms are used to obtain the optimal collective behavior decomposition based on history data. The trained model can be used for collective behavior prediction. For modeling individual behavior, two simple games, the minority game and mixed game are investigated in experiments on the real-world stock prices and foreign-exchange rate. Experimental results are presented to show the effectiveness of the new proposed model. 相似文献
14.
Periodicity detection in time series databases 总被引:7,自引:0,他引:7
Elfeky M.G. Aref W.G. Elmagarmid A.K. 《Knowledge and Data Engineering, IEEE Transactions on》2005,17(7):875-887
Periodicity mining is used for predicting trends in time series data. Discovering the rate at which the time series is periodic has always been an obstacle for fully automated periodicity mining. Existing periodicity mining algorithms assume that the periodicity, rate (or simply the period) is user-specified. This assumption is a considerable limitation, especially in time series data where the period is not known a priori. In this paper, we address the problem of detecting the periodicity rate of a time series database. Two types of periodicities are defined, and a scalable, computationally efficient algorithm is proposed for each type. The algorithms perform in O(n log n) time for a time series of length n. Moreover, the proposed algorithms are extended in order to discover the periodic patterns of unknown periods at the same time without affecting the time complexity. Experimental results show that the proposed algorithms are highly accurate with respect to the discovered periodicity rates and periodic patterns. Real-data experiments demonstrate the practicality of the discovered periodic patterns. 相似文献
15.
16.
17.
In this article we consider the problem of detecting unusual values or outliers from time series data where the process by
which the data are created is difficult to model. The main consideration is the fact that data closer in time are more correlated
to each other than those farther apart. We propose two variations of a method that uses the median from a neighborhood of
a data point and a threshold value to compare the difference between the median and the observed data value. Both variations
of the method are fast and can be used for data streams that occur in quick succession such as sensor data on an airplane.
Martin Meckesheimer has been a member of the Applied Statistics Group at Phantom Works, Boeing since 2001. He received a Bachelor of Science
Degree in Industrial Engineering from the University of Pittsburgh in 1997, and a Master's Degree in Industrial and Systems
Engineering from Ecole Centrale Paris in 1999. Martin earned a Doctorate in Industrial Engineering from The Pennsylvania State
University in August 2001, as a student of Professor Russell R. Barton and Dr. Timothy W. Simpson. His primary research interests
are in the areas of design of experiments and surrogate modeling.
Sabyasachi Basu received his Ph.D. is Statistics from the University of Wisconsin at Madison in 1990. Since his Ph.D., he has worked in both
academia and in industry. He has taught and guided Ph.D. students in the Department of Statistics at the Southern Methodist
University. He has also worked as a senior marketing statistician at the J. C. Penney Company. Dr. Basu is also an American
Society of Quality certified Six Sigma Black Belt. He is currently an Associate Technical Fellow in Statistics and Data Mining
at the Boeing Company. In this capacity, he works as a researcher and a technical consultant within Boeing for data mining,
statistics and process improvements. He has published more than 20 papers and technical reports. He has also served as journal
referee for several journals, organized conferences and been invited to present at conferences. 相似文献
18.
19.
Francesco Gullo Author Vitae Author Vitae Andrea Tagarelli Author Vitae Sergio Greco Author Vitae 《Pattern recognition》2009,42(11):2998-3014
Similarity search and detection is a central problem in time series data processing and management. Most approaches to this problem have been developed around the notion of dynamic time warping, whereas several dimensionality reduction techniques have been proposed to improve the efficiency of similarity searches. Due to the continuous increasing of sources of time series data and the cruciality of real-world applications that use such data, we believe there is a challenging demand for supporting similarity detection in time series in a both accurate and fast way. Our proposal is to define a concise yet feature-rich representation of time series, on which the dynamic time warping can be applied for effective and efficient similarity detection of time series. We present the Derivative time series Segment Approximation (DSA) representation model, which originally features derivative estimation, segmentation and segment approximation to provide both high sensitivity in capturing the main trends of time series and data compression. We extensively compare DSA with state-of-the-art similarity methods and dimensionality reduction techniques in clustering and classification frameworks. Experimental evidence from effectiveness and efficiency tests on various datasets shows that DSA is well-suited to support both accurate and fast similarity detection. 相似文献
20.
Rob J. Hyndman Roman A. Ahmed George Athanasopoulos Han Lin Shang 《Computational statistics & data analysis》2011,55(9):2579-2589
In many applications, there are multiple time series that are hierarchically organized and can be aggregated at several different levels in groups based on products, geography or some other features. We call these “hierarchical time series”. They are commonly forecast using either a “bottom-up” or a “top-down” method.In this paper we propose a new approach to hierarchical forecasting which provides optimal forecasts that are better than forecasts produced by either a top-down or a bottom-up approach. Our method is based on independently forecasting all series at all levels of the hierarchy and then using a regression model to optimally combine and reconcile these forecasts. The resulting revised forecasts add up appropriately across the hierarchy, are unbiased and have minimum variance amongst all combination forecasts under some simple assumptions.We show in a simulation study that our method performs well compared to the top-down approach and the bottom-up method. We demonstrate our proposed method by forecasting Australian tourism demand where the data are disaggregated by purpose of travel and geographical region. 相似文献