共查询到20条相似文献,搜索用时 20 毫秒
1.
2.
针对TCP/IP骨干网,利用NetFlow技术,提出一种基于业务流量周期规律特性的建模与异常检测方法。该方法通过挖掘骨干网主要业务流量的规律性,结合时间序列分析方法,有效地预测流量的变化趋势,避免了对复杂的流量非线性趋势进行建模分析。 相似文献
3.
为克服Prophet模型对残差自相关性考虑的缺失,时间推理能力的不足,提高被动红外(passive infrared,PIR)运动探测器检测结果的准确性,提出一种Prophet与SARIMA动态加权组合的预测模型.分析PIR运动探测器的特点,分析对比几种流行的预测算法的优劣,在此基础上设计Prophet-SARIMA的组合预测模型,统计和分析用户的行为.为获取最好的组合效果,设计动态加权组合算法,通过加权算法可确定最优的权值组合.通过实验验证了Prophet-SARIM A组合预测模型在人体红外数据的预测中具有更强的适用性和更高的准确性. 相似文献
4.
A large class of monitoring problems can be cast as the detection of a change in the parameters of a static or dynamic system, based on the effects of these changes on one or more observed variables. In this paper, the use of random forest models to detect change points in dynamic systems is considered. The approach is based on the embedding of multivariate time series data associated with normal process conditions, followed by the extraction of features from the resulting lagged trajectory matrix. The features are extracted by recasting the data into a binary classification problem, which can be solved with a random forest model. A proximity matrix can be calculated from the model and from this matrix features can be extracted that represent the trajectory of the system in phase space. The results of the study suggest that the random forest approach may afford distinct advantages over a previously proposed linear equivalent, particularly when complex nonlinear systems need to be monitored. 相似文献
5.
为解决复杂系统中非线性时间序列预测模型构建效率低、预测精度低的问题,提出基于组合模型的HURST-EMD预测算法.采用EMD算法将非线性时间序列分解为代表原始序列特征的各个IMF,然后引入赫斯特(Hurst)指数将同类的IMF整合为新的分量,最后选用LS-SVR-ARIMA模型进行组合预测.在该算法中,设计了序列分类整合等过程,优化了建模的计算量,构建了高效精准的预测模型.为验证模型的有效性,采用上证指数公共数据集和真实交通流数据进行检验,实验结果表明,改进的基于组合模型的HURST-EMD预测算法在提高预测效率的同时具有更好的预测精度. 相似文献
6.
Duc-Son Pham Svetha Venkatesh Mihai Lazarescu Saha Budhaditya 《Data mining and knowledge discovery》2014,28(1):145-189
This paper addresses the anomaly detection problem in large-scale data mining applications using residual subspace analysis. We are specifically concerned with situations where the full data cannot be practically obtained due to physical limitations such as low bandwidth, limited memory, storage, or computing power. Motivated by the recent compressed sensing (CS) theory, we suggest a framework wherein random projection can be used to obtained compressed data, addressing the scalability challenge. Our theoretical contribution shows that the spectral property of the CS data is approximately preserved under a such a projection and thus the performance of spectral-based methods for anomaly detection is almost equivalent to the case in which the raw data is completely available. Our second contribution is the construction of the framework to use this result and detect anomalies in the compressed data directly, thus circumventing the problems of data acquisition in large sensor networks. We have conducted extensive experiments to detect anomalies in network and surveillance applications on large datasets, including the benchmark PETS 2007 and 83 GB of real footage from three public train stations. Our results show that our proposed method is scalable, and importantly, its performance is comparable to conventional methods for anomaly detection when the complete data is available. 相似文献
7.
时间序列数据在测量过程中通常受到事物内在可变性以及外界干扰等因素的影响,针对各个时间点上数据受影响程度不同的情况,提出一种基于高斯过程预估模型的时间序列数据离群点检测方法。将监测数据分解为标准值和偏差项两个部分,除了对理想情况下的标准值建模,还再次使用高斯过程实现对异方差偏差项的有效描述,通过变分推断解决引入偏差项后的后验概率求解问题,将后验分布中设定的容差区间用于离群点判定。使用雅虎公司公开的网络流量时序数据进行验证,模型输出的容差区间在不同时间点上的变化趋势与标注的正常数据偏差情况相符,并在对比实验中异常检测性能指标F1-score优于自回归积分滑动平均模型、一类支持向量机以及基于密度并伴随噪声的空间聚类算法。实验结果表明,该模型能够有效描述各个时间点上正常数据的分布情况,取得误报率和召回率两方面的综合权衡,而且可以避免模型参数设置不当导致的性能问题。 相似文献
8.
Most temporal data models have concentrated on describing temporal data based on versioning of objects, tuples or attributes. The concept of time series, which is often needed in temporal applications, does not fit well within these models. The goal of this paper is to propose a generalized temporal database model that integrates the modeling of both version-based and time-series based temporal data into a single conceptual framework. The concept of calendar is also integrated into our proposed model. We also discuss how a conceptual Extended-ER design in our model can be mapped to an object-oriented or relational database implementation. 相似文献
9.
由于温度、光照等物理属性的时空连续性,密集部署的传感器网络中节点感知的数据往往具有很高的时空相关性。这种数据相关性产生的数据冗余会带来通信负担,也会缩短网络的生命周期。提出一种基于预测模型的簇型数据收集机制 (CDCF),探索数据相关性,减少无线传感器网络的通信量。该机制包括一种基于曲线拟合最小二乘法的时间序列预测模型和简单有效的误差控制方法。在数据收集过程中,簇型结构考虑到了数据间的空间相关性,时间序列预测模型探讨数据的时间相关性。实验仿真表明,在较为稳定的网络环境中,相对于收集原始数据,该机制只需10%~20%的通信量就可完成整个网络的数据收集任务;数据误差控制方法可以确保基站恢复数据的误差控制在用户定义的误差范围之内。 相似文献
10.
11.
Julien Rabatel Sandra Bringay Pascal Poncelet 《Expert systems with applications》2011,38(6):7003-7015
Today, many industrial companies must face problems raised by maintenance. In particular, the anomaly detection problem is probably one of the most challenging. In this paper we focus on the railway maintenance task and propose to automatically detect anomalies in order to predict in advance potential failures. We first address the problem of characterizing normal behavior. In order to extract interesting patterns, we have developed a method to take into account the contextual criteria associated to railway data (itinerary, weather conditions, etc.). We then measure the compliance of new data, according to extracted knowledge, and provide information about the seriousness and the exact localization of a detected anomaly. 相似文献
12.
In this article we consider the problem of detecting unusual values or outliers from time series data where the process by
which the data are created is difficult to model. The main consideration is the fact that data closer in time are more correlated
to each other than those farther apart. We propose two variations of a method that uses the median from a neighborhood of
a data point and a threshold value to compare the difference between the median and the observed data value. Both variations
of the method are fast and can be used for data streams that occur in quick succession such as sensor data on an airplane.
Martin Meckesheimer has been a member of the Applied Statistics Group at Phantom Works, Boeing since 2001. He received a Bachelor of Science
Degree in Industrial Engineering from the University of Pittsburgh in 1997, and a Master's Degree in Industrial and Systems
Engineering from Ecole Centrale Paris in 1999. Martin earned a Doctorate in Industrial Engineering from The Pennsylvania State
University in August 2001, as a student of Professor Russell R. Barton and Dr. Timothy W. Simpson. His primary research interests
are in the areas of design of experiments and surrogate modeling.
Sabyasachi Basu received his Ph.D. is Statistics from the University of Wisconsin at Madison in 1990. Since his Ph.D., he has worked in both
academia and in industry. He has taught and guided Ph.D. students in the Department of Statistics at the Southern Methodist
University. He has also worked as a senior marketing statistician at the J. C. Penney Company. Dr. Basu is also an American
Society of Quality certified Six Sigma Black Belt. He is currently an Associate Technical Fellow in Statistics and Data Mining
at the Boeing Company. In this capacity, he works as a researcher and a technical consultant within Boeing for data mining,
statistics and process improvements. He has published more than 20 papers and technical reports. He has also served as journal
referee for several journals, organized conferences and been invited to present at conferences. 相似文献
13.
Francesco Gullo Author Vitae Author Vitae Andrea Tagarelli Author Vitae Sergio Greco Author Vitae 《Pattern recognition》2009,42(11):2998-3014
Similarity search and detection is a central problem in time series data processing and management. Most approaches to this problem have been developed around the notion of dynamic time warping, whereas several dimensionality reduction techniques have been proposed to improve the efficiency of similarity searches. Due to the continuous increasing of sources of time series data and the cruciality of real-world applications that use such data, we believe there is a challenging demand for supporting similarity detection in time series in a both accurate and fast way. Our proposal is to define a concise yet feature-rich representation of time series, on which the dynamic time warping can be applied for effective and efficient similarity detection of time series. We present the Derivative time series Segment Approximation (DSA) representation model, which originally features derivative estimation, segmentation and segment approximation to provide both high sensitivity in capturing the main trends of time series and data compression. We extensively compare DSA with state-of-the-art similarity methods and dimensionality reduction techniques in clustering and classification frameworks. Experimental evidence from effectiveness and efficiency tests on various datasets shows that DSA is well-suited to support both accurate and fast similarity detection. 相似文献
14.
15.
Rob J. Hyndman Roman A. Ahmed George Athanasopoulos Han Lin Shang 《Computational statistics & data analysis》2011,55(9):2579-2589
In many applications, there are multiple time series that are hierarchically organized and can be aggregated at several different levels in groups based on products, geography or some other features. We call these “hierarchical time series”. They are commonly forecast using either a “bottom-up” or a “top-down” method.In this paper we propose a new approach to hierarchical forecasting which provides optimal forecasts that are better than forecasts produced by either a top-down or a bottom-up approach. Our method is based on independently forecasting all series at all levels of the hierarchy and then using a regression model to optimally combine and reconcile these forecasts. The resulting revised forecasts add up appropriately across the hierarchy, are unbiased and have minimum variance amongst all combination forecasts under some simple assumptions.We show in a simulation study that our method performs well compared to the top-down approach and the bottom-up method. We demonstrate our proposed method by forecasting Australian tourism demand where the data are disaggregated by purpose of travel and geographical region. 相似文献
16.
提出了一个两阶段的多元时间序列异常检测算法。该算法通过有界坐标系统 (BCS)技术计算多元时间序列样本之间的相似性,采用基于距离的方法实现异常检测。算法第一阶段采用K-means算法对数据进行聚类,并按照一个启发式规则对其进行排序;第二阶段在聚类结果上采用循环嵌套算法进行异常检测,并通过两个剪枝规则进行高效剪枝,提高了算法的效率。在两个实际数据集上进行实验,实验结果验证了算法的有效性。 相似文献
17.
Katsunori Takeda Tetsuo Hattori Tetsuya Izumi Hiromichi Kawano 《Artificial Life and Robotics》2010,15(4):417-420
It is important to detect a structural change in a time series quickly as a trigger to remodeling the forecasting model. The
well-known Chow test has been used as the standard method for detecting change, especially in economics. However, we have
proposed the application of the sequential probability ratio test (SPRT) for detecting the change in single-regression modeled
time-series data. In this article, we show experimental results using SPRT and the Chow test when applied to time-series data
that are based on multiple regression models. We also clarify the effectiveness of SPRT compared with the Chow test in its
ability to detect change early and correctly, and its computational complexity. Moreover, we extend the definition of the
point at which structural change is detected with the SPRT method, and show an improvement in the accuracy of change detection. 相似文献
18.
为研究降雨天气中降雨量和相关气象要素的关系,找出降雨前后相关气象要素的变化规律,提出了多维时间序列数据挖掘模型.该模型首先对气象要素时间序列进行维度选择预处理,剔除不相关及冗余维度,然后运用提出的极值斜率分段线性拟合法对时间序列进行分段、数据压缩及特征值提取,最后使用k-means聚类算法对处理后的多维序列进行符号化,利用规则提取得到降雨天气模型.实验结果表明了该模型具有较好的实用价值. 相似文献
19.
加油时序数据包含加油行为的多维信息,但是指定加油站点数据较为稀疏,现有成熟的数据异常检测算法存在挖掘较多假性异常点以及遗漏较多真实异常点的缺陷,并不适用于挖掘加油站时序数据。提出一种基于深度学习的异常检测方法识别加油异常车辆,首先通过自动编码器对加油站点采集到的相关数据进行特征提取,然后采用嵌入双向长短期记忆(Bi-LSTM)的Seq2Seq模型对加油行为进行预测,最后通过比较预测值和原始值来定义异常点的阈值。通过在加油数据集以及信用卡欺诈数据集上的实验验证了该方法的有效性,并且相对于现有方法在加油数据集上均方根误差(RMSE)降低了21.1%,在信用卡欺诈数据集上检测异常的准确率提高了1.4%。因此,提出的模型可以有效应用于加油行为异常的车辆检测,从而提高加油站的管理和运营效率。 相似文献
20.