首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到17条相似文献,搜索用时 156 毫秒
1.
基于函数的时间序列分段线性表示方法   总被引:1,自引:0,他引:1  
谢福鼎  王赫楠  张永  孙岩 《计算机科学》2011,38(11):153-155,160
考虑到时间序列的时间特性对不同区段的影响以及时间序列数据动态增长的实际情况,在RPAA ( Reversed Piecewise Aggregate Approximation)和PAA(Piecewise Aggregate Approximation)方法的基础上,提出了一种新的时间序列分段线性表示方法FPAA(Founction Piecewise Aggregate Approximation)。FPAA方法通过定义函数影响因子,克服了RPAA和PAA方法的不足。该方法具有线性时间复杂度,满足下界定理,并且支持时间序列的在线划分。实验表明,与PAA方法和RPAA方法相比,所提出的方法可以较有效地进行时间序列的在线查询。  相似文献   

2.
DTW(Dynamic Time Warping)算法被广泛应用于序列数据比对,以度量序列间距离,但算法较高的时间复杂度限制了其在长序列比对上的应用。提出基于自适应搜索窗口的序列相似比对算法(ADTW),算法利用分段聚集平均(Piecewise Aggregate Approximation,PAA)策略进行序列抽样得到低精度序列,然后计算低精度序列下的比对路径,并根据低精度距离矩阵上的梯度变化预测路径偏差,限制路径搜索窗口的拓展范围;随后算法逐步提高序列精度,并在搜索窗口内修正路径、计算新的搜索窗口,最终,实现DTW距离和相似比对路径的快速求解。对比FastDTW,ADTW算法在同等度量准确率下提高计算效率约20%,其时间复杂度为[O(n)]。  相似文献   

3.
针对股票、基金等大量时间序列数据的趋势预测问题,提出一种基于新颖特征模型的多时间尺度时间序列趋势预测算法。首先,在原始时间序列中提取带有多时间尺度特征的特征树,其刻画了时间序列,不仅带有序列在各个层次的特征,同时表示了层次之间的关系。然后,利用聚类挖掘特征序列中的隐含状态。最后,应用隐马尔可夫模型(HMM)设计一个多时间尺度趋势预测算法(MTSTPA),同时对不同尺度下的趋势以及趋势的长度作出预测。在真实股票数据集上的实验中,在各个尺度上的预测准确率均在60%以上,与未使用特征树对比,使用特征树的模型预测效率更高,在某一尺度上准确率高出10个百分点以上。同时,与经典自回归滑动平均模型(ARMA)模型和PHMM(Pattern-based HMM)对比,MTSTPA表现更优,验证了其有效性。  相似文献   

4.
针对时间序列的数据挖掘将时间序列数据转换为离散的符号序列, 提出了一种基于滑动窗口及局部特征的时间序列符号化方法。该方法采用了滑动窗口的方法将时间序列分割, 每个分段采用多个斜率表示, 最后采用K-均值聚类算法对斜率表示的分段进行聚类, 实现时间序列的符号化。实验证明了该方法的有效性与准确性。  相似文献   

5.
为了进一步改善和提高基于模式的时间序列趋势相似性度量效果,在时间序列分段线性表示的基础上,依据分段子序列的均值及其线性拟合函数的导数符号,实现时间序列的分段模式化,以模式之间的异同性定义模式匹配距离,借鉴动态时间弯曲(Dynamic Time Warping,DTW)的动态规划原理,提出一种动态模式匹配方法(Dynamic Pattern Matching,DPM)。实验结果表明,该方法能够在不同压缩率条件下,准确度量等长时间序列的趋势相似性,而且时间消耗较低。时间序列不等长作为存在数据缺失的一种表现形式,该方法的度量效果与数据缺失比例之间的关系值得进一步的深入研究。  相似文献   

6.
基于斜率提取边缘点的时间序列分段线性表示方法   总被引:7,自引:0,他引:7  
本文引入解析几何中的斜率,提出了一种新颖的基于斜率提取边缘点的时间序列分段线性表示方法SEEP。对于斜率变化范围比较集中的时间序列,SEEP表示方法有着非常好的效果,与以往的分段线性表示方法相比,SEEP表示方法与原始时间序列之间的拟合误差更小,而且要小很多;对于斜率变化范围比较大的时间序列,SEEP表示方法与原始时间序列之间的拟合误差,和以往的分段线性表示方法相比,也相差不大,并且SEEP表示方法计算简单,易于实现。算法的时间复杂度仅为O(n),  相似文献   

7.
针对时间序列数据降维过程中易丢失趋势特征信息的问题,提出一种基于趋势特征的时间序列符号聚集近似表示方法,除保留各序列分段的均值特征外,采用分段的趋势距离因子及趋势形态因子共同描述序列趋势特征;并给出了满足下界密封性的距离度量方法,从而更好地表示具有不同趋势特征的时间序列。在公共数据集上的实验结果表明,该方法在分类误报率、降维比率等方面比符号聚集近似方法(SAX)和基于趋势距离的时间序列符号近似表示方法(SAX_TD)有10%以上的下降,并具有更好的下界密封性。实验结果证明,该算法在进行时间序列压缩的同时充分保留时间序列的趋势变化形态,从而提高时间序列数据挖掘的效率。  相似文献   

8.
提出了满足滑动窗口、最大间隙、最小间隙、最大跨度四种时间参数限制的序列挖掘算法.算法通过划分不同的等价类分解搜索空间,利用时间连接实现模式的逐步增长,挖掘过程只需扫描一次序列数据库.由于序列嵌入的四种参数具有通用性,本算法不仅能发现以前相关算法所能发现的模式,还能发现其他算法所不能发现的模式。  相似文献   

9.
大规模时间序列数据库降维及相似搜索   总被引:4,自引:0,他引:4  
李爱国  覃征 《计算机学报》2005,28(9):1467-1475
提出一种基于分段多项式表示(PPR)的时间序列数据库相似查询的系统化方法.PPR是一类基于线性多项式回归的正交变换.用PPR变换索引时间序列数据在理论上具备非漏报性质.文中分析了PPR的计算复杂性以及查询阈值的下界,并提出了一种衡量时间序列相似查询算法之查询效率的定量指标.与基于离散傅立叶变换(DFT)和离散小波变换(DWT)的时间序列相似查询算法所作的对比实验表明,所提算法可以用低的索引结构维数获得高的查询效率.  相似文献   

10.
基于变化点的时间序列近似表示   总被引:1,自引:0,他引:1  
时间序列的近似表示能够提高时间序列数据挖掘的效率和可靠性。提出了一种基于变化点的时间序列近似表示,具有简单直观、近似质量高、适应能力强等优点。在来自不同领域的真实数据集上的实验表明:与时间序列的重要点分段表示和分段常量表示相比,基于变化点的时间序列近似表示在近似质量和适应能力上都具有明显的优势。  相似文献   

11.
高效的时间序列下界技术   总被引:3,自引:0,他引:3       下载免费PDF全文
针对时间序列数据,提出一种新的基于动态时间弯曲的下界技术,该技术首先基于分段聚集近似的线性表示对原始序列进行降维,同时生成查询序列的网格最小边界矩形近似表示,然后利用基于动态时间弯曲距离对两者下界距离度量。实验结果表明,该下界技术与以往相关技术相比,能够产生更大的下界距离,具有更强的紧凑度、裁剪搜索空间能力以及更短的运行时间,有利于时间序列数据挖掘。  相似文献   

12.
Anomaly detection has received much attention due to its various applications. Generally, the first step to discover anomalies is a data representation method which reduces dimensionality as well as preserves key information. Anomaly detection based on real-value representation methods is meaningful for its convenience in numeric operation. A typical real-value representation method is the Piecewise Aggregate Approximation (PAA) that is simple and intuitive by capturing mean values of segments in a sequence. However, if segments are same or similar in their average values but different in their oscillation amplitudes, the PAA method is ineffective to describe a sequence composed of such segments. To address this issue, we propose a representation method called the Piecewise Aggregate Approximation in the Amplitude Domain (AD-PAA). For discovering anomalies, a sequence is partitioned into subsequences by a sliding window firstly. Then in the AD-PAA method, a subsequence is divided into equal size subsections according to the amplitude domain. With mean values of subsections computed, the amplitude oscillation of a subsequence is embodied effectively. When the AD-PAA method is applied to approximate subsequences, the AD-PAA representation of a sequence is constructed. Anomalies are determined by anomaly scores that are based on similarities among representation results. Experimental results on various data confirm that the proposed method is more accurate than the PAA based method and other comparison methods. The ability to differentiate anomalies of the proposed algorithm is also superior.  相似文献   

13.
The problem of similarity search in large time series databases has attracted much attention recently. It is a non-trivial problem because of the inherent high dimensionality of the data. The most promising solutions involve first performing dimensionality reduction on the data, and then indexing the reduced data with a spatial access method. Three major dimensionality reduction techniques have been proposed: Singular Value Decomposition (SVD), the Discrete Fourier transform (DFT), and more recently the Discrete Wavelet Transform (DWT). In this work we introduce a new dimensionality reduction technique which we call Piecewise Aggregate Approximation (PAA). We theoretically and empirically compare it to the other techniques and demonstrate its superiority. In addition to being competitive with or faster than the other methods, our approach has numerous other advantages. It is simple to understand and to implement, it allows more flexible distance measures, including weighted Euclidean queries, and the index can be built in linear time. Received 16 May 2000 / Revised 18 December 2000 / Accepted in revised form 2 January 2001  相似文献   

14.
Novel Online Methods for Time Series Segmentation   总被引:1,自引:0,他引:1  
To efficiently and effectively mine massive amounts of data in the time series, approximate representation of the data is one of the most commonly used strategies. Piecewise Linear Approximation is such an approach, which represents a time series by dividing it into segments and approximating each segment with a straight line. In this paper, we first propose a new segmentation criterion that improves computing efficiency. Based on this criterion, two novel online piecewise linear segmentation methods are developed, the feasible space window method and the stepwise feasible space window method. The former usually produces much fewer segments and is faster and more reliable in the running time than other methods. The latter can reduce the representation error with fewer segments. It achieves the best overall performance on the segmentation results compared with other methods. Extensive experiments on a variety of real-world time series have been conducted to demonstrate the advantages of our methods.  相似文献   

15.
刘芬  郭躬德 《计算机应用》2013,33(1):192-198
基于关键点的符号化聚合近似(SAX)改进算法(KP_SAX)在SAX的基础上利用关键点对时间序列进行点距离度量,能更有效地计算时间序列的相似性,但对时间序列的模式信息体现不足,仍不能合理地度量时间序列的相似性。针对SAX与KP_SAX存在的缺陷,提出了一种基于SAX的时间序列相似性复合度量方法。综合了点距离和模式距离两种度量,先利用关键点将分段累积近似(PAA)法平均分段进一步细分成各个子分段;再用一个包含此两种距离信息的三元组表示每个子分段;最后利用定义的复合距离度量公式计算时间序列间的相似性,计算结果能更有效地反映时间序列间的差异。实验结果显示,改进方法的时间效率比KP_SAX算法仅降低了0.96%,而在时间序列区分度性能上优于KP_SAX算法和SAX算法。  相似文献   

16.
李海林  邬先利 《计算机应用》2018,38(11):3204-3210
针对传统异常片段检测方法在处理增量式时间序列时效率低的问题,提出一种基于频繁模式发现的时间序列异常检测(TSAD)方法。首先,将历史输入的时间序列数据进行符号转化;其次,利用符号化特征找出历史序列数据集中的频繁模式;最后,结合最长公共子序列匹配方法度量频繁模式与当前新增加时间序列数据之间的相似度,从而发现新增加数据中的异常模式。与基于滑动窗口预测的水文时间序列异常检测方法(TSOD)和基于扩展符号聚集近似的水文时间序列异常挖掘方法(ESAA)相比,对于实验选择的三种类型的时间序列数据,TSAD的检测率都超过90%;TSOD对规则性较强的序列检测率较高,能达到99%,但对噪声干扰较大的序列检测率较低,对数据偏向性较强;ESAA对三种类型的数据检测率均不超过70%。实验结果表明,TSAD在时间序列异常检测中能够较好地发现异常片段。  相似文献   

17.
In order to achieve an optimum and successful operation of an industrial process, it is important firstly to detect upsets, equipment malfunctions or other abnormal events as early as possible and secondly to identify and remove the cause of those events. Univariate and multivariate statistical process control methods have been widely applied in process industries for early fault detection and localization.The primary objective of the proposed research is the design of an anomaly detection and visualization tool that is able to present to the shift operator – and to the various levels of plant operation and company management – an early, global, accurate and consolidated presentation of the operation of major subgroups or of the whole plant, aided by a graphical form.Piecewise Aggregate Approximation (PAA) and Symbolic Aggregate Approximation (SAX) are considered as two of the most popular representations for time series data mining, including clustering, classification, pattern discovery and visualization in time series datasets. However SAX is preferred since it is able to transform a time series into a set of discrete symbols, e.g. into alphabet letters, being thus far more appropriate for a graphical representation of the corresponding information, especially for the shift operator. The methods are applied on individual time records of each process variable, as well as on entire groups of time records of process variables in combination with Hidden Markov Models. In this way, the proposed visualization tool is not only associated with a process defect, but it allows also identifying which specific abnormal situation occurred and if this has also occurred in the past. Case studies based on the benchmark Tennessee Eastman process demonstrate the effectiveness of the proposed approach. The results indicate that the proposed visualization tool captures meaningful information hidden in the observations and shows superior monitoring performance.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号