首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
分段线性表示是时间序列降维的有效方法。在总结分析序列趋势变化特点的基础上,提出了一种基于趋势转折点的时间序列分段线性表示算法。首先定义了趋势转折点作为时间序列分段点的备选集,以点到区域的距离度量趋势转折点的重要性,再根据给定的阈值选择重要趋势转折点作为分段点,对时间序列进行分段线性表示。通过与其他6种方法进行实验比较,结果表明:所提方法在具有较好的拟合质量和适应能力以及对转折点明显的序列,都表现出较强的抗噪声干扰能力。  相似文献   

2.
时间序列的相似性度量是时间序列数据挖掘的研究基础,为数据挖掘任务的效率和准确度提供可靠的保障。提出一种时间序列的层次分段及相似性度量方法,方法首先识别时间序列中的极值点,依据极值点的特征对时间序列进行分层次分段,并以此为基础,通过定义新的距离公式来度量时间序列间的相似性。使用新提出的相似性度量方法对时间序列进行聚类计算,实验结果表明,该方法能够有效地度量时间序列间的相似性,聚类效果明显,具有较好的实用性和良好的应用前景。  相似文献   

3.
The previous decade has brought a remarkable increase of the interest in applications that deal with querying and mining of time series data. Many of the research efforts in this context have focused on introducing new representation methods for dimensionality reduction or novel similarity measures for the underlying data. In the vast majority of cases, each individual work introducing a particular method has made specific claims and, aside from the occasional theoretical justifications, provided quantitative experimental observations. However, for the most part, the comparative aspects of these experiments were too narrowly focused on demonstrating the benefits of the proposed methods over some of the previously introduced ones. In order to provide a comprehensive validation, we conducted an extensive experimental study re-implementing eight different time series representations and nine similarity measures and their variants, and testing their effectiveness on 38 time series data sets from a wide variety of application domains. In this article, we give an overview of these different techniques and present our comparative experimental findings regarding their effectiveness. In addition to providing a unified validation of some of the existing achievements, our experiments also indicate that, in some cases, certain claims in the literature may be unduly optimistic.  相似文献   

4.
时间序列具有数据量大、维数高和更新速度快等特点,导致一般的分段线性方法难以刻画原始时间序列的全局趋势特征.针对时间序列的特性,提出了一种基于时态边缘算子的自主分段表示方法(简称APLR_TEO),能够有效刻画出时间序列的形状特征.首先通过时态边缘算子与原始时间序列做卷积并根据关联规则得到边缘极值点;然后根据时序的变化特...  相似文献   

5.
时间序列相似性度量在挖掘时间序列模式,提取时间序列关联关系上发挥着重要作用。分析了当前主流的时间序列相似性度量算法,分别指出了各度量算法在度量时序数据相似性时存在的缺陷,并提出了基于数学形态学的时间序列相似性度量算法。通过将归一化的时间序列二值图像化表示,再引入了图像处理领域中的膨胀、腐蚀操作对时序数据进行形态变换分析,提高相似时序数据部分的抗噪性,同时又不降低时序数据非相似部分间的差异度,实现时序数据相似性度量分类精度的提高。在八种时间序列测试数据集合上进行分类实验,实验结果表明提出的基于数学形态学的时间序列相似性度量算法在时间序列分类精度上得到有效改善,相比于DTW相似性度量算法,分类精度平均水平提升了8.74%,最高提升20%。  相似文献   

6.
Time series data, due to their numerical and continuous nature, are difficult to process, analyze, and mine. However, these tasks become easier when the data can be transformed into meaningful symbols. Most recent works on time series only address how to identify a given pattern from a time series and do not consider the problem of identifying a suitable set of time points for segmenting the time series in accordance with a given set of pattern templates (e.g., a set of technical patterns for stock analysis). However, the use of fixed-length segmentation is an oversimplified approach to this problem; hence, a dynamic approach (with high controllability) is preferable so that the time series can be segmented flexibly and effectively according to the needs of the users and the applications. In view of the fact that this segmentation problem is an optimization problem and evolutionary computation is an appropriate tool to solve it, we propose an evolutionary time series segmentation algorithm. This approach allows a sizeable set of pattern templates to be generated for mining or query. In addition, defining similarity between time series (or time series segments) is of fundamental importance in fitness computation. By identifying the perceptually important points directly from the time domain, time series segments and templates of different lengths can be compared and intuitive pattern matching can be carried out in an effective and efficient manner. Encouraging experimental results are reported from tests that segment both artificial time series generated from the combinations of pattern templates and the time series of selected Hong Kong stocks.  相似文献   

7.
时间序列的相似性度量是时间序列分析的基础工作之一,是进行相似匹配的关键。针对欧几里德距离描述分段趋势的不足和各种模式距离对应分段之间距离值的离散化问题,提出一种基于形态相似距离的时间序列相似性度量方法,标准数据集上完成的识别和聚类实验表明了该方法的可行性和有效性。  相似文献   

8.
从石油录井色谱数据应用的实际需求出发,提出一种新的时间序列分段拟合算法。该算法通过一次扫描数据,根据中线距离阈值和非单调序列中极值点保持时间段阈值两个约束条件,选择反映序列趋势变化的关键点,然后线性拟合时间序列。实验结果表明该算法能够在保持原始序列主要形态的同时剔除噪音干扰,精确定位单调序列中的突变转折点,发现序列中的尖峰状态。  相似文献   

9.
时间序列异常检测   总被引:3,自引:0,他引:3  
在k-近邻局部异常检测算法的基础上,结合时间序列的分割方法,提出了一种高效的时间序列异常检测算法。该算法首先把序列重要点作为数据的分割点,对时间序列数据进行高比例压缩;其次利用局部异常检测方法检测出时间序列中的异常模式。通过心电图(ECG)数据实验验证了算法的有效性和合理性。  相似文献   

10.
The study demonstrates the superiority of fuzzy based methods for non-stationary, non-linear time series. Study is based on unequal length fuzzy sets and uses IF-THEN based fuzzy rules to capture the trend prevailing in the series. The proposed model not only predicts the value but can also identify the transition points where the series may change its shape and is ready to include subject expert’s opinion to forecast. The series is tested on three different types of data: enrolment for Alabama university, sales volume of a chemical company and Gross domestic capital of India: the growth curve. The model is tested on both kind of series: with and without outliers. The proposed model provides an improved prediction with lesser MAPE (mean average percentage error) for all the series tested.  相似文献   

11.
基于时间序列趋势转折点的分段线性表示*   总被引:10,自引:2,他引:8  
在充分利用时间序列时变特征的基础上,以有效地提取序列中的趋势和压缩原始数据为目标,提出了基于时间序列趋势转折点的分段线性表示方法。该方法在有效地提取序列中的趋势和压缩原始数据的同时,能够随着时间序列长度的增长对序列进行划分,具有高效、实现方法简便、效果直观的优点,对于不同领域的数据适应性良好。  相似文献   

12.
李建勋  马美玲  郭建华  严峻 《计算机应用》2019,39(10):2955-2959
针对符合一定数据模式或规律的虚假数据识别问题,提出一种基于随机性分析的虚假趋势时间序列判别方法。该方法在分析时间序列组成的基础上,首先探索虚假趋势时间序列的简单伪造和复杂伪造方式,并将其分解为虚假趋势和虚假随机两部分;然后通过基函数逼近进行时间序列虚假趋势部分的提取,采用随机性理论开展虚假随机部分的分析;最终借助单比特频数和块内频数对虚假随机部分是否具备随机性进行检测,为具有一定趋势特征的虚假时间序列的判别提供了一个解决方案。实验结果表明:该方法能够有效地分解虚假时间序列和提取虚假趋势部分,实现简单伪造数据和复杂伪造数据的判别,支持对通过观测手段或者检测设备所获取的数值型数据的真伪分析,进一步提高了虚假数据可判别范围,平均判别正确率可达74.7%。  相似文献   

13.
The paper presents SwiftSeg, a novel technique for online time series segmentation and piecewise polynomial representation. The segmentation approach is based on a least-squares approximation of time series in sliding and/or growing time windows utilizing a basis of orthogonal polynomials. This allows the definition of fast update steps for the approximating polynomial, where the computational effort depends only on the degree of the approximating polynomial and not on the length of the time window. The coefficients of the orthogonal expansion of the approximating polynomial-obtained by means of the update steps-can be interpreted as optimal (in the least-squares sense) estimators for average, slope, curvature, change of curvature, etc., of the signal in the time window considered. These coefficients, as well as the approximation error, may be used in a very intuitive way to define segmentation criteria. The properties of SwiftSeg are evaluated by means of some artificial and real benchmark time series. It is compared to three different offline and online techniques to assess its accuracy and runtime. It is shown that SwiftSeg-which is suitable for many data streaming applications-offers high accuracy at very low computational costs.  相似文献   

14.
Hierarchical image segmentation based on similarity of NDVI time series   总被引:1,自引:0,他引:1  
Although a variety of hierarchical image segmentation procedures for remote sensing imagery have been published, none of them specifically integrates remote sensing time series in spatial or hierarchical segmentation concepts. However, this integration is important for the analysis of ecosystems which are hierarchical in nature, with different ecological processes occurring at different spatial and temporal scales. Therefore, the objective of this paper is to introduce a multi-temporal hierarchical image segmentation (MTHIS) methodology to generate a hierarchical set of segments based on spatial similarity of remote sensing time series. MTHIS employs the similarity of the fast Fourier transform (FFT) components of multi-seasonal time series to group pixels with similar temporal behavior into hierarchical segments at different scales. Use of the FFT allows the distinction between noise and vegetation related signals and increases the computational efficiency. The MTHIS methodology is demonstrated on the area of South Africa in an MTHIS protocol for Normalized Difference Vegetation Index (NDVI) time series. Firstly, the FFT components that express the major spatio-temporal variation in the NDVI time series, the average and annual term, are selected and the segmentation is performed based on these components. Secondly, the results are visualized by means of a boundary stability image that confirms the accuracy of the algorithm to spatially group pixels at different scale levels. Finally, the segmentation optimum is determined based on discrepancy measures which illustrate the correspondence of the applied MTHIS output with landcover-landuse maps describing the actual vegetation. In future research, MTHIS can be used to analyze the spatial and hierarchical structure of any type of remote sensing time series and their relation to ecosystem processes.  相似文献   

15.
基于时间序列模式表示的异常检测算法*   总被引:2,自引:0,他引:2  
提出了一种基于时间序列的模式表示提取时间序列异常值的异常检测算法(PREOV).时间序列的模式表示本身就具有压缩数据、保持时间序列基本形态的功能,并且具有一定的除噪能力.在时间序列模式表示的基础上提取异常值,可以大大提高算法的效率和准确性,达到事半功倍的效果.在本算法中,还使用了一定的剪枝策略,使得算法的时间复杂度进一步降低.该算法计算简单、实现方便、无须训练,可以支持时间序列的动态增长.  相似文献   

16.
一种基于信息熵的时间序列分段线性表示方法   总被引:1,自引:0,他引:1  
针对部分时间序列具有高维、大数据量及数据更新速度较快的特点, 导致在原始时间序列上难以进行数据挖掘的问题, 提出一种基于信息熵的时间序列分段线性表示方法——PLR_IE。该算法利用信息熵作为评判重要点数量的性能指标, 从序列中提取重要分段点的数量分布情况, 利用重要点组成的序列重新拟合原始时间序列, 为下一步数据挖掘提供基础。实验结果表明, 该方法能高效地提取出序列主要特征、拟合原始序列。  相似文献   

17.
王玲  李泽中 《控制与决策》2024,39(2):568-576
现有多元时间序列分段算法中分段点的选择以及分段个数的确定往往需要分别独立完成,大大增加了算法的计算复杂度.为解决上述问题,提出一种基于多元时间序列的自适应贪婪高斯分段算法.该算法将多元时间序列各个分段所对应的数据解释为来自不同多元高斯分布的独立样本,进而将分段问题转化为协方差正则化的最大似然估计问题进行求解.为提高学习效率,采用贪婪搜寻方法使每个段的似然值最大化进而近似地找到最优分段点,并且在搜寻的过程中利用信息增益方法自适应地获取最优的分段个数,避免分段个数确定和分段点选择分别独立进行,从而减少计算的复杂度.基于多种领域的真实数据集实验结果表明,所提出方法的分段精度以及运行效率均优于传统方法,并且能够有效完成多元时间序列的异常检测任务.  相似文献   

18.
文章提出了一种新的用于磁敏传感器的车辆检测算法。算法首先将磁敏数据时间序列经过变长滑动窗口滤波预处理,由PLA抽取平滑后的时间序列特征,用于车辆检测,从而获得相关的交通信息。仿真实验表明,算法有效地减少了慢速行驶的大型车辆对检测结果的影响,保持了较高的准确率。  相似文献   

19.
Recently, the increasing use of time series data has initiated various research and development attempts in the field of data and knowledge management. Time series data is characterized as large in data size, high dimensionality and update continuously. Moreover, the time series data is always considered as a whole instead of individual numerical fields. Indeed, a large set of time series data is from stock market. Stock time series has its own characteristics over other time series. Moreover, dimensionality reduction is an essential step before many time series analysis and mining tasks. For these reasons, research is prompted to augment existing technologies and build new representation to manage financial time series data. In this paper, financial time series is represented according to the importance of the data points. With the concept of data point importance, a tree data structure, which supports incremental updating, is proposed to represent the time series and an access method for retrieving the time series data point from the tree, which is according to their order of importance, is introduced. This technique is capable to present the time series in different levels of detail and facilitate multi-resolution dimensionality reduction of the time series data. In this paper, different data point importance evaluation methods, a new updating method and two dimensionality reduction approaches are proposed and evaluated by a series of experiments. Finally, the application of the proposed representation on mobile environment is demonstrated.  相似文献   

20.
In this paper, we present a new model to handle four major issues of fuzzy time series forecasting, viz., determination of effective length of intervals, handling of fuzzy logical relationships (FLRs), determination of weight for each FLR, and defuzzification of fuzzified time series values. To resolve the problem associated with the determination of length of intervals, this study suggests a new time series data discretization technique. After generating the intervals, the historical time series data set is fuzzified based on fuzzy time series theory. Each fuzzified time series values are then used to create the FLRs. Most of the existing fuzzy time series models simply ignore the repeated FLRs without any proper justification. Since FLRs represent the patterns of historical events as well as reflect the possibility of appearances of these types of patterns in the future. If we simply discard the repeated FLRs, then there may be a chance of information lost. Therefore, in this model, it is recommended to consider the repeated FLRs during forecasting. It is also suggested to assign weights on the FLRs based on their severity rather than their patterns of occurrences. For this purpose, a new technique is incorporated in the model. This technique determines the weight for each FLR based on the index of the fuzzy set associated with the current state of the FLR. To handle these weighted FLRs and to obtain the forecasted results, this study proposes a new defuzzification technique. The proposed model is verified and validated with three different time series data sets. Empirical analyses signify that the proposed model have the robustness to handle one-factor time series data set very efficiently than the conventional fuzzy time series models. Experimental results show that the proposed model also outperforms over the conventional statistical models.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号