首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The previous decade has brought a remarkable increase of the interest in applications that deal with querying and mining of time series data. Many of the research efforts in this context have focused on introducing new representation methods for dimensionality reduction or novel similarity measures for the underlying data. In the vast majority of cases, each individual work introducing a particular method has made specific claims and, aside from the occasional theoretical justifications, provided quantitative experimental observations. However, for the most part, the comparative aspects of these experiments were too narrowly focused on demonstrating the benefits of the proposed methods over some of the previously introduced ones. In order to provide a comprehensive validation, we conducted an extensive experimental study re-implementing eight different time series representations and nine similarity measures and their variants, and testing their effectiveness on 38 time series data sets from a wide variety of application domains. In this article, we give an overview of these different techniques and present our comparative experimental findings regarding their effectiveness. In addition to providing a unified validation of some of the existing achievements, our experiments also indicate that, in some cases, certain claims in the literature may be unduly optimistic.  相似文献   

2.
Time series data, due to their numerical and continuous nature, are difficult to process, analyze, and mine. However, these tasks become easier when the data can be transformed into meaningful symbols. Most recent works on time series only address how to identify a given pattern from a time series and do not consider the problem of identifying a suitable set of time points for segmenting the time series in accordance with a given set of pattern templates (e.g., a set of technical patterns for stock analysis). However, the use of fixed-length segmentation is an oversimplified approach to this problem; hence, a dynamic approach (with high controllability) is preferable so that the time series can be segmented flexibly and effectively according to the needs of the users and the applications. In view of the fact that this segmentation problem is an optimization problem and evolutionary computation is an appropriate tool to solve it, we propose an evolutionary time series segmentation algorithm. This approach allows a sizeable set of pattern templates to be generated for mining or query. In addition, defining similarity between time series (or time series segments) is of fundamental importance in fitness computation. By identifying the perceptually important points directly from the time domain, time series segments and templates of different lengths can be compared and intuitive pattern matching can be carried out in an effective and efficient manner. Encouraging experimental results are reported from tests that segment both artificial time series generated from the combinations of pattern templates and the time series of selected Hong Kong stocks.  相似文献   

3.
The study demonstrates the superiority of fuzzy based methods for non-stationary, non-linear time series. Study is based on unequal length fuzzy sets and uses IF-THEN based fuzzy rules to capture the trend prevailing in the series. The proposed model not only predicts the value but can also identify the transition points where the series may change its shape and is ready to include subject expert’s opinion to forecast. The series is tested on three different types of data: enrolment for Alabama university, sales volume of a chemical company and Gross domestic capital of India: the growth curve. The model is tested on both kind of series: with and without outliers. The proposed model provides an improved prediction with lesser MAPE (mean average percentage error) for all the series tested.  相似文献   

4.
Hierarchical image segmentation based on similarity of NDVI time series   总被引:1,自引:0,他引:1  
Although a variety of hierarchical image segmentation procedures for remote sensing imagery have been published, none of them specifically integrates remote sensing time series in spatial or hierarchical segmentation concepts. However, this integration is important for the analysis of ecosystems which are hierarchical in nature, with different ecological processes occurring at different spatial and temporal scales. Therefore, the objective of this paper is to introduce a multi-temporal hierarchical image segmentation (MTHIS) methodology to generate a hierarchical set of segments based on spatial similarity of remote sensing time series. MTHIS employs the similarity of the fast Fourier transform (FFT) components of multi-seasonal time series to group pixels with similar temporal behavior into hierarchical segments at different scales. Use of the FFT allows the distinction between noise and vegetation related signals and increases the computational efficiency. The MTHIS methodology is demonstrated on the area of South Africa in an MTHIS protocol for Normalized Difference Vegetation Index (NDVI) time series. Firstly, the FFT components that express the major spatio-temporal variation in the NDVI time series, the average and annual term, are selected and the segmentation is performed based on these components. Secondly, the results are visualized by means of a boundary stability image that confirms the accuracy of the algorithm to spatially group pixels at different scale levels. Finally, the segmentation optimum is determined based on discrepancy measures which illustrate the correspondence of the applied MTHIS output with landcover-landuse maps describing the actual vegetation. In future research, MTHIS can be used to analyze the spatial and hierarchical structure of any type of remote sensing time series and their relation to ecosystem processes.  相似文献   

5.
The paper presents SwiftSeg, a novel technique for online time series segmentation and piecewise polynomial representation. The segmentation approach is based on a least-squares approximation of time series in sliding and/or growing time windows utilizing a basis of orthogonal polynomials. This allows the definition of fast update steps for the approximating polynomial, where the computational effort depends only on the degree of the approximating polynomial and not on the length of the time window. The coefficients of the orthogonal expansion of the approximating polynomial-obtained by means of the update steps-can be interpreted as optimal (in the least-squares sense) estimators for average, slope, curvature, change of curvature, etc., of the signal in the time window considered. These coefficients, as well as the approximation error, may be used in a very intuitive way to define segmentation criteria. The properties of SwiftSeg are evaluated by means of some artificial and real benchmark time series. It is compared to three different offline and online techniques to assess its accuracy and runtime. It is shown that SwiftSeg-which is suitable for many data streaming applications-offers high accuracy at very low computational costs.  相似文献   

6.
Recently, the increasing use of time series data has initiated various research and development attempts in the field of data and knowledge management. Time series data is characterized as large in data size, high dimensionality and update continuously. Moreover, the time series data is always considered as a whole instead of individual numerical fields. Indeed, a large set of time series data is from stock market. Stock time series has its own characteristics over other time series. Moreover, dimensionality reduction is an essential step before many time series analysis and mining tasks. For these reasons, research is prompted to augment existing technologies and build new representation to manage financial time series data. In this paper, financial time series is represented according to the importance of the data points. With the concept of data point importance, a tree data structure, which supports incremental updating, is proposed to represent the time series and an access method for retrieving the time series data point from the tree, which is according to their order of importance, is introduced. This technique is capable to present the time series in different levels of detail and facilitate multi-resolution dimensionality reduction of the time series data. In this paper, different data point importance evaluation methods, a new updating method and two dimensionality reduction approaches are proposed and evaluated by a series of experiments. Finally, the application of the proposed representation on mobile environment is demonstrated.  相似文献   

7.
文章提出了一种新的用于磁敏传感器的车辆检测算法。算法首先将磁敏数据时间序列经过变长滑动窗口滤波预处理,由PLA抽取平滑后的时间序列特征,用于车辆检测,从而获得相关的交通信息。仿真实验表明,算法有效地减少了慢速行驶的大型车辆对检测结果的影响,保持了较高的准确率。  相似文献   

8.
In this paper, we present a new model to handle four major issues of fuzzy time series forecasting, viz., determination of effective length of intervals, handling of fuzzy logical relationships (FLRs), determination of weight for each FLR, and defuzzification of fuzzified time series values. To resolve the problem associated with the determination of length of intervals, this study suggests a new time series data discretization technique. After generating the intervals, the historical time series data set is fuzzified based on fuzzy time series theory. Each fuzzified time series values are then used to create the FLRs. Most of the existing fuzzy time series models simply ignore the repeated FLRs without any proper justification. Since FLRs represent the patterns of historical events as well as reflect the possibility of appearances of these types of patterns in the future. If we simply discard the repeated FLRs, then there may be a chance of information lost. Therefore, in this model, it is recommended to consider the repeated FLRs during forecasting. It is also suggested to assign weights on the FLRs based on their severity rather than their patterns of occurrences. For this purpose, a new technique is incorporated in the model. This technique determines the weight for each FLR based on the index of the fuzzy set associated with the current state of the FLR. To handle these weighted FLRs and to obtain the forecasted results, this study proposes a new defuzzification technique. The proposed model is verified and validated with three different time series data sets. Empirical analyses signify that the proposed model have the robustness to handle one-factor time series data set very efficiently than the conventional fuzzy time series models. Experimental results show that the proposed model also outperforms over the conventional statistical models.  相似文献   

9.
Social tagging is widely practiced in the Web 2.0 era. Users can annotate useful or interesting Web resources with keywords for future reference. Social tagging also facilitates sharing of Web resources. This study reviews the chronological variation of social tagging data and tracks social trends by clustering tag time series. The data corpus in this study is collected from Hemidemi.com. A tag is represented in a time series form according to its annotating Web pages. Then time series clustering is applied to group tag time series with similar patterns and trends in the same time period. Finally, the similarities between clusters in different time periods are calculated to determine which clusters have similar themes, and the trend variation of a specific tag in different time periods is also analyzed. The evaluation shows the recommendation accuracy of the proposed approach is about 75%. Besides, the case discussion also proves the proposed approach can track the social trends.  相似文献   

10.
An iterative segmentation method is presented and illustrated on specific examples. Full control of each iteration step is obtained by combining local and global properties according to a model of the image structure. A consistent convergence criterion is derived from additional image structure properties and a test is proposed to evaluate adequacy of segmentation.  相似文献   

11.
A method that measures the distance between extended objects of nonregular shape is presented. The distance measure is an average of a set of minimal point-to-point distances between the borders of the objects. The set of points is collected with a well-defined criterion based on processing of distance values on a connected medial axis formed between the objects  相似文献   

12.
Fuzzy c-means clustering (FCM) with spatial constraints (FCM_S) is an effective algorithm suitable for image segmentation. Its effectiveness contributes not only to the introduction of fuzziness for belongingness of each pixel but also to exploitation of spatial contextual information. Although the contextual information can raise its insensitivity to noise to some extent, FCM_S still lacks enough robustness to noise and outliers and is not suitable for revealing non-Euclidean structure of the input data due to the use of Euclidean distance (L2 norm). In this paper, to overcome the above problems, we first propose two variants, FCM_S1 and FCM_S2, of FCM_S to aim at simplifying its computation and then extend them, including FCM_S, to corresponding robust kernelized versions KFCM_S, KFCM_S1 and KFCM_S2 by the kernel methods. Our main motives of using the kernel methods consist in: inducing a class of robust non-Euclidean distance measures for the original data space to derive new objective functions and thus clustering the non-Euclidean structures in data; enhancing robustness of the original clustering algorithms to noise and outliers, and still retaining computational simplicity. The experiments on the artificial and real-world datasets show that our proposed algorithms, especially with spatial constraints, are more effective.  相似文献   

13.
Modeling and forecasting seasonal and trend time series is an important research topic in many areas of industrial and economic activity. In this study, we forecast the seasonal and trend time series using a quasi-linear autoregressive model. This quasi-linear autoregressive model belongs to a class of varying coefficient models in which its autoregressive coefficients are constructed by radial basis function networks. A combined genetic optimization and gradient-based optimization algorithm is applied for automatic selection of proper input variables and model-dependent variables, and optimizing the model parameters simultaneously. The model is tested by five monthly time series. We compare the results with those of other various methods, which show the effectiveness of the proposed approach for the seasonal time series.  相似文献   

14.
在时间序列的GMBR表示的基础上,首次提出将基于距离和基于密度的时间序列检测方法结合,给出了时间序列模式异常的定义,并用“异常特征值”来衡量时间序列模式的异常程度.根据所提出的模式异常的定义,在强力搜索算法的基础之上提出了新的时间序列异常检测算法GMBR-DD (Grid Minimum Bounding Rectangle-Discords Detect),该算法将基于距离和基于密度的异常检测方法结合,能够高效地发现时间序列中的异常模式.通过三组实验数据,对提出的异常时间序列定义和时间序列的异常检测算法进行了验证,实验结果表明所提出的时间序列异常检测算法能够有效地发现时间序列的异常变动,为决策提供了很好的平台和有力的工具.  相似文献   

15.
A procedure is introduced for the analysis of seasonal trends in time series of Earth observation imagery. Called Seasonal Trend Analysis (STA), the procedure is based on an initial stage of harmonic analysis of each year in the series to extract the annual and semi‐annual harmonics. Trends in the parameters of these harmonics over years are then analysed using a robust median‐slope procedure. Finally, images of these trends are used to create colour composites highlighting the amplitudes and phases of seasonality trends. The technique specifically rejects high‐frequency sub‐annual noise and is robust to short‐term interannual variability up to a period of 29% of the length of the series. It is, thus, a very effective procedure for focusing on the general nature of longer‐term trends in seasonality.  相似文献   

16.
We propose a special type of time series, which we call an item-set time series, to facilitate the temporal analysis of software version histories, email logs, stock market data, etc. In an item-set time series, each observed data value is a set of discrete items. We formalize the concept of an item-set time series and present efficient algorithms for segmenting a given item-set time series. Segmentation of a time series partitions the time series into a sequence of segments where each segment is constructed by combining consecutive time points of the time series. Each segment is associated with an item set that is computed from the item sets of the time points in that segment, using a function which we call a measure function. We then define a concept called the segment difference, which measures the difference between the item set of a segment and the item sets of the time points in that segment. The segment difference values are required to construct an optimal segmentation of the time series. We describe novel and efficient algorithms to compute segment difference values for each of the measure functions described in the paper. We outline a dynamic programming based scheme to construct an optimal segmentation of the given item-set time series. We use the item-set time series segmentation techniques to analyze the temporal content of three different data sets–Enron email, stock market data, and a synthetic data set. The experimental results show that an optimal segmentation of item-set time series data captures much more temporal content than a segmentation constructed based on the number of time points in each segment, without examining the item set data at the time points, and can be used to analyze different types of temporal data.  相似文献   

17.
The problem of adaptive segmentation of time series with abrupt changes in the spectral characteristics is addressed. Such time series have been encountered in various fields of time series analysis such as speech processing, biomedical signal processing, image analysis and failure detection. Mathematically, these time series often can be modeled by zero mean gaussian distributed autoregressive (AR) processes, where the parameters of the process, including the gain factor, remain constant for certain time intervals and then jump abruptly to new values. Identification of such processes requires adaptive segmentation: the times of parameter jumps have to be estimated thoroughly to constitute boundaries of “homogeneous” segments which can be described by stationary AR processes. In this paper, a new effective method for sequential adaptive segmentation is proposed, which is based on parallel application of two sequential parameter estimation procedures. The detection of a parameter change as well as the estimation of the accurate position of a segment boundary is effectively performed by a sequence of suitable generalized likelihood ratio (GLR) tests. Flow charts as well as a block diagram of the algorithm are presented. The adjustment of the three control parameters of the procedure (the AR model order, a threshold for the GLR test and the length of a “test window”) is discussed with respect to various performance features. The results of simulation experiments are presented which demonstrate the good detection properties of the algorithm and in particular an excellent ability to allocate the segment boundaries even within a sequence of short segments. As an application to biomedical signals, the analysis of human electroencephalograms (EEG) is considered and an example is shown.  相似文献   

18.
We present a semi-supervised time series classification method based on co-training which uses the hidden Markov model (HMM) and one nearest neighbor (1-NN) as two learners. For modeling time series effectively, the symbolization of time series is required and a new granulation-based symbolic representation method is proposed in this paper. First, a granule for each segment of time series is constructed, and then the segments are clustered by spectral clustering applied to the formed similarity matrix. Using four time series datasets from UCR Time Series Data Mining Archive, the experimental results show that proposed symbolic representation works successfully for HMM. Compared with the supervised method, the semi-supervised method can construct accurate classifiers with very little labeled data available.  相似文献   

19.
We propose a method to detect the onset of linear trend in a time series and estimate the change point T from the profile of a linear trend test statistic, computed on consecutive overlapping time windows along the time series. We compare our method to two standard methods for trend change detection and evaluate them with Monte Carlo simulations for different time series lengths, autocorrelation strengths, trend slopes and distribution of residuals. The proposed method turns out to estimate T better for small and correlated time series. The methods were also applied to global temperature records suggesting different turning points.  相似文献   

20.
We present an approach for the joint segmentation and classification of a time series. The segmentation is on the basis of a menu of possible statistical models: each of these must be describable in terms of a sufficient statistic, but there is no need for these sufficient statistics to be the same, and these can be as complex (for example, cepstral features or autoregressive coefficients) as fits. All that is needed is the probability density function (PDF) of each sufficient statistic under its own assumed model--presumably this comes from training data, and it is particularly appealing that there is no need at all for a joint statistical characterization of all the statistics. There is similarly no need for an a-priori specification of the number of sections, as the approach uses an appropriate penalization of an over-zealous segmentation. The scheme has two stages. In stage one, rough segmentations are implemented sequentially using a piecewise generalized likelihood ratio (GLR); in the second stage, the results from the first stage (both forward and backward) are refined. The computational burden is remarkably small, approximately linear with the length of the time series, and the method is nicely accurate in terms both of discovered number of segments and of segmentation accuracy. A hybrid of the approach with one based on Gibbs sampling is also presented; this combination is somewhat slower but considerably more accurate.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号