首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
一种有效的时间序列维数约简方法   总被引:2,自引:0,他引:2  
提出了一种用于相似性查询的时间序列维数约简的有效方法,该方法采用快速小波变换将时间序列分解成不同频率的子带,用经过多分辨分解后得到的低频逼近信号重新表示原始序列,这样将一个高维的时间序列映射到一个低维空间,这种方法支持欧几理德距离标准和L-平移欧几理德距离标准,该算法的时间复杂性为O(n)。  相似文献   

2.
相似时间序列挖掘方法   总被引:5,自引:0,他引:5  
马尔可夫状态转移矩阵描述了随机过程的动态特性,而时间序列可以认为是这一动态特性的外在体现,将二者有效地结合起来为相似时间序列挖掘提供了一种有效的新方法。  相似文献   

3.
1 引言数据开采技术已经引起了国际上人工智能和数据库专家学者的强烈关注,其核心就是要从庞大的数据集里发现知识,为人们管理、决策提供科学依据,而对关联规则的发现一直是数据开采的热门话题,从A-gawal首先提出Agriori算法以来,产生和改进了许多有效的算法和模型。然而,不得不面临的一个现实问题是,面对海量数据以及数据自身、之间复杂的关系、  相似文献   

4.
基于小波变换的时间序列相似模式匹配   总被引:21,自引:1,他引:21  
提出了一种新的时序相似模式匹配方法,它采用小波分析的方法实现时间序列数据的降维,采用小波序列表示原序列,将小波序列组织为多维索引结构R-tree存储,在该索引结构基础上,基于一种表示相似性的距离函数,定义了范围查询和最近邻查询算法,实验结果证明这种方法性能优于传统的基于傅立叶变换的相似模式匹配方法。  相似文献   

5.
该文提出了基于傅立叶变换的一种新的时间序列相似搜索算法。该算法利用高效的索引方法,达到快速的匹配,解决了多序列的子序列匹配问题。大量算例验证了该算法的通用性和有效性,它可以应用到求解各种时间序列相关的实际问题。  相似文献   

6.
时间序列的表示与分类算法综述   总被引:1,自引:0,他引:1  
时间序列是按照时间排序的一组随机变量,它通常是在相等间隔的时间段内,依照给定的采样率,对某种潜在过程进行观测的结果。时间序列数据广泛地存在于商业、农业、气象、生物科学以及生态学等诸多领域,从时间序列中发现有用的知识已成为数据挖掘领域的研究热点之一。在时间序列表示方面,主要介绍了非数据适应性表示方法、数据适应性表示方法和基于模型的表示方法;针对时间序列的分类方法,着重介绍了基于时域相似性、形状相似性和变化相似性的分类算法,并对未来的研究方向进行了进一步的展望。  相似文献   

7.
马超红  翁小清 《计算机科学》2018,45(2):291-296, 317
在时间序列数据挖掘领域,时间序列的早期分类越来越受到人们的重视,由于时间序列的长度(也称为维数)较大,在早期分类的实际应用中选择合适的维数约简方法非常重要,因此提出一种基于分段聚合近似(PAA)的时间序列早期分类方法。首先运用PAA对时间序列样本进行维数约简,然后在低维空间对样本进行早期分类,在43个时间序列数据集上的实验结果表明, 所提方法 在准确率、早期性、可靠性等方面优于已有方法。  相似文献   

8.
时间序列分析提供的理论和方法是进行高难度综合课题的研究工具之一。近几年来已有很多的学者对时间序列的研究取得了丰硕的成果,有的在已有时间序列分析方法的基础上进行创新,研究出了新的预测方法。该文从基本理论和方法等方面对时间序列分析进行了综述,同时阐述了其研究动态和发展趋势。  相似文献   

9.
基于MODIS时间序列数据的作物季相信息提取   总被引:3,自引:0,他引:3       下载免费PDF全文
基于MODIS NDVI时间序列数据对浙北平原单季稻区进行作物季相一致性分析。对NDVI时间序列数据进行离散傅立叶变换去除噪声,再利用土地利用现状图提取耕地区的NDVI影像图,根据时间序列曲线的最大值研究作物的季相。结果表明:水稻生长期对NDVI时间序列曲线的响应和季相一致性均较小麦和油菜好;8 d合成的数据较16 d合成的数据可以更详细地反映作物季相信息。研究证实了MODIS NDVI时间序列曲线对区域作物季相分析的意义。  相似文献   

10.
该文研究波动数据的处理与分析方法。波动性数据在各个行业的应用实践中,经常出现,对该类数据的处理方法的研究,具有重要的理论与实用价值。波动数据有的很有周期规律性,例如复合周期函数型的电子信号;有的波动数据没有明显的周期特性,例如直线轴上不定时波动的数据叠加白噪声的数据模式。  相似文献   

11.
重点研究进化回归神经网络对时序数据和关联数据的建模能力。针对两个标准问题,采用不同形式的建模数据,比较了前向网络和回归神经网络的建模及预测效果,进一步将进化算法用于不同结构回归神经网络的训练并比较了它们的建模能力。仿真结果表明回归神经网络对时序关联数据有很好的建模和预测能力,相比于前向网络,无需过程时序特点的先验知识,可以采用最简单的建模数据形式。而进化算法相比于常规的梯度下降算法,用于训练不同的回归网络结构通用性好,且训练过程不受局部极小问题的困扰,适当规模的训练过程可以获得性能良好的神经网络模型。  相似文献   

12.
郝石磊  王志海  刘海洋 《软件学报》2022,33(5):1817-1832
时间序列分类问题是时间序列数据挖掘中的一项重要任务, 近些年受到了越来越广泛的关注. 该问题的一个重要组成部分就是时间序列间的相似性度量. 在众多相似性度量算法中, 动态时间规整是一种非常有效的算法,目前已经被广泛应用到视频、音频、手写体识别以及生物信息处理等众多领域. 动态时间规整本质上是一种在边界及时间一致性约束下...  相似文献   

13.
Characteristic-Based Clustering for Time Series Data   总被引:1,自引:0,他引:1  
With the growing importance of time series clustering research, particularly for similarity searches amongst long time series such as those arising in medicine or finance, it is critical for us to find a way to resolve the outstanding problems that make most clustering methods impractical under certain circumstances. When the time series is very long, some clustering algorithms may fail because the very notation of similarity is dubious in high dimension space; many methods cannot handle missing data when the clustering is based on a distance metric.This paper proposes a method for clustering of time series based on their structural characteristics. Unlike other alternatives, this method does not cluster point values using a distance metric, rather it clusters based on global features extracted from the time series. The feature measures are obtained from each individual series and can be fed into arbitrary clustering algorithms, including an unsupervised neural network algorithm, self-organizing map, or hierarchal clustering algorithm.Global measures describing the time series are obtained by applying statistical operations that best capture the underlying characteristics: trend, seasonality, periodicity, serial correlation, skewness, kurtosis, chaos, nonlinearity, and self-similarity. Since the method clusters using extracted global measures, it reduces the dimensionality of the time series and is much less sensitive to missing or noisy data. We further provide a search mechanism to find the best selection from the feature set that should be used as the clustering inputs.The proposed technique has been tested using benchmark time series datasets previously reported for time series clustering and a set of time series datasets with known characteristics. The empirical results show that our approach is able to yield meaningful clusters. The resulting clusters are similar to those produced by other methods, but with some promising and interesting variations that can be intuitively explained with knowledge of the global characteristics of the time series.  相似文献   

14.
As the basis of data management and analysis, data quality issues have increasingly become a research hotspot in related fields, which contributes to optimization of big data and artificial intelligence technology. Generally, physical failures or technical defects in data collectors and recorders cause anomalies in collected data. These anomalies will strongly impact on subsequent data analysis and artificial intelligence processes; thus, data should be processed and cleaned accordingly before application. Existing repairing methods based on smoothing will cause a large number of originally correct data points being over-repaired into wrong values. The constraint-based methods such as sequential dependency and SCREEN cannot accurately repair data under complex conditions since the constraints are relatively simple. A time series data repairing method under multi-speed constraints is further proposed based on the principle of minimum repairing. Then, dynamic programming is used to solve the problem of data anomalies with optimal repairing. Specifically, multiple speed intervals are set to constrain time series data, and a series of candidate repairing points are formed for each data point according to the speed constraints. Next, the optimal repair solution is selected from these candidates based on the dynamic programming method. With regard to the feasibility study of this method, an artificial dataset, two real datasets, and another real dataset with real anomalies are employed for experiments in case of different rates of anomalies and data sizes. Experimental results demonstrate that, compared with the existing methods based on smoothing or constraints, the proposed method has better performance in terms of RMS errors and time cost. In addition, the investigation of clustering and classification accuracy with several datasets reveals the impact of data quality on subsequent data analysis and artificial intelligence. The proposed method can improve the quality of data analysis and artificial intelligence results.  相似文献   

15.
基于小波分析的时间序列数据挖掘模型   总被引:2,自引:0,他引:2  
论文提出一个基于小波分析的时间序列挖掘模型TSMiner,它支持时间序列数据挖掘的整个过程。该模型由5部分组成:原始数据的可视化、数据预处理、数据约简,模式发现和结果模式可视化。该模型应用小波实现数据的多层次可视化表示、数据约简和多尺度模式发现。它可以帮助用户观察高维数据,理解中间结果和解释发现的模式。  相似文献   

16.
高菲  宋韶旭  王建民 《软件学报》2021,32(3):689-711
为进一步优化推广大数据及人工智能技术,作为数据管理与分析的基础,数据质量问题日益成为相关领域的研究热点.通常情况下,数据采集及记录仪的物理故障或技术缺陷等会导致收集到的数据存在一定的错误,而异常错误会对后续的数据分析以及人工智能过程产生不可小视的影响,因此在数据应用之前,需要对数据进行相应的数据清洗修复.现存的平滑修复...  相似文献   

17.
The UCR time series archive – introduced in 2002, has become an important resource in the time series data mining community, with at least one thousand published papers making use of at least one data set from the archive. The original incarnation of the archive had sixteen data sets but since that time, it has gone through periodic expansions. The last expansion took place in the summer of 2015 when the archive grew from 45 to 85 data sets. This paper introduces and will focus on the new data expansion from 85 to 128 data sets. Beyond expanding this valuable resource, this paper offers pragmatic advice to anyone who may wish to evaluate a new algorithm on the archive. Finally, this paper makes a novel and yet actionable claim: of the hundreds of papers that show an improvement over the standard baseline (1-nearest neighbor classification), a fraction might be mis-attributing the reasons for their improvement. Moreover, the improvements claimed by these papers might have been achievable with a much simpler modification, requiring just a few lines of code.   相似文献   

18.
确定性时间序列的相似性匹配方法都没有考虑数据的不确定性,而现实世界中传感器采集到的数据往往是不确定的,现有的时间序列的相似性匹配方法不适用于这些领域.针对此问题,将不确定性时间序列做预处理,把它分为横向时间维和纵向概率维,首先把给定的不确定时间序列用Haar小波变换进行压缩变换,在此基础上,对得到的不确定性时间序列概率维作纵向处理,提出一种选代表方法,即采用概率最大法、均值法等选出一条确定的时间序列.通过这2种预处理后,对得到的确定性时间序列进行降维和索引,根据查询序列和数据库中的时间序列中的各自的不确定性进行组合,分别提出对应组合的相似性匹配算法.  相似文献   

19.
本文首先简述了多维时间序列近年来的发展背景,然后具体的分析了其特点和建模的方法,并对其算法做了简单的介绍,最后建立了多维时间序列的AR模型,将多维时间序列应用到具体的在线检测的数据处理中,取得了良好的效果。  相似文献   

20.
Clustering time series is a problem that has applications in a wide variety of fields, and has recently attracted a large amount of research. Time series data are often large and may contain outliers. We show that the simple procedure of clipping the time series (discretising to above or below the median) reduces memory requirements and significantly speeds up clustering without decreasing clustering accuracy. We also demonstrate that clipping increases clustering accuracy when there are outliers in the data, thus serving as a means of outlier detection and a method of identifying model misspecification. We consider simulated data from polynomial, autoregressive moving average and hidden Markov models and show that the estimated parameters of the clipped data used in clustering tend, asymptotically, to those of the unclipped data. We also demonstrate experimentally that, if the series are long enough, the accuracy on clipped data is not significantly less than the accuracy on unclipped data, and if the series contain outliers then clipping results in significantly better clusterings. We then illustrate how using clipped series can be of practical benefit in detecting model misspecification and outliers on two real world data sets: an electricity generation bid data set and an ECG data set.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号