共查询到19条相似文献,搜索用时 76 毫秒
1.
2.
3.
4.
5.
6.
基于斜率表示的时间序列相似性度量方法 总被引:5,自引:0,他引:5
时间序列相似性搜索是数据挖掘领域的一个热点研究方向,相似性距离度量方法是其中的一个重要问题.针对含有大量噪声并存在数据缺失的高维多元时间序列数据,本文提出一种基于斜率表示的时间序列相似性度量方法.该方法是在线性分段的基础上,对两个序列间的斜率差进行加权,因而物理概念更为明确.文中还证明斜率距离完全满足相似性度量的基本准则.实例证明了算法的有效性. 相似文献
7.
时间序列的相似性度量是时间序列数据挖掘研究中的一个重要问题,是进行序列查询、分类、预测的一项基础工作。寻求一种好的度量对提高挖掘任务的效率和准确性有着至关重要的意义。目前从事这方面的研究除了少许理论论述外,几乎都采用一种固定的方法,即提出具体要求并提供实验数据。然而,大多数实验方法不是使用范围有限就是侧重点不同。为了提供一个比较全面的实验验证,用1NN分类算法进行了大量的时间序列交叉验证实验,重新评估了其中的弹性度量,并使用不同应用领域的28个时间序列数据集进行比较,结果表明,该方法具有更高的准确性。 相似文献
8.
时间序列的相似性度量是时间序列分析的基础工作之一,是进行相似匹配的关键。针对欧几里德距离描述分段趋势的不足和各种模式距离对应分段之间距离值的离散化问题,提出一种基于形态相似距离的时间序列相似性度量方法,标准数据集上完成的识别和聚类实验表明了该方法的可行性和有效性。 相似文献
9.
基于事件的时间序列相似性度量方法 总被引:2,自引:0,他引:2
为了在时间序列相似性度量过程中更好地体现用户的需求,提高相似性度量的准确度,提出了基于事件的时间序列相似性度量方法(SMBE)。首先将用户的需求定义为事件,将原始时间序列转化为事件序列;然后,构建了基于事件序列的相似性度量模型(SMBE),SMBE定义了不同事件序列中各元素之间的相似性,并构成相应的相似性矩阵,对相似性矩阵进行搜索得到最优路径的值作为序列之间的相似性度量;最后,提出了基于SMBE的聚类方法。实验表明,在参数设置合理的情况下,能获得接近0.90的聚类精度。 相似文献
10.
时间序列的相似性搜索是时间序列知识发现的重要方面。该文提出了一种新的基于距离度量的时间序列相似性搜索算法。该算法采用分段线性表示,同时使用改进的模式距离来度量序列间的距离。 相似文献
11.
针对时间序列的全序列聚类展开,提出一种新的相似性度量——全局特征,即从时间序列的统计分布特征、非线性和Fourier频谱转换等3个方面提取11个全局特征构建特征向量。利用特征向量来描述原时间序列,不仅保留了大部分原有的信息,还能加快聚类计算的速度。经过大量的实验验证表明,基于全局特征提取的相似性度量能得到合理的聚类结果,特别是对经济领域的时间序列效果更为明显。例举了2个数据进行实验,并从主观和客观两个角度对聚类结果进行评估。 相似文献
12.
为了减少噪声数据对查询最优序列的影响,避免Euclidean距离对形态的敏感性,以及要求序列等长的缺点,提出了面向噪声数据的时间序列相似性搜索算法.运用SPC方法去除序列中的噪声数据;采用DTW距离作为度量函数,使用规范化方法使序列处于相同的分辨率下;采用LB_ Keogh下界函数对候选序列集合进行筛选.仿真实验结果表明,该算法在阈值较小时,对含有噪声数据序列的匹配能力较强. 相似文献
13.
For more than a decade, researchers have actively explored the area of image/video analysis and retrieval. Yet one fundamental
problem remains largely unsolved: how to measure perceptual similarity between two objects. For this purpose, most researchers
employ a Minkowski-type metric. Unfortunately, the Minkowski metric does not reliably find similarities in objects that are
obviously alike. Through mining a large set of visual data, our team has discovered a perceptual distance function. We call
the discovered function the dynamic partial function (DPF). When we empirically compare DPF to Minkowski-type distance functions in image retrieval and in video shot-transition
detection using our image features, DPF performs significantly better. The effectiveness of DPF can be explained by similarity theories in cognitive psychology. 相似文献
14.
Querying time series data based on similarity 总被引:3,自引:0,他引:3
We study similarity queries for time series data where similarity is defined, in a fairly general way, in terms of a distance function and a set of affine transformations on the Fourier series representation of a sequence. We identify a safe set of transformations supporting a wide variety of comparisons and show that this set is rich enough to formulate operations such as moving average and time scaling. We also show that queries expressed using safe transformations can efficiently be computed without prior knowledge of the transformations. We present a query processing algorithm that uses the underlying multidimensional index built over the data set to efficiently answer similarity queries. Our experiments show that the performance of this algorithm is competitive to that of processing ordinary (exact match) queries using the index, and much faster than sequential scanning. We propose a generalization of this algorithm for simultaneously handling multiple transformations at a time, and give experimental results on the performance of the generalized algorithm 相似文献
15.
16.
为了更好地体现时间序列的形态特征,并探索更适合于较长时间序列之间相似性度量的方法,在动态时间弯曲算法的基础上进行改进,提出了基于分层动态时间弯曲的序列相似性度量方法。对时间序列进行多层次分段,并从分段中均匀抽取相对应的层次分段子序列,然后将层次分段子序列抽象为三维空间的点(反映了分段子序列的均值、长度和趋势)进行相似性度量,最后综合各个层次的相似性度量作为结果。实验表明,在参数设置合理的情况下,此方法能获得较高的序列相似性度量准确度和效率。 相似文献
17.
18.
19.
Efficient query filtering for streaming time series with applications to semisupervised learning of time series classifiers 总被引:1,自引:1,他引:1
Li Wei Eamonn Keogh Helga Van Herle Agenor Mafra-Neto Russell J. Abbott 《Knowledge and Information Systems》2007,11(3):313-344
In this paper, we define time series query filtering, the problem of monitoring the streaming time series for a set of predefined patterns. This problem is of great practical
importance given the massive volume of streaming time series available through sensors, medical patient records, financial
indices and space telemetry. Since the data may arrive at a high rate and the number of predefined patterns can be relatively
large, it may be impossible for the comparison algorithm to keep up. We propose a novel technique that exploits the commonality
among the predefined patterns to allow monitoring at higher bandwidths, while maintaining a guarantee of no false dismissals.
Our approach is based on the widely used envelope-based lower-bounding technique. As we will demonstrate on extensive experiments
in diverse domains, our approach achieves tremendous improvements in performance in the offline case, and significant improvements
in the fastest possible arrival rate of the data stream that can be processed with guaranteed no false dismissals. As a further
demonstration of the utility of our approach, we demonstrate that it can make semisupervised learning of time series classifiers
tractable.
Li Wei is a Ph.D. candidate in the Department of Computer Science & Engineering at the University of California, Riverside. She
received her B.S. and M.S. degrees from Fudan University, China. Her research interests include data mining and information
retrieval.
Eamonn Keogh is an Assistant Professor of computer science at the University of California, Riverside. His research interests include
data mining, machine learning and information retrieval. Several of his papers have won best paper awards, including papers
at SIGKDD and SIGMOD. Dr. Keogh is the recipient of a 5-year NSF Career Award for “Efficient Discovery of Previously Unknown Patterns and Relationships in Massive Time Series Databases”.
Helga Van Herle is an Assistant Clinical Professor of medicine at the Division of Cardiology of the Geffen School of Medicine at UCLA. She
received her M.D. from UCLA in 1993; completed her residency in internal medicine at the New York Hospital (Cornell University;
1993–1996) and her cardiology fellowship at UCLA (1997–2001). Dr. Van Herle holds an M.Sc. in bioengineering from Columbia
University (1987) and a B.Sc. in chemical engineering from UCLA (1985).
Agenor Mafra-Neto, Ph.D., is the CEO of ISCA Technologies, Inc., in California and the founder of ISCA Technologies, LTDA, in Brazil. His research
interests include the analysis of insect behavior and communication systems, the manipulation of insect behavior, and the
automation of pest monitoring and pest control. Dr. Mafra-Neto is currently coordinating the deployment of area-wide smart
sensor and effector networks to micromanage agricultural and public health pests in the field in an automatic fashion.
Russell J. Abbott is a Professor of computer science at California State University, Los Angeles, and a member of the staff at the Aerospace
Corporation, El Segundo, CA. His primary interests are in the field of complex systems. He is currently organizing a workshop
to bring together people working in the fields of complex systems and systems engineering. 相似文献