首页 | 本学科首页   官方微博 | 高级检索  
     

时间序列对称模式挖掘
引用本文:李盼盼,宋韶旭,王建民.时间序列对称模式挖掘[J].软件学报,2022,33(3):968-984.
作者姓名:李盼盼  宋韶旭  王建民
作者单位:清华大学 软件学院, 北京 海淀 100084;清华大学 软件学院, 北京 海淀 100084;大数据系统软件国家工程实验室, 北京 100084;北京信息科学与技术国家研究中心(清华大学), 北京 100084
基金项目:国家重点研发计划项目(2019YFB1705301,2019YFB1707001);国家自然科学基金(62072265,62021002,71690231);工信部2020年新兴平台软件项目
摘    要:随着信息化和工业化的融合,物联网和工业互联网蓬勃发展,由此产生了以时间序列为代表的大量工业大数据.时间序列中蕴含着很多有价值的模式,其中,对称模式在各类时间序列中广泛存在.挖掘对称模式对于行为分析、轨迹跟踪、异常检测等领域具有重要的研究价值,但时间序列的数据量往往高达几十甚至上百GB.使用直接的嵌套查询算法挖掘对称模式...

关 键 词:时间序列  对称模式  距离度量  动态规划
收稿时间:2021/7/1 0:00:00
修稿时间:2021/7/31 0:00:00

Time Series Symmetric Pattern Mining
LI Pan-Pan,SONG Shao-Xu,WANG Jian-Min.Time Series Symmetric Pattern Mining[J].Journal of Software,2022,33(3):968-984.
Authors:LI Pan-Pan  SONG Shao-Xu  WANG Jian-Min
Affiliation:School of Software, Tsinghua University, Beijing 100084, China;School of Software, Tsinghua University, Beijing 100084, China;National Engineering Laboratory for Big Data Software, Beijing 100084, China;Beijing National Research Center for Information Science and Technology (Tsinghua University), Beijing 100084, China
Abstract:With the integration of informatization and industrialization, the Internet of Things and Industrial Internet are booming, resulting in a large amount of industrial big data represented by time series. There are many valuable patterns in time series, among which symmetric patterns are widespread in various time series. Mining symmetric patterns has important research value in the fields of behavior analysis, trajectory tracking, anomaly detection, etc. However, the amount of time series data is often as high as tens or even hundreds of gigabytes, and direct nested query algorithms may take several months or even several years, and typical acceleration techniques such as indexing, lower bounds, and triangular inequalities can only produce an acceleration of one or two orders of magnitude at most. Therefore, based on the inspiration of the dynamic time warping algorithm, this paper proposes a method that can mine all the symmetric patterns of the time series within the time complexity of O(WX|T|). Specifically, given the symmetric pattern length constraint, the symmetric subsequences can be calculated based on the interval dynamic programming. Then the most non-overlapping symmetric subsequences can be selected according to the greedy strategy. In addition, this paper also studies the algorithm for mining symmetric patterns in the time series data stream, and adjusts the window size to ensure the integrity of the symmetric pattern data according to the data in the window the feature dynamically. Using one artificial data set and three real data sets to experiment with the above method under different data volumes, it can be seen from the experimental results that compared with other symmetric pattern mining methods, this method has a good performance in terms of pattern mining results and time overhead.
Keywords:time series  symmetric pattern  distance measurement  dynamic programming
点击此处可从《软件学报》浏览原始摘要信息
点击此处可从《软件学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号