首页 | 本学科首页   官方微博 | 高级检索  
     

一种时间序列鉴别性特征字典构建算法
引用本文:张伟,王志海,原继东,郝石磊.一种时间序列鉴别性特征字典构建算法[J].软件学报,2020,31(10):3216-3237.
作者姓名:张伟  王志海  原继东  郝石磊
作者单位:北京交通大学计算机与信息技术学院,北京 100044
基金项目:中央高校基本科研业务费专项资金(2018JBM014);国家自然科学基金(61702030,61672086);北京市自然科学基金(4182052);北京市优秀人才项目资助(2017000020124G056)
摘    要:时间序列数据广泛产生于科技和经济的多个领域.基于符号傅里叶近似(symbolic Fourier approximation)和滑动窗口的定长单词抽取算法是目前时间序列特征字典构建过程中最有效的特征生成算法之一,但是该算法在特征生成过程中不能根据不同滑动窗口长度动态地选择保留的最优傅里叶值的个数,而且特征字典构建过程中缺少从生成的海量特征中对鉴别性特征进行有效选择的算法.为此,提出一种鉴别性特征字典构建算法.首先,提出一种针对不同长度滑动窗口学习最优单词长度的基于Fourier近似的可变长度单词抽取方法;其次,构建了一种新的特征鉴别性评价指标,并依据其动态阈值对生成的特征进行选择.实验结果表明,基于构建的特征字典的逻辑回归模型不仅分类精度高,而且可以有效发现预测过程中的鉴别性特征.

关 键 词:时间序列分类  特征生成  鉴别性特征选择  特征字典学习
收稿时间:2018/10/23 0:00:00
修稿时间:2019/1/1 0:00:00

Time Series Discriminative Feature Dictionary Construction Algorithm
ZHANG Wei,WANG Zhi-Hai,YUAN Ji-Dong,HAO Shi-Lei.Time Series Discriminative Feature Dictionary Construction Algorithm[J].Journal of Software,2020,31(10):3216-3237.
Authors:ZHANG Wei  WANG Zhi-Hai  YUAN Ji-Dong  HAO Shi-Lei
Affiliation:School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, China
Abstract:Time series data are widely generated in many fields of science, technology and economy. Time series feature generation algorithm based on Symbolic Fourier Approximation (SFA) and sliding window transformation mechanism is one of the most effective feature dictionary construction algorithms, but there are some obvious shortcomings in this kind of methods. Firstly, the number of optimal Fourier values cannot be dynamically selected for different sliding window lengths in the process of transformation. Secondly, there is a lack of effective algorithm to select discriminant features from the generated massive features. To this end, a new variable length feature dictionary building algorithm is proposed in this study. First, a variable length word extraction method based on SFA is proposed. The method dynamically selects the optimal number of Fourier values for different sliding window lengths. Second, a new feature discriminant evaluation indicator is designed, and the generated features are selected according to its dynamic threshold. Experimental results show that, based on the proposed time series dictionary, the logistic regression model can achieve high classification accuracy and find the discriminant features in the prediction process.
Keywords:time series classification  feature generation  discriminant feature selection  feature dictionary learning
本文献已被 万方数据 等数据库收录!
点击此处可从《软件学报》浏览原始摘要信息
点击此处可从《软件学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号