共查询到20条相似文献,搜索用时 15 毫秒
1.
时间序列聚类分析是数据挖掘研究的一个重要内容。已有的聚类算法大多采用k均值对低维数据进行聚类,不能对高维多变量时间序列(MTS)数据进行有效聚类。提出一种高效的多变量时间序列聚类算法PCA-CLUSTER,首先利用主成分分析对MTS数据降维;选取MTS数据的主成分序列进行K近邻聚类分析。理论分析和实验结果表明算法可以有效解决MTS数据聚类问题。 相似文献
2.
Co?kun Hamzaçebi 《Information Sciences》2008,178(23):4550-4559
In this study, an artificial neural network (ANN) structure is proposed for seasonal time series forecasting. The proposed structure considers the seasonal period in time series in order to determine the number of input and output neurons. The model was tested for four real-world time series. The results found by the proposed ANN were compared with the results of traditional statistical models and other ANN architectures. This comparison shows that the proposed model comes with lower prediction error than other methods. It is shown that the proposed model is especially convenient when the seasonality in time series is strong; however, if the seasonality is weak, different network structures may be more suitable. 相似文献
3.
4.
为提高多元时间序列相似性度量的效率,采用扩展Frobenius范数(Eros)的主元分析(PCA)方法,通过主元和本征值构造主元相似因子,用于比较多元时间序列矩阵之间的相似性。为了验证这种方法的有效性,针对三组数据(两个真实数据,一个合成数据)进行了实验。结果表明,该方法相对于以往的欧几里德距离(ED),动态时间弯曲(DTW)相似性度量方法具有一定的优越性。 相似文献
5.
6.
主成分分析与神经网络的结合在多变量序列预测中的应用 总被引:1,自引:0,他引:1
目前预测方法的研究主要集中在单变量时间序列上,本文建立起一种针对多元变量非线性时间序列建模和预测的方法框架.首先,同时考虑序列状态间的线性相关性和非线性相关性,建立初始延迟窗以包含充分的预测信息;然后,利用主成分分析(PCA)方法寻找不同变量在数据空间中的最大方差方向,扩展PCA应用于提取多个变量的综合信息,重构多元变量输入状态相空间;最后,利用神经网络逼近不同变量之间以及当前状态和将来状态之间的函数映射关系,实现多元变量预测.对Ro¨ssler混沌方程和大连降雨、气温序列的预测仿真说明了本文方法的有效性,为多元变量时间序列分析提供了一条新的途径. 相似文献
7.
针对高维特性对多元时间序列数据挖掘过程和结果的影响,以及传统主成分分析方法在多元时间序列数据特征表示上的局限性,提出一种基于变量相关性的多元时间序列数据特征表示方法。通过协方差矩阵描述每个多元时间序列的分布特征和变量相关关系,利用主成分分析方法对综合协方差矩阵进行主元分析,进而实现多元时间序列的数据降维和特征表示。实验结果表明,所提出的方法不仅能提高多元时间序列数据挖掘的质量,还可以对不等长多元时间序列进行快速有效的挖掘。 相似文献
8.
提出一种基于独立成分分析(ICA)的最小二乘支持向量机(LS-SVM),用于时间序列的多步超前独立预测.用ICA估计预测变量中的独立成分(IC),用不含噪声的IC重新构建时间序列.利用 -最近邻法( -NN)减小训练集的规模,提出一种新的距离函数以降低LS-SVM训练过程的计算复杂度,并用约束条件对预测值进行后处理.使用基于ICA的LS-SVM、普通LS-SVM与反向传播神经网络(BP-ANN),对多个时间序列进行对比预测实验.实验结果表明,基于ICA的LS-SVM的预测性能优于普通LS-SVM和BP-ANN. 相似文献
9.
Modeling and forecasting seasonal and trend time series is an important research topic in many areas of industrial and economic activity. In this study, we forecast the seasonal and trend time series using a quasi-linear autoregressive model. This quasi-linear autoregressive model belongs to a class of varying coefficient models in which its autoregressive coefficients are constructed by radial basis function networks. A combined genetic optimization and gradient-based optimization algorithm is applied for automatic selection of proper input variables and model-dependent variables, and optimizing the model parameters simultaneously. The model is tested by five monthly time series. We compare the results with those of other various methods, which show the effectiveness of the proposed approach for the seasonal time series. 相似文献
10.
针对分段线性表示(即一阶线性函数表示)或分段常数表示(即零阶函数表示)在时间序列近似表示中拟合误差较大的问题,提出时间序列高阶函数分段表示方法。通过建立高阶函数候选集模型,利用拟合误差指标选取最优函数,为保证在分段点处连续,引入断点处约束条件;在分段点选取方式上,设定观测值变化阈值及分段区间阈值,保证压缩率的同时,保留重要点信息。实验结果表明,该算法相对于分段线性表示和分段常数表示,能更好地拟合原始序列。 相似文献
11.
12.
基于储备池主成分分析的多元时间序列预测研究 总被引:1,自引:0,他引:1
提出一种基于回声状态网络储备池的非线性PCA 方法,并将其应用于多元时间序列的预测中.由于多维输入变量间的相关性会影响建模效果,通过储备池将输入在原空间的非线性特征转化成高维空间的线性特征.在其中运用线性PCA 技术寻找输入在储备池空间的最大方差方向,提取有效的多元变量综合信息.经储备池主成分分析处理后的输入与预测点呈动态线性映射,可使用线性方法建模.仿真结果表明了该方法的有效性. 相似文献
13.
在高端制造企业的运维业务中,配件需求随机发生,且伴随有大量的零需求阶段,同时,对应的配件需求数据量小,且呈现出间歇性和块状分布的特点,导致现有时间序列预测方法难以有效预测配件需求走势。为解决该问题,提出了一种间歇性时间序列的可预测性评估及联合预测方法。首先,提出了一种新的间歇相似度指标,通过统计两条序列中“0”元素出现的频次和位置,并结合最大信息系数和平均需求间隔等度量指标,有效评估了序列的趋势信息和波动规律,并实现了对间歇性序列可预测性的量化;其次,基于该指标,构建了一个间歇相似度层次聚类方法来自适应地筛选相似性高、可预测性强的序列,剔除极度稀疏、无法预测的序列;此外,探索利用序列间的结构化信息,并构建多输出支持向量回归(M-SVR)模型,从而实现小样本下的间歇性序列联合预测;最后,分别在两个公开数据集(UCI礼品零售数据集和华为电脑配件数据集)和某大型制造企业实际配件售后数据集上进行实验。实验结果表明,相比多个典型的时间序列预测方法,所提方法可有效挖掘各类间歇性序列的可预测性,提高小样本间歇性序列的预测精度,从而为制造企业配件需求预测提供了一种新的解决方案。 相似文献
14.
Fuzzy time series model has been successfully employed in predicting stock prices and foreign exchange rates. In this paper, we propose a new fuzzy time series model termed as distance-based fuzzy time series (DBFTS) to predict the exchange rate. Unlike the existing fuzzy time series models which require exact match of the fuzzy logic relationships (FLRs), the distance-based fuzzy time series model uses the distance between two FLRs in selecting prediction rules. To predict the exchange rate, a two factors distance-based fuzzy time series model is constructed. The first factor of the model is the exchange rate itself and the second factor comprises many candidate variables affecting the fluctuation of exchange rates. Using the exchange rate data released by the Central Bank of Taiwan, we conducted several experiments on exchange rate forecasting. The experiment results showed that the distance-based fuzzy time series outperformed the random walk model and the artificial neural network model in terms of mean square error. 相似文献
15.
Traditional methodologies for time series prediction take the series to be predicted and split it into training, validation, and test sets. The first one serves to construct forecasting models, the second set for model selection, and the third one is used to evaluate the final model. Different time series approaches such as ARIMA and exponential smoothing, as well as regression techniques such as neural networks and support vector regression, have been successfully used to develop forecasting models. A problem that has not yet received proper attention, however, is how to update such forecasting models when new data arrives, i.e. when a new event of the considered time series occurs.This paper presents a strategy to update support vector regression based forecasting models for time series with seasonal patterns. The basic idea of this updating strategy is to add the most recent data to the training set every time a predefined number of observations takes place. This way, information in new data is taken into account in model construction. The proposed strategy outperforms the respective static version in almost all time series studied in this work, considering three different error measures. 相似文献
16.
降低漏报率和误检率是网络流量异常检测的难点问题之一。本文提出了一种大规模通信网络流量异常特征分析的多时间序列数据挖掘方法,把多个网络流量特征参数构成的时间序列作为一个整体进行分析研究,进行多时间序列数据挖掘产生网络流量异常相关的有效关联规则,对整个通信网络的安全威胁进行准确地描述。Abilene网络数据验证了本文的方法。 相似文献
17.
Pritpal Singh Bhogeswar Borah 《Engineering Applications of Artificial Intelligence》2013,26(10):2443-2457
In this paper, we present a new model to handle four major issues of fuzzy time series forecasting, viz., determination of effective length of intervals, handling of fuzzy logical relationships (FLRs), determination of weight for each FLR, and defuzzification of fuzzified time series values. To resolve the problem associated with the determination of length of intervals, this study suggests a new time series data discretization technique. After generating the intervals, the historical time series data set is fuzzified based on fuzzy time series theory. Each fuzzified time series values are then used to create the FLRs. Most of the existing fuzzy time series models simply ignore the repeated FLRs without any proper justification. Since FLRs represent the patterns of historical events as well as reflect the possibility of appearances of these types of patterns in the future. If we simply discard the repeated FLRs, then there may be a chance of information lost. Therefore, in this model, it is recommended to consider the repeated FLRs during forecasting. It is also suggested to assign weights on the FLRs based on their severity rather than their patterns of occurrences. For this purpose, a new technique is incorporated in the model. This technique determines the weight for each FLR based on the index of the fuzzy set associated with the current state of the FLR. To handle these weighted FLRs and to obtain the forecasted results, this study proposes a new defuzzification technique. The proposed model is verified and validated with three different time series data sets. Empirical analyses signify that the proposed model have the robustness to handle one-factor time series data set very efficiently than the conventional fuzzy time series models. Experimental results show that the proposed model also outperforms over the conventional statistical models. 相似文献
18.
19.
为了更好地体现时间序列的形态特征,并探索更适合于较长时间序列之间相似性度量的方法,在动态时间弯曲算法的基础上进行改进,提出了基于分层动态时间弯曲的序列相似性度量方法。对时间序列进行多层次分段,并从分段中均匀抽取相对应的层次分段子序列,然后将层次分段子序列抽象为三维空间的点(反映了分段子序列的均值、长度和趋势)进行相似性度量,最后综合各个层次的相似性度量作为结果。实验表明,在参数设置合理的情况下,此方法能获得较高的序列相似性度量准确度和效率。 相似文献
20.
针对现有模糊时间序列预测算法无法适应预测中新关系出现的问题,提出了一种基于区间相似度的模糊时间序列预测(ISFTS)算法。首先,在模糊理论的基础上,采用基于均值的方法二次划分论域的区间,在论域区间上定义相应模糊集将历史数据模糊化;然后建立三阶模糊逻辑关系并引入逻辑关系相似度的计算公式,计算未来数据变化趋势值得到预测的模糊值;最后对预测模糊值去模糊化得到预测的确定值。由于ISFTS算法是预测数据变化趋势,克服了目前预测算法的逻辑关系的缺陷。仿真实验结果表明,与同类的预测算法相比,ISFTS算法预测误差更小,在误差相对比(MAPE)、绝对误差均值(MAE)和均方根误差(RMSE)三项指标上均优于同类的对比算法,因此ISFTS算法在时间序列预测中尤其是大数据量情况下的预测具有更强的适应性。 相似文献