共查询到20条相似文献,搜索用时 15 毫秒
1.
《Expert systems with applications》2014,41(9):4349-4359
Kernel-based algorithms have been proven successful in many nonlinear modeling applications. However, the computational complexity of classical kernel-based methods grows superlinearly with the increasing number of training data, which is too expensive for online applications. In order to solve this problem, the paper presents an information theoretic method to train a sparse version of kernel learning algorithm. A concept named instantaneous mutual information is investigated to measure the system reliability of the estimated output. This measure is used as a criterion to determine the novelty of the training sample and informative ones are selected to form a compact dictionary to represent the whole data. Furthermore, we propose a robust learning scheme for the training of the kernel learning algorithm with an adaptive learning rate. This ensures the convergence of the learning algorithm and makes it converge to the steady state faster. We illustrate the performance of our proposed algorithm and compare it with some recent kernel algorithms by several experiments. 相似文献
2.
Abdullah Mueen Nikan Chavoshi Noor Abu-El-Rub Hossein Hamooni Amanda Minnich Jonathan MacCarthy 《Knowledge and Information Systems》2018,54(1):237-263
Dynamic time warping (DTW) distance has been effectively used in mining time series data in a multitude of domains. However, in its original formulation DTW is extremely inefficient in comparing long sparse time series, containing mostly zeros and some unevenly spaced nonzero observations. Original DTW distance does not take advantage of this sparsity, leading to redundant calculations and a prohibitively large computational cost for long time series. We derive a new time warping similarity measure (AWarp) for sparse time series that works on the run-length encoded representation of sparse time series. The complexity of AWarp is quadratic on the number of observations as opposed to the range of time of the time series. Therefore, AWarp can be several orders of magnitude faster than DTW on sparse time series. AWarp is exact for binary-valued time series and a close approximation of the original DTW distance for any-valued series. We discuss useful variants of AWarp: bounded (both upper and lower), constrained, and multidimensional. We show applications of AWarp to three data mining tasks including clustering, classification, and outlier detection, which are otherwise not feasible using classic DTW, while producing equivalent results. Potential areas of application include bot detection, human activity classification, search trend analysis, seismic analysis, and unusual review pattern mining. 相似文献
3.
Marcin Michalak 《Pattern Analysis & Applications》2011,14(3):283-293
This short article describes two kernel algorithms of the regression function estimation. One of them is called HASKE and has its own heuristic of the h parameter evaluation. The second is a hybrid algorithm that connects the SVM and HASKE in such a way that the definition of the local neighborhood is based on the definition of the h-neighborhood from HASKE. Both of them are used as predictors for time series. 相似文献
4.
Neural Computing and Applications - Based on support vector machine (SVM), incremental SVM was proposed, which has a strong ability to deal with various classification and regression problems.... 相似文献
5.
6.
Using the classical Parzen window (PW) estimate as the target function, the sparse kernel density estimator is constructed in a forward-constrained regression (FCR) manner. The proposed algorithm selects significant kernels one at a time, while the leave-one-out (LOO) test score is minimized subject to a simple positivity constraint in each forward stage. The model parameter estimation in each forward stage is simply the solution of jackknife parameter estimator for a single parameter, subject to the same positivity constraint check. For each selected kernels, the associated kernel width is updated via the Gauss-Newton method with the model parameter estimate fixed. The proposed approach is simple to implement and the associated computational cost is very low. Numerical examples are employed to demonstrate the efficacy of the proposed approach. 相似文献
7.
时序数据中的野值会直接影响数据挖掘算法的结果,甚至造成算法失效。传统的基于密度的带有噪声的空间聚类(DBSCAN)算法可以用来识别野值,但是却存在算法对参数敏感、时间复杂度高、精度不高等问题。针对时序数据的特点,提出了一种可自动进行多次识别的基于方差聚类的野值识别算法。该方法通过将传统的邻域密度转换为方差和均值、将密度阈值转换为时间窗口内的方差和阈值,在定义野值数据、野簇数据和异常簇数据的基础上,给出野值识别方法的判断规则。同时,针对一次野值识别不能将全部野值剔除的问题,通过定义多次野值识别的结束条件将算法扩展为多次野值识别算法。通过在某航天数据挖掘项目中的应用,验证了该算法具有较好的通用性、低的时间复杂度、可进行多次识别以提高精度等特点。 相似文献
8.
José B. Aragão Jr. 《Computers & Electrical Engineering》2010,36(3):536-544
Voice over IP (VoIP) applications requires a buffer at the receiver to minimize the packet loss due to late arrival. Several algorithms are available in the literature to estimate the playout buffer delay. Classic estimation algorithms are non-adaptive, i.e. they differ from more recent approaches basically due to the absence of learning mechanisms. This paper introduces two new formulations of adaptive algorithms for online learning and prediction of the playout buffer delay, the first one being based on the standard Box-Jenkins autoregressive model, while the second one being based on the feedforward and recurrent neural networks. The obtained results indicate that the proposed algorithms present better overall performance than the classic ones. 相似文献
9.
Hongqiao Wang Fuchun Sun Yanning Cai Zongtao Zhao 《Soft Computing - A Fusion of Foundations, Methodologies and Applications》2010,14(9):931-944
The kernel method has proved to be an effective machine learning tool in many fields. Support vector machines with various
kernel functions may have different performances, as the kernels belong to two different types, the local kernels and the
global kernels. So the composite kernel, which can bring more stable results and good precision in classification and regression,
is an inevitable choice. To reduce the computational complexity of the kernel machine’s online modeling, an unbiased least
squares support vector regression model with composite kernel is proposed. The bias item of LSSVR is eliminated by improving
the form of structure risk in this model, and then the calculating method of the regression coefficients is greatly simplified.
Simultaneously, through introducing the composite kernel to the LSSVM, the model can easily adapt to the irregular variation
of the chaotic time series. Considering the real-time performance, an online learning algorithm based on Cholesky factorization
is designed according to the characteristic of extended kernel function matrix. Experimental results indicate that the unbiased
composite kernel LSSVR is effective and suitable for online time series with both the steep variations and the smooth variations,
as it can well track the dynamic character of the series with good prediction precisions, better generalization and stability.
The algorithm can also save much computation time comparing to those methods using matrix inversion, although there is a little
more loss in time than that with the usage of single kernels. 相似文献
10.
Abdullah Mueen Eamonn Keogh Qiang Zhu Sydney S. Cash M. Brandon Westover Nima Bigdely-Shamlo 《Data mining and knowledge discovery》2011,22(1-2):73-105
Time series motifs are sets of very similar subsequences of a long time series. They are of interest in their own right, and are also used as inputs in several higher-level data mining algorithms including classification, clustering, rule-discovery and summarization. In spite of extensive research in recent years, finding time series motifs exactly in massive databases is an open problem. Previous efforts either found approximate motifs or considered relatively small datasets residing in main memory. In this work, we leverage off previous work on pivot-based indexing to introduce a disk-aware algorithm to find time series motifs exactly in multi-gigabyte databases which contain on the order of tens of millions of time series. We have evaluated our algorithm on datasets from diverse areas including medicine, anthropology, computer networking and image processing and show that we can find interesting and meaningful motifs in datasets that are many orders of magnitude larger than anything considered before. 相似文献
11.
12.
13.
Wang Zhijin Su Qiankun Chao Guoqing Cai Bing Huang Yaohui Fu Yonggang 《Applied Intelligence》2022,52(13):14595-14606
Applied Intelligence - Share turnover is a key indicator for investing in the stock market, which represents how easy or difficult it is to trade a stock. Several techniques have been proposed to... 相似文献
14.
为在基于隐变量模型的因果关系发现算法中综合考虑隐变量之间的瞬时性和延时性因果效应,构建以动态贝叶斯网络为基础的时序隐变量模型,提出对应的因果关系发现算法。使用因子分析的方法估计测量模型中的因子载荷矩阵,应用结构向量自回归模型估计自回归矩阵,利用数据的非高斯性依次学习模型中隐变量之间的瞬时效应矩阵与延时效应矩阵,构建时序隐变量模型的因果网络结构。实验结果验证了算法的有效性。 相似文献
15.
Hisashi Shimodaira 《Expert systems with applications》1996,10(3-4):429-434
This paper explores a method of improving the predictive performance by the multi-layer feedforward neural network in time series predicting. For the similar data selective learning method, we propose a method of weighting the distance by a power function of correlation coefficients for the time series (CSDS method). The results of numerical experiments show that with the case of a time series whose nature is rather choppy or chaotic, using the CSDS method appropriately is considerably effective to improve the predictive performance and its performance is considerably better than that by the previously proposed other methods. 相似文献
16.
Fuzzy time series approaches are used when observations of time series contain uncertainty. Moreover, these approaches do not require the assumptions needed for traditional time series approaches. Generally, fuzzy time series methods consist of three stages, namely, fuzzification, determination of fuzzy relations, and defuzzification. Artificial intelligence algorithms are frequently used in these stages with genetic algorithms being the most popular of these algorithms owing to their rich operators and good performance. However, the mutation operator of a GA may cause some negative results in the solution set. Thus, we propose a modified genetic algorithm to find optimal interval lengths and control the effects of the mutation operator. The results of applying our new approach to real datasets show superior forecasting performance when compared with those obtained by other techniques. 相似文献
17.
Depei Bao 《Applied Intelligence》2008,29(1):1-11
Traditional financial analysis systems utilize low-level price data as their analytical basis. For example, a decision-making
system for stock predictions regards raw price data as the training set for classifications or rule inductions. However, the
financial market is a complex and dynamic system with noisy, non-stationary and chaotic data series. Raw price data are too
random to characterize determinants in the market, preventing us from reliable predictions. On the other hand, high-level
representation models which represent data on the basis of human knowledge of the problem domain can reduce the randomness
in the raw data. In this paper, we present a high-level representation model easy to translate from low-level data into the
machine representation. It is a generalized model in that it can accommodate multiple financial analytical techniques and
intelligent trading systems. To demonstrate this, we further combine the representation with a probabilistic model for automatic
stock trades and provide promising results.
An erratum to this article can be found at 相似文献
18.
为克服维数灾难和过拟合等传统算法所不可规避的问题,利用支持向量机(Support Vector Machine,SVM)提出基于时序数据时间相关性的核函数修正选择方法,并以真实的二氧化硫(SO2)数据为实验数据验证该方法的有效性.实验结果表明采用时序核函数对测试数据集的拟合效果更好,并对模型泛化能力有一定的提高. 相似文献
19.
Michael P. Clements 《Computational statistics & data analysis》2007,51(7):3580-3594
The calculation of interval forecasts for highly persistent autoregressive (AR) time series based on the bootstrap is considered. Three methods are considered for countering the small-sample bias of least-squares estimation for processes which have roots close to the unit circle: a bootstrap bias-corrected OLS estimator; the use of the Roy-Fuller estimator in place of OLS; and the use of the Andrews-Chen estimator in place of OLS. All three methods of bias correction yield superior results to the bootstrap in the absence of bias correction. Of the three correction methods, the bootstrap prediction intervals based on the Roy-Fuller estimator are generally superior to the other two. The small-sample performance of bootstrap prediction intervals based on the Roy-Fuller estimator are investigated when the order of the AR model is unknown, and has to be determined using an information criterion. 相似文献
20.
Otávio A. S. Carpinteiro Jo?o P. R. R. Leite Carlos A. M. Pinheiro Isaías Lima 《Artificial Intelligence Review》2012,38(2):163-171
This paper presents the study of three forecasting models??a multilayer perceptron, a support vector machine, and a hierarchical model. The hierarchical model is made up of a self-organizing map and a support vector machine??the latter on top of the former. The models are trained and assessed on a time series of a Brazilian stock market fund. The results from the experiments show that the performance of the hierarchical model is better than that of the support vector machine, and much better than that of the multilayer perceptron. 相似文献