首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The fuzzy time series has recently received increasing attention because of its capability of dealing with vague and incomplete data. There have been a variety of models developed to either improve forecasting accuracy or reduce computation overhead. However, the issues of controlling uncertainty in forecasting, effectively partitioning intervals, and consistently achieving forecasting accuracy with different interval lengths have been rarely investigated. This paper proposes a novel deterministic forecasting model to manage these crucial issues. In addition, an important parameter, the maximum length of subsequence in a fuzzy time series resulting in a certain state, is deterministically quantified. Experimental results using the University of Alabama’s enrollment data demonstrate that the proposed forecasting model outperforms the existing models in terms of accuracy, robustness, and reliability. Moreover, the forecasting model adheres to the consistency principle that a shorter interval length leads to more accurate results.  相似文献   

2.
The aim of this paper is to investigate the problem of finding the efficient number of clusters in fuzzy time series. The clustering process has been discussed in the existing literature, and a number of methods have been suggested. These methods have several drawbacks, especially the lack of cluster shape and quantity optimization. There are two critical dimensions in a fuzzy time series clustering: the selection of a proper interval for fuzzy clusters and the optimization of the membership degrees among the fuzzy cluster set. The existing methods for the interval selection assume that the intended data has a short-tailed distribution, and the cluster intervals are established in identical lengths (e.g. Song and Chissom, 1994; Chen, 1996; Yolcu et al., 2009). However, the time series data (particularly in economic research) is rarely short-tailed and mostly converges to long-tail distribution because of the boom-bust market behavior. This paper proposes a novel clustering method named histogram damping partition (HDP) to define sub-clusters on the standard deviation intervals and truncate the histogram of the data by a constraint based on the coefficient of variation. The HDP approach can be used for many different kinds of fuzzy time series models at the clustering stage.  相似文献   

3.
Performing complex, informed priority rules might pose a challenge for traditional operator-driven systems. However, computer-integrated manufacturing systems may significantly benefit from the complex, informed rules such as state-dependent priority rules. A state-dependent priority rule can be defined as a list of IF–THEN priority rules that will be performed if certain system conditions are satisfied. Here, we propose a genetic algorithm based learning system for constructing interval-based, state-dependent priority rules for each interval of queue lengths in dynamic job shops. Our approach builds interval based state-dependent priority rules pairing the priority rules with the intervals of queue lengths, and determines priority rules and their corresponding length of intervals for a given objective. A genetic algorithm is developed for matching queue length intervals with appropriate priority rules during simulation. A system simulation evaluates the efficiencies of interval based state dependent priority rules. The experiments show that interval-based state dependent priority rules obtained by the proposed approach considerably outperform the priority rules including shortest processing time (SPT), minimum slack time (MST), earlier due date (EDD), modified due date (MDD), cost over time (COVERT), and critical ratio (CR) for total tardiness for most of the problems.  相似文献   

4.
模拟时间序列因为在处理数据采集中固有的不确定性和含糊性方面的显著能力而得到了越来越多的的关注,已经有许多模型致力于改进预测准确性和减少预测的计算开销,然而对于预测不确定性的控制、有效的分区间隔和对于不同的分区间隔达到一致的预测准确性方面研究较少。针对现有预测模型的不足,本文提出了一种新的预测模型,新模型增强了预测的性能并允许处理两因子预测问题。在新模型中,应用模糊均值算法来处理模糊时间序列的区间划分,划分时考虑了数据点的性质,产生不等大小的区间。最后在仿真实验中采用真实的观察数据,仿真实验结果表明本文提出的预测模型在预测准确性方面要优于现有的其他预测模型。  相似文献   

5.
Fuzzy time series forecasting method has been applied in several domains, such as stock market price, temperature, sales, crop production and academic enrollments. In this paper, we introduce a model to deal with forecasting problems of two factors. The proposed model is designed using fuzzy time series and artificial neural network. In a fuzzy time series forecasting model, the length of intervals in the universe of discourse always affects the results of forecasting. Therefore, an artificial neural network- based technique is employed for determining the intervals of the historical time series data sets by clustering them into different groups. The historical time series data sets are then fuzzified, and the high-order fuzzy logical relationships are established among fuzzified values based on fuzzy time series method. The paper also introduces some rules for interval weighing to defuzzify the fuzzified time series data sets. From experimental results, it is observed that the proposed model exhibits higher accuracy than those of existing two-factors fuzzy time series models.  相似文献   

6.
在时间序列数据库中,大多数现有的相似性搜索方法都集中在如何提高算法的效率,而对于由不精确数据组成的时间序列如何进行相似性搜索,则研究比较少,不精确数据经常用区间数据来表示;通过识别区间数时间序列中的重要区间数,使得区间数时间序列的维数大幅度降低,该文针对由区间数组成的时间序列,提出了一种基于低分率聚类的索引方法。实验表明,该方法加快了区间数时间序列的查找过程,不会出现漏报现象。  相似文献   

7.
Using normal distribution assumptions, one can obtain confidence intervals for variance components in a variety of applications. A normal-based interval, which has exact coverage probability under normality, is usually constructed from a pivot so that the endpoints of the interval depend on the data as well as the distribution of the pivotal quantity. Alternatively, one can employ a point estimation technique to form a large-sample (or approximate) confidence interval. A commonly used approach to estimate variance components is the restricted maximum likelihood (REML) method. The endpoints of a REML-based confidence interval depend on the data and the asymptotic distribution of the REML estimator. In this paper, simulation studies are conducted to evaluate the performance of the normal-based and the REML-based intervals for the intraclass correlation coefficient under non-normal distribution assumptions. Simulated coverage probabilities and expected lengths provide guidance as to which interval procedure is favored for a particular scenario. Estimating the kurtosis of the underlying distribution plays a central role in implementing the REML-based procedure. An empirical example is given to illustrate the usefulness of the REML-based confidence intervals under non-normality.  相似文献   

8.
The objective of this study is to explore ways of determining the useful lengths of intervals in fuzzy time series. It is suggested that ratios, instead of equal lengths of intervals, can more properly represent the intervals among observations. Ratio-based lengths of intervals are, therefore, proposed to improve fuzzy time series forecasting. Algebraic growth data, such as enrollments and the stock index, and exponential growth data, such as inventory demand, are chosen as the forecasting targets, before forecasting based on the various lengths of intervals is performed. Furthermore, sensitivity analyses are also carried out for various percentiles. The ratio-based lengths of intervals are found to outperform the effective lengths of intervals, as well as the arbitrary ones in regard to the different statistical measures. The empirical analysis suggests that the ratio-based lengths of intervals can also be used to improve fuzzy time series forecasting.  相似文献   

9.
In this paper, a novel kind of threshold similarity query is introduced. It reports a threshold if exceeding which the queried time series has the most similar time intervals compared to that of the given query time series above its query threshold, and the extent of the similarity between the two time interval sequences should be within a user-specified range. We present an efficient method composed by two steps to solve the query. The first step is to dramatically narrow the search space into a quite small subspace without false dismissals, and the second to search iteratively in the subspace. In more detail, a lower bounding distance function is described, which guarantees no false dismissals during the first step. Furthermore, we use binary search to quickly locate the solution within the subspace based on the continuity and monotone of the length function of time intervals, which are proved in this paper. We implemented our method on traffic data and discovered some useful knowledge. We also carried out experiments on diverse time series data to compare our method with brute force method. The results were excellent: our method accelerated the search time from 10 times up to 150 times.  相似文献   

10.
The prediction of future events has great importance in many applications. The prediction is based on episode rules which are composed of events and two time constraints which require all the events in the episode rule and in the predicate of the rule to occur in a time interval, respectively. In an event stream, a sequence of events which matches the predicate of the rule satisfying the specified time constraint is called an occurrence of the predicate. After finding the occurrence, the consequent event which will occur in a time interval can be predicted. However, the time intervals computed from some occurrences for predicting the event can be contained in the time intervals computed from other occurrence and become redundant. As a result, how to design an efficient and effective event predictor in a stream environment is challenging. In this paper, an effective scheme is proposed to avoid matching the predicate events corresponding to redundant time intervals for prediction. Based on the scheme, we respectively consider two methodologies, forward retrieval and backward retrieval, for the efficient matching of predicate events over event streams. The approach based on forward retrieval construct a queue structure to incrementally maintain parts of the matched results as events arrive, and thus it avoids backward scans of the event stream. On the other hand, the approach based on backward retrieval maintains the recently arrived events in a tree structure. The matching of predicate events is triggered by identifiable events and achieved by an efficient retrieval on the tree structure, which avoids exhaustive scans of the arrived events. By running a series of experiments, we show that each of the proposed approaches has its advantages on particular data distributions and parameter settings.  相似文献   

11.
The problem of anomaly detection in time series has received a lot of attention in the past two decades. However, existing techniques cannot locate where the anomalies are within anomalous time series, or they require users to provide the length of potential anomalies. To address these limitations, we propose a self-learning online anomaly detection algorithm that automatically identifies anomalous time series, as well as the exact locations where the anomalies occur in the detected time series. In addition, for multivariate time series, it is difficult to detect anomalies due to the following challenges. First, anomalies may occur in only a subset of dimensions (variables). Second, the locations and lengths of anomalous subsequences may be different in different dimensions. Third, some anomalies may look normal in each individual dimension but different with combinations of dimensions. To mitigate these problems, we introduce a multivariate anomaly detection algorithm which detects anomalies and identifies the dimensions and locations of the anomalous subsequences. We evaluate our approaches on several real-world datasets, including two CPU manufacturing data from Intel. We demonstrate that our approach can successfully detect the correct anomalies without requiring any prior knowledge about the data.  相似文献   

12.
In this paper, a new forecasting model based on two computational methods, fuzzy time series and particle swarm optimization, is presented for academic enrollments. Most of fuzzy time series forecasting methods are based on modeling the global nature of the series behavior in the past data. To improve forecasting accuracy of fuzzy time series, the global information of fuzzy logical relationships is aggregated with the local information of latest fuzzy fluctuation to find the forecasting value in fuzzy time series. After that, a new forecasting model based on fuzzy time series and particle swarm optimization is developed to adjust the lengths of intervals in the universe of discourse. From the empirical study of forecasting enrollments of students of the University of Alabama, the experimental results show that the proposed model gets lower forecasting errors than those of other existing models including both training and testing phases.  相似文献   

13.
A lot of research has resulted in many time series models with high precision forecasting realized at the numerical level. However, in the real world, higher numerical precision may not be necessary for the perception, reasoning and decision-making of human. Model of time series with an ability of humans to perceive and process abstract entities (rather than numeric entities) is more adaptable for some problems of decision-making. With this regard, information granules and granular computing play a primordial role. Fox example, if change range (intervals) of stock prices for a certain period in the future is regarded as information granule, constructing model that can forecast change ranges (intervals) of stock prices for a period in the future is better able to help stock investors make reasonable decisions in comparison with those based upon specific forecasting numerical value of stock price. In this paper, we propose a new modeling approach to realize interval prediction, in which the idea of information granules and granular computing is integrated with the classical Chen’s method. The proposed method is to segment an original numeric time series into a collection of time windows first, and then build fuzzy granules expressed as a certain fuzzy set over each time windows by exploiting the principle of justifiable granularity. Finally, fuzzy granular model can be constructed by mining fuzzy logical relationships of adjacent granules. The constructed model can carry out interval prediction by degranulation operation. Two benchmark time series are used to validate the feasibility and effectiveness of the proposed approach. The obtained results demonstrate the effectiveness of the approach. Besides, for modeling and prediction of large-scale time series, the proposed approach exhibit a clear advantage of reducing computation overhead of modeling and simplifying forecasting.  相似文献   

14.
The exploration of repeated patterns with different lengths, also called variable-length motifs, has received a great amount of attention in recent years. However, existing algorithms to detect variable-length motifs in large-scale time series are very time-consuming. In this paper, we introduce a time- and space-efficient approximate variable-length motif discovery algorithm, Distance-Propagation Sequitur (DP-Sequitur), for detecting variable-length motifs in large-scale time series data (e.g. over one hundred million in length). The discovered motifs can be ranked by different metrics such as frequency or similarity, and can benefit a wide variety of real-world applications. We demonstrate that our approach can discover motifs in time series with over one hundred million points in just minutes, which is significantly faster than the fastest existing algorithm to date. We demonstrate the superiority of our algorithm over the state-of-the-art using several real world time series datasets.  相似文献   

15.
In this article we derive likelihood-based confidence intervals for the risk ratio using over-reported two-sample binary data obtained using a double-sampling scheme. The risk ratio is defined as the ratio of two proportion parameters. By maximizing the full likelihood function, we obtain closed-form maximum likelihood estimators for all model parameters. In addition, we derive four confidence intervals: a naive Wald interval, a modified Wald interval, a Fieller-type interval, and an Agresti-Coull interval. All four confidence intervals are illustrated using cervical cancer data. Finally, we conduct simulation studies to assess and compare the coverage probabilities and average lengths of the four interval estimators. We conclude that the modified Wald interval, unlike the other three intervals, produces close-to-nominal confidence intervals under various simulation scenarios examined here and, therefore, is preferred in practice.  相似文献   

16.
Partitioning the universe of discourse and determining effective intervals are critical for forecasting in fuzzy time series. Equal length intervals used in most existing literatures are convenient but subjective to partition the universe of discourse. In this paper, we study how to partition the universe of discourse into intervals with unequal length to improve forecasting quality. First, we calculate the prototypes of data using fuzzy clustering, then form some subsets according to the prototypes. An unequal length partitioning method is proposed. We show that these intervals carry well-defined semantics. To verify the suitability and effectiveness of the approach, we apply the proposed method to forecast enrollment of students of Alabama University and Germany’s DAX stock index monthly values. Empirical results show that the unequal length partitioning can greatly improve forecast accuracy. Further more, the proposed method is very robust and stable for forecasting in fuzzy time series.  相似文献   

17.
An interval time series (ITS) is a time series where each period is described by an interval. In finance, ITS can describe the temporal evolution of the high and low prices of an asset throughout time. These price intervals are related to the concept of volatility and are worth considering in order to place buy or sell orders. This article reviews two approaches to forecast ITS. On the one hand, the first approach consists of using univariate or multivariate forecasting methods. The possible cointegrating relation between the high and low values is analyzed for multivariate models and the equivalence of the VAR models is shown for the minimum and the maximum time series, as well as for the center and radius time series. On the other hand, the second approach adapts classic forecasting methods to deal with ITS using interval arithmetic. These methods include exponential smoothing, the k-NN algorithm and the multilayer perceptron. The performance of these approaches is studied in two financial ITS. As a result, evidences of the predictability of the ITS are found, especially in the interval range. This fact opens a new path in volatility forecasting.  相似文献   

18.
Many forecasting models based on the concept of fuzzy time series have been proposed in the past decades. Two main factors, which are the lengths of intervals and the content of forecast rules, impact the forecasted accuracy of the models. How to find the proper content of the main factors to improve the forecasted accuracy has become an interesting research topic. Some forecasting models, which combined heuristic methods or evolutionary algorithms (such as genetic algorithms and simulated annealing) with the fuzzy time series, have been proposed but their results are not satisfied. In this paper, we use the particle swarm optimization to find the proper content of the main factors. A new hybrid forecasting model which combined particle swarm optimization with fuzzy time series is proposed to improve the forecasted accuracy. The experimental results of forecasting enrollments of students of the University of Alabama show that the new model is better than any existing models, and it can get better quality solutions based on the first-order and the high-order fuzzy time series, respectively.  相似文献   

19.
Fuzzy time series approaches are used when observations of time series contain uncertainty. Moreover, these approaches do not require the assumptions needed for traditional time series approaches. Generally, fuzzy time series methods consist of three stages, namely, fuzzification, determination of fuzzy relations, and defuzzification. Artificial intelligence algorithms are frequently used in these stages with genetic algorithms being the most popular of these algorithms owing to their rich operators and good performance. However, the mutation operator of a GA may cause some negative results in the solution set. Thus, we propose a modified genetic algorithm to find optimal interval lengths and control the effects of the mutation operator. The results of applying our new approach to real datasets show superior forecasting performance when compared with those obtained by other techniques.  相似文献   

20.
Towards a new approach for mining frequent itemsets on data stream   总被引:1,自引:0,他引:1  
Mining frequent patterns on streaming data is a new challenging problem for the data mining community since data arrives sequentially in the form of continuous rapid streams. In this paper we propose a new approach for mining itemsets. Our approach has the following advantages: an efficient representation of items and a novel data structure to maintain frequent patterns coupled with a fast pruning strategy. At any time, users can issue requests for frequent itemsets over an arbitrary time interval. Furthermore our approach produces an approximate answer with an assurance that it will not bypass user-defined frequency and temporal thresholds. Finally the proposed method is analyzed by a series of experiments on different datasets.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号