首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
We study how to perform model selection for time series data where millions of candidate ARMA models may be eligible for selection. We propose a feasible computing method based on the Gibbs sampler. By this method model selection is performed through a random sample generation algorithm, and given a model of fixed dimension the parameter estimation is done through the maximum likelihood method. Our method takes into account several computing difficulties encountered in estimating ARMA models. The method is found to have probability of 1 in the limit in selecting the best candidate model under some regularity conditions. We then propose several empirical rules to implement our computing method for applications. Finally, a simulation study and an example on modelling China's Consumer Price Index (CPI) data are presented for purpose of illustration and verification.  相似文献   

2.
Model selection and model combination is a general problem in many areas. Especially, when we have several different candidate models and also have gathered a new data set, we want to construct a more accurate and precise model in order to help predict future events. In this paper, we propose a new data-guided model combination method by decomposition and aggregation. With the aid of influence diagrams, we analyze the dependence among candidate models and apply latent factors to characterize such dependence. After analyzing model structures in this framework, we derive an optimal composite model. Two widely used data analysis tools, namely, Principal Component Analysis (PCA) and Independent Component Analysis (ICA) are applied for the purpose of factor extraction from the class of candidate models. Once factors are ready, they are sorted and aggregated in order to produce composite models. During the course of factor aggregation, another important issue, namely factor selection, is also touched on. Finally, a numerical study shows how this method works and an application using physical data is also presented. Editor: Dan Roth  相似文献   

3.
基于FARIMA模型的Internet网络业务预报   总被引:30,自引:3,他引:27  
最近的网络研究发现Internet网络业务同时呈现长相关和短相关特性,因此建立可以同时描述,预报长相关和短相关特性的网络业务模型很有必要。文中给出了利用FARIMA模型进行建模和预报的方法,实验表明这种方法用于实际Internet网络trace是非常有效的,另外提供了简化FARIMA模型拟合的方法和具体步骤,这样大大缩短了模型辨识的时间,对于实际网络预报有很好的实用性。  相似文献   

4.
Mixture models are ubiquitous in applied science. In many real-world applications, the number of mixture components needs to be estimated from the data. A popular approach consists of using information criteria to perform model selection. Another approach which has become very popular over the past few years consists of using Dirichlet processes mixture (DPM) models. Both approaches are computationally intensive. The use of information criteria requires computing the maximum likelihood parameter estimates for each candidate model whereas DPM are usually trained using Markov chain Monte Carlo (MCMC) or variational Bayes (VB) methods. We propose here original batch and recursive expectation-maximization algorithms to estimate the parameters of DPM. The performance of our algorithms is demonstrated on several applications including image segmentation and image classification tasks. Our algorithms are computationally much more efficient than MCMC and VB and outperform VB on an example.  相似文献   

5.
Clustering problems are central to many knowledge discovery and data mining tasks. However, most existing clustering methods can only work with fixed-dimensional representations of data patterns. In this paper, we study the clustering of data patterns that are represented as sequences or time series possibly of different lengths. We propose a model-based approach to this problem using mixtures of autoregressive moving average (ARMA) models. We derive an expectation-maximization (EM) algorithm for learning the mixing coefficients as well as the parameters of the component models. To address the model selection problem, we use the Bayesian information criterion (BIC) to determine the number of clusters in the data. Experiments are conducted on a number of simulated and real datasets. Results from the experiments show that our method compares favorably with other methods proposed previously by others for similar time series clustering tasks.  相似文献   

6.
Discovering the genetic basis of common human diseases will be assisted by large-scale association studies with a large number of individuals and genetic markers, such as single-nucleotide polymorphisms (SNPs). The potential size of the data and the resulting model space require the development of efficient methodology to unravel associations between epidemiological outcomes and SNPs in dense genetic maps. We apply an evolutionary algorithm (EA) to construct models consisting of logic trees. These trees are Boolean expressions involving nodes that contain strings of SNPs in high linkage disequilibrium (LD), that is, SNPs that are highly correlated with each other. At each generation of the algorithm, a population of logic tree models is modified using selection, crossover, and mutation moves. Logic trees are selected for the next generation using a fitness function based on the marginal likelihood in a Bayesian regression framework. Mutation and crossover moves use LD measures to propose changes to the trees, and facilitate the movement through the model space. We demonstrate our method on data from a candidate gene study of quantitative genetic variation.  相似文献   

7.
This study addresses the problem of modeling the variation of the grounding resistance during the year. An AutoRegressive Moving Average (ARMA) model is fitted (off-line) on the provided actual data using the Corrected Akaike Information Criterion (AICC). The developed model is shown to fit the data in a successful manner. Difficulties occur when the provided data includes noise or errors and also when an on line/adaptive modeling is required. In both cases, and under the assumption that the provided data can be represented by an ARMA model, simultaneous order and parameter estimation of ARMA models under the presence of noise is necessary. In this paper, a new method based on the multi-model partitioning theory which is also applicable to on line/adaptive operation, is used for the solution of the above mentioned problem. The simulations show that the proposed method succeeds in selecting the correct ARMA model order and estimates the parameters accurately in very few steps and even with a small sample size. For validation purposes the method introduced is compared with three other established order selection criteria presenting very good results. The proposed method can be extremely useful in the studies of electrical engineer designers, since the variation of the grounding resistance during the year affects significantly power systems performance and must be definitely considered.  相似文献   

8.
Rational transfer functions are standard models for radar targets and adaptive beamforming. Fitting these models essentially involves estimating the transfer function “poles and zeroes.” A key preliminary step in this estimation process is to determine the numbers of poles and zeros, or equivalently to determine the order of the corresponding ARMA model. A pattern-based method of order selection using matrix ranks is proposed for input/output (I/O) ARMA models, where ARMA model inputs and outputs are each observed in additive noise with known variances. This I/O ARMA model encompasses two distinct scenarios: observational studies in which all observations—those of both inputs and outputs—are erred, and controlled experiments in which outputs are observed with error while inputs are known without error. The proposed rank pattern method exploits the eigenvalue structure of the covariance matrices associated with the observed data and performs well for short data records at moderate SNRs.  相似文献   

9.
Optimal Choice of AR and MA Parts in Autoregressive Moving Average Models   总被引:2,自引:0,他引:2  
This paper deals with the Bayesian method of choosing the best model for a given one-dimensional series among a finite number of candidates belonging to autoregressive (AR), moving average (MA), ARMA, and other families. The series could be either a sequence of observations in time as in speech applications, or a sequence of pixel intensities of a two-dimensional image. The observation set is not restricted to be Gaussian. We first derive an optimum decision rule for assigning the given observation set to one of the candidate models so as to minimize the average probability of error in the decision. We also derive an optimal decision rule so as to minimize the average value of the loss function. Then we simplify the decision rule when the candidate models are different Gaussian ARMA models of different orders. We discuss the consistency of the optimal decision rule and compare it with the other decision rules in the literature for comparing dynamical models.  相似文献   

10.
In this paper we propose a parametric and a non-parametric identification algorithm for dynamic errors-in-variables model. We show that the two-dimensional process composed of the input-output data admits a finite order ARMA representation. The non-parametric method uses the ARMA structure to compute a consistent estimate of the joint spectrum of the input and the output. A Frisch scheme is then employed to extract an estimate of the joint spectrum of the noise free input-output data, which in turn is used to estimate the transfer function of the system. The parametric method exploits the ARMA structure to give estimates of the system parameters. The performances of the algorithms are illustrated using the results obtained from a numerical simulation study.  相似文献   

11.
The analysis of a relationship among variables in data generating systems is one of the important problems in machine learning. In this paper, we propose an approach for estimating a graphical representation of variables in data generating processes, based on the non-Gaussianity of external influences and an autoregressive moving-average (ARMA) model. The presented model consists of two parts, i.e., a classical structural-equation model for instantaneous effects and an ARMA model for lagged effects in processes, and is estimated through the analysis using the non-Gaussianity on the residual processes. As well as the recently proposed non-Gaussianity based method named LiNGAM analysis, the estimation by the proposed method has identifiability and consistency. We also address the relation of the estimated structure by our method to the Granger causality. Finally, we demonstrate analyses on the data containing both of the instantaneous causality and the Granger (temporal) causality by using our proposed method where the datasets for the demonstration cover both artificial and real physical systems.  相似文献   

12.
In order to select the best predictive neural-network architecture in a set of several candidate networks, we propose a general Bayesian nonlinear regression model comparison procedure, based on the maximization of an expected utility criterion. This criterion selects the model under which the training set achieves the highest level of internal consistency, through the predictive probability distribution of each model. The density of this distribution is computed as the model posterior predictive density and is asymptotically approximated from the assumed Gaussian likelihood of the data set and the related conjugate prior density of the parameters. The use of such a conjugate prior allows the analytic calculation of the parameter posterior and predictive posterior densities, in an empirical Bayes-like approach. This Bayesian selection procedure allows us to compare general nonlinear regression models and in particular feedforward neural networks, in addition to embedded models as usual with asymptotic comparison tests.  相似文献   

13.
We present a novel interactive learning‐based method for curating datasets using user‐defined criteria for training and refining Generative Adversarial Networks. We employ a novel batch‐mode active learning strategy to progressively select small batches of candidate exemplars for which the user is asked to indicate whether they match the, possibly subjective, selection criteria. After each batch, a classifier that models the user's intent is refined and subsequently used to select the next batch of candidates. After the selection process ends, the final classifier, trained with limited but adaptively selected training data, is used to sift through the large collection of input exemplars to extract a sufficiently large subset for training or refining the generative model that matches the user's selection criteria. A key distinguishing feature of our system is that we do not assume that the user can always make a firm binary decision (i.e., “meets” or “does not meet” the selection criteria) for each candidate exemplar, and we allow the user to label an exemplar as “undecided”. We rely on a non‐binary query‐by‐committee strategy to distinguish between the user's uncertainty and the trained classifier's uncertainty, and develop a novel disagreement distance metric to encourage a diverse candidate set. In addition, a number of optimization strategies are employed to achieve an interactive experience. We demonstrate our interactive curation system on several applications related to training or refining generative models: training a Generative Adversarial Network that meets a user‐defined criteria, adjusting the output distribution of an existing generative model, and removing unwanted samples from a generative model.  相似文献   

14.
The aim of this paper is to propose a new hybrid data mining model based on combination of various feature selection and ensemble learning classification algorithms, in order to support decision making process. The model is built through several stages. In the first stage, initial dataset is preprocessed and apart of applying different preprocessing techniques, we paid a great attention to the feature selection. Five different feature selection algorithms were applied and their results, based on ROC and accuracy measures of logistic regression algorithm, were combined based on different voting types. We also proposed a new voting method, called if_any, that outperformed all other voting methods, as well as a single feature selection algorithm's results. In the next stage, a four different classification algorithms, including generalized linear model, support vector machine, naive Bayes and decision tree, were performed based on dataset obtained in the feature selection process. These classifiers were combined in eight different ensemble models using soft voting method. Using the real dataset, the experimental results show that hybrid model that is based on features selected by if_any voting method and ensemble GLM + DT model performs the highest performance and outperforms all other ensemble and single classifier models.  相似文献   

15.
特征选择算法是微阵列数据分析的重要工具,特征选择算法的分类性能和稳定性对微阵列数据分析至关重要。为了提高特征选择算法的分类性能和稳定性,提出一种面向高维微阵列数据的集成特征选择算法来弥补单个基因子集信息量的不足,提高基因特征选择算法的分类性能和稳定性。该算法首先采用信噪比方法选择若干区分基因;然后对每个区分基因利用条件信息相关系数评估候选基因与区分基因的相关性,生成多个相关基因子集,最后,通过集成学习技术整合多个相似基因子集。实验结果表明,本文提出的集成特征选择算法的分类性能以及稳定性在多数情况下均优于只选择单个基因子集的方法。  相似文献   

16.
We propose a sequential test procedure for transient detections in a stochastic process which can be expressed as an autoregressive moving average (ARMA) model. Preliminary analysis shows that if an ARMA(p,q) time series exhibits a transient behavior, then its residuals behave as an ARMA(Q,Q) process, where Qp + q. Based on this fact, we derive a new sequential test to determine when a transient behavior occurs in a given ARMA time series. Simulation experiments conducted in this study show that the proposed test can detect the occurrence of a transient in the ARMA model. We also apply the proposed method to detect transient changes in the pH of an erythromycin salt.  相似文献   

17.
分布式动态信任模型作为适用于云计算环境下的访问管理机制已经得到广泛研究,然而现有的许多信任模型忽视了对信任数据可靠性的评估,导致推荐信任不可靠时出现模型失效.针对这一问题,本文提出了一种新的考虑信任可靠度的分布式动态信任管理模型DDTM-TR.DDTM-TR模型首先使用可靠度对信任进行评估,降低不可靠数据对直接信任、推荐信任、综合信任计算的影响.然后,选择多个待选节点计算它们的综合信任,并以计算出的综合信任为概率,随机选择待选节点进行交互.最后,在交互结束后,根据交互满意度反馈修正节点的可靠度.仿真实验表明,DDTM-TR模型在处理恶意服务、恶意推荐都优于对比模型并且能通过反馈算法进一步降低判断的失败率.  相似文献   

18.
19.
A procedure is proposed for computing the autocovariances and the ARMA representations of the squares, and higher-order powers, of Markov-switching GARCH models. It is shown that many interesting subclasses of the general model can be discriminated in view of their autocovariance structures. Explicit derivation of the autocovariances allows for parameter estimation in the general model, via a GMM procedure. It can also be used to determine how many ARMA representations are needed to identify the Markov-switching GARCH parameters. A Monte Carlo study and an application to the Standard & Poor index are presented.  相似文献   

20.
Migrating organisational services, data and application on the Cloud is an important strategic decision for organisations due to the large number of benefits introduced by the usage of cloud computing, such as cost reduction and on-demand resources. Despite, however, many benefits, there are challenges and risks for cloud adaption related to (amongst others) data leakage, insecure APIs and shared technology vulnerabilities. These challenges need to be understood and analysed in the context of an organisation’s security and privacy goals and relevant cloud computing deployment models. Although the literature provides a large number of references to works that consider cloud computing security issues, no work has been provided, to our knowledge, which supports the elicitation of security and privacy requirements and the selection of an appropriate cloud deployment model based on such requirements. This work contributes towards this gap. In particular, we propose a requirements engineering framework to support the elicitation of security and privacy requirements and the selection of an appropriate deployment model based on the elicited requirements. Our framework provides a modelling language that builds on concepts from requirements, security, privacy and cloud engineering, and a systematic process. We use a real case study, based on the Greek National Gazette, to demonstrate the applicability of our work.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号