首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The statistical properties of the likelihood ratio test statistic (LRTS) for mixture-of-expert models are addressed in this paper. This question is essential when estimating the number of experts in the model. Our purpose is to extend the existing results for simple mixture models (Liu and Shao, 2003 [8]) and mixtures of multilayer perceptrons (Olteanu and Rynkiewicz, 2008 [9]). In this paper we first study a simple example which embodies all the difficulties arising in such models. We find that in the most general case the LRTS diverges but, with additional assumptions, the behavior of such models can be totally explicated.  相似文献   

2.

In the analysis and prediction of real world systems two of the key problems are nonstation arity (often in the form of switching between regimes) and overfitting (particularly serious for noisy processes). This article addresses these problems using gated experts consisting of a nonlinear gating network and several also nonlinear competing experts. Each expert learns to predict the conditional mean and each expert adapts its width to match the noise level in its regime. The gating network learns to predict the probability of each expert given the input. This article focuses on the case where the gating network bases its decision on infor mation from the inputs. This can be contrasted to hidden Markov models where the decision is based on the previous state s i e on the output of the gating network at the previous time step as well as to averaging over several predictors. In contrast, gated experts soft partition the input space. This article discusses the underlying statistical assumptions, derives the weight update rules and compares the performance of gated experts to standard methods on three time series: 1 - a computer generated series obtained by randomly switching between two nonlinear processes; 2 - a time series from the Santa Fe Time Series Competition the light intensity of a laser in chaotic state; and 3 - the daily electricity demand of France (a real world multivariate problem with structure on several timescales). The main results are (1) the gating network correctly discovers the different regimes of the process (2) the widths associated with each expert are important for the segmentation task and they can be used to characterize the subprocesses and (3) there is less overfitting compared to single networks homogeneous multilayer perceptrons since the experts learn to match their variances to the local noise levels. This can be viewed as matching the local complexity of the model to the local complexity of the data.  相似文献   

3.
Jiang W 《Neural computation》2000,12(6):1293-1301
The mixtures-of-experts (ME) methodology provides a tool of classification when experts of logistic regression models or Bernoulli models are mixed according to a set of local weights. We show that the Vapnik-Chervonenkis dimension of the ME architecture is bounded below by the number of experts m and above by O(m4s2), where s is the dimension of the input. For mixtures of Bernoulli experts with a scalar input, we show that the lower bound m is attained, in which case we obtain the exact result that the VC dimension is equal to the number of experts.  相似文献   

4.
Forecasting air-pollutant levels is an important issue, due to their adverse effects on public health, and often a legislative necessity. The advantage of Bayesian methods is their ability to provide density predictions which can easily be transformed into ordinal or binary predictions given a set of thresholds. We develop a Bayesian approach to forecasting PM\(_{10}\) and O\(_3\) levels that efficiently deals with extensive amounts of input parameters, and test whether it outperforms classical models and experts. The new approach is used to fit models for PM\(_{10}\) and O\(_3\) level forecasting that can be used in daily practice. We also introduce a novel approach for comparing models to experts based on estimated cost matrices. The results for diverse air quality monitoring sites across Slovenia show that Bayesian models outperform classical models in both PM\(_{10}\) and O\(_3\) predictions. The proposed models perform better than experts in PM\(_{10}\) and are on par with experts in O\(_3\) predictions—where experts already base their predictions on predictions from a statistical model. A Bayesian approach—especially using Gaussian processes—offers several advantages: superior performance, robustness to overfitting, more information, and the ability to efficiently adapt to different cost matrices.  相似文献   

5.
A novel class of nonlinear models is studied based on local mixtures of autoregressive Poisson time series. The proposed model has the following construction: at any given time period, there exist a certain number of Poisson regression models, denoted as experts, where the vector of covariates may include lags of the dependent variable. Additionally, the existence of a latent multinomial variable is assumed, whose distribution depends on the same covariates as the experts. The latent variable determines which Poisson regression is observed. This structure is a special case of the mixtures-of-experts class of models, which is considerably flexible in modelling the conditional mean function. A formal treatment of conditions to guarantee the asymptotic normality of the maximum likelihood estimator is presented, under stationarity and nonstationarity. The performance of common model selection criteria in selecting the number of experts is explored via Monte Carlo simulations. Finally, an application to a real data set is presented, in order to illustrate the ability of the proposed structure to flexibly model the conditional distribution function.  相似文献   

6.
The importance of medical image segmentation increases in fields like treatment planning or computer aided diagnosis. For high quality automatic segmentations, algorithms based on statistical shape models (SSMs) are often used. They segment the image in an iterative way. However, segmentation experts and other users can only asses the final segmentation results, as the segmentation is performed in a “black box manner”. Users cannot get deeper knowledge on how the (possibly bad) output was produced. Moreover, they do not see whether the final output is the result of a stabilized process. We present a novel Visual Analytics method, which offers this desired deeper insight into the image segmentation. Our approach combines interactive visualization and automatic data analysis. It allows the expert to assess the quality development (convergence) of the model both on global (full organ) and local (organ areas, landmarks) level. Thereby, local patterns in time and space, e.g., non-converging parts of the organ during the segmentation, can be identified. The localization and specifications of such problems helps the experts creating segmentation algorithms to identify algorithm drawbacks and thus it may point out possible ways how to improve the algorithms systematically. We apply our approach on real-world data showing its usefulness for the analysis of the segmentation process with statistical shape models.  相似文献   

7.
In computer vision applications, models are often used to gain information about real-world objects. In order to determine model parameters that match the image content, displacement experts serve as an update function to refine initial model parameter estimations. However, building robust displacement experts is a non-trivial task, especially in unconstrained environments. Therefore, we provide the fitting algorithm not only with the original image but with a multi-band image representation that reflects the location of several facial components. To demonstrate its robustness in real-world scenarios, we integrate the Labeled Faces In The Wild database, which consists of images that have been taken outside lab environments.  相似文献   

8.
We consider a class of nonlinear models based on mixtures of local autoregressive time series. At any given time point, we have a certain number of linear models, denoted as experts, where the vector of covariates may include lags of the dependent variable. Additionally, we assume the existence of a latent multinomial variable, whose distribution depends on the same covariates as the experts, that determines which linear process is observed. This structure, denoted as mixture-of-experts (ME), is considerably flexible in modeling the conditional mean function, as shown by Jiang and Tanner. We present a formal treatment of conditions to guarantee the asymptotic normality of the maximum likelihood estimator (MLE), under stationarity and nonstationarity, and under correct model specification and model misspecification. The performance of common model selection criteria in selecting the number of experts is explored via Monte Carlo simulations. Finally, we present applications to simulated and real data sets, to illustrate the ability of the proposed structure to model not only the conditional mean, but also the whole conditional density.  相似文献   

9.
Important decisions are often based on a distributed process of information processing, from a knowledge base that is itself distributed among agents. The simplest such situation is that where a decision-maker seeks the recommendations of experts. Because experts may have vested interests in the consequences of their recommendations, decision-makers usually seek the advice of experts they trust. Trust, however, is a commodity that is usually built through repeated face time and social interaction and thus cannot easily be built in a global world where we have immediate internet access to a vast pool of experts. In this article, we integrate findings from experimental psychology and formal tools from Artificial Intelligence to offer a preliminary roadmap for solving the problem of trust in this computer-mediated environment. We conclude the article by considering a diverse array of extended applications of such a solution.  相似文献   

10.
Finite mixture models have been applied for different computer vision, image processing and pattern recognition tasks. The majority of the work done concerning finite mixture models has focused on mixtures for continuous data. However, many applications involve and generate discrete data for which discrete mixtures are better suited. In this paper, we investigate the problem of discrete data modeling using finite mixture models. We propose a novel, well motivated mixture that we call the multinomial generalized Dirichlet mixture. The novel model is compared with other discrete mixtures. We designed experiments involving spatial color image databases modeling and summarization, and text classification to show the robustness, flexibility and merits of our approach.  相似文献   

11.
《Computers & Geosciences》2006,32(8):1040-1051
Conventional statistical methods are often ineffective to evaluate spatial regression models. One reason is that spatial regression models usually have more parameters or smaller sample sizes than a simple model, so their degree of freedom is reduced. Thus, it is often unlikely to evaluate them based on traditional tests. Another reason, which is theoretically associated with statistical methods, is that statistical criteria are crucially dependent on such assumptions as normality, independence, and homogeneity. This may create problems because the assumptions are open for testing. In view of these problems, this paper proposes an alternative empirical evaluation method. To illustrate the idea, a few hedonic regression models for a house and land price data set are evaluated, including a simple, ordinary linear regression model and three spatial models. Their performance as to how well the price of the house and land can be predicted is examined. With a cross-validation technique, the prices at each sample point are predicted with a model estimated with the samples excluding the one being concerned. Then, empirical criteria are established whereby the predicted prices are compared with the real, observed prices. The proposed method provides an objective guidance for the selection of a suitable model specification for a data set. Moreover, the method is seen as an alternative way to test the significance of the spatial relationships being concerned in spatial regression models.  相似文献   

12.
This paper is about developing a group user model able to predict unknown features (attributes, preferences, or behaviors) of any interlocutor. Specifically, for systems where there are features that cannot be modeled by a domain expert within the human computer interaction. In such cases, statistical models are applied instead of stereotype user models. The time consumption of these models is high, and when a requisite of bounded response time is added most common solution involves summarizing knowledge. Summarization involves deleting knowledge from the knowledge base and probably losing accuracy in the medium-term. This proposal provides all the advantages of statistical user models and avoids knowledge loss by using an R-Tree structure and various search spaces (universes of users) of diverse granularity for solving inferences with enhanced success rates. Along with the formalization and evaluation of the approach, main advantages will be discussed, and a perspective for its future evolution is provided. In addition, this paper provides a framework to evaluate statistical user models and to enable performance comparison among different statistical user models.  相似文献   

13.
On a pattern-oriented model for intrusion detection   总被引:6,自引:0,他引:6  
Operational security problems, which are often the result of access authorization misuse, can lead to intrusion in secure computer systems. We motivate the need for pattern-oriented intrusion detection, and present a model that tracks both data and privilege flows within secure systems to detect context-dependent intrusions caused by operational security problems. The model allows the uniform representation of various types of intrusion patterns, such as those caused by unintended use of foreign programs and input data, imprudent choice of default privileges, and use of weak protection mechanisms. As with all pattern-oriented models, this model cannot be used to detect new, unanticipated intrusion patterns that could be detected by statistical models. For this reason, we expect that this model will complement, not replace, statistical models for intrusion detection  相似文献   

14.
In this paper the performability analysis of fault-tolerant computer systems using a hierarchical decomposition technique is presented. A special class of queueing network (QN) models, the so-called BCMP [4], and generalized stochastic Petri nets (GSPN) [1] which are often used to separately model performance and reliability respectively, have been combined in order to preserve the best modelling features of both.

A conceptual model is decomposed into GSPN and BCMP submodels, which are solved in isolation. Then, the remaining GSPN portion of the model is aggregated with flow-equivalents of BCMP models, in order to compute performability measures. The substitutes of BCMP models are presented by means of simple GSPN constructs, thereby preserving the 1st and 2nd moments of the throughput. A simple example of a data communication system where failed transmissions are corrected, is presented.  相似文献   


15.
16.
Domain experts typically have detailed knowledge of the concepts that are used in their domain; however they often lack the technical skills needed to translate that knowledge into model-driven engineering (MDE) idioms and technologies. Flexible or bottom-up modelling has been introduced to assist with the involvement of domain experts by promoting the use of simple drawing tools. In traditional MDE the engineering process starts with the definition of a metamodel which is used for the instantiation of models. In bottom-up MDE example models are defined at the beginning, letting the domain experts and language engineers focus on expressing the concepts rather than spending time on technical details of the metamodelling infrastructure. The metamodel is then created manually or inferred automatically. The flexibility that bottom-up MDE offers comes with the cost of having nodes in the example models left untyped. As a result, concepts that might be important for the definition of the domain will be ignored while the example models cannot be adequately re-used in future iterations of the language definition process. In this paper, we propose a novel approach that assists in the inference of the types of untyped model elements using Constraint Programming. We evaluate the proposed approach in a number of example models to identify the performance of the prediction mechanism and the benefits it offers. The reduction in the effort needed to complete the missing types reaches up to 91.45% compared to the scenario where the language engineers had to identify and complete the types without guidance.  相似文献   

17.
Modern computer graphics applications usually require high resolution object models for realistic rendering.However,it is expensive and difficult to deform such models in real time.In order to reduce the computational cost during deformations,a dense model is often manipulated through a simplified structure,called cage,which envelops the model.However,cages are usually built interactively by users,which is tedious and time-consuming.In this paper,we introduce a novel method that can build cages automatically for both 2D polygons and 3D triangular meshes.The method consists of two steps:1) simplifying the input model with quadric error metrics and quadratic programming to build a coarse cage;2) removing the self-intersections of the coarse cage with Delaunay partitions.With this new method,a user can build a cage to envelop an input model either entirely or partially with the approximate vertex number the user specifies.Experimental results show that,compared to other cage building methods with the same number of vertex,cages built by our method are more similar to the input models.Thus,the dense models can be manipulated with higher accuracy through our cages.  相似文献   

18.
Autoregressive-moving average (ARMA) models are often used for the purpose of forecasting a time series. As an aide to chosing a model, use is made of the autocorrelation function which is estimated from the data. If the only interest in the model is for forecasting purposes, then it is not necessary to compute the autocorrelation function associated with the chosen model. For this reason, a method for computation of the autocorrelation function is not usually included in the software used for identifying ARMA models. However, there are applications of ARMA models where it is important to compute the autocovariance function.

This paper contains an algorithm and a listing of a FORTRAN program which computes the autocovariance directly from the solution to the difference equations which govern its behavior.  相似文献   


19.
Spam has become a major issue in computer security because it is a channel for threats such as computer viruses, worms and phishing. More than 85% of received e-mails are spam. Historical approaches to combat these messages including simple techniques such as sender blacklisting or the use of e-mail signatures, are no longer completely reliable. Currently, many solutions feature machine-learning algorithms trained using statistical representations of the terms that usually appear in the e-mails. Still, these methods are merely syntactic and are unable to account for the underlying semantics of terms within the messages. In this paper, we explore the use of semantics in spam filtering by representing e-mails with a recently introduced Information Retrieval model: the enhanced Topic-based Vector Space Model (eTVSM). This model is capable of representing linguistic phenomena using a semantic ontology. Based upon this representation, we apply several well-known machine-learning models and show that the proposed method can detect the internal semantics of spam messages.  相似文献   

20.
We have been developing an interactive computer software for the systematic support to modeling and simulation of intelligent control systems, based on a human-friendy systems methodology. The support system has a universal application in data analysis, system structuring, statistical and fuzzy modeling, and simulation, with the aid of human-computer interfaces to acquire knowledge or judgment of the domain experts. This paper presents our soft systems methodology and its implementation into the computer to develop intelligent process control systems. New technical proposals include a modeling method of fuzzy implication inference models and a design method of model predictive controllers.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号