首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The geostatistical modeling of continuous variables relies heavily on the multivariate Gaussian distribution. It is remarkably tractable. The multivariate Gaussian distribution is adopted for K multiple variables (often K is between 2 and 10) and for N multiple locations (often N is in the tens of millions). Our focus is on the relationship between the K variables. Each variable is transformed to be univariate Gaussian, but the multivariate nature of the data is not necessarily Gaussian after univariate transformation. If multiple data variables are deemed non-Gaussian, then additional steps need to be taken such as linearization by alternating conditional expectation (ACE) or multivariate transformation by the stepwise conditional transformation (SCT). Although all L-variate distributions (1<LK) should be checked, the bivariate distributions are practically important; there are relatively few data in practice to investigate higher order distributions. A quantitative measure of departure from the bivariate Gaussian distribution is established based on quadrants and the distribution of differences from the theoretically expected distribution. Although approximate, the measure of departure is useful for comparing different distributions and guiding the geostatistician to look closer at some data variables. A scatnscores program is shown that will plot all K(K–1)/2 bivariate cross plots associated with K variables. The correlation coefficients, number of data, degree of departure from the bivariate Gaussian distribution, and bivariate Gaussian probability contours associated with specified cumulative probabilities are shown. The data ID numbers can also be shown to help identify outlier or problematic data.  相似文献   

2.
The bivariate distributions are useful in simultaneous modeling of two random variables. These distributions provide a way to model models. The bivariate families of distributions are not much widely explored and in this article a new family of bivariate distributions is proposed. The new family will extend the univariate transmuted family of distributions and will be helpful in modeling complex joint phenomenon. Statistical properties of the new family of distributions are explored which include marginal and conditional distributions, conditional moments, product and ratio moments, bivariate reliability and bivariate hazard rate functions. The maximum likelihood estimation (MLE) for parameters of the family is also carried out. The proposed bivariate family of distributions is studied for the Weibull baseline distributions giving rise to bivariate transmuted Weibull (BTW) distribution. The new bivariate transmuted Weibull distribution is explored in detail. Statistical properties of the new BTW distribution are studied which include the marginal and conditional distributions, product, ratio and conditional momenst. The hazard rate function of the BTW distribution is obtained. Parameter estimation of the BTW distribution is also done. Finally, real data application of the BTW distribution is given. It is observed that the proposed BTW distribution is a suitable fit for the data used.  相似文献   

3.
Probability distributions have been in use for modeling of random phenomenon in various areas of life. Generalization of probability distributions has been the area of interest of several authors in the recent years. Several situations arise where joint modeling of two random phenomenon is required. In such cases the bivariate distributions are needed. Development of the bivariate distributions necessitates certain conditions, in a field where few work has been performed. This paper deals with a bivariate beta-inverse Weibull distribution. The marginal and conditional distributions from the proposed distribution have been obtained. Expansions for the joint and conditional density functions for the proposed distribution have been obtained. The properties, including product, marginal and conditional moments, joint moment generating function and joint hazard rate function of the proposed bivariate distribution have been studied. Numerical study for the dependence function has been implemented to see the effect of various parameters on the dependence of variables. Estimation of the parameters of the proposed bivariate distribution has been done by using the maximum likelihood method of estimation. Simulation and real data application of the distribution are presented.  相似文献   

4.
This paper is concerned with ANOVA-like tests in the context of mixed discrete and continuous data. The likelihood ratio approach is used to obtain a location test in the mixed data setting after specifying a general location model for the joint distribution of the mixed discrete and continuous variables. The approach allows the problem to be treated from a multivariate perspective to simultaneously test both the discrete and continuous parameters of the model, thus avoiding the problem of multiple significance testing. Moreover, associations among variables are accounted for, resulting in improved power performance of the test. Unlike existing distance-based alternatives which rely on asymptotic theory, the likelihood ratio test is exact. In addition, it can be viewed as an extension to the mixed data setting of the classical multivariate ANOVA. We compare its performance against those of currently available tests via Monte Carlo simulations. Two real-data examples are presented to illustrate the methodology.  相似文献   

5.
In this paper we consider the Marshall-Olkin bivariate Weibull distribution. The Marshall-Olkin bivariate Weibull distribution is a singular distribution, whose both the marginals are univariate Weibull distributions. This is a generalization of the Marshall-Olkin bivariate exponential distribution. The cumulative joint distribution of the Marshall-Olkin bivariate Weibull distribution is a mixture of an absolute continuous distribution function and a singular distribution function. This distribution has four unknown parameters and it is observed that the maximum likelihood estimators of the unknown parameters cannot be obtained in explicit forms. In this paper we discuss about the computation of the maximum likelihood estimators of the unknown parameters using EM algorithm. We perform some simulations to see the performances of the EM algorithm and re-analyze one data set for illustrative purpose.  相似文献   

6.
The curse of dimensionality is severe when modeling high-dimensional discrete data: the number of possible combinations of the variables explodes exponentially. We propose an architecture for modeling high-dimensional data that requires resources (parameters and computations) that grow at most as the square of the number of variables, using a multilayer neural network to represent the joint distribution of the variables as the product of conditional distributions. The neural network can be interpreted as a graphical model without hidden random variables, but in which the conditional distributions are tied through the hidden units. The connectivity of the neural network can be pruned by using dependency tests between the variables (thus reducing significantly the number of parameters). Experiments on modeling the distribution of several discrete data sets show statistically significant improvements over other methods such as naive Bayes and comparable Bayesian networks and show that significant improvements can be obtained by pruning the network.  相似文献   

7.
We consider bivariate distributions that are specified in terms of a parametric copula function and nonparametric or semiparametric marginal distributions. The performance of two semiparametric estimation procedures based on censored data is discussed: maximum likelihood (ML) and two-stage pseudolikelihood (PML) estimation. The two-stage procedure involves less computation and it is of interest to see whether it is significantly less efficient than the full maximum likelihood approach. We also consider cases where the copula model is misspecified, in which case PML may be better. Extensive simulation studies demonstrate that in the absence of covariates, two-stage estimation is highly efficient and has significant robustness advantages for estimating marginal distributions. In some settings, involving covariates and a high degree of association between responses, ML is more efficient. For the estimation of association, PML does not offer an advantage.  相似文献   

8.
INteger-valued AutoRegressive (INAR) processes are common choices for modeling non-negative discrete valued time series. In this framework and motivated by the frequent occurrence of multivariate count time series data in several different disciplines, a generalized specification of the bivariate INAR(1) (BINAR(1)) model is considered. In this new, full BINAR(1) process, dependence between the two series stems from two sources simultaneously. The main focus is on the specific parametric case that arises under the assumption of a bivariate Poisson distribution for the innovations of the process. As it is shown, such an assumption gives rise to a Hermite BINAR(1) process. The method of conditional maximum likelihood is suggested for the estimation of its unknown parameters. A short application on financial count data illustrates the model.  相似文献   

9.
Inference in Bayesian networks with large domain of discrete variables requires significant computational effort. In order to reduce the computational effort, current approaches often assume that discrete variables have some bounded number of values or are represented at an appropriate size of clusters. In this paper, we introduce decision-tree structured conditional probability representations that can efficiently handle a large domain of discrete and continuous variables. These representations can partition the large number of values into some reasonable number of clusters and lead to more robust parameter estimation. Very rapid computation and ability to treat both discrete and continuous variables are accomplished via modified belief propagation algorithm. Being able to compute various types of reasoning from a single Bayesian network eliminates development and maintenance issues associated with the use of distinct models for different types of reasoning. Application to real-world steel production process data is presented.  相似文献   

10.
Modelling environmental systems becomes a challenge when dealing directly with continuous and discrete data simultaneously. The aim in regression is to give a prediction of a response variable given the value of some feature variables. Multiple linear regression models, commonly used in environmental science, have a number of limitations: (1) all feature variables must be instantiated to obtain a prediction, and (2) the inclusion of categorical variables usually yields more complicated models. Hybrid Bayesian networks are an appropriate approach to solve regression problems without such limitations, and they also provide additional advantages. This methodology is applied to modelling landscape–socioeconomy relationships for different types of data (continuous, discrete or hybrid). Three models relating socioeconomy and landscape are proposed, and two scenarios of socioeconomic change are introduced in each one to obtain a prediction. This proposal can be easily applied to other areas in environmental modelling.  相似文献   

11.
This paper considers a class of distributions arising from the difference of two discrete random variables belonging to the Panjer family of distributions. Some distributional properties and computation of probabilities are discussed. Goodness of fit and tests of hypotheses involving the likelihood ratio, score and Wald tests have been considered. As an illustration, an application to paired count data is given.  相似文献   

12.
A variety of methods of modelling overdispersed count data are compared. The methods are classified into three main categories. The first category are ad hoc methods (i.e. pseudo-likelihood, (extended) quasi-likelihood, double exponential family distributions). The second category are discretized continuous distributions and the third category are observational level random effects models (i.e. mixture models comprising explicit and non-explicit continuous mixture models and finite mixture models). The main focus of the paper is a family of mixed Poisson distributions defined so that its mean μ is an explicit parameter of the distribution. This allows easier interpretation when μ is modelled using explanatory variables and provides a more orthogonal parameterization to ease model fitting. Specific three parameter distributions considered are the Sichel and Delaporte distributions. A new four parameter distribution, the Poisson-shifted generalized inverse Gaussian distribution is introduced, which includes the Sichel and Delaporte distributions as a special and a limiting case respectively. A general formula for the derivative of the likelihood with respect to μ, applicable to the whole family of mixed Poisson distributions considered, is given. Within the framework introduced here all parameters of the distributions are modelled as parametric and/or nonparametric (smooth) functions of explanatory variables. This provides a very flexible way of modelling count data. Maximum (penalized) likelihood estimation is used to fit the (non)parametric models.  相似文献   

13.
Many real life decision making problems can be modeled as discrete stochastic multi-attribute decision making (MADM) problems. A novel method for discrete stochastic MADM problems is developed based on the ideal and nadir solutions as in the classical TOPSIS method. In a stochastic MADM problem, the evaluations of the alternatives with respect to the different attributes are represented by discrete stochastic variables. According to stochastic dominance rules, the probability distributions of the ideal and nadir variates, both are discrete stochastic variables, are defined and determined for a set of discrete stochastic variables. A metric is proposed to measure the distance between two discrete stochastic variables. The ideal solution is a vector of ideal variates and the nadir solution is a vector of nadir variates for the multiple attributes. As in the classical TOPSIS method, the relative closeness of an alternative is determined by its distances from the ideal and nadir solutions. The rankings of the alternatives are determined using the relative closeness. Examples are presented to illustrate the effectiveness of the proposed method. Through the examples, several significant advantages of the proposed method over some existing methods are discussed.  相似文献   

14.
This paper presents modeling and control of nonlinear hybrid systems using multiple linearized models. Each linearized model is a local representation of all locations of the hybrid system. These models are then combined using Bayes theorem to describe the nonlinear hybrid system. The multiple models, which consist of continuous as well as discrete variables, are used for synthesis of a model predictive control (MPC) law. The discrete-time equivalent of the model predicts the hybrid system behavior over the prediction horizon. The MPC formulation takes on a similar form as that used for control of a continuous variable system. Although implementation of the control law requires solution of an online mixed integer nonlinear program, the optimization problem has a fixed structure with certain computational advantages. We demonstrate performance and computational efficiency of the modeling and control scheme using simulations on a benchmark three-spherical tank system and a hydraulic process plant.  相似文献   

15.
In this paper we analyze the problem of learning and updating of uncertainty in Dirichlet models, where updating refers to determining the conditional distribution of a single variable when some evidence is known. We first obtain the most general family of prior-posterior distributions which is conjugate to a Dirichlet likelihood and we identify those hyperparameters that are influenced by data values. Next, we describe some methods to assess the prior hyperparameters and we give a numerical method to estimate the Dirichlet parameters in a Bayesian context, based on the posterior mode. We also give formulas for updating uncertainty by determining the conditional probabilities of single variables when the values of other variables are known. A time series approach is presented for dealing with the cases in which samples are not identically distributed, that is, the Dirichlet parameters change from sample to sample. This typically occurs when the population is observed at different times. Finally, two examples are given that illustrate the learning and updating processes and the time series approach.  相似文献   

16.
Statistical edge detection: learning and evaluating edge cues   总被引:9,自引:0,他引:9  
We formulate edge detection as statistical inference. This statistical edge detection is data driven, unlike standard methods for edge detection which are model based. For any set of edge detection filters (implementing local edge cues), we use presegmented images to learn the probability distributions of filter responses conditioned on whether they are evaluated on or off an edge. Edge detection is formulated as a discrimination task specified by a likelihood ratio test on the filter responses. This approach emphasizes the necessity of modeling the image background (the off-edges). We represent the conditional probability distributions nonparametrically and illustrate them on two different data sets of 100 (Sowerby) and 50 (South Florida) images. Multiple edges cues, including chrominance and multiple-scale, are combined by using their joint distributions. Hence, this cue combination is optimal in the statistical sense. We evaluate the effectiveness of different visual cues using the Chernoff information and Receiver Operator Characteristic (ROC) curves. This shows that our approach gives quantitatively better results than the Canny edge detector when the image background contains significant clutter. In addition, it enables us to determine the effectiveness of different edge cues and gives quantitative measures for the advantages of multilevel processing, for the use of chrominance, and for the relative effectiveness of different detectors. Furthermore, we show that we can learn these conditional distributions on one data set and adapt them to the other with only slight degradation of performance without knowing the ground truth on the second data set. This shows that our results are not purely domain specific. We apply the same approach to the spatial grouping of edge cues and obtain analogies to nonmaximal suppression and hysteresis.  相似文献   

17.
Several univariate proportional reversed hazard models have been proposed in the literature. Recently, Kundu and Gupta (2010) proposed a class of bivariate models with proportional reversed hazard marginals. It is observed that the proposed bivariate proportional reversed hazard models have a singular component. In this paper we introduce the multivariate proportional reversed hazard models along the same manner. Moreover, it is observed that the proposed multivariate proportional reversed hazard model can be obtained from the Marshall–Olkin copula. The multivariate proportional reversed hazard models also have a singular component, and their marginals have proportional reversed hazard distributions. The multivariate ageing and the dependence properties are discussed in details. We further provide some dependence measure specifically for the bivariate case. The maximum likelihood estimators of the unknown parameters cannot be expressed in explicit forms. We propose to use the EM algorithm to compute the maximum likelihood estimators. One trivariate data set has been analysed for illustrative purposes.  相似文献   

18.
This article presents a computer program to directly simulate continuous regionalized variables such as mineral grades over a block support. Simulation is performed in the scope of the discrete Gaussian model and does not rely on a block discretization. The realizations can be made conditional to point-support data by use of simple or ordinary kriging, depending on whether or not the average value of the data is considered known. The proposed program can account for an information effect (misclassifications between ore and waste) by co-simulating the true block-support grades together with the grades that will be predicted at the production stage to discriminate between ore and waste.  相似文献   

19.
A procedure is presented for finding a discrete approximation to a continuous multivariate density function. It is based on a previously developed algorithm [2] for determining the L1 optimal discrete approximation to a univariate density. Results of approximating continuous bivariate density functions, which represent distributions of the parameters of a pharmacokinetic model, show good agreement between the mean and covariance matrix of the approximated and approximating densities. The distribution of a predicted drug conceptration was also calculated using a continuous density and discrete approximations with both 25 and 81 points. The expected values of the predicted concentration, as well as selected percentile points, obtained using each density are in close agreement.  相似文献   

20.
This paper examines the most widely used reliability models. The models discussed fall into two categories, the data domain and the time domain. Besides tracing the historical development of the various models their advantages and disadvantages are analyzed. This includes models based on discrete as weil as continuous probability distributions. How well a given model performs its purpose in a specific economic environment will determine the usefulness of the model. Each of the models is examined with actual data as to the applicability of the error fmding process.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号