A semiparametric model for compositional data analysis in presence of covariates on the simplex |
| |
Authors: | Malini Iyengar Dipak K Dey |
| |
Affiliation: | (1) SmithKline Beecham Pharm, 19426 Collegeville, PA, USA;(2) Department of Statistics, University of Connecticut, 06269-4120 Storrs, CT, USA |
| |
Abstract: | Compositional data occur as natural realizations of multivariate observations comprising element proportions of some whole
quantity. Such observations predominate in disciplines like geology, biology, ecology, economics and chemistry. Due to unit
sum constraint on compositional data, specialized statistical methods are required for analyzing these data. Dirichlet distributions
were originally used to study compositional data even though this family of distribution is not appropriate (see Aitchison,
1986) because of their extreme independence properties. Aitchison (1982) endeavored to provide a viable alternative to existing
methods by employing Logistic Normal distribution to analyze such constrained data. However this family does not include the
Dirichlet class and is therefore unable to address the issue of extreme independence. In this paper generalized Liouville
family is investigated to model compositional data which includes covariates. This class permits distributions that admit
negative or mixed correlation and also contains non-Dirichlet distributions with non-positive correlation and overcomes deficits
in the Dirichlet class. Semiparametric Bayesian methods are proposed to estimate the probability density. Predictive distributions
are used to assess performance of the model. The methods are illustrated on a real data set. |
| |
Keywords: | Compositional data Markov chain Monte Carlo methods posterior predictive distribution semiparametric density estimation |
本文献已被 SpringerLink 等数据库收录! |
|