期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Resampling methods for ranked set samples

Reza Modarres Terrence P. Hui 《Computational statistics & data analysis》2006,51(2):1039-1050

When measuring units are expensive or time consuming, while ranking them can be done easily, it is known that ranked set sampling (RSS) is preferred to simple random sampling (SRS). Available results for RSS are developed under specific parametric assumptions or are asymptotic in nature, with few results available for finite size samples when the underlying distribution of the observed data is unknown. We investigate the use of resampling techniques to draw inferences on population characteristics. To obtain standard error and confidence interval estimates we discuss and compare three methods of resampling a given ranked set sample. Chen et al. (2004. Ranked Set Sampling: Theory and Applications. Springer, New York) suggest a natural method to obtain bootstrap samples from each row of a RSS. We prove that this method is consistent for a location estimator. We propose two other methods that are designed to obtain more stratified resamples from the given sample. Algorithms are provided for these methods. We recommend a method that obtains a bootstrap RSS from the observations. We prove several properties of this method, including consistency for a location parameter. We define two types of L-estimators for RSS and obtain expressions for their exact moments. We discuss an application to obtain confidence intervals for the Winsorized mean of a RSS. 相似文献

2.

An empirical assessment of ranking accuracy in ranked set sampling

Haiying Chen Douglas A. Wolfe 《Computational statistics & data analysis》2006,51(2):1411-1419

Ranked set sampling (RSS) involves ranking of potential sampling units on the variable of interest using judgment or an auxiliary variable to aid in sample selection. Its effectiveness depends on the success in this ranking. We provide an empirical assessment of RSS ranking accuracy in estimation of a population proportion. 相似文献

3.

A bootstrap test for symmetry based on ranked set samples

Reza DrikvandiReza Modarres Abdullah H. Jalilian 《Computational statistics & data analysis》2011,55(4):1807-1814

To test the hypothesis of symmetry about an unknown median we propose the maximum of a partial sum process based on ranked set samples. We discuss the properties of the test statistic and investigate a modified bootstrap ranked set sample bootstrap procedure to obtain its sampling distribution. The power of the new test statistic is compared with two existing tests in a simulation study. 相似文献

4.

Algorithms and applications for answering ranked queries using ranked views

Vagelis?Hristidis Email author Yannis?Papakonstantinou 《The VLDB Journal The International Journal on Very Large Data Bases》2004,13(1):49-70

Ranked queries return the top objects of a database according to a preference function. We present and evaluate (experimentally and theoretically) a core algorithm that answers ranked queries in an efficient pipelined manner using materialized ranked views. We use and extend the core algorithm in the described PREFER and MERGE systems. PREFER precomputes a set of materialized views that provide guaranteed query performance. We present an algorithm that selects a near optimal set of views under space constraints. We also describe multiple optimizations and implementation aspects of the downloadable version of PREFER. Then we discuss MERGE, which operates at a metabroker and answers ranked queries by retrieving a minimal number of objects from sources that offer ranked queries. A speculative version of the pipelining algorithm is described.Received: 10 June 2002, Accepted: 11 June 2002, Published online: 30 September 2003Edited by: A. MendelzonWork supported by NSF Grant No. 9734548. 相似文献

5.

Evaluating the precision of eight spatial sampling schemes in estimating regional means of simulated yield for two crops

《Environmental Modelling & Software》2016

We compared the precision of simple random sampling (SimRS) and seven types of stratified random sampling (StrRS) schemes in estimating regional mean of water-limited yields for two crops (winter wheat and silage maize) that were simulated by fourteen crop models. We found that the precision gains of StrRS varied considerably across stratification methods and crop models. Precision gains for compact geographical stratification were positive, stable and consistent across crop models. Stratification with soil water holding capacity had very high precision gains for twelve models, but resulted in negative gains for two models. Increasing the sample size monotonously decreased the sampling errors for all the sampling schemes. We conclude that compact geographical stratification can modestly but consistently improve the precision in estimating regional mean yields. Using the most influential environmental variable for stratification can notably improve the sampling precision, especially when the sensitivity behavior of a crop model is known. 相似文献

6.

An adaptive surrogate modeling-based sampling strategy for parameter optimization and distribution estimation (ASMO-PODE)

《Environmental Modelling & Software》2017

Parameter distribution estimation has long been a hot issue for the uncertainty quantification of environmental models. Traditional approaches such as MCMC (Markov Chain Monte Carlo) are prohibitive to be applied to large complex dynamic models because of the high computational cost of computing resources. To reduce the number of model evaluations required, we proposed an adaptive surrogate modeling-based sampling strategy for parameter distribution estimation, named ASMO-PODE (Adaptive Surrogate Modeling-based Optimization – Parameter Optimization and Distribution Estimation). The ASMO-PODE can provide an estimation of the parameter distribution using as little as one percent of the model evaluations required by a regular MCMC approach. The effectiveness and efficiency of the ASMO-PODE approach have been evaluated with 2 test problems and one land surface model, the Common Land Model. The results demonstrated that the ASMO-PODE method is an economic way for parameter optimization and distribution estimation. 相似文献

7.

Understanding and comparisons of different sampling approaches for the Fourier Amplitudes Sensitivity Test (FAST)

Chonggang Xu 《Computational statistics & data analysis》2011,55(1):184-198

Fourier Amplitude Sensitivity Test (FAST) is one of the most popular uncertainty and sensitivity analysis techniques. It uses a periodic sampling approach and a Fourier transformation to decompose the variance of a model output into partial variances contributed by different model parameters. Until now, the FAST analysis is mainly confined to the estimation of partial variances contributed by the main effects of model parameters, but does not allow for those contributed by specific interactions among parameters. In this paper, we theoretically show that FAST analysis can be used to estimate partial variances contributed by both main effects and interaction effects of model parameters using different sampling approaches (i.e., traditional search-curve based sampling, simple random sampling and random balance design sampling). We also analytically calculate the potential errors and biases in the estimation of partial variances. Hypothesis tests are constructed to reduce the effect of sampling errors on the estimation of partial variances. Our results show that compared to simple random sampling and random balance design sampling, sensitivity indices (ratios of partial variances to variance of a specific model output) estimated by search-curve based sampling generally have higher precision but larger underestimations. Compared to simple random sampling, random balance design sampling generally provides higher estimation precision for partial variances contributed by the main effects of parameters. The theoretical derivation of partial variances contributed by higher-order interactions and the calculation of their corresponding estimation errors in different sampling schemes can help us better understand the FAST method and provide a fundamental basis for FAST applications and further improvements. 相似文献

8.

On the sampling distribution of resubstitution and leave-one-out error estimators for linear classifiers

Amin Zollanvari Author Vitae Edward R. Dougherty Author Vitae 《Pattern recognition》2009,42(11):2705-2723

Error estimation is a problem of high current interest in many areas of application. This paper concerns the classical problem of determining the performance of error estimators in small-sample settings under a Gaussianity parametric assumption. We provide here for the first time the exact sampling distribution of the resubstitution and leave-one-out error estimators for linear discriminant analysis (LDA) in the univariate case, which is valid for any sample size and combination of parameters (including unequal variances and sample sizes for each class). In the multivariate case, we provide a quasi-binomial approximation to the distribution of both the resubstitution and leave-one-out error estimators for LDA, under a common but otherwise arbitrary class covariance matrix, which is assumed to be known in the design of the LDA discriminant. We provide numerical examples, using both synthetic and real data, that indicate that these approximations are accurate, provided that LDA classification error is not too large. 相似文献

9.

Approximating probability distribution of circuit performance function for parametric yield estimation using transferable belief model

XiaoBin Xu DongHua Zhou YinDong Ji ChengLin Wen 《中国科学:信息科学(英文版)》2013,56(11):1-19

This paper applies the transferable belief model （TBM） interpretation of the Dempster-Shafer theory of evidence to approximate distribution of circuit performance function for parametric yield estimation. Treating input parameters of performance function as credal variables defined on a continuous frame of real numbers, the suggested approach constructs a random set-type evidence for these parameters. The corresponding random set of the function output is obtained by extension principle of random set. Within the TBM framework, the random set of the function output in the credal state can be transformed to a pignistic state where it is represented by the pignistic cumulative distribution. As an approximation to the actual cumulative distribution, it can be used to estimate yield according to circuit response specifications. The advantage of the proposed method over Monte Carlo （MC） methods lies in its ability to implement just once simulation process to obtain an available approximate value of yield which has a deterministic estimation error. Given the same error, the new method needs less number of calculations than MC methods. A track circuit of high-speed railway and a numerical eight-dimensional quadratic function examples are included to demonstrate the efficiency of this technique. 相似文献

10.

随机迭代算法的概率分布模型及应用

章立亮《电脑与信息技术》2006,14(5):11-13,41

针对带概率的迭代函数系统，伴随概率在吸引子图像控制中的影响作用，文章提出了几种不同的概率分布模型，应用该模型可以对吸引子图像实现局部细节和整体形状的控制，并以树木的模拟为实例，通过计算机数值实验展示了所给模型的控制效果。此方法用于计算机模拟自然景物，计算简单，易于操作，效果较好。相似文献

11.

A hybrid fuzzy-statistical clustering approach for estimating the time of changes in fixed and variable sampling control charts 总被引：3，自引：0，他引：3

Adel Alaeddini Mehdi Ghazanfari 《Information Sciences》2009,179(11):1769-47

Control charts are the most popular Statistical Process Control (SPC) tools used to monitor process changes. When a control chart produces an out-of-control signal, it means that the process has changed. However, control chart signals do not indicate the real time of the process changes, which is essential for identifying and removing assignable causes and ultimately improving the process. Identifying the real time of the process change is known as change-point estimation problem. Most of the traditional change-point methods are based on maximum likelihood estimators (MLE) which need strict statistical assumptions. In this paper, first, we introduce clustering as a potential tool for change-point estimation. Next, we discuss the challenges of employing clustering methods for change-point estimation. Afterwards, based on the concepts of fuzzy clustering and statistical methods, we develop a novel hybrid approach which is able to effectively estimate change-points in processes with either fixed or variable sample size. Using extensive simulation studies, we also show that the proposed approach performs considerably well in all considered conditions in comparison to powerful statistical methods and popular fuzzy clustering techniques. The proposed approach can be employed for processes with either normal or non-normal distributions. It is also applicable to both phase-I and phase-II. Finally, it can estimate the true values of both in- and out-of-control states’ parameters. 相似文献

12.

Elliptic Gabriel graph for finding neighbors in a point set and its application to normal vector estimation

Joon C. Park Author Vitae Author Vitae Byoung K. Choi Author Vitae 《Computer aided design》2006,38(6):619-626

Point-based shape representation has received increased attention in recent years, mainly due to its simplicity. One of the most fundamental operations for point set processing is to find the neighbors of each point. Mesh structures and neighborhood graphs are commonly used for this purpose. However, though meshes are very popular in the field of computer graphics, neighbor relations encoded in a mesh are often distorted. Likewise, neighborhood graphs, such as the minimum spanning tree (MST), relative neighborhood graph (RNG), and Gabriel graph (GG), are also imperfect as they usually give too few neighbors for a given point. In this paper, we introduce a generalization of Gabriel graph, named elliptic Gabriel graph (EGG), which takes an elliptic influence region instead of the circular region in GG. In order to determine the appropriate aspect ratio of the elliptic influence region of EGG, this paper also presents the analysis between the aspect ratio of the elliptic influence region and the average valence of the resulting neighborhood. Analytic and empirical test results are included. 相似文献

13.

Using mega-fuzzification and data trend estimation in small data set learning for early FMS scheduling knowledge

Der-Chiang Li Chih-Sen Wu Tung-I Tsai Fengming M. Chang 《Computers & Operations Research》2006

Provided with plenty of data (experience), data mining techniques are widely used to extract suitable management skills from the data. Nevertheless, in the early stages of a manufacturing system, only rare data can be obtained, and built scheduling knowledge is usually fragile. Using small data sets, this research's purpose is improving the accuracy of machine learning for flexible manufacturing system (FMS) scheduling. The study develops a data trend estimation technique and combines it with mega-fuzzification and adaptive-network-based fuzzy inference systems (ANFIS). The results of the simulated FMS scheduling problem indicate that learning accuracy can be significantly improved using the proposed method involving a very small data set. 相似文献

14.

On BFC-MSMIP strategies for scenario cluster partitioning, and twin node family branching selection and bounding for multistage stochastic mixed integer programming

Laureano F. Escudero María Araceli Garín María Merino Gloria Prez 《Computers & Operations Research》2010,37(4):738-753

In the branch-and-fix coordination (BFC-MSMIP) algorithm for solving large-scale multistage stochastic mixed integer programming problems, we find it crucial to decide the stages where the nonanticipativity constraints are explicitly considered in the model. This information is materialized when the full model is broken down into a scenario cluster partition with smaller subproblems. In this paper we present a scheme for obtaining strong bounds and branching strategies for the Twin Node Families to increase the efficiency of the procedure BFC-MSMIP, based on the information provided by the nonanticipativity constraints that are explicitly considered in the problem. Some computational experience is reported to support the efficiency of the new scheme. 相似文献

15.

The distribution of sums, products and ratios for Lawrance and Lewis's bivariate exponential random variables

Saralees Nadarajah M. Masoom Ali 《Computational statistics & data analysis》2006,50(12):3449-3463

We derive the exact distributions of R=X+Y, P=XY and W=X/(X+Y) and the corresponding moment properties when X and Y follow Lawrence and Lewis's bivariate exponential distribution. The expressions turn out to involve special functions. We also provide extensive tabulations of the percentage points associated with the distributions. These tables—obtained using intensive computing power—will be of use to practitioners of the bivariate exponential distribution. 相似文献

16.

On the hazard function of Birnbaum-Saunders distribution and associated inference

Debasis Kundu Nandini Kannan 《Computational statistics & data analysis》2008,52(5):2692-2702

In this paper, we discuss the shape of the hazard function of Birnbaum-Saunders distribution. Specifically, we establish that the hazard function of Birnbaum-Saunders distribution is an upside down function for all values of the shape parameter. In reliability and survival analysis, as it is often of interest to determine the point at which the hazard function reaches its maximum, we propose different estimators of that point and evaluate their performance using Monte Carlo simulations. Next, we analyze a data set and illustrate all the inferential methods developed here and finally make some concluding remarks. 相似文献

17.

On estimation and influence diagnostics for zero-inflated negative binomial regression models

Aldo M. Garay 《Computational statistics & data analysis》2011,55(3):1304-1318

The zero-inflated negative binomial model is used to account for overdispersion detected in data that are initially analyzed under the zero-inflated Poisson model. A frequentist analysis, a jackknife estimator and a non-parametric bootstrap for parameter estimation of zero-inflated negative binomial regression models are considered. In addition, an EM-type algorithm is developed for performing maximum likelihood estimation. Then, the appropriate matrices for assessing local influence on the parameter estimates under different perturbation schemes and some ways to perform global influence analysis are derived. In order to study departures from the error assumption as well as the presence of outliers, residual analysis based on the standardized Pearson residuals is discussed. The relevance of the approach is illustrated with a real data set, where it is shown that zero-inflated negative binomial regression models seems to fit the data better than the Poisson counterpart. 相似文献

18.

On the choice of importance distributions for unconstrained and constrained state estimation using particle filter

J. PrakashSachin C. Patwardhan Sirish L. Shah 《Journal of Process Control》2011,21(1):3-16

Recursive state estimation of constrained nonlinear dynamical system has attracted the attention of many researchers in recent years. For nonlinear/non-Gaussian state estimation problems, particle filters have been widely used (Arulampalam et al. [1]). As pointed out by Daum [2], particle filters require a proposal distribution and the choice of proposal distribution is the key design issue. In this paper, a novel approach for generating the proposal distribution based on a constrained Extended Kalman filter (C-EKF), Constrained Unscented Kalman filter (C-UKF) and constrained Ensemble Kalman filter (C-EnkF) has been proposed. The efficacy of the proposed state estimation algorithms using a particle filter is illustrated via a successful implementation on a simulated gas-phase reactor, involving constraints on estimated state variables and another example problem, which involves constraints on the process noise (Rao et al. [10]). We also propose a state estimation scheme for estimating state variables in an autonomous hybrid system using particle filter with Unscented Kalman filter as a proposal and unconstrained Ensemble Kalman filter (EnKF) as a proposal. The efficacy of the proposed state estimation scheme for an autonomous hybrid system is demonstrated by conducting simulation studies on a three-tank hybrid system. The simulation studies underline the crucial role played by the choice of proposal distribution in formulation of particle filters. 相似文献

19.

On demand synchronization and load distribution for database grid-based Web applications

Wen-Syan Li Kemal Altintas Murat Kantarc&#x;o lu 《Data & Knowledge Engineering》2004,51(3):V2388-323

With the availability of content delivery networks (CDN), many database-driven Web applications rely on data centers that host applications and database contents for better performance and higher reliability. However, it raises additional issues associated with database/data center synchronization, query/transaction routing, load balancing, and application result correctness/precision. In this paper, we investigate the issues in the context of data center synchronization for such load and precision critical Web applications in a distributed data center infrastructure. We develop a scalable scheme for adaptive synchronization of data centers to maintain the load and application precision requirements. A prototype has been built for the evaluation of the proposed scheme. The experimental results show the effectiveness of the proposed scheme in maintaining both application result precision and load distribution; adapting to traffic patterns and system capacity limits. 相似文献

20.

Level estimation, classification and probability distribution architectures for trading the EUR/USD exchange rate

Andreas Lindemann Christian L. Dunis Paulo Lisboa 《Neural computing & applications》2005,14(3):256-271

Dunis and Williams (Derivatives: use, trading and regulation 8(3):211–239, 2002; Applied quantitative methods for trading and investment. Wiley, Chichester, 2003) have shown the superiority of a Multi-layer perceptron network (MLP), outperforming its benchmark models such as a moving average convergence divergence technical model (MACD), an autoregressive moving average model (ARMA) and a logistic regression model (LOGIT) on a Euro/Dollar (EUR/USD) time series. The motivation for this paper is to investigate the use of different neural network architectures. This is done by benchmarking three different neural network designs representing a level estimator, a classification model and a probability distribution predictor. More specifically, we present the Mulit-layer perceptron network, the Softmax cross entropy model and the Gaussian mixture model and benchmark their respective performance on the Euro/Dollar (EUR/USD) time series as reported by Dunis and Williams. As it turns out, the Multi-layer perceptron does best when used without confirmation filters and leverage, while the Softmax cross entropy model and the Gaussian mixture model outperforms the Multi-layer perceptron when using more sophisticated trading strategies and leverage. This might be due to the ability of both models using probability distributions to identify successfully trades with a high Sharpe ratio.

Paulo LisboaEmail:

相似文献