首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
When measuring units are expensive or time consuming, while ranking them can be done easily, it is known that ranked set sampling (RSS) is preferred to simple random sampling (SRS). Available results for RSS are developed under specific parametric assumptions or are asymptotic in nature, with few results available for finite size samples when the underlying distribution of the observed data is unknown. We investigate the use of resampling techniques to draw inferences on population characteristics. To obtain standard error and confidence interval estimates we discuss and compare three methods of resampling a given ranked set sample. Chen et al. (2004. Ranked Set Sampling: Theory and Applications. Springer, New York) suggest a natural method to obtain bootstrap samples from each row of a RSS. We prove that this method is consistent for a location estimator. We propose two other methods that are designed to obtain more stratified resamples from the given sample. Algorithms are provided for these methods. We recommend a method that obtains a bootstrap RSS from the observations. We prove several properties of this method, including consistency for a location parameter. We define two types of L-estimators for RSS and obtain expressions for their exact moments. We discuss an application to obtain confidence intervals for the Winsorized mean of a RSS.  相似文献   

2.
Ranked set sampling (RSS) involves ranking of potential sampling units on the variable of interest using judgment or an auxiliary variable to aid in sample selection. Its effectiveness depends on the success in this ranking. We provide an empirical assessment of RSS ranking accuracy in estimation of a population proportion.  相似文献   

3.
We conceptualized security-related stress (SRS) and proposed a theoretical model linking SRS, discrete emotions, coping response, and information security policy (ISP) compliance. We used an experience sampling design, wherein 138 professionals completed surveys. We observed that SRS had a positive association with frustration and fatigue, and these negative emotions were associated with neutralization of ISP violations. Additionally, frustration and fatigue make employees more likely to follow through on their rationalizations of ISP violations by decreased ISP compliance. Our findings provide evidence that neutralization is not a completely stable phenomenon but can vary within individuals from one time point to another.  相似文献   

4.
To test the hypothesis of symmetry about an unknown median we propose the maximum of a partial sum process based on ranked set samples. We discuss the properties of the test statistic and investigate a modified bootstrap ranked set sample bootstrap procedure to obtain its sampling distribution. The power of the new test statistic is compared with two existing tests in a simulation study.  相似文献   

5.
粗集动态约简研究   总被引:8,自引:0,他引:8  
对动态约简思想进行了阐述,详细讨论了多层次的形式化动态约简,对抽样问题进行了研究,指出了Bazan的思想中存在的问题,提出了抽样计算的新思想方法,把约简精度系数引入到对抽样的估计中,使计算方法适应于各层面动态约简,构造了完备的动态约简体系.  相似文献   

6.
The development of an efficient ground sampling strategy is critical to assess uncertainties associated with moderate- or coarse-resolution remote-sensing products. This work presents a comparison of estimating spatial means from fine spatial resolution images using spatial random sampling (SRS), Block Kriging (BK), and Means of Surface with Nonhomogeneity (MSN) at 1 km2 spatial scale. Towards this goal, we focus on the sampling strategies for ground data measurements and provide an assessment of the MODIS LAI product validated by the spatial means estimated by the above-mentioned three methods. The results of this study indicate that: (1) for its effective stratification strategies and its criteria of minimum mean square estimation error, MSN demonstrates the lowest mean squared estimation error for estimating the means of stratified nonhomogeneous surface; (2) BK is efficient in estimating the means of homogeneous surfaces without bias and with minimum mean squared estimation errors. The MODIS LAI product is assessed using the means estimated by SRS, BK, and MSN based on Landsat 8 OLI and SPOT HRV fine-resolution LAI maps. For heterogeneous surfaces, MSN results in low RMSE and high accuracy of MODIS LAI product compared with BK and SRS, whereas for homogeneous surfaces, the statistical parameters outputted by these three methods are similar. These results reveal that MSN is an effective method for estimating the spatial means for heterogeneous surfaces. There are differences in the accuracies of MODIS LAI product assessed by these three methods.  相似文献   

7.
不平衡数据集中,样本的分布位置对于决策边界具有差异性,传统的采样方法没有根据样本位置做区别化采样处理.为此提出一种不平衡数据中基于异类k距离的边界混合采样算法(BHSK).通过异类k距离识别出边界集;再根据支持度将边界少数类样本细分为三类,分别采用不同的过采样方法和过采样倍率,根据少数类样本的不同重要性进行过采样,生成...  相似文献   

8.
一种基础矩阵线性估计的鲁棒方法   总被引:4,自引:0,他引:4  
宋汉辰  张小义  吴玲达 《计算机工程》2005,31(15):178-179,185
以基础矩阵估计规范八点算法为基础,针对线性算法对参考图像匹配点集中存在的异常匹配极为敏感的问题,将基础矩阵的线性鲁棒估计转换为对匹配点集的最优化采样问题,以最小对极距离平方和中数准则获得基础矩阵的稳定可信解。实验表明,算法对基础矩阵的精度和稳定性均有较大提高。  相似文献   

9.
10.
陈俞  赵素云  李雪峰  陈红  李翠平 《软件学报》2017,28(11):2825-2835
传统的属性约简由于其时间复杂度和空间复杂度过高,几乎无法应用到大规模的数据集中.将随机抽样引入传统的模糊粗糙集中,使得属性约简的效率大幅度提升.首先,在统计下近似的基础上提出一种统计属性约简的定义.这里的约简不是原有意义上的约简,而是保持基于统计下近似定义的统计辨识度不变的属性子集.然后,采用抽样的方法计算统计辨识度的样本估计值,基于此估计值可以对统计属性重要性进行排序,从而可以设计一种快速的适用于大规模数据的序约简算法.由于随机抽样集以及统计近似概念的引入,该算法从时间和空间上均降低了约简的计算复杂度,同时又保持了数据集中信息含量几乎不变.最后,数值实验将基于随机抽样的序约简算法和两种传统的属性约简算法从以下3个方面进行了对比:计算属性约简时间消耗、计算属性约简空间消耗、约简效果.对比实验验证了基于随机抽样的序约简算法在时间与空间上的优势.  相似文献   

11.
In order to select a sample in a finite population of N units with given inclusion probabilities, it is possible to define a sampling design on at most N samples that have a positive probability of being selected. Designs defined on minimal sets of samples are called minimum support designs. It is shown that, for any vector of inclusion probabilities, systematic sampling always provides a minimum support design. This property makes it possible to extensively compute the sampling design and the joint inclusion probabilities. Random systematic sampling can be viewed as the random choice of a minimum support design. However, even if the population is randomly sorted, a simple example shows that some joint inclusion probabilities can be equal to zero. Another way of randomly selecting a minimum support design is proposed, in such a way that all the samples have a positive probability of being selected, and all the joint inclusion probabilities are positive.  相似文献   

12.
基于聚类方法的审计分层抽样算法研究   总被引:1,自引:0,他引:1  
针对审计抽样中最复杂的抽样算法一分层抽样,从数据挖掘中“聚类”的角度出发,较好地运用了聚类思想于审计抽样的分层抽样算法之中,为该算法的实现提供了一种新的解决方案。AICPA39没有为分层抽样提供具体的实现方式,国内的学者曾从统计学角度有过实现,将从计算机科学角度实现方法与统计学实现方法进行分析比较,这是对分层抽样算法实现的有益新探索。  相似文献   

13.
This paper presents an efficient method for sampling the illumination functions in higher order radiosity algorithms. In such algorithms, the illumination function is not assumed to be constant across each patch, but it is approximated by a function which is at least C1 continuous. Our median cut sampling algorithm is inspired by the observation that many form factors are computed at higher precision than is necessary. While a high sampling rate is necessary in regions of high illumination, dark areas can be sampled at a much lower rate to compute the received radiosity within a given precision. We adaptively subdivide the emitter into regions of approximately equal influence on the result. Form factors are evaluated by the disk approximation and a ray tracing based test for occlusion detection. The implementation of a higher order radiosity system using B-splines as radiosity function is described. The median cut algorithm can also be used for radiosity algorithms based on the constant radiosity assumption.  相似文献   

14.
基于相似粗糙集的案例特征权值确定新方法   总被引:9,自引:0,他引:9  
针对现有案例特征权值确定方法客观性差、算法复杂等问题,首先介绍和完善了基于传统粗糙集的权值确定方法.其次,针对基于传统粗糙集的方法会造成案例相似度测量误差从而影响案例推理的准确性的问题,将传统粗糙集的不可分辨关系推广为相似关系,提出了一种基于相似粗糙集的案例特征权值确定方法.给出了相似粗糙集的基本定义,以及利用该方法基于差别矩阵进行特征权值计算的两个定理.最后,用实例表明了方法的有效性.  相似文献   

15.
Distributed and Parallel Databases - Stratified random sampling (SRS) is a widely used sampling technique for approximate query processing. We consider SRS on continuously arriving data streams and...  相似文献   

16.
《Graphical Models》2014,76(1):17-29
This paper proposes a novel method for the computation of hierarchical Poisson disk samplings on polygonal surfaces. The algorithm generates a pointerless hierarchical structure such that each level is a uniform Poisson disk sampling and a subset of the next level. As the main result, given a dynamically-varying importance sampling function defined over a surface, the hierarchy is capable of generating adaptive samplings with blue noise characteristics, temporal-coherence and real-time computation. Classical algorithms produce hierarchies in tight ratios, which is a serious bottleneck specially for a large number of samples. Instead, our method uses sparse ratios and decreases the adaptation error of the hierarchy through a fast optimization process. Therefore, we save a considerable amount of time (up to 74% in our experiments) while preserving the good blue noise properties. We present applications on Non-Photo Realistic rendering (NPR), more specifically, on surface stippling effects. First, we apply our method by taking illumination to be the importance sampling to shade the surface, and second, we dynamically deform a surface with a predefined stippled texture.  相似文献   

17.
Sequential analysis as a sampling technique facilitates efficient statistical inference by considering less number of observations in comparison to the fixed sampling method. The optimal stopping rule dictates the sample size and also the statistical inference deduced thereafter. In this research we propose three variants of the already existing multistage sampling procedures and name them as (i) Jump and Crawl (JC), (ii) Batch Crawl and Jump (BCJ) and (iii) Batch Jump and Crawl (BJC) sequential sampling methods. We use the (i) normal, (ii) exponential, (iii) gamma and (iv) extreme value distributions for the point estimation problems under bounded risk conditions. We highlight the efficacy of using the right adaptive sampling plan for the bounded risk problems for these four distributions, considering two different loss functions, namely (i) squared error loss (SEL) and (ii) linear exponential (LINEX) loss functions. Comparison and analysis of our proposed methods with existing sequential sampling techniques is undertaken and the importance of this study is highlighted using extensive theoretical simulation runs.  相似文献   

18.
针对传统K-均值聚类方法不能有效处理大规模数据聚类的问题,提出一种基于随机抽样的加速K-均值聚类(K-means Clustering Algorithm Based on Random Sampling , Kmeans_RS)方法,以提高传统K-均值聚类方法的效率。首先从大规模的聚类数据集中进行随机抽样,得到规模较小的工作集,在工作集上进行传统K-均值聚类,得到聚类中心和半径,并得到抽样结果;然后通过衡量剩下的聚类样本与已得到的抽样结果之间的关系,对剩余的样本进行归类。该方法通过随机抽样大大地减小了参与K-均值聚类的问题规模,从而有效提高了聚类效率,可解决大规模数据的聚类问题。实验结果表明,Kmeans_RS方法在大规模数据集中在保持聚类效果的同时大幅度提高了聚类效率。  相似文献   

19.
改进型分层抽样技术及性能研究   总被引:2,自引:2,他引:0  
报文抽样技术是高速网络流量测量和管理中使用的一项关键技术。本文通过引进分层特征、层数L、分层边界、各层样本量分配、层内抽样策略5个分层抽样参数,并对其进行重新配置和简单理论探讨,实现对分层抽样技术的改进。同时文章使用简单线性估计推断原始流数据,并借助于Φ偏差检验方法,对改进的分层抽样技术和其它抽样技术在测量网络报文长度分布方面进行准确性性能比较。结果表明,改进的分层抽样技术在测量报文长度分布方面的准确性性能远高于其它抽样方式,提高了测量的精度。  相似文献   

20.
朱君鹏  李晖  陈梅  戴震宇 《计算机科学》2018,45(11):249-255
抽样作为一种有效的统计分析方法,常被用于大规模图数据分析领域以提升性能。现有的图抽样算法大多存在高度节点或低度节点过度入样的问题,较大程度地影响了算法的性能。复杂网络具有无标度特性,即节点的度服从幂律分布,节点个体之间存在较大差异。在基于点选择策略的抽样方法的基础上,通过结合节点的近似度分布策略,设计并实现了高效无偏的分层图抽样算法SNS。在3个真实的图数据集上的实验结果表明,SNS算法比其他图抽样算法保留了更多的拓扑属性,且执行效率比FFS更高。SNS算法在度的无偏性、抽样结果拓扑属性近似性方面的表现均优于现有算法。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号