首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
When data are dichotomous, this paper notes the utility of inverse sampling in establishing equivalence with respect to the risk ratio. This paper develops an exact equivalence test that accounts for the risk ratio under inverse sampling and further discusses the relationship between the exact equivalence test and the exact conditional confidence limits. Also included are an exact and two asymptotic procedures for calculation of the minimum required number of index subjects for a desired power 1--beta at a given alpha-level. Finally, this paper provides a table that summarizes the minimum required number of index subjects for powers equal to 0.90 and 0.80 in application of the proposed exact equivalence test at 0.05-level in a variety of situations.  相似文献   

2.
This paper proposes Mantel-Haenszel-type statistics for testing whether a new treatment is at least as effective as the standard treatment in comparative binomial trials. The null hypotheses considered are of a specified nonzero difference and of a ratio not equal to unity. It is shown that we may also use these tests for testing the equivalence of two treatments.  相似文献   

3.
Conventional clinical trials involve tests of hypotheses with statistics computed from values of dependent variables alone. An alternative is to test hypotheses with statistics computed from benefit/harm scores that measure longitudinal associations between dose and values of the dependent variables. The proposed standardized measure of benefit/harm quantifies the strength of evidence that a patient did either better or worse while on treatment. A benefit/harm score, particularly when obtained from a randomized, N-of-1 trial, indicates a beneficial or harmful treatment effect for the individual. Benefit/harm scores from samples of patients are evaluated with standard statistical tests, with or without group comparisons, to make inferences about populations. The proposed alternative strategy can yield within-patient indicators of treatment effect that are more reliable, valid, comprehensive, and detailed. This, in turn, could help make many population-based clinical trials more informative, cost-effective, and clinically useful for participants. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

4.
In this paper we develop multiple hypotheses testing procedures to compare a new treatment with a set of standard treatments in a clinical trial. The aim is to classify the new treatment with respect to each of the standards, by specifying those to which the new treatment is superior, those to which the new treatment is equivalent and those to which one can establish neither superiority nor equivalence. We propose several stepwise procedures and compare them with respect to their familywise error rates and power. The step-down methods SD1 and SD2 test for superiority first, followed by tests for equivalence for those comparisons where we cannot establish superiority. The step-up methods SU1 and SU2 test for equivalence first, followed by tests for superiority for those comparisons where we can establish at least equivalence. The methods SD3 and SU3 apply the tests for superiority and equivalence in pairs. All the methods require that we specify a threshold value delta > 0 in advance for defining equivalence. In applications where it is not possible to specify a value delta, we can use the method SD1 by testing for superiority first, followed by one-sided confidence limits on the efficacy differences for those comparisons where we cannot establish superiority.  相似文献   

5.
Usually, it is the purpose of a clinical trial to demonstrate the superiority of a (new) treatment in comparison to another treatment with regard to a well-defined criterion of efficacy. However, other aspects rather than improved efficacy might be regarded as advantages of a new therapy, i.e. less or less severe adverse events, a more simple applicability, or a lower price. In this case, it may be sufficient to show a "comparable" efficacy (therapeutic equivalence). Unfortunately, equivalence studies can lead to severe problems of interpretation in case of insufficient methodological planning. In general, more detailed information must be available in advance compared to the common (superiority) trials. Very carefully designed trials are necessary to evaluate the therapeutic equivalence of treatments.  相似文献   

6.
BACKGROUND: The number of antidepressant drugs available in the market has grown rapidly in the last few years. The present paper underlines some of the pre-clinical and clinical problems that call close attention from the regulatory authorities when approving new drugs. METHODS: We present here a review of the literature. RESULTS: A wide heterogeneity in the action of the various antidepressants precludes any single theory about the pathogenesis and therapy of depression. Antidepressant activity, in fact, may be achieved by acting on a number of different monoaminergic mechanisms. The variety in the neurochemical effects of antidepressants is not reflected in clinical trials, which tend to stereotypy. In many cases clinical trials aim at demonstrating equivalence rather than differences in efficacy. Regulatory authorities should, therefore, pay attention in accepting the equivalence of effects of a new drug in relation to a reference one: most clinical trials of new antidepressant drugs do not have the power to detect clinically relevant differences. CONCLUSIONS: Unconventional new pre-clinical tests are needed to generate antidepressants with a different mechanism of action. Clinical studies are needed to promote objective comparative evaluation of the cost, benefits and toxic effects of new antidepressants.  相似文献   

7.
Proof pile load tests are an important means to cope with uncertainties in the design and construction of pile foundations. In this paper, a systematic method to incorporate the results of proof load tests not conducted to failure into the design of pile foundations is developed. In addition, illustrative acceptance criteria for driven piles based on proof load tests are proposed for use in a reliability-based design. Finally, modifications to conventional proof test procedures are studied so that the value derived from proof tests can be maximized. Whether or not a proof test is conducted to failure, its results can be used to update the probability distribution of the pile capacity using the method proposed in this paper. Hence, contributions of the proof test can be included in foundation design in a logical manner by considering several load test parameters such as the number of tests, the test load, the factor of safety, and test results. This adds value to proof load tests and warrants improvements in the procedures for acceptance of pile foundations using proof load tests. A larger test load for proof tests, say 1.5 times the predicted pile capacity, is recommended since it will yield more information about the capacity statistics and thus allow for more economical designs.  相似文献   

8.
In repeated measures studies, we are often interested in comparing group effects in which groups are associated with a certain order relation. We propose testing procedures for ordered group effects using the generalized estimating equations (GEE) approach of Liang and Zeger (1986, Biometrika 73, 13-22). The order-constrained GEE estimators of group effects are approximated by the isotonic regression of the unconstrained GEE estimators. Based on these constrained estimators, we construct test statistics for detecting ordered group effects. The limiting distributions of the test statistics are mixtures of chi-square distributions. A Monte Carlo experiment shows improved performances of the proposed tests over the usual chi-square tests in detecting ordered group effects. The proposed test procedures are illustrated by familial polyposis supplementation trial data.  相似文献   

9.
Exploratory data analyses in medical research usually involve many potential risk factors. Typically, one performs numerous hypothesis tests to identify variables that are of prognostic value. Because of the multiplicity of tests, one must control the overall false positive rate. The Bonferroni adjustment is simple to use, but may be overly conservative when applied to correlated tests. We propose an exact adjustment method, based on the joint permutational distributions of the test statistics, in settings where the acquired sample size only allows analysis of a single feature at a time. We demonstrate our method with two examples.  相似文献   

10.
Test statistics for the homogeneity of the risk difference for a series of 2 x 2 tables when the data are sparse is proposed. A weighted least squares statistic is commonly used to test for equality of the risk difference over the tables; however, when the data are sparse, this statistic can have anticonservative Type I error rates. Simulation is used to compare the proposed test statistics to the weighted least squares statistic. The weighted least squares statistic has the most anticonservative Type I error rates of all the statistics compared. We suggest the use of one of our proposed test statistics instead of the weighted least squares statistic.  相似文献   

11.
Evidence of group matching frequently takes the form of a nonsignificant test of statistical difference. Theoretical hypotheses of no difference are also tested in this way. These practices are flawed in that null hypothesis statistical testing provides evidence against the null hypothesis and failing to reject H? is not evidence supportive of it. Tests of statistical equivalence are needed. This article corrects the inferential confidence interval (ICI) reduction factor introduced by W. W. Tryon (2001) and uses it to extend his discussion of statistical equivalence. This method is shown to be algebraically equivalent with D. J. Schuirmann's (1987) use of 2 one-sided t tests, a highly regarded and accepted method of testing for statistical equivalence. The ICI method provides an intuitive graphic method for inferring statistical difference as well as equivalence. Trivial difference occurs when a test of difference and a test of equivalence are both passed. Statistical indeterminacy results when both tests are failed. Hybrid confidence intervals are introduced that impose ICI limits on standard confidence intervals. These intervals are recommended as replacements for error bars because they facilitate inferences. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

12.
Likelihood-ratio statistics are proposed to test for heterogeneity in nucleotide substitution rate among regions of a DNA sequence. The tests examine three-sequence phylogenies, and two specific tests are proposed: a test to detect rate heterogeneity among genic regions within a sequence, over all evolutionary lineages; and a test to detect rate heterogeneity among regions in a specific evolutionary lineage. Simulations examine the ability of tests to detect a single region that varies in nucleotide substitution rate relative to the remainder of the sequence. A 50-bp region with a fivefold substitution-rate increase can be detected > or = 90% of the time when it is found in all three lineages of the phylogeny, and a 50-bp region of fivefold rate increase can be detected with approximately 70% power when it is found in only one evolutionary lineage. Simulation also examines the effect of transition- and transversion-rate differences. The tests are applied to published DNA sequences. While the tests are powerful, significant results can be difficult to interpret biologically.  相似文献   

13.
We propose a new, less costly, design to test the equivalence of digital versus analogue mammography in terms of sensitivity and specificity. Because breast cancer is a rare event among asymptomatic women, the sample size for testing equivalence of sensitivity is larger than that for testing equivalence of specificity. Hence calculations of sample size are based on sensitivity. With the proposed design it is possible to achieve the same power as a completely paired design by increasing the number of less costly analogue mammograms and not giving the more expensive digital mammograms to some randomly selected subjects who are negative on the analogue mammogram. The key idea is that subjects who are negative on the analogue mammogram are unlikely to have cancer and hence contribute less information for estimating sensitivity than subjects who are positive on the analogue mammogram. To ascertain disease state among subjects not biopsied, we propose another analogue mammogram at a later time determined by a natural history model. The design differs from a double sampling design because it compares two imperfect tests instead of combining information from a perfect and imperfect test.  相似文献   

14.
A 1-sided exact test, based on the unconditional distribution of the common Z statistic, is proposed for the hypothesis of equal stress probabilities in a 2?×?2 comparative trials contingency table. The need for an exact test is justified by the fact that the Type I error probabilities of the large-sample (normal) test may turn out to be more than twice the nominal significance level. Power comparisons reveal that the new test performs considerably better than Fisher's exact test and, in some cases, is even more powerful than the randomized conditional test. (6 ref) (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

15.
Methods are proposed for comparing two diagnostic tests for the same data where a threshold for positive for each test is specified. One method contrasts the diagnostic tests' estimated risks. A second method compares the two tests' kappa coefficients. When thresholds for positive test results are specified a priori, maximum likelihood estimators and their asymptotic variances are derived and test statistics are presented for both case-control and naturalistic methods of sampling. The bootstrap is proposed as a method to assess differences in risk estimators when thresholds for positive test results are chosen by scanning the data. Examples are given to illustrate the methods.  相似文献   

16.
In this study we demonstrate an approach to replacing validated selection tests to which job applicants may have prior access. This approach, labeled construct equivalence, allows for replacing valid tests currently in use with new, experimental tests that have been shown to measure the same constructs. We demonstrated the construct equivalence approach by collecting data from over 2,000 applicants for four different positions in a large petrochemical company. We investigated the equivalence of the experimental and the current tests by using correlational analyses, structural modeling, and analyses of hiring decisions. Results indicated that the experimental and current tests measure the same constructs and that replacing the current tests with the experimental tests would treat ethnic and sex subgroups consistently. Construct equivalence was shown to be a viable approach to test substitution. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

17.
In this era of evidence-based medicine, diagnostic tests cannot escape close scrutiny of their effectiveness. Sensitivity and specificity have up till now played a central role in the evaluation of diagnostic tests. These terms are not without their shortcomings when it comes to the characterisation of a test's true worth for patients. Randomised clinical trials are increasingly used for evaluation of medical tests and outlining of strategy. The indirect relationship between test results and health outcome creates additional challenges for designers of such trials.  相似文献   

18.
This article describes the basic evaluation process and test methodology employed when temperature extremes for clothing systems must be considered as part of the U.S. Army's Health Hazard Assessment for material in the development and acquisition process. The goals of the evaluation are to select clothing systems that minimize the hazards of heat strain and to predict the heat strain for persons wearing such clothing. Clothing evaluations begin with biophysical assessments that determine the thermal characteristics (vapor permeability and insulation) for textiles via guarded hot plate tests and for clothing systems via thermal manikin tests. The results from biophysical tests can be used to select the textile and/or clothing with the best thermal characteristics. The data from manikin evaluations also can be used in prediction modeling. Human physiological testing is best done in a controlled laboratory environment, although for realism and user acceptability field trials may also be conducted. Proven test and measurement methods must be employed, and tests must control for confounding variables; subjects serve as their own controls, and test environment and procedures are consistent between trials. The process and test methodology described can be applied to the evaluation of civilian clothing systems as well as to the military systems for which they were developed.  相似文献   

19.
Visual field test results are crucial to the accuracy and efficiency of diagnosing blinding diseases such as glaucoma. Herein, a method of integrating self-organizing neural networks and empirical heuristics is used to perform visual field tests via a dynamic test strategy, which can lead to a reduction in the number of trials in a perimetric test. Experiments performed using clinical test records show that we are able to reduce by 20% to 30% the number of trials per test without much adverse effect on the accuracy of the tests.  相似文献   

20.
We present some practical extensions and applications of a strategy proposed by Thall, Simon and Estey for designing and monitoring single-arm clinical trials with multiple outcomes. We show by application how the strategy may be applied to construct designs for phase IIA activity trials and phase II equivalence trials. We also show how it may be extended to incorporate the use of mixture priors in settings where a Dirichlet distribution does not adequately quantify prior experience, randomized phase II selection trials involving two or more experimental treatments, and trials with group-sequential monitoring for applications involving multiple institutions.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号