首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The purpose of this study was to determine how well men recall reproductive information. By using a questionnaire, the authors surveyed men who had undergone orchiopexy for undescended testes and a group of matched control men, all of whom had had surgery at the Children's Hospital of Pittsburgh in Pittsburgh, Pennsylvania (n = 77), and their spouses. Subjects were a random subset of a larger (n = 1,212) male fertility study, which has been ongoing since 1992. In 1994, the spouses of men who participated in the study completed a short telephone survey that contained questions previously asked of their partners. Pearson correlations and kappa statistics were calculated to evaluate the accuracy of male recall of reproductive information. For the continuous measures, such as time to conception and frequency of intercourse, the correlations were high to moderate (r = 0.84 (p < 0.001) and r = 0.45 (p < 0.001), respectively). Agreement between the men and their spouses on the majority of bivariate (yes/no) questions, such as those concerning the use of birth control, as measured by the kappa statistic, was moderate to very good (K ranged from 0.14 to 0.69). Statistics were similar for formerly cryptorchid and control men. Male participants' responses to questions about their reproductive histories were accurate as compared with the responses given by their spouses. In this sample from a large cohort study, men appeared to recall reproductive information with acceptable accuracy.  相似文献   

2.
The European Community Respiratory Health Survey (ECRHS), a multinational survey, assesses and compares the prevalence of asthma among subjects, aged 20 to 44, in several European areas. In Spain, some participating centers have used mail and telephone as methods of questionnaire administration. The objective of the present study was to determine whether the validity and reliability of the questionnaire differed by method of administration. Reliability of the questionnaire was measured with the kappa index and the odds ratio of agreement, and validity with the sensitivity and specificity. This study found differences in the reliability of the questionnaires although these differences were more related to the questions themselves than to the method of administration. Among men, but not women, mailed questionnaires were more sensitive and telephone questionnaires more specific. We hypothesize that these differences in validity were due to the self-selection to more severe symptomatic subjects replying earlier and therefore to the mailed questionnaire. Combining different methods of administration was useful as it increased participation and was an adequate procedure to obtain information of good quality.  相似文献   

3.
A questionnaire for farm managers was designed, to obtain information regarding biosecurity on Ontario commercial broiler chicken and turkey operations, and then pre-tested. The questions that could be validated were verifiable by seeing the facility, by using farm records or by interviewing technical personnel other than the survey respondent. The survey was validated using a convenience sample of 24 farms from two companies. For 15 questions with dichotomous responses, the sensitivity ranged from 16.7 to 100%; the specificity ranged from 0 to 100%. For example, fences and gates seen during the farm visit were not accurately reported on the survey (poor sensitivity). Chance-corrected agreement was low (kappa < 0.4) for 34 questions, fair to good (0.4 < kappa < 0.8) for 25 questions, and excellent (kappa > 0.8) for seven questions. The percent agreement for questions where only one of the possible options was observed on validation ranged from 60.9 to 100%. Five questions with continuous numeric variables were analysed. A difference was observed (P < 0.1) between the survey and validation data for three questions regarding the number of birds, the bird sources and the downtime between flocks. In spite of pre-testing, the lack of clear wording and the absence of definitions for technical terms appeared to reduce validity. Response bias seems to be an issue with biosecurity surveys. The value of validating questionnaires before their use in epidemiologic research is confirmed.  相似文献   

4.
The authors report the inter-interviewer reliability of two brief questionnaires developed to measure the effects of innovations in methadone maintenance. The instruments were designed to answer the research questions, but to intrude only minimally into the clinical assessment and treatment processes. The Initial Interview, completed at the time of admission, yielded information on 23 variables, and the Followup Interview, completed as soon as possible after the first anniversary of admission, yielded information on 20 variables. To assess reliability, a repeat interview was conducted by a different interviewer immediately after the first interview was completed. Repeat interviews were conducted with 19 subjects who completed the Initial Interview and 30 who completed the Followup Interview. Exact agreement was found in all the pairs of responses from the Initial Interview for 5 of the 6 categorical variables and 6 of the 17 quantitative variables. For the remaining 11 quantitative variables, the intraclass correlation coefficients ranged from .700 to .999. Exact agreement was found in all pairs of responses from the Followup Interview for 2 of the 4 categorical variables and 8 of the 16 quantitative variables. For each of the remaining categorical variables, the kappa statistic was significant (.73 and .49). For the remaining 8 quantitative variables, the intraclass correlation coefficients ranged from .750 to .999. The findings signify satisfactory interviewer reliability of the instruments. These brief instruments could easily be adapted for use in other treatment evaluation studies where brevity in data collection is considered desirable.  相似文献   

5.
OBJECTIVES: To investigate the validity and reproducibility of a method of morphometric assessment of enamel demineralisation. METHODS: An in vitro investigation was carried out on 22 human teeth. One investigator coated the crowns of the teeth with an acid-resistant varnish, leaving a small window on the buccal surface. This was incrementally occluded by varnish over a 14-day period, during which the teeth were placed in a demineralising gel at pH 4.5. After varnish removal, a second investigator blindly quantitated the demineralised area by three methods of examination; direct visual, microscopic and from photographs. The microscopic and photographic measurements were carried out using a morphometric assessment with a 121-dot array. Photographs and assessments were repeated after 1 week. The readings were analysed using the kappa statistic, the limits of agreement and the coefficient of repeatability. RESULTS: Photographic assessments demonstrated excellent agreement for grid positioning (kappa > 0.81) and substantial agreement for reading reproducibility (kappa = 0.61-0.80). The coefficients of repeatability were found to be the same for repeat readings of the same slide and the repeated slides (5.0 mm2). They were higher for the microscopic technique (6.8 mm2) and for the direct visual technique (7.8 mm2). The limits of agreement are presented graphically. CONCLUSIONS: The photographic technique used was a reproducible method of measuring artificial enamel demineralisation. Measurement from photographs was more reproducible than direct measurement with the naked eye. Subjectiveness of the index leads to most variation and more objective means of assessing enamel demineralisation need to be found.  相似文献   

6.
OBJECTIVES: To describe a new severity of illness index for inflammatory skin disease called the Dermatology Index of Disease Severity (DIDS), and to show its preliminary use and reliability in staging disease in patients with psoriasis and dermatitis. DESIGN: Interobserver rating study using the DIDS with as many as 10 observers independently rating the same patient at a single point in time. SETTING: Ambulatory care clinics at an academic medical center with patients from various socioeconomic backgrounds. PATIENTS: Thirty-four patients with psoriasis and 15 patients with dermatitis were included in the study. MAIN OUTCOME MEASURES: The severity of illness for each patient was rated as 1 of 5 stages: 0, no evidence of clinical disease; I, limited disease; II, mild disease; III, moderate disease; and IV, severe disease. The degree of interobserver concordance was measured by the Cohen kappa statistic. RESULTS: All 5 stages were represented in the study of patients with psoriasis. The overall kappa statistic was 0.76, which is defined as substantial interobserver concordance. The use of the instrument in dermatitis showed good consensus in staging, where the kappa statistic was 0.41. CONCLUSION: We introduce an easy and efficient instrument for staging the severity of illness in inflammatory cutaneous diseases. The reliability of the DIDS is demonstrated in patients with psoriasis and in patients with dermatitis.  相似文献   

7.
Review of data in the literature on the quality of life and its assessment in chronic obstructive lung disease and in bronchial asthma. The authors mention the most frequently used types of questionnaires and results achieved when using them. General questionnaires include the Sickness Impact Profile or the short version of a very detailed questionnaire which has 36 questions with sub-questions (SF-36 = Short Form-36). Specific questionnaires are focused on certain questions concerning different diseases. These questionnaires include SGRQ (St. George's Respiratory Questionnaire) which is used mainly in chronic obstructive lung disease. For this disease also the CRQ was developed (Chronic Respiratory Questionnaire) but its section on dyspnoea is not standardized. For evaluation of the quality of life of asthmatic patients several questionnaires exist, in particular for children. Several questions call for further standardization. The value of questionnaires is, however, beyond dount. They elucidate the situation which does not ensue even from detailed functional examination of the lungs or immunological examination. It appraises bodily and mental functions of man, restriction of his activity, the sensation of comfort and general evaluation of his health. Thus "classical" evaluation methods are extended by now non-traditional ways of appraisal of diseases which have a high prevalence and thus also great impact in the population.  相似文献   

8.
This study describes the extent of agreement in classification of chest radiographs using the International Labor Organization (ILO) classification among six readers from the United States and Canada. A set of 119 radiographs was created and read by three Canadian and three US readers. The two ratings of interest were profusion (scored from 0/- to 3/+) and pleural abnormalities consistent with pneumoconiosis (scored with the ILO system, then collapsed into a yes/no). We used a number of approaches to evaluate interreader agreement on profusion and pleural changes, determining concordance, observed agreement, kappa statistic, and a new measure to approximate sensitivity and specificity. This study found that five of six readers had good fair to good agreement for pleural findings and for profusion as a dichotomous variable (> or = 1/0 vs < or = 0/1) using the kappa statistic, while a sixth reader had poor agreement. We found that concordance, expressed as percent agreement, was higher for normal radiographs than for ones that showed disease, and describe the use of the kappa statistic to control for this finding. This analysis adds to the existing literature with the use of the kappa statistic, and by presenting a new measure for "underreading" and "overreading" tendencies.  相似文献   

9.
To investigate the reliability of nominal scales, Kraemer proposed a measurement model from which kappa coefficients could be derived. More recently she suggested a matrix of coefficients as a comprehensive summary of reliability, contrasting this with use of a single summary kappa statistic. The main diagonal of the matrix consists of binary kappa coefficients for each category which measure the reliability of each category relative to all others, while the off-diagonal elements are correlation coefficients for pairs of categories. The off-diagonal elements were suggested as measures of confusion between categories. Schouten also suggested coefficients to assess confusion between pairs of categories, which might be used as alternative off-diagonal elements in a summary matrix. The two types of off-diagonal element will be compared. It will be shown that Schouten's coefficients can be expressed in terms of the parameters of Kraemer's measurement model and that they are more easily interpreted as measures of confusion. First, the maximum value for Schouten's coefficient is one. Secondly, for any pair of categories, Schouten's coefficient equals the proportionate reduction in the probability of classifying a subject in one category of the pair having previously classified them in the other. Thirdly, where the coefficient for a pair of categories is less than the summary kappa statistic, it will be shown that combining these two categories will increase the value of the summary kappa statistic. The methods of analysis are applied to data from a study of the reliability of psychiatric diagnosis and used to identify pairs of classifications between which there is substantial confusion.  相似文献   

10.
The outcome at age 2 years of preterm babies recruited into a three-arm randomised controlled trial of prophylactic volume expansion was ascertained in two ways: from a neurodevelopmental assessment performed by a paediatrician and from responses on a brief questionnaire completed by the child's personal health visitor. Of 776 babies recruited into the trial, 604 survived to the age of 2 years and the findings of a paediatric assessment were available for all survivors. Questionnaires were sent to the health visitors of 601 of the survivors; 513 (85.4%) were returned. There was sufficient information on the returned questionnaires to categorise 449 of the children as normal, impaired, moderately disabled or severely disabled. We were unable to detect a response bias by severity of disability. Agreement on individual questions ranged between 86.3% and 98.4%. There was some mismatch in the reporting of vision (weighted kappa = 0.71) and hearing (weighted kappa = 0.73), with differences in perception of level of severity of sensory loss. Health visitors tended to underestimate the child's functional level compared with the paediatrician. However, of 56 children classified as severely disabled by the paediatrician, 48 were classified similarly and eight as moderately disabled on the basis of the questionnaire. The end point of the trial was death or severe disability at 2 years of age. There was close similarity in the trial results whether based on the paediatric assessment or on the questionnaire. Further refinement of the questionnaire is needed, but this methodology may be useful in ascertaining the frequency of severe disability in large cohorts of babies.  相似文献   

11.
BACKGROUND: To assess the overuse and underuse of medical procedures, various methods have been developed, but their reproducibility has not been evaluated. This study estimates the reproducibility of one commonly used method. METHODS: We performed a parallel, three-way replication of the RAND-University of California at Los Angeles appropriateness method as applied to two medical procedures, coronary revascularization and hysterectomy. Three nine-member multidisciplinary panels of experts were composed for each procedure by stratified random sampling from a list of experts nominated by the relevant specialty societies. Each panel independently rated the same set of clinical scenarios in terms of the appropriateness of the relevant procedure on a risk-benefit scale ranging from 1 to 9. Final ratings were used to classify the procedure in each scenario as necessary or not necessary (to evaluate underuse) and inappropriate or not inappropriate (to evaluate overuse). Reproducibility was measured by overall agreement and by the kappa statistic. The criteria for underuse and overuse derived from these ratings were then applied to real populations of patients who had undergone coronary revascularization or hysterectomy. RESULTS: The rates of agreement among the three coronary-revascularization panels were 95, 94, and 96 percent for inappropriate-use scenarios and 93, 92, and 92 percent for necessary-use scenarios. Agreement among the three hysterectomy panels was 88, 70, and 74 percent for inappropriate-use scenarios. Scenarios involving necessary use of hysterectomy were not assessed. The three-way kappa statistic to detect overuse was 0.52 for coronary revascularization and 0.51 for hysterectomy. The three-way kappa statistic to detect underuse of coronary revascularization was 0.83. Application of individual panels' criteria to real populations of patients resulted in a 100 percent variation in the proportion of cases classified as inappropriate and a 20 percent variation in the proportion of cases classified as necessary. CONCLUSIONS: The appropriateness method is far from perfect. Appropriateness criteria may be useful in comparing levels of appropriate procedures among populations but should not by themselves be used to direct care for individual patients.  相似文献   

12.
OBJECTIVE: To show that the elderly at risk rating scale (EARRS) satisfies the requirements of an assessment tool for routine health checks in people over 75 and would also be suitable as a method of collecting epidemiological data on the needs of the elderly in a locality. DESIGN: Development and validation of a questionnaire based on a modification of the Winchester rating scale, by a series of prospective, comparative studies before the use of the instrument in a community survey. SETTING: Elderly care day hospital and the community. SUBJECTS: Elderly patients referred to an elderly care day hospital; population survey of subjects over 75 living at home. MAIN OUTCOME MEASURES: Reliability of responses using the kappa statistic; comparison of the scale with the Barthel index of daily living. RESULTS: EARRS has satisfactory validity and reliability when repeated by the same observer or a different observer, with a mean weighted kappa score above 0.80 in both instances. As a measure of disability in the community, it is better than the Barthel score in that it avoids the ceiling effect. The score is correlated with age, social situation, and receipt of support services, and individual questions scale appropriately to adverse outcomes. CONCLUSION: The EARRS satisfies the requirements of an assessment tool for health checks in the elderly, It is suitable for both population surveys and routine practice in primary care, has proved popular with practice nurses, and is easy to complete.  相似文献   

13.
This study assesses the reliability of a self-reported health questionnaire completed by 413 subjects aged 25-74 yr in the Erie County Periodontal Disease (ECPD) Study. Specific questions on general and oral health conditions were completed by each subject during a first visit and at a follow-up examination 2 yr later, and the two compared. Results showed that the overall measure of agreement between the two visits is substantial (average kappa, kappa = 0.80). Variation by gender and age were minimal. Questions regarding allergy to medications, oral treatment, reason for tooth extraction, health symptoms and history of systemic diseases exhibited high levels of agreement (kappa ranged from 0.71-0.90). Information on vitamin and mineral intake yielded kappa = 0.63. Oral conditions scored the lowest but were still acceptable (kappa = 0.57). These findings indicate that there were no significant discrepancies in self-reported responses to the health questionnaire used in the ECPD Study. Although the information provided by the subject may not be as accurate as compared to laboratory testing, it is nevertheless a reliable source of information which can be utilized cost-effectively in research studies.  相似文献   

14.
BACKGROUND: The aim of this study was to examine some psychometric properties of a new questionnaire measuring patients' satisfaction with respect to the quality of care during stay in a rehabilitation unit. The instrument (called SAT-16) is composed of 16 four-level items and 2 open-ended questions. The construct validity of the 16-item section was already demonstrated in a previous study based on factorial analysis. In this study the concurrent validity, further aspects of the construct validity and test-retest reliability were analyzed. METHODS: The SAT-16 was administered to 339 inpatients, admitted consecutively to a Rehabilitation Center. RESULTS: 262 questionnaires (77%) were returned, of which 221 with all items filled in. The SAT-16 correlated well with two other measures of satisfaction (CSQ-8 and global satisfaction regarding the hospital stay). The answers to two open-ended questions came out to be consistent with those to the 16 closed-ended questions. The high values for the indices of test-retest reliability (ICC and kappa) are evidence of the stability of the scores in two repeated administrations. CONCLUSIONS: The SAT-16 was found to be provided with good psychometric characteristics. It can be proposed as a valid instrument for use in clinical practice for the continuous quality improvement of inpatient medical rehabilitation programmes.  相似文献   

15.
BACKGROUND AND OBJECTIVES: The objective was to assess reliability of self-reported sexual histories among sexually transmitted disease clinic attendees who enrolled in a study in 1994. GOAL OF THIS STUDY: Knowledge about the reliability of sexual data is important to decide whether these measures of sexual behavior can be used in epidemiologic studies of sexually transmitted diseases. STUDY DESIGN: In 288 attendees, degree of agreement was assessed in responses to an identical set of sexual questions asked independently by a medical doctor and a public health nurse and in responses made by members of the same couple (n = 50) to a public health nurse. RESULTS: In the test-retest comparison, high agreement was found for most questions: kappa-values and exact agreement ranged from 0.73 to 0.96 and 54% to 99%, respectively. Participants interviewed by the medical doctor reported significantly lower numbers of partners and a higher age at first intercourse. Stratified analyses showed variability in agreement across subgroups. Most consistent, women provided more reliable reports than men. In the comparison of couples, substantial agreement was found for the municipality where they met (88% agreement; kappa = 0.72) and contraceptive method (87% agreement; kappa = 0.60), but only moderate agreement was found for frequency of sexual intercourse (26% agreement; kappa = 0.50). CONCLUSION: The authors conclude that data on sexual behavior can be collected reliably among sexually transmitted disease clinic attendees, although reporting bias does occur. The frequency of sexual intercourse was not sufficiently reliable and should be interpreted as an estimate only.  相似文献   

16.
BACKGROUND: There is currently no validated questionnaire that assesses both the presence and severity of dyspepsia. AIM: To develop the Leeds Dyspepsia Questionnaire (LDQ) as a measure of the presence and severity of dyspepsia, and to assess the validity, reliability and responsiveness of this instrument. METHODS: Unselected patients attending either a hospital dyspepsia clinic or a general practice surgery were interviewed by a trained gastroenterologist or a general practitioner on the presence and severity of dyspepsia. This opinion was compared with the results of the nurse-administered LDQ. Test-retest reliability was assessed by the same research nurse re-administering the LDQ 4-7 days after the initial visit in a subgroup of hospital patients. In a further subgroup of patients one researcher interviewed the patients and a second researcher re-administered the LDQ within 30 min to evaluate inter-rater reliability. The responsiveness of the LDQ was measured by repeating it in patients with endoscopically proven peptic ulcer or oesophagitis 1 month after receiving appropriate therapy. RESULTS: The LDQ was administered to 99 general practice and 215 hospital patients. In the GP population 41/98 (42%) had dyspepsia according to the GP and the LDQ had a sensitivity of 80% (95% CI: 65-91%) and a specificity of 79% (95% CI: 66-89%). The weighted kappa statistic for the agreement between the LDQ and the clinician for the severity of dyspepsia was 0.58 in the GP population and 0.49 in hospital patients. The kappa statistic for test-retest reliability was 0.83 in 107 patients. The LDQ had excellent inter-rater reliability with a kappa statistic of 0.90 in 42 patients. The median LDQ score fell from 22.5 (range 9-36) to 4.5 (range 0-27) in 12 patients 1 month after receiving appropriate therapy (Wilcoxon signed rank test, P < 0.0001). CONCLUSION: The LDQ is a valid, reliable and responsive instrument for measuring the presence and severity of dyspepsia.  相似文献   

17.
The reliability of information about mothers' and fathers' education, weight and height at birth, history of diarrhoea, duration of exclusive breast feeding and age of introduction of cows' milk products, selected from a structured questionnaire used in home interviewers was examined in a sample of 38 cases and 38 controls from a study related to the risk factors of insulin-dependent diabetes mellitus. The repetition of the questions was done by telephone. The agreement between the answers of both interviewers was verified using the kappa statistic (categorical variables) and the intra-class correlation coefficient (quantitative variables). The results enable one to conclude that the information is reproducible.  相似文献   

18.
PURPOSE: To assess the use of the Nidek 3Dx simultaneous stereophotography camera in diabetic patients, comparing the detection of clinically significant macular edema by fundus biomicroscopy to detection by the Nidek 3Dx simultaneous fundus stereophotograph. METHODS: Two hundred eight eyes of 123 diabetic patients at the Wilmer Retinal Vascular Center were examined for this prospective study between August 1993 and October 1993. Each patient was examined by one of three retina specialists by contact lens biomicroscopy for clinically significant macular edema and foveal center thickening. Nidek 3Dx fundus stereophotographs were obtained and graded independently for clinically significant macular edema and foveal center thickening by a fourth ophthalmologist masked from the clinical examination findings. Percent agreement, kappa statistic, and weighted kappa statistic were determined for the two diagnostic methods. RESULTS: One hundred eighty-four (88%) of the 208 stereophotographs were of sufficient quality to detect clinically significant macular edema; 175 (84%) of the 208 stereophotographs detected foveal center thickening. The agreement between the clinician and the photographic grading, measured by weighted kappa, was 0.52 for clinically significant macular edema and 0.58 for foveal center thickening, representing fair to good agreement beyond chance. Agreement was improved when normal fundus Nidek stereophotographs were available as standards for comparison. CONCLUSIONS: The Nidek 3Dx camera is suitable for photographic detection of clinically significant macular edema and may have a potential advantage over conventional cameras by achieving good-quality, gradable stereophotographs in a large proportion of photographed eyes.  相似文献   

19.
PURPOSE: To assess the interobserver agreement on the diagnosis and classification of cutaneous melanoma. MATERIALS AND METHODS: A set of 140 slides of cutaneous melanoma, including a small subset of benign pigmented skin lesions, were circulated to four experienced histopathologists. The kappa statistic for multiple ratings per subject was calculated using the method described by Fleiss. RESULTS: The kappa value on the diagnosis of cutaneous melanoma versus benign lesions was 0.61. There was some discordance on the diagnosis in 37 of 140 cases (26%). For the histopathologic classification of cutaneous melanoma, the highest kappa values were attained for Breslow thickness (kappa = 0.76) and presence of ulceration (kappa = 0.87). The agreement was generally poor for other histologic features, such as level of dermal invasion (kappa = 0.38), presence of regression (kappa = 0.27), and lymphocytic infiltration (kappa = 0.27). CONCLUSION: Our study suggests considerable disagreement among pathologists on the diagnosis of melanoma versus other pigmented lesions. Tumor thickness and presence of ulceration are the most reproducible histologic features of cutaneous melanoma.  相似文献   

20.
AIM: Responses to respiratory questionnaires are often used to identify individuals with asthma symptoms and may also be used to identify asymptomatic individuals. This study investigates the repeat responses over four years to such a questionnaire in a population of adult New Zealanders. METHODS: Seven hundred and twenty three asthmatics were sent two almost identical questionnaires in three areas of New Zealand, separated by approximately four years. All of them had answered yes to at least one of the three questions under study in the first survey. RESULTS: Following the second asthma questionnaire only 487 (67.4%) answered yes to at least one of the survey questions. Similarly, 51.1% of those who had reported having nocturnal shortness of breath in the first survey did so in the second survey, 69.9% of those who reported having had an asthma attack in the first survey did so in the second survey, and finally 74.8% of those who reported using asthma medication in the first survey did so in the second survey. CONCLUSION: Even in a previously identified symptomatic asthmatic group, a large proportion did not report respiratory symptoms and asthma medication use four years later. This implies that the true prevalence pool of susceptibles is likely to be far greater than is identified in surveys of the 12-month period prevalence of asthma symptoms. This has implications not only for the design of epidemiological studies (e.g., it poses problems for the selection of a control group of non-asthmatics in prevalence case-control studies), but also for the planning of health services and educational programmes for people with asthma.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号