Overview of the epidemiology methods and applications: strengths and limitations of observational study designs.
The impact of study design on the results of medical research has long been an area of both substantial debate and a smaller body of empirical research. Examples come from many disciplines within clinical and public health research. Among the early major contributions in the 1970s was work by Mosteller and colleagues (Gilbert et al., 1997), who noted that innovations in surgery and anesthesia showed greater gains than standard therapy when nonrandomized, controlled trials were evaluated compared with the gains reported in randomized, controlled trials. More recently, we and others have evaluated the impact of design in medical and surgical research, and concluded that the mean gain comparing new therapies to established therapies was biased by study design in nonrandomized trials (Colditz et al., 1989; Miller et al., 1989). Benson and Hartz (2000) conducted a study in which they focused only on studies reported after 1985. On the basis of 136 reports of 19 diverse treatments, Benson and Hartz concluded that in only 2 of the 19 analyses did the combined data from the observational studies lie outside the 95% confidence interval for the combined data from the randomized trials. A similar study drew only on data reported from 1991 to 1995, which showed remarkably similar results among observational studies and randomized, controlled trials (Concato et al., 2000). These more recent data suggest that advancing the study design and analytic methods may reduce bias in some evaluations of medical and public health interventions. Such methods apply not only to the original studies, but also to the approaches that are taken to quantitatively combine results by using meta-analytic approaches such as random effects meta-regression, Bayesian meta-analysis, and the like (Normand, 1999). By focusing attention on thorough data analysis, design issues can be understood and their impact or bias can be estimated, on average, and then ideally accounted for in the interpretation of data. Before discussing dietary data, let us first consider some of the more clearly delineated preventive exposures. Issues of study design have been addressed in terms of combining randomized trials and observational studies in evaluating preventive interventions such as Bacillus Calmette-Guerin vaccination (Colditz et al., 1994) and mammography screening (Desmissie et al., 1998). When one is interpreting the apparent heterogeneity in the results, it is important to step back and ask what is the relationship being evaluated under these different study designs? For example, a randomized, controlled trial uses the intention-to-treat analysis to preserve the merit of randomization. Such an analysis does not evaluate the exposure-disease relationship, but rather examines the impact of offering a new therapy versus an alternative therapy (regardless of adherence to the intervention, or control or placebo). On the other hand, a case-control study or a prospective cohort study will evaluate the impact of the screening test among those participants who were screened as compared with those who were never screened. In prevention studies, the design raises major issues of the timing of the exposure in the natural history of disease and also the adherence to therapy by healthy research volunteers. Case-control studies of preventive interventions such as screening mammography and prospective population-based studies of pap smears have capitalized on this variation in time since the last screen to evaluate the protective interval for a screening test (IARC Work Group, 1986). In contrast, a trial must choose a level of exposure, such as annual mammography screenings or colon screenings every 10 years with a colonoscopy, regardless of the evolving evidence on the duration of protection after a negative screening test. Continuing with the mammography example, a detailed study by Demissie and colleagues (1998) combined data from seven randomized trials and six case-control studies that investigated the association between participation in breast cancer screening programs and breast cancer mortality. The authors showed that if one assumes noncompliance with mammography (approaching 30%) and 20% of the control group is screened, then the benefit of mammography in terms of reduced mortality is comparable in randomized, controlled trials and epidemiologic studies after adjusting for nonadherence (Demissie et al., 1998). Thus, the different designs fundamentally measure different constructs of the impact of screening. Zelen (1988) considered the challenges of primary prevention trials and addressed both compliance and models of carcinogenesis as major impediments to the use of randomized, controlled trials to evaluate cancer prevention strategies. It is important to contrast these issues in both treatment trials and prevention trials. In treatment trials, recently diagnosed patients, who are often in life-threatening situations, are typically offered the option to participate in a trial of a new therapy compared with standard therapy or placebo. Compliance or adherence to therapy is usually very high among these highly motivated patients and their outcomes are generally in the short- to mid-term. In contrast, prevention trials recruit large numbers of healthy participants, offer them a therapy, and then follow them over many years because the chronic diseases being prevented are relatively rare. With substantial noncompliance (often in the range of 20% to 40% over the duration of the trial), an intention-to-treat analysis is no longer unbiased, but rather gives a biased estimate of the effect, typically underestimating the magnitude of the association that is seen in observation studies in which those participants who have had exposure to a particular lifestyle component are compared with those without such an exposure. There are additional challenges for nutritional interventions, including the timing of diet as a preventive agent in the disease process and the range of nutrient intakes in the population. In retrospective case-control studies, recall bias of past diet is an additional issue with which to contend. Unlike smoking or screening tests in which the exposure is finite and can be completely stopped and started, one's diet, physical activity, and weight change cannot go to zero for prolonged periods and sustain life. The range of nutrient intake is a major issue when enrolling participants into prevention trials and observational studies. Health-conscious volunteers are more often identified and screened as eligible for a trial. The epidemiology of diet and colon cancer has been extensively studied. For example, Cho et al. (2004) conducted a combined analysis of prospective dietary studies of calcium and vitamin D intake data from 10 cohorts. The dose-response relationship for calcium showed that the greatest benefit for increasing calcium intake was for those participants who had a reported daily intake below 1000 mg/d. Increasing the intake of those individuals with low intake to the level of 1000 mg/day would yield a 20% reduction in risk. Beyond this level of intake, there was little additional reduction in the risk of colon cancer. In the Women's Health Initiative, participants had a mean calcium intake of 1150 mg/d at baseline and increased this intake in the intervention arm to 2250 mg/d on average. This magnitude of increase was of limited association in the combined, prospective, cohort studies and was not related to risk in the randomized trial (Wactawski-Wende et al., 2006). Similar findings apply to the interpretation of the vitamin D intervention and highlight the role of dietary intake at randomization when evaluating dietary components. Returning to the time frame of exposure in the carcinogenic process, the null randomized, controlled trials of fiber (Alberts et al., 2000) and fruit and vegetables (Schatzkin et al., 2000) for the prevention of polyp recurrence amply illustrate Zelen's concerns regarding the timing of the preventive intervention in the disease process. The extent of DNA damage accumulated across the colonic mucosa at the time of detecting the "eligibility polyp" was certainly not limited only to the removed polyp. Rather, these observations beg the question that at what stage in the disease process may fiber play a role in protecting against colon cancer? This contrasts with the richness of epidemiologic studies that can address exposure over the life course and relate such exposure to disease risk. Perhaps the best-known example is the radiation follow-up effects cohort in Japan in which a radiation dose was estimated for each woman who had been exposed to the effects of the atomic bombs on Hiroshima and Nagasaki and followed-up over 40 years. The results of this study showed a clear and strong relationship between the increased risk of breast cancer with higher exposure among those participants who were exposed before the age of 20 years (Land et al., 2003). Retrospective assessment of diet after disease diagnosis has been demonstrated to introduce bias into the evaluation of exposure-disease relationships. For example, Giovannucci et al. (1993) evaluated retrospective recall of diet after breast cancer diagnosis within the ongoing Nurses' Health Study. In contrast with the prospective analysis in which no relationship between dietary fat and breast cancer was observed, the retrospective analysis yielded a positive relationship for total fat and saturated fat (Giovannucci et al., 1993). By comparing the top quintile versus the bottom quintile of reported intake, the retrospective assessment yielded odds ratios of 1.43 for total fat and 1.38 for saturated fat. Therefore, the magnitude of bias was sufficient to distort evaluation of the diet-disease relationship. (
School of Medicine, Department of Surgery, Washington University School of Medicine, St. Louis, Missouri.
Randomized Controlled Trials as Topic
Pub Type(s)Journal Article