Print this page
Tuesday, 01 March 2011 01:48

Options in Study Design

Written by
Rate this item
(1 Vote)

The epidemiologist is interested in relationships between variables, chiefly exposure and outcome variables. Typically, epidemiologists want to ascertain whether the occurrence of disease is related to the presence of a particular agent (exposure) in the population. The ways in which these relationships are studied may vary considerably. One can identify all persons who are exposed to that agent and follow them up to measure the incidence of disease, comparing such incidence with disease occurrence in a suitable unexposed population. Alternatively, one can simply sample from among the exposed and unexposed, without having a complete enumeration of them. Or, as a third alternative, one can identify all people who develop a disease of interest in a defined time period (“cases”) and a suitable group of disease-free individuals (a sample of the source population of cases), and ascertain whether the patterns of exposure differ between the two groups. Follow-up of study participants is one option (in so-called longitudinal studies): in this situation, a time lag exists between the occurrence of exposure and disease onset. One alternative option is a cross-section of the population, where both exposure and disease are measured at the same point in time.

In this article, attention is given to the common study designs—cohort, case-referent (case-control) and cross-sectional. To set the stage for this discussion, consider a large viscose rayon factory in a small town. An investigation into whether carbon disulphide exposure increases the risk of cardiovascular disease is started. The investigation has several design choices, some more and some less obvious. A first strategy is to identify all workers who have been exposed to carbon disulphide and follow them up for cardiovascular mortality.

Cohort Studies

A cohort study encompasses research participants sharing a common event, the exposure. A classical cohort study identifies a defined group of exposed people, and then everyone is followed up and their morbidity and/or mortality experience is registered. Apart from a common qualitative exposure, the cohort should also be defined on other eligibility criteria, such as age range, gender (male or female or both), minimum duration and intensity of exposure, freedom from other exposures, and the like, to enhance the study’s validity and efficiency. At entrance, all cohort members should be free of the disease under study, according to the empirical set of criteria used to measure the disease.

If, for example, in the cohort study on the effects of carbon disulphide on coronary morbidity, coronary heart disease is empirically measured as clinical infarctions, those who, at the baseline, have had a history of coronary infarction must be excluded from the cohort. By contrast, electrocardiographic abnormalities without a history of infarction can be accepted. However, if the appearance of new electrocardiographic changes is the empirical outcome measure, the cohort members should also have normal electrocardiograms at the baseline.

The morbidity (in terms of incidence) or the mortality of an exposed cohort should be compared to a reference cohort which ideally should be as similar as possible to the exposed cohort in all relevant aspects, except for the exposure, to determine the relative risk of illness or death from exposure. Using a similar but unexposed cohort as a provider of the reference experience is preferable to the common (mal)practice of comparing the morbidity or mortality of the exposed cohort to age-standardized national figures, because the general population falls short on fulfilling even the most elementary requirements for comparison validity. The Standardized Morbidity (or Mortality) Ratio (SMR), resulting from such a comparison, usually generates an underestimate of the true risk ratio because of a bias operating in the exposed cohort, leading to the lack of comparability between the two populations. This comparison bias has been named the “Healthy Worker Effect”. However, it is really not a true “effect”, but a bias from negative confounding, which in turn has arisen from health-selective turnover in an employed population. (People with poor health tend to move out from, or never enter, “exposed” cohorts, their end destination often being the unemployed section of the general population.)

Because an “exposed” cohort is defined as having a certain exposure, only effects caused by that single exposure (or mix of exposures) can be studied simultaneously. On the other hand, the cohort design permits the study of several diseases at the same time. One can also study concomitantly different manifestations of the same disease—for example, angina, ECG changes, clinical myocardial infarctions and coronary mortality. While well-suited to test specific hypotheses (e.g., “exposure to carbon disulphide causes coronary heart disease”), a cohort study also provides answers to the more general question: “What diseases are caused by this exposure?”

For example, in a cohort study investigating the risk to foundry workers of dying from lung cancer, the mortality data are obtained from the national register of causes of death. Although the study was to determine if foundry dust causes lung cancer, the data source, with the same effort, also gives information on all other causes of death. Therefore, other possible health risks can be studied at the same time.

The timing of a cohort study can either be retrospective (historical) or prospective (concurrent). In both instances the design structure is the same. A full enumeration of exposed people occurs at some point or period in time, and the outcome is measured for all individuals through a defined end point in time. The difference between prospective and retrospective is in the timing of the study. If retrospective, the end point has already occurred; if prospective, one has to wait for it.

In the retrospective design, the cohort is defined at some point in the past (for example, those exposed on 1 January 1961, or those taking on exposed work between 1961 and 1970). The morbidity and/or mortality of all cohort members is then followed to the present. Although “all” means that also those having left the job must be traced, in practice a 100 per cent coverage can rarely be achieved. However, the more complete the follow-up, the more valid is the study.

In the prospective design, the cohort is defined at the present, or during some future period, and the morbidity is then followed into the future.

When doing cohort studies, enough time must be allowed for follow-up in order that the end points of concern have sufficient time to manifest. Sometimes, because historical records may be available for only a short period into the past, it is nevertheless desirable to take advantage of this data source because it means that a shorter period of prospective follow-up would be needed before results from the study could be available. In these situations, a combination of the retrospective and the prospective cohort study designs can be efficient. The general layout of frequency tables presenting cohort data is shown in table 1.

Table 1. The general layout of frequency tables presenting cohort data

Component of disease rate

Exposed cohort

Unexposed cohort

Cases of illness or death



Number of people in cohort




The observed proportion of diseased in the exposed cohort is calculated as:

and that of the reference cohort as:

The rate ratio then is expressed as:

N0 and N1 are usually expressed in person-time units instead of as the number of people in the populations. Person-years are computed for each individual separately. Different people often enter the cohort during a period of time, not at the same date. Hence their follow-up times start at different dates. Likewise, after their death, or after the event of interest has occurred, they are no longer “at risk” and should not continue to contribute person-years to the denominator.

If the RR is greater than 1, the morbidity of the exposed cohort is higher than that of the reference cohort, and vice versa. The RR is a point estimate and a confidence interval (CI) should be computed for it. The larger the study, the narrower the confidence interval will become. If RR = 1 is not included in the confidence interval (e.g., the 95% CI is 1.4 to 5.8), the result can be considered as “statistically significant” at the chosen level of probability (in this example, α = 0.05).

If the general population is used as the reference population, c0 is substituted by the “expected” figure, E(c1 ), derived from the age-standardized morbidity or mortality rates of that population (i.e., the number of cases that would have occurred in the cohort, had the exposure of interest not taken place). This yields the Standardized Mortality (or Morbidity) Ratio, SMR. Thus,

Also for the SMR, a confidence interval should be computed. It is better to give this measure in a publication than a p-value, because statistical significance testing is meaningless if the general population is the reference category. Such comparison entails a considerable bias (the healthy worker effect noted above), and statistical significance testing, originally developed for experimental research, is misleading in the presence of systematic error.

Suppose the question is whether quartz dust causes lung cancer. Usually, quartz dust occurs together with other carcinogens—such as radon daughters and diesel exhaust in mines, or polyaromatic hydrocarbons in foundries. Granite quarries do not expose the stone workers to these other carcinogens. Therefore the problem is best studied among stone workers employed in granite quarries.

Suppose then that all 2,000 workers, having been employed by 20 quarries between 1951 and 1960, are enrolled in the cohort and their cancer incidence (alternatively only mortality) is followed starting at ten years after first exposure (to allow for an induction time) and ending in 1990. This is a 20- to 30-year (depending on the year of entry) or, say, on average, a 25-year follow-up of the cancer mortality (or morbidity) among 1,000 of the quarry workers who were specifically granite workers. The exposure history of each cohort member must be recorded. Those who have left the quarries must be traced and their later exposure history recorded. In countries where all inhabitants have unique registration numbers, this is a straightforward procedure, governed chiefly by national data protection laws. Where no such system exists, tracing employees for follow up purposes can be extremely difficult. Where appropriate death or disease registries exist, the mortality from all causes, all cancers and specific cancer sites can be obtained from the national register of causes of death. (For cancer mortality, the national cancer registry is a better source because it contains more accurate diagnoses. In addition, incidence (or, morbidity) data can also be obtained.) The death rates (or cancer incidence rates) can be compared to “expected numbers”, computed from national rates using the person-years of the exposed cohort as a basis.

Suppose that 70 fatal cases of lung cancer are found in the cohort, whereas the expected number (the number which would have occurred had there been no exposure) is 35. Then:

c1 = 70, E(c1) = 35

Thus, the SMR = 200, which indicates a twofold increase in risk of dying from lung cancer among the exposed. If detailed exposure data are available, the cancer mortality can be studied as a function of different latency times (say, 10, 15, 20 years), work in different types of quarries (different kinds of granite), different historical periods, different exposure intensities and so on. However, 70 cases cannot be subdivided into too many categories, because the number falling into each one rapidly becomes too small for statistical analysis.

Both types of cohort designs have advantages and disadvantages. A retrospective study can, as a rule, measure only mortality, because data for milder manifestations usually are lacking. Cancer registries are an exception, and perhaps a few others, such as stroke registries and hospital discharge registries, in that incidence data also are available. Assessing past exposure is always a problem and the exposure data are usually rather weak in retrospective studies. This can lead to effect masking. On the other hand, since the cases have already occurred, the results of the study become available much sooner; in, say, two to three years.

A prospective cohort study can be better planned to comply with the researcher’s needs, and exposure data can be collected accurately and systematically. Several different manifestations of a disease can be measured. Measurements of both exposure and outcome can be repeated, and all measurements can be standardized and their validity can be checked. However, if the disease has a long latency (such as cancer), much time—even 20 to 30 years—will need to pass before the results of the study can be obtained. Much can happen during this time. For example, turnover of researchers, improvements in techniques for measuring exposure, remodelling or closure of the plants chosen for study and so forth. All these circumstances endanger the success of the study. The costs of a prospective study are also usually higher than those of a retrospective study, but this is mostly due to the much greater number of measurements (repeated exposure monitoring, clinical examinations and so on), and not to more expensive death registration. Therefore the costs per unit of information do not necessarily exceed those of a retrospective study. In view of all this, prospective studies are more suited for diseases with rather short latency, requiring short follow-up, while retrospective studies are better for disease with a long latency.

Case-Control (or Case-Referent) Studies

Let us go back to the viscose rayon plant. A retrospective cohort study may not be feasible if the rosters of the exposed workers have been lost, while a prospective cohort study would yield sound results in a very long time. An alternative would then be the comparison between those who died from coronary heart disease in the town, in the course of a defined time period, and a sample of the total population in the same age group.

The classical case-control (or, case-referent) design is based on sampling from a dynamic (open, characterized by a turnover of membership) population. This population can be that of a whole country, a district or a municipality (as in our example), or it can be the administratively defined population from which patients are admitted to a hospital. The defined population provides both the cases and the controls (or referents).

The technique is to gather all the cases of the disease in question that exist at a point in time (prevalent cases), or have occurred during a defined period of time (incident cases). The cases thus can be drawn from morbidity or mortality registries, or be gathered directly from hospitals or other sources having valid diagnostics. The controls are drawn as a sample from the same population, either from among non-cases or from the entire population. Another option is to select patients with another disease as controls, but then these patients must be representative of the population from which the cases came. There may be one or more controls (i.e., referents) for each case. The sampling approach differs from cohort studies, which examine the entire population. It goes without saying that the gains in terms of the lower costs of case-control designs are considerable, but it is important that the sample is representative of the whole population from which the cases originated (i.e., the “study base”)—otherwise the study can be biased.

When cases and controls have been identified, their exposure histories are gathered by questionnaires, interviews or, in some instances, from existing records (e.g., payroll records from which work histories can be deduced). The data can be obtained either from the participants themselves or, if they are deceased, from close relatives. To ensure symmetrical recall, it is important that the proportion of dead and live cases and referents be equal, because close relatives usually give a less detailed exposure history than the participants themselves. Information about the exposure pattern among cases is compared to that among controls, providing an estimate of the odds ratio (OR), an indirect measure of the risk among the exposed to incur the disease relative to that of the unexposed.

Because the case-control design relies on the exposure information obtained from patients with a certain disease (i.e., cases) along with a sample of non-diseased people (i.e., controls) from the population from which the cases originated, the connection with exposures can be investigated for only one disease. By contrast, this design allows the concomitant study of the effect of several different exposures. The case-referent study is well suited to address specific research questions (e.g., “Is coronary heart disease caused by exposure to carbon disulphide?”), but it also can help to answer the more general question: “What exposures can cause this disease?”

The question of whether exposure to organic solvents causes primary liver cancer is raised (as an example) in Europe. Cases of primary liver cancer, a comparatively rare disease in Europe, are best gathered from a national cancer registry. Suppose that all cancer cases occurring during three years form the case series. The population base for the study is then a three-year follow-up of the entire population in the European country in question. The controls are drawn as a sample of persons without liver cancer from the same population. For reasons of convenience (meaning that the same source can be used for sampling the controls) patients with another cancer type, not related to solvent exposure, can be used as controls. Colon cancer has no known relation to solvent exposure; hence this cancer type can be included among the controls. (Using cancer controls minimizes recall bias in that the accuracy of the history given by cases and controls is, on average, symmetrical. However, if some presently unknown connection between colon cancer and exposure to solvents were revealed later, this type of control would cause an underestimation of the true risk—not an exaggeration of it.)

For each case of liver cancer, two controls are drawn in order to achieve greater statistical power. (One could draw even more controls, but available funds may be a limiting factor. If funds were not limited, perhaps as many as four controls would be optimal. Beyond four, the law of diminishing returns applies.) After obtaining appropriate permission from data protection authorities, the cases and controls, or their close relatives, are approached, usually by means of a mailed questionnaire, asking for a detailed occupational history with special emphasis on a chronological list of the names of all employers, the departments of work, the job tasks in different employment, and the period of employment in each respective task. These data can be obtained from relatives with some difficulty; however, specific chemicals or trade names usually are not well recalled by relatives. The questionnaire also should include questions on possible confounding data, such as alcohol use, exposure to foodstuffs containing aflatoxins, and hepatitis B and C infection. In order to obtain a sufficiently high response rate, two reminders are sent to non-respondents at three-week intervals. This usually results in a final response rate in excess of 70%. The occupational history is then reviewed by an industrial hygienist, without knowledge of the respondent’s case or control status, and exposure is classified into high, medium, low, none, and unknown exposure to solvents. The ten years of exposure immediately preceding the cancer diagnosis are disregarded, because it is not biologically plausible that initiator-type carcinogens can be the cause of the cancer if the latency time is that short (although promoters, in fact, could). At this stage it is also possible to differentiate between different types of solvent exposure. Because a complete employment history has been given, it is also possible to explore other exposures, although the initial study hypothesis did not include these. Odds ratios can then be computed for exposure to any solvent, specific solvents, solvent mixtures, different categories of exposure intensity, and for different time windows in relation to cancer diagnosis. It is advisable to exclude from analysis those with unknown exposure.

The cases and controls can be sampled and analysed either as independent series or matched groups. Matching means that controls are selected for each case based on certain characteristics or attributes, to form pairs (or sets, if more than one control is chosen for each case). Matching is usually done based on one or more such factors, as age, vital status, smoking history, calendar time of case diagnosis, and the like. In our example, cases and controls are then matched on age and vital status. (Vital status is important, because patients themselves usually give a more accurate exposure history than close relatives, and symmetry is essential for validity reasons.) Today, the recommendation is to be restrictive with matching, because this procedure can introduce negative (effect-masking) confounding.

If one control is matched to one case, the design is called a matched-pair design. Provided the costs of studying more controls are not prohibitive, more than one referent per case improves the stability of the estimate of the OR, which makes the study more size efficient.

The layout of the results of an unmatched case-control study is shown in table 2.

Table 2. Sample layout of case-control data

Exposure classification











From this table, the odds of exposure among the cases, and the odds of exposure among the population (the controls), can be computed and divided to yield the exposure odds ratio, OR. For the cases, the exposure odds is c1 / c0, and for the controls it is n1 / n0. The estimate of the OR is then:

If relatively more cases than controls have been exposed, the OR is in excess of 1 and vice versa. Confidence intervals must be calculated and provided for the OR, in the same manner as for the RR.

By way of a further example, an occupational health centre of a large company serves 8,000 employees exposed to a variety of dusts and other chemical agents. We are interested in the connection between mixed dust exposure and chronic bronchitis. The study involves follow-up of this population for one year. We have set the diagnostic criteria for chronic bronchitis as “morning cough and phlegm production for three months during two consecutive years”. Criteria for “positive” dust exposure are defined before the study begins. Each patient visiting the health centre and fulfilling these criteria during a one-year period is a case, and the next patient seeking medical advice for non-pulmonary problems is defined as a control. Suppose that 100 cases and 100 controls become enrolled during the study period. Let 40 cases and 15 controls be classified as having been exposed to dust. Then

c1 = 40, c0 = 60, n1 = 15, and n0 = 85.


In the foregoing example, no consideration has been given to the possibility of confounding, which may lead to a distortion of the OR due to systematic differences between cases and controls in a variable like age. One way to reduce this bias is to match controls to cases on age or other suspect factors. This results in a data layout depicted in table 3.

Table 3. Layout of case-control data if one control is matched to each case



Exposure (+)

Exposure (-)

Exposure (+)

f+ +

f+ -

Exposure (-)

f- +

f- -


The analysis focuses on the discordant pairs: that is, “case exposed, control unexposed” (f+–); and “case unexposed, control exposed” (f–+). When both members of a pair are exposed or unexposed, the pair is disregarded. The OR in a matched-pair study design is defined as

In a study on the association between nasal cancer and wood dust exposure, there were all together 164 case-control pairs. In only one pair, both the case and the control had been exposed, and in 150 pairs, neither the case nor the control had been exposed. These pairs are not further considered. The case, but not the control had been exposed in 12 pairs, and the control, but not the case, in one pair. Hence,

and because unity is not included in this interval, the result is statistically significant—that is, there is a statistically significant association between nasal cancer and wood dust exposure.

Case-control studies are more efficient than cohort studies when the disease is rare; they may in fact provide the only option. However, common diseases also can be studied by this method. If the exposure is rare, an exposure-based cohort is the preferable or only feasible epidemiological design. Of course, cohort studies also can be carried out on common exposures. The choice between cohort and case-control designs when both the exposure and disease are common is usually decided taking validity considerations into account.

Because case-control studies rely on retrospective exposure data, usually based on the participants’ recall, their weak point is the inaccuracy and crudeness of the exposure information, which results in effect-masking through non-differential (symmetrical) misclassification of exposure status. Moreover, sometimes the recall can be asymmetrical between cases and controls, cases usually believed to remember “better” (i.e., recall bias).

Selective recall can cause an effect-magnifying bias through differential (asymmetrical) misclassification of exposure status. The advantages of case-control studies lie in their cost-effectiveness and their ability to provide a solution to a problem relatively quickly. Because of the sampling strategy, they allow the investigation of very large target populations (e.g., through national cancer registries), thereby increasing the statistical power of the study. In countries where data protection legislation or lack of good population and morbidity registries hinders the execution of cohort studies, hospital-based case-control studies may be the only practical way to conduct epidemiological research.

Case-control sampling within a cohort (nested case-control study designs)

A cohort study also can be designed for sampling instead of complete follow-up. This design has previously been called a “nested” case-control study. A sampling approach within the cohort sets different requirements on cohort eligibility, because the comparisons are now made within the same cohort. This should therefore include not only heavily exposed workers, but also less-exposed and even unexposed workers, in order to provide exposure contrasts within itself. It is important to realize this difference in eligibility requirements when assembling the cohort. If a full cohort analysis is first carried out on a cohort whose eligibility criteria were on “much” exposure, and a “nested” case-control study is done later on the same cohort, the study becomes insensitive. This introduces effect-masking because the exposure contrasts are insufficient “by design” by virtue of a lack of variability in exposure experience among members of the cohort.

However, provided the cohort has a broad range of exposure experience, the nested case-control approach is very attractive. One gathers all the cases arising in the cohort over the follow-up period to form the case series, while only a sample of the non-cases is drawn for the control series. The researchers then, as in the traditional case-control design, gather detailed information on the exposure experience by interviewing cases and controls (or, their close relatives), by scrutinizing the employers’ personnel rolls, by constructing a job exposure matrix, or by combining two or more of these approaches. The controls can either be matched to the cases or they can be treated as an independent series.

The sampling approach can be less costly compared to exhaustive information procurement on each member of the cohort. In particular, because only a sample of controls is studied, more resources can be devoted to detailed and accurate exposure assessment for each case and control. However, the same statistical power problems prevail as in classical cohort studies. To achieve adequate statistical power, the cohort must always comprise an “adequate” number of exposed cases depending on the magnitude of the risk that should be detected.

Cross-sectional study designs

In a scientific sense, a cross-sectional design is a cross-section of the study population, without any consideration given to time. Both exposure and morbidity (prevalence) are measured at the same point in time.

From the aetiological point of view, this study design is weak, partly because it deals with prevalence as opposed to incidence. Prevalence is a composite measure, depending both on the incidence and duration of the disease. This also restricts the use of cross-sectional studies to diseases of long duration. Even more serious is the strong negative bias caused by the health-dependent elimination from the exposed group of those people more sensitive to the effects of exposure. Therefore aetiological problems are best solved by longitudinal designs. Indeed, cross-sectional studies do not permit any conclusions about whether exposure preceded disease, or vice versa. The cross-section is aetiologically meaningful only if a true time relation exists between the exposure and the outcome, meaning that present exposure must have immediate effects. However, the exposure can be cross-sectionally measured so that it represents a longer past time period (e.g., the blood lead level), while the outcome measure is one of prevalence (e.g., nerve conduction velocities). The study then is a mixture of a longitudinal and a cross-sectional design rather than a mere cross-section of the study population.

Cross-sectional descriptive surveys

Cross-sectional surveys are often useful for practical and administrative, rather than for scientific, purposes. Epidemiological principles can be applied to systematic surveillance activities in the occupational health setting, such as:

  • observation of morbidity in relation to occupation, work area, or certain exposures
  • regular surveys of workers exposed to known occupational hazards
  • examination of workers coming into contact with new health hazards
  • biological monitoring programmes
  • exposure surveys to identify and quantify hazards
  • screening programmes of different worker groups
  • assessing the proportion of workers in need of prevention or regular control (e.g., blood pressure, coronary heart disease).


It is important to choose representative, valid, and specific morbidity indicators for all types of surveys. A survey or a screening programme can use only a rather small number of tests, in contrast to clinical diagnostics, and therefore the predictive value of the screening test is important. Insensitive methods fail to detect the disease of interest, while highly sensitive methods produce too many falsely positive results. It is not worthwhile to screen for rare diseases in an occupational setting. All case finding (i.e., screening) activities also require a mechanism for taking care of people having “positive” findings, both in terms of diagnostics and therapy. Otherwise only frustration will result with a potential for more harm than good emerging.



Read 4177 times Last modified on Thursday, 13 October 2011 20:25