The importance of confounding is that it suggests an association where none exists or masks a true association (Figure 1).
Figure 1. The principle of confounding; the confounder makes the exposure more likely and in some way independently modifies the outcome, making it appear that there is an association between the exposure and the outcome when there is none, or masking a true association
It commonly occurs in observational studies, but can also occur in randomized studies, especially, but not only, if they are poorly designed.
For example, if by chance more elderly people are randomized to an active intervention than to placebo, and if age is independently more likely to be associated with a beneficial outcome, the intervention may falsely appear to be beneficial.
Because observational studies are not randomized to ensure equivalent groups for comparison (or to eliminate imbalances due to chance), confounders are common.
It is possible to reduce the effects of known possible confounders by analysing the data statistically in ways that allow for them. However, there will always be the possibility of unknown confounders, which cannot be taken into account. It is therefore not uncommon for the results of observational studies to be overturned when subsequent randomised trials do not confirm the results of the observational studies.
In a seminal example, early findings on the supposed beneficial effect of hormone replacement therapy in cardiovascular disease were reversed when studies that adjusted for socioeconomic status or education were accounted for; there was a reduced risk among studies that did not adjust for these factors, suggesting confounding.
In retrospective, non-randomized studies of patients taking digoxin there were increased death rates, even after adjustment for plausible confounders; however, in a prospective randomized study, mortality was not increased. This suggests that the observational data were subject to confounding. For example, it was likely that those who had been taking digoxin in the observational studies were sicker and therefore more likely to die.
Other comparisons within a study may give information about the potential role of confounders. For example, in a register-based retrospective nationwide cohort study of 848,786 pregnancies, using the Danish Medical Birth Registry, there was an apparent association between the use of selective serotonin reuptake inhibitors (SSRIs) during pregnancy in 4183 women and an increased risk of certain congenital defects. However, multivariable logistic regression models reduced the significance of an association; furthermore, there were similar risks in a group of women who had stopped taking SSRIs during pregnancy, strongly suggesting that the apparent association was due to an unidentified confounder. Analysis of the effect of dose as a continuous variable showed that there was no dose-response association, further evidence that there was no true association.
In a systematic review of epidemiological (case-control and cohort) studies of the effectiveness of statins in reducing the risk of Parkinson’s disease (PD), Bykov and colleagues investigated the impact of confounding. Six of 10 included studies collectively showed a protective effect of statins (relative risk 0.75; 95% CI: 0.60–0.92); however, these studies did not adjust (control) for serum cholesterol concentrations, which are inversely related to the risk of PD. In the four studies that did control for cholesterol, the beneficial effect of statins fell by 28% (7–65%) and shifted the point estimate to a non-significant harmful effect of statins (RR 1.04; 0.68–1.59).
Randomization is the best way to reduce the risk of confounding. However, it may not be enough, particularly when it is anticipated that imbalances in prognostic factors may occur despite randomization, or when imbalances occur by chance. Stratification and statistical adjustment can reduce the risk of confounding in such cases.
An extension of this is the use of propensity scores, in which potential confounders are used to build a statistical model that assigns to each person a number called their propensity score: the people with high scores are more likely to have certain confounders, and those with low scores are less likely. The use of propensity scores in a study of metformin showed that the risk of cancers is lower in metformin users. But the randomized trial evidence, as far as it goes, shows no convincing evidence that metformin has any effect, illustrating the difficulty in adequately controlling for confounding.
On the other hand, a very large effect size can outweigh the combined effects of plausible confounders. Even if plausible confounders have not been ruled out by the design of the study, a large observed effect can swamp the combined effects of the confounders. For example, the observable effects of general anaesthesia are unlikely to be accountable by confounding, placebo effects, or any kinds of biases; in such cases, randomized trials may not even be necessary. In observational studies associations have to be dramatic if one wants to be confident that plausible confounders have been ruled out; this is true of both beneficial and harmful associations.