Early Stopping Bias

A distortion of treatment effect estimates that arises when a clinical trial is stopped early, whether for benefit, futility, or safety, following an interim analysis. 

Background

Stopping rules are a common feature of clinical trials, implemented primarily for ethical and safety reasons. They ensure that a study does not continue when an intervention is clearly better (stopping for benefit), clearly worse (stopping for futility), or poses unacceptable risk (stopping for safety). 

Although stopping rules are an ethical imperative, they can affect the interpretation of treatment effects. Trials terminated early, known as truncated trials, because of a beneficial treatment effect (stopped for benefit), are particularly susceptible to early stopping bias. Such trials risk exaggerating the treatment effect, especially when the total number of events is small. 

The consequences of early stopping for benefit extend beyond statistical distortion. Compelling early findings can trigger a cascade of effects, including media attention, publication in high-impact journals, and rapid translation into clinical practice and guidelines. While additional evidence may weaken or even contradict these early findings, there may be reluctance to reverse recommendations in the face of new data, and the gathering of such data in further trials may be considered unethical.

Early stopping for futility, in contrast, aims to prevent participants from receiving an ineffective intervention, while conserving time and resources. However, stopping for this reason may underestimate the treatment effect, particularly when interim results are imprecise or not representative of future data, and may lead to inappropriately stopping a study for an intervention with a modest treatment effect.

Early stopping for safety is often considered the most ethically straightforward reason to terminate a trial, as patient safety and protection must supersede statistical considerations. Nevertheless, early stopping for participant safety may still introduce bias and overestimate the risk of harm or adverse events when event counts are low.

Finally, some trials stop early for reasons unrelated to the study’s own findings, such as loss of funding, slow recruitment, difficulties accessing the intervention, or the emergence of external evidence that makes the research question less relevant. However, when a trial terminates for operational or feasibility reasons, bias is unlikely to be introduced by stopping, and so will not be further considered in this article.

Example

Stopping for benefit

β-blocker bisoprolol

Poldermans et al. (1999) is an example of a trial stopped early for benefit. The trial evaluated the β-blocker bisoprolol for patients with vascular disease undergoing non-cardiac surgery. After early trial termination, guideline recommendations supported the use of bisoprolol in the perioperative setting. However, later studies and pooled analyses revealed that perioperative bisoprolol in non-cardiac surgery was associated with an increased risk of disabling strokes, casting doubt on the reliability of the early-stopped trial. Despite this contradictory evidence, perioperative β-blockers continued to be used for a time, possibly reflecting a reluctance to overturn established practice and commercial influences.

rhAPC

In 2001, a trial of rhAPC (recombinant human activated protein C) in severe sepsis was stopped early after interim analyses suggested reduced mortality. A subsequent trial raised concerns about increased bleeding risk and failed to confirm the mortality benefit. Despite this, the drug continued to be recommended for several years before being withdrawn from the market in 2010–11, illustrating the long-term impact of early-stopped evidence on practice.

Stopping for safety/harm

The Women’s Health Initiative (WHI) trial evaluated the long-term effects of combined estrogen and progestin therapy in postmenopausal women (known as hormone replacement therapy, HRT). An interim safety review found that women in the treatment group had a higher risk of breast cancer and no cardiovascular benefit. The Data and Safety Monitoring Board recommended stopping the trial early in 2002 to protect participants, leading to a rapid global decline in HRT use. Later reviews showed that the WHI findings were influenced by factors such as participants beginning hormone therapy more than 10 years after menopause, and by the specific HRT formulation used. This research has demonstrated that benefits and risks vary by timing of therapy initiation and by type of HRT, with some regimens showing more favourable safety profiles. On the balance of evidence, HRT can benefit specific patient groups. This illustrates how early stopping for harm, though ethically necessary in trial settings, can contribute to uncertainty or oversimplified interpretations of treatment effects.

Stopping for futility

The GUIDE-IT trial tested whether using a blood marker (NT-proBNP) to guide treatment for people with heart failure could improve outcomes. The study was stopped early because an interim analysis suggested the approach was unlikely to show benefit, as both groups had similar rates of hospitalisation or cardiovascular death. Although stopping was reasonable, ending the trial before it reached its planned size risked underestimating the intervention’s true effectiveness. A later meta-analysis, which included the GUIDE-IT trial, did find evidence of benefit, although this was not seen in sensitivity analyses restricted to low-risk-of-bias studies.

Impact

It is important to be aware of and account for early stopping bias when investigating the effect of interventions. Early stopping for benefit is likely to overestimate the treatment effect; stopping for futility or patient safety may underestimate it. Stopping for reasons unrelated to the study’s own findings is unlikely to introduce bias.

Bassler et al.’s systematic review and meta-analysis of stopping trials early for benefit concluded that RCTs stopped early for benefit overestimate treatment effects. This overestimation was most common when only a small number of outcome events had occurred, especially fewer than 200. Stopping rules for benefit should be very strict in the magnitude of evidence and plausibility, e.g. before 500 events accumulated. Another review found that trials stopped for benefit had a median of 66 accrued events prior to stopping, with smaller event numbers yielding the largest treatment effects. 

Stopping for futility may underestimate treatment effects. A review by Walter concluded that stopping early was probably reasonable when the interim results are likely to reflect future trends. Statistical analyses, such as conditional power, should be calculated, and assumptions about future data should be specified.

Stopping for safety in studies may lead to underestimation of treatment benefit or overestimation of risk. As with all early stopping, reporting of methods and rationale for stopping should be clear. Trials stopped early for harms need to be interpreted cautiously, and related evidence should be taken into account when evaluating results.

Preventive steps

To prevent or limit the impact of early stopping bias, whilst upholding the ethical and safety reasons to end a study early, the following steps can be considered:

Pre-specify stopping boundaries, particularly when stopping for benefit, requiring a large number of accumulated events before considering early stopping.

When deciding whether to stop for futility, use conditional power analyses to estimate the probability of obtaining a statistically significant result if the trial continues to completion, noting that these analyses should be based on realistic assumptions about future data.

Report full details of interim analyses, stopping boundaries, and decision-making processes, including whether an independent data committee was involved in the decision and whether interim analyses were planned or ad hoc, to allow critical appraisal of the trial. See Item 23b of the 2025 CONSORT reporting guidelines.

Promote caution in guideline development when evidence is based on truncated trials, and consider the need to confirm results in subsequent trials or pooled reviews.

Sources

Abraham et al. (2005). Drotrecogin alfa (activated) for adults with severe sepsis and a low risk of death. The New England journal of medicine, 353(13), 1332–1341. https://doi.org/10.1056/NEJMoa050935 

Bassler D et al.,  STOPIT-2 Study Group (2010). Stopping randomized trials early for benefit and estimation of treatment effects: systematic review and meta-regression analysis. JAMA, 303(12), 1180–1187. https://doi.org/10.1001/jama.2010.310 

Bernard, GR et al. & Recombinant human protein C Worldwide Evaluation in Severe Sepsis (PROWESS) study group (2001). Efficacy and safety of recombinant human activated protein C for severe sepsis. The New England journal of medicine, 344(10), 699–709. https://doi.org/10.1056/NEJM200103083441001 

Felker GM et al., Effect of Natriuretic Peptide-Guided Therapy on Hospitalization or Cardiovascular Mortality in High-Risk Patients With Heart Failure and Reduced Ejection Fraction: A Randomized Clinical Trial. JAMA. 2017 Aug 22;318(8):713-720. doi: 10.1001/jama.2017.10565. PMID: 28829876; PMCID: PMC5605776.https://pubmed.ncbi.nlm.nih.gov/28829876/

Guyatt, GH et al. (2012). Problems of stopping trials early. BMJ (Clinical research ed.), 344, e3863. https://doi.org/10.1136/bmj.e3863 

Hopewell S et al. (2025). CONSORT 2025 explanation and elaboration: updated guideline for reporting randomised trials. BMJ (Clinical research ed.), 389, e081124. https://doi.org/10.1136/bmj-2024-081124 

Jitlal M et al. Stopping clinical trials early for futility: retrospective analysis of several randomised clinical studies. Br J Cancer 107, 910–917 (2012). https://doi.org/10.1038/bjc.2012.344

McLellan J et al. Natriuretic peptide-guided treatment for heart failure: a systematic review and meta-analysis. BMJ Evid Based Med. 2020 Feb;25(1):33-37. doi: 10.1136/bmjebm-2019-111208. Epub 2019 Jul 20. PMID: 31326896; PMCID: PMC7029248. https://pmc.ncbi.nlm.nih.gov/articles/PMC7029248/

Montori VM et al., Randomized trials stopped early for benefit: a systematic review. JAMA. 2005 Nov 2;294(17):2203-9. doi: 10.1001/jama.294.17.2203. PMID: 16264162. https://pubmed.ncbi.nlm.nih.gov/16264162/ 

Poldermans D  et al (1999). The effect of bisoprolol on perioperative mortality and myocardial infarction in high-risk patients undergoing vascular surgery. Dutch Echocardiographic Cardiac Risk Evaluation Applying Stress Echocardiography Study Group. The New England journal of medicine, 341(24), 1789–1794. https://doi.org/10.1056/NEJM199912093412402 

Rossouw JE  et al,  & Writing Group for the Women’s Health Initiative Investigators (2002). Risks and benefits of estrogen plus progestin in healthy postmenopausal women: principal results From the Women’s Health Initiative randomized controlled trial. JAMA, 288(3), 321–333. https://doi.org/10.1001/jama.288.3.321 

Stute P et al. (2023). Reappraising 21 years of the WHI study: Putting the findings in context for clinical practice. Maturitas, 174, 8–13. https://doi.org/10.1016/j.maturitas.2023.04.271 

Walter SD  et al.  (2019). Randomised trials with provision for early stopping for benefit (or harm): The impact on the estimated treatment effect. Statistics in medicine, 38(14), 2524–2543. https://doi.org/10.1002/sim.8142 

Walter SD et al. A systematic survey of randomised trials that stopped early for reasons of futility. BMC Med Res Methodol 20, 10 (2020). https://doi.org/10.1186/s12874-020-0899-1 


PubMed feed

Leave a Reply

Your email address will not be published. Required fields are marked *