Verification bias

In a diagnostic accuracy study, confirmation of a diagnosis is not consistent between groups of people: whilst some participants receive confirmation of the diagnosis by one reference standard, some have no confirmation, or have confirmation by a different reference standard.


Many reference tests are invasive, expensive, or carry a procedural risk (e.g. angiography, biopsy, surgery), and therefore, patients and clinicians may be less likely to pursue further tests if a preliminary test is negative. Verification may be partial, where only those with a positive test receive the reference standard, or differential, where a different reference test is used depending on the result.


In a study assessing the accuracy of D-dimer testing for diagnosing pulmonary embolism (PE) patients that had a positive result were further assessed with ventilation-perfusion scans (reference standard 1), whereas patients that had negative D-dimer results were evaluated with routine clinical follow up (reference standard 2).

Patients who had asymptomatic pulmonary embolisms, but false negative D-dimers (they had the disease but tested negative) may not have been diagnosed by routine follow up (symptoms may have resolved in the interim). Using clinical follow up as a second reference standard may have resulted in patients with false results misclassified as true negative (not having the disease), overestimating the accuracy of the test.


An assessment of the effect of biases on diagnostic accuracy studies showed that studies that relied on two or more reference standards to verify results of the index test reported odds ratios that were on average 60% higher than odds ratios in studies that used a single reference standard. This can occur because different reference standards vary in how they define the target condition and therefore have different accuracies.  It is difficult to predict the direction of the effect on the results as verification bias can lead to a test appearing to be more accurate or less accurate than it is.  

Studies, where the reference standard was an expensive and/or invasive test, are particularly prone to verification bias. For instance, studies assessing the diagnostic accuracy of faecal occult blood test (FOBT) often employ a study design where only those that test positive on FOBT receive a confirmatory invasive test known as a colonoscopy. Although these designs may have been used for ethical or funding restrictions, they introduce verification bias. Comparing the diagnostic accuracy studies of FOBT for colorectal cancer in studies without verification bias against those with verification bias found that ‘the pooled sensitivity of FOBT without verification bias was significantly lower than those studies with this bias. The pooled specificity of the studies without verification bias was also higher.

Preventive steps

Ideally, in a diagnostic accuracy study, all patients should receive the same reference test.  However, obtaining a reference test in every patient may not be ethical, practical, or cost-effective, which can lead to verification bias. One way to reduce verification bias in clinical studies is to perform the reference test in a random sample of study participants. Some statistical methods have been developed to correct for verification bias, but these should be used with caution.


Begg CB, Greenes RA. Assessment of diagnostic tests when disease verification is subject to selection biasBiometrics. 1983  39 (1): 207–215

Cronin AM, Vickers AJ. Statistical methods to correct for verification bias in diagnostic studies are inadequate when there are few false negatives: a simulation study BMC Med Res Methodol. 2008 Nov 11;8:75. doi: 10.1186/1471-2288-8-75.

de Groot JA et al.  Verification problems in diagnostic accuracy studies: consequences and solutions. BMJ. 2011 Aug 2;343:d4770.

de Groot JA et al. Adjusting for partial verification or workup bias in meta-analyses of diagnostic accuracy studies. Am J Epidemiol. 2012 Apr 15;175(8):847-53.

Rosman AS, Korsten MA.Effect of verification bias on the sensitivity of fecal occult blood testing: a meta-analysis.J Gen Intern Med 2010;25:1211–1221.

Rutjes AW et al. Evidence of bias and variation in diagnostic accuracy studies.CMAJ. 2006 Feb 14;174(4):469-67.

Whiting PF et al. QUADAS-2: a revised tool for the quality assessment of diagnostic accuracy studies. Ann Intern Med 2011;155:529-536

PubMed feed

These sources are retrieved dynamically from PubMed

View more →