Background
Many reference tests are invasive, expensive, or carry a procedural risk (e.g. angiography, biopsy, surgery), and therefore, patients and clinicians may be less likely to pursue further tests if a preliminary test is negative. Verification may be partial, where only those with a positive test receive the reference standard, or differential, where a different reference test is used depending on the result.
Example
In a study assessing the accuracy of D-dimer testing for diagnosing pulmonary embolism (PE) patients that had a positive result were further assessed with ventilation-perfusion scans (reference standard 1), whereas patients that had negative D-dimer results were evaluated with routine clinical follow up (reference standard 2).
Patients who had asymptomatic pulmonary embolisms, but false negative D-dimers (they had the disease but tested negative) may not have been diagnosed by routine follow up (symptoms may have resolved in the interim). Using clinical follow up as a second reference standard may have resulted in patients with false results misclassified as true negative (not having the disease), overestimating the accuracy of the test.
Impact
An assessment of the effect of biases on diagnostic accuracy studies showed that studies that relied on two or more reference standards to verify results of the index test reported odds ratios that were on average 60% higher than odds ratios in studies that used a single reference standard. This can occur because different reference standards vary in how they define the target condition and therefore have different accuracies. It is difficult to predict the direction of the effect on the results as verification bias can lead to a test appearing to be more accurate or less accurate than it is.
Studies, where the reference standard was an expensive and/or invasive test, are particularly prone to verification bias. For instance, studies assessing the diagnostic accuracy of faecal occult blood test (FOBT) often employ a study design where only those that test positive on FOBT receive a confirmatory invasive test known as a colonoscopy. Although these designs may have been used for ethical or funding restrictions, they introduce verification bias. Comparing the diagnostic accuracy studies of FOBT for colorectal cancer in studies without verification bias against those with verification bias found that ‘the pooled sensitivity of FOBT without verification bias was significantly lower than those studies with this bias. The pooled specificity of the studies without verification bias was also higher.
Preventive steps
Ideally, in a diagnostic accuracy study, all patients should receive the same reference test. However, obtaining a reference test in every patient may not be ethical, practical, or cost-effective, which can lead to verification bias. One way to reduce verification bias in clinical studies is to perform the reference test in a random sample of study participants. Some statistical methods have been developed to correct for verification bias, but these should be used with caution.