Background
In studies assessing the accuracy of a diagnostic test, ideally, the person reviewing the results of one test in a diagnostic test accuracy study is unaware of the results of the other test (either the index or the reference standard). For example, if an investigation assesses doctors’ accuracy in diagnosing ankle fractures based on physical diagnosis, they should not know the X-ray results. Interpreters are more likely to be influenced by knowledge of the results of another test when the test result requires a subjective reading or assessment (e.g. imaging tests or clinical evaluation) compared with objective measures (e.g. automated laboratory tests).
The availability of (or lack of) relevant clinical or patient information (e.g. age, gender, symptom severity and location) may impact the performance and/or interpretation of a test result. This is referred to as clinical review bias. For example, the site of symptoms frequently guides how imaging results are examined. Therefore, it is essential when assessing the diagnostic accuracy of an index test that the clinical information available is the same as what would be available in practice.
There can then be difficulties in separating the diagnostic value of the pre-existing clinical information from the added value of the index test unless a comparison can be made to isolate the incremental value of the new test. This should be considered when choosing the comparator in the review objectives. In addition, studies have shown that the availability of clinical information to the person interpreting the index test results increases sensitivity with less effect on specificity (Whiting 2004, Loy 2004).
Example
A systematic review of the diagnostic accuracy of carcinoembryonic antigen (CEA) to detect colorectal cancer recurrence found that in 39 of the 42 included studies it was unclear if clinicians it were blinded to the results of the CEA test (index test) and in 1 study there was no blinding of the index test. This may have influenced the interpretation of the reference test, which was clinical diagnostics (e.g. clinical features, pathology, radiology, colonoscopy).
A systematic review of endometrial biomarkers for the non‐invasive diagnosis of endometriosis found that in 42 of 54 included studies, the index test operators were not blind to the results of the reference standard, and in 8 studies, blinding was unclear.
A review assessing reference standard-related bias in studies of radiographers’ reading of plain radiographs showed that reading performance is inflated when the observer is aware of the reference standard report before commenting on the radiograph.
Impact
A systematic review of the sources of bias in diagnostic test accuracy studies has shown that sensitivity and overall accuracy were higher when the person interpreting the reference standard was aware of the index test results, and some evidence that sensitivity was higher when the person interpreting the index test was not blinded to the results of the reference standard. The review found that in some studies, lack of blinding had no effect on the diagnostic accuracy measure. This may be related to the nature of the index or reference test (e.g. whether it is an objective measure). The authors also showed that the availability of clinical information to the person interpreting the results of the index test increases sensitivity, while the effect on specificity varied.
A study assessing the impact of design biases on diagnostic accuracy estimates reported an overestimation of the diagnostic odds ratio by approximately 30% in studies interpreting the reference test with knowledge of the outcomes of the index test, compared to studies with adequate blinding (relative diagnostic odds ratio, 1.3; 95% CI, 1.0-1.9).
Preventive steps
Primary studies of diagnostic test accuracy should blind outcome assessors to the results of the other test; that is, assessors of the results of the index test should be blinded to the results of the reference test and vice versa. This is particularly important where test results require a subjective interpretation. In line with the STARD guidelines for diagnostic test accuracy study reporting, it should also be clearly reported in the methods whether outcome assessors were masked to index and reference standard results. Where masking was not possible, a clear statement as to the potential impact of the lack of masking on the results (sensitivity and specificity) should be included when reporting the results of the study.
To avoid clinical review bias, the clinical and patient history information available when interpreting the results of the index test should be the same as what would be available in practice.