Catalogue of Bias

Review biases

Occurs in diagnostic test accuracy studies when the person interpreting the results of the index test has knowledge of the results of the reference standard (diagnostic review bias), or, the person interpreting the results of the reference standard has knowledge of the results of the index test (index review bias). Clinical review bias occurs when relevant clinical and patient information is available to the person interpreting the test or reference result. These are all forms of observer bias.

Background
Impact
Example
Preventive steps
Further resources
Cite as

Background

In studies assessing the accuracy of a diagnostic test, ideally, the person reviewing the results of one test in a diagnostic test accuracy study is unaware of the results of the other test (either the index or the reference standard). For example, if an investigation assesses doctors’ accuracy in diagnosing ankle fractures based on physical diagnosis, they should not know the X-ray results. Interpreters are more likely to be influenced by knowledge of the results of another test when the test result requires a subjective reading or assessment (e.g. imaging tests or clinical evaluation) compared with objective measures (e.g. automated laboratory tests).

The availability of (or lack of) relevant clinical or patient information (e.g. age, gender, symptom severity and location) may impact the performance and/or interpretation of a test result. This is referred to as clinical review bias. For example, the site of symptoms frequently guides how imaging results are examined. Therefore, it is essential when assessing the diagnostic accuracy of an index test that the clinical information available is the same as what would be available in practice.

There can then be difficulties in separating the diagnostic value of the pre-existing clinical information from the added value of the index test unless a comparison can be made to isolate the incremental value of the new test. This should be considered when choosing the comparator in the review objectives. In addition, studies have shown that the availability of clinical information to the person interpreting the index test results increases sensitivity with less effect on specificity (Whiting 2004, Loy 2004).

Example

A systematic review of the diagnostic accuracy of carcinoembryonic antigen (CEA) to detect colorectal cancer recurrence found that in 39 of the 42 included studies it was unclear if clinicians it were blinded to the results of the CEA test (index test) and in 1 study there was no blinding of the index test. This may have influenced the interpretation of the reference test, which was clinical diagnostics (e.g. clinical features, pathology, radiology, colonoscopy).

A systematic review of endometrial biomarkers for the non‐invasive diagnosis of endometriosis found that in 42 of 54 included studies, the index test operators were not blind to the results of the reference standard, and in 8 studies, blinding was unclear.

A review assessing reference standard-related bias in studies of radiographers’ reading of plain radiographs showed that reading performance is inflated when the observer is aware of the reference standard report before commenting on the radiograph.

Impact

A systematic review of the sources of bias in diagnostic test accuracy studies has shown that sensitivity and overall accuracy were higher when the person interpreting the reference standard was aware of the index test results, and some evidence that sensitivity was higher when the person interpreting the index test was not blinded to the results of the reference standard. The review found that in some studies, lack of blinding had no effect on the diagnostic accuracy measure. This may be related to the nature of the index or reference test (e.g. whether it is an objective measure). The authors also showed that the availability of clinical information to the person interpreting the results of the index test increases sensitivity, while the effect on specificity varied.

A study assessing the impact of design biases on diagnostic accuracy estimates reported an overestimation of the diagnostic odds ratio by approximately 30% in studies interpreting the reference test with knowledge of the outcomes of the index test, compared to studies with adequate blinding (relative diagnostic odds ratio, 1.3; 95% CI, 1.0-1.9).

Preventive steps

Primary studies of diagnostic test accuracy should blind outcome assessors to the results of the other test; that is, assessors of the results of the index test should be blinded to the results of the reference test and vice versa. This is particularly important where test results require a subjective interpretation. In line with the STARD guidelines for diagnostic test accuracy study reporting, it should also be clearly reported in the methods whether outcome assessors were masked to index and reference standard results. Where masking was not possible, a clear statement as to the potential impact of the lack of masking on the results (sensitivity and specificity) should be included when reporting the results of the study.

To avoid clinical review bias, the clinical and patient history information available when interpreting the results of the index test should be the same as what would be available in practice.

Cite as

Catalogue Of Bias Collaboration. Plüddemann A, Heneghan C. Review biases. In Catalogue of Bias 2023. https://catalogofbias.org/biases/review-biases/

Background
Example
Impact
Preventive steps
Further resources
Cite as

Related biases

Observer bias

Glossary of terms
About GET-IT

GET-IT provides plain language definitions of health research terms

All biases

Sources

Boone D et al. Systematic review: bias in imaging studies – the effect of manipulating clinical context, recall bias and reporting intensity. Eur Radiol. 2012 Mar;22(3):495-505.

STARD Group, et al. STARD 2015: an updated list of essential items for reporting diagnostic accuracy studies. BMJ. 2015 Oct 28;351:h5527.

Brealey SD et al. Evidence of reference standard related bias in studies of plain radiograph reading performance: a meta-regression. Br J Radiol. 2007 Jun;80(954):406-13

Gupta D et al. Endometrial biomarkers for the non-invasive diagnosis of endometriosis. Cochrane Database Syst Rev. 2016 Apr 20;4:CD012165.

Lijmer JG et al. Empirical evidence of design-related bias in studies of diagnostic tests. JAMA. 1999 Sep 15;282(11):1061-6.

Rutjes AW et al Evidence of bias and variation in diagnostic accuracy studies. CMAJ. 2006 Feb 14;174(4):469-76.

Sørensen et al.The diagnostic accuracy of carcinoembryonic antigen to detect colorectal cancer recurrence – A systematic review. Int J Surg. 2016 Jan;25:134-44. doi:

QUADAS-2 Steering Group, et al. A systematic review classifies sources of bias and variation in diagnostic test accuracy studies. J Clin Epidemiol. 2013 Oct;66(10):1093-104.