Information Bias

Bias that arises from systematic differences in the collection, recall, recording or handling of information used in a study.  

Background

Information bias is any systematic difference from the truth that arises in the collection, recall, recording and handling of information in a study, including how missing data is dealt with.

Major types of information bias are misclassification bias, observer bias, recall bias and reporting bias. It is a probable bias within observational studies, particularly in those with retrospective designs, but can also affect experimental studies.

Example

Chang et al 2010 investigated information bias in the self-reporting of personal computer use within a study looking at computer use and musculoskeletal symptoms.

Over a period of 3 weeks, young adults reported the duration of computer use each day, as well as musculoskeletal symptoms. Usage-monitor software installed onto participant’s computers provided the reference measure. Comparing the self-reported with the reference data on the computer, the correlation varied widely with Spearman’s coefficients from -0.22 to 0.8.

For reference-estimated usage of <3.6 hours/day, self-reports tended to overestimate the duration of computer use. Conversely, for reference-assessed computer use of more than 3.6 hours/day, self-reports tended to underestimate the duration of computer use. Illustrating e that information bias can operate in more than one direction even within a study group.

Impact

All types of study can be subject to information bias. Observational studies may be at greater risk, particularly those relying on self-reports and retrospective data collection.  Although randomisation in intervention studies reduces the risk of bias and confounding, it can not entirely eradicate these (Shahar & Shahar 2009).

Missing data can be a major cause of information bias, where certain groups of people are more likely to have missing data. An example where differential recording may occur is in smoking data within medical records.  Overall, smoking status is quite well recorded in primary care records, those with no record of smoking status are more likely to be ex-smokers or non-smokers than current smokers. Marston et al, 2014 ).  showed the overestimate might be as much as 8% for smoking status. Those who quit smoking at a young age or a long time ago were more likely to misclassified as non-smokers.

Non-differential (random) misclassification of measures (where errors in measurements occur equally in all comparison groups) will tend to lead to an underestimation of effect.  Differential information bias (where there are different levels of inaccuracy between comparison groups) could work in either direction, resulting in an overestimate or underestimate of the true effect (Kesmodel, 2018). The bias was more likely when the exposure is dichotomized.

Preventive steps

Strategies to avoid information bias include choosing an appropriate study design, following well-designed protocols for data collection and handling, and the appropriate definition of exposures and outcomes.

An important element to minimise information bias is to ensure that blinding of intervention status (or exposure status in observational studies) is maintained whilst outcomes are measured and recorded.  If this is not possible, then the participants and investigators should be blind to the main hypotheses of the research.

Where possible, the information should be collected prospectively, using standardised methods and devices.  Where interviewers collect data, questions should be posed neutrally.

Other techniques include utilising medical or work records to ascertain information, or to corroborate self-reported measures.

Close attention to these details should occur at the design stage, as little can be done to address incorrect data.  In the case of missing data, techniques such as multiple imputations may be possible if data is thought to be missing at random (Dziura et al 2013)  [link here to missing data bias page, once done].