Big is not always beautiful: the Apple Heart Study


Ami Banerjee blogs about the Apple Heart Study and what the results mean

What a gift. During our fourth away day working on the Catalogue of Bias resource, a systematic compendium of all the possible biases in health research and practice, the Apple Heart Study was published in the New England Journal of Medicine.

This trial recruited 419 297 participants over 8 months. The amazing scale and speed of recruitment comfortably make this the fastest recruiting and largest trial to-date.

Atrial fibrillation (AF) is the world’s commonest heart rhythm problem and causes a significant preventable burden of stroke worldwide, but a substantial proportion is either undiagnosed or diagnosed after the stroke. Over the last few years, there has been growing interest in new ways of detecting AF at scale. Enter stage left the Apple Watch, which has an optical sensor with an irregular pulse notification algorithm.

The Apple Heart Study prospectively recruited adults over the age of 22 in an open-label design without any comparator. Participants who received a notification from the Apple Watch app were prompted to start a telemedicine consultation. Those with urgent symptoms were encouraged to attend the emergency department or urgent care. Those without urgent symptoms were sent an ECG patch to wear for up to 7 days. The ECG patches were returned by post to be checked by two clinicians.

The primary outcome was AF for more than 30 seconds on ECG patch monitoring in a participant who received an irregular pulse notification. Participants also completed a survey at 90 days.

Of 2161 (0.5%) individuals who received an irregular pulse notification, only 945 (44%) were included in the first visit, 658 (30%) had an ECG shipped, 450 (21%) returned an ECG which could be analysed, 372 (17%) completed a 90-day survey, 96 (18%) had a second visit, and 254 (12%) completed the end-of-study survey. No lack of ascertainment, compliance or detection biases here. The percentages presented here are lower than those reported in figure 1 of the NEJM  publication. because I have used all the data from those who had an irregular pulse notification (intention to test group) as opposed to analysing just those who engaged with further testing and follow up (per-protocol test group).

AF was identified in 153/450 participants who returned ECG patches, resulting in a diagnostic yield of AF on ECG patches of 34% overall (35% in the over 65 years age group and 18% in those younger than 40 years).

The authors report positive predictive values of 71% for an individual tachogram (a 1-minute recording form the Apple Watch) and 84% for an irregular pulse notification, i.e. 84% of participants with an irregular pulse notification had AF. The authors acknowledge that “the positive predictive values were measured for participants who had already received an irregular pulse notification and are therefore only an estimate of the positive predictive value of an initial notification in the overall cohort”, representing a potential spin bias.

In topical areas such as digital technology, wearables and big data, hot stuff bias and confirmation bias are also major issues. While the world still concentrates on big pharma as the source of industry sponsorship bias in trials and evidence-based healthcare, our eyes are diverted from digital technology giants, which have, on average, an order of magnitude more net worth. The size of the study population tells us less about the study, and more about Apple, the undisputed global heavyweight of digital technology companies, currently valued at $961 billion, and the appeal of its products to consumers.

Ami Banerjee, Associate Professor in Clinical Data Science and Honorary Consultant Cardiologist, UCL

Conflicts of interest: Advisory boards for Pfizer, Astra-Zeneca and Boehringer Ingelheim and Trustee for South Asian Health Foundation.

Reference:  Large-Scale Assessment of a Smartwatch to Identify Atrial Fibrillation N Engl J Med 2019; 381:1909-1917
DOI: 10.1056/NEJMoa1901183