Clinical Use of the Apple Watch

Apple Inc. released the first Apple Watch in April 2015.  At that time, it was mostly marketed as a fitness and fashion accessory.  However, it has quickly developed into one of the most popular pieces of consumer health tech on the market, and data generated from the device is commonly encountered in the cardiology and electrophysiology clinic, necessitating action from clinicians.  As such, it is important for providers to be aware of the benefits, limitations and evidence related to use of the device. 

As it relates to cardiology, the Apple Watch is primarily encountered by physicians due to results from the irregular rhythm notification system or an abnormal electrical tracing generated by the electrocardiogram (ECG) technology.  These features were released in September 2018 with the Apple Watch Series 4.  The irregular rhythm notification can be turned on in the Apple Watch application on the users phone and used with any Series of the watch, while the ECG feature is only available on Series 4 and up. 

In June of 2022, Apple announced a new feature, now available, meant to allow users with a known history of atrial fibrillation (AF) determine what percentage of the time they are spending in AF.  

Irregular Rhythm Notification:

FDA Approval

The FDA has approved this feature as Class II Software as a Medical Device and it is only covered for over-the-counter use (not prescription use).  As a Class II device, the Irregular Rhythm Notification is considered to have a moderate level of risk, which means that the FDA has established specific controls and requirements to provide a reasonable assurance of safety and effectiveness for this feature.  It is approved for use in people without a known history of AF.

How does it work:

The Apple Watch uses an optical heart rate sensor, which employs a technology called photoplethysmography (PPG) to measure heart rate. PPG involves shining light, usually green LEDs, onto the skin and measuring the amount of light that is reflected back. As blood volume in the wrist changes with each heartbeat, the amount of light absorbed and reflected varies, allowing the watch to detect the user’s pulse.  When the user is NOT active (it can determine this from the accelerometer) it will attempt to measure the PPG signal over a 1 minute stretch.  It stores this as a tachogram (plot of time between heartbeats over that minute).  If the signal is not clean enough to capture a certain frequency of beats, then that tachogram is discarded and not used.  Apple uses this tachogram to look at the regularity of time between beets, which should be fairly stable between beats in sinus rhythm but will be irregular for rhythms like AF.  The threshold for what constitutes ‘irregularity’ between beats is determined by a proprietary algorithm developed by Apple and not available to the public.  Once it identifies an irregular rhythm, this triggers a cascade of checks which attempt to run as frequently as every 15 minutes (these are subject to the same limitations as noted above regarding signal clarity and need for the user to be inactive).  If 5 / 6 tachograms are classified as irregular, only then is the user notified to the irregular rhythm, so it requires a minimum of 75 minutes of an irregular rhythm to reach this threshold, and prior to initiating this cascade it is important to remember that the watch is checking tachograms every 2-4 hours but not continuously.  

Data leading to approval:

This product went through a variety of preclinical and clinical testing.  Some of these results are available via Apple and their public FDA filing.  Regarding the preclinical testing, Apple states publicly that: 

“Prior to clinical testing, studies were conducted to develop the PPG-based detection algorithm and to evaluate algorithm performance across a variety of conditions and user behaviors. Among these were deep breathing, riding in a car, hand tremors and motion, reduced hand or wrist perfusion, overnight wear, rapid ventricular response in individuals with AFib, and other arrhythmias. These studies were performed in 2300 control subjects and more than 500 subjects with AF… Because PPG relies on light absorptivity, the arrhythmia detection algorithm was tested across a variety of skin types and tones to ensure that sensor platform adjustments for skin tone were sufficient in the context of the algorithms used to detect arrhythmias… Validation efforts demonstrated no significant difference in algorithm sensitivity or specificity across skin types or tones.”

The primary clinical study used to validate this feature was the Apple Heart Study (AHS), published in the NEJM in 2019.  Apple used data from a pre-specified sub-study of the AHS to present to the FDA.  All participants in the AHS consented to have their data be made available for this sub-study.  The sub-study consisted of the participants in the larger AHS that received an irregular rhythm notification during the study period.  These individuals were then sent a 7-day ambulatory patch ECG recorder (ePatch).  While wearing the ePatch, each subsequent irregular tachogram was compared to the signal collected by the patch which was analyzed by an independent cardiologist as ‘sinus rhythm’, ‘AF’, ‘Unreadable’ or ‘Other Irregular Rhythm’.  A total of 269 subjects were included in this sub-study (full reported details can be found in the FDA filing) but only 226 provided both ePatch and tachogram data that could be analyzed.  Of these, 25.2% received an irregular rhythm notification and there were 2,634 irregular tachograms for comparison (out of 10,432 total tachograms generated).  The individual tachogram PPV for AF detection was 66.6%, but the notification level (5 / 6 tachograms) PPV was 78.9%.  Overall, 98.2% of the patients that received a notification while wearing the ePatch had AF or another clinically relevant arrhythmia identified, showing the importance of further investigation if an irregular rhythm notification is triggered.

ECG Detection Software:

FDA Approval:

The FDA has approved this feature as Class II Software as a Medical Device and it is only covered for over-the-counter use (not prescription use).  As a Class II device, the ECG feature is considered to have a moderate level of risk, which means that the FDA has established specific controls and requirements to provide a reasonable assurance of safety and effectiveness for this feature.  It is specifically noted in the approval letter that the device is not intended to provide a formal diagnosis. 

How does it work:

This application was introduced with the Apple Watch Series 4, and has been available on all series in production since that time. The ECG technology consists of electrodes integrated into the back crystal and the Digital Crown of the watch. These electrodes are responsible for measuring and cataloging the electrical signals generated by the user’s heart.  Once the ECG app is open on the watch, the user places their finger on the Digital Crown, creating a closed circuit between their finger and the electrodes on the back of the watch. The watch records the ECG signal (analogous to lead I of the clinical ECG) for a duration of 30 seconds. During this time, it’s essential for the user to remain still as movement or noise can interfere with the ECG reading.  The user can then view the ECG waveform from within the app on their phone, which also allows them to generate a PDF of the 30 second rhythm which can then be shared with parties of their choosing.  The ECG app utilizes machine learning to classify the 30 second tracing as sinus rhythm; atrial fibrillation; inconclusive due to high or low heart rate or other arrhythmia; and inconclusive as a result of poor quality tracing. 

Data leading to approval:

The ECG detection algorithm also underwent a lot of preclinical validation testing.  The sensors and classification algorithm were tested across various ethnicities, wrist circumferences, BMI ranges, ages, non-AFib arrhythmias, degrees of band tightness, postures, and exercise states.  There were a number of factors that were associated with a higher burden of ‘unreadable’ ECGs such as exercise, sweating and low BMI.  Other factors included:

Source: https://www.apple.com/healthcare/docs/site/Apple_Watch_Arrhythmia_Detection.pdf

After preclinical testing, there were two ‘multicenter studies’ sponsored by Apple which were meant to (i) validate the clarity of the app to generate an ECG strip similar lead I of a 12-lead clinical ECG, (ii) classify the rhythm as sinus rhythm or AF. 

The first study was used to support the first version of the app (1.0).  It enrolled 602 patients, 301 in an AF cohort and 287 in a sinus rhythm cohort (14 excluded).  Study participants were asked “to record three single lead ECGs using the ECG app as study staff simultaneously recorded three 12-lead ECGs using an FDA-cleared clinical device (GE Healthcare CardioSoft ECG device).” For (i), three certified cardiac technicians looked at 140 samples (70 AF and 70 sinus rhythm) and compared the Apple Watch ECG to lead I of the clinical ECG.  For (ii), three blinded cardiologists classified the 12-lead ECG as ‘sinus rhythm’, ‘AF’, ‘other’ (not sinus rhythm or AF within a HR of 50-100) or ‘unreadable’.  The ECG app classified the app generated rhythm as ‘sinus rhythm’, ‘AF’, ‘unclassifiable’ or ‘unreadable’ as well.  For classifiable ECGs, the sensitivity and specificity was then calculated.

The second study was used to support the second version of the app (2.0), which expanded the classifiable heart rate from 50-100 to 50-150 bpm and introduced two new classifications of ‘sinus rhythm with high HR’ (>100 bpm) and ‘AF with high HR’ (>100 bpm).  It enrolled 546 patients, 304 in the AF cohort and 241 in the sinus rhythm cohort (1 excluded).  The end points of this study were similar to the first (sensitivity and specificity for AF), but including HRs up to 150 bpm.  This version of the app was also tasked with attempting to make a diagnosis of supraventricular tachycardia (though the details of what was classified as supraventricular tachycardia for this purpose are not specifically laid out).  Two blinded cardiologists independantly classifyied the 12-lead ECGs in this study.   

Results: 

For the first study, technicians rated 99.2% of all tracings (that weren’t excluded due to artifact, n=13) as morphologically equivalent to lead I of the 12-lead clinical ECG.

Source: https://www.apple.com/healthcare/docs/site/Apple_Watch_Arrhythmia_Detection.pdf

The sensitivity and specificity of the ECG algorithm was 98.3% and 99.6%, respectively (when not including the unclassifiable ECGs n=13).  When unclassifiable was included, sensitivity and specificity were 95.5% and 97.1%, respectively.  

Source: https://www.apple.com/healthcare/docs/site/Apple_Watch_Arrhythmia_Detection.pdf

In the second study,  the sensitivity and specificity were 98.5% and 99.3% respectively for expanded HR ECG classifications of AF and sinus rhythm.  The sensitivity and specificity of this algorithm was tested at rest and after exercise without a significant difference in the results.  

Source: https://www.apple.com/healthcare/docs/site/Apple_Watch_Arrhythmia_Detection.pdf

Atrial Fibrillation History Feature:

FDA Approval:  

The FDA has approved this feature as Class II Software as a Medical Device and it is only covered for over-the-counter use (not prescription use).  As a Class II device, the ECG feature is considered to have a moderate level of risk, which means that the FDA has established specific controls and requirements to provide a reasonable assurance of safety and effectiveness for this feature.  It is approved for users with a known history of AF

How does it work:

Just like the irregular rhythm notification (see above), this feature uses the PPG LED sensor on the back of the watch.  It also uses the same tachogram classification algorithm in regards to how frequently it checks for irregular rhythms.  In their FDA filing (link) it is noted that they use a proprietary convolutional neural network optimized for sensitivity to classify a rhythm as AF.  The user must wear the watch for at least 5 days a week and 12 hours a day for the algorithm to provide data.  If these criteria are met, then every Monday the watch breaks down AF data in different ways:

  • Weekly: time over the past 7 days spent in AF as a percentage of time wearing the watch
  • Day of the week: percentage estimate of the amount of time the user was in AF per day of the week averaged over the past 42 days
  • Time of day: percentage estimate of the amount of time the user was in AF at a certain time of day (broken down into 4 hours blocks) averaged over the past 42 days  

This information can be exported to healthcare providers or others the user chooses to share it with.

Data leading to approval:

Apple states in its FDA filing that the device algorithm was trained and tested in the preclinical setting on over 2,500 subjects and 3 million pulse rate recordings.  Similar to the irregular rhythm notification algorithm, they state that the studies “recruited demographically diverse populations with broad representation of age, sex, BMI, race, and skin tones.”  In the preclinical phase, the AF History algorithm was found to have a sensitivity and specificity for AF of 97.0% and 99.0%, respectively.

The feature was tested clinically in an enrolled population of 413 participants with a mix of paroxysmal and permanent AF who also wore an ECG patch concurrently for 13 days.  The main endpoint of this study was to assess the accuracy of the AF burden estimate on the watch to that of the measured burden on the clinical proven patch with a Bland-Altman Limits of Agreement method.  The average difference in weekly percentage burden between the subjects contributing data was 0.67% (-11.4% to 12.8%).  93% of subjects had a difference within 5%, and 96% of subjects had a difference within 10%.  

Resources:

Leave a comment