Non-invasive continuous health monitoring can greatly improve the efficiency of our healthcare systems, and help save many lives. As we’ve seen in a previous post, Atrial Fibrillation (or AFib), an abnormal heart rhythm, increases the risk of stroke and heart failure. With the proper prevention and diagnosis, we can put a stop to many of the deaths that result from cardiovascular diseases — the number 1 cause of death worldwide.

One promise is the use of everyday wearables to track heart rhythm. It gives cardiologists the chance to monitor problems at a distance, in a way that is non-invasive and easy for the patient.

Automation plays a big role. Smart software and algorithms are responsible for alerting physicians if something is wrong. In this post we’ll develop one such algorithm: a neural network that can detect AFib from a few seconds recording of a single lead ECG, like the one you can find on an Apple Watch.

We’ll roughly follow the paper Convolutional Recurrent Neural Networks for Electrocardiogram Classification. We’ll use Pytorch and the PhysioNet dataset.

## The PhysioNet dataset

The publicly available PhysioNet/CinC Challenge 2017 data set contains 8,528 single lead ECG recordings of length ranging from 9 to 61 seconds, sampled at 300Hz.

Each recording is labeled with one of the classes “normal rhythm”, “AF rhythm”, “other rhythm”, and “noisy recording”. From now on we’ll refer to these classes as “N”, “A”, “O”, and “~” respectively.

If you’d like to follow along, go ahead and download the training data. The data are provided as EFDB-compliant Matlab V4 files (each including a .mat file with the ECG data and a .hea file with the waveform information). The REFERENCE.csv has the labels.

For a detailed description of the dataset, see this paper.

Let’s plot 10 seconds of a sample recording for each class:

This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.

Learn more about bidirectional Unicode characters

import numpy as np | |

import scipy.io as sio | |

import matplotlib.pyplot as plt | |

# plot 10 seconds | |

secs = 10 | |

# maximum length of ECG recording is 61 seconds | |

max_length = 61 | |

# our data was recorded at 300Hz | |

freq = 300 | |

files = [(‘N’, ‘A00001’), | |

(‘A’, ‘A00004’), | |

(‘O’, ‘A00038’), | |

(‘~’, ‘A00022’)] | |

fig = plt.figure(figsize=(20,16)) | |

for i, (cls, f) in enumerate(files): | |

data = sio.loadmat(‘training2017/’+f+‘.mat’)[‘val’][0] | |

ax = fig.add_subplot(2, 2, i+1) | |

ax.plot(np.arange(0, secs*freq)/300, data[:secs*freq]/1000) | |

ax.set_title(‘Class: ‘ + cls, fontsize=30) | |

ax.set(xlim=[0, secs], xticks=np.arange(0, 20, 1), | |

ylim=[–1, 2], yticks=np.arange(–0.5, 1.2, 0.5)) | |

ax.tick_params(axis=“x”, labelsize=20) | |

ax.tick_params(axis=“y”, labelsize=20) | |

ax.set_xlabel(‘Time (s)’, fontsize=25) | |

ax.set_ylabel(‘Potential (mV)’, fontsize=25) | |

plt.tight_layout() | |

plt.show() |

## Preparing the data

Our neural network expects all the input data to be of the same size. Since our recordings have variable lengths, we’ll pad the ECGs with zeros to create sequences of consistent lengths.

This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.

Learn more about bidirectional Unicode characters

def zero_pad(data, length): | |

extended = np.zeros(length) | |

siglength = np.min([length, data.shape[0]]) | |

extended[:siglength] = data[:siglength] | |

return extended | |

# plot a sample | |

data = sio.loadmat(‘training2017/A00001.mat’)[‘val’][0] | |

# 61 seconds is the maximum length in our dataset | |

data = zero_pad(data, max_length*freq) | |

plt.figure(figsize=(15, 5)) | |

plt.title(‘Sample zero-padded ECG recording’, fontsize=30) | |

plt.xlabel(‘Time (s)’, fontsize=25) | |

plt.ylabel(‘Potential (mV)’, fontsize=25) | |

plt.plot(np.arange(0, data.shape[0])/300, data/1000) |

Electrocardiograms are representations of the periodic cardiac cycle. To make sense of an ECG we need to look at the frequency information, and how it changes over time.

### STFT and the spectrogram

We can apply the discrete Fourier transform (DFT) to extract the frequency information and obtain the power spectrum, but this will obfuscate the time information. Instead, we can apply a form of the Fourier transform called short-time Fourier transform (STFT). Rather than taking the DFT of the whole signal in one go, we split the signal into small pieces and take the DFT of each individually, in a sliding-window fashion. Then, we plot it as a spectrogram, which is a clever way to visualize the frequency, power, and time in a single image. It is basically a two-dimensional graph, with a third dimension represented by colors. The amplitude of a particular frequency at a particular time is represented by the color, with dark blues corresponding to low amplitudes and brighter colors up through red corresponding to progressively stronger amplitudes.

To optimize the dynamic range of the frequency, it is also useful to apply a logarithmic transform. In fact, other reports showed that the logarithmic transform considerably increases the classification accuracy.