Another wave of enthusiasm hit the media in recent weeks unfolding the endless opportunities for artificial intelligence (AI) in service for medicine. This time the “new AI neural network approach detected heart failure from a single heartbeat with 100% diagnostic accuracy” – an energetic and simple message - was shared and reshared across social media and beyond. Lay readers fueled the rolling snowball with comments and views suggesting that Goliath of medicine has been successfully defeated by magic capabilities of AI. But, are we really done with heart failure?
The paper, which gave grounds for these discussions was published online in the Biomedical Signal Processing and Control by Mihaela Porumb et al (1).The authors have implemented in a very elegant manner a new approach for analysis of electrocardiogram (ECG) using the hierarchical neural networks that mimic the human visual system called Convolutional Neural Network (CNN or ConvNets) (2). This method being a class of deep neural networks, allows for image recognition and classification, is used for object or face recognition. This time in the work by Porumb et al. the face patters were replaced by ECG traces and the system was trained to classify the beats in to either “normal” or “chronic heart failure (CHF)” category. In total, 490,505 beats were used to train (50%), validate (25%) and test (25%) the model. The outcomes showed low rate of misclassification reaching 1% for false positive and 3% of false negative results. The diagnostic accuracy was impressively high 97.8±0.2% with area under curve of 0.98±0.01. Interestingly, nearly 72% of all misclassified CHF heart beats belong to a same subject. Furthermore, then the single beat analysis was replaced by analysis of a 5-minute ECG segment the accuracy increased further to 99% and the misclassification dropped to 0.1%. Just great!

Watching the news, the enthusiasm around deep data analysis resemble the early times of gene engineering. Alike, the general public was presented with groundbreaking outcomes on animal models that were soon to be implemented in everyday clinical practice. This has never happened, the road form lab to bed is complex and time-consuming, and so will it be for AI. What we can do to accelerate it is to apply rigorous scientific methodology to testing the AI tools in clinical practice. Defining the indications and reaching for improvements in the hard clinical endpoints.