Take-home messages
1. Non-invasive diagnostics by means of artificial intelligence are feasible today.
2. ECG features are an ideal substrate to advance data analysis techniques.
Impact on practice statement
Promising AI-driven ECG applications and diagnostics may transform future clinical practice in cardiovascular conditions.
Introduction
The electrocardiogram (ECG) is a universal tool in clinical medicine that has been widely used by clinicians for more than 100 years. Its low cost and ubiquitous availability has enabled clinicians to identify numerous structural and electrical heart abnormalities, or to point in a certain direction for further investigations.
The impressive recent advances in artificial intelligence (AI), particularly in the field of medicine, have provided clinicians with insights into data acquisition and analysis leading to advanced non-invasive diagnostics. Within ECG diagnostics in particular, remarkable AI analysis by means of deep-learning convolutional algorithms have enabled rapid interpretation utilising ECG features as an ideal substrate for this process [1]. Nowadays, several groups across the globe have large digital ECG databases linked with clinical datasets, which have in turn revealed the utility of AI in identifying signatures and patterns that are unrecognisable by conventional ECG interpretation [2,3]. Accordingly, neural networks that can identify these patterns have been used to find various heart conditions and pathologies such as left ventricular (LV) systolic dysfunction, atrial fibrillation (AF) and arrhythmia syndromes, as well as hypertrophic cardiomyopathy (HCM) [3-5].
Despite reaching several milestones, AI applications may also pose some challenges in interpretation. Experience in data acquisition and analysis is required, including data quality control, in addition to external validation so as to exclude limitations and challenges. This article overviews the integration of AI-based ECG applications in various cardiovascular conditions, providing information into diagnostic and therapeutic measures; it discusses challenges and pitfalls, and provides insights for future applications.
Artificial intelligence models in ECG diagnostics
Although AI applications have been used for several decades in various fields, most AI applications in cardiovascular disease have only recently been employed. There are essentially two ways AI can be applied in ECG diagnostics: automated ECG interpretation, which has been available for many years (and has been continuously improving), and more recently, extraction and analysis of raw data which has allowed for the provision of information that is beyond the perception of the human eye and therefore beyond classic interpretation [6]. There are complex processes which define cardiac signals on a conventional ECG, such as filtration to enhance their amplitude, but there are also influences from body electrical activity, anatomic variations, cardiac rotation, etc. Therefore, a large amount of data in a digital form are required to create models such as machine learning (ML) models which aim to perform complex mathematical tasks and optimise a “solution” through large computational power via software that is currently available [2].
Deep learning (DL) is a subset of ML that uses neural networks with many layers to mimic the learning process of the human brain. More specifically, each “neuron” represents an equation with parameters that are adjusted during network training, thus the representation of the input is learned by the network itself. A frequently used model is a subtype of neural networks called convolutional neural networks (CNNs), which are particularly useful for finding patterns in images and signal data such as ECGs [7]. Supervised or unsupervised techniques can be used and CNNs have evolved from deep neural networks. A neural network consists of two serial components, the feature extraction layers and subsampling (pooling) layers, which then utilise the feature extraction output as their input to further analyse the data and generate the ultimate output (Figure 1) [8].
Figure 1. Generic development of a convolutional neural network using the 12-lead ECG. The ECG analogue signal is converted to a digital recording, resulting in a list of numerical values corresponding to the amplitude of the signal, feeding sequential layers of convolutions until the final model output is reached.
During the learning phase, both the inputs and the outputs are presented to the network and the model “discovers” certain rules. This process often requires enormous datasets and solid computing power. The ultimate goal would be for developed networks to be capable of identifying novel relationships independent of features selected by a human. In the last phase, the parameters are determined, and the algorithm that can be applied invariably to any input is established. Furthermore, an artificial neural network consists of multiple units called nodes or neurons that are organised into layers. Each layer uses a feature output as their own input to complete the analysis and create the ultimate output [2].
Machine-learning techniques, alongside the availability of large ECG datasets, have enabled the systematic extraction of related features and their link with specific cardiac disease. More specifically, an ECG is converted into a time series, sampled in a way such that each sample represents the signal amplitude for a certain time point. For instance, for an output of “yes/no” in response to the presence of atrial fibrillation (or the presence of systolic dysfunction), it may be predicted that this is a type of ECG acquired in sinus rhythm. Therefore, given that the training of ML models requires only labelled data, ML seems like a reasonable next step in reviewing and expanding those data [9].
ECG diagnostics in left ventricular systolic dysfunction
LV systolic dysfunction can be associated with the development of heart failure symptoms and is prognostically important. Such dysfunction is normally identified using echocardiography (or other appropriate imaging modalities) in individuals with relevant symptoms. However, such reductions in systolic function can be asymptomatic, thus limiting the opportunity for their detection [10]. Despite this, early detection and treatment is associated with improved morbidity [11]. As such, the ability to determine the presence of systolic dysfunction prior to symptom development is important.
An AI algorithm developed from ~45,000 individuals and tested on a separate group of >50,000 individuals has been able to accurately identify LV dysfunction (defined as an LV ejection fraction [LVEF] ≤35%) with an area under the curve (AUC) of 0.93 [12]. Indeed, even in those with normal LV function at the time of assessment, those flagged by the algorithm were significantly more likely to develop LV systolic dysfunction over the following three years than those considered to be negative by the algorithm. The predictive value of this algorithm has subsequently been validated in a prospective dataset with similar findings [13]. Kwon et al have shown similar results in a Korean cohort of ~55,000 ECGs with a model able to detect both reduced (LVEF ≤40%) and mid-range systolic function (LVEF ≤50%) [14]. Other groups have also developed algorithms with similar efficacy for detecting systolic dysfunction [15]. As a result, such AI models, in combination with readily available clinical and biomarker data, may provide an appropriate screening tool for LV dysfunction.
ECG diagnostics in hypertrophic cardiomyopathy
AI, particularly deep learning with CNNs, has shown promise in detecting HCM via ECGs. Traditional ECG features suggestive of HCM have been hampered by their inconsistency, with up to 10% of those with HCM having apparently normal ECGs [16,17]. While screening with echocardiography is a reliable method, it is comparatively costly and time-consuming. Development of AI-based ECG diagnostics may be able to avoid the limitations of traditional algorithms by not relying on specific ECG criteria, such as LV hypertrophy (LVH), which often presents variably among HCM patients [18].
An AI algorithm developed by a team from the Mayo Clinic using ECGs from ~2,500 individuals with HCM, and ~51,000 controls, demonstrated impressive accuracy (AUC 0.95). This finding was similar when limited to ECGs with either traditional LVH criteria or those with apparently normal ECGs [18]. This model's robust performance, particularly in younger patients and irrespective of mutation status, underscores its potential for broad application in HCM screening. Of potential importance, the high negative predictive value (NPV) suggests that this tool may be useful to rule out HCM in large-scale screening assessments [18]. More broadly, it is possible for AI to accurately determine the presence of LVH from an ECG. An AI algorithm developed from 21,000 individuals with paired ECG and echocardiography data demonstrated a greater sensitivity for the same specificity when compared to assessment of the ECG by a cardiologist [19].
Similar methods have been shown to be successful for helping to identify patients at high risk of AF or those already suffering from symptomless paroxysms of the arrhythmia. Utilising approximately over 600,000 ECGs, AI demonstrated a 79% accuracy and AUC of 0.87 in detecting paroxysmal AF by identifying subclinical changes not visible during active episodes [2]. Indeed, when ECGs from within one month of the patient being diagnosed with AF were examined, the accuracy of the AI algorithm was even higher (AUC 0.90, accuracy 83%). However, when this model was compared against an established clinical prediction model for AF (CHARGE-AF), they were similarly predictive for AF risk over time (C-statistics for AI 0.69 [95% confidence interval {CI}: 0.66 to 0.72] and CHARGE-AF 0.69 [95% CI: 0.66 to 0.71]) [6]. As such, while models with high precision can be developed, and while AI may indeed provide an important option for “point of care” risk assessment using a single test, we should not forget the value of traditional prediction models.
Given the important pathological association between AF and stroke, determining stroke risk in those with AF is an important avenue for AI to consider. In such scenarios, AI models have been demonstrated to outperform the CHA2DS2-VASc score, with a study showing three different supervised machine learning models were able to demonstrate reasonable predictive ability for stroke in those with AF (AUC of 0.60 to 0.66). In contrast, the CHA2DS2-VASc score had an AUC of only 0.52 [20]. Despite these promising findings, it should be considered that stroke prediction in those with AF remains inaccurate even when AI models are combined with established clinical models.
ECG diagnostics in other cardiac pathologies
The ECG plays a role, to a greater or lesser extent, in the diagnosis of a range of other cardiac pathologies. AI models have been developed to determine the presence of long QT syndrome (LQTS) even in the presence of a normal corrected QT interval. In an analysis of ECGs from ~2,000 individuals from a specialised genetic heart rhythm clinic, the developed algorithm was able to separate those with confirmed LQTS with a QT corrected for heart rate (QTc) <450 ms from those without LQTS (AUC 0.86) [21]. The ability for AI algorithms to detect LQTS has been confirmed by other studies [22]. The ability to identify the presence of LQTS despite a normal QT interval has important screening implications for clinical practice.
Valvular heart disease is a common group of pathologies with important implications for mortality and morbidity [23]. Traditional diagnosis is normally based on clinical auscultation followed by echocardiographic assessment. However, auscultation is a variable skill which may not identify all patients with valvular heart disease [24]. AI algorithms have been developed which identify moderate-severe aortic stenosis (AS) with a high degree of precision (AUC ~0.90), with similar findings when used on an external validation dataset [25,26]. Interestingly, those flagged as positive by one AI model but who were found not to have significant AS were twice as likely to progress to moderate-severe AS in the following 15 years compared to those identified as negative by the algorithm [25]. Similar results have been achieved when identifying individuals with clinically significant mitral valve disease [27]. Indeed, one algorithm developed using ECGs from ~77,000 individuals had a high degree of accuracy for all left-sided valvular pathologies [28].
Validation in practice; challenges and limitations
Despite all these advances, it is important to acknowledge several potential challenges and limitations. These may be derived first by the fact that ECGs obtained in routine, real-world settings might be of poor quality. Also, even though some of the models might perform well in a certain population, it is imperative that they undergo rigorous evaluation for external validity in diverse populations. The development of AI-ECG models needs large datasets for training, validation and testing, hence multicentre collaborations seem vital. Moreover, legal and regulatory aspects of incorporating AI-based diagnoses include whether and to what extent the clinician may opt to take them into consideration, as well as the approval of which regulatory bodies. A legal framework to regulate AI-based decision-making is warranted.
Conclusions
Impressive recent advances, particularly in the medical field, have provided clinicians with insights into data acquisition and analysis leading to advanced non-invasive diagnostics via AI. Within ECG diagnostics in particular, remarkable analysis by means of deep-learning CNNs have enabled rapid interpretation utilising ECG features as an ideal substrate for this process in various pathologies, including arrhythmias, LV systolic dysfunction, cardiomyopathy and valve disease. However, as with any medical tool, the AI in ECG diagnostics requires thorough validation, clinician training and an appropriate legal framework before being integrated into medical practice.