K. Hii. Reorientation of an Atrial Model to Simulate 12-lead ECG Signals: An Overfitting and Data Augmentation Problem. Institute of Biomedical Engineering, Karlsruhe Institute of Technology (KIT). Bachelorarbeit. 2020
The present study of different atrial flutter mechanisms remains a very complex subject. Without the use of invasive mapping techniques or just by observing the 12-lead ECG signals, it is impossible to differentiate between atrial flutter mechanisms. A more sophisticated approach like a radial basis neural network classifier is implemented in this thesis to classify atrial flutter signals according to its mechanism. However, in order to have a good classifier, two important aspects need to be considered: a huge amount of data which, at the same time, does not cause overfitting. Data from previous studies were used as a benchmark to assess the performance of the classifier by enlarging the available dataset. One way to feed the classifier new data is by using data augmentation methods. In our study, we simulated different rotations around all 3 axes of the atrial model to generate new 12-lead ECG signals. We also investigated the potential problem of overfitting in the process. We started by first doing a correlation analysis of the ECG signals to have an idea how much signals could change at each rotation. Here, the signals between the initial position and each of the rotated position were compared. We found out that within a range of ±10◦, there were, in most cases, correlation coefficients higher than 0.75 which might not be useful for machine learning applications. We implemented different scenarios to investigate which train and test dataset division would improve the classifier accuracy or trigger an undesired overfitting problem. We found out that adding rotations to the atrial model as a means of data augmentation improved the performance of the classifier for some mechanisms. This, however, was valid if a part of the atrial model dataset was also used for training. From this, we learned that we could achieve an individual increase of 12 - 25% in accuracy using atrial models whose data was partially used in the train set. The initial idea was to train the classifier with atria models and test with ’unseen’ atria models. Yet, we noticed that the classifier did not perform as well on unknown atrial models as accuracies observed were lower than the benchmark. To achieve a higher accuracy, we concluded that augmenting the dataset of only 2 atrial models were not enough to improve the overall classifier accuracy and more atrial geometries were needed to investigate the possible improvements they could bring.