Perinatal asphyxia, characterized by insufficient oxygen supply to the fetus, is a critical issue that can result in severe neurologic and developmental impairments (Wang et al 2021). Immediate identification and intervention are crucial for mitigating the adverse effects of perinatal asphyxia. The current standard method for assessing fetal well-being during labour is cardiotocography (CTG), which measures fetal heart rate and uterine contractions (Chen et al 2011). Diagnosis of perinatal asphyxia is guided by the FIGO guidelines (Ayres-de Campos et al 2015). However, concerns have been raised regarding the accuracy of CTG due to subjective interpretation and potential errors, leading to false positives and false negatives (Ojala et al 2006, Benton et al 2020). Moreover, their bulky design requires stationary operation and demands expert knowledge for transducer placement. These limitations necessitate the exploration of novel technologies that can provide more precise and reliable information regarding fetal well-being. Non-invasive fetal electrocardiography (NI-fECG) is a promising sensing technology for fetal heart monitoring that is being developed to refine the determination of perinatal asphyxia (Oudijk et al 2004, Jezewski et al 2017). It records fetal cardiac activity unobtrusively and non-invasively via electrodes from the maternal abdomen. Different electrode positions allow the signal to be acquired at different angles, which are displayed as channels. There is a huge amount of possible electrode configurations, yet standardized positioning remains undefined. Minimizing the number of electrodes offers increased comfort for the pregnant woman. Compared to CTG, NI-fECG offers high levels of fetal heart rate accuracy, which is less influenced by fetal movements and more accurate for women with high BMI (Sänger et al 2012, Hayes-Gill et al 2020, Liu et al 2023). NI-fECG provides an unobtrusive solution for long-term fetal monitoring as a self-applied wearable device, addressing the challenges of limited access and time-consuming prenatal care.
Although NI-fECG technology is gaining acceptance, and its potential beyond fetal heart rate monitoring is being explored (Jaeger et al 2022), its implementation in clinical practice is still limited (Wakefield et al 2022). The proof of reliable heart rate extraction under real conditions is yet to be established, and there is currently no definition of normative values (Smith et al 2018). Signal processing challenges arise due to increased noise levels in wearable NI-fECG and the low signal-to-noise ratio resulting from the mixed signal containing fetal and maternal ECG, uterine muscle signals, and other confounders, making fECG extraction challenging. Existing algorithms for fetal QRS detection are available. However, previous comparisons of those algorithms lacked standardization (Hasan et al 2009, Clifford et al 2014, Andreotti et al 2016, Li et al 2017, Kahankova et al 2020, Vaidya and Chaitra 2020). Additionally, there is a need for an algorithm that exhibits stable and reliable performance in the presence of noisy recordings.
In this paper, we introduce Power-MF, a new fetal QRS detection algorithm designed to be robust against noise. Power-MF is based on a combination of power spectral density (PSD) and matched filter techniques. Further, we objectively benchmark Power-MF against the state-of-the-art for fetal QRS detection on the two most recently published Non-invasive Multimodal Foetal ECG-Doppler Dataset for Antenatal Cardiology Research (NInFEA) (Sulas et al 2021) and Abdominal and Direct Fetal ECG Database (ADFECG) (Matonia et al 2020) datasets. For benchmarking, we selected three relevant open-source algorithms. We analyze the algorithms' performances concerning electrode configurations suitable for wearable NI-fECG devices.
A graphical abstract of this paper is shown in figure 1.
Figure 1. Graphical abstract of this paper, showing the benchmark process of Power-MF against state-of-the-art fetal QRS detection algorithms. Further, the background, research goal, methods, and key findings are described.
Download figure:
Standard image High-resolution imageWe conducted literature research in central databases to identify the currently most relevant fetal QRS detection algorithms. We also took algorithms published with open-source datasets and algorithms mentioned in review papers into account. From the collection of algorithms, we decided to focus on three algorithms by Behar et al (2014), Varanini et al (2014), and Sulas et al (2021). These algorithms were published after 2013, and their source code is publicly available. Further, they have been evaluated on a complete public dataset with the most commonly used metrics Sensitivity (Se), positive predictive value (PPV), and F1-score, and achieved good results. Table 1 shows the selected algorithms, including their self-stated accuracy and the respective evaluation datasets. In the following, the algorithms are described in detail.
Table 1. Self-stated performance of state-of-the-art fetal QRS detection algorithms. The algorithms are outlined along with their respective publications and the acronyms we defined for this work. The performance metrics include Se, PPV, and F1-score. Descendants are reported as stated in the corresponding publication.
PublicationAlgorithm acronymEvaluation datasetSe in %PPV in %F1-score in %Varanini et al (2014) Varanini CinC201399.198.998.9aBehar et al (2014) Behar CinC201395.996.096.0Sulas et al (2021) Sulas NInFEA978188a2.1. Sulas (2021)The algorithm by Sulas et al (2021) (Sulas) has a self-reported Se of 97% and PPV of 81% and was evaluated on the NInFEA dataset (Sulas et al 2021). The first step of Sulas consists of a preprocessing step using a bandpass filter between 0.05 Hz and 250 Hz to suppress low and high-frequency components. Then, the maternal QRS complexes are detected from the thoracic reference channels, and the maternal ECG is suppressed using a deflation algorithm based on periodic component analysis (πCA). The deflation algorithm comprises a decomposition step followed by wavelet denoising and reconstruction of the signal to remove the maternal component (Sameni and Clifford 2010).
To enhance the fetal signal, an independent component analysis (ICA) step is performed on the residual signals by means of the joint approximate diagonalization of eigenmatrices (JADE) algorithm based on the work of Cardoso and Souloumiac (1993). A channel selection based on a matched filter is performed to find the channel that contains the strongest fetal signal after ICA. A generic template of a fetal QRS complex is used for this purpose. To find the fetal QRS complexes, peak detection is performed on the matched filter output of each channel. The RR intervals of each channel are computed, and the channel with the slightest standard deviation of RR intervals is selected.
In order to correct the detected fetal QRS, another πCA step is performed with the temporary fetal QRS as an input. A new template is created based on their positions by averaging multiple fetal QRS. Using the new template, a matched filter is applied. The peak detection results applied to the output signal indicate the final fetal QRS.
2.2. Behar (2014)The algorithm by Behar et al (2014) (Behar) was published during the Physionet / Computing in Cardiology Challenge 2013 (CinC2013) (Goldberger et al 2000). According to the self-reported performance it achieved a Se of 95.9%, PPV of 96.0%, and an F1-score of 96.0% on CinC2013. In this approach, template subtraction (TS) and ICA are performed in various sequences (TS, TS-ICA, ICA, ICA-TS), and from those, the channel with the smoothest heart rate variability is selected.
First, a notch filter followed by a Butterworth low-pass is applied to remove powerline interference and high-frequency components above 200 Hz. Furthermore, baseline wanders below 3 Hz are canceled using a Butterworth high-pass filter.
On each preprocessed channel, maternal QRS detection is performed with a QRS detector similar to Pan and Tompkins (1985) with a refractory period of 250 ms. Further, a channel selection approach is applied to select the channel with the most plausible maternal QRS sequence as a reference.
Maternal ECG suppression is applied in four different ways, and the best fetal QRS sequence out of all four approaches is selected. The first approach, referred to as 'TS', consists of a simple TS, where several maternal R-peaks are averaged with a fixed window width around the maternal R-peak. In the second approach, 'TS-ICA', an ICA step, is applied after TS, using the JADE algorithm based on the work of Cardoso and Souloumiac (1993). In the third approach, 'ICA', only an ICA step is performed. In the fourth approach, 'ICA-TS', ICA is followed by a TS. On the residuals of all four methods, peak detection is performed using a QRS detector with an adjusted refractory period of 150 ms to account for the higher fetal heart rate. Out of all fetal QRS sequences, the best one is selected using a beat comparison measure. The final step consists of a smoothing process to remove extra detected fetal QRS and insert missed fetal QRS based on physiologically reasonable heart rates.
2.3. Varanini (2014)The algorithm by Varanini et al (2014) (Varanini) has a high self-reported Se of 99.1% and PPV of 98.9% on the CinC2013 dataset, resulting in an F1-score of 98.9%. The algorithm uses ICA and singular value decomposition (SVD) for detecting and suppressing maternal ECG. Another ICA step is performed on the residual signal to enhance the fECG.
First, the signals are preprocessed to remove impulsive artifacts using a median filter. In the second step, baseline wandering is removed with a low pass Butterworth filter in forward and backward directions to achieve a zero-phase shift in the baseline estimate. The difference between the baseline and original signals is then used as the detrended signal. Finally, the power line interference is removed using notch filters.
After preprocessing, a FastICA algorithm according to Hyvarinen (1999) with deflationary orthogonalization and the hyperbolic cosine as contrast function is performed to separate the maternal ECG from the signal. Then, the channel with the best maternal ECG signal is selected, and the maternal QRS are detected using an adaptive threshold on the absolute derivative of the selected maternal channel.
To suppress the maternal ECG, the maternal R-peaks are approximated by an SVD-based method. A template of the maternal ECG is created by weighting the maternal R-peaks with a trapezoidal window and decomposing it with SVD. Afterward, the maternal beats are reconstructed using eigenvectors corresponding to the three largest eigenvalues, which are subtracted from the signal. Maternal ECG cancellation is performed on all channels.
Another ICA step is performed to further enhance and separate the weak fetal signal after maternal ECG suppression. For fetal QRS detection, the signal is filtered with a derivative filter consisting of a comb filter followed by a moving average. After filtering, the QRS detector is applied, similar to the maternal QRS detector but with an adapted RR-interval size. From the detected fetal QRS, the RR intervals are calculated, and a segment is identified in which a good SNR is assumed. This segment is characterized by constant RR intervals. From the beginning/end of this segment, a second QRS detector is initiated in the forward/backward direction. This detector searches for maximum points in the weighted derivative signal. The weights depend on the predicted RR intervals, i.e. the weight is higher in the area where the next QRS is expected. The lengths of the predicted RR intervals are initialized from the values in the start segment and adjusted from beat to beat using a least mean square algorithm. The best RR sequence is selected from each channel based on a-priori knowledge of typical fetal RR values to obtain plausible R-peaks.
3.1. Power-MF fetal QRS detectionIn this section, we present Power-MF, a fetal QRS detection algorithm that utilizes PSD and matched filter techniques. The main goal of Power-MF is to improve robustness against noisy signal segments, a common issue in fetal QRS detection. The algorithm continues the work presented by Varanini et al (2014).
Previous state-of-the-art algorithms, such as Varanini, have shown good performance in clean signal segments, see table 1. However, their performance degrades when dealing with noisy segments. By inspecting the false detections of Varanini, we observed inaccuracies in noisy segments where fetal peaks are not clearly visible, see figure 2. Power-MF addresses this issue by incorporating a fetal QRS detection method based on a matched filter. The use of matched filters has been previously shown to be robust to noise in adult QRS detection, and we hypothesize that it will also provide good performance in the presence of noise in fECG signals (Eskofier et al 2008, Smigiel and Marciniak 2017, Jamshidian-Tehrani and Sameni 2018). Additionally, Power-MF focuses on identifying the channel with the strongest fECG component for QRS detection. We use the PSD of the fECG to distinguish it from the maternal ECG and other background noise. By focusing on the channel with the strongest fECG component, Power-MF aims to improve the SNR and increase the accuracy of QRS detection. Power-MF employs Varanini's steps for preprocessing and maternal ECG cancellation, see section 2.3. Figure 3 shows the algorithms steps of Power-MF and Varanini.
Figure 2. Processing steps of the fetal QRS detection algorithm by Varanini et al (2014). Varanini et al (2014) falsely detects the fetal peaks in the noisy parts of the signal (127.0–129.5 s). The dashed vertical lines indicate the ground truth fetal peak annotations. The circles in the bottom plot indicate the fetal peaks detected by Varanini et al (2014). The grey areas mark the acceptance interval of 50 ms. The signal is the fourth channel of recording 6 from the dataset ADFECG B2 Labour. The electrode signal is displayed in arbitrary units.
Download figure:
Standard image High-resolution imageFigure 3. Overview of the algorithm steps of Power-MF and Varanini et al (2014). Power-MF and Varanini et al (2014) share the preprocessing and maternal ECG cancellation. Their fetal QRS detection methodology differs in channel selection and fetal QRS detection. The labels (a)–(i) correspond to different time points during the algorithm. Graphical visualizations of (a)–(i) based on an example signal are included in the supplementary material.
Download figure:
Standard image High-resolution image 3.1.1. Channel selection based on PSDWe assume that in the channel containing the fECG, the fetal QRS complexes occur at a specific frequency corresponding to the fetal heart rate. Therefore, the channel with the highest PSD in the range of the expected fetal heart rate is selected.
First, the signal is preprocessed to highlight the fetal peaks. Therefore, the filtered absolute derivative of the signal is calculated for each channel. A comb filter with 8 ms delay is used as the derivative filter, followed by a moving average with a 5 ms window length. The absolute values of the derivative are calculated and filtered with a Butterworth bandpass between 0.7 Hz and 8.0 Hz. The derivative highlights the high-frequency components caused by the high slopes of the QRS. The bandpass is then used to smooth the signal. The filter parameters were determined through practical observation and experience exclusively with the CinC2013 dataset.
The PSD was obtained by calculating Welch's PSD estimate (Welch 1967) with a Gaussian window and a 50% overlap. The window size ws was set to , with sampling frequency
, to capture at least 15 cardiac cycles, assuming a lower fetal heart rate boundary of 110 bpm. The PSD of the channel with the clearest fetal signal is assumed to contain a peak at a specific frequency corresponding to the fetal heart rate. The channel with the most prominent peak between 1.8 Hz and 3.0 Hz in the PSD is selected, corresponding to a fetal heart rate between 108 bpm and 180 bpm. The healthy fetal heart rate is between 110 bpm and 160 bpm. The upper limit was chosen slightly higher to include boundary cases. The lower limit was shifted only slightly to avoid reaching the range of the maternal heart rate in case the maternal ECG was not completely suppressed.
Figure 4 shows the PSD for all channels of a signal. In the highlighted dashed area between 1.8 Hz and 3.0 Hz only one channel has a clear peak at 2.1 Hz, corresponding to the channel with the best fECG representation.
Figure 4. Channel selection of Power-MF based on PSD. PSD was estimated using Welch's method. A distinct peak occurs for Channel 1 at about 2.1 Hz. The graphs of the other three channels do not show a clear peak. The highlighted dashed area between 1.8 Hz and 3.0 Hz corresponds to a fetal heart rate between 108 bpm and 180 bpm and indicates the range in which the fetal peak is expected. This is exemplarily shown for a signal with four channels.
Download figure:
Standard image High-resolution image 3.1.2. Fetal QRS detection based on a matched filterA matched filter maximizes the detection of a target signal by correlating the input signal with a known reference waveform called a template. In the first step, a fetal QRS template is generated. The template is generated for each processed signal separately and uses only the selected channel. For this purpose, all local maxima in the absolute derivative of the signal are determined, which have a certain minimum peak distance to their neighbor peak. Around each detected maximum, the signal is cropped to the size of the median RR interval in the recording, resulting in an array of waveforms centered around preliminary QRS peaks. The template is then calculated as the median of these waveforms. Then, the fetal ECG-enhanced signal of the selected channel is filtered with the time-reversed template. The resulting signal has peaks at locations where there is a high correlation between the signal and the template, i.e. where a fetal QRS waveform is expected.
3.1.3. Parameter optimization of Power-MFThe creation of a fetal QRS template involves the optimization of the minimum peak distance in Power-MF. The minimum peak distance is treated as a hyperparameter and is optimized in a training step on an independent dataset, adjusting the minimum peak distance to achieve the highest F1-score. Specific physiologically meaningful values are systematically tested, and the chosen value is set for further procedure.
3.1.3.1. DataFor parameter optimization of Power-MF, we used the CinC2013 dataset. It contains signals with a length of one minute and comprises four channels. They were collected with different devices, resolutions, and configurations. All recordings have a sampling frequency of 1000 Hz. The dataset contains three subsets, but only subset A is publicly available and contains reference R-peak annotations. In this work subset A was used to optimize the parameters of Power-MF. Due to inaccurate reference annotations, recordings a33, a38, a52, a54, and a71 were excluded, as suggested in previous publications (Behar et al 2014, Varanini et al 2014).
3.1.3.2. Evaluation and ResultsFor the minimum peak distance optimization, physiologically meaningful values from 290 ms to 360 ms in 10 ms steps were set. For each value, the local maxima were computed on all recordings of CinC2013 and compared with the ground truth fetal R-peak annotations. The minimum peak distance is set to the value leading to the highest F1-score.
The algorithm achieved the highest mean F1-score of 94.25% with a minimum peak distance of 340 ms on the training dataset (CinC2013). Consequently, this value was set for further procedure. Table 2 shows all results of the parameter optimization.
Table 2. Results of the parameter optimization of Power-MF on the CinC2013 dataset. The optimal minimum distance between two peaks was determined for the peak detection of the fetal QRS detection step. Mean F1-scores (Min, Max) in %. The highest mean F1-score was achieved at 340 ms (highlighted in bold).
Minimum peak distanceMean F1-score in %Min/max F1-score in %290 ms93.6328.98/100.0300 ms93.8929.39/100.0310 ms94.0428.78/100.0320 ms94.0928.46/100.0330 ms94.2427.38/100.0 340 ms 94.25 29.01/100.0350 ms93.8825.00/100.0360 ms92.7627.34/100.03.2. Benchmarking Power-MF against the state-of-the-art3.2.1. DataThe most recently published open-source datasets NInFEA (Sulas et al 2021) and the ADFECG (Matonia et al 2020), consisting of the two subsets B1 Pregnancy and B2 Labour, were selected as test data for the algorithm benchmarking. An overview of the datasets is given in table 3. All datasets contain ground truth annotations. To the best of our knowledge, further publicly available datasets do not have reference annotations or are simulated data (Moor et al 1997, Andreotti et al 2016, Behar et al 2019).
Table 3. Characteristics of the datasets Non-invasive Multimodal Foetal ECG-Doppler Dataset for Antenatal Cardiology Research (NInFEA) and Abdominal and Direct Fetal ECG Database (ADFECG), that are used in this study.
YearNumber of signalsNumber of channelsPregnancy weekSignal lengthSampling frequencyNInFEA (Sulas et al 2021)20216027 channels (22 abdominal, 3 thoracic, 2 back)21st–27thvaries between 7.5 s and 2 min2048 HzADFECG B1 Pregnancy (Matonia et al 2020)2020104 abdominal channels32nd–42nd20 min500 HzADFECG B2 Labour (Matonia et al 2020)2020124 abdominal channels32nd–42nd during labour5 min500 HzFor ADFECG the labels were acquired through automated fetal QRS detection and were subsequently verified by clinical experts (Matonia et al 2020). We used all leads as they were published.
For NInFEA, a synchronized pulse-wave Doppler sonography (PWD) of the fetal heart was acquired simultaneously and clinically annotated for heartbeat references. The PWD signal annotations are V-peaks, characteristic points representing the blood flow through the aortic valve. From the physiological perspective, blood flow through the aorta is immediately preceded by the depolarization and contraction of the ventricles. It can, therefore, be assumed that a V-wave directly follows an R-peak in the PWD signal. The V-peaks in the PWD signal have been annotated by expert clinicians during signal acquisition. We use all 22 abdominal channels of the NInFEA dataset for our study. Additionally, we included five electrode configurations based on market-available wearable devices, as proposed by Sulas et al (2021) proposed, see figure 5. We excluded the 9th channel of recording 34 in NInFEA due to corrupted signal values.
Figure 5. Electrode configurations of the NInFEA dataset. This overview shows the electrode configurations of the NInFEA dataset, as proposed by Sulas et al (2021). Configurations (a)–(e) model market-available wearable NI-fECG devices.
Download figure:
Standard image High-resolution image 3.2.2. Algorithm preparationThe proposed algorithm Power-MF covers all steps of the fetal QRS detection pipeline. Its preprocessing and maternal ECG suppression overlap with Varanini, as these have shown to be robust. Power-MF differs from Varanini in the channel selection and fetal QRS detection steps. A complete overview of the algorithm steps of Power-MF and Varanini is given in figure 3.
The algorithms we employed for benchmarking required some adjustments to function properly on the datasets we selected. The source codes of Behar and Varanini are publicly available on the CinC2013 website 4 . For Sulas, we used the authors' publicly available repository on Github 5 , which depends on the OSET toolbox 6 . The test datasets have sampling frequencies of 2048 Hz (NInFEA) and 500 Hz (ADFECG dataset). Since both algorithms, Behar and Varanini, were developed for CinC2013, they were published with a sampling frequency set to 1000 Hz. Sulas, on the other hand, was published with a sampling frequency set to 2048 Hz. All algorithms take the sampling frequency of the signals as an input parameter. Depending on the dataset we evaluated the algorithms on, this input parameter was set to the respective sampling frequency of the dataset. This step was necessary to ensure conformity between datasets and algorithms. Although we did not modify the core methodology behind the algorithm or the data directly, we cannot rule out the possibility that the algorithms behave differently, as they may depend on the sampling frequency. It was out of the scope of this work to test if additional modifications of this algorithm could improve the performance of the unseen dataset.
In the algorithm Behar, the authors defined specific parameters to improve their performance on the CinC2013 dataset. More precisely, the identified time series are flagged as implausible if less than 85 fetal QRS or more than 200 fetal QRS in the 60 s long recordings are detected or if the standard deviation of the detected RR intervals is above 17 ms. For our study, these parameters were removed, ensuring no signals longer than 60 s are rejected.
Sulas was adapted to detect maternal QRS from abdominal channels, as not all datasets contain thoracic reference channels. All algorithms were implemented in Matlab.
3.2.3. Performance metricsFor performance evaluation, we computed Se, PPV, and F1-score as follows:
The available reference fetal heartbeat annotations serve as ground truth. True positives (TP) are the number of correctly detected QRS, false positives (FP) are the number of wrongly detected QRS, and false negatives (FN) are the number of missed QRS.
For adults, a detected QRS is considered to be a true positive if it is within 150 ms of the reference annotation (Di Marco and Chiari 2011, Heryan et al 2021). For ADFECG, we used an interval of 50 ms due to the higher fetal heart rate, as suggested by Behar et al (2016). For the evaluation of the NInFEA dataset, though, a different evaluation method was chosen: In the original publication by Sulas et al (2021), a true positive is defined as a detected QRS that has a distance less than 200 ms to the annotation. As described before, from a physiological perspective, the QRS is expected to occur shortly before the V-peak. Therefore, we consider a QRS to be a true positive if it is within 200 ms before the annotation.
Se (1) can be interpreted as the percentage of correctly detected true fetal QRS out of the total number of true fetal QRS, i.e. it indicates how successful an algorithm is at finding the true fetal QRS. PPV (2) is the percentage of correctly detected true fetal QRS out of all detected fetal QRS. This value indicates how well the algorithm can detect true fetal QRS out of all detections. F1-score (3) is the harmonic mean of Se and PPV and is used to summarize the overall performance of a detector.
4.1. Performance on NInFEA datasetTable 4 shows detailed results for the five wearable electrode configurations and for all 22 abdominal electrodes, as introduced in figure 5. Our proposed algorithm Power-MF outperforms state-of-the-art algorithms on three of six electrode configurations (see figure 5) of the NInFEA dataset. Power-MF achieves an F1-score of 84.5% ± 17.7%, 89.3% ± 14.8%, and 90.5% ± 13.4% for the electrode configurations (b), (d) and (e). For electrode configurations (a) and (c), Varanini achieves the highest F1-score with 85.8% ± 21.0% and 89.5% ± 18.4%, respectively. On all abdominal electrodes, the algorithm Sulas performed best with a mean F1-score of 93.0% ± 12.4%.
Table 4. Mean F1-scores ± standard deviation (min, max) in % for Power-MF and state-of-the-art fetal QRS detection algorithms on the NInFEA dataset for five different electrode configurations (see figure 5) and all abdominal electrodes, as proposed by (Sulas et al 2021). The results are averaged over all recordings. Minimum and maximum F1-scores are given in brackets. The best result for each configuration is highlighted in bold.
NInFEA electrode configuration (a)(b)(c)(d)(e)All abdominal electrodes Power-MF
Comments (0)