1. Abstract
This paper describes a novel ECG segmentation method based on the recurrent neural network (RNN) with long short-term memory (LSTM) layers.
Each ECG sample is classified into one of the four categories: P-wave, QRS-wave, T-wave, and neutral (others).
T-wave segmentation can achieve an accuracy of 90%。
2. METHODOLOGY
2.1 Data Sets
QT database (QTDB) from PhysioNet.
signals are sampled at 250Hz;
Total 105 two-channeled ECG recordings, each 15-minutes in duration.
Every recording is divided into 500 data points.
In total, there are 64, 040 sets of 500 data points of ECG.
2.2 Extracted Features
Each 500 data points constructs 500x4 dimension. They are raw ECG data points, the local average of a data point, the first and second derivatives of a data point.
2.3 Bidirectional Long Short-Term Memory Recurrent Neural Network
the input
x
=
(
x
1
,
x
2
,
.
.
,
x
T
)
x = (x_1, x_2,..,x_T)
x=(x1,x2,..,xT) feeds to the network;
the net compute the hidden vector sequence,
h
=
(
h
1
,
h
2
,
.
.
.
,
h
T
)
h = (h_1, h_2,...,h_T)
h=(h1,h2,...,hT);
the net output vector
y
=
(
y
1
,
y
2
,
.
.
.
,
y
T
)
y = (y_1, y_2, ... ,y_T)
y=(y1,y2,...,yT).
From 1 to T, T is the number of timestamps as,
h
t
=
H
(
W
x
h
x
t
+
W
h
h
h
t
−
1
+
b
h
)
h_t = H(W_{xh}x_t + W_{hh}h_{t-1}+b_h)
ht=H(Wxhxt+Whhht−1+bh)
y
t
=
W
h
t
h
t
+
b
y
)
y_t = W_{ht}h_t + b_y)
yt=Whtht+by)W denotes weight matrices, b denotes bias vectors, and H denotes hidden layer function.
Belief introduction to LSTM, Bidirectional LSTM.
2.4 Network Architecture and Post-Processing Step
It is three layers deep, consisting of the input layer, two bidirectional LSTM layers, and an output layer.
the input layer,
x
=
(
x
1
,
x
2
,
.
.
.
,
x
500
)
x = (x_1, x_2, ..., x_{500})
x=(x1,x2,...,x500), takes the raw ECG
signal of size 500 and three additional extracted features per data point.the input is 500 x 4 time series.
The hidden layer is a BLSTM layer.
On every timestamp , there are two different hidden LSTM layers including forward hidden layer and one backward hidden layer.
Every BLSTM composed by 250 forward hidden LSTM and 250 backward hidden LSTM.
Each hidden LSTM layer has 125 LSTM cells.
The output layer that classifies every data point in time series into four categories.
output dimension is 500 x 4 neuron-producing probability over a 500-unit timeline
each
y
t
y_t
yt defines a probability distribution over the |K| possible states where K = {1; 2; 3; 4}.
A softmax output layer is used,
minimize the error function
where
y
ˉ
t
\bar{y}^{t}
yˉt is the vector of output activations before they have been normalized with the softmax function.
A filter of size 17 is applied to the final output result, If the beginning and the end of the output under this filter belong to the same cardiac wave class, any output under this window is then assigned to the class of the start and end of this filter.
Training Experiment
trained with Adam Optimizer through 68 epochs using mini-batch procedure of batch size 250.
the results showed 94.6% accuracy for training set, 93.8% accuracy for validation set, and 93.7% accuracy for test set.
Supervised ECG Interval Segmentation Using LSTM Neural Network