根据txt文件的时间段在edf中获取对应的心电信号,归纳并写入csv中
首先txt文件的部分数据是这样的:
而edf文件是这样的:
我们要根据txt里面的time和duration确定开始时间和结束时间,在edf中获取这个时间段内的信号(edf的time为时间节点,根据时间节点判断是否在各个时间段内),并将这些信号标签标为对应的type;然后在每个时间段都获取不多于10s的负样本,并将这段时间内的信号标签标为normal。最终这些都归纳写入csv中。
import csv
import os
import mne
txt_path = "/alpha/.../aa.txt"
edf_path = "/alpha/.../edf/aa.edf"
save_path = "/alpha/.../save.csv"
mul = 0.006833
def start_adjust(input):
rem = input % mul
if rem == 0:
return input
else:
return input - rem
def end_adjust(input):
rem = input % mul
if rem == 0:
return input
else:
return input + (mul - rem)
raw = mne.io.read_raw_edf(edf_path, preload=False)
raw2 = raw.copy()
raw3 = raw.copy()
df = raw.to_data_frame()
print(df)
# # 获取通道数值
raw.pick_channels(["chan 1"])
eeg = raw.to_data_frame()
eeg1 = list(eeg.values[:, 1])
raw2.pick_channels(["chan 2"])
eeg = raw2.to_data_frame()
eeg2 = list(eeg.values[:, 1])
raw3.pick_channels(["chan 3"])
eeg = raw3.to_data_frame()
eeg3 = list(eeg.values[:, 1])
# print(eeg1[-1], eeg2[-1], eeg3[-1])
# # 获取时间值
times = list(df["time"])
# print(list(df["time"])[-5:])
# 写表头
with open(save_path, "w", encoding="utf-8", newline='') as f2:
writer = csv.writer(f2)
writer.writerow(["duration", "chan1", "chan2", "chan3", "type"])
with open(txt_path, "r", encoding="utf-8") as f1:
tmp = []
for idx, line in enumerate(f1.readlines()):
if idx in [0, 1, 2]:
continue
line = line.strip("\n")
a = line.split()
if len(a) == 1:
continue
# print(a)
start = int(a[0].split(":")[0])*60*60 + int(a[0].split(":")[1])*60 + int(a[0].split(":")[-1])
start = start_adjust(start)
print(start)
end = start + int(a[2])
end = end_adjust(end)
# print(end)
tmp.append(end)
neg = []
if len(tmp) == 2:
dura = start - tmp[0]
if dura > 10:
dura = 10
neg = [tmp[0], tmp[0] + 10]
tmp.pop(0)
pos_eeg = []
neg_eeg = []
for idx, time in enumerate(times):
if start <= time <= end:
eggs = [eeg1[idx], eeg2[idx], eeg3[idx]]
pos_eeg.append(eggs)
if not len(neg) == 0:
if neg[0] <= time <= neg[1]:
eggs = [eeg1[idx], eeg2[idx], eeg3[idx]]
neg_eeg.append(eggs)
with open(save_path, "a", encoding="utf-8", newline='') as f3:
writer = csv.writer(f3)
# writer.writerow(["duration", "chan1", "chan2", "chan3", "type"])
if not len(neg_eeg) == 0:
for idx, e in enumerate(neg_eeg):
data = []
if idx == 0:
data = [str(neg[0]) + ":" + str(neg[1]), e[0], e[1], e[2], "normal"]
else:
data = ["", e[0], e[1], e[2], "normal"]
writer.writerow(data)
for idx, e in enumerate(pos_eeg):
data = []
if idx == 0:
data = [str(start) + ":" + str(end), e[0], e[1], e[2], a[1]]
else:
data = ["", e[0], e[1], e[2], a[1]]
writer.writerow(data)
最终写入的csv为这样:
比如这是第一段时间内的
第二段
这里表格的duration都是以秒s为单位的,不再是txt里面十分秒的格式。
可能会出现的错误:
使用mne.io.read_raw_edf读取edf文件时,可能会报一个ValueError 的错误,这时候需要下载一个EDFbrowser软件,点Tools下的Edit EDF/BDF header选项(有的新版本也叫EDF header editor)打开每个edf文件再点保存即可,每个文件都操作一遍每个文件就都能正常被读取。