我正在尝试在Python中进行Whatsapp分析,我想将其转换为包含日期,小时,人物和消息列的数据框。'[8/23/17, 1:45:10 AM] Guillermina: Guten Morgen',
'[8/23/17, 1:47:05 AM] Kester Stieldorf: Good morning :) was in Düsseldorf one hour ago ;)',
'[8/23/17, 1:47:16 AM] Guillermina: Hahahaha',
'[8/23/17, 1:47:19 AM] Guillermina: What?',
'[8/23/17, 1:47:36 AM] Kester Stieldorf: Yeah had to pick something up',
我已经尝试过:pieces = [x.strip('\n') for x in file_read.split('\n')]
beg_pattern = r'\d+/\d+/\d+,\s+\d+:\d+\s+\w+\.\w+\.'
pattern = r'\d+/(\d+/\d+),\s+\d+:\d+\s+\w+\.\w+\.\s+-\s+(\w+|\w+\s+\w+|\w+\s+\w+\s+\w+|\w+\s+\w+\.\s+\w+|\w+\s+\w+-\w+|\w+\'\w+\s+\w+|\+\d+\s+\(\W+\d+\)\s+\d+-\d+\W+|\W+\+\d+\s+\d+\s+\d+\s+\d+\W+|\W+\+\d+\s+\d+\w+\W+):(.*)'
reg = re.compile(beg_pattern)
regex = re.compile(pattern)
remove_blanks = [x for x in pieces if reg.match(x)]
blanks = [x for x in pieces if not reg.match(x)]
grouped_data = []
for x in remove_blanks:
grouped_data.extend(regex.findall(x))
grouped_data_list = [list(x) for x in grouped_data]
但不管用。我很确定re.install()存在问题,因为当我输出reg和regex时,它们返回空数组。我怎么解决这个问题?