处理数据集的python脚本

最新推荐文章于 2024-08-02 08:29:13 发布

一只猪眼看世界

最新推荐文章于 2024-08-02 08:29:13 发布

阅读量380

点赞数 1

分类专栏： python

本文链接：https://blog.csdn.net/qq_28136307/article/details/88121948

版权

python 专栏收录该内容

7 篇文章 0 订阅

订阅专栏

def segment(src_dir):
    segment_id = 0
    lines = open(src_dir,'r').readlines()
    temp = lines[0].split(' ')[1].split('_')[1]
    wf = open('./segments_new', 'a', encoding='UTF-8-sig')

    for line in lines:
        utt = line.split(" ")[1]
        utt_id = utt.split('_')[1]
        start = line.split(" ")[2]
        end = line.split(" ")[3]
        if utt_id == temp:
            segment_id_str = "{}_{}".format(utt, str(segment_id).zfill(4))
            print(segment_id_str,utt,start,end)
            segment_id += 1
            wf.write(segment_id_str + ' ' + utt + ' ' + start + ' ' + end + '\n')
        else:
            temp = utt_id
            segment_id = 0
            segment_id_str = "{}_{}".format(utt, str(segment_id).zfill(4))
            print(segment_id_str)
            segment_id += 1
            wf.write(segment_id_str + ' ' + utt + ' ' + start + ' ' + end + '\n')


segment('./segments')