数据处理(一)生成视频标签文件

关于行为识别数据处理,生成视频的注释文件

1.以选取ucf101十个类别的为例

2.把这些类别底下的视频文件重新命名,每次使用修改filename和new_path即可

# -*- coding: utf-8 -*-
# 作用===文件夹重新命名从1开始

import os  #导入模块

# filename = "D://postgraduate//anomalydetection//dataset//zengqiang//walktest//" #文件地址  walktest--类别
# filename = "D://pycharm  projects//mmaction2//TAM//ucf101//ApplyEyeMakeup//" #文件地址  walktest--类别
# filename = "D://pycharm  projects//mmaction2//TAM//ucf101//ApplyLipstick//"
# filename = "D://pycharm  projects//mmaction2//TAM//ucf101//Archery//"
# filename = "D://pycharm  projects//mmaction2//TAM//ucf101//BabyCrawling//"
# filename = "D://pycharm  projects//mmaction2//TAM//ucf101//BalanceBeam//"
# filename = "D://pycharm  projects//mmaction2//TAM//ucf101//BandMarching//"
# filename = "D://pycharm  projects//mmaction2//TAM//ucf101//BaseballPitch//"
# filename = "D://pycharm  projects//mmaction2//TAM//ucf101//Basketball//"
# filename = "D://pycharm  projects//mmaction2//TAM//ucf101//BasketballDunk//"
filename = "D://pycharm  projects//mmaction2//TAM//ucf101//BenchPress//"


list_path = os.listdir(filename)   #读取文件夹里面的名字

count = 1
for index in list_path:
    path = filename + '//' + index  # 原本文件名
    # new_path = "D://pycharm  projects//mmaction2//TAM//ucf101//ApplyEyeMakeup//" + '//' + f'{count}.avi' # avi/mp4格式
    # new_path = "D://pycharm  projects//mmaction2//TAM//ucf101//ApplyLipstick//" + '//' + f'{count}.avi'
    # new_path = "D://pycharm  projects//mmaction2//TAM//ucf101//Archery//" + '//' + f'{count}.avi'
    # new_path = "D://pycharm  projects//mmaction2//TAM//ucf101//BabyCrawling//" + '//' + f'{count}.avi'
    # new_path = "D://pycharm  projects//mmaction2//TAM//ucf101//BalanceBeam//" + '//' + f'{count}.avi'
    # new_path = "D://pycharm  projects//mmaction2//TAM//ucf101//BandMarching//" + '//' + f'{count}.avi'
    # new_path = "D://pycharm  projects//mmaction2//TAM//ucf101//BaseballPitch//" + '//' + f'{count}.avi'
    # new_path = "D://pycharm  projects//mmaction2//TAM//ucf101//Basketball//" + '//' + f'{count}.avi'
    # new_path = "D://pycharm  projects//mmaction2//TAM//ucf101//BasketballDunk//" + '//' + f'{count}.avi'
    new_path = "D://pycharm  projects//mmaction2//TAM//ucf101//BenchPress//" + '//' + f'{count}.avi'

    print(new_path)
    os.rename(path, new_path)
    count += 1

print('修改完成')

逐个类别运行后

3.读入数据集以及标签注释文本,以下为按(6:2:2)生成训练集测试集验证集的代码

class_Ind注释txt文本为

1 ApplyEyeMakeup
2 ApplyLipstick
3 Archery
4 BabyCrawling
5 BalanceBeam
6 BandMarching
7 BaseballPitch
8 Basketball
9 BasketballDunk
10 BenchPress
# 作用===================划分数据集(视频)
# 改动数据集路径dataset_root和标签路径labels_file,类别数num_classes,还有class_names类别文件名四处即可替换为自己的数据集使用
import os
import random
from collections import defaultdict
# D:\pycharm  projects\mmaction2\TAM\ucf101
dataset_root = 'D://pycharmprojects//mmaction2//TAM//ucf101'
num_classes = 10  # 更新类别数为10

class_videos = defaultdict(list)

# 更新类别名称并遍历每个类别文件夹以收集视频文件名
class_names = [
    "ApplyEyeMakeup",
    "ApplyLipstick",
    "Archery",
    "BabyCrawling",
    "BalanceBeam",
    "BandMarching",
    "BaseballPitch",
    "Basketball",
    "BasketballDunk",
    "BenchPress"
]

for class_id, class_name in enumerate(class_names, start=1):
    class_folder = os.path.join(dataset_root, class_name)
    print(f"Class folder: {class_folder}")
    print(f"Exists? {os.path.exists(class_folder)}")
    videos = os.listdir(class_folder)
    class_videos[class_id] = videos

# 为每个类别的视频列表进行随机化
for class_id, videos in class_videos.items():
    random.shuffle(videos)

# 计算划分数据集的索引
train_ratio = 0.6
val_ratio = 0.2

train_videos, val_videos, test_videos = defaultdict(list), defaultdict(list), defaultdict(list)

for class_id, videos in class_videos.items():
    total_videos = len(videos)
    train_split = int(total_videos * train_ratio)
    val_split = int(total_videos * (train_ratio + val_ratio))

    train_videos[class_id] = videos[:train_split]
    val_videos[class_id] = videos[train_split:val_split]
    test_videos[class_id] = videos[val_split:]

# 创建训练集、验证集和测试集的txt文件
# def write_txt(file_name, videos_dict):
#     with open(file_name, 'w') as file:
#         for class_id, videos in videos_dict.items():
#             class_name = class_names[class_id - 1]  # Adjust for 0-based indexing
#             for video in videos:
#                 file.write(f'{class_name}/{video}\n')
#
# write_txt('train.txt', train_videos)
# write_txt('val.txt', val_videos)
# write_txt('test.txt', test_videos)
def write_txt(file_name, videos_dict, labels_file):
    with open(labels_file, 'r') as labels:
        class_labels = labels.read().splitlines()

    with open(file_name, 'w') as file:
        for class_id, videos in videos_dict.items():
            class_info = class_labels[class_id - 1].split(' ', 1)
            class_name = class_info[1]  # 提取标签部分
            label = class_info[0]  # 提取编号部分
            for video in videos:
                file.write(f'{class_name}/{video}  {label}\n')  # 在视频路径后添加类别标签和编号


# 调用这个函数来创建训练集、验证集和测试集的文本文件
# D:\pycharmprojects\mmaction2\TAM\ucfTrainTestlist\classInd.txt
# labels_file = 'your_labels_file.txt'  # 替换为你的类别标签文件名
labels_file = 'D:/pycharmprojects/mmaction2/TAM/ucfTrainTestlist/classInd.txt'
write_txt('train.txt', train_videos, labels_file)
write_txt('val.txt', val_videos, labels_file)
write_txt('test.txt', test_videos, labels_file)

生成的val.txt,test.txt,train.txt文件里面为这种

至此,视频文件注释文件生成完毕

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值