Tensorflow 2.0 视频分类(二) UCF-101数据集预处理

最新推荐文章于 2024-09-05 02:24:45 发布

老光头_ME2CS

最新推荐文章于 2024-09-05 02:24:45 发布

阅读量3.9k

点赞数 5

分类专栏： Python 计算机视觉卷积神经网络文章标签： tensorflow python 深度学习大数据机器学习

本文链接：https://blog.csdn.net/Forrest97/article/details/105947125

版权

该博客介绍了如何对UCF-101数据集进行预处理，包括数据下载、训练和测试集合的创建，以及使用ffmpeg提取视频帧。重点讨论了避免同一视频片段出现在训练和测试集合中的问题，并提供了相关代码和参考资料。

摘要由CSDN通过智能技术生成

UCF-101数据下载

视频下载路径：http://crcv.ucf.edu/data/UCF101/UCF101.rar
解压后就是分类数据集的标准目录格式，二级目录名为人类活动类别，二级目录下就是对应的视频数据。每个视频长度为4s，大小320*240，帧率25HZ。
需要注意: 相同的活动下，有不同的视频是截取自同一个长视频的片段，即视频中的人物和背景等特征基本相似，因此为避免此类视频被分别划分到train和test集合引起训练效果不合实际的过大，UCF放提供了标准的train和test集合检索文件。
集合划分检索文件下载地址：
https://www.crcv.ucf.edu/data/UCF101/UCF101TrainTestSplits-RecognitionTask.zip
解压后如下，有三组推荐的划分方案
在这里插入图片描述

建立Train 和 Test 数据集合目录

运行代码1_move_files.py
将解压后的UCF-101和ucfTrainTestlist放在相同目录下运行以下代码，完成对原有视频的重新划分和移动

import os
import os.path

def get_train_test_lists(version='01'):
    """
    Using one of the train/test files (01, 02, or 03), get the filename
    breakdowns we'll later use to move everything.
    选择一个数据分割版本，并读取检索路径
    """
    # Get our files based on version. 
    test_file = os.path.join('ucfTrainTestlist', 'testlist' + version + '.txt')
    train_file = os.path.join('ucfTrainTestlist', 'trainlist' + version + '.txt')

    # Build the test list.
    with open(test_file) as fin:
        test_list = [row.strip() for row in list(fin)]

    # Build the train list. Extra step to remove the class index.
    with open(train_file) as fin:
        train_list = [row.strip() for row in list(fin)]
        train_list = [row.split(' ')[0] for row in train_list]

    # Set the groups in a dictionary.
    file_groups = {
   
        'train': train_list,
        'test': test_list