配置项目环境
新建Colaboratory
连接google云端硬盘
连接google云端硬盘是为了方便后续文件上传和下载。连接代码如下:
from google.colab import drive
drive.mount('/content/drive/')
输入账号密码之后连接成功。
安装必要的python包
Colaboratory自带python3.8.10,而且内涵很多常用的包,所以根据官方的Requirements按需下载。需要注意的是官方在后面备注命令的包,最好用他的命令下载。
其中Detectron2是slowfast用到的一个目标检测项目,其安装代码如下:
!git clone https://github.com/facebookresearch/detectron2 detectron2_repo
!pip3 install -e detectron2_repo
安装项目
下载slowfast代码,并且安装的代码如下:
!git clone https://github.com/facebookresearch/slowfast
cd SlowFast
!python3 setup.py build develop
配置运行参数
如果只是运行demo的话,使用的配置文件目录是 demo/AVA/SLOWFAST_32x2_R101_50_50.yaml。打开SLOWFAST_32x2_R101_50_50.yaml这个文件,在最后的DEMO部分修改成如下所示:
DEMO:
ENABLE: True
LABEL_FILE_PATH: "/content/drive/MyDrive/slowfast/demo/AVA/ava.json" #标注文件路径
#WEBCAM: 0
INPUT_VIDEO: "/content/drive/MyDrive/slowfast/input/2.mp4" #输入文件的路径
OUTPUT_FILE: "/content/drive/MyDrive/slowfast/output/2.mp4" #输出文件的路径
DETECTRON2_CFG: "COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml"
DETECTRON2_WEIGHTS: detectron2://COCO- Detection/faster_rcnn_R_50_FPN_3x/137849458/model_final_280758.pkl
根据我们修改的配置文件在/content/drive/MyDrive/slowfast/目录下新建input和output两个文件夹,分别存放输入输出文件。在/content/drive/MyDrive/slowfast/slowfast/demo/AVA/目录下新建ava.json文件,其内容如下:
{"bend/bow (at the waist)": 0, "crawl": 1, "crouch/kneel": 2, "dance": 3, "fall down": 4, "get up": 5, "jump/leap": 6, "lie/sleep": 7, "martial art": 8, "run/jog": 9, "sit": 10, "stand": 11, "swim": 12, "walk": 13, "answer phone": 14, "brush teeth": 15, "carry/hold (an object)": 16, "catch (an object)": 17, "chop": 18, "climb (e.g., a mountain)": 19, "clink glass": 20, "close (e.g., a door, a box)": 21, "cook": 22, "cut": 23, "dig": 24, "dress/put on clothing": 25, "drink": 26, "drive (e.g., a car, a truck)": 27, "eat": 28, "enter": 29, "exit": 30, "extract": 31, "fishing": 32, "hit (an object)": 33, "kick (an object)": 34, "lift/pick up": 35, "listen (e.g., to music)": 36, "open (e.g., a window, a car door)": 37, "paint": 38, "play board game": 39, "play musical instrument": 40, "play with pets": 41, "point to (an object)": 42, "press": 43, "pull (an object)": 44, "push (an object)": 45, "put down": 46, "read": 47, "ride (e.g., a bike, a car, a horse)": 48, "row boat": 49, "sail boat": 50, "shoot": 51, "shovel": 52, "smoke": 53, "stir": 54, "take a photo": 55, "text on/look at a cellphone": 56, "throw": 57, "touch (an object)": 58, "turn (e.g., a screwdriver)": 59, "watch (e.g., TV)": 60, "work on a computer": 61, "write": 62, "fight/hit (a person)": 63, "give/serve (an object) to (a person)": 64, "grab (a person)": 65, "hand clap": 66, "hand shake": 67, "hand wave": 68, "hug (a person)": 69, "kick (a person)": 70, "kiss (a person)": 71, "lift (a person)": 72, "listen to (a person)": 73, "play with kids": 74, "push (another person)": 75, "sing to (e.g., self, a person, a group)": 76, "take (an object) from (a person)": 77, "talk to (e.g., self, a person, a group)": 78, "watch (a person)": 79}
运行DEMO
将要识别的视频上传到/content/drive/MyDrive/slowfast/input目录下。
在主目录/content/drive/MyDrive/slowfast下,输入以下代码运行程序:
!python3 tools/run_net.py --cfg demo/AVA/SLOWFAST_32x2_R101_50_50.yaml
最终的结果被保存在/content/drive/MyDrive/slowfast/output目录下。
运行中遇到的问题
一般按照以上步骤运行的过程中,总会遇到各种各样的问题,下面我总结了一些问题及解决方法,以供参考。
问题1
报错:not find PIL
解决方法:将/content/drive/MyDrive/slowfast目录下的setup.py文件中的PIL 更改为 Pillow
问题2
报错:ImportError: cannot import name ‘cat_all_gather’ from ‘pytorchvideo.layers.distributed’
解决方法:将报错目录下的distributed.py文件内容替换成以下内容:
# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
"""Distributed helpers."""
import torch
import torch.distributed as dist
from torch._C._distributed_c10d import ProcessGroup
from torch.autograd.function import Function
_LOCAL_PROCESS_GROUP = None
def get_world_size() -> int:
"""
Simple wrapper for correctly getting worldsize in both distributed
/ non-distributed settings
"""
return (
torch.distributed.get_world_size()
if torch.distributed.is_available() and torch.distributed.is_initialized()
else 1
)
def cat_all_gather(tensors, local=False):
"""Performs the concatenated all_reduce operation on the provided tensors."""
if local:
gather_sz = get_local_size()
else:
gather_sz = torch.distributed.get_world_size()
tensors_gather = [torch.ones_like(tensors) for _ in range(gather_sz)]
torch.distributed.all_gather(
tensors_gather,
tensors,
async_op=False,
group=_LOCAL_PROCESS_GROUP if local else None,
)
output = torch.cat(tensors_gather, dim=0)
return output
def init_distributed_training(cfg):
"""
Initialize variables needed for distributed training.
"""
if cfg.NUM_GPUS <= 1:
return
num_gpus_per_machine = cfg.NUM_GPUS
num_machines = dist.get_world_size() // num_gpus_per_machine
for i in range(num_machines):
ranks_on_i = list(
range(i * num_gpus_per_machine, (i + 1) * num_gpus_per_machine)
)
pg = dist.new_group(ranks_on_i)
if i == cfg.SHARD_ID:
global _LOCAL_PROCESS_GROUP
_LOCAL_PROCESS_GROUP = pg
def get_local_size() -> int:
"""
Returns:
The size of the per-machine process group,
i.e. the number of processes per machine.
"""
if not dist.is_available():
return 1
if not dist.is_initialized():
return 1
return dist.get_world_size(group=_LOCAL_PROCESS_GROUP)
def get_local_rank() -> int:
"""
Returns:
The rank of the current process within the local (per-machine) process group.
"""
if not dist.is_available():
return 0
if not dist.is_initialized():
return 0
assert _LOCAL_PROCESS_GROUP is not None
return dist.get_rank(group=_LOCAL_PROCESS_GROUP)
def get_local_process_group() -> ProcessGroup:
assert _LOCAL_PROCESS_GROUP is not None
return _LOCAL_PROCESS_GROUP
class GroupGather(Function):
"""
GroupGather performs all gather on each of the local process/ GPU groups.
"""
@staticmethod
def forward(ctx, input, num_sync_devices, num_groups):
"""
Perform forwarding, gathering the stats across different process/ GPU
group.
"""
ctx.num_sync_devices = num_sync_devices
ctx.num_groups = num_groups
input_list = [torch.zeros_like(input) for k in range(get_local_size())]
dist.all_gather(
input_list, input, async_op=False, group=get_local_process_group()
)
inputs = torch.stack(input_list, dim=0)
if num_groups > 1:
rank = get_local_rank()
group_idx = rank // num_sync_devices
inputs = inputs[
group_idx * num_sync_devices : (group_idx + 1) * num_sync_devices
]
inputs = torch.sum(inputs, dim=0)
return inputs
@staticmethod
def backward(ctx, grad_output):
"""
Perform backwarding, gathering the gradients across different process/ GPU
group.
"""
grad_output_list = [
torch.zeros_like(grad_output) for k in range(get_local_size())
]
dist.all_gather(
grad_output_list,
grad_output,
async_op=False,
group=get_local_process_group(),
)
grads = torch.stack(grad_output_list, dim=0)
if ctx.num_groups > 1:
rank = get_local_rank()
group_idx = rank // ctx.num_sync_devices
grads = grads[
group_idx
* ctx.num_sync_devices : (group_idx + 1)
* ctx.num_sync_devices
]
grads = torch.sum(grads, dim=0)
return grads, None, None
可以先在Google云端硬盘的目录下新建distributed.py,粘贴代码,然后用Linux命令粘贴过去:
cp -f /content/drive/MyDrive/distributed.py /usr/local/lib/python3.8/dist-packages/pytorchvideo/layers/
问题3
报错:ModuleNotFoundError: No module named ‘pytorchvideo.losses’
解决方法:先到/usr/local/lib/python3.8/dist-packages/pytorchvideo/目录:
cd /usr/local/lib/python3.8/dist-packages/pytorchvideo/
新建losses文件夹:
mkdir -p losses
在losses文件夹里新建文件soft_target_cross_entropy.py,其内容如下:
# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
import torch
import torch.nn as nn
import torch.nn.functional as F
from pytorchvideo.layers.utils import set_attributes
from pytorchvideo.transforms.functional import convert_to_one_hot
class SoftTargetCrossEntropyLoss(nn.Module):
"""
Adapted from Classy Vision: ./classy_vision/losses/soft_target_cross_entropy_loss.py.
This allows the targets for the cross entropy loss to be multi-label.
"""
def __init__(
self,
ignore_index: int = -100,
reduction: str = "mean",
normalize_targets: bool = True,
) -> None:
"""
Args:
ignore_index (int): sample should be ignored for loss if the class is this value.
reduction (str): specifies reduction to apply to the output.
normalize_targets (bool): whether the targets should be normalized to a sum of 1
based on the total count of positive targets for a given sample.
"""
super().__init__()
set_attributes(self, locals())
assert isinstance(self.normalize_targets, bool)
if self.reduction not in ["mean", "none"]:
raise NotImplementedError(
'reduction type "{}" not implemented'.format(self.reduction)
)
self.eps = torch.finfo(torch.float32).eps
def forward(self, input: torch.Tensor, target: torch.Tensor) -> torch.Tensor:
"""
Args:
input (torch.Tensor): the shape of the tensor is N x C, where N is the number of
samples and C is the number of classes. The tensor is raw input without
softmax/sigmoid.
target (torch.Tensor): the shape of the tensor is N x C or N. If the shape is N, we
will convert the target to one hot vectors.
"""
# Check if targets are inputted as class integers
if target.ndim == 1:
assert (
input.shape[0] == target.shape[0]
), "SoftTargetCrossEntropyLoss requires input and target to have same batch size!"
target = convert_to_one_hot(target.view(-1, 1), input.shape[1])
assert input.shape == target.shape, (
"SoftTargetCrossEntropyLoss requires input and target to be same "
f"shape: {input.shape} != {target.shape}"
)
# Samples where the targets are ignore_index do not contribute to the loss
N, C = target.shape
valid_mask = torch.ones((N, 1), dtype=torch.float).to(input.device)
if 0 <= self.ignore_index <= C - 1:
drop_idx = target[:, self.ignore_idx] > 0
valid_mask[drop_idx] = 0
valid_targets = target.float() * valid_mask
if self.normalize_targets:
valid_targets /= self.eps + valid_targets.sum(dim=1, keepdim=True)
per_sample_per_target_loss = -valid_targets * F.log_softmax(input, -1)
per_sample_loss = torch.sum(per_sample_per_target_loss, -1)
# Perform reduction
if self.reduction == "mean":
# Normalize based on the number of samples with > 0 non-ignored targets
loss = per_sample_loss.sum() / torch.sum(
(torch.sum(valid_mask, -1) > 0)
).clamp(min=1)
elif self.reduction == "none":
loss = per_sample_loss
return
可以先在Google云端硬盘的目录下新建soft_target_cross_entropy.py,粘贴代码,然后用Linux命令粘贴过去:
cp -f /content/drive/MyDrive/soft_target_cross_entropy.py /usr/local/lib/python3.8/dist-packages/pytorchvideo/losses/
问题4
报错:KeyError: ‘Non-existent config key: TENSORBOARD.MODEL_VIS.TOPK’
解决方法:注释掉配置文件demo/AVA/SLOWFAST_32x2_R101_50_50.yaml的下面三行:
# TENSORBOARD
# MODEL_VIS
# TOPK:2
问题5
报错:ModuleNotFoundError: No module named ‘xxx’
解决方法:输入命令,下载相应的包:
!pip3 install xxx