Docker torchserve 部署模型流程——以WSL部署YOLO-FaceV2为例

wangzitaoX

已于 2024-03-02 13:57:47 修改

阅读量1.7k

点赞数 3

文章标签： docker YOLO pytorch

于 2023-06-08 12:47:37 首次发布

本文链接：https://blog.csdn.net/wangzitaotao/article/details/131101852

版权

Docker torchserve 部署模型流程——以WSL部署YOLO-FaceV2为例

Docker torchserve 模型部署

一、配置WSL安装docker

WSL官方教程
1，https://learn.microsoft.com/zh-cn/windows/wsl/?source=recommendations
2，https://learn.microsoft.com/zh-cn/windows/wsl/setup/environment?source=recommendations

docker安装，这里注意选择合适的docker版本
https://docs.docker.com/
在这里插入图片描述

二、配置docker环境

1，拉取官方镜像

地址https://hub.docker.com/r/pytorch/torchserve/tags

docker pull pytorch/torchserve:0.7.0-gpu

**左侧为docker镜像，右侧为拉取命令**
左侧为docker镜像，右侧为拉取命令

docker镜像对应的dockerfile：
https://hub.docker.com/layers/pytorch/torchserve/0.7.0-gpu/images/sha256a8a5fb048b20fb71fed43d47caf370e5f4e15f27c219234734d8bb7d7870c158?context=explore

在这里插入图片描述

2，启动docker容器，将本地路径映射到docker

YOLO-FaceV2 Windows本地路径
在这里插入图片描述
YOLO-FaceV2 WSL路径

# pytorch/torchserve:0.7.0-gpu     docker启动指令
docker run --rm -it --gpus all -p 8080:8080 -p 8081:8081 -v /mnt/c/data/CrowdCounting/serve/YOLO-FaceV2-master:/home/model-server/extra-files -v /mnt/c/data/CrowdCounting/serve/YOLO-FaceV2-master/model-store:/home/model-server/model-store pytorch/torchserve:0.7.0-gpu

# docker路径映射指令，根据自己的需要增加映射指令
-v /mnt/c/data/CrowdCounting/serve/YOLO-FaceV2-master:/home/model-server/extra-files

在这里插入图片描述

3，查看docker镜像

docker启动之后，查看镜像、容器以及进入容器需要打开新的terminal
在这里插入图片描述

# docker启动之后，查看镜像、容器以及进入容器需要打开新的terminal
# 查看所有镜像
docker images
# 查看已启动的镜像容器，一个镜像可以启动多个容器
docker ps

在这里插入图片描述

4，进入docker容器

# 把cc4313027126修改为自己的容器ID
docker exec -it cc4313027126 /bin/bash

5，在docker容器中配置模型需要的Python依赖包

torch官网地址https://pytorch.org/get-started/previous-versions/

# 从官网安装torch
pip install torch==1.10.1+cu111 torchvision==0.11.2+cu111 torchaudio==0.10.1 -f https://download.pytorch.org/whl/cu111/torch_stable.html
# 从清华源安装其他依赖
pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple

6，如果修改过docker容器配置，需要将自定义的容器保存为镜像

# 50e43f173778为容器ID，yoloface:cu111为新镜像名字
docker commit 50e43f173778 yoloface:cu111

7，第一次配置docker的完整执行步骤

第一次配置docker，
特别是需要修改docker配置、需要根据requirements.txt文件安装python包的，
建议严格按照1~6的顺序，依次完成步骤，中间不要跳步！！！
否则可能无法成功启动docker，届时需要删掉已配置的docker容器，从头再来！！！

8，完整配置并保存docker镜像后，重新启动、进入docker容器的执行步骤

完整配置并保存docker镜像后，
不再需要修改docker配置、不再安装python包的，
按照2~4的步骤顺序，重新启动、进入docker容器，不再执行步骤1、5、6

注意，此时重新启动的docker应该为刚刚保存的yoloface:cu111，不再是最开始的pytorch/torchserve:0.7.0-gpu，因此将步骤2的启动指令更新为下面的指令

# yoloface:cu111     docker启动指令
docker run --rm -it --gpus all -p 8080:8080 -p 8081:8081 -v /mnt/c/data/CrowdCounting/serve/YOLO-FaceV2-master:/home/model-server/extra-files -v /mnt/c/data/CrowdCounting/serve/YOLO-FaceV2-master/model-store:/home/model-server/model-store yoloface:cu111

三、编写handler文件，生成.mar文件

生成.mar文件指令，注意指令格式

# 生成.mar文件指令示例，参数根据自己的情况重新设置
# 注意：正确设置路径，否则会产生一系列错误！！！！！
# 不要问我是怎么知道的，mar不相信眼泪
torch-model-archiver --model-name yolofacev2 --version 1.0 --model-file experimental.py --serialized-file best.pt --handler handler.py --extra-files "models.zip, utils.zip"

# 注意--extra-files "models.zip, utils.zip"，引号中zip文件间的空格，如果有空格且出现“找不到zip文件”的错误，那就去掉空格

注意指令参数与文件相对路径对应
在这里插入图片描述
神经病啊，没有handler文件怎么生成mar？

这不就来了嘛！

1，handler文件initialize函数修改

接下来的修改过程：
请注意：正确设置路径，否则会产生一系列错误！！！！！
请注意：正确设置路径，否则会产生一系列错误！！！！！
请注意：正确设置路径，否则会产生一系列错误！！！！！

    def initialize(self, context):
        properties = context.system_properties
        logger.info(f"Cuda available: {torch.cuda.is_available()}")
        logger.info(f"GPU available: {torch.cuda.device_count()}")
        use_cuda = torch.cuda.is_available() and torch.cuda.device_count() > 0
        self.map_location = 'cuda' if use_cuda else 'cpu'
        self.device = torch.device(self.map_location + ':' +
                                   str(properties.get('gpu_id')
                                       ) if use_cuda else 'cpu')
        self.manifest = context.manifest
        model_dir = properties.get('model_dir')
        logger.info("==================model_dir==================="
                    " %s loaded successfully", model_dir)
        self.model_pt_path = None
        if "serializedFile" in self.manifest["model"]:
            serialized_file = self.manifest["model"]["serializedFile"]
            self.model_pt_path = os.path.join(model_dir, serialized_file)
        model_file = self.manifest['model']['modelFile']
        logger.info("Model file %s loaded successfully", self.model_pt_path)

上面的代码不要随意修改，它来自BaseHandler文件的initialize函数，它的作用主要是加载model_dir，model_file，与serialized_file，眼熟吗？看这里：

torch-model-archiver 
--model-file experimental.py 
--serialized-file best.pt 
--model-name yolofacev2 
--version 1.0 
--handler handler.py 
--extra-files "models.zip, utils.zip"

model_file，与serialized_file在mar文件生成的指令中出现过，它们负责加载模型文件与模型权重
model_dir是在handler文件执行过程中docker产生的临时路径，嘶~，它长这个亚子：

model_dir：/home/model-server/tmp/models/59324bc14e6c48d5821e157886545f1b

关键在model_dir里存放了“mar文件生成的指令”传入的所有文件
在这里插入图片描述
So，你可以自己找到临时路径，查看你想传入的文件是不是正确传入到model_dir，如果没有？你懂得！

突然感觉docker变得透明了
在这里插入图片描述

2，模型文件很多，需要加载的文件很多，–extra-files很多肿么办？服用zip压缩包可以救命！

“mar文件生成的指令”中–extra-files可以传入压缩包，像这样：

torch-model-archiver 
--extra-files "models.zip, utils.zip"

但需要在initialize导入压缩包并解压，解压后存放的位置就是model_dir，下图包含了解压后的结果
在这里插入图片描述

注意：压缩包里是这个样子的，zip里《必须》有完整的文件夹，解压之后才会有上面的效果！
在这里插入图片描述

导入压缩包并解压程序如下，放在handler文件initialize函数中

with zipfile.ZipFile(model_dir + '/models.zip', 'r') as zip_ref:
     zip_ref.extractall(model_dir)
with zipfile.ZipFile(model_dir + '/utils.zip', 'r') as zip_ref:
     zip_ref.extractall(model_dir)
self.load_yoloface_model()

注意看：这个load_yoloface_model函数叫小帅？？？重来！
在这里插入图片描述
在zip解压之后，才可以在load_yoloface_model函数导入之前被压缩的文件，比如模型文件、config文件等，注意导入文件的时机，否则？你懂得！

   def load_yoloface_model(self):
        from experimental import attempt_load
        from utils.datasets import letterbox
        from utils.general import check_img_size, non_max_suppression, scale_coords, xyxy2xywh
        self.letterbox = letterbox
        self.check_img_size = check_img_size
        self.non_max_suppression = non_max_suppression
        self.scale_coords = scale_coords
        self.xyxy2xywh = xyxy2xywh

3，handler文件preprocess函数修改，这里以图片为例

preprocess函数接收到的data是图片经过http转码的，需要转个圈圈，转回来，如下：

    def preprocess(self, data):
        # Initialize
        # stride = int(model.stride.max())  # model stride
        print("debug--%d", len(data))
        images = []
        for row in data:
            image = row.get("data") or row.get("body")
            if isinstance(image, str):
                # if the image is a string of bytesarray.
                image = base64.b64decode(image)
            # If the image is sent as bytesarray
            elif isinstance(image, (bytearray, bytes)):
                image = Image.open(io.BytesIO(image))
                image = cv2.cvtColor(np.asarray(image), cv2.COLOR_RGB2BGR)
            else:
                # if the image is a list
                image = image.get('instances')[0]
                image = np.divide(torch.HalfTensor(image), 255)

            img0 = image

4，完整handler文件如下，仅供参考
什么？你觉得我代码写的烂，哎？，不知道为什么人家听不见，我觉得能跑就行！

# -*- coding: utf-8 -*-
import datetime
import os
import cv2
import sys
import zipfile
import numpy as np
import logging
import base64
import torch
import io
from PIL import Image
from ts.torch_handler.base_handler import BaseHandler

logger = logging.getLogger(__name__)
start_up_time = datetime.datetime.now()
filename = ".//log_" + str(start_up_time).replace(':', '') + '.txt'
logging.basicConfig(filename=filename, level=logging.INFO,
                        format='[%(asctime)s.%(msecs)03d] %(message)s', datefmt='%H:%M:%S')
logging.getLogger().addHandler(logging.StreamHandler(sys.stdout))


class FaceDetectHandler(BaseHandler):

    def __init__(self):
        super().__init__()
        self.imgsz = 640
        self.iou_thres = 0.3
        self.conf_thres = 0.1
        self.xyxy2xywh = None
        self.scale_coords = None
        self.non_max_suppression = None
        self.check_img_size = None
        self.letterbox = None

    def load_yoloface_model(self):
        from experimental import attempt_load
        from utils.datasets import letterbox
        from utils.general import check_img_size, non_max_suppression, scale_coords, xyxy2xywh
        self.letterbox = letterbox
        self.check_img_size = check_img_size
        self.non_max_suppression = non_max_suppression
        self.scale_coords = scale_coords
        self.xyxy2xywh = xyxy2xywh
        with torch.no_grad():
            self.model = attempt_load(self.model_pt_path, map_location=self.device)  # load FP32 model

    def initialize(self, context):
        properties = context.system_properties
        logger.info(f"Cuda available: {torch.cuda.is_available()}")
        logger.info(f"GPU available: {torch.cuda.device_count()}")
        use_cuda = torch.cuda.is_available() and torch.cuda.device_count() > 0
        self.map_location = 'cuda' if use_cuda else 'cpu'
        self.device = torch.device(self.map_location + ':' +
                                   str(properties.get('gpu_id')
                                       ) if use_cuda else 'cpu')
        self.manifest = context.manifest
        model_dir = properties.get('model_dir')
        logger.info("==================model_dir==========================="
                    " %s loaded successfully", model_dir)
        self.model_pt_path = None
        if "serializedFile" in self.manifest["model"]:
            serialized_file = self.manifest["model"]["serializedFile"]
            self.model_pt_path = os.path.join(model_dir, serialized_file)
        model_file = self.manifest['model']['modelFile']
        logger.info("Model file %s loaded successfully", self.model_pt_path)

        with zipfile.ZipFile(model_dir + '/models.zip', 'r') as zip_ref:
            zip_ref.extractall(model_dir)
        with zipfile.ZipFile(model_dir + '/utils.zip', 'r') as zip_ref:
            zip_ref.extractall(model_dir)
        self.load_yoloface_model()

    def dynamic_resize(self, shape, stride=64):
        max_size = max(shape[0], shape[1])
        if max_size % stride != 0:
            max_size = (int(max_size / stride) + 1) * stride
        return max_size

    def preprocess(self, data):
        print("debug--%d", len(data))
        images = []
        for row in data:
            image = row.get("data") or row.get("body")
            if isinstance(image, str):
                # if the image is a string of bytesarray.
                image = base64.b64decode(image)
                # If the image is sent as bytesarray
            elif isinstance(image, (bytearray, bytes)):
                image = Image.open(io.BytesIO(image))
                image = cv2.cvtColor(np.asarray(image), cv2.COLOR_RGB2BGR)
            else:
                # if the image is a list
                image = image.get('instances')[0]
                image = np.divide(torch.HalfTensor(image), 255)

            img0 = image
            imgsz = self.imgsz
            if imgsz <= 0:  # original size
                imgsz = self.dynamic_resize(image.shape)
            imgsz = self.check_img_size(imgsz, s=64)  # check img_size
            # yolov5的resize，使用比例填充
            # (683, 1024, 3) -> (448, 640, 3)
            img = self.letterbox(image, imgsz)[0]
            # Convert
            # (448, 640, 3) -> (3, 448, 640)
            img = img[:, :, ::-1].transpose(2, 0, 1)  # BGR to RGB, to 3x416x416
            img = np.ascontiguousarray(img)
            img = torch.from_numpy(img).to(self.device)
            img = img.float()  # uint8 to fp16/32
            img /= 255.0  # 0 - 255 to 0.0 - 1.0
            if img.ndimension() == 3:
                img = img.unsqueeze(0)
            images.append([img, img0])

        return images

    def inference(self, data, *args, **kwargs):
        imgsz = self.imgsz
        model = self.model
        # Run inference
        # img(1,3,448,640)且归一化
        bbox_sets = []
        for img_4t, img0_3c in data:
            pred = model(img_4t)[0]
            bbox_sets.append([img_4t, img0_3c, pred])
        return bbox_sets

    def postprocess(self, data):
        boxes = [[] for _ in range(len(data))]
        for i, bbox_sets in enumerate(data):
            img_4t, img0_3c, pred = bbox_sets[0], bbox_sets[1], bbox_sets[2]
            # Apply NMS
            pred = self.non_max_suppression(pred, self.conf_thres, self.iou_thres)[0]
            h, w, c = img0_3c.shape
            if pred is not None:
                pred[:, :4] = self.scale_coords(img_4t.shape[2:], pred[:, :4], img0_3c.shape).round()
                for j in range(pred.size()[0]):
                    *xyxy, conf, cls = pred[j]
                    xyxy = torch.Tensor(xyxy).to(self.device)
                    # xywh = (xyxy2xywh(torch.tensor(xyxy).view(1, 4)) / gn).view(-1)  # normalized xywh
                    xywh = (self.xyxy2xywh(torch.as_tensor(xyxy).view(1, 4)) / 1.0).view(-1)
                    xywh = xywh.data.cpu().numpy()
                    conf = pred[j, 4].cpu().numpy()
                    # landmarks = (pred[j, 5:15].view(1, 10) / gn_lks).view(-1).tolist()
                    # class_num = pred[j, 15].cpu().numpy()
                    x1 = int(xywh[0] - 0.5 * xywh[2])
                    y1 = int(xywh[1] - 0.5 * xywh[3])
                    x2 = int(xywh[0] + 0.5 * xywh[2])
                    y2 = int(xywh[1] + 0.5 * xywh[3])
                    # boxes.append([x1, y1, x2 - x1, y2 - y1, conf])
                    boxes[i].append({
                        "x1": x1,
                        "y1": y1,
                        "x2": x2,
                        "y2": y2,
                        "confidence": conf.item()
                    })
        return boxes

四、.mar文件生成之后，重启测试

根据mar文件存放地址重启torchserve，mar文件最好放在model-store里，为什么我忘了，后面找到了再补充
在这里插入图片描述

torchserve --stop
torchserve --start --ncs --model-store model-store --models yolofacev2.mar

重启torchserve后，打开新的terminal，可能需要再次进入docker容器（实际操作中，进不进容器需要尝试，有些WSL必须进容器才能进行连接测试，有些WSL必须不进入容器才能进行连接测试，根据实际情况做判断，怎么不报错，怎么来）

# 连接模型
curl http://localhost:8081/models/yolofacev2
# 测试命令
curl http://127.0.0.1:8080/predictions/yolofacev2 -T data/images/zidane.jpg
curl http://127.0.0.1:8080/predictions/yolofacev2 -T data/images/bus.jpg