openmmlab学习笔记总结

qq_48142097

已于 2023-06-04 00:00:26 修改

阅读量200

点赞数

分类专栏： openmmlab 文章标签：学习笔记 python

于 2023-06-01 23:56:09 首次发布

本文链接：https://blog.csdn.net/qq_48142097/article/details/130997974

版权

openmmlab 专栏收录该内容

1 篇文章 0 订阅

订阅专栏

Label-Studio 和 SAM

SAM (Segment Anything) 是 Meta AI 推出的分割一切的模型。
Label Studio 是一款优秀的标注软件，覆盖图像分类、目标检测、分割等领域数据集标注的功能。

环境配置

conda create -n rtmdet-sam python=3.9 -y
conda activate rtmdet-sam

当然，实际测试并不需要3.9，直接用3.8的环境也没问题，但最好不要低于3.8，不然后续会有些坑。创建虚拟环境后拉取源码：

git clone https://github.com/open-mmlab/playground

然后安装pytorch：

# Linux and Windows CUDA 11.3
pip install torch==1.10.1+cu113 torchvision==0.11.2+cu113 torchaudio==0.10.1 -f https://download.pytorch.org/whl/cu113/torch_stable.html


# Linux and Windows CPU only
pip install torch==1.10.1+cpu torchvision==0.11.2+cpu torchaudio==0.10.1 -f https://download.pytorch.org/whl/cpu/torch_stable.html

# OSX
pip install torch==1.10.1 torchvision==0.11.2 torchaudio==0.10.1

准备segment的预训练模型与可视化的包：

cd path/to/playground/label_anything
# Before proceeding to the next step in Windows, you need to complete the following command line.
# conda install pycocotools -c conda-forge
pip install opencv-python pycocotools matplotlib onnxruntime onnx
pip install git+https://github.com/facebookresearch/segment-anything.git
wget https://dl.fbaipublicfiles.com/segment_anything/sam_vit_b_01ec64.pth

# For better segmentation results, use the sam_vit_h_4b8939.pth weights
# wget https://dl.fbaipublicfiles.com/segment_anything/sam_vit_l_0b3195.pth
# wget https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth

这里下载了segment anything的最小base模型，大概200多M左右，具体调用与接口api可以参考facebookresearch/segment-anything 的GitHub，主要类为SamAutomaticMaskGenerator ，大致方法为：

    def _generate_masks(self, image: np.ndarray) -> MaskData:
        orig_size = image.shape[:2]
        crop_boxes, layer_idxs = generate_crop_boxes(
            orig_size, self.crop_n_layers, self.crop_overlap_ratio
        )

        # Iterate over image crops
        data = MaskData()
        for crop_box, layer_idx in zip(crop_boxes, layer_idxs):
            crop_data = self._process_crop(image, crop_box, layer_idx, orig_size)
            data.cat(crop_data)

        # Remove duplicate masks between crops
        if len(crop_boxes) > 1:
            # Prefer masks from smaller crops
            scores = 1 / box_area(data["crop_boxes"])
            scores = scores.to(data["boxes"].device)
            keep_by_nms = batched_nms(
                data["boxes"].float(),
                scores,
                torch.zeros_like(data["boxes"][:, 0]),  # categories
                iou_threshold=self.crop_nms_thresh,
            )
            data.filter(keep_by_nms)

        data.to_numpy()
        return data

    def _process_crop(
        self,
        image: np.ndarray,
        crop_box: List[int],
        crop_layer_idx: int,
        orig_size: Tuple[int, ...],
    ) -> MaskData:
        # Crop the image and calculate embeddings
        x0, y0, x1, y1 = crop_box
        cropped_im = image[y0:y1, x0:x1, :]
        cropped_im_size = cropped_im.shape[:2]
        self.predictor.set_image(cropped_im)

        # Get points for this crop
        points_scale = np.array(cropped_im_size)[None, ::-1]
        points_for_image = self.point_grids[crop_layer_idx] * points_scale

        # Generate masks for this crop in batches
        data = MaskData()
        for (points,) in batch_iterator(self.points_per_batch, points_for_image):
            batch_data = self._process_batch(points, cropped_im_size, crop_box, orig_size)
            data.cat(batch_data)
            del batch_data
        self.predictor.reset_image()

        # Remove duplicates within this crop.
        keep_by_nms = batched_nms(
            data["boxes"].float(),
            data["iou_preds"],
            torch.zeros_like(data["boxes"][:, 0]),  # categories
            iou_threshold=self.box_nms_thresh,
        )
        data.filter(keep_by_nms)

        # Return to the original image frame
        data["boxes"] = uncrop_boxes_xyxy(data["boxes"], crop_box)
        data["points"] = uncrop_points(data["points"], crop_box)
        data["crop_boxes"] = torch.tensor([crop_box for _ in range(len(data["rles"]))])

        return data

这里截取的是SamAutomaticMaskGenerator 类前处理的两个函数片段，可以看到，整体还是比较简单的，无非是整体识别，然后找掩模，调用方式也有个demo，为：

import sys
sys.path.append("..")
from segment_anything import sam_model_registry, SamAutomaticMaskGenerator, SamPredictor

sam_checkpoint = "sam_vit_h_4b8939.pth"
model_type = "vit_h"

device = "cuda"

sam = sam_model_registry[model_type](checkpoint=sam_checkpoint)
sam.to(device=device)

mask_generator = SamAutomaticMaskGenerator(sam)

masks = mask_generator.generate(image)

plt.figure(figsize=(20,20))
plt.imshow(image)
show_anns(masks)
plt.axis('off')
plt.show()

在这里插入图片描述

那这里对segment anything解释略过，继续安装label studio。现在需要安装label studio的后端与算法部分，为：

# sudo apt install libpq-dev python3-dev # Note: If using Label Studio 1.7.2 version, you need to install libpq-dev and python3-dev dependencies.

# Installing label-studio may take some time. If you cannot find the version, please use the official source.
pip install label-studio==1.7.3
pip install label-studio-ml==1.0.9

安装到这里，会出现比较大的冲突，也是前面必须要在python3.8以上的原因，前者所需要的numpy版本会自动升级到numpy==1.24.1以上，但在python3.8以下的numpy的pip源最高也才到1.20，显然不满足要求，所以会安装失败，而python版本没有问题，安装顺序也不能颠倒，大概是这两个包是两个团队所写的，后者的numpy需要又是1.24以下，所以会卸载前面的一些东西，一般python的包向下兼容没事，但如果先安装下面的studio-ml，就会提示向上兼容的错误，同样会安装失败。

确保安装没有问题后，就可以先启动模型服务：

cd path/to/playground/label_anything

label-studio-ml start sam --port 8003 --with \
  sam_config=vit_b \
  sam_checkpoint_file=./sam_vit_b_01ec64.pth \
  out_mask=True \
  out_bbox=True \
  device=cuda:0
# device=cuda:0 is for using GPU inference. If you want to use CPU inference, replace cuda:0 with cpu.
# out_poly=True returns the annotation of the bounding polygon.

默认启动8003端口，非常见端口，也提供自定义设置，比较人性化，但后端服务就有点迷惑了，同样，可启动为：

label-studio start

再官方文档中，没有注明出怎么更改端口，以及其它一些设置项的说明，而默认启动的是8080这种常用端口，很明显会和其它服务冲突，所以我停止了我的nginx，再运行了它的服务，除了端口，IP倒是走的0.0.0.0，而不是127.0.0.1，不然云平台上部署的话，只能去源码里改配置，否则就无法访问了。

这里没有报错的话，与官方界面一样，为：

在这里插入图片描述

大概试了一下，登录界面的账号密码只是做了正则匹配规则，邮箱只要长度和后续符号正确，密码无要求，都能直接注册登录。

之后就是一些网页操作，按照文档顺序即可。这里有一个更改xml的配置文件：

<View>
  <Image name="image" value="$image" zoom="true"/>
  <KeyPointLabels name="KeyPointLabels" toName="image">
    <Label value="cat" smart="true" background="#e51515" showInline="true"/>
    <Label value="person" smart="true" background="#412cdd" showInline="true"/>
  </KeyPointLabels>
  <RectangleLabels name="RectangleLabels" toName="image">
  	<Label value="cat" background="#FF0000"/>
  	<Label value="person" background="#0d14d3"/>
  </RectangleLabels>
  <PolygonLabels name="PolygonLabels" toName="image">
  	<Label value="cat" background="#FF0000"/>
  	<Label value="person" background="#0d14d3"/>
  </PolygonLabels>
  <BrushLabels name="BrushLabels" toName="image">
  	<Label value="cat" background="#FF0000"/>
  	<Label value="person" background="#0d14d3"/>
  </BrushLabels>
</View>

官方的解释是这样的：

In the above XML, we have configured the annotations, where KeyPointLabels are for keypoint annotations, BrushLabels are for Mask annotations, PolygonLabels are for bounding polygon annotations, and RectangleLabels are for rectangle annotations.
This example uses two categories, cat and person. If community users want to add more categories, they need to add the corresponding categories in KeyPointLabels, BrushLabels, PolygonLabels, and RectangleLabels respectively.

即对应于写的两种label，比如说上述是cat和person，也可以写car、truck等等，有四种标注方式为BrushLabels、PolygonLabels、RectangleLabels、KeyPointLabels。

但在我的测试中，比较好用的是RectangleLabels 和 KeyPointLabels，多边形和brush画画没有效果，即找不到最佳的框也没有反应，可能操作问题，也可能没有更新好？emmm，文档中也没有演示GIF图，之后去B站找个视频看看