【变化检测】基于UNet建筑物(LEVIR-CD)变化检测实战及ONNX推理

你的陈某某

已于 2025-05-08 19:16:09 修改

阅读量2.6k

点赞数 32

分类专栏：变化检测文章标签：变化检测 UNet 深度学习

于 2024-08-24 16:36:17 首次发布

本文链接：https://blog.csdn.net/weixin_45679938/article/details/141498683

版权

变化检测专栏收录该内容

9 篇文章

订阅专栏

主要内容如下：

1、LEVIR-CD数据集介绍及下载
2、运行环境安装
3、基于likyoo写的变化检测代码模型训练与预测
4、Onnx运行及可视化

运行环境：Python=3.8，torch1.12.0+cu113，onnxruntime-gpu=1.12.0
likyoo变化检测源码：https://github.com/likyoo/change_detection.pytorch
使用情况：环境配置简单、训练速度也快。

1 LEVIR-CD数据集介绍

1.1 简介

LEVIR-CD 由 637 个超高分辨率（VHR，0.5m/像素）谷歌地球（GE）图像块对组成，大小为 1024 × 1024 像素。这些时间跨度为 5 到 14 年的双时态图像具有显着的土地利用变化，尤其是建筑增长。LEVIR-CD涵盖别墅住宅、高层公寓、小型车库、大型仓库等各类建筑。在这里，我们关注与建筑相关的变化，包括建筑增长（从土壤/草地/硬化地面或在建建筑到新的建筑区域的变化）和建筑衰退。这些双时态图像由遥感图像解释专家使用二进制标签（1 表示变化，0 表示不变）进行注释。我们数据集中的每个样本都由一个注释者进行注释，然后由另一个进行双重检查以生成高质量的注释。
数据来源：https://justchenhao.github.io/LEVIR/
论文地址：https://www.mdpi.com/2072-4292/12/10/1662
快速下载链接：https://aistudio.baidu.com/datasetdetail/104390/1

1.2 示例

在这里插入图片描述

2 运行环境安装

2.1 基础环境安装

【超详细】跑通YOLOv8之深度学习环境配置1-Anaconda安装
 【超详细】跑通YOLOv8之深度学习环境配置2-CUDA安装

创建Python环境及换源可借鉴如下：
【超详细】跑通YOLOv8之深度学习环境配置3-YOLOv8安装

2.2 likyoo变化检测代码环境安装

2.2.1 代码下载

在这里插入图片描述

2.2.2 环境安装

# 1 创建环境
conda create -n likyoo python=3.8
conda activate likyoo

# 2 安装torch
# 方式1：
conda install pytorch==1.12.0 torchvision==0.13.0 torchaudio==0.12.0 cudatoolkit=11.3 -c pytorch
# 方式2：
pip install torch==1.12.0+cu113 torchvision==0.13.0+cu113 torchaudio==0.12.0 --extra-index-url https://download.pytorch.org/whl/cu113

# 3 验证torch安装是否为gpu版
import torch
print(torch.__version__)  # 打印torch版本
print(torch.cuda.is_available())  # True即为成功
print(torch.version.cuda)
print(torch.backends.cudnn.version())

# 4 安装其他依赖库
cd ./change_detection.pytorch-main
pip install -r requirements.txt
pip install six  # 训练报错缺少该库

3 模型训练与预测

3.1 模型架构

在这里插入图片描述

3.1 模型训练

训练代码为local_test.py

选择不同的分割架构，详情见cdp的__init__.py
修改训练数据路径
epoch可以修改72行MAX_EPOCH，默认为60，batchsize默认为8【本文改成32训练，测试效果比8好，显存占8G左右，训练较快，60epoch在RTX4080上不到一个小时】；

训练报错：RuntimeError: Attempted to set the storage of a tensor on device “cpu” to a storage on different device “cuda:0”. This is no longer allowed; the devices must match.

解决方法：
在 hub.py里修改最后一行，删去, map_location=map_location
在这里插入图片描述

3.2 模型预测

新建predict.py脚本，复制如下内容【注意输入路径是否正确，否则报错】：

import cv2
import numpy as np 
import torch
from torch.utils.data import DataLoader, Dataset
import albumentations as A
import change_detection_pytorch as cdp
from change_detection_pytorch.datasets import LEVIR_CD_Dataset, SVCD_Dataset
from change_detection_pytorch.utils.lr_scheduler import GradualWarmupScheduler
 
DEVICE = 'cuda' if torch.cuda.is_available() else 'cpu'
 
model = cdp.Unet(
    encoder_name="resnet34",  # choose encoder, e.g. mobilenet_v2 or efficientnet-b7
    encoder_weights="imagenet",  # use `imagenet` pre-trained weights for encoder initialization
    in_channels=3,  # model input channels (1 for gray-scale images, 3 for RGB, etc.)
    classes=2,  # model output channels (number of classes in your datasets)
    siam_encoder=True,  # whether to use a siamese encoder
    fusion_form='concat',  # the form of fusing features from two branches. e.g. concat, sum, diff, or abs_diff.
)
 
model_path = 'best_model.pth'  # 修改1
model.to(DEVICE)
# model.load_state_dict(torch.load(model_path))
model = torch.load(model_path)
model.eval()
test_transform = A.Compose([
            A.Normalize()])

path1 = 'E:/datasets/LEVIR-CD/test/A/test_7.png'  # 修改2
img1 = cv2.imread(path1)
img1 = test_transform(image = img1)
img1 = img1['image']
img1 = img1.transpose(2, 0, 1)
img1 = np.expand_dims(img1,0)
img1 = torch.Tensor(img1)
img1 = img1.cuda()
 
path2 = 'E:/datasets/LEVIR-CD/test/B/test_7.png'  # 修改3
img2 = cv2.imread(path2)
img2 = test_transform(image = img2)
img2 = img2['image']
img2 = img2.transpose(2, 0, 1)
img2 = np.expand_dims(img2,0)
img2 = torch.Tensor(img2)
img2 = img2.cuda()
 
pre = model(img1,img2)
pre = torch.argmax(pre, dim=1).cpu().data.numpy() * 255
cv2.imwrite('./result/test_7_pre.png', pre[0])  # 修改4

3.3 结果显示

其中epoch=60结果：
在这里插入图片描述

4 Onnx运行及可视化

4.1 Onnx导出静态和动态文件

新建export.py脚本，导出onnx，复制如下内容：

import torch
import change_detection_pytorch as cdp

model = cdp.Unet(
    encoder_name="resnet34",  # choose encoder, e.g. mobilenet_v2 or efficientnet-b7
    encoder_weights="imagenet",  # use `imagenet` pre-trained weights for encoder initialization
    in_channels=3,  # model input channels (1 for gray-scale images, 3 for RGB, etc.)
    classes=2,  # model output channels (number of classes in your datasets)
    siam_encoder=True,  # whether to use a siamese encoder
    fusion_form='concat',  # the form of fusing features from two branches. e.g. concat, sum, diff, or abs_diff.
)

model_path = "best_model.pth"  # 修改1
model = torch.load(model_path)

input1 = torch.randn(1, 3, 1024, 1024, device='cuda:0')
input2 = torch.randn(1, 3, 1024, 1024, device='cuda:0')

# 保存静态onnx
torch.onnx.export(model, (input1, input2),
                   "cd.onnx",
                   input_names=["images1", "images2"],
                   output_names=["output"],
                   verbose=True, opset_version=12)

# 保存动态onnx
torch.onnx.export(model, (input1, input2),
                   "cd_dy.onnx",
                   input_names=["images1", "images2"],
                   output_names=["output"],
                   verbose=True, opset_version=12,
                   dynamic_axes={
                       "images1": {0 :"batch_size", 2: "input_height", 3: "input_width"},
                       "images2": {0 :"batch_size", 2: "input_height", 3: "input_width"},
                       "output": {0 :"batch_size", 2: "output_height", 3: "output_width"}
                   })

查看模型结构
https://netron.app/
静态onnx：

动态onnx：

注意：其中的[1,2,1024,1024]表示大小为1024*1024的值0和1两张（两个类别）概率图，所以转掩码图的后处理需要用argmax函数处理下，即返回该坐标概率最大的类别。

4.2 Onnx运行及可视化

4.2.1 Onnx推理运行

import os
import cv2
import time
import argparse
import numpy as np
import onnxruntime as ort  # 使用onnxruntime推理用上，pip install onnxruntime-gpu==1.12.0 -i https://pypi.tuna.tsinghua.edu.cn/simple
 

class CD(object):
    def __init__(self, onnx_model, in_shape=1024):
        self.in_shape = in_shape  # 图像输入尺度
        self.mean = [0.406, 0.456, 0.485]  # 定义均值和标准差（确保它们与图像数据的范围相匹配）  
        self.std = [0.225, 0.224, 0.229]  # 基于BGR范围的

        # 构建onnxruntime推理引擎
        self.ort_session = ort.InferenceSession(onnx_model,
                                providers=['CUDAExecutionProvider', 'CPUExecutionProvider']
                                if ort.get_device() == 'GPU' else ['CPUExecutionProvider'])

    # 归一化 
    def normalize(self, image, mean, std):  
        # 如果均值和标准差是基于0-255范围的图像计算的，那么需要先将图像转换为0-1范围  
        image = image / 255.0  
        image = image.astype(np.float32)  
        image_normalized = np.zeros_like(image)  

        for i in range(3):  # 对于 BGR 的每个通道  
            image_normalized[:, :, i] = (image[:, :, i] - mean[i]) / std[i]  
        return image_normalized
    

    def preprocess(self, img_a, img_b):
        # resize为1024大小
        if img_a.shape[0] != self.in_shape and img_a.shape[1] != self.in_shape:
            img_a = cv2.resize(img_a, (self.in_shape, self.in_shape), interpolation=cv2.INTER_LINEAR)
        if img_b.shape[0] != self.in_shape and img_b.shape[1] != self.in_shape:
            img_b = cv2.resize(img_b, (self.in_shape, self.in_shape), interpolation=cv2.INTER_LINEAR)

        # 应用归一化  
        img_a = self.normalize(img_a, self.mean, self.std)
        img_b = self.normalize(img_b, self.mean, self.std)
        img_a = np.ascontiguousarray(np.einsum('HWC->CHW', img_a)[::-1], dtype=np.single)  # (1024, 1024, 3)-->(3, 1024, 1024), BGR-->RGB
        img_b = np.ascontiguousarray(np.einsum('HWC->CHW', img_b)[::-1], dtype=np.single)  # np.single 和 np.float32 是等价的
        img_a = img_a[None] if len(img_a.shape) == 3 else img_a  # (1, 3, 256, 256)
        img_b = img_b[None] if len(img_b.shape) == 3 else img_b

        return img_a, img_b
    
    # 推理
    def infer(self, img_a, img_b):
        im1, im2 = self.preprocess(img_a, img_b)  # --> (1, 3, 256, 256)

        preds = self.ort_session.run(None, {self.ort_session.get_inputs()[0].name: im1, self.ort_session.get_inputs()[1].name: im2})[0]  
        out_img = (np.argmax(preds, axis=1)[0] * 255).astype("uint8")  
       
        return out_img

if __name__ == '__main__':
    # Create an argument parser to handle command-line arguments
    parser = argparse.ArgumentParser()
    parser.add_argument('--model', type=str, default='cd.onnx', help='Path to ONNX model')
    parser.add_argument('--source_A', type=str, default=str('E:/datasets/LEVIR-CD/test/A/test_7.png'), help='A期图像')
    parser.add_argument('--source_B', type=str, default=str('E:/datasets/LEVIR-CD/test/B/test_7.png'), help='B期图像')
    parser.add_argument('--in_shape', type=int, default=1024, help='输入模型图像尺度')
    args = parser.parse_args()

    # 实例化变化检测模型
    cd= CD(args.model, args.in_shape)

    t1 = time.time()
    # Read image by OpenCV
    img_a = cv2.imread(args.source_A)
    img_b = cv2.imread(args.source_B)

    # 推理+输出
    out = cd.infer(img_a, img_b)
    
    # 保存结果
    cv2.imwrite('./result/test_7_res.png', out)
    print('总耗时：{}'.format(time.time() - t1))