【人工智能Ⅱ】实验3:FCN对Pascal VOC2012数据集进行语义分割

实验3:FCN对Pascal VOC2012数据集进行语义分割

一:实验目的与要求

1:掌握图像分割的含义。

2:掌握利用FCN建立训练模型。

3:掌握使用FCN对Pascal VOC2012数据集进行语义分割。

二:实验内容

1:用FCN网络完成对Pascal VOC2012数据集进行语义分割。

2:可视化比较FCN8s,FCN16s,FCN32s的分割效果和Groud Truth。

3:尝试调整网络参数提高模型的分割精度。

三:实验环境

本实验所使用的环境条件如下表所示。

操作系统

Ubuntu(Linux)

程序语言

Python(3.11.4)

第三方依赖

torch, torchvision, tqdm,ninja,PIL,argparse,os,sys,__future__等

四:方法流程

1:编写数据加载代码,下载Pascal VOC2012数据集。

2:调用FCN系列结构的模块化代码,包括FCN8s、FCN16s、FCN32s。

3:调用FCN在数据集上的训练和验证代码,便于在训练过程中输出训练集和测试集的效果。

4:编写可视化比较原图和FCN测试后的输出结果图。

5:尝试改进FCN模型,提高分割精度。

五:实验展示(训练过程和训练部分结果进行可视化)

1:FCN8s的训练

设置训练指令,采用FCN8s作为分割模型,采用vgg16作为backbone卷积模型,采用0.0001作为初始学习率,采用30作为训练的迭代次数,采用Pascal VOC 2012作为数据集。

完整指令如下所示:

cd awesome-semantic-segmentation-pytorch/scripts

python train.py --model fcn8s --backbone vgg16 --dataset pascal_voc --lr 0.0001 --epochs 30

本次训练的时长统计为:Total training time: 0:36:34.115804 (0.1998s / it)

本次训练的训练日志路径为:/home/ubuntu/awesome-semantic-segmentation-pytorch/runs/logs/fcn8s_vgg16_pascal_voc_log.txt

本次训练的训练集学习率变化如下图所示:

本次训练的训练集损失值变化如下图所示:

本次训练的验证集mIoU变化如下图所示:

本次训练的验证集像素分类准确率变化如下图所示:

2:FCN16s的训练

设置训练指令,采用FCN16s作为分割模型,采用vgg16作为backbone卷积模型,采用0.0001作为初始学习率,采用30作为训练的迭代次数,采用Pascal VOC 2012作为数据集。

完整指令如下所示:

python train.py --model fcn16s --backbone vgg16 --dataset pascal_voc --lr 0.0001 --epochs 30

本次训练的时长统计为:Total training time: 0:38:59.804016 (0.2131s / it)

本次训练的训练日志路径为:/home/ubuntu/awesome-semantic-segmentation-pytorch/runs/logs/fcn16s_vgg16_pascal_voc_log.txt

本次训练的训练集学习率变化如下图所示:

本次训练的训练集损失值变化如下图所示:

本次训练的验证集mIoU变化如下图所示:

本次训练的验证集像素分类准确率变化如下图所示:

3:FCN32s的训练

设置训练指令,采用FCN32s作为分割模型,采用vgg16作为backbone卷积模型,采用0.0001作为初始学习率,采用30作为训练的迭代次数,采用Pascal VOC 2012作为数据集。

完整指令如下所示:

python train.py --model fcn32s --backbone vgg16 --dataset pascal_voc --lr 0.0001 --epochs 30

本次训练的时长统计为:Total training time: 0:43:39.373219 (0.2386s / it)

本次训练的训练日志路径为:/home/ubuntu/awesome-semantic-segmentation-pytorch/runs/logs/fcn32s_vgg16_pascal_voc_log.txt

本次训练的训练集学习率变化如下图所示:

 本次训练的训练集损失值变化如下图所示:

本次训练的验证集mIoU变化如下图所示:

本次训练的验证集像素分类准确率变化如下图所示:

4:FCN8s的验证

完整指令如下所示:

python eval.py --model fcn8s --backbone vgg16 --dataset pascal_voc


本次验证的数据集mIoU变化如下图所示:

从数据集中抽样,分割图像与原图的对比结果如下图所示:

5:FCN16s的验证

完整指令如下所示:

python eval.py --model fcn16s --backbone vgg16 --dataset pascal_voc

本次验证的数据集mIoU变化如下图所示:

本次验证的数据集像素分类准确率变化如下图所示:

从数据集中抽样,分割图像与原图的对比结果如下图所示:

6:FCN32s的验证

完整指令如下所示:

python eval.py --model fcn32s --backbone vgg16 --dataset pascal_voc

本次验证的数据集mIoU变化如下图所示:

本次验证的数据集像素分类准确率变化如下图所示:

从数据集中抽样,分割图像与原图的对比结果如下图所示:

 7:FCN8s的预测

完整指令如下所示:

python demo.py --model fcn8s_vgg16_voc --input-pic /home/ubuntu/awesome-semantic-segmentation-pytorch/tests/sheep.jpg

预测样本的分割图像与原图的对比结果如下图所示:

8:FCN16s的预测

完整指令如下所示:

python demo.py --model fcn16s_vgg16_voc --input-pic /home/ubuntu/awesome-semantic-segmentation-pytorch/tests/sheep.jpg

预测样本的分割图像与原图的对比结果如下图所示:

9:FCN32s的预测

完整指令如下所示:

python demo.py --model fcn32s_vgg16_voc --input-pic /home/ubuntu/awesome-semantic-segmentation-pytorch/tests/sheep.jpg

预测样本的分割图像与原图的对比结果如下图所示:

10:FCN的优化(backbone使用resnet系列)

 根据pytorch官网中关于FCN的介绍(FCN | PyTorch),官网采用了ResNet-50和ResNet-101作为了FCN的backbone。因此,本实验中将直接把更换backbone作为FCN的优化改进方法。

以resnet50作为backbone,预测样本的分割图像与原图的对比结果如下图所示:

以resnet101作为backbone,预测样本的分割图像与原图的对比结果如下图所示:

从上图可以发现,在pytorch官网提供的FCN模型中,预测图像的精度有大幅度的提升。

六:实验结论

1:通过可视化比较不同模型的分割效果,可以直观地了解不同模型的性能差异和优劣势,进而为模型选择和优化提供参考。

2:FCN网络结构主要分为全卷积部分和反卷积部分。全卷积部分是CNN网络,用于提取特征;反卷积部分是通过上采样得到原尺寸的语义分割图像。FCN的输入可以为任意尺寸的彩色图像,输出与输入尺寸相同,通道数为目标类别数+1(背景)。FCN是从抽象的特征中恢复出每个像素所属的类别,即从图像级别的分类进一步延伸到像素级别的分类。

3:FCN通过跳级结构将最后一层的预测(富有全局信息)和更浅层(富有局部信息)的预测结合起来,在遵守全局预测的同时进行局部预测。

4:FCN的缺点,第一个是得到的结果还不够精细,对细节不够敏感;第二个是未考虑像素与像素之间的关系,缺乏空间一致性等。

七:遇到的问题和解决方法

问题1:Pascal VOC 2012数据集的数量较大,导致下载时过程较慢。

解决1:暂无,只能等待半个小时左右。

问题2:对于FCN网络的训练、验证和预测,很难通过单一代码进行实现。

解决2:在GitHub上查找了一些可用的仓库,例如awesome-semantic-segmentation-pytorch和Amazing-Semantic-Segmentation,以及官方给的fcn.berkeleyvision.org。考虑到用户友好性,最终采用了awesome-semantic-segmentation-pytorch作为本次实验的主体框架。

问题3:仓库awesome-semantic-segmentation-pytorch的工程,如果使用fcn+resnet系列,则resnet的下载链接有问题。

解决3:暂时没找到工程中的解决方法。而是选择了torch官方的集成模型进行预测。

八:程序源代码

源代码可参考以下链接:https://github.com/Tramac/awesome-semantic-segmentation-pytorch,此处只给出本次实验中重要的代码。

1:训练代码

import argparse

import time

import datetime

import os

import shutil

import sys

cur_path = os.path.abspath(os.path.dirname(__file__))

root_path = os.path.split(cur_path)[0]

sys.path.append(root_path)

import torch

import torch.nn as nn

import torch.utils.data as data

import torch.backends.cudnn as cudnn

from torchvision import transforms

from core.data.dataloader import get_segmentation_dataset

from core.models.model_zoo import get_segmentation_model

from core.utils.loss import get_segmentation_loss

from core.utils.distributed import *

from core.utils.logger import setup_logger

from core.utils.lr_scheduler import WarmupPolyLR

from core.utils.score import SegmentationMetric

def parse_args():

    parser = argparse.ArgumentParser(description='Semantic Segmentation Training With Pytorch')

    # model and dataset

    parser.add_argument('--model', type=str, default='fcn',

                        choices=['fcn32s', 'fcn16s', 'fcn8s', 'fcn', 'psp', 'deeplabv3',

                            'deeplabv3_plus', 'danet', 'denseaspp', 'bisenet', 'encnet',

                            'dunet', 'icnet', 'enet', 'ocnet', 'psanet', 'cgnet', 'espnet',

                            'lednet', 'dfanet','swnet'],

                        help='model name (default: fcn32s)')

    parser.add_argument('--backbone', type=str, default='resnet50',

                        choices=['vgg16', 'resnet18', 'resnet50', 'resnet101', 'resnet152',

                            'densenet121', 'densenet161', 'densenet169', 'densenet201'],

                        help='backbone name (default: vgg16)')

    parser.add_argument('--dataset', type=str, default='pascal_voc',

                        choices=['pascal_voc', 'pascal_aug', 'ade20k', 'citys', 'sbu'],

                        help='dataset name (default: pascal_voc)')

    parser.add_argument('--base-size', type=int, default=520,

                        help='base image size')

    parser.add_argument('--crop-size', type=int, default=480,

                        help='crop image size')

    parser.add_argument('--workers', '-j', type=int, default=4,

                        metavar='N', help='dataloader threads')

    # training hyper params

    parser.add_argument('--jpu', action='store_true', default=False,

                        help='JPU')

    parser.add_argument('--use-ohem', type=bool, default=False,

                        help='OHEM Loss for cityscapes dataset')

    parser.add_argument('--aux', action='store_true', default=False,

                        help='Auxiliary loss')

    parser.add_argument('--aux-weight', type=float, default=0.4,

                        help='auxiliary loss weight')

    parser.add_argument('--batch-size', type=int, default=4, metavar='N',

                        help='input batch size for training (default: 8)')

    parser.add_argument('--start_epoch', type=int, default=0,

                        metavar='N', help='start epochs (default:0)')

    parser.add_argument('--epochs', type=int, default=50, metavar='N',

                        help='number of epochs to train (default: 50)')

    parser.add_argument('--lr', type=float, default=1e-4, metavar='LR',

                        help='learning rate (default: 1e-4)')

    parser.add_argument('--momentum', type=float, default=0.9, metavar='M',

                        help='momentum (default: 0.9)')

    parser.add_argument('--weight-decay', type=float, default=1e-4, metavar='M',

                        help='w-decay (default: 5e-4)')

    parser.add_argument('--warmup-iters', type=int, default=0,

                        help='warmup iters')

    parser.add_argument('--warmup-factor', type=float, default=1.0 / 3,

                        help='lr = warmup_factor * lr')

    parser.add_argument('--warmup-method', type=str, default='linear',

                        help='method of warmup')

    # cuda setting

    parser.add_argument('--no-cuda', action='store_true', default=False,

                        help='disables CUDA training')

    parser.add_argument('--local_rank', type=int, default=0)

    # checkpoint and log

    parser.add_argument('--resume', type=str, default=None,

                        help='put the path to resuming file if needed')

    parser.add_argument('--save-dir', default='~/.torch/models',

                        help='Directory for saving checkpoint models')

    parser.add_argument('--save-epoch', type=int, default=10,

                        help='save model every checkpoint-epoch')

    parser.add_argument('--log-dir', default='../runs/logs/',

                        help='Directory for saving checkpoint models')

    parser.add_argument('--log-iter', type=int, default=10,

                        help='print log every log-iter')

    # evaluation only

    parser.add_argument('--val-epoch', type=int, default=1,

                        help='run validation every val-epoch')

    parser.add_argument('--skip-val', action='store_true', default=False,

                        help='skip validation during training')

    args = parser.parse_args()

    # default settings for epochs, batch_size and lr

    if args.epochs is None:

        epoches = {

            'coco': 30,

            'pascal_aug': 80,

            'pascal_voc': 50,

            'pcontext': 80,

            'ade20k': 160,

            'citys': 120,

            'sbu': 160,

        }

        args.epochs = epoches[args.dataset.lower()]

    if args.lr is None:

        lrs = {

            'coco': 0.004,

            'pascal_aug': 0.001,

            'pascal_voc': 0.0001,

            'pcontext': 0.001,

            'ade20k': 0.01,

            'citys': 0.01,

            'sbu': 0.001,

        }

        args.lr = lrs[args.dataset.lower()] / 8 * args.batch_size

    return args

class Trainer(object):

    def __init__(self, args):

        self.args = args

        self.device = torch.device(args.device)

        # image transform

        input_transform = transforms.Compose([

            transforms.ToTensor(),

            transforms.Normalize([.485, .456, .406], [.229, .224, .225]),

        ])

        # dataset and dataloader

        data_kwargs = {'transform': input_transform, 'base_size': args.base_size, 'crop_size': args.crop_size}

        train_dataset = get_segmentation_dataset(args.dataset, split='train', mode='train', **data_kwargs)

        val_dataset = get_segmentation_dataset(args.dataset, split='val', mode='val', **data_kwargs)

        args.iters_per_epoch = len(train_dataset) // (args.num_gpus * args.batch_size)

        args.max_iters = args.epochs * args.iters_per_epoch

        train_sampler = make_data_sampler(train_dataset, shuffle=True, distributed=args.distributed)

        train_batch_sampler = make_batch_data_sampler(train_sampler, args.batch_size, args.max_iters)

        val_sampler = make_data_sampler(val_dataset, False, args.distributed)

        val_batch_sampler = make_batch_data_sampler(val_sampler, args.batch_size)

        self.train_loader = data.DataLoader(dataset=train_dataset,

                                            batch_sampler=train_batch_sampler,

                                            num_workers=args.workers,

                                            pin_memory=True)

        self.val_loader = data.DataLoader(dataset=val_dataset,

                                          batch_sampler=val_batch_sampler,

                                          num_workers=args.workers,

                                          pin_memory=True)

        # create network

        BatchNorm2d = nn.SyncBatchNorm if args.distributed else nn.BatchNorm2d

        self.model = get_segmentation_model(model=args.model, dataset=args.dataset, backbone=args.backbone,

                                            aux=args.aux, jpu=args.jpu, norm_layer=BatchNorm2d).to(self.device)

        # resume checkpoint if needed

        if args.resume:

            if os.path.isfile(args.resume):

                name, ext = os.path.splitext(args.resume)

                assert ext == '.pkl' or '.pth', 'Sorry only .pth and .pkl files supported.'

                print('Resuming training, loading {}...'.format(args.resume))

                self.model.load_state_dict(torch.load(args.resume, map_location=lambda storage, loc: storage))

        # create criterion

        self.criterion = get_segmentation_loss(args.model, use_ohem=args.use_ohem, aux=args.aux,

                                               aux_weight=args.aux_weight, ignore_index=-1).to(self.device)

        # optimizer, for model just includes pretrained, head and auxlayer

        params_list = list()

        if hasattr(self.model, 'pretrained'):

            params_list.append({'params': self.model.pretrained.parameters(), 'lr': args.lr})

        if hasattr(self.model, 'exclusive'):

            for module in self.model.exclusive:

                params_list.append({'params': getattr(self.model, module).parameters(), 'lr': args.lr * 10})

        self.optimizer = torch.optim.SGD(params_list,

                                         lr=args.lr,

                                         momentum=args.momentum,

                                         weight_decay=args.weight_decay)

        # lr scheduling

        self.lr_scheduler = WarmupPolyLR(self.optimizer,

                                         max_iters=args.max_iters,

                                         power=0.9,

                                         warmup_factor=args.warmup_factor,

                                         warmup_iters=args.warmup_iters,

                                         warmup_method=args.warmup_method)

        if args.distributed:

            self.model = nn.parallel.DistributedDataParallel(self.model, device_ids=[args.local_rank],

                                                             output_device=args.local_rank)

        # evaluation metrics

        self.metric = SegmentationMetric(train_dataset.num_class)

        self.best_pred = 0.0

    def train(self):

        save_to_disk = get_rank() == 0

        epochs, max_iters = self.args.epochs, self.args.max_iters

        log_per_iters, val_per_iters = self.args.log_iter, self.args.val_epoch * self.args.iters_per_epoch

        save_per_iters = self.args.save_epoch * self.args.iters_per_epoch

        start_time = time.time()

        logger.info('Start training, Total Epochs: {:d} = Total Iterations {:d}'.format(epochs, max_iters))

        self.model.train()

        for iteration, (images, targets, _) in enumerate(self.train_loader):

            iteration = iteration + 1

            self.lr_scheduler.step()

            images = images.to(self.device)

            targets = targets.to(self.device)

            outputs = self.model(images)

            loss_dict = self.criterion(outputs, targets)

            losses = sum(loss for loss in loss_dict.values())

            # reduce losses over all GPUs for logging purposes

            loss_dict_reduced = reduce_loss_dict(loss_dict)

            losses_reduced = sum(loss for loss in loss_dict_reduced.values())

            self.optimizer.zero_grad()

            losses.backward()

            self.optimizer.step()

            eta_seconds = ((time.time() - start_time) / iteration) * (max_iters - iteration)

            eta_string = str(datetime.timedelta(seconds=int(eta_seconds)))

            if iteration % log_per_iters == 0 and save_to_disk:

                logger.info(

                    "Iters: {:d}/{:d} || Lr: {:.6f} || Loss: {:.4f} || Cost Time: {} || Estimated Time: {}".format(

                        iteration, max_iters, self.optimizer.param_groups[0]['lr'], losses_reduced.item(),

                        str(datetime.timedelta(seconds=int(time.time() - start_time))), eta_string))

            if iteration % save_per_iters == 0 and save_to_disk:

                save_checkpoint(self.model, self.args, is_best=False)

            if not self.args.skip_val and iteration % val_per_iters == 0:

                self.validation()

                self.model.train()

        save_checkpoint(self.model, self.args, is_best=False)

        total_training_time = time.time() - start_time

        total_training_str = str(datetime.timedelta(seconds=total_training_time))

        logger.info(

            "Total training time: {} ({:.4f}s / it)".format(

                total_training_str, total_training_time / max_iters))

    def validation(self):

        # total_inter, total_union, total_correct, total_label = 0, 0, 0, 0

        is_best = False

        self.metric.reset()

        if self.args.distributed:

            model = self.model.module

        else:

            model = self.model

        torch.cuda.empty_cache()  # TODO check if it helps

        model.eval()

        for i, (image, target, filename) in enumerate(self.val_loader):

            image = image.to(self.device)

            target = target.to(self.device)

            with torch.no_grad():

                outputs = model(image)

            self.metric.update(outputs[0], target)

            pixAcc, mIoU = self.metric.get()

            logger.info("Sample: {:d}, Validation pixAcc: {:.3f}, mIoU: {:.3f}".format(i + 1, pixAcc, mIoU))

        new_pred = (pixAcc + mIoU) / 2

        if new_pred > self.best_pred:

            is_best = True

            self.best_pred = new_pred

        save_checkpoint(self.model, self.args, is_best)

        synchronize()

def save_checkpoint(model, args, is_best=False):

    """Save Checkpoint"""

    directory = os.path.expanduser(args.save_dir)

    if not os.path.exists(directory):

        os.makedirs(directory)

    filename = '{}_{}_{}.pth'.format(args.model, args.backbone, args.dataset)

    filename = os.path.join(directory, filename)

    if args.distributed:

        model = model.module

    torch.save(model.state_dict(), filename)

    if is_best:

        best_filename = '{}_{}_{}_best_model.pth'.format(args.model, args.backbone, args.dataset)

        best_filename = os.path.join(directory, best_filename)

        shutil.copyfile(filename, best_filename)

if __name__ == '__main__':

    args = parse_args()

    # reference maskrcnn-benchmark

    num_gpus = int(os.environ["WORLD_SIZE"]) if "WORLD_SIZE" in os.environ else 1

    args.num_gpus = num_gpus

    args.distributed = num_gpus > 1

    if not args.no_cuda and torch.cuda.is_available():

        cudnn.benchmark = True

        args.device = "cuda"

    else:

        args.distributed = False

        args.device = "cpu"

    if args.distributed:

        torch.cuda.set_device(args.local_rank)

        torch.distributed.init_process_group(backend="nccl", init_method="env://")

        synchronize()

    args.lr = args.lr * num_gpus

    logger = setup_logger("semantic_segmentation", args.log_dir, get_rank(), filename='{}_{}_{}_log.txt'.format(

        args.model, args.backbone, args.dataset))

    logger.info("Using {} GPUs".format(num_gpus))

    logger.info(args)

    trainer = Trainer(args)

    trainer.train()

    torch.cuda.empty_cache()

2:验证代码

from __future__ import print_function

import os

import sys

cur_path = os.path.abspath(os.path.dirname(__file__))

root_path = os.path.split(cur_path)[0]

sys.path.append(root_path)

import torch

import torch.nn as nn

import torch.utils.data as data

import torch.backends.cudnn as cudnn

from torchvision import transforms

from core.data.dataloader import get_segmentation_dataset

from core.models.model_zoo import get_segmentation_model

from core.utils.score import SegmentationMetric

from core.utils.visualize import get_color_pallete

from core.utils.logger import setup_logger

from core.utils.distributed import synchronize, get_rank, make_data_sampler, make_batch_data_sampler

from train import parse_args

class Evaluator(object):

    def __init__(self, args):

        self.args = args

        self.device = torch.device(args.device)

        # image transform

        input_transform = transforms.Compose([

            transforms.ToTensor(),

            transforms.Normalize([.485, .456, .406], [.229, .224, .225]),

        ])

        # dataset and dataloader

        val_dataset = get_segmentation_dataset(args.dataset, split='val', mode='testval', transform=input_transform)

        val_sampler = make_data_sampler(val_dataset, False, args.distributed)

        val_batch_sampler = make_batch_data_sampler(val_sampler, images_per_batch=1)

        self.val_loader = data.DataLoader(dataset=val_dataset,

                                          batch_sampler=val_batch_sampler,

                                          num_workers=args.workers,

                                          pin_memory=True)

        # create network

        BatchNorm2d = nn.SyncBatchNorm if args.distributed else nn.BatchNorm2d

        self.model = get_segmentation_model(model=args.model, dataset=args.dataset, backbone=args.backbone,

                                            aux=args.aux, pretrained=True, pretrained_base=False,

                                            local_rank=args.local_rank,

                                            norm_layer=BatchNorm2d).to(self.device)

        if args.distributed:

            self.model = nn.parallel.DistributedDataParallel(self.model,

                device_ids=[args.local_rank], output_device=args.local_rank)

        self.model.to(self.device)

        self.metric = SegmentationMetric(val_dataset.num_class)

    def eval(self):

        self.metric.reset()

        self.model.eval()

        if self.args.distributed:

            model = self.model.module

        else:

            model = self.model

        logger.info("Start validation, Total sample: {:d}".format(len(self.val_loader)))

        for i, (image, target, filename) in enumerate(self.val_loader):

            image = image.to(self.device)

            target = target.to(self.device)

            with torch.no_grad():

                outputs = model(image)

            self.metric.update(outputs[0], target)

            pixAcc, mIoU = self.metric.get()

            logger.info("Sample: {:d}, validation pixAcc: {:.3f}, mIoU: {:.3f}".format(

                i + 1, pixAcc * 100, mIoU * 100))

            if self.args.save_pred:

                pred = torch.argmax(outputs[0], 1)

                pred = pred.cpu().data.numpy()

                predict = pred.squeeze(0)

                mask = get_color_pallete(predict, self.args.dataset)

                mask.save(os.path.join(outdir, os.path.splitext(filename[0])[0] + '.png'))

        synchronize()

if __name__ == '__main__':

    args = parse_args()

    num_gpus = int(os.environ["WORLD_SIZE"]) if "WORLD_SIZE" in os.environ else 1

    args.distributed = num_gpus > 1

    if not args.no_cuda and torch.cuda.is_available():

        cudnn.benchmark = True

        args.device = "cuda"

    else:

        args.distributed = False

        args.device = "cpu"

    if args.distributed:

        torch.cuda.set_device(args.local_rank)

        torch.distributed.init_process_group(backend="nccl", init_method="env://")

        synchronize()

    # TODO: optim code

    args.save_pred = True

    if args.save_pred:

        outdir = '../runs/pred_pic/{}_{}_{}'.format(args.model, args.backbone, args.dataset)

        if not os.path.exists(outdir):

            os.makedirs(outdir)

    logger = setup_logger("semantic_segmentation", args.log_dir, get_rank(),

                          filename='{}_{}_{}_log.txt'.format(args.model, args.backbone, args.dataset), mode='a+')

    evaluator = Evaluator(args)

    evaluator.eval()

    torch.cuda.empty_cache()

3:测试代码

import os

import sys

import argparse

import torch

cur_path = os.path.abspath(os.path.dirname(__file__))

root_path = os.path.split(cur_path)[0]

sys.path.append(root_path)

from torchvision import transforms

from PIL import Image

from core.utils.visualize import get_color_pallete

from core.models import get_model

parser = argparse.ArgumentParser(

    description='Predict segmentation result from a given image')

parser.add_argument('--model', type=str, default='fcn32s_vgg16_voc',

                    help='model name (default: fcn32_vgg16)')

parser.add_argument('--dataset', type=str, default='pascal_aug', choices=['pascal_voc, pascal_aug, ade20k, citys'],

                    help='dataset name (default: pascal_voc)')

parser.add_argument('--save-folder', default='~/.torch/models',

                    help='Directory for saving checkpoint models')

parser.add_argument('--input-pic', type=str, default='../datasets/voc/VOC2012/JPEGImages/2007_000032.jpg',

                    help='path to the input picture')

parser.add_argument('--outdir', default='./eval', type=str,

                    help='path to save the predict result')

parser.add_argument('--local_rank', type=int, default=0)

args = parser.parse_args()

def demo(config):

    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

    # output folder

    if not os.path.exists(config.outdir):

        os.makedirs(config.outdir)

    # image transform

    transform = transforms.Compose([

        transforms.ToTensor(),

        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]),

    ])

    image = Image.open(config.input_pic).convert('RGB')

    images = transform(image).unsqueeze(0).to(device)

    model = get_model(args.model, local_rank=args.local_rank, pretrained=True, root=args.save_folder).to(device)

    print('Finished loading model!')

    model.eval()

    with torch.no_grad():

        output = model(images)

    pred = torch.argmax(output[0], 1).squeeze(0).cpu().data.numpy()

    mask = get_color_pallete(pred, args.dataset)

    outname = os.path.splitext(os.path.split(args.input_pic)[-1])[0] + '.png'

    mask.save(os.path.join(args.outdir, outname))

if __name__ == '__main__':

    demo(args)

4:分割图像和原图对比代码

from PIL import Image

def merge_images(original_image_path, segmentation_image_path, output_image_path):

    # Open original and segmentation images

    original_image = Image.open(original_image_path)

    segmentation_image = Image.open(segmentation_image_path)

    # Resize segmentation image if needed to match original image size

    if original_image.size != segmentation_image.size:

        segmentation_image = segmentation_image.resize(original_image.size, Image.NEAREST)

    # Create a new image with same width and double height

    merged_image = Image.new('RGB', (original_image.width, original_image.height * 2))

    # Paste original image at the top

    merged_image.paste(original_image, (0, 0))

    # Paste segmentation image at the bottom

    merged_image.paste(segmentation_image, (0, original_image.height))

    # Save the merged image

    merged_image.save(output_image_path)

original = r"/home/ubuntu/awesome-semantic-segmentation-pytorch/tests/sheep.jpg"

segmentation = r"/home/ubuntu/awesome-semantic-segmentation-pytorch/scripts/eval/sheep.png"

merge_images(original, segmentation, '/home/ubuntu/zzz-merged.jpg')

5:FCN网络结构代码

import os

import torch

import torch.nn as nn

import torch.nn.functional as F

from .base_models.vgg import vgg16

__all__ = ['get_fcn32s', 'get_fcn16s', 'get_fcn8s',

           'get_fcn32s_vgg16_voc', 'get_fcn16s_vgg16_voc', 'get_fcn8s_vgg16_voc']

class FCN32s(nn.Module):

    """There are some difference from original fcn"""

    def __init__(self, nclass, backbone='vgg16', aux=False, pretrained_base=True,

                 norm_layer=nn.BatchNorm2d, **kwargs):

        super(FCN32s, self).__init__()

        self.aux = aux

        if backbone == 'vgg16':

            self.pretrained = vgg16(pretrained=pretrained_base).features

        else:

            raise RuntimeError('unknown backbone: {}'.format(backbone))

        self.head = _FCNHead(512, nclass, norm_layer)

        if aux:

            self.auxlayer = _FCNHead(512, nclass, norm_layer)

        self.__setattr__('exclusive', ['head', 'auxlayer'] if aux else ['head'])

    def forward(self, x):

        size = x.size()[2:]

        pool5 = self.pretrained(x)

        outputs = []

        out = self.head(pool5)

        out = F.interpolate(out, size, mode='bilinear', align_corners=True)

        outputs.append(out)

        if self.aux:

            auxout = self.auxlayer(pool5)

            auxout = F.interpolate(auxout, size, mode='bilinear', align_corners=True)

            outputs.append(auxout)

        return tuple(outputs)

class FCN16s(nn.Module):

    def __init__(self, nclass, backbone='vgg16', aux=False, pretrained_base=True, norm_layer=nn.BatchNorm2d, **kwargs):

        super(FCN16s, self).__init__()

        self.aux = aux

        if backbone == 'vgg16':

            self.pretrained = vgg16(pretrained=pretrained_base).features

        else:

            raise RuntimeError('unknown backbone: {}'.format(backbone))

        self.pool4 = nn.Sequential(*self.pretrained[:24])

        self.pool5 = nn.Sequential(*self.pretrained[24:])

        self.head = _FCNHead(512, nclass, norm_layer)

        self.score_pool4 = nn.Conv2d(512, nclass, 1)

        if aux:

            self.auxlayer = _FCNHead(512, nclass, norm_layer)

        self.__setattr__('exclusive', ['head', 'score_pool4', 'auxlayer'] if aux else ['head', 'score_pool4'])

    def forward(self, x):

        pool4 = self.pool4(x)

        pool5 = self.pool5(pool4)

        outputs = []

        score_fr = self.head(pool5)

        score_pool4 = self.score_pool4(pool4)

        upscore2 = F.interpolate(score_fr, score_pool4.size()[2:], mode='bilinear', align_corners=True)

        fuse_pool4 = upscore2 + score_pool4

        out = F.interpolate(fuse_pool4, x.size()[2:], mode='bilinear', align_corners=True)

        outputs.append(out)

        if self.aux:

            auxout = self.auxlayer(pool5)

            auxout = F.interpolate(auxout, x.size()[2:], mode='bilinear', align_corners=True)

            outputs.append(auxout)

        return tuple(outputs)

class FCN8s(nn.Module):

    def __init__(self, nclass, backbone='vgg16', aux=False, pretrained_base=True, norm_layer=nn.BatchNorm2d, **kwargs):

        super(FCN8s, self).__init__()

        self.aux = aux

        if backbone == 'vgg16':

            self.pretrained = vgg16(pretrained=pretrained_base).features

        else:

            raise RuntimeError('unknown backbone: {}'.format(backbone))

        self.pool3 = nn.Sequential(*self.pretrained[:17])

        self.pool4 = nn.Sequential(*self.pretrained[17:24])

        self.pool5 = nn.Sequential(*self.pretrained[24:])

        self.head = _FCNHead(512, nclass, norm_layer)

        self.score_pool3 = nn.Conv2d(256, nclass, 1)

        self.score_pool4 = nn.Conv2d(512, nclass, 1)

        if aux:

            self.auxlayer = _FCNHead(512, nclass, norm_layer)

        self.__setattr__('exclusive',

                         ['head', 'score_pool3', 'score_pool4', 'auxlayer'] if aux else ['head', 'score_pool3',

                                                                                         'score_pool4'])

    def forward(self, x):

        pool3 = self.pool3(x)

        pool4 = self.pool4(pool3)

        pool5 = self.pool5(pool4)

        outputs = []

        score_fr = self.head(pool5)

        score_pool4 = self.score_pool4(pool4)

        score_pool3 = self.score_pool3(pool3)

        upscore2 = F.interpolate(score_fr, score_pool4.size()[2:], mode='bilinear', align_corners=True)

        fuse_pool4 = upscore2 + score_pool4

        upscore_pool4 = F.interpolate(fuse_pool4, score_pool3.size()[2:], mode='bilinear', align_corners=True)

        fuse_pool3 = upscore_pool4 + score_pool3

        out = F.interpolate(fuse_pool3, x.size()[2:], mode='bilinear', align_corners=True)

        outputs.append(out)

        if self.aux:

            auxout = self.auxlayer(pool5)

            auxout = F.interpolate(auxout, x.size()[2:], mode='bilinear', align_corners=True)

            outputs.append(auxout)

        return tuple(outputs)

class _FCNHead(nn.Module):

    def __init__(self, in_channels, channels, norm_layer=nn.BatchNorm2d, **kwargs):

        super(_FCNHead, self).__init__()

        inter_channels = in_channels // 4

        self.block = nn.Sequential(

            nn.Conv2d(in_channels, inter_channels, 3, padding=1, bias=False),

            norm_layer(inter_channels),

            nn.ReLU(inplace=True),

            nn.Dropout(0.1),

            nn.Conv2d(inter_channels, channels, 1)

        )

    def forward(self, x):

        return self.block(x)

def get_fcn32s(dataset='pascal_voc', backbone='vgg16', pretrained=False, root='~/.torch/models',

               pretrained_base=True, **kwargs):

    acronyms = {

        'pascal_voc': 'pascal_voc',

        'pascal_aug': 'pascal_aug',

        'ade20k': 'ade',

        'coco': 'coco',

        'citys': 'citys',

        'sbu': 'sbu',

    }

    from ..data.dataloader import datasets

    model = FCN32s(datasets[dataset].NUM_CLASS, backbone=backbone, pretrained_base=pretrained_base, **kwargs)

    if pretrained:

        from .model_store import get_model_file

        device = torch.device(kwargs['local_rank'])

        model.load_state_dict(torch.load(get_model_file('fcn32s_%s_%s' % (backbone, acronyms[dataset]), root=root),

                              map_location=device))

    return model

def get_fcn16s(dataset='pascal_voc', backbone='vgg16', pretrained=False, root='~/.torch/models',

               pretrained_base=True, **kwargs):

    acronyms = {

        'pascal_voc': 'pascal_voc',

        'pascal_aug': 'pascal_aug',

        'ade20k': 'ade',

        'coco': 'coco',

        'citys': 'citys',

        'sbu': 'sbu',

    }

    from ..data.dataloader import datasets

    model = FCN16s(datasets[dataset].NUM_CLASS, backbone=backbone, pretrained_base=pretrained_base, **kwargs)

    if pretrained:

        from .model_store import get_model_file

        device = torch.device(kwargs['local_rank'])

        model.load_state_dict(torch.load(get_model_file('fcn16s_%s_%s' % (backbone, acronyms[dataset]), root=root),

                              map_location=device))

    return model

def get_fcn8s(dataset='pascal_voc', backbone='vgg16', pretrained=False, root='~/.torch/models',

              pretrained_base=True, **kwargs):

    acronyms = {

        'pascal_voc': 'pascal_voc',

        'pascal_aug': 'pascal_aug',

        'ade20k': 'ade',

        'coco': 'coco',

        'citys': 'citys',

        'sbu': 'sbu',

    }

    from ..data.dataloader import datasets

    model = FCN8s(datasets[dataset].NUM_CLASS, backbone=backbone, pretrained_base=pretrained_base, **kwargs)

    if pretrained:

        from .model_store import get_model_file

        device = torch.device(kwargs['local_rank'])

        model.load_state_dict(torch.load(get_model_file('fcn8s_%s_%s' % (backbone, acronyms[dataset]), root=root),

                              map_location=device))

    return model

def get_fcn32s_vgg16_voc(**kwargs):

    return get_fcn32s('pascal_voc', 'vgg16', **kwargs)

def get_fcn16s_vgg16_voc(**kwargs):

    return get_fcn16s('pascal_voc', 'vgg16', **kwargs)

def get_fcn8s_vgg16_voc(**kwargs):

    return get_fcn8s('pascal_voc', 'vgg16', **kwargs)

if __name__ == '__main__':

    model = FCN16s(21)

    print(model)

6:VGG backbone代码

import torch

import torch.nn as nn

import torch.utils.model_zoo as model_zoo

__all__ = [

    'VGG', 'vgg11', 'vgg11_bn', 'vgg13', 'vgg13_bn', 'vgg16', 'vgg16_bn',

    'vgg19_bn', 'vgg19',

]

model_urls = {

    'vgg11': 'https://download.pytorch.org/models/vgg11-bbd30ac9.pth',

    'vgg13': 'https://download.pytorch.org/models/vgg13-c768596a.pth',

    'vgg16': 'https://download.pytorch.org/models/vgg16-397923af.pth',

    'vgg19': 'https://download.pytorch.org/models/vgg19-dcbb9e9d.pth',

    'vgg11_bn': 'https://download.pytorch.org/models/vgg11_bn-6002323d.pth',

    'vgg13_bn': 'https://download.pytorch.org/models/vgg13_bn-abd245e5.pth',

    'vgg16_bn': 'https://download.pytorch.org/models/vgg16_bn-6c64b313.pth',

    'vgg19_bn': 'https://download.pytorch.org/models/vgg19_bn-c79401a0.pth',

}

class VGG(nn.Module):

    def __init__(self, features, num_classes=1000, init_weights=True):

        super(VGG, self).__init__()

        self.features = features

        self.avgpool = nn.AdaptiveAvgPool2d((7, 7))

        self.classifier = nn.Sequential(

            nn.Linear(512 * 7 * 7, 4096),

            nn.ReLU(True),

            nn.Dropout(),

            nn.Linear(4096, 4096),

            nn.ReLU(True),

            nn.Dropout(),

            nn.Linear(4096, num_classes)

        )

        if init_weights:

            self._initialize_weights()

    def forward(self, x):

        x = self.features(x)

        x = self.avgpool(x)

        x = x.view(x.size(0), -1)

        x = self.classifier(x)

        return x

    def _initialize_weights(self):

        for m in self.modules():

            if isinstance(m, nn.Conv2d):

                nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')

                if m.bias is not None:

                    nn.init.constant_(m.bias, 0)

            elif isinstance(m, nn.BatchNorm2d):

                nn.init.constant_(m.weight, 1)

                nn.init.constant_(m.bias, 0)

            elif isinstance(m, nn.Linear):

                nn.init.normal_(m.weight, 0, 0.01)

                nn.init.constant_(m.bias, 0)

def make_layers(cfg, batch_norm=False):

    layers = []

    in_channels = 3

    for v in cfg:

        if v == 'M':

            layers += [nn.MaxPool2d(kernel_size=2, stride=2)]

        else:

            conv2d = nn.Conv2d(in_channels, v, kernel_size=3, padding=1)

            if batch_norm:

                layers += (conv2d, nn.BatchNorm2d(v), nn.ReLU(inplace=True))

            else:

                layers += [conv2d, nn.ReLU(inplace=True)]

            in_channels = v

    return nn.Sequential(*layers)

cfg = {

    'A': [64, 'M', 128, 'M', 256, 256, 'M', 512, 512, 'M', 512, 512, 'M'],

    'B': [64, 64, 'M', 128, 128, 'M', 256, 256, 'M', 512, 512, 'M', 512, 512, 'M'],

    'D': [64, 64, 'M', 128, 128, 'M', 256, 256, 256, 'M', 512, 512, 512, 'M', 512, 512, 512, 'M'],

    'E': [64, 64, 'M', 128, 128, 'M', 256, 256, 256, 256, 'M', 512, 512, 512, 512, 'M', 512, 512, 512, 512, 'M'],

}

def vgg11(pretrained=False, **kwargs):

    """VGG 11-layer model (configuration "A")

    Args:

        pretrained (bool): If True, returns a model pre-trained on ImageNet

    """

    if pretrained:

        kwargs['init_weights'] = False

    model = VGG(make_layers(cfg['A']), **kwargs)

    if pretrained:

        model.load_state_dict(model_zoo.load_url(model_urls['vgg11']))

    return model

def vgg11_bn(pretrained=False, **kwargs):

    """VGG 11-layer model (configuration "A") with batch normalization

    Args:

        pretrained (bool): If True, returns a model pre-trained on ImageNet

    """

    if pretrained:

        kwargs['init_weights'] = False

    model = VGG(make_layers(cfg['A'], batch_norm=True), **kwargs)

    if pretrained:

        model.load_state_dict(model_zoo.load_url(model_urls['vgg11_bn']))

    return model

def vgg13(pretrained=False, **kwargs):

    """VGG 13-layer model (configuration "B")

    Args:

        pretrained (bool): If True, returns a model pre-trained on ImageNet

    """

    if pretrained:

        kwargs['init_weights'] = False

    model = VGG(make_layers(cfg['B']), **kwargs)

    if pretrained:

        model.load_state_dict(model_zoo.load_url(model_urls['vgg13']))

    return model

def vgg13_bn(pretrained=False, **kwargs):

    """VGG 13-layer model (configuration "B") with batch normalization

    Args:

        pretrained (bool): If True, returns a model pre-trained on ImageNet

    """

    if pretrained:

        kwargs['init_weights'] = False

    model = VGG(make_layers(cfg['B'], batch_norm=True), **kwargs)

    if pretrained:

        model.load_state_dict(model_zoo.load_url(model_urls['vgg13_bn']))

    return model

def vgg16(pretrained=False, **kwargs):

    """VGG 16-layer model (configuration "D")

    Args:

        pretrained (bool): If True, returns a model pre-trained on ImageNet

    """

    if pretrained:

        kwargs['init_weights'] = False

    model = VGG(make_layers(cfg['D']), **kwargs)

    if pretrained:

        model.load_state_dict(model_zoo.load_url(model_urls['vgg16']))

    return model

def vgg16_bn(pretrained=False, **kwargs):

    """VGG 16-layer model (configuration "D") with batch normalization

    Args:

        pretrained (bool): If True, returns a model pre-trained on ImageNet

    """

    if pretrained:

        kwargs['init_weights'] = False

    model = VGG(make_layers(cfg['D'], batch_norm=True), **kwargs)

    if pretrained:

        model.load_state_dict(model_zoo.load_url(model_urls['vgg16_bn']))

    return model

def vgg19(pretrained=False, **kwargs):

    """VGG 19-layer model (configuration "E")

    Args:

        pretrained (bool): If True, returns a model pre-trained on ImageNet

    """

    if pretrained:

        kwargs['init_weights'] = False

    model = VGG(make_layers(cfg['E']), **kwargs)

    if pretrained:

        model.load_state_dict(model_zoo.load_url(model_urls['vgg19']))

    return model

def vgg19_bn(pretrained=False, **kwargs):

    """VGG 19-layer model (configuration 'E') with batch normalization

    Args:

        pretrained (bool): If True, returns a model pre-trained on ImageNet

    """

    if pretrained:

        kwargs['init_weights'] = False

    model = VGG(make_layers(cfg['E'], batch_norm=True), **kwargs)

    if pretrained:

        model.load_state_dict(model_zoo.load_url(model_urls['vgg19_bn']))

    return model

if __name__ == '__main__':

    img = torch.randn((4, 3, 480, 480))

    model = vgg16(pretrained=False)

    out = model(img)

7:训练日志可视化输出代码

import re

import matplotlib.pyplot as plt

# 读取日志文件

log_file_path = r"awesome-semantic-segmentation-pytorch/runs/logs/fcn32s_vgg16_pascal_voc_log.txt"

with open(log_file_path, 'r') as f:

    log_data = f.readlines()

# 解析日志内容

pixAcc_data = []

mIoU_data = []

iteration_data = []

lr_data = []

loss_data = []

for line in log_data:

    match = re.search(r'Validation pixAcc: (\d+\.\d+), mIoU: (\d+\.\d+)', line)

    if match:

        pixAcc_data.append(float(match.group(1)))

        mIoU_data.append(float(match.group(2)))

    re_match = re.search(r'Iters: (\d+)/\d+ \|\| Lr: (\d+\.\d+) \|\| Loss: (\d+\.\d+)', line)

    if re_match:

        iteration_data.append(int(re_match.group(1)))

        lr_data.append(float(re_match.group(2)))

        loss_data.append(float(re_match.group(3)))

# 绘制曲线图

iterations = range(1, len(pixAcc_data) + 1)

# Pixel Accuracy的图

plt.figure(figsize=(10, 5))

plt.plot(iterations, pixAcc_data, label='Pixel Accuracy', marker='o')

plt.xlabel('Iterations')

plt.ylabel('Pixel Accuracy')

plt.title('Pixel Accuracy')

plt.legend()

plt.grid(True)

plt.savefig(r"/home/ubuntu/zzz-pixAcc.jpg")

# mIoU的图

plt.figure(figsize=(10, 5))

plt.plot(iterations, mIoU_data, label='mIoU', marker='o')

plt.xlabel('Iterations')

plt.ylabel('mIoU')

plt.title('Mean IoU')

plt.legend()

plt.grid(True)

plt.savefig(r"/home/ubuntu/zzz-mIoU.jpg")  

# Learning Rate的图

plt.figure(figsize=(10, 5))

plt.plot(iteration_data, lr_data, label='Learning Rate', marker='o')

plt.xlabel('Iterations')

plt.ylabel('Learning Rate')

plt.title('Learning Rate')

plt.legend()

plt.grid(True)

plt.savefig(r"/home/ubuntu/zzz-lr.jpg")

# Loss的图

plt.figure(figsize=(10, 5))

plt.plot(iteration_data, loss_data, label='Loss', marker='o')

plt.xlabel('Iterations')

plt.ylabel('Loss')

plt.title('Loss')

plt.legend()

plt.grid(True)

plt.savefig(r"/home/ubuntu/zzz-loss.jpg")

print("Finished!")

8:torch.model预测代码

import torch

model = torch.hub.load('pytorch/vision:v0.10.0', 'fcn_resnet50', pretrained=True)

model.eval()

# Download an example image from the pytorch website

import urllib

filename = (r"/home/ubuntu/awesome-semantic-segmentation-pytorch/tests/sheep.jpg")

# sample execution (requires torchvision)

from PIL import Image

from torchvision import transforms

input_image = Image.open(filename)

input_image = input_image.convert("RGB")

preprocess = transforms.Compose([

    transforms.ToTensor(),

    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),

])

input_tensor = preprocess(input_image)

input_batch = input_tensor.unsqueeze(0) # create a mini-batch as expected by the model

# move the input and model to GPU for speed if available

if torch.cuda.is_available():

    input_batch = input_batch.to('cuda')

    model.to('cuda')

with torch.no_grad():

    output = model(input_batch)['out'][0]

output_predictions = output.argmax(0)

# create a color pallette, selecting a color for each class

palette = torch.tensor([2 ** 25 - 1, 2 ** 15 - 1, 2 ** 21 - 1])

colors = torch.as_tensor([i for i in range(21)])[:, None] * palette

colors = (colors % 255).numpy().astype("uint8")

# plot the semantic segmentation predictions of 21 classes in each color

r = Image.fromarray(output_predictions.byte().cpu().numpy()).resize(input_image.size)

r.putpalette(colors)

import matplotlib.pyplot as plt

r.save('zzz-semantic_segmentation_result.png')

  • 35
    点赞
  • 27
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
为了训练R-FCN模型使用自己的数据集,你可以按照以下步骤进行操作: 1. 首先,将你的数据集准备好,包括图像和相应的标注。确保你的标注与R-FCN的要求相匹配,通常是使用PASCAL VOC格式或COCO格式的标注。 2. 接下来,你需要克隆py-R-FCN工程,你可以通过运行以下命令来完成此操作: ``` git clone https://github.com/Orpine/py-R-FCN.git ``` 3. 在克隆的py-R-FCN目录中,你需要进行一些配置。首先,你需要安装所需的依赖项,可以通过运行以下命令来完成: ``` pip install -r requirements.txt ``` 4. 然后,你需要将训练和验证图像及其相应的标注放置在正确的目录下。按照py-R-FCN的要求,将训练图像和标注放置在`data/VOCdevkit2007/VOC2007`目录下,将验证图像和标注放置在`data/VOCdevkit2007/VOC2007`目录下。 5. 接下来,你需要在`experiments/cfgs`目录中创建一个配置文件。你可以根据你的需求修改现有的配置文件,或者创建一个新的配置文件。确保配置文件中包含正确的目录路径和类别数等信息。 6. 现在,你可以开始训练R-FCN模型了。通过在终端中运行以下命令来进行训练: ``` python train.py --cfg experiments/cfgs/your_config_file.yaml ``` 请将`your_config_file.yaml`替换为你创建的配置文件的名称。 7. 训练过程中可能会遇到一些问题,例如`loss_bbox=0`的情况。你可以参考中提供的解决方法,主要是关于numpy的问题。此外,中也提到了一些常见问题和解决方法,建议你先将迭代次数设置为100或更小来调试。 希望这些步骤对你有帮助。祝你成功训练R-FCN模型!

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

MorleyOlsen

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值