3-unet-camvid

unet图像分割

导入包,数据集

%reload_ext autoreload
%autoreload 2
%matplotlib inline
from fastai.vision import *
from fastai.callbacks.hooks import *
from fastai.utils.mem import *
path = untar_data(URLs.CAMVID)
path.ls()
[WindowsPath('C:/Users/Wither8848/.fastai/data/camvid/codes.txt'),
 WindowsPath('C:/Users/Wither8848/.fastai/data/camvid/images'),
 WindowsPath('C:/Users/Wither8848/.fastai/data/camvid/labels'),
 WindowsPath('C:/Users/Wither8848/.fastai/data/camvid/valid.txt')]
path_lbl = path/'labels'
path_img = path/'images'

查看数据集

查看图像

fnames = get_image_files(path_img)
fnames[:3]
[WindowsPath('C:/Users/Wither8848/.fastai/data/camvid/images/0001TP_006690.png'),
 WindowsPath('C:/Users/Wither8848/.fastai/data/camvid/images/0001TP_006720.png'),
 WindowsPath('C:/Users/Wither8848/.fastai/data/camvid/images/0001TP_006750.png')]
img_f = fnames[0]
img = open_image(img_f)
img.show(figsize=(5,5))

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-X9VYXRrB-1583647675337)(output_9_0.png)]

查看标签

lbl_names = get_image_files(path_lbl)
lbl_names[:3]
[WindowsPath('C:/Users/Wither8848/.fastai/data/camvid/labels/0001TP_006690_P.png'),
 WindowsPath('C:/Users/Wither8848/.fastai/data/camvid/labels/0001TP_006720_P.png'),
 WindowsPath('C:/Users/Wither8848/.fastai/data/camvid/labels/0001TP_006750_P.png')]
print(fnames[0].suffix)
fnames[0].stem
.png





'0001TP_006690'
  • 标签为数据集名字+‘_P’,生成这样的一个函数,自动寻找到便签
  1. f’ 为字符串的代写格式,类似%f
  2. path 对象的使用.stem得到文件名.suffix得到后缀
get_y_fn = lambda x: path_lbl/f'{x.stem}_P{x.suffix}'
  • 这里使用了open_mask因为原来是掩码不太清楚
mask = open_mask(get_y_fn(img_f))
mask.show(figsize=(5,5), alpha=1)

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-OgBkVVBK-1583647675338)(output_17_0.png)]

src_size = np.array(mask.shape[1:])
src_size,mask.data
(array([720, 960]),
 tensor([[[ 4,  4,  4,  ...,  4,  4,  4],
          [ 4,  4,  4,  ...,  4,  4,  4],
          [ 4,  4,  4,  ...,  4,  4,  4],
          ...,
          [19, 19, 19,  ..., 30, 30, 30],
          [19, 19, 19,  ..., 30, 30, 30],
          [19, 19, 19,  ..., 30, 30, 30]]]))
  • 映射储存在codes.txt里面,如4-对应建筑物
codes = np.loadtxt(path/'codes.txt', dtype=str); codes
array(['Animal', 'Archway', 'Bicyclist', 'Bridge', 'Building', 'Car', 'CartLuggagePram', 'Child', 'Column_Pole',
       'Fence', 'LaneMkgsDriv', 'LaneMkgsNonDriv', 'Misc_Text', 'MotorcycleScooter', 'OtherMoving', 'ParkingBlock',
       'Pedestrian', 'Road', 'RoadShoulder', 'Sidewalk', 'SignSymbol', 'Sky', 'SUVPickupTruck', 'TrafficCone',
       'TrafficLight', 'Train', 'Tree', 'Truck_Bus', 'Tunnel', 'VegetationMisc', 'Void', 'Wall'], dtype='<U17')

生成数据集

size = src_size//2

free = gpu_mem_get_free_no_cache()
# the max size of bs depends on the available GPU RAM
if free > 8200: bs=8
else:           bs=4
print(f"using bs={bs}, have {free}MB of GPU RAM free")
using bs=4, have 6813MB of GPU RAM free
  • 这里使用图像分割数据集,验证集来自于给定得valid.txt因为这个数据集是来自于一个视频,为了保证验证集不连续
src = (SegmentationItemList.from_folder(path_img)
       .split_by_fname_file('../valid.txt')
       .label_from_func(get_y_fn, classes=codes))
data = (src.transform(get_transforms(), size=size, tfm_y=True)
        .databunch(bs=bs)
        .normalize(imagenet_stats))
  • 查看数据集
data.show_batch(2, figsize=(10,7))

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-YqytEN0z-1583647675338)(output_27_0.png)]

生成模型

name2id = {v:k for k,v in enumerate(codes)}
void_code = name2id['Void']

def acc_camvid(input, target):
    target = target.squeeze(1)
    mask = target != void_code
    return (input.argmax(dim=1)[mask]==target[mask]).float().mean()
metrics=acc_camvid
wd=1e-2
learn = unet_learner(data, models.resnet34, metrics=metrics, wd=wd)
lr_find(learn)
learn.recorder.plot()

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-qVthGWnl-1583647675339)(output_31_2.png)]

lr=3e-3
learn.fit_one_cycle(10, slice(lr), pct_start=0.9)
epochtrain_lossvalid_lossacc_camvidtime
01.0694810.7929510.81585701:54
10.7442780.6005080.84708201:51
20.6493650.5591100.84899001:49
30.5982120.4539650.87555901:51
40.6178870.4929550.86697101:49
50.5734970.5630690.85496501:51
60.5529200.4948360.87479101:48
70.5199040.4038100.89191501:53
80.5542300.4649740.88304001:52
90.4277340.3389180.90328901:53
learn.save('stage-1')
learn.load('stage-1');
learn.show_results(rows=3, figsize=(8,9))

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-8DULAqDB-1583647675339)(output_35_0.png)]

lr=3e-3
learn.unfreeze()
lrs = slice(lr/400,lr/4)
learn.fit_one_cycle(12, lrs, pct_start=0.8)
epochtrain_lossvalid_lossacc_camvidtime
00.3783980.3267950.90683202:02
10.3782350.3138570.91015501:54
20.3752830.3075060.91193001:56
30.3646990.3212080.90961201:56
40.3532120.2851770.92016301:56
50.3470130.3093240.91260601:53
60.3338580.3236520.90576301:55
70.3326460.3028860.91415701:57
80.3282570.2860690.92209401:56
90.3207060.2876750.92261801:55
100.2944630.2796780.92259501:53
110.2668780.2720970.92597801:54
  • 学习率曲线应该是先高后低,如果一直降低,尝试加大一点点学习率
learn.recorder.plot_losses()
learn.recorder.plot_lr()

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-u4oO5UMs-1583647675339)(output_38_0.png)]

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-tLQCjotM-1583647675339)(output_38_1.png)]

learn.save('stage-2');

使用原始分辨率进一步训练

learn.destroy()

size = src_size

free = gpu_mem_get_free_no_cache()
# the max size of bs depends on the available GPU RAM
if free > 8200: bs=3
else:           bs=1
print(f"using bs={bs}, have {free}MB of GPU RAM free")
this Learner object self-destroyed - it still exists, but no longer usable
using bs=1, have 6027MB of GPU RAM free
data = (src.transform(get_transforms(), size=size, tfm_y=True)
        .databunch(bs=bs)
        .normalize(imagenet_stats))
  • 使用16位浮点数加快运算,甚至结果可能会更好
learn = unet_learner(data, models.resnet34, metrics=metrics, wd=wd).to_fp16()
learn.load('stage-2');
lr_find(learn)
learn.recorder.plot()
ame}.recorder.plot() to see the graph.

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-GesvvZoT-1583647675339)(output_45_2.png)]

lr=1e-3
learn.fit_one_cycle(10, slice(lr), pct_start=0.8)
epochtrain_lossvalid_lossacc_camvidtime
00.4234640.3303590.91140904:13
10.3735530.3340040.90782704:08
20.3549780.3169640.91227004:08
30.3934600.3168100.91743804:08
40.3423010.3323190.91779604:10
50.3436240.3138000.91953404:07
60.3285650.3123780.90993304:14
70.3368490.3175630.91390204:20
80.2690400.3000680.91874004:22
90.2385360.2818920.92377204:10
learn.save('stage-1-big')
learn.load('stage-1-big');
learn.unfreeze()
lrs = slice(1e-6,lr/10)
learn.fit_one_cycle(10, lrs)
learn.save('stage-2-big')
epochtrain_lossvalid_lossacc_camvidtime
00.2560140.2842050.92345204:41
10.2321190.2831080.92367704:29
20.2358520.2889520.92089704:26
30.2199010.2615500.92831204:24
40.2278030.2649390.92750204:23
50.2070690.2755470.92765604:25
60.1993580.2624790.92836604:28
70.2002430.2590150.93044904:31
80.1972280.2506650.93284904:32
90.1943180.2652130.92862704:31
learn.load('stage-2-big');
learn.show_results(rows=3, figsize=(10,10))

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-EptM1S4W-1583647675340)(output_51_0.png)]

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 2
    评论
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值