tensorflow图像分割unet_小白通过kaggle学习图像语义分割笔记: TGS 地表以下盐沉积物分割...

1.题目简介

tgs-salt-identification-challenge

地球上有大量油气聚集的区域在地表以下也有大量盐分沉积。但是知道到底哪里有大量盐分是非常困难的,仍然需要专家对地质图像里的盐体的进行判定,判定结果的准确率,会潜在的影响石油和天然气公司的钻井人员的安全。本题目就是希望参赛者能够根据图片判定盐的位置。

2.EDA

intro-to-seismic-salt-and-how-to-geophysics 这个Notebook介绍了学科背景,并发现有39%的训练数据为空标注。

v2-7313a599d80e9ff738ba4bb864277e01_b.jpg

basic-data-visualization-using-pytorch-dataset 使用torch的data工具包简要的显示了图片和相应的mask。

v2-60854856b70d2bf8a34443be69526cfc_b.jpg

train-dataset-visualization 将image和mask合二为一

v2-0eb8f102949c91ef1723aa55f28491a3_b.jpg

fake-incorrect-training-masks

通过分析认为可能存在一些标记错误(画方框)的情况,但也有人回复这是地质学家做标注时候无法判定,会直接就有矩形简单的画个框。

v2-c747de6367597737470c5d967d2fb381_b.jpg

3.leak

在这个比赛里,出题方还提供了地质图片对应的深度,大家发现同一个depth的图片其实来源于更大的一张大图片的分割。因此大家逆向工程设法把这些图片重新拼接起来。不过后来由于主办方在private数据集消除了这个leak,因此提升基本仅限于public。4th Place Solution通过这个leak进行了后处理,用手工规则修正答案,public提升0.01,private斤提升0.001

v2-4954ff418bf16616f08f3bfe85d185b5_b.jpg

4.metric

本次比赛是IoU(Intersection Over Union )

explanation-of-scoring-metric 对metric做出了详细的解释。

fast-iou-scoring-metric-in-pytorch-and-numpy fast-iou-metric-in-numpy-and-tensorflow 里有对IoU的快速实现

5.augmentation

在处理101×101转换到128×128的图片的时候,有padding的,也有resize的

分割类任务可做的数据增强似乎较少

u-net-dropout-augmentation-stratification 做了flip

simple-tricks flip也可用于tta

Augmentation that works

蛙神展示正确的random crop方法

1st Place Solution

  • HorizontalFlip(p=0.5)
  • RandomBrightness(p=0.2, limit=0.2)
  • RandomContrast(p=0.1, limit=0.2)
  • ShiftScaleRotate(shift_limit=0.1625, scale_limit=0.6, rotate_limit=0, p=0.7)

part of 8th place (Private LB) solution

认为图片明暗度不统一,因此np.clip(img - np.median(img) +127, 0, 255)

使用了: random gamma, brightness, shift, scale, rotate, horizontal flip, contrast

11th place solution

  • Random invert , mean subtraction and derivative instead of raw input不work
  • Random cutout (even more unexpected, but some papers indicate that it’s helpful for segmentation, because implicitly it causes model to learn “inpainting” input tiles)
  • Random gamma-correction (makes sense, since tiles seem to be postprocessed in a way that changes brightness)
  • Random fixed-size crop
  • Random horizontal flip

14th place solution

先用skimage 的biharmonic inpaint pad148×148,然后再Random crop到128×128

增强用了:flip in the left-right direction and random linear color

30th: strong baseline

  • Hard augmentations surprisely work best. Contrast, Brightness, Gamma, Blur, Horizontal Flip, Shift and ShiftScale up to 50 pixels.
  • Last 10 epochs with no augmentations (HorizontalFlip only) helped a bit.

6.model & train model(Notebook )

这时候kaggle还基本被keras统治,这里复盘一下Notebook的分享是怎么从trained-from-scratch U-Net里一点一点提升

1.intro-to-seismic-salt-and-how-to-geophysics 从头建立了一个trained-from-scratch U-Net,这应该是语义分割最常见的模型。(0.65535)

The U-Net is basically looking like an Auto-Encoder with shortcuts.

v2-9bd95b80b19c3f84736b87c4c0f90062_b.jpg

https://www.kaggle.com/bguberfain/unet-with-depth 在基础unet上(第1个kernel)加了depth feature (0.70118)

2. u-net-dropout-augmentation-stratification 在前者上加了augmentation和dropout项,搜索最佳threshold,得到了提升(0.74744)

u-net-with-simple-resnet-blocks-forked (加了对预测结果里mask偏低的样本直接手工改成空mask(0.80988)

3.u-net-with-simple-resnet-blocks 在前者基础上添加了resnet模块(0.81394

u-net-with-simple-resnet-blocks-v2-new-loss 将loss改为lovasz_hinge loss,移除了最后一层dropout,得到了提升(0.83410)

introduction-to-u-net-with-simple-resnet-blocks 在lovasz loss从relu改为elu(0.83434) unet-with-simple-resnet-blocks 进一步调整learning rate, 调大epochs and batch size(0.84898)

4.unet-resnet34-in-keras pretrained-resnet34-in-keras

将backbone改成了resnet34,并使用了预训练weight.

using-resnet50-pretrained-model-in-keras

进一步用了resnet50,并使用了预训练weight.

getting-0-87-on-private-lb-using-kaggle-kernel

再进一步pretrained Xception model with ResNet decoder, Pseudo-Labelling,SWA (约0.87)

其他有用的Notebook和Post

unet-resnetblock-hypercolumn-deep-supervision-fold 提出建立二分类模型来预测mask为空的

deeplabv3 deeplabv3的baseline

9.goto-pytorch-fix-for-v0-3 tgs-fastai-resnet34-unet pytorch和fastai的baseline,不过一年半前这两个框架不是Notebook的主流

https://www.kaggle.com/c/tgs-salt-identification-challenge/discussion/65226 https://www.kaggle.com/c/tgs-salt-identification-challenge/discussion/66568 这个post里作者详细的描述了如何改进模型以得到提升。

10.https://www.kaggle.com/c/tgs-salt-identification-challenge/discussion/63715 https://www.kaggle.com/c/tgs-salt-identification-challenge/discussion/65933 蛙神提到可以增加一个Deep semi-supervised learning来对是否为空图片进行分类

v2-51dbcce8271647645a820a148a1b7ae5_b.jpg

https://www.kaggle.com/c/tgs-salt-identification-challenge/discussion/65347

在这篇post里介绍了snapshot ensembling and cyclic lr

v2-db39409769c389712cbf0b46cf194a41_b.jpg

v2-e531fdede9c67a294f6ae5262870a2fa_b.jpg

7.model & train model(Gold solution)

1st Place Solution

基本模型

b.e.s. model

Input: 101 -> resize to 192 -> pad to 224
Encoder: ResNeXt50 pretrained on ImageNet
Decoder: conv3x3 + BN, Upsampling, scSE
Training overview:
Optimizer: RMSprop. Batch size: 24
Loss: BCE+Dice. Reduce LR on plateau starting from 0.0001
Loss: Lovasz. Reduce LR on plateau starting from 0.00005
Loss: Lovasz. 4 snapshots with cosine annealing LR, 80 epochs each, LR starting from 0.0001

phalanx model

ResNet34 (architecture is similar to resnet_34_pad_128 described below) with input: 101 -> resize to 202 -> pad to 256

然后在baseline的基础上添加高置信度的 pseudo labels,进一步的提升了分数。

重新训练模型

resnet_34_pad_128

Input: 101 -> pad to 128
Encoder: ResNet34 + scSE (conv7x7 -> conv3x3 and remove first max pooling)
Center Block: Feature Pyramid Attention (remove 7x7)
Decoder: conv3x3, transposed convolution, scSE + hyper columns
Loss: Lovasz

resnet_34_resize_128

Input: 101 -> resize to 128
Encoder: ResNet34 + scSE (remove first max pooling)
Center Block: conv3x3, Global Convolutional Network
Decoder: Global Attention Upsample (implemented like senet -> like scSE, conv3x3 -> GCN) + deep supervision
Loss: BCE for classification and Lovasz for segmentation
Optimizer: SGD. Batch size: 32.
Pretrain on pseudolabels for 150 epochs (50 epochs per cycle with cosine annealing, LR 0.01 -> 0.001)
Finetune on train data. 5 folds, 4 snapshots with cosine annealing LR, 50 epochs each, LR 0.01 -> 0.001

4th Place Solution

input: 101 random pad to 128*128, random LRflip;

encoder: resnet34, se-resnext50, resnext101_ibna, se-resnet101, se-resnet152, se resnet154;
decoder: scse, hypercolumn (not used in network with resnext101ibna, seresnext101 backbone), ibn block, dropout;
Deep supervision structure with Lovasz softmax (a great idea from Heng);
SGD: momentum -- 0.9, weight decay -- 0.0002, lr -- from 0.01 to
0.001 (changed in each epoch);
LR schedule: cosine annealing with snapshot ensemble (shared by
Peter), 50 epochs/cycle, 7cycles/fold ,10fold;

也使用了Pseudo Label

5th place solution 5th place solution(知乎)

通过拼接图片发现全零标注并不代表该图片没有任何的盐,仅仅代表该图片不存在盐/非盐的边界

  1. 在Lovasz loss中使用了elu+1而不是默认的relu;虽然在训练时validation loss在1.0 左右(对比0.3 training loss),但实际效果更佳
  2. 调整了hypercolumn中的特征:
  3. scSE单元实现略有不同
  4. 实现了Object Context Module(目标语境模块)并将其用在了 SE-ResNeXt50 模型上

也使用了cosine annealing lr和snapshot ensemble method

part of 8th place (Private LB) solution

model

Best performing backbones: SeNet154, SeResNext101, SeResNext50, DPN92
(from top to bottom)
Decoder: U-Net like decoder with ScSe, CBAM and Hypercolumn

train method

Batch size: 8
Adam for 200 epochs.
LR schedule: 0.0001 -- 100 epochs, 0.00001 -- 100 epochs (cycle #1)
CycleLR for 40 epochs more with 10 epoch cycle and rmsprop optimizer
(cycles #2, #3, #4, #5).
On each training cycle, one checkpoint was made.
Hard example mining was performed as well.

9th place solution(single model)

从se_resnext50到SENet154得到了显著的提升

其他work的方法

  • AdamW with the Noam scheduler
  • cutout
  • SWA after the training on the best loss, pixel accuracy, metric and the last models (+0.004).

11th place solution

第一位队员的Model: UNet-like architecture

Backbone: SE ResNeXt-50, pretrained on ImageNet
Decoder features (inspired by Heng’s helpful posts and discussions):
Spatial and Channel Squeeze Excitation gating
Hypercolumns
Deep supervision (zero/nonzero mask)

Optimizer:

SGD: momentum 0.9, weight decay 0.0001
Batch size: 16
Starting LR determined using procedure similar to LR find from fast.ai course - 5e-2
LR schedule - cosine annealing from maximum LR, cycle length - 50 epochs, 10 cycles per experiment
Best snapshots according to metric were saved independently for each cycle, final solution uses 2 best cycles per fold

第二位队员的Model: Modified Unet

Backbone: SE-ResNeXt-50, pretrained on ImageNet
Decoder features:
Dilated convolutions with dilation from 1 to 5
Hypercolumns
ASP OCModule before last Convolution layer
Deep supervision (zero/nonzero mask, nonzero mask segmentation)
Dropout

v2-a2eb9f65bdf6018e99c6e67b00780a87_b.jpg

Optimizer:

SGD: momentum = 0.9, weight decay = 0.0001
Batch size: 16
Lr schedule. Pretrain for 32 epochs with lr = 0.01. Then SGDR was applied for 4 cycles with cosine annealing: lr from 0.01 to 0.0001. Each cycle lasts for 64 epochs.

14th place solution

SE-ResNeXt50 encoder. Standard decoder blocks enriched with custom-built FPN-style layers.

v2-da2912ba23d10dd0a60366a8cfe7a73a_b.jpg

Models Training

Loss: Lovasz hinge loss with elu + 1. See details here
Optimizer: SGD with LR 0.01, momentum 0.9, weight_decay 0.0001

Train stages:

EarlyStopping with patience 100; ReduceLROnPlateau with patience=30, factor=0.64, min_lr=1e-8; Lovasz * 0.75 + BCE empty * 0.25.
Cosine annealing learning rate 300 epochs, 50 per cycle; Lovasz * 0.5 + BCE empty * 0.5.

part of the 15th place solution

表示提升主要来自于添加Topology-aware loss、Spatial and channel squeeze & excitation module、Guided upsampling module

A small U-shape model (~8MB) with

  • Pyramid pooling module, (dilated) residual blocks and auxiliary training
  • Spatial and channel squeeze & excitation module
  • Guided upsampling module
  • Optimizer: Adam
  • Weight decay: 1e-4
  • Initial learning rate: 1e-3

8.loss

https://www.kaggle.com/alexanderliao/u-net-bn-aug-strat-focal-loss-fixed   介绍了DICE loss f,Lovasz Softmax loss, Focal loss ,作者也在知乎上讲了相关问题,见有关语义分割的奇技淫巧有哪些?

9th place solution(single model) modifying Lovasz to symmetric which gives a good boost in the LB (on public LB +0.008 on private LB +0.02)

11th place solution 分类loss(BCE)和分割loss(BCE + Lavasz Hinge*0.5)的组合

30th: strong baseline 前面的epoch用bce然后再后面改为 Lavasz

9.postprocess

得出的结果的mask由rle算法输出

https://www.kaggle.com/adamhart/faster-rle https://www.kaggle.com/danmoller/even-faster-rle 有对rle的快速实现。

https://www.kaggle.com/meaninglesslives/apply-crf crf(Conditional Random Fields)可能可以提升分数

v2-3af8d6098a90aa68956c230f3d26cef3_b.jpg

10.知乎上的分享

5th place solution(知乎)

有关语义分割的奇技淫巧有哪些?


11.小结

这个比赛是kaggle在Notebook里提供GPU的第一个比赛,也是这个比赛开始,CV类比赛新手不需要有实验室专业背景或者抱大腿,只需要跟着论坛好好学,就可以独立做比赛甚至出成绩了,据第一名说也是这个比赛才第一次接触图像分割,很多大佬也是从这个比赛开始崭露头角。按时间顺序从前往后看下去,论坛和Notebook在蛙神的带领下从trained-from-scratch U-Net(0.65)一直做到最后的0.89+,每一步提升都有体现,挺有意思的。

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值