colab数据集下载配置代码:
%%time
!pip install -U -q kaggle
!mkdir -p ~/.kaggle
!echo '{"username":"pupil1","key":"ae776d041bf94ae1bfa9a3843797ad6d"}' > ~/.kaggle/kaggle.json
!chmod 600 ~/.kaggle/kaggle.json
!mkdir -p understanding_cloud_organization
!kaggle competitions download -c understanding_cloud_organization
!mv *.zip understanding_cloud_organization/
!mv *.csv understanding_cloud_organization/
!cd /content/understanding_cloud_organization/;unzip train_images.zip
!cd /content/understanding_cloud_organization;mkdir train_images;mv *.jpg train_images/
!cd /content/understanding_cloud_organization/;unzip train.csv.zip
!cd /content/understanding_cloud_organization/;unzip test_images.zip
!cd /content/understanding_cloud_organization;mkdir test_images;mv *.jpg test_images
根据[2]的描述
The remaining area, which has not been covered by two succeeding orbits, is marked black.0
所以图片中如果出现黑色区域,就是两颗卫星都没有扫描到的地方。如下:
使用pupil1账号视角,凡是变色的都是看过的,实在极其没有意义的不予收录.
链接 备注 Train with crops, Predict with full images 发帖子的人得分不高 How effective is pseudo-labeling? (看完了)半监督 [LB 0.628] simple segmentation approach
threshold is high?
threshold的用法 Overlapping Labels in Train Data?
Can a pixel be considered as multiple classes?
(看完了)
根据第一个链接,一个像素可以属于多个类别.
Each image was labeled by several people (2-4), so the labels can overlap. In addition, there was no restriction that the labels from a single labeler cannot overlap. To create the masks for this competition, we simply used the union of all labels for each class. So naturally there will be some overlap.
AdamAccumulate (看完了)提到了AdamAccumulate的版本兼容性问题 Hints for late joiners? (看完了)提到使用steel比赛的方案 Bounding Boxes instead of Segmentation (看完了)评论中提到:
举办方不鼓励对象检测的方式,但是帖子的作者认为线性的模型比非线性的模型跟容易泛化,所以坚持使用Bounding Boxes(对象检测)的方式
use linknet unet> linknet > fpn Correct Dice Metric (看完了)讨论误差函数机制 Instance Segmentation->Request for list of past competition 参考资料 Information: Bad image list
Corrupt and Mislabeled Images
Information: Bad image list
一些损坏的数据 Question about the black area in the image 有很好的可视化 ResNet34 implementation of Unet works but ResNet 50 and 101 fails? (看完了)改变模型如果爆内存就减少batch_size Flowers are easy to pick ? 介绍了一些树算法 Single model performance 最佳单模 A best description of Generating mask from encoded pixel 涉及encoded pixel Adding TTA to the model before optimisation could help
Augmentations Strategies for this Competition. TTA?
使用时间强化 Augmentations thred
Augmentations released version 0.4.0
图像增强的讨论 Questions about the origin of the data 讨论快照功能 More Tricks to Train w/ Bigger Batches (pytorch)
Some tricks to train faster (pytorch)
A trick to use bigger batches for training: gradient accumulation
讨论训练技巧 Simple Descriptions of Cloud Types / Labeling Process 讨论肉眼区分类别 Fast data loading [Experiments] 快速读取数据 Deeper, Stronger, Better? 发现resnet18有效
resnext50_32x4d
和efficientnet-b5无效
Beware of Pandas value_counts method for validation split 指出几个代码的pandas使用有误 Efficient Net B4-B7 评论区提到修补小batch_size的办法是使用 gradient accumulation Improving code quality with utility scripts
Utility scripts for Keras users
Using High-level frameworks is not learner friendly
代码推销 Object Detection vs Instance Segmentation 很多概念 Hybrid convolutional and bidirectional LSTM or RNN 使用RNN网络 EfficientNets are now available in pytorch segmentation model repo. 没看懂这个是干嘛的,房之后再看 New method to tackle severe label noise 处理label噪音的一篇论文 FPN or Unet: Which one is better? 提到了FPN以及Unet Some thoughts on this competition kernel grandmaster的一些想法 what is the label to be taken for overlapping masks? for example, in the image 0011165.jpg, Fish and Flower masks overlap each other for some region. mask重合 Must read material 一些资料 Ideas for merging ensemble's predictions
How to effectively ensemble models with Keras
讨论模型融合 Instance Segmentation->How to predict classes 讨论UNET的输出怎么改成多分类 What does it mean to use a pretrained resnet encoder with UNET? 讨论UNET使用预训练的resnet编码器是什么意思? Regular image segmentation approach 提到进行语义分割任务的都有两个数据集 Discussing post processing 讨论后处理 Weakly supervised segmentation 弱监督分割 Must-see Kernels and topics - Understanding Clouds from Satellite Images 对于资料的自行总结 RLE Decode in C++ 提到了RLE技术 Hints from a late joiner's persepctive 提到了后处理 Impact of using classier for removing the masks 考虑去掉mask编码 A Late Joiner's Understanding and Notes 需要细看 LPT: See what's going on with that commit ? 介绍了一个有用的训练的可视化工具 Knock Knock can send you email notification (or slack notification) 一个工具用来提醒你训练结束的时候发信息到你邮件通知你
一些统计数据来自[1]:
Useful Stats::
no. of empty mask = 7055 no. of non-empty mask = 7737 no. of non-empty mask for Fish
= 1864 no. of non-empty mask for Flower
= 1509 no. of non-empty mask for Gravel
= 1982 no. of non-empty mask for Sugar
= 2382
Reference:
[1]Public TestSet Distribution via LB probing
[2]https://www.kaggle.com/c/understanding_cloud_organization/data