Deeplab、目标检测、Segmentation

论文阅读 - Semantic Image Segmentation With Deep Convolutional Nets and Fully Connected CRFs
1. Project
2. 笔记
这里写图片描述

摘要 ——
主要是将CNN和概率图模型结合,来处理像素级分类问题,即语义图像分割. 由于CNN具有不变性,适合于 high-level 任务,如图像分类. 但CNN网络最后一层的输出不足以精确物体分割. 这里结合CNN最后输出层的特征与全连接CRF相结合,提升语义分割效果.

Deeplab的源代码可以托管在网页上运行demo

1. Deeplab–github代码

这里写图片描述

2. 在线运行脚本
这里写图片描述

3. Running DeepLab on Cityscapes Semantic Segmentation Dataset
4. TensorFlow DeepLab Model Zoo
5. Youtube 视频演示 DeepLab v3 Xception Cityscapes
这里写图片描述


目标检测

1. Mask RCNN Youtute视频
道路场景—-人、车的bounding box以及准确率

这里写图片描述

2. github 此部分代码

这里写图片描述


CityScapes Benchmark

测评数据排名


Semantic Segmentation Prepared for CSC2541: Visual Percep7on for Autonomous Driving

Segmentation

Papers
    U-Net
Foreground Object Segmentation
Semantic Segmentation
    DeepLab
    DeepLab v2
    DeepLab v3
    DeepLabv3+
    CRF-RNN
    BoxSup
    DeconvNet
    SegNet
    ParseNet
    DecoupledNet
    ScribbleSup
    ENet
    PixelNet
    RefineNet
    ICNet
    LinkNet
Instance Segmentation
    MaskLab
    Human Instance Segmentation
Specific Segmentation
Segment Proposal
Scene Labeling / Scene Parsing
    PSPNet
    Benchmarks
    Challenges
Human Parsing
Video Object Segmentation
    Challenge
Projects
3D Segmentation
Leaderboard
Blogs
Talks

Papers

Deep Joint Task Learning for Generic Object Extraction

Highly Efficient Forward and Backward Propagation of Convolutional Neural Networks for Pixelwise Classification

Segmentation from Natural Language Expressions
这里写图片描述

Semantic Object Parsing with Graph LSTM

Fine Hand Segmentation using Convolutional Neural Networks

Feedback Neural Network for Weakly Supervised Geo-Semantic Segmentation

FusionNet: A deep fully residual convolutional neural network for image segmentation in connectomics

A deep learning model integrating FCNNs and CRFs for brain tumor segmentation

Texture segmentation with Fully Convolutional Networks

Fast LIDAR-based Road Detection Using Convolutional Neural Networks

Deep Value Networks Learn to Evaluate and Iteratively Refine Structured Outputs

arxiv: https://arxiv.org/abs/1703.04363
demo: https://gyglim.github.io/deep-value-net/

Annotating Object Instances with a Polygon-RNN

intro: CVPR 2017. CVPR Best Paper Honorable Mention Award. University of Toronto
project page: http://www.cs.toronto.edu/polyrnn/
arxiv: https://arxiv.org/abs/1704.05548

Semantic Segmentation via Structured Patch Prediction, Context CRF and Guidance CRF

intro: CVPR 2017
paper: http://openaccess.thecvf.com/content_cvpr_2017/papers/Shen_Semantic_Segmentation_via_CVPR_2017_paper.pdf
github(Caffe): https://github.com//FalongShen/SegModel

Nighttime sky/cloud image segmentation

intro: ICIP 2017
arxiv: https://arxiv.org/abs/1705.10583

Distantly Supervised Road Segmentation

intro: ICCV workshop CVRSUAD2017. Indiana University & Preferred Networks
arxiv: https://arxiv.org/abs/1708.06118

Ω-Net: Fully Automatic, Multi-View Cardiac MR Detection, Orientation, and Segmentation with Deep Neural Networks

Ω-Net (Omega-Net): Fully Automatic, Multi-View Cardiac MR Detection, Orientation, and Segmentation with Deep Neural Networks

https://arxiv.org/abs/1711.01094

Superpixel clustering with deep features for unsupervised road segmentation

intro: Preferred Networks, Inc & Indiana University
arxiv: https://arxiv.org/abs/1711.05998

Learning to Segment Human by Watching YouTube

intro: TPAMI 2017
arxiv: https://arxiv.org/abs/1710.01457

W-Net: A Deep Model for Fully Unsupervised Image Segmentation

https://arxiv.org/abs/1711.08506

End-to-end detection-segmentation network with ROI convolution

intro: ISBI 2018
arxiv: https://arxiv.org/abs/1801.02722

A Foreground Inference Network for Video Surveillance Using Multi-View Receptive Field

https://arxiv.org/abs/1801.06593

Piecewise Flat Embedding for Image Segmentation

https://arxiv.org/abs/1802.03248

A Pyramid CNN for Dense-Leaves Segmentation

intro: Computer and Robot Vision, Toronto, May 2018
arxiv: https://arxiv.org/abs/1804.01646

U-Net

U-Net: Convolutional Networks for Biomedical Image Segmentation

intro: conditionally accepted at MICCAI 2015
project page: http://lmb.informatik.uni-freiburg.de/people/ronneber/u-net/
arxiv: http://arxiv.org/abs/1505.04597
code+data: http://lmb.informatik.uni-freiburg.de/people/ronneber/u-net/u-net-release-2015-10-02.tar.gz
github: https://github.com/orobix/retina-unet
github: https://github.com/jakeret/tf_unet
notes: http://zongwei.leanote.com/post/Pa

DeepUNet: A Deep Fully Convolutional Network for Pixel-level Sea-Land Segmentation

https://arxiv.org/abs/1709.00201

TernausNet: U-Net with VGG11 Encoder Pre-Trained on ImageNet for Image Segmentation

intro: Lyft Inc. & MIT
intro: part of the winning solution (1st out of 735) in the Kaggle: Carvana Image Masking Challenge
arxiv: https://arxiv.org/abs/1801.05746
github: https://github.com/ternaus/TernausNet

Capsules for Object Segmentation

keywords: convolutional-deconvolutional capsule network, SegCaps, U-Net
arxiv: https://arxiv.org/abs/1804.04241

Deep Object Co-Segmentation

https://arxiv.org/abs/1804.06423
Foreground Object Segmentation

Pixel Objectness

project page: http://vision.cs.utexas.edu/projects/pixelobjectness/
arxiv: https://arxiv.org/abs/1701.05349
github: https://github.com/suyogduttjain/pixelobjectness

A Deep Convolutional Neural Network for Background Subtraction

arxiv: https://arxiv.org/abs/1702.01731

Semantic Segmentation

Fully Convolutional Networks for Semantic Segmentation

intro: CVPR 2015, PAMI 2016
keywords: deconvolutional layer, crop layer
arxiv: http://arxiv.org/abs/1411.4038
arxiv(PAMI 2016): http://arxiv.org/abs/1605.06211
slides: https://docs.google.com/presentation/d/1VeWFMpZ8XN7OC3URZP4WdXvOGYckoFWGVN7hApoXVnc
slides: http://tutorial.caffe.berkeleyvision.org/caffe-cvpr15-pixels.pdf
talk: http://techtalks.tv/talks/fully-convolutional-networks-for-semantic-segmentation/61606/
github(official): https://github.com/shelhamer/fcn.berkeleyvision.org
github: https://github.com/BVLC/caffe/wiki/Model-Zoo#fcn
github: https://github.com/MarvinTeichmann/tensorflow-fcn
github(Chainer): https://github.com/wkentaro/fcn
github: https://github.com/wkentaro/pytorch-fcn
github: https://github.com/shekkizh/FCN.tensorflow
notes: http://zhangliliang.com/2014/11/28/paper-note-fcn-segment/

From Image-level to Pixel-level Labeling with Convolutional Networks

intro: CVPR 2015
intro: “Weakly Supervised Semantic Segmentation with Convolutional Networks”
intro: performs semantic segmentation based only on image-level annotations in a multiple instance learning framework
arxiv: http://arxiv.org/abs/1411.6228
paper: http://ronan.collobert.com/pub/matos/2015_semisupsemseg_cvpr.pdf

Feedforward semantic segmentation with zoom-out features

intro: CVPR 2015. Toyota Technological Institute at Chicago
paper: http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Mostajabi_Feedforward_Semantic_Segmentation_2015_CVPR_paper.pdf
bitbuckt: https://bitbucket.org/m_mostajabi/zoom-out-release
video: https://www.youtube.com/watch?v=HvgvX1LXQa8

DeepLab

Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs

intro: ICLR 2015. DeepLab
arxiv: http://arxiv.org/abs/1412.7062
bitbucket: https://bitbucket.org/deeplab/deeplab-public/
github: https://github.com/TheLegendAli/DeepLab-Context

Weakly- and Semi-Supervised Learning of a DCNN for Semantic Image Segmentation

intro: DeepLab
arxiv: http://arxiv.org/abs/1502.02734
bitbucket: https://bitbucket.org/deeplab/deeplab-public/
github: https://github.com/TheLegendAli/DeepLab-Context

DeepLab v2

DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs

intro: TPAMI
intro: 79.7% mIOU in the test set, PASCAL VOC-2012 semantic image segmentation task
intro: Updated version of our previous ICLR 2015 paper
project page: http://liangchiehchen.com/projects/DeepLab.html
arxiv: https://arxiv.org/abs/1606.00915
bitbucket: https://bitbucket.org/aquariusjay/deeplab-public-ver2
github: https://github.com/DrSleep/tensorflow-deeplab-resnet
github: https://github.com/isht7/pytorch-deeplab-resnet

DeepLabv2 (ResNet-101)

http://liangchiehchen.com/projects/DeepLabv2_resnet.html
DeepLab v3

Rethinking Atrous Convolution for Semantic Image Segmentation

intro: Google. DeepLabv3
arxiv: https://arxiv.org/abs/1706.05587

DeepLabv3+

Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation

intro: Google Inc.
arxiv: https://arxiv.org/abs/1802.02611
github: https://github.com/tensorflow/models/tree/master/research/deeplab
blog: https://research.googleblog.com/2018/03/semantic-image-segmentation-with.html

CRF-RNN

Conditional Random Fields as Recurrent Neural Networks

intro: ICCV 2015. Oxford / Stanford / Baidu
project page: http://www.robots.ox.ac.uk/~szheng/CRFasRNN.html
arxiv: http://arxiv.org/abs/1502.03240
github: https://github.com/torrvision/crfasrnn
demo: http://www.robots.ox.ac.uk/~szheng/crfasrnndemo
github: https://github.com/martinkersner/train-CRF-RNN

BoxSup

BoxSup: Exploiting Bounding Boxes to Supervise Convolutional Networks for Semantic Segmentation

arxiv: http://arxiv.org/abs/1503.01640

Efficient piecewise training of deep structured models for semantic segmentation

intro: CVPR 2016
arxiv: http://arxiv.org/abs/1504.01013

DeconvNet

Learning Deconvolution Network for Semantic Segmentation

intro: ICCV 2015. DeconvNet
intro: two-stage training: train the network with easy examples first and fine-tune the trained network with more challenging examples later
project page: http://cvlab.postech.ac.kr/research/deconvnet/
arxiv: http://arxiv.org/abs/1505.04366
slides: http://web.cs.hacettepe.edu.tr/~aykut/classes/spring2016/bil722/slides/w06-deconvnet.pdf
gitxiv: http://gitxiv.com/posts/9tpJKNTYksN5eWcHz/learning-deconvolution-network-for-semantic-segmentation
github: https://github.com/HyeonwooNoh/DeconvNet
github: https://github.com/HyeonwooNoh/caffe

SegNet

SegNet: A Deep Convolutional Encoder-Decoder Architecture for Robust Semantic Pixel-Wise Labelling

arxiv: http://arxiv.org/abs/1505.07293
github: https://github.com/alexgkendall/caffe-segnet
github: https://github.com/pfnet-research/chainer-segnet

SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation

homepage: http://mi.eng.cam.ac.uk/projects/segnet/
arxiv: http://arxiv.org/abs/1511.00561
github: https://github.com/alexgkendall/caffe-segnet
tutorial: http://mi.eng.cam.ac.uk/projects/segnet/tutorial.html

SegNet: Pixel-Wise Semantic Labelling Using a Deep Networks

youtube: https://www.youtube.com/watch?v=xfNYAly1iXo
mirror: http://pan.baidu.com/s/1gdUzDlD

Getting Started with SegNet

blog: http://mi.eng.cam.ac.uk/projects/segnet/tutorial.html
github: https://github.com/alexgkendall/SegNet-Tutorial

ParseNet

ParseNet: Looking Wider to See Better

intro:ICLR 2016
arxiv: http://arxiv.org/abs/1506.04579
github: https://github.com/weiliu89/caffe/tree/fcn
caffe model zoo: https://github.com/BVLC/caffe/wiki/Model-Zoo#parsenet-looking-wider-to-see-better

DecoupledNet

Decoupled Deep Neural Network for Semi-supervised Semantic Segmentation

intro: ICLR 2016
project(paper+code): http://cvlab.postech.ac.kr/research/decouplednet/
arxiv: http://arxiv.org/abs/1506.04924
github: https://github.com/HyeonwooNoh/DecoupledNet

Semantic Image Segmentation via Deep Parsing Network

intro: ICCV 2015. CUHK
keywords: Deep Parsing Network (DPN), Markov Random Field (MRF)
homepage: http://personal.ie.cuhk.edu.hk/~lz013/projects/DPN.html
arxiv.org: http://arxiv.org/abs/1509.02634
paper: http://www.cv-foundation.org/openaccess/content_iccv_2015/papers/Liu_Semantic_Image_Segmentation_ICCV_2015_paper.pdf
slides: http://personal.ie.cuhk.edu.hk/~pluo/pdf/presentation_dpn.pdf

Multi-Scale Context Aggregation by Dilated Convolutions

intro: ICLR 2016.
intro: Dilated Convolution for Semantic Image Segmentation
homepage: http://vladlen.info/publications/multi-scale-context-aggregation-by-dilated-convolutions/
arxiv: http://arxiv.org/abs/1511.07122
github: https://github.com/fyu/dilation
github: https://github.com/nicolov/segmentation_keras
notes: http://www.inference.vc/dilated-convolutions-and-kronecker-factorisation/

Instance-aware Semantic Segmentation via Multi-task Network Cascades

intro: CVPR 2016 oral. 1st-place winner of MS COCO 2015 segmentation competition
keywords: RoI warping layer, Multi-task Network Cascades (MNC)
arxiv: http://arxiv.org/abs/1512.04412
github: https://github.com/daijifeng001/MNC

Object Segmentation on SpaceNet via Multi-task Network Cascades (MNC)

blog: https://medium.com/the-downlinq/object-segmentation-on-spacenet-via-multi-task-network-cascades-mnc-f1c89d790b42
github: https://github.com/lncohn/pascal_to_spacenet

Learning Transferrable Knowledge for Semantic Segmentation with Deep Convolutional Neural Network

intro: TransferNet
project page: http://cvlab.postech.ac.kr/research/transfernet/
arxiv: http://arxiv.org/abs/1512.07928
github: https://github.com/maga33/TransferNet

Combining the Best of Convolutional Layers and Recurrent Layers: A Hybrid Network for Semantic Segmentation

arxiv: http://arxiv.org/abs/1603.04871

Seed, Expand and Constrain: Three Principles for Weakly-Supervised Image Segmentation

intro: ECCV 2016
arxiv: https://arxiv.org/abs/1603.06098
github: https://github.com/kolesman/SEC

ScribbleSup

ScribbleSup: Scribble-Supervised Convolutional Networks for Semantic Segmentation

project page: http://research.microsoft.com/en-us/um/people/jifdai/downloads/scribble_sup/
arxiv: http://arxiv.org/abs/1604.05144

Laplacian Reconstruction and Refinement for Semantic Segmentation

Laplacian Pyramid Reconstruction and Refinement for Semantic Segmentation

intro: ECCV 2016
arxiv: https://arxiv.org/abs/1605.02264
paper: https://www.ics.uci.edu/~fowlkes/papers/gf-eccv16.pdf
github(MatConvNet): https://github.com/golnazghiasi/LRR

Natural Scene Image Segmentation Based on Multi-Layer Feature Extraction

arxiv: http://arxiv.org/abs/1605.07586

Convolutional Random Walk Networks for Semantic Image Segmentation

arxiv: http://arxiv.org/abs/1605.07681

ENet

ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation

arxiv: http://arxiv.org/abs/1606.02147
github: https://github.com/e-lab/ENet-training
github(Caffe): https://github.com/TimoSaemann/ENet
github: https://github.com/PavlosMelissinos/enet-keras
github: https://github.com/kwotsin/TensorFlow-ENet
blog: http://culurciello.github.io/tech/2016/06/20/training-enet.html

Fully Convolutional Networks for Dense Semantic Labelling of High-Resolution Aerial Imagery

arxiv: http://arxiv.org/abs/1606.02585

Deep Learning Markov Random Field for Semantic Segmentation

arxiv: http://arxiv.org/abs/1606.07230

Region-based semantic segmentation with end-to-end training

intro: ECCV 2016
arxiv: http://arxiv.org/abs/1607.07671
githun: https://github.com/nightrome/matconvnet-calvin

Built-in Foreground/Background Prior for Weakly-Supervised Semantic Segmentation

intro: ECCV 2016
arxiv: http://arxiv.org/abs/1609.00446

PixelNet

PixelNet: Towards a General Pixel-level Architecture

intro: semantic segmentation, edge detection
arxiv: http://arxiv.org/abs/1609.06694

Exploiting Depth from Single Monocular Images for Object Detection and Semantic Segmentation

intro: IEEE T. Image Processing
intro: propose an RGB-D semantic segmentation method which applies a multi-task training scheme: semantic label prediction and depth value regression
arxiv: https://arxiv.org/abs/1610.01706

PixelNet: Representation of the pixels, by the pixels, and for the pixels

intro: CMU & Adobe Research
project page: http://www.cs.cmu.edu/~aayushb/pixelNet/
arxiv: https://arxiv.org/abs/1702.06506
github(Caffe): https://github.com/aayushbansal/PixelNet

Semantic Segmentation of Earth Observation Data Using Multimodal and Multi-scale Deep Networks

arxiv: http://arxiv.org/abs/1609.06846

Deep Structured Features for Semantic Segmentation

arxiv: http://arxiv.org/abs/1609.07916

CNN-aware Binary Map for General Semantic Segmentation

intro: ICIP 2016 Best Paper / Student Paper Finalist
arxiv: https://arxiv.org/abs/1609.09220

Efficient Convolutional Neural Network with Binary Quantization Layer

arxiv: https://arxiv.org/abs/1611.06764

Mixed context networks for semantic segmentation

intro: Hikvision Research Institute
arxiv: https://arxiv.org/abs/1610.05854

High-Resolution Semantic Labeling with Convolutional Neural Networks

arxiv: https://arxiv.org/abs/1611.01962

Gated Feedback Refinement Network for Dense Image Labeling

intro: CVPR 2017
paper: http://www.cs.umanitoba.ca/~ywang/papers/cvpr17.pdf

RefineNet

RefineNet: Multi-Path Refinement Networks with Identity Mappings for High-Resolution Semantic Segmentation

RefineNet: Multi-Path Refinement Networks for High-Resolution Semantic Segmentation

intro: CVPR 2017. IoU 83.4% on PASCAL VOC 2012
arxiv: https://arxiv.org/abs/1611.06612
github: https://github.com/guosheng/refinenet
leaderboard: http://host.robots.ox.ac.uk:8080/leaderboard/displaylb.php?challengeid=11&compid=6#KEY_Multipath-RefineNet-Res152

Full-Resolution Residual Networks for Semantic Segmentation in Street Scenes

keywords: Full-Resolution Residual Units (FRRU), Full-Resolution Residual Networks (FRRNs)
arxiv: https://arxiv.org/abs/1611.08323
github(Theano/Lasagne): https://github.com/TobyPDE/FRRN
youtube: https://www.youtube.com/watch?v=PNzQ4PNZSzc

Semantic Segmentation using Adversarial Networks

intro: Facebook AI Research & INRIA. NIPS Workshop on Adversarial Training, Dec 2016, Barcelona, Spain
arxiv: https://arxiv.org/abs/1611.08408
github(Chainer): https://github.com/oyam/Semantic-Segmentation-using-Adversarial-Networks

Improving Fully Convolution Network for Semantic Segmentation

arxiv: https://arxiv.org/abs/1611.08986

The One Hundred Layers Tiramisu: Fully Convolutional DenseNets for Semantic Segmentation

intro: Montreal Institute for Learning Algorithms & Ecole Polytechnique de Montreal
arxiv: https://arxiv.org/abs/1611.09326
github: https://github.com/SimJeg/FC-DenseNet
github: https://github.com/titu1994/Fully-Connected-DenseNets-Semantic-Segmentation
github(Keras): https://github.com/0bserver07/One-Hundred-Layers-Tiramisu

Training Bit Fully Convolutional Network for Fast Semantic Segmentation

intro: Megvii
arxiv: https://arxiv.org/abs/1612.00212

Classification With an Edge: Improving Semantic Image Segmentation with Boundary Detection

intro: “an end-to-end trainable deep convolutional neural network (DCNN) for semantic segmentation with built-in awareness of semantically meaningful boundaries. “
arxiv: https://arxiv.org/abs/1612.01337

Diverse Sampling for Self-Supervised Learning of Semantic Segmentation

arxiv: https://arxiv.org/abs/1612.01991

Mining Pixels: Weakly Supervised Semantic Segmentation Using Image Labels

intro: Nankai University & University of Oxford & NUS
arxiv: https://arxiv.org/abs/1612.02101

FCNs in the Wild: Pixel-level Adversarial and Constraint-based Adaptation

arxiv: https://arxiv.org/abs/1612.02649

Understanding Convolution for Semantic Segmentation

intro: UCSD & CMU & UIUC & TuSimple
arxiv: https://arxiv.org/abs/1702.08502
github(MXNet): [https://github.com/TuSimple/TuSimple-DUC]https://github.com/TuSimple/TuSimple-DUC
pretrained-models: https://drive.google.com/drive/folders/0B72xLTlRb0SoREhISlhibFZTRmM

Label Refinement Network for Coarse-to-Fine Semantic Segmentation

https://www.arxiv.org/abs/1703.00551

Predicting Deeper into the Future of Semantic Segmentation

intro: Facebook AI Research
arxiv: https://arxiv.org/abs/1703.07684

Object Region Mining with Adversarial Erasing: A Simple Classification to Semantic Segmentation Approach

intro: CVPR 2017 (oral)
keywords: Adversarial Erasing (AE)
arxiv: https://arxiv.org/abs/1703.08448

Guided Perturbations: Self Corrective Behavior in Convolutional Neural Networks

intro: University of Maryland & GE Global Research Center
arxiv: https://arxiv.org/abs/1703.07928

Not All Pixels Are Equal: Difficulty-aware Semantic Segmentation via Deep Layer Cascade

intro: CVPR 2017 spotlight paper
arxxiv: https://arxiv.org/abs/1704.01344

Large Kernel Matters – Improve Semantic Segmentation by Global Convolutional Network

https://arxiv.org/abs/1703.02719

Loss Max-Pooling for Semantic Image Segmentation

intro: CVPR 2017
arxiv: https://arxiv.org/abs/1704.02966

Reformulating Level Sets as Deep Recurrent Neural Network Approach to Semantic Segmentation

https://arxiv.org/abs/1704.03593

A Review on Deep Learning Techniques Applied to Semantic Segmentation

https://arxiv.org/abs/1704.06857

Joint Semantic and Motion Segmentation for dynamic scenes using Deep Convolutional Networks

intro: [International Institute of Information Technology & Max Planck Institute For Intelligent Systems
arxiv: https://arxiv.org/abs/1704.08331

ICNet

ICNet for Real-Time Semantic Segmentation on High-Resolution Images

intro: CUHK & Sensetime
project page: https://hszhao.github.io/projects/icnet/
arxiv: https://arxiv.org/abs/1704.08545
github: https://github.com/hszhao/ICNet
video: https://www.youtube.com/watch?v=qWl9idsCuLQ

LinkNet

Feature Forwarding: Exploiting Encoder Representations for Efficient Semantic Segmentation

LinkNet: Exploiting Encoder Representations for Efficient Semantic Segmentation

project page: https://codeac29.github.io/projects/linknet/
arxiv: https://arxiv.org/abs/1707.03718
github: https://github.com/e-lab/LinkNet

Pixel Deconvolutional Networks

intro: Washington State University
arxiv: https://arxiv.org/abs/1705.06820

Incorporating Network Built-in Priors in Weakly-supervised Semantic Segmentation

intro: IEEE TPAMI
arxiv: https://arxiv.org/abs/1706.02189

Deep Semantic Segmentation for Automated Driving: Taxonomy, Roadmap and Challenges

intro: IEEE ITSC 2017
arxiv: https://arxiv.org/abs/1707.02432

Semantic Segmentation with Reverse Attention

intro: BMVC 2017 oral. University of Southern California
arxiv: https://arxiv.org/abs/1707.06426

Stacked Deconvolutional Network for Semantic Segmentation

https://arxiv.org/abs/1708.04943

Learning Dilation Factors for Semantic Segmentation of Street Scenes

intro: GCPR 2017
arxiv: https://arxiv.org/abs/1709.01956

A Self-aware Sampling Scheme to Efficiently Train Fully Convolutional Networks for Semantic Segmentation

https://arxiv.org/abs/1709.02764

One-Shot Learning for Semantic Segmentation

intro: BMWC 2017
arcxiv: https://arxiv.org/abs/1709.03410
github: https://github.com/lzzcd001/OSLSM

An Adaptive Sampling Scheme to Efficiently Train Fully Convolutional Networks for Semantic Segmentation

https://arxiv.org/abs/1709.02764

Semantic Segmentation from Limited Training Data

https://arxiv.org/abs/1709.07665

Unsupervised Domain Adaptation for Semantic Segmentation with GANs

https://arxiv.org/abs/1711.06969

Neuron-level Selective Context Aggregation for Scene Segmentation

https://arxiv.org/abs/1711.08278

Road Extraction by Deep Residual U-Net

https://arxiv.org/abs/1711.10684

Mix-and-Match Tuning for Self-Supervised Semantic Segmentation

intro: AAAI 2018
project page: http://mmlab.ie.cuhk.edu.hk/projects/M&M/
arxiv: https://arxiv.org/abs/1712.00661
github: https://github.com/XiaohangZhan/mix-and-match/
github: https://github.com//liuziwei7/mix-and-match

Error Correction for Dense Semantic Image Labeling

https://arxiv.org/abs/1712.03812

Semantic Segmentation via Highly Fused Convolutional Network with Multiple Soft Cost Functions

https://arxiv.org/abs/1801.01317

RTSeg: Real-time Semantic Segmentation Comparative Study

arxiv: https://arxiv.org/abs/1803.02758
github: https://github.com/MSiam/TFSegmentation

ShuffleSeg: Real-time Semantic Segmentation Network

intro: Cairo University
arxiv: https://arxiv.org/abs/1803.03816

Dynamic-structured Semantic Propagation Network

intro: CVPR 2018
arxiv: https://arxiv.org/abs/1803.06067

ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation

project page: https://sacmehta.github.io/ESPNet/
arxiv: https://arxiv.org/abs/1803.06815
github: https://github.com/sacmehta/ESPNet

Context Encoding for Semantic Segmentation

intro: CVPR 2018
arxiv: https://arxiv.org/abs/1803.08904
github: https://github.com/zhanghang1989/PyTorch-Encoding

Adaptive Affinity Field for Semantic Segmentation

intro: UC Berkeley / ICSI
arxiv: https://arxiv.org/abs/1803.10335

Predicting Future Instance Segmentations by Forecasting Convolutional Features

intro: Facebook AI Research & Univ. Grenoble Alpes
arxiv: https://arxiv.org/abs/1803.11496

Fully Convolutional Adaptation Networks for Semantic Segmentation

intro: CVPR 2018, Rank 1 in Segmentation Track of Visual Domain Adaptation Challenge 2017
keywords: Fully Convolutional Adaptation Networks (FCAN), Appearance Adaptation Networks (AAN) and Representation Adaptation Networks (RAN)
arxiv: https://arxiv.org/abs/1804.08286

Learning a Discriminative Feature Network for Semantic Segmentation

intro: CVPR 2018
arxiv: https://arxiv.org/abs/1804.09337

Instance Segmentation

Simultaneous Detection and Segmentation

intro: ECCV 2014
author: Bharath Hariharan, Pablo Arbelaez, Ross Girshick, Jitendra Malik
arxiv: http://arxiv.org/abs/1407.1808
github(Matlab): https://github.com/bharath272/sds_eccv2014

Convolutional Feature Masking for Joint Object and Stuff Segmentation

intro: CVPR 2015
keywords: masking layers
arxiv: https://arxiv.org/abs/1412.1283
paper: http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Dai_Convolutional_Feature_Masking_2015_CVPR_paper.pdf

Proposal-free Network for Instance-level Object Segmentation

paper: http://arxiv.org/abs/1509.02636

Hypercolumns for object segmentation and fine-grained localization

intro: CVPR 2015
arxiv: https://arxiv.org/abs/1411.5752
paper: http://www.cs.berkeley.edu/~bharath2/pubs/pdfs/BharathCVPR2015.pdf

SDS using hypercolumns

github: https://github.com/bharath272/sds

Learning to decompose for object detection and instance segmentation

intro: ICLR 2016 Workshop
keyword: CNN / RNN, MNIST, KITTI
arxiv: http://arxiv.org/abs/1511.06449

Recurrent Instance Segmentation

intro: ECCV 2016
porject page: http://romera-paredes.com/ris
arxiv: http://arxiv.org/abs/1511.08250
github(Torch): https://github.com/bernard24/ris
poster: http://www.eccv2016.org/files/posters/P-4B-46.pdf
youtube: https://www.youtube.com/watch?v=l_WD2OWOqBk

Instance-sensitive Fully Convolutional Networks

intro: ECCV 2016. instance segment proposal
arxiv: http://arxiv.org/abs/1603.08678

Amodal Instance Segmentation

intro: ECCV 2016
arxiv: http://arxiv.org/abs/1604.08202

Bridging Category-level and Instance-level Semantic Image Segmentation

keywords: online bootstrapping
arxiv: http://arxiv.org/abs/1605.06885

Bottom-up Instance Segmentation using Deep Higher-Order CRFs

intro: BMVC 2016
arxiv: http://arxiv.org/abs/1609.02583

DeepCut: Object Segmentation from Bounding Box Annotations using Convolutional Neural Networks

arxiv: http://arxiv.org/abs/1605.07866

End-to-End Instance Segmentation and Counting with Recurrent Attention

intro: ReInspect
arxiv: http://arxiv.org/abs/1605.09410

Translation-aware Fully Convolutional Instance Segmentation

Fully Convolutional Instance-aware Semantic Segmentation

intro: CVPR 2017 Spotlight paper. winning entry of COCO segmentation challenge 2016
keywords: TA-FCN / FCIS
arxiv: https://arxiv.org/abs/1611.07709
github: https://github.com/msracver/FCIS
slides: https://onedrive.live.com/?cid=f371d9563727b96f&id=F371D9563727B96F%2197213&authkey=%21AEYOyOirjIutSVk

InstanceCut: from Edges to Instances with MultiCut

arxiv: https://arxiv.org/abs/1611.08272

Deep Watershed Transform for Instance Segmentation

arxiv: https://arxiv.org/abs/1611.08303

Object Detection Free Instance Segmentation With Labeling Transformations

arxiv: https://arxiv.org/abs/1611.08991

Shape-aware Instance Segmentation

arxiv: https://arxiv.org/abs/1612.03129

Interpretable Structure-Evolving LSTM

intro: CMU & Sun Yat-sen University & National University of Singapore & Adobe Research
intro: CVPR 2017 spotlight paper
arxiv: https://arxiv.org/abs/1703.03055

Mask R-CNN

intro: ICCV 2017 Best paper award. Facebook AI Research
arxiv: https://arxiv.org/abs/1703.06870
github: https://github.com/TuSimple/mx-maskrcnn
github(Keras+TensorFlow): https://github.com/matterport/Mask_RCNN

Semantic Instance Segmentation via Deep Metric Learning

https://arxiv.org/abs/1703.10277

Pose2Instance: Harnessing Keypoints for Person Instance Segmentation

https://arxiv.org/abs/1704.01152

Pixelwise Instance Segmentation with a Dynamically Instantiated Network

intro: CVPR 2017
arxiv: https://arxiv.org/abs/1704.02386

Instance-Level Salient Object Segmentation

intro: CVPR 2017
arxiv: https://arxiv.org/abs/1704.03604

Semantic Instance Segmentation with a Discriminative Loss Function

intro: Published at “Deep Learning for Robotic Vision”, workshop at CVPR 2017. KU Leuven
arxiv: https://arxiv.org/abs/1708.02551

SceneCut: Joint Geometric and Object Segmentation for Indoor Scenes

https://arxiv.org/abs/1709.07158

S4 Net: Single Stage Salient-Instance Segmentation

arxiv: https://arxiv.org/abs/1711.07618
github: https://github.com/RuochenFan/S4Net

Deep Extreme Cut: From Extreme Points to Object Segmentation

https://arxiv.org/abs/1711.09081

Learning to Segment Every Thing

intro: UC Berkeley & Facebook AI Research
keywords: MaskX R-CNN
arxiv: https://arxiv.org/abs/1711.10370

Recurrent Neural Networks for Semantic Instance Segmentation

project page: https://imatge-upc.github.io/rsis/
arxiv: https://arxiv.org/abs/1712.00617
github: https://github.com/imatge-upc/rsis

MaskLab

MaskLab: Instance Segmentation by Refining Object Detection with Semantic and Direction Features

https://arxiv.org/abs/1712.04837

Recurrent Pixel Embedding for Instance Grouping

intro: learning to embed pixels and group them into boundaries, object proposals, semantic segments and instances.
project page: http://www.ics.uci.edu/~skong2/SMMMSG.html
arxiv: https://arxiv.org/abs/1712.08273
github: https://github.com/aimerykong/Recurrent-Pixel-Embedding-for-Instance-Grouping
slides: http://www.ics.uci.edu/~skong2/slides/pixel_embedding_for_grouping_public_version.pdf
poster: http://www.ics.uci.edu/~skong2/slides/pixel_embedding_for_grouping_poster.pdf

Annotation-Free and One-Shot Learning for Instance Segmentation of Homogeneous Object Clusters

https://arxiv.org/abs/1802.00383

Path Aggregation Network for Instance Segmentation

intro: CVPR 2018
arxiv: https://arxiv.org/abs/1803.01534

Learning to Segment via Cut-and-Paste

intro: Google
keywords: weakly-supervised, adversarial learning setup
arxiv: https://arxiv.org/abs/1803.06414

Learning to Cluster for Proposal-Free Instance Segmentation

https://arxiv.org/abs/1803.06459
Human Instance Segmentation

PersonLab: Person Pose Estimation and Instance Segmentation with a Bottom-Up, Part-Based, Geometric Embedding Model

intro: Google, Inc.
keywords: Person detection and pose estimation, segmentation and grouping
arxiv: https://arxiv.org/abs/1803.08225

Pose2Seg: Human Instance Segmentation Without Detection

intro: Tsinghua University & Tencent AI Lab & Cardiff University
arxiv: https://arxiv.org/abs/1803.10683

Specific Segmentation

A CNN Cascade for Landmark Guided Semantic Part Segmentation

project page: http://aaronsplace.co.uk/
paper: https://aaronsplace.co.uk/papers/jackson2016guided/jackson2016guided.pdf

End-to-end semantic face segmentation with conditional random fields as convolutional, recurrent and adversarial networks

arxiv: https://arxiv.org/abs/1703.03305

Face Parsing via Recurrent Propagation

intro: BMVC 2017
arxiv: https://arxiv.org/abs/1708.01936

Face Parsing via a Fully-Convolutional Continuous CRF Neural Network

https://arxiv.org/abs/1708.03736

Boundary-sensitive Network for Portrait Segmentation

https://arxiv.org/abs/1712.08675
Segment Proposal

Learning to Segment Object Candidates

intro: Facebook AI Research (FAIR)
intro: DeepMask. learning segmentation proposals
arxiv: http://arxiv.org/abs/1506.06204
github: https://github.com/facebookresearch/deepmask
github: https://github.com/abbypa/NNProject_DeepMask

Learning to Refine Object Segments

intro: ECCV 2016. Facebook AI Research (FAIR)
intro: SharpMask. an extension of DeepMask which generates higher-fidelity masks using an additional top-down refinement step.
arxiv: http://arxiv.org/abs/1603.08695
github: https://github.com/facebookresearch/deepmask

FastMask: Segment Object Multi-scale Candidates in One Shot

intro: CVPR 2017. University of California & Fudan University & Megvii Inc.
arxiv: https://arxiv.org/abs/1612.08843
github: https://github.com/voidrank/FastMask

Scene Labeling / Scene Parsing

Indoor Semantic Segmentation using depth information

arxiv: http://arxiv.org/abs/1301.3572

Recurrent Convolutional Neural Networks for Scene Parsing

arxiv: http://arxiv.org/abs/1306.2795
slides: http://people.ee.duke.edu/~lcarin/Yizhe8.14.2015.pdf
github: https://github.com/NP-coder/CLPS1520Project
github: https://github.com/rkargon/Scene-Labeling

Learning hierarchical features for scene labeling

paper: http://yann.lecun.com/exdb/publis/pdf/farabet-pami-13.pdf

Multi-modal unsupervised feature learning for rgb-d scene labeling

intro: ECCV 2014
paper: http://www3.ntu.edu.sg/home/wanggang/WangECCV2014.pdf

Scene Labeling with LSTM Recurrent Neural Networks

paper: http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Byeon_Scene_Labeling_With_2015_CVPR_paper.pdf

Attend, Infer, Repeat: Fast Scene Understanding with Generative Models

arxiv: http://arxiv.org/abs/1603.08575
notes: http://www.shortscience.org/paper?bibtexKey=journals/corr/EslamiHWTKH16

“Semantic Segmentation for Scene Understanding: Algorithms and Implementations” tutorial

intro: 2016 Embedded Vision Summit
youtube: https://www.youtube.com/watch?v=pQ318oCGJGY

Semantic Understanding of Scenes through the ADE20K Dataset

arxiv: https://arxiv.org/abs/1608.05442

Learning Deep Representations for Scene Labeling with Guided Supervision

Learning Deep Representations for Scene Labeling with Semantic Context Guided Supervision

intro: CUHK
arxiv: https://arxiv.org/abs/1706.02493

Spatial As Deep: Spatial CNN for Traffic Scene Understanding

intro: AAAI 2018
arxiv: https://arxiv.org/abs/1712.06080

Multi-Path Feedback Recurrent Neural Network for Scene Parsing

arxiv: http://arxiv.org/abs/1608.07706

Scene Labeling using Recurrent Neural Networks with Explicit Long Range Contextual Dependency

arxiv: https://arxiv.org/abs/1611.07485

PSPNet

Pyramid Scene Parsing Network

intro: CVPR 2017
intro: mIoU score as 85.4% on PASCAL VOC 2012 and 80.2% on Cityscapes, ranked 1st place in ImageNet Scene Parsing Challenge 2016
project page: http://appsrv.cse.cuhk.edu.hk/~hszhao/projects/pspnet/index.html
arxiv: https://arxiv.org/abs/1612.01105
slides: http://image-net.org/challenges/talks/2016/SenseCUSceneParsing.pdf
github: https://github.com/hszhao/PSPNet
github: https://github.com/Vladkryvoruchko/PSPNet-Keras-tensorflow

Open Vocabulary Scene Parsing

https://arxiv.org/abs/1703.08769

Deep Contextual Recurrent Residual Networks for Scene Labeling

https://arxiv.org/abs/1704.03594

Fast Scene Understanding for Autonomous Driving

intro: Published at “Deep Learning for Vehicle Perception”, workshop at the IEEE Symposium on Intelligent Vehicles 2017
arxiv: https://arxiv.org/abs/1708.02550

FoveaNet: Perspective-aware Urban Scene Parsing

https://arxiv.org/abs/1708.02421

BlitzNet: A Real-Time Deep Network for Scene Understanding

intro: INRIA
arxiv: https://arxiv.org/abs/1708.02813

Semantic Foggy Scene Understanding with Synthetic Data

https://arxiv.org/abs/1708.07819

Restricted Deformable Convolution based Road Scene Semantic Segmentation Using Surround View Cameras

https://arxiv.org/abs/1801.00708

Dense Recurrent Neural Networks for Scene Labeling

https://arxiv.org/abs/1801.06831
Benchmarks

MIT Scene Parsing Benchmark

homepage: http://sceneparsing.csail.mit.edu/
github(devkit): https://github.com/CSAILVision/sceneparsing

Semantic Understanding of Urban Street Scenes: Benchmark Suite

https://www.cityscapes-dataset.com/benchmarks/
Challenges

Large-scale Scene Understanding Challenge

homepage: http://lsun.cs.princeton.edu/

Places2 Challenge

http://places2.csail.mit.edu/challenge.html
Human Parsing

Human Parsing with Contextualized Convolutional Neural Network

intro: ICCV 2015
paper: http://www.cv-foundation.org/openaccess/content_iccv_2015/html/Liang_Human_Parsing_With_ICCV_2015_paper.html

Look into Person: Self-supervised Structure-sensitive Learning and A New Benchmark for Human Parsing

intro: CVPr 2017. SYSU & CMU
keywords: Look Into Person (LIP)
project page: http://hcp.sysu.edu.cn/lip/
arxiv: https://arxiv.org/abs/1703.05446
github: https://github.com/Engineering-Course/LIP_SSL

Cross-domain Human Parsing via Adversarial Feature and Label Adaptation

intro: AAAI 2018
arxiv: https://arxiv.org/abs/1801.01260

Fusing Hierarchical Convolutional Features for Human Body Segmentation and Clothing Fashion Classification

intro: Wuhan University
arxiv: https://arxiv.org/abs/1803.03415

Video Object Segmentation

Fast object segmentation in unconstrained video

project page: http://calvin.inf.ed.ac.uk/software/fast-video-segmentation/
paper: http://calvin.inf.ed.ac.uk/wp-content/uploads/Publications/papazoglouICCV2013-camera-ready.pdf

Recurrent Fully Convolutional Networks for Video Segmentation

arxiv: https://arxiv.org/abs/1606.00487

Object Detection, Tracking, and Motion Segmentation for Object-level Video Segmentation

arxiv: http://arxiv.org/abs/1608.03066

Clockwork Convnets for Video Semantic Segmentation

intro: ECCV 2016 Workshops
intro: evaluated on the Youtube-Objects, NYUD, and Cityscapes video datasets
arxiv: http://arxiv.org/abs/1608.03609
github: https://github.com/shelhamer/clockwork-fcn

STFCN: Spatio-Temporal FCN for Semantic Video Segmentation

arxiv: http://arxiv.org/abs/1608.05971

One-Shot Video Object Segmentation

intro: OSVOS
project: http://www.vision.ee.ethz.ch/~cvlsegmentation/osvos/
arxiv: https://arxiv.org/abs/1611.05198
github(official): https://github.com/kmaninis/OSVOS-caffe
github(official): https://github.com/scaelles/OSVOS-TensorFlow
github(official): https://github.com/kmaninis/OSVOS-PyTorch

Video Object Segmentation Without Temporal Information

https://arxiv.org/abs/1709.06031

Convolutional Gated Recurrent Networks for Video Segmentation

arxiv: https://arxiv.org/abs/1611.05435

Learning Video Object Segmentation from Static Images

arxiv: https://arxiv.org/abs/1612.02646

Semantic Video Segmentation by Gated Recurrent Flow Propagation

arxiv: https://arxiv.org/abs/1612.08871

FusionSeg: Learning to combine motion and appearance for fully automatic segmention of generic objects in videos

project page: http://vision.cs.utexas.edu/projects/fusionseg/
arxiv: https://arxiv.org/abs/1701.05384
github: https://github.com/suyogduttjain/fusionseg

Unsupervised learning from video to detect foreground objects in single images

https://arxiv.org/abs/1703.10901

Semantically-Guided Video Object Segmentation

https://arxiv.org/abs/1704.01926

Learning Video Object Segmentation with Visual Memory

https://arxiv.org/abs/1704.05737

Flow-free Video Object Segmentation

https://arxiv.org/abs/1706.09544

Online Adaptation of Convolutional Neural Networks for Video Object Segmentation

https://arxiv.org/abs/1706.09364

Video Object Segmentation using Tracked Object Proposals

intro: CVPR-2017 workshop, DAVIS-2017 Challenge
arxiv: https://arxiv.org/abs/1707.06545

Video Object Segmentation with Re-identification

intro: CVPR 2017 Workshop, DAVIS Challenge on Video Object Segmentation 2017 (Winning Entry)
arxiv: https://arxiv.org/abs/1708.00197

Pixel-Level Matching for Video Object Segmentation using Convolutional Neural Networks

intro: ICCV 2017
arxiv: https://arxiv.org/abs/1708.05137

MaskRNN: Instance Level Video Object Segmentation

intro: NIPS 2017
arxiv: https://arxiv.org/abs/1803.11187

SegFlow: Joint Learning for Video Object Segmentation and Optical Flow

project page: https://sites.google.com/site/yihsuantsai/research/iccv17-segflow
arxiv: https://arxiv.org/abs/1709.06750
github: https://github.com/JingchunCheng/SegFlow

Video Semantic Object Segmentation by Self-Adaptation of DCNN

https://arxiv.org/abs/1711.08180

Learning to Segment Moving Objects

https://arxiv.org/abs/1712.01127

Instance Embedding Transfer to Unsupervised Video Object Segmentation

intro: University of Southern California & Google Inc
arxiv: https://arxiv.org/abs/1801.00908
blog: https://medium.com/@barvinograd1/instance-embedding-instance-segmentation-without-proposals-31946a7c53e1

Panoptic Segmentation

intro: Facebook AI Research (FAIR) & Heidelberg University
arxiv: https://arxiv.org/abs/1801.00868

Efficient Video Object Segmentation via Network Modulation

intro: Snap Inc. & Northwestern University & Google Inc.
arxiv: https://arxiv.org/abs/1802.01218

Video Object Segmentation with Joint Re-identification and Attention-Aware Mask Propagation

intro: CUHK
arxiv: https://arxiv.org/abs/1803.04242

Video Object Segmentation with Language Referring Expressions

https://arxiv.org/abs/1803.08006

Dynamic Video Segmentation Network

intro: CVPR 2018
arxiv: https://arxiv.org/abs/1804.00931

Low-Latency Video Semantic Segmentation

intro: CVPR 2018 Spotlight
arxiv: https://arxiv.org/abs/1804.00389

Blazingly Fast Video Object Segmentation with Pixel-Wise Metric Learning

intro: CVPR 2018
arxiv: https://arxiv.org/abs/1804.03131

Challenge

DAVIS: Densely Annotated VIdeo Segmentation

homepage: http://davischallenge.org/
arxiv: https://arxiv.org/abs/1704.00675

DAVIS Challenge on Video Object Segmentation 2017

http://davischallenge.org/challenge2017/publications.html
Projects

TF Image Segmentation: Image Segmentation framework

intro: Image Segmentation framework based on Tensorflow and TF-Slim library
github: https://github.com/warmspringwinds/tf-image-segmentation

KittiSeg: A Kitti Road Segmentation model implemented in tensorflow.

keywords: MultiNet
intro: KittiSeg performs segmentation of roads by utilizing an FCN based model.
github: https://github.com/MarvinTeichmann/KittiBox

Semantic Segmentation Architectures Implemented in PyTorch

intro: Segnet/FCN/U-Net/Link-Net
github: https://github.com/meetshah1995/pytorch-semseg

PyTorch for Semantic Segmentation

https://github.com/ZijunDeng/pytorch-semantic-segmentation
3D Segmentation

PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation

intro: Stanford University
project page: http://stanford.edu/~rqi/pointnet/
arxiv: https://arxiv.org/abs/1612.00593
github: https://github.com/charlesq34/pointnet

DA-RNN: Semantic Mapping with Data Associated Recurrent Neural Networks

https://arxiv.org/abs/1703.03098

SqueezeSeg: Convolutional Neural Nets with Recurrent CRF for Real-Time Road-Object Segmentation from 3D LiDAR Point Cloud

intro: UC Berkeley
arxiv: https://arxiv.org/abs/1710.07368

SEGCloud: Semantic Segmentation of 3D Point Clouds

intro: International Conference of 3D Vision (3DV) 2017 (Spotlight). Stanford University
homepage: http://segcloud.stanford.edu/
arxiv: https://arxiv.org/abs/1710.07563

Leaderboard

Segmentation Results: VOC2012 BETA: Competition “comp6” (train on own data)

http://host.robots.ox.ac.uk:8080/leaderboard/displaylb.php?cls=mean&challengeid=11&compid=6
Blogs

Mobile Real-time Video Segmentation

https://research.googleblog.com/2018/03/mobile-real-time-video-segmentation.html

Deep Learning for Natural Image Segmentation Priors

http://cs.brown.edu/courses/csci2951-t/finals/ghope/

Image Segmentation Using DIGITS 5

https://devblogs.nvidia.com/parallelforall/image-segmentation-using-digits-5/

Image Segmentation with Tensorflow using CNNs and Conditional Random Fields http://warmspringwinds.github.io/tensorflow/tf-slim/2016/12/18/image-segmentation-with-tensorflow-using-cnns-and-conditional-random-fields/

Fully Convolutional Networks (FCNs) for Image Segmentation

blog: http://warmspringwinds.github.io/tensorflow/tf-slim/2017/01/23/fully-convolutional-networks-(fcns)-for-image-segmentation/
ipn: https://github.com/warmspringwinds/tensorflow_notes/blob/master/fully_convolutional_networks.ipynb

Image segmentation with Neural Net

blog: https://medium.com/@m.zaradzki/image-segmentation-with-neural-net-d5094d571b1e#.s5f711g1q
github: https://github.com/mzaradzki/neuralnets/tree/master/vgg_segmentation_keras

A 2017 Guide to Semantic Segmentation with Deep Learning

http://blog.qure.ai/notes/semantic-segmentation-deep-learning-review
Talks

Deep learning for image segmentation

intro: PyData Warsaw - Mateusz Opala & Michał Jamroż
youtube: https://www.youtube.com/watch?v=W6r_a5crqGI

« RNN and LSTM
Super-Resolution »
About me

Hi world~
Recent Posts

Study Resources
Keep Up With New Trends
Courses
PyInstsaller and Others
C++ Programming Solutions
Add Lunr Search Plugin For Blog

Links

A blog template forked from zJiaJun. Powered by Jekyll.

bottom-up attention是指在模型中引入多层级的注意力机制,用于对输入特征的不同部分进行不同的加权处理。而PyTorch是一个开源的深度学习框架,提供了丰富的函数和类用于实现各种注意力模型。 在PyTorch中实现bottom-up attention需要以下步骤: 1. 定义输入特征:首先,需要准备好输入特征,可以是一个图像数据集或者是一个文本序列。对于图像数据集,可以使用PyTorch的`torchvision`模块加载数据集,并将数据转换成所需的格式。对于文本序列,可以使用`torchtext`模块加载数据集,并进行处理。 2. 构建模型:使用PyTorch构建一个模型,可以选择使用现有的模型或者自定义模型。在构建模型时,可以使用PyTorch的各种层级和函数,包括卷积层、循环神经网络、全连接层等,来构建底层特征提取器和注意力机制。 3. 实现底层特征提取器:底层特征提取器负责从输入特征中提取有用的表示。可以使用PyTorch提供的卷积神经网络、池化层、标准化层等进行特征提取。 4. 实现注意力机制:注意力机制负责对不同特征部分进行加权处理。可以使用PyTorch提供的注意力模型,例如self-attention、多头注意力等。这些模型可以在模型中的不同层级上使用,用于权重计算和特征融合。 5. 定义损失函数和优化器:根据任务的不同,选择合适的损失函数,例如交叉熵损失、均方误差等。然后,使用PyTorch的优化器,例如随机梯度下降(SGD)或Adam等,来调整模型的参数。 6. 训练和评估模型:使用准备好的数据集进行模型的训练和评估。在PyTorch中,可以使用`torch.utils.data`模块来加载和处理数据,使用PyTorch的`nn.Module`子类定义模型,然后使用PyTorch提供的训练和评估函数进行训练和评估。 总结:bottom-up attention PyTorch是指在PyTorch框架中实现多层级的注意力机制。通过准备输入特征、构建模型、实现底层特征提取器和注意力机制、定义损失函数和优化器以及训练和评估模型等步骤,可以实现bottom-up attention模型。PyTorch提供了丰富的函数和类,便于实现和调试各种注意力模型。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值