Hierarchically Gated Deep Networks for Semantic Segmentation.
Guo-Jun Qi
本文结合了LSTM+CNNs,在CNNs中加入了分级结构判断是否是同一尺度。
可以看出从2015到2016的语义分割发展方向,多尺度问题一直是多篇论文的方向,本片解决的问题也是多尺度问题。
Abstract
Semantic segmentation aims to parse the scene structure of images by annotating the labels to each pixel so that images can be segmented into different regions. While image structures usually have various scales, it is difficult to use a single scale to model the spatial contexts for all individual pixels. Multi-scale Convolutional Neural Networks (CNNs) and their variants have made striking success for modeling the global scene structure for an image. However, they are limited in labeling fine-grained local structures like pixels and patches, since spatial contexts might be blindly mixed up without appropriately customizing their scales. To address this challenge, we develop a novel paradigm of multiscale deep network to model spatial contexts surrounding different pixels at various scales. It builds multiple layers of memory cells, learning feature representations for individual pixels at their customized scales by hierarchically absorbing relevant spatial contexts via memory gates between layers. Such Hierarchically Gated Deep Networks (HGDNs) can customize a suitable scale for each pixel, thereby delivering better performance on labeling scene structures of various scales. We conduct the experiments on two datasets, and show competitive results compared with the other multi-scale deep networks on the semantic segmentation task.
摘要
语义分割旨在通过将标签注释到每个像素来解析图像的场景结构,使得图像可以被分割成不同的区域。虽然图