CVPR2018-video object segmentation--3

Referring Image Segmentation via Recurrent Refinement Networks

略读 motivation

This work segments image object from natural language descriptions, which requires combining natural language processing with visual understanding technology. The researchers aim to generate high-quality segmentation masks by incorporating multi-scale information to refine segmentation results.

Temporal Deformable Residual Networks for Action Segmentation in Videos

略读,motivation

This work tackles action segmentation in videos. It proposes a temporal deformable residual network (TDRN) to analyse video intervals at multiple temporal scales. TDRN consists of two temporal streams: i) Residual stream that analyses video information at its full temporal resolution. ii) Pooling/Unpooling stream that captures long-range video information at different scale. The former facilitates local, fine-scale action segmentation, while the latter use multiscale context for improving frame classification accuracy.

启发
  1. temporal convolution是什么?与3d-convolution相同吗?用起来效果怎么样?

Context Encoding for Semantic Segmentation

在这里插入图片描述

Research Background

Given an image, semantic segmentation can assigns pre-pixel predictions of object categories, which provides comprehensive sense description including the information of object category, location and shape. The FCN based frameworks adopt several strategies, i.e., dilated convolution and multi-resolution pyramid-based feature representation, to enlarge the reception field so as to alleviate spatial resolution loss. However, contextual information is not well-explored.

Motivation and Proposed approach

This work believes that being aware of sense context and narrow the list of probable categories can make semantic segmentation much easier. The classic computer vision approaches encode global context information by capturing feature statistics. This work extend the Encoding Layer (former works) to capture global feature statistics for understanding semantic context.

优点与启发
  1. 借鉴texture classification的工作,对于semantic segmentation学习context embedding。相比于之前的context模型,本研究似乎初次做到了context的感知。想清楚什么办法可以解决问题,先不用在意实现的难度,然后从本领域或其它领域找可以借鉴的工作,实现想法
  2. 本文提出的semantic encoding loss对于大物体和小物体贡献相同,可以参考类似的想法提升video object segmentation任务中小物体的性能
潜在的不足
  1. 如果某一类出现,将其对应的feature map highlight,则这一层channel数目等于 类别数目?
  2. 感觉context embedding 可以做得更深入
  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值