CVPR2018-video object segmentation--3

最新推荐文章于 2024-06-05 09:55:36 发布

乐兮山南水北

最新推荐文章于 2024-06-05 09:55:36 发布

阅读量935

点赞数

分类专栏：论文阅读文章标签： CVPR2018 video object segmentation

本文链接：https://blog.csdn.net/u012494820/article/details/82781929

版权

论文阅读专栏收录该内容

12 篇文章 0 订阅

订阅专栏

Referring Image Segmentation via Recurrent Refinement Networks

略读 motivation

This work segments image object from natural language descriptions, which requires combining natural language processing with visual understanding technology. The researchers aim to generate high-quality segmentation masks by incorporating multi-scale information to refine segmentation results.

Temporal Deformable Residual Networks for Action Segmentation in Videos

略读，motivation

This work tackles action segmentation in videos. It proposes a temporal deformable residual network (TDRN) to analyse video intervals at multiple temporal scales. TDRN consists of two temporal streams: i) Residual stream that analyses video information at its full temporal resolution. ii) Pooling/Unpooling stream that captures long-range video information at different scale. The former facilitates local, fine-scale action segmentation, while the latter use multiscale context for improving frame classification accuracy.

启发

temporal convolution是什么？与3d-convolution相同吗？用起来效果怎么样？

Context Encoding for Semantic Segmentation

在这里插入图片描述

Research Background

Given an image, semantic segmentation can assigns pre-pixel predictions of object categories, which provides comprehensive sense description including the information of object category, location and shape. The FCN based frameworks adopt several strategies, i.e., dilated convolution and multi-resolution pyramid-based feature representation, to enlarge the reception field so as to alleviate spatial resolution loss. However, contextual information is not well-explored.

Motivation and Proposed approach

This work believes that being aware of sense context and narrow the list of probable categories can make semantic segmentation much easier. The classic computer vision approaches encode global context information by capturing feature statistics. This work extend the Encoding Layer (former works) to capture global feature statistics for understanding semantic context.

优点与启发

借鉴texture classification的工作，对于semantic segmentation学习context embedding。相比于之前的context模型，本研究似乎初次做到了context的感知。想清楚什么办法可以解决问题，先不用在意实现的难度，然后从本领域或其它领域找可以借鉴的工作，实现想法
本文提出的semantic encoding loss对于大物体和小物体贡献相同，可以参考类似的想法提升video object segmentation任务中小物体的性能

潜在的不足

如果某一类出现，将其对应的feature map highlight，则这一层channel数目等于类别数目？
感觉context embedding 可以做得更深入

乐兮山南水北

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
CVPR2018-video object segmentation--3

Referring Image Segmentation via Recurrent Refinement Networks Ruiyu略读 motivationThis work segments image object from natural language descriptions, which requires combining natural language process...
复制链接

扫一扫