毫米波感知论文阅读笔记：CVPR 2023, Hidden Gems: 4D Radar Scene Flow Learning Using Cross-Modal Supervision

R.X. NLOS

已于 2023-10-21 18:41:17 修改

阅读量114

点赞数

分类专栏： # Deep Learning # 论文阅读文章标签：论文阅读笔记 CVPR

于 2023-10-19 19:10:57 首次发布

本文链接：https://blog.csdn.net/qazwsxrx/article/details/133932845

版权

原始笔记链接: https://mp.weixin.qq.com/s?__biz=Mzg4MjgxMjgyMg==&mid=2247486705&idx=1&sn=2e9c8be25d079fcf9dca9a9b67a90651&chksm=cf51be08f826371edc3226955f1acff0bb560f5256611b43ea91f9db6a579d5313b97ec598e5#rd
$\uparrow$ 打开上述链接即可阅读全文

CVPR 2023 | Hidden Gems: 4D Radar Scene Flow Learning Using Cross-Modal Supervision

毫米波感知论文阅读笔记：CVPR 2023, Hidden Gems: 4D Radar Scene Flow Learning Using Cross-Modal Supervision

0 Abstract

This paper
- proposes a noval approach to 4D radar-based scene flow estimation via cross-modal learning
Motivation
- Co-located sensing redundancy in modern autonomous vehicles
  
  ✅ provides various forms of supervision cues for radar scene flow estimation.
Methods
- presents a multi-task model architecture for the identified cross-modal learning problem
- proposes specific loss functions to engage scene flow estimation using multiple cross-modal constraints for effective model training
Experiments
- SOTA performance
- proves effective in inferring more accurate 4D radar scene flow using cross-modal supervised learning
- shown to be useful for two subtasks
  
  ✅ motion segmentation
  
  ✅ ego-motion estimation

Code：https://github.com/Toytiny/CMFlow

在这里插入图片描述

1 Introduction

Scene flow estimation
- Definition of scene flow estimation
  
  ✅ Obtaining a 3D motion vector field of static and dynamic environments relative to an ego-agent
- Importance of scene flow in the context of self-driving
  
  ✅ Provides motion cues for various tasks
Current scene flow estimation approaches
- fully-supervised, weakly-supervised learning or rely on self-supervised signals.
- Challenges of these approaches
  
  ✅ labor-intensive process of scene flow annotations for supervised learning
  
  ✅ the often subpar performance of self-supervised learning methods
Specific challenges in 4D radar scene flow learning
- Rise of 4D automotive radars
  
  ✅ resistant to adverse conditions and have the ability to measure object velocity.
- Complications with 4D radar point clouds
  
  ❌ sparsity and noise in the point clouds which complicate the scene flow annotation process for supervised learning
Solution: Hidden Gems
- exploiting cross-modal supervision signals in autonomous vehicles
  
  ✅ Modern autonomous vehicles are equipped with multiple sensors that provide complementary and redundant perception results.
- The authors aim to use this co-located perception redundancy to provide multiple supervision cues to improve radar scene flow learning.
  
  🚩 The primary research question: how to retrieve and apply cross-modal supervision signals from co-located sensors on a vehicle to improve radar scene flow learning
  
  ✅ exploiting useful supervision signals from odometer (GPS/INS), LiDAR, and RGB camera
  
  ✅ Train: Multi-modal data; Test: Only radar data
Contributions
- 1 the first 4D radar scene flow learning using cross-modal supervision
- 2 propose a multi-task model architecture & loss functions
- 3 demonstrate the SOTA performance and its effectiveness in downstream tasks

2 Related Work

Scene flow
- Scene flow was first defined as a 3D uplift of optical flow
- Traditional approaches to scene flow from either RGB or RGB-D images:
  
  ✅ based on prior knowledge assumptions or by training deep networks in a supervised or unsupervised way
- Some methods directly infer point-wise scene flow from 3D sparse point clouds
  
  ✅ These methods may rely on online optimization
  
  ✅ DL-based methods have been dominant for pointcloud-based scene flow estimation
Deep scene flow on point clouds
- Current SOTA methods: leveraging large amounts of data for training (Supervised)
  
  ✅ fully-supervised manner with GT flow: labor-intensive and costly scene flow annotations
  
  ✅ simulated dataset for training: may result in poor generalization
- Self-supervised learning frameworks to avoid the labor and pitfalls of synthetic data.
  
  ✅ Exploit supervision signals from the input data
  
  ❌ performance is limited: no real labels are used to supervise their models
- a Trade-off between annotation efforts and performance
  
  ✅ Combine the ego-motion and manually annotated background segmentation labels
  
  ✅ ego-motion is easily assessed from odometry sensors
  
  ❌ However, the segmentation labels are still manually annotated and expensive
Radar scene flow
- Previous works cannot be directly extended to the sparse and noisy radar point clouds
  
  ❌ they mostly estimate scene flow on dense point clouds captured by LiDAR or rendered from stereo images
- recent work proposes a self-supervised pipeline for radar scene flow estimation.
  
  ❌ However, the lack of real supervision signals limits its scene flow estimation performance
Noting the proposal
- Solution of the supervision problem:
  
  ✅ retrieve supervision signals from co-located sensors in an automatic manner
  
  ✅ without resorting to any human intervention during training
- only require other modalities during the training stage, not during inference.

3 Method

3.1 Problem Definition

Defines the task of scene flow estimation.

Scene flow estimation :
- aims to solve a motion field that describes the non-rigid transformations induced both by the motion of the ego-vehicle and the dynamic objects in the scene
The inputs of point cloud-based scene flow: two consecutive point clouds
- the source one $\mathbf{P}^s=\left\{\mathbf{p}_i^s=\right.\left.\left\{\mathbf{c}_i^s, \mathbf{x}_i^s\right\}\right\}_{i=1}^N$
- the target one $\mathbf{P}^t=\left\{\mathbf{p}_i^t=\left\{\mathbf{c}_i^t, \mathbf{x}_i^t\right\}\right\}_{i=1}^M$
- $\mathbf{c}_i^s, \mathbf{c}_i^t \in \mathbb{R}^3$ : 3D coordinates of each point
- $\mathbf{x}_i^s, \mathbf{x}_i^t \in \mathbb{R}^C$ : raw point features of each point
The outputs : point-wise 3D motion vectors $\mathbf{F}$
- $\mathbf{F}=\left\{\mathbf{f}_i \in\right.\left.\mathbb{R}^3\right\}_{i=1}^N$ align each point in $\mathbf{P}^s$ to its corresponding position $\mathbf{c}_i^{\prime}=\mathbf{c}_i^s+\mathbf{f}_i$ in the target frame

Note: $\mathbf{P}^s$ and $\mathbf{P}^t$

最低0.47元/天解锁文章

R.X. NLOS

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
打赏
0
评论
毫米波感知论文阅读笔记：CVPR 2023, Hidden Gems: 4D Radar Scene Flow Learning Using Cross-Modal Supervision

This paperMotivationCo-located sensing redundancy in modern autonomous vehicles✅ provides various forms of supervision cues for radar scene flow estimation.MethodsExperimentsSOTA performanceproves effective in inferring more accurate 4D radar scene flow us
复制链接

扫一扫