毫米波感知论文阅读笔记:CVPR 2023, Hidden Gems: 4D Radar Scene Flow Learning Using Cross-Modal Supervision

原始笔记链接: https://mp.weixin.qq.com/s?__biz=Mzg4MjgxMjgyMg==&mid=2247486705&idx=1&sn=2e9c8be25d079fcf9dca9a9b67a90651&chksm=cf51be08f826371edc3226955f1acff0bb560f5256611b43ea91f9db6a579d5313b97ec598e5#rd
↑ \uparrow 打开上述链接即可阅读全文

CVPR 2023 | Hidden Gems: 4D Radar Scene Flow Learning Using Cross-Modal Supervision

毫米波感知论文阅读笔记:CVPR 2023, Hidden Gems: 4D Radar Scene Flow Learning Using Cross-Modal Supervision
图 0

0 Abstract

  • This paper

    • proposes a noval approach to 4D radar-based scene flow estimation via cross-modal learning
  • Motivation

    • Co-located sensing redundancy in modern autonomous vehicles

      ✅ provides various forms of supervision cues for radar scene flow estimation.

  • Methods

    • presents a multi-task model architecture for the identified cross-modal learning problem
    • proposes specific loss functions to engage scene flow estimation using multiple cross-modal constraints for effective model training
  • Experiments

    • SOTA performance

    • proves effective in inferring more accurate 4D radar scene flow using cross-modal supervised learning

    • shown to be useful for two subtasks

      motion segmentation

      ego-motion estimation

Code:https://github.com/Toytiny/CMFlow

在这里插入图片描述

1 Introduction

  • Scene flow estimation

    • Definition of scene flow estimation

      Obtaining a 3D motion vector field of static and dynamic environments relative to an ego-agent

    • Importance of scene flow in the context of self-driving

      ✅ Provides motion cues for various tasks

  • Current scene flow estimation approaches

    • fully-supervised, weakly-supervised learning or rely on self-supervised signals.

    • Challenges of these approaches

      ✅ labor-intensive process of scene flow annotations for supervised learning

      ✅ the often subpar performance of self-supervised learning methods

  • Specific challenges in 4D radar scene flow learning

    • Rise of 4D automotive radars

      ✅ resistant to adverse conditions and have the ability to measure object velocity.

    • Complications with 4D radar point clouds

      ❌ sparsity and noise in the point clouds which complicate the scene flow annotation process for supervised learning

  • Solution: Hidden Gems

    • exploiting cross-modal supervision signals in autonomous vehicles

      ✅ Modern autonomous vehicles are equipped with multiple sensors that provide complementary and redundant perception results.

    • The authors aim to use this co-located perception redundancy to provide multiple supervision cues to improve radar scene flow learning.

      🚩 The primary research question: how to retrieve and apply cross-modal supervision signals from co-located sensors on a vehicle to improve radar scene flow learning

      ✅ exploiting useful supervision signals from odometer (GPS/INS), LiDAR, and RGB camera

      ✅ Train: Multi-modal data; Test: Only radar data

  • Contributions

    • 1 the first 4D radar scene flow learning using cross-modal supervision
    • 2 propose a multi-task model architecture & loss functions
    • 3 demonstrate the SOTA performance and its effectiveness in downstream tasks

2 Related Work

  • Scene flow
    • Scene flow was first defined as a 3D uplift of optical flow

    • Traditional approaches to scene flow from either RGB or RGB-D images:

      ✅ based on prior knowledge assumptions or by training deep networks in a supervised or unsupervised way

    • Some methods directly infer point-wise scene flow from 3D sparse point clouds

      ✅ These methods may rely on online optimization

      DL-based methods have been dominant for pointcloud-based scene flow estimation

  • Deep scene flow on point clouds
    • Current SOTA methods: leveraging large amounts of data for training (Supervised)

      fully-supervised manner with GT flow: labor-intensive and costly scene flow annotations

      simulated dataset for training: may result in poor generalization

    • Self-supervised learning frameworks to avoid the labor and pitfalls of synthetic data.

      ✅ Exploit supervision signals from the input data

      performance is limited: no real labels are used to supervise their models

    • a Trade-off between annotation efforts and performance

      ✅ Combine the ego-motion and manually annotated background segmentation labels

      ✅ ego-motion is easily assessed from odometry sensors

      ❌ However, the segmentation labels are still manually annotated and expensive

  • Radar scene flow
    • Previous works cannot be directly extended to the sparse and noisy radar point clouds

      ❌ they mostly estimate scene flow on dense point clouds captured by LiDAR or rendered from stereo images

    • recent work proposes a self-supervised pipeline for radar scene flow estimation.

      ❌ However, the lack of real supervision signals limits its scene flow estimation performance

  • Noting the proposal
    • Solution of the supervision problem:

      retrieve supervision signals from co-located sensors in an automatic manner

      without resorting to any human intervention during training

    • only require other modalities during the training stage, not during inference.

3 Method

3.1 Problem Definition

Defines the task of scene flow estimation.

  • Scene flow estimation :

    • aims to solve a motion field that describes the non-rigid transformations induced both by the motion of the ego-vehicle and the dynamic objects in the scene
  • The inputs of point cloud-based scene flow: two consecutive point clouds

    • the source one P s = { p i s = { c i s , x i s } } i = 1 N \mathbf{P}^s=\left\{\mathbf{p}_i^s=\right.\left.\left\{\mathbf{c}_i^s, \mathbf{x}_i^s\right\}\right\}_{i=1}^N Ps={ pis={ cis,xis}}i=1N
    • the target one P t = { p i t = { c i t , x i t } } i = 1 M \mathbf{P}^t=\left\{\mathbf{p}_i^t=\left\{\mathbf{c}_i^t, \mathbf{x}_i^t\right\}\right\}_{i=1}^M Pt={ pit={ cit,xit}}i=1M
    • c i s , c i t ∈ R 3 \mathbf{c}_i^s, \mathbf{c}_i^t \in \mathbb{R}^3 cis,citR3: 3D coordinates of each point
    • x i s , x i t ∈ R C \mathbf{x}_i^s, \mathbf{x}_i^t \in \mathbb{R}^C xis,xitRC: raw point features of each point
  • The outputs : point-wise 3D motion vectors F \mathbf{F} F

    • F = { f i ∈ R 3 } i = 1 N \mathbf{F}=\left\{\mathbf{f}_i \in\right.\left.\mathbb{R}^3\right\}_{i=1}^N F={ fiR3}i=1N align each point in P s \mathbf{P}^s Ps to its corresponding position c i ′ = c i s + f i \mathbf{c}_i^{\prime}=\mathbf{c}_i^s+\mathbf{f}_i ci=cis+fi in the target frame

Note: P s \mathbf{P}^s Ps and P t \mathbf{P}^t P

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

R.X. NLOS

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值