KITTI depth completion數據集評測

前言

本文包含筆者試跑KITTI depth completion評測工具的記錄,以及關於評測輸出結果的簡要介紹。

在Ubuntu 20.04下跑一共踩了兩個坑(據說在Ubuntu 18.04下跑不會有坑),解決方式也一併列出。

數據集

KITTI Depth Completion Evaluation數據集包含以下四個部份:
depth_completion
第一個是含標籤的數據集,要跑深度學習才會用到。

第二個是velodyne數據集,如果只需要跑評測的話用不到。

第三個是挑選過的驗證集val_selection_cropped及測試集test_depth_prediction_anonymous,test_depth_completion_anonymous。其中驗證集包含image(RGB圖像),velodyne_raw(velodyne圖),intrinsics(相機內參?)及groundtruth_depth(ground truth的深度圖);測試集只包含image,velodyne_rawintrinsics

第四個是用於評測的工具。

評價指標

以下內容摘自官網:

Our evaluation table ranks all methods according to the root mean squared error (RMSE) of the inverse depth maps.
However, we also provide some other metrics:
iRMSE:  Root mean squared error of the inverse depth [1/km]
iMAE:    Mean absolute error of the inverse depth [1/km]
RMSE:   Root mean squared error [mm]
MAE:     Mean absolute error [mm]

當中提到了各種方法的排序依據是inverse深度圖的RMSE。

至於為何要用inverse的深度圖(深度倒數構成的圖),參考What is inverse depth (in odometry) and why would I use it?。理由如下:場景中常有距離相機無窮遠的物體,它們的深度是無限大。如果深度圖中包含無限大的數值會影響計算的穩定性;為了解決這個問題,採用了距離的倒數之後,那麼前面提到的無限大值就都會變為0。

環境

使用的環境是Ubuntu 20.04 docker,使用以下指令安裝必要的套件:

apt-get update -y
apt install -y build-essential wget vim libtool zlib1g-dev

安裝libpng_1.2.54:

wget http://archive.ubuntu.com/ubuntu/pool/main/libp/libpng/libpng_1.2.54.orig.tar.xz
tar xvf libpng_1.2.54.orig.tar.xz 
cd libpng-1.2.54
./autogen.sh
./configure
make -j8 
sudo make install
sudo ldconfig

安裝png++-0.2.9

wget http://download.savannah.nongnu.org/releases/pngpp/png++-0.2.9.tar.gz
tar -xzvf png++-0.2.9.tar.gz -C /usr/src/
cd /usr/src/png++-0.2.9/
make
make test
sudo make install

如果在make時碰到以下錯誤:

make -C example
make[1]: Entering directory '/d/depth_devkit/devkit/cpp/png++-0.2.9_bak/example'
g++     pixel_generator.cpp   -o pixel_generator
pixel_generator.cpp:35:10: fatal error: png.hpp: No such file or directory
   35 | #include <png.hpp>
      |          ^~~~~~~~~
compilation terminated.
make[1]: *** [<builtin>: pixel_generator] Error 1
make[1]: Leaving directory '/d/depth_devkit/devkit/cpp/png++-0.2.9_bak/example'
make: *** [Makefile:119: examples] Error 2

需手動編輯example/pixel_generator.cpp,將第35行由:

#include <png.hpp>

改成:

#include "png.hpp"

編譯

修改io_depth.h

如果直接拿原代碼編譯運行,會出現一個編譯錯誤及一個執行錯誤。要修改的地方如下:

min的參數型別

修改前的編譯錯誤:

io_depth.h:244:78: error: no matching function for call to 'min(float, double)'
  244 |             float d_err = std::min(fabs(getDepth(u,v)-D_gt.getDepth(u,v)),5.0)/5.0;
      |

修改第244行,將:

float d_err = std::min(fabs(getDepth(u,v)-D_gt.getDepth(u,v)),5.0)/5.0;

改成:

float d_err = std::min(fabs(getDepth(u,v)-D_gt.getDepth(u,v)),5.0f)/5.0f;

函數返回型別

修改前的執行錯誤:

Starting depth evaluation..
Found 1000 groundtruth and 1000 prediction files!
Processing: 2011_09_26_drive_0002_sync_groundtruth_depth_0000000005_image_02
Segmentation fault

修改第109行,將:

  inline bool setInvalid (const int32_t u,const int32_t v) {
    data_[v*width_+u] = -1;
  }

改成:

  inline void setInvalid (const int32_t u,const int32_t v) {
    data_[v*width_+u] = -1;
  }

編譯並執行

cd depth_devkit/devkit/cpp
./make.sh
./evaluate_depth ../../../data_depth_selection/depth_selection/val_selection_cropped/groundtruth_depth/ ../../../data_depth_selection/depth_selection/val_selection_cropped/velodyne_raw

ERROR: Couldn’t generate/store output statistics!

如果在執行過程中出現以下錯誤,可能是因為權限不夠:

ERROR: Couldn't generate/store output statistics!
An error occured while processing your results.
Please make sure that the data in your result directory has the right format (compare to prediction/sparseConv_val)

./evaluate_depth前加上sudo即可解決。

結果

執行完evaluate_depth會生成以下五個資料夾:depth_ipol, depth_orig, errors_img, errors_out, image_0及一個純文字檔stats_depth.txt。五個資料夾內都只包含前20張圖的資料,而純文字檔則是由所有圖片計算而成,各自的意涵條列如下。

image_0

表示原來的ground truth圖。

image_0

depth_orig

表示預測出來的深度圖。

如果在這個資料夾內看到名字帶groundtruth的圖片不需要覺得奇怪,因為代碼中是用:

std::string prefix = std::string(namelist_gt[i]->d_name).substr(0, lastindex);

當作一對圖片的id,然後在存檔的時候直接拿這個id當作圖片的檔名:

D_orig.writeColor(prediction_dir + "/depth_orig/" + prefix + ".png", max_depth);

距離由近至遠依序為(黑)藍紅紫綠青黃白,其中黑色表示距離為無效值。可視化為:

writeFalseColors

上面的圖由以下代碼生成。

void writeFalseColors (const std::string file_name, float max_val) {

// color map
float map[8][4] = {{0,0,0,114},{0,0,1,185},{1,0,0,114},{1,0,1,174},
                   {0,1,0,114},{0,1,1,185},{1,1,0,114},{1,1,1,0}};

// create color png image
png::image< png::rgb_pixel > image(width_,height_);

// for all pixels do
for (int32_t v=0; v<height_; v++) {
  for (int32_t u=0; u<width_; u++) {
    float   wd7 = width_/7.0f;
    int32_t i = static_cast<int32_t>((static_cast<float>(u)/width_)*7.0f);
    float w = 1 - (u-i*wd7)/wd7;
    uint8_t r = (uint8_t)((w*map[i][0]+(1.0-w)*map[i+1][0]) * 255.0);
    uint8_t g = (uint8_t)((w*map[i][1]+(1.0-w)*map[i+1][1]) * 255.0);
    uint8_t b = (uint8_t)((w*map[i][2]+(1.0-w)*map[i+1][2]) * 255.0);

    // set pixel
    image.set_pixel(u,v,png::rgb_pixel(r,g,b));
  }
}

// write to file
image.write(file_name);
}

depth_orig的示例圖:
depth_orig

對照一下用RGB相機拍出來的照片:
image

depth_ipol

表示對預測出來的深度圖做內插後得到的深度圖。想像一下在兩個有深度值的像素之間,有許多像素是無深度值的,我們對無深度像素的內插方式是拿兩側深度較小的值來補上。對於在上下左右邊緣處且無深度的像素,則用最靠近他們且有深度的點的深度值進行外插。

depth_orig相同,圖片檔名沿用自prefix,所以檔名中帶groundtruth是正常的:

D_ipol.writeColor(prediction_dir + "/depth_ipol/" + prefix + ".png", max_depth);

距離由近至遠依序為(黑)藍紅紫綠青黃白,這點與depth_orig相同。

depth_ipol的示例圖:

depth_ipol

errors_img

將深度的差值轉成顏色並存成圖片

如果啟用log_colors,則錯誤幅度由低至高依序為深藍淺藍黃橘紅深紅。

errorImage_log_colors_true

如果關閉log_colors,則錯誤幅度最低者為黑,最高者為白。

errorImage_log_colors_false

調用以下代碼,分別傳入log_colors=truelog_colors=false即可生成上面的兩張圖。

png::image<png::rgb_pixel> errorImage (DepthImage &D_gt,bool log_colors=false) {
for (int32_t u=1; u<width()-1; u++) {
  int32_t i = static_cast<float>(u-1)/(width()-2)*10.0f;
  float d_err = static_cast<float>(u-1)/(width()-3);
  std::cout << "u: " << u << ", i: " << i << ", d_err: " << d_err << std::endl;
}
png::image<png::rgb_pixel> image(width(),height());
for (int32_t v=1; v<height()-1; v++) {
  for (int32_t u=1; u<width()-1; u++) {
    if (true) {
      png::rgb_pixel val;
      if (log_colors) {
        int32_t i = static_cast<float>(u-1)/(width()-2)*10.0f; //[0,10)
        val.red   = (uint8_t)LC[i][2];
        val.green = (uint8_t)LC[i][3];
        val.blue  = (uint8_t)LC[i][4];
      } else {
        float d_err = static_cast<float>(u-1)/(width()-3); //[0,1]
        val.red   = (uint8_t)(d_err*255.0);
        val.green = (uint8_t)(d_err*255.0);
        val.blue  = (uint8_t)(d_err*255.0);
      }
      
      image.set_pixel(u,v,val);
    }
  }
}
return image;
}

啟用log_colors生成的errors_image
errors_img_log_colors_true

關閉log_colors生成的errors_image

errors_img_log_colors_true

注意到上面兩張圖片的頂部都是黑的,這是因為在生成errors_img的過程中,會略過沒有ground truth的像素:

png::image<png::rgb_pixel> errorImage (DepthImage &D_gt,bool log_colors=false) {
  png::image<png::rgb_pixel> image(width(),height());
  for (int32_t v=1; v<height()-1; v++) {
    for (int32_t u=1; u<width()-1; u++) {
      if (D_gt.isValid(u,v)) {
        //...
      }
    }
  }
  return image;
}

要檢查一個像素有沒有ground truth值,可以檢查image_0這張圖片。

errors_out

內插後的深度圖與ground truth的誤差,包含以下九個指標。

mae

深度差值絕對值的平均值

for (uint32_t u = 0; u < width; u++) {
  for (uint32_t v = 0; v < height; v++) {
    if (D_gt.isValid(u, v)) {
      const float d_err = fabs(depth_gt_m - depth_ipol_m);
      errors[0] += d_err;
    }
  }
}
errors[0] /= (float)num_pixels;

rmse

深度差值的方均根

for (uint32_t u = 0; u < width; u++) {
  for (uint32_t v = 0; v < height; v++) {
    if (D_gt.isValid(u, v)) {
        const float d_err = fabs(depth_gt_m - depth_ipol_m);
        const float d_err_squared = d_err * d_err;
        errors[1] += d_err_squared;
    }
  }
}

errors[1] /= (float)num_pixels;
errors[1] = sqrt(errors[1]);

inverse mae

深度值倒數的差值,取絕對值後再取平均值

for (uint32_t u = 0; u < width; u++) {
  for (uint32_t v = 0; v < height; v++) {
    if (D_gt.isValid(u, v)) {
      const float d_err_inv = fabs( 1.0 / depth_gt_m - 1.0 / depth_ipol_m);
      errors[2] += d_err_inv;
    }
  }
}

errors[2] /= (float)num_pixels;

inverse rmse

深度值倒數的差值取方均根

for (uint32_t u = 0; u < width; u++) {
  for (uint32_t v = 0; v < height; v++) {
    if (D_gt.isValid(u, v)) {
      const float d_err_inv = fabs( 1.0 / depth_gt_m - 1.0 / depth_ipol_m);
      const float d_err_inv_squared = d_err_inv * d_err_inv;
      errors[3] += d_err_inv_squared;
    }
  }
}

errors[3] /= (float)num_pixels;
errors[3] = sqrt(errors[3]);

log mae

對深度值取log後計算差值,取絕對值後再取平均值

for (uint32_t u = 0; u < width; u++) {
  for (uint32_t v = 0; v < height; v++) {
    if (D_gt.isValid(u, v)) {
      const float d_err_log = fabs(log(depth_gt_m) - log(depth_ipol_m));
      errors[4] += d_err_log;
    }
  }
}

errors[4] /= (float)num_pixels;

log rmse

對深度值取log後計算差值,再取方均根

for (uint32_t u = 0; u < width; u++) {
  for (uint32_t v = 0; v < height; v++) {
    if (D_gt.isValid(u, v)) {
      const float d_err_log = fabs(log(depth_gt_m) - log(depth_ipol_m));
      const float d_err_log_squared = d_err_log * d_err_log;
      errors[5] += d_err_log_squared;
    }
  }
}

const float normalizedSquaredLog = errors[5] / (float)num_pixels;
errors[5] = sqrt(normalizedSquaredLog);

scale invariant log

先對深度值取log後計算差值,再取平方和平均,記為normalizedSquaredLog

對深度值取log後計算差值並求和,記為logSum。用normalizedSquaredLog減去logSum平均值的平方再開根號。

for (uint32_t u = 0; u < width; u++) {
  for (uint32_t v = 0; v < height; v++) {
    if (D_gt.isValid(u, v)) {
      const float d_err_log = fabs(log(depth_gt_m) - log(depth_ipol_m));
      const float d_err_log_squared = d_err_log * d_err_log;
      errors[5] += d_err_log_squared;
      logSum += (log(depth_gt_m) - log(depth_ipol_m));
    }
  }
}

const float normalizedSquaredLog = errors[5] / (float)num_pixels;
errors[6] = sqrt(normalizedSquaredLog - (logSum*logSum / ((float)num_pixels*(float)num_pixels)));

abs relative

先計算相對誤差為深度差絕對值與ground truth深度值的商,再取平均值

for (uint32_t u = 0; u < width; u++) {
  for (uint32_t v = 0; v < height; v++) {
    if (D_gt.isValid(u, v)) {
      const float d_err = fabs(depth_gt_m - depth_ipol_m);
      errors[7] += d_err/depth_gt_m;
    }
  }
}

errors[7] /= (float)num_pixels;

squared relative

先計算相對誤差為深度差絕對值與ground truth深度值的商,再取平方和平均值

for (uint32_t u = 0; u < width; u++) {
  for (uint32_t v = 0; v < height; v++) {
    if (D_gt.isValid(u, v)) {
      const float d_err = fabs(depth_gt_m - depth_ipol_m);
      const float d_err_squared = d_err * d_err;
      errors[8] += d_err_squared/(depth_gt_m*depth_gt_m);
    }
  }
}

errors[8] /= (float)num_pixels;

stats_depth.txt

errors_out計算的9個指標分別計算平均值,最小值及最大值,得到9*3=27個指標。

mean mae: 2.487007 
min  mae: 0.362612 
max  mae: 5.655302 
mean rmse: 6.278716 
min  rmse: 1.000000 
max  rmse: 12.744315 
mean inverse mae: 0.009926 
min  inverse mae: 0.001883 
max  inverse mae: 0.035303 
mean inverse rmse: 0.023858 
min  inverse rmse: 0.003904 
max  inverse rmse: 0.075682 
mean log mae: nan 
min  log mae: 0.029891 
max  log mae: 0.327116 
mean log rmse: -nan 
min  log rmse: 0.095481 
max  log rmse: 0.579777 
mean scale invariant log: -nan 
min  scale invariant log: 0.095078 
max  scale invariant log: 0.547631 
mean abs relative: 0.116020 
min  abs relative: 0.023609 
max  abs relative: 0.260368 
mean squared relative: 0.071436 
min  squared relative: 0.008656 
max  squared relative: 0.457794 

log

Starting depth evaluation..
Found 1000 groundtruth and 1000 prediction files!
Processing: 2011_09_26_drive_0002_sync_groundtruth_depth_0000000005_image_02
Processing: 2011_09_26_drive_0002_sync_groundtruth_depth_0000000008_image_03
Processing: 2011_09_26_drive_0002_sync_groundtruth_depth_0000000011_image_02
Processing: 2011_09_26_drive_0002_sync_groundtruth_depth_0000000014_image_03
Processing: 2011_09_26_drive_0002_sync_groundtruth_depth_0000000017_image_02
...
Done. 
Your evaluation results are:
mean mae: 2.48701
mean rmse: 6.27872
mean inverse mae: 0.00992581
mean inverse rmse: 0.0238582
mean log mae: nan
mean log rmse: -nan
mean scale invariant log: -nan
mean abs relative: 0.11602
mean squared relative: 0.0714363
Your evaluation results are available at:
../../../data_depth_selection/depth_selection/val_selection_cropped/velodyne_raw/stats_depth.txt
  • 11
    点赞
  • 33
    收藏
    觉得还不错? 一键收藏
  • 8
    评论
评论 8
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值