The Setup of Deeplab v2

1. Clone the code

Refer to http://blog.csdn.net/Xmo_jiao/article/details/77897109

2. Make the dataset

Refer to https://blog.csdn.net/u014451076/article/details/79700653
Note: When we convert the ground truth from RGB to Gray, the original train and val label of VOC should keep the white boundary which will influence the result(i.e higher mIoU).

3. In the process of compiling the caffe-deeplab-v2

We need to compile caffe-deeplab-v2.
Because the cudnn version used by the auther of deeplab v2 is 4.0, Usually the version we used is >=5.0 and we need to modify this code.
Refer to http://blog.csdn.net/tianrolin/article/details/71246472

4. Train

Refer to http://blog.csdn.net/Xmo_jiao/article/details/77897109
The basic procedure is:

  • modify some paths in run_pascal.sh
  • modify train_aug.txt and val.txt in direcotary list
  • run run_pascal.sh to train
  • run run_pascal.sh to test, the result is .mat
  • run deeplabv2/matlab/EvalSegResult.m to evaluate

5. Evaluate(deeplab-vgg16-MS-LargeFOV)

Modify deeplab-public-ver2/matlab/my_script/EvalSegResults.m to test. One example is below:

SetupEnv;

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% You do not need to chage values below
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

if is_server
  if strcmp(dataset, 'voc12')
    VOC_root_folder = '/path/todata/pascal';
  elseif strcmp(dataset, 'coco')
    VOC_root_folder = '/rmt/data/coco';
  else
    error('Wrong dataset');
  end
else
  if strcmp(dataset, 'voc12')  
    VOC_root_folder = '~/dataset/PASCAL/VOCdevkit';
  elseif strcmp(dataset, 'coco')
    VOC_root_folder = '~/dataset/coco';
  else
    error('Wrong dataset');
  end
end

if has_postprocess
  if learn_crf
    post_folder = sprintf('post_densecrf_W%d_XStd%d_RStd%d_PosW%d_PosXStd%d_ModelType%d_Epoch%d', bi_w, bi_x_std, bi_r_std, pos_w, pos_x_std, model_type, epoch); 
  else
    post_folder = sprintf('post_densecrf_W%d_XStd%d_RStd%d_PosW%d_PosXStd%d', bi_w, bi_x_std, bi_r_std, pos_w, pos_x_std); 
  end
else
  post_folder = 'post_none';
end

%output_mat_folder = fullfile('/path/to/deeplab_v2/voc2012', feature_name, model_name, testset, feature_type);
output_mat_folder = '/path/to/deeplab_v2/voc2012/features/deeplab_largeFOV/val/fc8';

save_root_folder = fullfile('/path/to/deeplab_v2/voc2012', 'res', feature_name, model_name, testset, feature_type, post_folder);

fprintf(1, 'Saving to %s\n', save_root_folder);

if strcmp(dataset, 'voc12')
  seg_res_dir = [save_root_folder '/results/VOC2012/'];
  seg_root = fullfile(VOC_root_folder, 'VOC2012');
  gt_dir   = fullfile(VOC_root_folder, 'VOC2012', 'SegmentationClass');
elseif strcmp(dataset, 'coco')
  seg_res_dir = [save_root_folder '/results/COCO2014/'];
  seg_root = fullfile(VOC_root_folder, '');
  gt_dir   = fullfile(VOC_root_folder, '', 'SegmentationClass');
end

save_result_folder = fullfile(seg_res_dir, 'Segmentation', [id '_' testset '_cls']);

if ~exist(save_result_folder, 'dir')
    mkdir(save_result_folder);
end

if strcmp(dataset, 'voc12')
  VOCopts = GetVOCopts(seg_root, seg_res_dir, trainset, testset, 'VOC2012');
elseif strcmp(dataset, 'coco')
  VOCopts = GetVOCopts(seg_root, seg_res_dir, trainset, testset, '');
end

if is_mat
  % crop the results
  load('pascal_seg_colormap.mat');

  output_dir = dir(fullfile(output_mat_folder, '*.mat'));

  for i = 1 : numel(output_dir)
    if mod(i, 100) == 0
        fprintf(1, 'processing %d (%d)...\n', i, numel(output_dir));
    end

    data = load(fullfile(output_mat_folder, output_dir(i).name));
    % load .mat
    raw_result = data.data;
    raw_result = permute(raw_result, [2 1 3]);

    img_fn = output_dir(i).name(1:end-4);
    img_fn = strrep(img_fn, '_blob_0', '');

    if strcmp(dataset, 'voc12')
      img = imread(fullfile(VOC_root_folder, 'VOC2012', 'JPEGImages', [img_fn, '.jpg']));
    elseif strcmp(dataset, 'coco')
      img = imread(fullfile(VOC_root_folder, 'JPEGImages', [img_fn, '.jpg']));
    end

    % get the original size from the original image so as to clip
    img_row = size(img, 1);
    img_col = size(img, 2);

    % clip
    result = raw_result(1:img_row, 1:img_col, :);

    if ~is_argmax
      [~, result] = max(result, [], 3);
      result = uint8(result) - 1;
    else
      result = uint8(result);
    end

    if debug
        gt = imread(fullfile(gt_dir, [img_fn, '.png']));
        figure(1), 
        subplot(221),imshow(img), title('img');
        subplot(222),imshow(gt, colormap), title('gt');
        subplot(224), imshow(result,colormap), title('predict');
    end

    imwrite(result, colormap, fullfile(save_result_folder, [img_fn, '.png']));
  end
end

% get iou score
if strcmp(testset, 'val')
  [accuracies, avacc, conf, rawcounts] = MyVOCevalseg(VOCopts, id);
else
  fprintf(1, 'This is test set. No evaluation. Just saved as png\n');
end 

result is below(with out CRF).

processing 100 (1449)...
processing 200 (1449)...
processing 300 (1449)...
processing 400 (1449)...
processing 500 (1449)...
processing 600 (1449)...
processing 700 (1449)...
processing 800 (1449)...
processing 900 (1449)...
processing 1000 (1449)...
processing 1100 (1449)...
processing 1200 (1449)...
processing 1300 (1449)...
processing 1400 (1449)...
test confusion: 112/1449
test confusion: 237/1449
test confusion: 341/1449
test confusion: 458/1449
test confusion: 576/1449
test confusion: 700/1449
test confusion: 806/1449
test confusion: 912/1449
test confusion: 1022/1449
test confusion: 1127/1449
test confusion: 1236/1449
test confusion: 1348/1449
Percentage of pixels correctly labelled overall: 92.531%
Accuracy for each class (pixel accuracy)
      background: 96.219%
       aeroplane: 93.273%
         bicycle: 81.932%
            bird: 88.940%
            boat: 83.684%
          bottle: 81.487%
             bus: 90.519%
             car: 89.931%
             cat: 93.166%
           chair: 49.704%
             cow: 79.759%
     diningtable: 57.427%
             dog: 86.444%
           horse: 85.098%
       motorbike: 86.893%
          person: 87.983%
     pottedplant: 63.491%
           sheep: 82.238%
            sofa: 52.055%
           train: 85.916%
       tvmonitor: 78.807%
-------------------------
Mean Class Accuracy: 80.713%
Accuracy for each class (intersection/union measure)
      background: 92.067%
       aeroplane: 82.498%
         bicycle: 36.758%
            bird: 79.323%
            boat: 63.057%
          bottle: 69.035%
             bus: 86.545%
             car: 80.722%
             cat: 82.480%
           chair: 34.429%
             cow: 71.885%
     diningtable: 51.089%
             dog: 76.410%
           horse: 71.342%
       motorbike: 74.273%
          person: 79.867%
     pottedplant: 49.074%
           sheep: 75.517%
            sofa: 43.021%
           train: 79.163%
       tvmonitor: 64.390%
-------------------------
Average accuracy: 68.712%

6. Run_densecrf.py

  • firstly, make deeplab_v2/densecrf
    Maybe, we will happen to this error
collect2: error: ld returned 1 exit status
make[1]: *** [prog_refine_pascal_v4] Error 1
make[1]: Leaving directory `/path/to/deeplab_v2/deeplab-public-ver2/densecrf'
make: *** [all] Error 2

To solve this problem, we modify the Makefile by line 31,

$(CC) refine_pascal_v4/dense_inference.cpp -o prog_refine_pascal_v4     $(CFLAGS) -L. -lDenseCRF -lmatio -lhdf5 -I./util/

Then, make successful

  • Use deeplab_v2/densecrf/my_script/SaveJpgToPPM.m to transfer JPEGImages to PPMImages
  • run deeplab_v2/voc2012/run_densecrf.py to get the result of post-processing, but the format is bin. One example of run_densecrf.py is below.
#!/bin/bash 

###########################################
# You can either use this script to generate the DenseCRF post-processed results
# or use the densecrf_layer (wrapper) in Caffe
###########################################
DATASET=voc12
LOAD_MAT_FILE=1

MODEL_NAME=deeplab_largeFOV

TEST_SET=val           #val, test

# the features  folder save the features computed via the model trained with the train set
# the features2 folder save the features computed via the model trained with the trainval set
FEATURE_NAME=features #features, features2
FEATURE_TYPE=fc8

# specify the parameters
MAX_ITER=10

Bi_W=4
Bi_X_STD=49
Bi_Y_STD=49
Bi_R_STD=5
Bi_G_STD=5 
Bi_B_STD=5

POS_W=3
POS_X_STD=3
POS_Y_STD=3


#######################################
# MODIFY THE PATY FOR YOUR SETTING
#######################################
SAVE_DIR=/path/to/deeplab_v2/voc2012/res/${FEATURE_NAME}/${MODEL_NAME}/${TEST_SET}/${FEATURE_TYPE}/post_densecrf_W${Bi_W}_XStd${Bi_X_STD}_RStd${Bi_R_STD}_PosW${POS_W}_PosXStd${POS_X_STD}

echo "SAVE TO ${SAVE_DIR}"

CRF_DIR=/path/to/deeplab_v2/deeplab-public-ver2/densecrf

if [ ${DATASET} == "voc12" ]
then
    IMG_DIR_NAME=pascal/VOC2012
elif [ ${DATASET} == "coco" ]
then
    IMG_DIR_NAME=coco
elif [ ${DATASET} == "voc10_part" ]
then
    IMG_DIR_NAME=pascal/VOCdevkit/VOC2012
fi

# NOTE THAT the densecrf code only loads ppm images
IMG_DIR=/path/to/data/${IMG_DIR_NAME}/PPMImages

if [ ${LOAD_MAT_FILE} == 1 ]
then
    # the features are saved in .mat format
    CRF_BIN=${CRF_DIR}/prog_refine_pascal_v4
    FEATURE_DIR=/path/to/deeplab_v2/voc2012/${FEATURE_NAME}/${MODEL_NAME}/${TEST_SET}/${FEATURE_TYPE}
else
    # the features are saved in .bin format (has called SaveMatAsBin.m in the densecrf/my_script)
    CRF_BIN=${CRF_DIR}/prog_refine_pascal
    FEATURE_DIR=/rmt/work/deeplab/exper/${DATASET}/${FEATURE_NAME}/${MODEL_NAME}/${TEST_SET}/${FEATURE_TYPE}/bin
fi

mkdir -p ${SAVE_DIR}

#echo ${CRF_BIN} -id ${IMG_DIR} -fd ${FEATURE_DIR} -sd ${SAVE_DIR} -i ${MAX_ITER} -px ${POS_X_STD} -py ${POS_Y_STD} -    pw ${POS_W} -bx ${Bi_X_STD} -by ${Bi_Y_STD} -br ${Bi_R_STD} -bg ${Bi_G_STD} -bb ${Bi_B_STD} -bw ${Bi_W}

# run the program
${CRF_BIN} -id ${IMG_DIR} -fd ${FEATURE_DIR} -sd ${SAVE_DIR} -i ${MAX_ITER} -px ${POS_X_STD} -py ${POS_Y_STD} -pw ${POS_W} -bx ${Bi_X_STD} -by ${Bi_Y_STD} -br ${Bi_R_STD} -bg ${Bi_G_STD} -bb ${Bi_B_STD} -bw ${Bi_W}
  • Evaluate the result after CRF(deeplab-vgg16-MS-LargeFOV)
pixel accuracy:  0.933649579242
mean accuracy:  0.811556028833
mIoU:  0.712235214969
fw IU:  0.880878180695
  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 7
    评论
本课程适合具有一定深度学习基础,希望发展为深度学习之计算机视觉方向的算法工程师和研发人员的同学们。基于深度学习的计算机视觉是目前人工智能最活跃的领域,应用非常广泛,如人脸识别和无人驾驶中的机器视觉等。该领域的发展日新月异,网络模型和算法层出不穷。如何快速入门并达到可以从事研发的高度对新手和中级水平的学生而言面临不少的挑战。精心准备的本课程希望帮助大家尽快掌握基于深度学习的计算机视觉的基本原理、核心算法和当前的领先技术,从而有望成为深度学习之计算机视觉方向的算法工程师和研发人员。本课程系统全面地讲述基于深度学习的计算机视觉技术的原理并进行项目实践。课程涵盖计算机视觉的七大任务,包括图像分类、目标检测、图像分割(语义分割、实例分割、全景分割)、人脸识别、图像描述、图像检索、图像生成(利用生成对抗网络)。本课程注重原理和实践相结合,逐篇深入解读经典和前沿论文70余篇,图文并茂破译算法难点, 使用思维导图梳理技术要点。项目实践使用Keras框架(后端为Tensorflow),学员可快速上手。通过本课程的学习,学员可把握基于深度学习的计算机视觉的技术发展脉络,掌握相关技术原理和算法,有助于开展该领域的研究与开发实战工作。另外,深度学习之计算机视觉方向的知识结构及学习建议请参见本人CSDN博客。本课程提供课程资料的课件PPT(pdf格式)和项目实践代码,方便学员学习和复习。本课程分为上下两部分,其中上部包含课程的前五章(课程介绍、深度学习基础、图像分类、目标检测、图像分割),下部包含课程的后四章(人脸识别、图像描述、图像检索、图像生成)。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 7
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值