Pointnet（part_seg）部件分割网络代码解析：train.py（二）

最新推荐文章于 2024-04-01 16:28:27 发布

HHYi

最新推荐文章于 2024-04-01 16:28:27 发布

阅读量1k

点赞数 2

分类专栏：深度学习文章标签： tensorflow 神经网络

本文链接：https://blog.csdn.net/HYilalala/article/details/117227826

版权

深度学习专栏收录该内容

7 篇文章 3 订阅

订阅专栏

论文地址：PointNet网络
代码地址：https://github.com/charlesq34/pointnet
windows10环境配置可参考这篇文章：windows下运行pointnet（全）

网络图示：

在这里插入图片描述

一、训练集准备

1）下载训练集
数据下载：h5文件（HDF5_data）
存放位置：…\part_seg\hdf5_data

2）数据采样处理（下采样或上采样）
原因：对于不同的物体，其原始点云大小不一样，假设飞机3000个点，椅子2500个点，在输入到pointnet网络时不能做到不同的物体输入不同的点云大小，因此必须进行下采样或上采样，一般设置输入点云大小为2048

3）训练集文件夹解析
HDF5_data文件下载后格式如下图所示：
在这里插入图片描述
4）训练集中h5数据解析

二、代码讲解

1）运行train.py文件之前，最好将batch,大小设置为4

原因：建议将batch改为4（否则我的gpu报out of memory，当然你显存够大当我没说），训练200个epoch（当然epoch大小可以根据自己来设置）

# DEFAULT SETTINGS
parser = argparse.ArgumentParser()
parser.add_argument('--gpu', type=int, default=1, help='GPU to use [default: GPU 0]')
parser.add_argument('--batch', type=int, default=4, help='Batch Size during training [default: 32]')
parser.add_argument('--epoch', type=int, default=200, help='Epoch to run [default: 50]')
parser.add_argument('--point_num', type=int, default=2048, help='Point Number [256/512/1024/2048]')
parser.add_argument('--output_dir', type=str, default='train_results', help='Directory that stores all training logs and trained models')
parser.add_argument('--wd', type=float, default=0, help='Weight Decay [Default: 0.0]')
FLAGS = parser.parse_args()

2）get_transform（input transform）模块代码解析

def get_transform(point_cloud, is_training, bn_decay=None, K = 3):
    """ Transform Net, input is BxNx3 gray image
        Return:
            Transformation matrix of size 3xK

        当batch_size大小为4，epoch为200，point_num为2048时：
        point_cloud:(4,2048,3)
        is_training:在验证集处理时（eval_one_epoch)是False，训练集处理时(train_one_epoch）是True
    """
    batch_size = point_cloud.get_shape()[0].value # 4
    num_point = point_cloud.get_shape()[1].value # 2048
    # input_image维度是 [4x2048x3x1] ，因此将点云看成是W和H分为2048和3的2D图像，维度是1
    input_image = tf.expand_dims(point_cloud, -1)
    '''
    第一个卷积核size是 [1,3] ，正好对应的就是“2D图像”的一行，也就是一个点（三维坐标），输出通道数是64，
    因此输出张量维度应该是 [4x2048x1X64]
    第二个卷积核size是 [1,1] ， [1*1] 卷积只改变通道数，输出张量维度是 [4x2048x1X128]
    conv2d就是将卷积封装了一下，核心部分也就是调用tf.nn.conv2d
    '''
    net = tf_util.conv2d(input_image, 64, [1,3], padding='VALID', stride=[1,1],
                         bn=True, is_training=is_training, scope='tconv1', bn_decay=bn_decay)
    net = tf_util.conv2d(net, 128, [1,1], padding='VALID', stride=[1,1],
                         bn=True, is_training=is_training, scope='tconv3', bn_decay=bn_decay) # [4x2048x1X128]
    net = tf_util.conv2d(net, 1024, [1,1], padding='VALID', stride=[1,1],
                         bn=True, is_training=is_training, scope='tconv4', bn_decay=bn_decay) # [4x2048x1X1024]
    net = tf_util.max_pool2d(net, [num_point,1], padding='VALID', scope='tmaxpool') # [4x1x1X1024]
    '''
    以上通过卷积和max_pooling对batch内个各个点云提取global feature：[2048x1],
    下面就是将global feature降到3xK维度，并reshape成3x3,得到transform matrix
    '''
    net = tf.reshape(net, [batch_size, -1]) # [4X1024]
    net = tf_util.fully_connected(net, 128, bn=True, is_training=is_training, scope='tfc1', bn_decay=bn_decay) # [4x128]
    net = tf_util.fully_connected(net, 128, bn=True, is_training=is_training, scope='tfc2', bn_decay=bn_decay) # [4x128]

    with tf.variable_scope('transform_XYZ') as sc:
        assert(K==3) # 断言方法，来判断K是否为3，若不是结束运行
        weights = tf.get_variable('weights', [128, 3*K], initializer=tf.constant_initializer(0.0), dtype=tf.float32) # [128X9]
        biases = tf.get_variable('biases', [3*K], initializer=tf.constant_initializer(0.0), dtype=tf.float32) + tf.constant([1,0,0,0,1,0,0,0,1], dtype=tf.float32) # [9]
        transform = tf.matmul(net, weights) # [4X9]
        transform = tf.nn.bias_add(transform, biases) # [4X9]

    #transform = tf_util.fully_connected(net, 3*K, activation_fn=None, scope='tfc3')
    transform = tf.reshape(transform, [batch_size, 3, K]) # [4X3X3]
    return transform # [4X3X3]

3）get_transform_k（feature transform）模块

def get_transform_K(inputs, is_training, bn_decay=None, K = 3):
    """ Transform Net, input is BxNx1xK gray image
        Return:
            Transformation matrix of size KxK
        当batch_size大小为4，epoch为200，point_num为2048时：
        inputs:[4X2048X1X128]
        is_training:在验证集处理时（eval_one_epoch)是False，训练集处理时(train_one_epoch）是True
    """
    batch_size = inputs.get_shape()[0].value # 4
    num_point = inputs.get_shape()[1].value # 2048

    net = tf_util.conv2d(inputs, 256, [1,1], padding='VALID', stride=[1,1],
                         bn=True, is_training=is_training, scope='tconv1', bn_decay=bn_decay) # [4X2048X1X256]
    net = tf_util.conv2d(net, 1024, [1,1], padding='VALID', stride=[1,1],
                         bn=True, is_training=is_training, scope='tconv2', bn_decay=bn_decay) # [4X2048X1X1024]
    net = tf_util.max_pool2d(net, [num_point,1], padding='VALID', scope='tmaxpool') # [4X1X1X1024]

    net = tf.reshape(net, [batch_size, -1]) # [4X1024]
    net = tf_util.fully_connected(net, 512, bn=True, is_training=is_training, scope='tfc1', bn_decay=bn_decay) # [4X512]
    net = tf_util.fully_connected(net, 256, bn=True, is_training=is_training, scope='tfc2', bn_decay=bn_decay) # [4X256]

    with tf.variable_scope('transform_feat') as sc:
        weights = tf.get_variable('weights', [256, K*K], initializer=tf.constant_initializer(0.0), dtype=tf.float32) # [256X16384]
        biases = tf.get_variable('biases', [K*K], initializer=tf.constant_initializer(0.0), dtype=tf.float32) + tf.constant(np.eye(K).flatten(), dtype=tf.float32) # [16384]
        transform = tf.matmul(net, weights) # [4X16384]
        transform = tf.nn.bias_add(transform, biases) # [4X16384]

    #transform = tf_util.fully_connected(net, 3*K, activation_fn=None, scope='tfc3')
    transform = tf.reshape(transform, [batch_size, K, K]) # [4X128X128]
    return transform # [4X128X128]

4）get_model模块

def get_model(point_cloud, input_label, is_training, cat_num, part_num, \
		batch_size, num_point, weight_decay, bn_decay=None):
    """ ConvNet baseline, input is BxNx3 gray image
        当batch_size大小为4，epoch为200，point_num为2048时：
        输入的
        point_cloud:(4,2048,3)
        input_label:(4,16)
        cat_num:16
        part_num:50
        batch_size:4
        num_point:2048
        weight_decay:0.0
    """
    end_points = {}

    with tf.variable_scope('transform_net1') as sc:
        K = 3
        '''get_transform实现第一个T-Net网络，即input transform'''
        transform = get_transform(point_cloud, is_training, bn_decay, K = 3) # [4X3X3]
    point_cloud_transformed = tf.matmul(point_cloud, transform) # [4X2048X3]

    input_image = tf.expand_dims(point_cloud_transformed, -1) # [4X2048X3X1]
    # 第一个卷积核大小是[1,3],正好对应的就是“2D图像”的一行，也就是一个点（三维坐标），输出通道数是64，因此输出张量维度应该是BxNx1x64
    out1 = tf_util.conv2d(input_image, 64, [1,K], padding='VALID', stride=[1,1],
                         bn=True, is_training=is_training, scope='conv1', bn_decay=bn_decay) # [4X2048X1X64]
    # 第二个卷积核大小是[1,1],1*1卷积只改变通道数，输出张量维度应该是BxNx1x128
    out2 = tf_util.conv2d(out1, 128, [1,1], padding='VALID', stride=[1,1],
                         bn=True, is_training=is_training, scope='conv2', bn_decay=bn_decay) # [4X2048X1X128]
    out3 = tf_util.conv2d(out2, 128, [1,1], padding='VALID', stride=[1,1],
                         bn=True, is_training=is_training, scope='conv3', bn_decay=bn_decay) # [4X2048X1X128]

    with tf.variable_scope('transform_net2') as sc:
        K = 128
        '''get_transform_K是第二个T-Net网络，即feature transform'''
        transform = get_transform_K(out3, is_training, bn_decay, K)  # [4X128X128]

    end_points['transform'] = transform # [4X128X128]

    squeezed_out3 = tf.reshape(out3, [batch_size, num_point, 128]) # [4X2048X128]
    net_transformed = tf.matmul(squeezed_out3, transform) # [4X2048X128]
    net_transformed = tf.expand_dims(net_transformed, [2]) # [4X2048X1X128]

    out4 = tf_util.conv2d(net_transformed, 512, [1,1], padding='VALID', stride=[1,1],
                         bn=True, is_training=is_training, scope='conv4', bn_decay=bn_decay) # [4X2048X1X512]
    out5 = tf_util.conv2d(out4, 2048, [1,1], padding='VALID', stride=[1,1],
                         bn=True, is_training=is_training, scope='conv5', bn_decay=bn_decay) # [4X2048X1X2048]
    out_max = tf_util.max_pool2d(out5, [num_point,1], padding='VALID', scope='maxpool') # [4X1X1X2048]

    # classification network
    net = tf.reshape(out_max, [batch_size, -1]) # [4X2048]
    net = tf_util.fully_connected(net, 256, bn=True, is_training=is_training, scope='cla/fc1', bn_decay=bn_decay)  # [4X256]
    net = tf_util.fully_connected(net, 256, bn=True, is_training=is_training, scope='cla/fc2', bn_decay=bn_decay)  # 2个tensors
    net = tf_util.dropout(net, keep_prob=0.7, is_training=is_training, scope='cla/dp1') # [4X256]
    net = tf_util.fully_connected(net, cat_num, activation_fn=None, scope='cla/fc3') # [4X16]

    # segmentation network
    '''将out_max与one_hot_label_expand进行拼接'''
    one_hot_label_expand = tf.reshape(input_label, [batch_size, 1, 1, cat_num]) # [4X1X1X16]
    out_max = tf.concat(axis=3, values=[out_max, one_hot_label_expand]) # [4X1X1X2064]

    expand = tf.tile(out_max, [1, num_point, 1, 1]) # [4X2048X1X2064]
    '''
    input：
        expand：[4X2048X1X2064]
        out1：[4X2048X1X64]
        out2：[4X2048X1X128]
        out3：[4X2048X1X128]
        out4：[4X2048X1X512]
        out5：[4X2048X1X2048]
    return：
        net：[4X16]
        net2：[4X2048X50]
        end_points：[4X128X128]
    '''
    concat = tf.concat(axis=3, values=[expand, out1, out2, out3, out4, out5]) # [4X2048X1X4944]

    net2 = tf_util.conv2d(concat, 256, [1,1], padding='VALID', stride=[1,1], bn_decay=bn_decay,
                        bn=True, is_training=is_training, scope='seg/conv1', weight_decay=weight_decay) # 2个tensors
    net2 = tf_util.dropout(net2, keep_prob=0.8, is_training=is_training, scope='seg/dp1') # [4X2048X1X256]
    net2 = tf_util.conv2d(net2, 256, [1,1], padding='VALID', stride=[1,1], bn_decay=bn_decay,
                        bn=True, is_training=is_training, scope='seg/conv2', weight_decay=weight_decay) # 2个tensors
    net2 = tf_util.dropout(net2, keep_prob=0.8, is_training=is_training, scope='seg/dp2') # [4X2048X1X256]
    net2 = tf_util.conv2d(net2, 128, [1,1], padding='VALID', stride=[1,1], bn_decay=bn_decay,
                        bn=True, is_training=is_training, scope='seg/conv3', weight_decay=weight_decay) # [4X2048X1X128]
    net2 = tf_util.conv2d(net2, part_num, [1,1], padding='VALID', stride=[1,1], activation_fn=None, 
                        bn=False, scope='seg/conv4', weight_decay=weight_decay) # [4X2048X1X50]

    net2 = tf.reshape(net2, [batch_size, num_point, part_num]) # [4X2048X50]

    return net, net2, end_points

5）train.py

def train():
    with tf.Graph().as_default():
        with tf.device('/gpu:'+str(FLAGS.gpu)):
            pointclouds_ph, input_label_ph, labels_ph, seg_ph = placeholder_inputs()
            is_training_ph = tf.placeholder(tf.bool, shape=())
            '''
            batch:初始化变量0
            learning_rate：指数衰减学习率
            bn_decay:批标准化衰减率
            '''
            batch = tf.Variable(0, trainable=False)
            learning_rate = tf.train.exponential_decay(
                            BASE_LEARNING_RATE,     # base learning rate
                            batch * batch_size,     # global_var indicating the number of steps
                            DECAY_STEP,             # step size
                            DECAY_RATE,             # decay rate
                            staircase=True          # Stair-case or continuous decreasing
                            )
            learning_rate = tf.maximum(learning_rate, LEARNING_RATE_CLIP)
        
            bn_momentum = tf.train.exponential_decay(
                      BN_INIT_DECAY,
                      batch*batch_size,
                      BN_DECAY_DECAY_STEP,
                      BN_DECAY_DECAY_RATE,
                      staircase=True)

            bn_decay = tf.minimum(BN_DECAY_CLIP, 1 - bn_momentum)
            lr_op = tf.summary.scalar('learning_rate', learning_rate)
            batch_op = tf.summary.scalar('batch_number', batch)
            bn_decay_op = tf.summary.scalar('bn_decay', bn_decay)

           '''labels_pred：[4X16]  seg_pred：[4X2048X50] end_points：[4X128X128]'''
            labels_pred, seg_pred, end_points = model.get_model(pointclouds_ph, input_label_ph, \
                    is_training=is_training_ph, bn_decay=bn_decay, cat_num=NUM_CATEGORIES, \
                    part_num=NUM_PART_CATS, batch_size=batch_size, num_point=point_num, weight_decay=FLAGS.wd)

            # model.py defines both classification net and segmentation net, which share the common global feature extractor network.
            # In model.get_loss, we define the total loss to be weighted sum of the classification and segmentation losses.
            # Here, we only train for segmentation network. Thus, we set weight to be 1.0.
            loss, label_loss, per_instance_label_loss, seg_loss, per_instance_seg_loss, per_instance_seg_pred_res  \
                = model.get_loss(labels_pred, seg_pred, labels_ph, seg_ph, 1.0, end_points)

            total_training_loss_ph = tf.placeholder(tf.float32, shape=())
            total_testing_loss_ph = tf.placeholder(tf.float32, shape=())

            label_training_loss_ph = tf.placeholder(tf.float32, shape=())
            label_testing_loss_ph = tf.placeholder(tf.float32, shape=())

            seg_training_loss_ph = tf.placeholder(tf.float32, shape=())
            seg_testing_loss_ph = tf.placeholder(tf.float32, shape=())

            label_training_acc_ph = tf.placeholder(tf.float32, shape=())
            label_testing_acc_ph = tf.placeholder(tf.float32, shape=())
            label_testing_acc_avg_cat_ph = tf.placeholder(tf.float32, shape=())

            seg_training_acc_ph = tf.placeholder(tf.float32, shape=())
            seg_testing_acc_ph = tf.placeholder(tf.float32, shape=())
            seg_testing_acc_avg_cat_ph = tf.placeholder(tf.float32, shape=())

            total_train_loss_sum_op = tf.summary.scalar('total_training_loss', total_training_loss_ph)
            total_test_loss_sum_op = tf.summary.scalar('total_testing_loss', total_testing_loss_ph)

            label_train_loss_sum_op = tf.summary.scalar('label_training_loss', label_training_loss_ph)
            label_test_loss_sum_op = tf.summary.scalar('label_testing_loss', label_testing_loss_ph)

            seg_train_loss_sum_op = tf.summary.scalar('seg_training_loss', seg_training_loss_ph)
            seg_test_loss_sum_op = tf.summary.scalar('seg_testing_loss', seg_testing_loss_ph)

            label_train_acc_sum_op = tf.summary.scalar('label_training_acc', label_training_acc_ph)
            label_test_acc_sum_op = tf.summary.scalar('label_testing_acc', label_testing_acc_ph)
            label_test_acc_avg_cat_op = tf.summary.scalar('label_testing_acc_avg_cat', label_testing_acc_avg_cat_ph)

            seg_train_acc_sum_op = tf.summary.scalar('seg_training_acc', seg_training_acc_ph)
            seg_test_acc_sum_op = tf.summary.scalar('seg_testing_acc', seg_testing_acc_ph)
            seg_test_acc_avg_cat_op = tf.summary.scalar('seg_testing_acc_avg_cat', seg_testing_acc_avg_cat_ph)

            train_variables = tf.trainable_variables()

            # tf.train.AdamOptimizer表示可训练参数，优化算法
            trainer = tf.train.AdamOptimizer(learning_rate)
            # 优化器优化
            train_op = trainer.minimize(loss, var_list=train_variables, global_step=batch)

        saver = tf.train.Saver()  # 保存和加载模型

        config = tf.ConfigProto()
        config.gpu_options.allow_growth = True
        config.allow_soft_placement = True
        sess = tf.Session(config=config)
        
        init = tf.global_variables_initializer() # 全局变量初始化
        sess.run(init)  # 图结构创建好了，开始会话

        train_writer = tf.summary.FileWriter(SUMMARIES_FOLDER + '/train', sess.graph)
        test_writer = tf.summary.FileWriter(SUMMARIES_FOLDER + '/test')

        train_file_list = provider.getDataFiles(TRAINING_FILE_LIST)
        num_train_file = len(train_file_list)
        test_file_list = provider.getDataFiles(TESTING_FILE_LIST)
        num_test_file = len(test_file_list)

        fcmd = open(os.path.join(LOG_STORAGE_PATH, 'cmd.txt'), 'w')
        fcmd.write(str(FLAGS))
        fcmd.close()

        # write logs to the disk
        flog = open(os.path.join(LOG_STORAGE_PATH, 'log.txt'), 'w')

        def train_one_epoch(train_file_idx, epoch_num):
            is_training = True

            for i in range(num_train_file):
                cur_train_filename = os.path.join(hdf5_data_dir, train_file_list[train_file_idx[i]])
                printout(flog, 'Loading train file ' + cur_train_filename)

                cur_data, cur_labels, cur_seg = provider.loadDataFile_with_seg(cur_train_filename)
                cur_data, cur_labels, order = provider.shuffle_data(cur_data, np.squeeze(cur_labels))
                cur_seg = cur_seg[order, ...]

                cur_labels_one_hot = convert_label_to_one_hot(cur_labels)

                num_data = len(cur_labels)
                num_batch = num_data // batch_size

                total_loss = 0.0
                total_label_loss = 0.0
                total_seg_loss = 0.0
                total_label_acc = 0.0
                total_seg_acc = 0.0

                for j in range(num_batch):
                    begidx = j * batch_size
                    endidx = (j + 1) * batch_size

                    feed_dict = {
                            pointclouds_ph: cur_data[begidx: endidx, ...], 
                            labels_ph: cur_labels[begidx: endidx, ...], 
                            input_label_ph: cur_labels_one_hot[begidx: endidx, ...], 
                            seg_ph: cur_seg[begidx: endidx, ...],
                            is_training_ph: is_training, 
                            }

                    _, loss_val, label_loss_val, seg_loss_val, per_instance_label_loss_val, \
                            per_instance_seg_loss_val, label_pred_val, seg_pred_val, pred_seg_res \
                            = sess.run([train_op, loss, label_loss, seg_loss, per_instance_label_loss, \
                            per_instance_seg_loss, labels_pred, seg_pred, per_instance_seg_pred_res], \
                            feed_dict=feed_dict)

                    per_instance_part_acc = np.mean(pred_seg_res == cur_seg[begidx: endidx, ...], axis=1)
                    average_part_acc = np.mean(per_instance_part_acc)

                    total_loss += loss_val
                    total_label_loss += label_loss_val
                    total_seg_loss += seg_loss_val
                    
                    per_instance_label_pred = np.argmax(label_pred_val, axis=1)
                    total_label_acc += np.mean(np.float32(per_instance_label_pred == cur_labels[begidx: endidx, ...]))
                    total_seg_acc += average_part_acc

                total_loss = total_loss * 1.0 / num_batch
                total_label_loss = total_label_loss * 1.0 / num_batch
                total_seg_loss = total_seg_loss * 1.0 / num_batch
                total_label_acc = total_label_acc * 1.0 / num_batch
                total_seg_acc = total_seg_acc * 1.0 / num_batch

                lr_sum, bn_decay_sum, batch_sum, train_loss_sum, train_label_acc_sum, \
                        train_label_loss_sum, train_seg_loss_sum, train_seg_acc_sum = sess.run(\
                        [lr_op, bn_decay_op, batch_op, total_train_loss_sum_op, label_train_acc_sum_op, \
                        label_train_loss_sum_op, seg_train_loss_sum_op, seg_train_acc_sum_op], \
                        feed_dict={total_training_loss_ph: total_loss, label_training_loss_ph: total_label_loss, \
                        seg_training_loss_ph: total_seg_loss, label_training_acc_ph: total_label_acc, \
                        seg_training_acc_ph: total_seg_acc})

                train_writer.add_summary(train_loss_sum, i + epoch_num * num_train_file)
                train_writer.add_summary(train_label_loss_sum, i + epoch_num * num_train_file)
                train_writer.add_summary(train_seg_loss_sum, i + epoch_num * num_train_file)
                train_writer.add_summary(lr_sum, i + epoch_num * num_train_file)
                train_writer.add_summary(bn_decay_sum, i + epoch_num * num_train_file)
                train_writer.add_summary(train_label_acc_sum, i + epoch_num * num_train_file)
                train_writer.add_summary(train_seg_acc_sum, i + epoch_num * num_train_file)
                train_writer.add_summary(batch_sum, i + epoch_num * num_train_file)

                printout(flog, '\tTraining Total Mean_loss: %f' % total_loss)
                printout(flog, '\t\tTraining Label Mean_loss: %f' % total_label_loss)
                printout(flog, '\t\tTraining Label Accuracy: %f' % total_label_acc)
                printout(flog, '\t\tTraining Seg Mean_loss: %f' % total_seg_loss)
                printout(flog, '\t\tTraining Seg Accuracy: %f' % total_seg_acc)

        def eval_one_epoch(epoch_num):
            is_training = False

            total_loss = 0.0
            total_label_loss = 0.0
            total_seg_loss = 0.0
            total_label_acc = 0.0
            total_seg_acc = 0.0
            total_seen = 0

            # NUM_CATEGORIES = 16
            total_label_acc_per_cat = np.zeros((NUM_CATEGORIES)).astype(np.float32)  # 每一类物体分类标签的正确数
            total_seg_acc_per_cat = np.zeros((NUM_CATEGORIES)).astype(np.float32)  # 每一类分割的正确数
            total_seen_per_cat = np.zeros((NUM_CATEGORIES)).astype(np.int32)  # 每类个数

            for i in range(num_test_file):
                cur_test_filename = os.path.join(hdf5_data_dir, test_file_list[i])
                printout(flog, 'Loading test file ' + cur_test_filename)

                '''
                cur_data（2048,2048,3）：测试集的点云数据
                cur_labels（2048,16）：点云数据物体对应的16类
                cur_seg（2048,2048）：每个点对应的50类其中之一
                ''' 
                cur_data, cur_labels, cur_seg = provider.loadDataFile_with_seg(cur_test_filename)
                cur_labels = np.squeeze(cur_labels)

                # 将label都转换为one_hot形式
                cur_labels_one_hot = convert_label_to_one_hot(cur_labels)

                num_data = len(cur_labels)
                num_batch = num_data // batch_size

                for j in range(num_batch): # 按批次运行
                    begidx = j * batch_size # 开始的索引
                    endidx = (j + 1) * batch_size # 结束的索引
                    feed_dict = {
                            pointclouds_ph: cur_data[begidx: endidx, ...], 
                            labels_ph: cur_labels[begidx: endidx, ...], 
                            input_label_ph: cur_labels_one_hot[begidx: endidx, ...], 
                            seg_ph: cur_seg[begidx: endidx, ...],
                            is_training_ph: is_training, 
                            }

                    loss_val, label_loss_val, seg_loss_val, per_instance_label_loss_val, \
                            per_instance_seg_loss_val, label_pred_val, seg_pred_val, pred_seg_res \
                            = sess.run([loss, label_loss, seg_loss, per_instance_label_loss, \
                            per_instance_seg_loss, labels_pred, seg_pred, per_instance_seg_pred_res], \
                            feed_dict=feed_dict)
                    # 求每个物体的零件正确率
                    per_instance_part_acc = np.mean(pred_seg_res == cur_seg[begidx: endidx, ...], axis=1)
                    # 求这4个batch_size的平均零件正确率
                    average_part_acc = np.mean(per_instance_part_acc)

                    total_seen += 1
                    total_loss += loss_val
                    total_label_loss += label_loss_val
                    total_seg_loss += seg_loss_val
                    # 求这4个batch_size对类别预测的标签
                    per_instance_label_pred = np.argmax(label_pred_val, axis=1)
                    # 算出预测标签的正确率并求平均进行累加
                    total_label_acc += np.mean(np.float32(per_instance_label_pred == cur_labels[begidx: endidx, ...]))
                    # 将平均零件分割正确率累加
                    total_seg_acc += average_part_acc

                    for shape_idx in range(begidx, endidx):
                        # test过的每一类的个数
                        total_seen_per_cat[cur_labels[shape_idx]] += 1
                        # 每一类标签判断正确的个数：预测标签与正确标签对比，如果正确就在相应位置+1
                        total_label_acc_per_cat[cur_labels[shape_idx]] += np.int32(per_instance_label_pred[shape_idx-begidx] == cur_labels[shape_idx])
                        # 将每个物体分割的正确率累加
                        total_seg_acc_per_cat[cur_labels[shape_idx]] += per_instance_part_acc[shape_idx - begidx]

            total_loss = total_loss * 1.0 / total_seen
            total_label_loss = total_label_loss * 1.0 / total_seen
            total_seg_loss = total_seg_loss * 1.0 / total_seen
            total_label_acc = total_label_acc * 1.0 / total_seen
            total_seg_acc = total_seg_acc * 1.0 / total_seen

            test_loss_sum, test_label_acc_sum, test_label_loss_sum, test_seg_loss_sum, test_seg_acc_sum = sess.run(\
                    [total_test_loss_sum_op, label_test_acc_sum_op, label_test_loss_sum_op, seg_test_loss_sum_op, seg_test_acc_sum_op], \
                    feed_dict={total_testing_loss_ph: total_loss, label_testing_loss_ph: total_label_loss, \
                    seg_testing_loss_ph: total_seg_loss, label_testing_acc_ph: total_label_acc, seg_testing_acc_ph: total_seg_acc})

            test_writer.add_summary(test_loss_sum, (epoch_num+1) * num_train_file-1)
            test_writer.add_summary(test_label_loss_sum, (epoch_num+1) * num_train_file-1)
            test_writer.add_summary(test_seg_loss_sum, (epoch_num+1) * num_train_file-1)
            test_writer.add_summary(test_label_acc_sum, (epoch_num+1) * num_train_file-1)
            test_writer.add_summary(test_seg_acc_sum, (epoch_num+1) * num_train_file-1)

            printout(flog, '\tTesting Total Mean_loss: %f' % total_loss)
            printout(flog, '\t\tTesting Label Mean_loss: %f' % total_label_loss)
            printout(flog, '\t\tTesting Label Accuracy: %f' % total_label_acc)
            printout(flog, '\t\tTesting Seg Mean_loss: %f' % total_seg_loss)
            printout(flog, '\t\tTesting Seg Accuracy: %f' % total_seg_acc)

            for cat_idx in range(NUM_CATEGORIES):
                if total_seen_per_cat[cat_idx] > 0:
                    printout(flog, '\n\t\tCategory %s Object Number: %d' % (all_obj_cats[cat_idx][0], total_seen_per_cat[cat_idx]))
                    printout(flog, '\t\tCategory %s Label Accuracy: %f' % (all_obj_cats[cat_idx][0], total_label_acc_per_cat[cat_idx]/total_seen_per_cat[cat_idx]))
                    printout(flog, '\t\tCategory %s Seg Accuracy: %f' % (all_obj_cats[cat_idx][0], total_seg_acc_per_cat[cat_idx]/total_seen_per_cat[cat_idx]))

        if not os.path.exists(MODEL_STORAGE_PATH):
            os.mkdir(MODEL_STORAGE_PATH)

        for epoch in range(TRAINING_EPOCHES): # 训练次数
            printout(flog, '\n<<< Testing on the test dataset ...')
            eval_one_epoch(epoch)

            printout(flog, '\n>>> Training for the epoch %d/%d ...' % (epoch, TRAINING_EPOCHES))

            train_file_idx = np.arange(0, len(train_file_list))
            np.random.shuffle(train_file_idx) # 打乱顺序
            # train_one_proch比eval_one_epoch多一个优化器过程
            train_one_epoch(train_file_idx, epoch)

            if (epoch+1) % 10 == 0:
                # 保存训练模型
                cp_filename = saver.save(sess, os.path.join(MODEL_STORAGE_PATH, 'epoch_' + str(epoch+1)+'.ckpt'))
                printout(flog, 'Successfully store the checkpoint model into ' + cp_filename)

            flog.flush()

        flog.close()

三、训练结果（获取训练模型）

训练了200个epoch后，可以在…/part_seg/train_results路径下查看训练结果。如下图所示：
在这里插入图片描述

train_results/logs文件：保存训练时的日志数据，每一个epoch对应的各个物体的Accuracy和IOU值
1.Total Point ：point_num(该实例点云数量)
2.Ground Truth ：objnames[cur_gt_label] (实例类别)
3.Predict ：objnames[label_pred_val] (预测类别)
4.Accuracy ：seg_acc = np.mean(seg_pred_val == seg) （分割正确点数占总点云数的百分比，正确率)
5.IoU ：avg_iou (每个零件算IOU相加，求平均IOU) avg_iou = total_iou / len(iou_oids)
6.IoU details ：’’ + n_pred + '’ + n_gt + ‘’ + n_intersect + '’ + n_union + ‘’’’ + (n_intersect / n_union) +\n
7.2874个实例的总的 Accuary = total_acc / total_seen 总的正确点云数 / 总的点云数量
8.IoU = (total_acc_iou += avg_iou) / total_seen
train_results/summaries文件：可以通过tensorboard可视化工具查看训练结果，以图表形式表示，使用方法为：
1.通过anaconda Promapt，激活虚拟环境
2.运行tensorboard --logdir=路径地址（输入summaries文件的绝对路径）
3.获得本地服务器地址，运行到浏览器即可查看

可视化展示（可查看精确度、IOU、以及Graphs图）：
train_results/trained_models文件：存放的是训练好的模型，一般选择最好的那个模型进行测试。

HHYi

关注

2
点赞
踩
15

收藏

觉得还不错? 一键收藏
1
评论
Pointnet（part_seg）部件分割网络代码解析：train.py（二）

论文地址：PointNet网络代码地址：https://github.com/charlesq34/pointnetwindows10环境配置可参考这篇文章：windows下运行pointnet（全）网络图示：一、训练集准备1）下载训练集数据下载：h5文件（HDF5_data）存放位置：…\part_seg\hdf5_data2）数据采样处理（下采样或上采样）原因：对于不同的物体，其原始点云大小不一样，假设飞机3000个点，椅子2500个点，在输入到pointnet网络时不能做到不同的物
复制链接

扫一扫

专栏目录