论文地址:PointNet网络
代码地址:https://github.com/charlesq34/pointnet
windows10环境配置可参考这篇文章:windows下运行pointnet(全)
网络图示:
一、训练集准备
1)下载训练集
数据下载:h5文件(HDF5_data)
存放位置:…\part_seg\hdf5_data
2)数据采样处理(下采样或上采样)
原因:对于不同的物体,其原始点云大小不一样,假设飞机3000个点,椅子2500个点,在输入到pointnet网络时不能做到不同的物体输入不同的点云大小,因此必须进行下采样或上采样,一般设置输入点云大小为2048
3)训练集文件夹解析
HDF5_data文件下载后格式如下图所示:
4)训练集中h5数据解析
二、代码讲解
1)运行train.py文件之前,最好将batch,大小设置为4
原因:建议将batch改为4(否则我的gpu报out of memory,当然你显存够大当我没说),训练200个epoch(当然epoch大小可以根据自己来设置)
# DEFAULT SETTINGS
parser = argparse.ArgumentParser()
parser.add_argument('--gpu', type=int, default=1, help='GPU to use [default: GPU 0]')
parser.add_argument('--batch', type=int, default=4, help='Batch Size during training [default: 32]')
parser.add_argument('--epoch', type=int, default=200, help='Epoch to run [default: 50]')
parser.add_argument('--point_num', type=int, default=2048, help='Point Number [256/512/1024/2048]')
parser.add_argument('--output_dir', type=str, default='train_results', help='Directory that stores all training logs and trained models')
parser.add_argument('--wd', type=float, default=0, help='Weight Decay [Default: 0.0]')
FLAGS = parser.parse_args()
2)get_transform(input transform)模块代码解析
def get_transform(point_cloud, is_training, bn_decay=None, K = 3):
""" Transform Net, input is BxNx3 gray image
Return:
Transformation matrix of size 3xK
当batch_size大小为4,epoch为200,point_num为2048时:
point_cloud:(4,2048,3)
is_training:在验证集处理时(eval_one_epoch)是False,训练集处理时(train_one_epoch)是True
"""
batch_size = point_cloud.get_shape()[0].value # 4
num_point = point_cloud.get_shape()[1].value # 2048
# input_image维度是 [4x2048x3x1] ,因此将点云看成是W和H分为2048和3的2D图像,维度是1
input_image = tf.expand_dims(point_cloud, -1)
'''
第一个卷积核size是 [1,3] ,正好对应的就是“2D图像”的一行,也就是一个点(三维坐标),输出通道数是64,
因此输出张量维度应该是 [4x2048x1X64]
第二个卷积核size是 [1,1] , [1*1] 卷积只改变通道数,输出张量维度是 [4x2048x1X128]
conv2d就是将卷积封装了一下,核心部分也就是调用tf.nn.conv2d
'''
net = tf_util.conv2d(input_image, 64, [1,3], padding='VALID', stride=[1,1],
bn=True, is_training=is_training, scope='tconv1', bn_decay=bn_decay)
net = tf_util.conv2d(net, 128, [1,1], padding='VALID', stride=[1,1],
bn=True, is_training=is_training, scope='tconv3', bn_decay=bn_decay) # [4x2048x1X128]
net = tf_util.conv2d(net, 1024, [1,1], padding='VALID', stride=[1,1],
bn=True, is_training=is_training, scope='tconv4', bn_decay=bn_decay) # [4x2048x1X1024]
net = tf_util.max_pool2d(net, [num_point,1], padding='VALID', scope='tmaxpool') # [4x1x1X1024]
'''
以上通过卷积和max_pooling对batch内个各个点云提取global feature:[2048x1],
下面就是将global feature降到3xK维度,并reshape成3x3,得到transform matrix
'''
net = tf.reshape(net, [batch_size, -1]) # [4X1024]
net = tf_util.fully_connected(net, 128, bn=True, is_training=is_training, scope='tfc1', bn_decay=bn_decay) # [4x128]
net = tf_util.fully_connected(net, 128, bn=True, is_training=is_training, scope='tfc2', bn_decay=bn_decay) # [4x128]
with tf.variable_scope('transform_XYZ') as sc:
assert(K==3) # 断言方法,来判断K是否为3,若不是结束运行
weights = tf.get_variable('weights', [128, 3*K], initializer=tf.constant_initializer(0.0), dtype=tf.float32) # [128X9]
biases = tf.get_variable('biases', [3*K], initializer=tf.constant_initializer(0.0), dtype=tf.float32) + tf.constant([1,0,0,0,1,0,0,0,1], dtype=tf.float32) # [9]
transform = tf.matmul(net, weights) # [4X9]
transform = tf.nn.bias_add(transform, biases) # [4X9]
#transform = tf_util.fully_connected(net, 3*K, activation_fn=None, scope='tfc3')
transform = tf.reshape(transform, [batch_size, 3, K]) # [4X3X3]
return transform # [4X3X3]
3)get_transform_k(feature transform)模块
def get_transform_K(inputs, is_training, bn_decay=None, K = 3):
""" Transform Net, input is BxNx1xK gray image
Return:
Transformation matrix of size KxK
当batch_size大小为4,epoch为200,point_num为2048时:
inputs:[4X2048X1X128]
is_training:在验证集处理时(eval_one_epoch)是False,训练集处理时(train_one_epoch)是True
"""
batch_size = inputs.get_shape()[0].value # 4
num_point = inputs.get_shape()[1].value # 2048
net = tf_util.conv2d(inputs, 256, [1,1], padding='VALID', stride=[1,1],
bn=True, is_training=is_training, scope='tconv1', bn_decay=bn_decay) # [4X2048X1X256]
net = tf_util.conv2d(net, 1024, [1,1], padding='VALID', stride=[1,1],
bn=True, is_training=is_training, scope='tconv2', bn_decay=bn_decay) # [4X2048X1X1024]
net = tf_util.max_pool2d(net, [num_point,1], padding='VALID', scope='tmaxpool') # [4X1X1X1024]
net = tf.reshape(net, [batch_size, -1]) # [4X1024]
net = tf_util.fully_connected(net, 512, bn=True, is_training=is_training, scope='tfc1', bn_decay=bn_decay) # [4X512]
net = tf_util.fully_connected(net, 256, bn=True, is_training=is_training, scope='tfc2', bn_decay=bn_decay) # [4X256]
with tf.variable_scope('transform_feat') as sc:
weights = tf.get_variable('weights', [256, K*K], initializer=tf.constant_initializer(0.0), dtype=tf.float32) # [256X16384]
biases = tf.get_variable('biases', [K*K], initializer=tf.constant_initializer(0.0), dtype=tf.float32) + tf.constant(np.eye(K).flatten(), dtype=tf.float32) # [16384]
transform = tf.matmul(net, weights) # [4X16384]
transform = tf.nn.bias_add(transform, biases) # [4X16384]
#transform = tf_util.fully_connected(net, 3*K, activation_fn=None, scope='tfc3')
transform = tf.reshape(transform, [batch_size, K, K]) # [4X128X128]
return transform # [4X128X128]
4)get_model模块
def get_model(point_cloud, input_label, is_training, cat_num, part_num, \
batch_size, num_point, weight_decay, bn_decay=None):
""" ConvNet baseline, input is BxNx3 gray image
当batch_size大小为4,epoch为200,point_num为2048时:
输入的
point_cloud:(4,2048,3)
input_label:(4,16)
cat_num:16
part_num:50
batch_size:4
num_point:2048
weight_decay:0.0
"""
end_points = {}
with tf.variable_scope('transform_net1') as sc:
K = 3
'''get_transform实现第一个T-Net网络,即input transform'''
transform = get_transform(point_cloud, is_training, bn_decay, K = 3) # [4X3X3]
point_cloud_transformed = tf.matmul(point_cloud, transform) # [4X2048X3]
input_image = tf.expand_dims(point_cloud_transformed, -1) # [4X2048X3X1]
# 第一个卷积核大小是[1,3],正好对应的就是“2D图像”的一行,也就是一个点(三维坐标),输出通道数是64,因此输出张量维度应该是BxNx1x64
out1 = tf_util.conv2d(input_image, 64, [1,K], padding='VALID', stride=[1,1],
bn=True, is_training=is_training, scope='conv1', bn_decay=bn_decay) # [4X2048X1X64]
# 第二个卷积核大小是[1,1],1*1卷积只改变通道数,输出张量维度应该是BxNx1x128
out2 = tf_util.conv2d(out1, 128, [1,1], padding='VALID', stride=[1,1],
bn=True, is_training=is_training, scope='conv2', bn_decay=bn_decay) # [4X2048X1X128]
out3 = tf_util.conv2d(out2, 128, [1,1], padding='VALID', stride=[1,1],
bn=True, is_training=is_training, scope='conv3', bn_decay=bn_decay) # [4X2048X1X128]
with tf.variable_scope('transform_net2') as sc:
K = 128
'''get_transform_K是第二个T-Net网络,即feature transform'''
transform = get_transform_K(out3, is_training, bn_decay, K) # [4X128X128]
end_points['transform'] = transform # [4X128X128]
squeezed_out3 = tf.reshape(out3, [batch_size, num_point, 128]) # [4X2048X128]
net_transformed = tf.matmul(squeezed_out3, transform) # [4X2048X128]
net_transformed = tf.expand_dims(net_transformed, [2]) # [4X2048X1X128]
out4 = tf_util.conv2d(net_transformed, 512, [1,1], padding='VALID', stride=[1,1],
bn=True, is_training=is_training, scope='conv4', bn_decay=bn_decay) # [4X2048X1X512]
out5 = tf_util.conv2d(out4, 2048, [1,1], padding='VALID', stride=[1,1],
bn=True, is_training=is_training, scope='conv5', bn_decay=bn_decay) # [4X2048X1X2048]
out_max = tf_util.max_pool2d(out5, [num_point,1], padding='VALID', scope='maxpool') # [4X1X1X2048]
# classification network
net = tf.reshape(out_max, [batch_size, -1]) # [4X2048]
net = tf_util.fully_connected(net, 256, bn=True, is_training=is_training, scope='cla/fc1', bn_decay=bn_decay) # [4X256]
net = tf_util.fully_connected(net, 256, bn=True, is_training=is_training, scope='cla/fc2', bn_decay=bn_decay) # 2个tensors
net = tf_util.dropout(net, keep_prob=0.7, is_training=is_training, scope='cla/dp1') # [4X256]
net = tf_util.fully_connected(net, cat_num, activation_fn=None, scope='cla/fc3') # [4X16]
# segmentation network
'''将out_max与one_hot_label_expand进行拼接'''
one_hot_label_expand = tf.reshape(input_label, [batch_size, 1, 1, cat_num]) # [4X1X1X16]
out_max = tf.concat(axis=3, values=[out_max, one_hot_label_expand]) # [4X1X1X2064]
expand = tf.tile(out_max, [1, num_point, 1, 1]) # [4X2048X1X2064]
'''
input:
expand:[4X2048X1X2064]
out1:[4X2048X1X64]
out2:[4X2048X1X128]
out3:[4X2048X1X128]
out4:[4X2048X1X512]
out5:[4X2048X1X2048]
return:
net:[4X16]
net2:[4X2048X50]
end_points:[4X128X128]
'''
concat = tf.concat(axis=3, values=[expand, out1, out2, out3, out4, out5]) # [4X2048X1X4944]
net2 = tf_util.conv2d(concat, 256, [1,1], padding='VALID', stride=[1,1], bn_decay=bn_decay,
bn=True, is_training=is_training, scope='seg/conv1', weight_decay=weight_decay) # 2个tensors
net2 = tf_util.dropout(net2, keep_prob=0.8, is_training=is_training, scope='seg/dp1') # [4X2048X1X256]
net2 = tf_util.conv2d(net2, 256, [1,1], padding='VALID', stride=[1,1], bn_decay=bn_decay,
bn=True, is_training=is_training, scope='seg/conv2', weight_decay=weight_decay) # 2个tensors
net2 = tf_util.dropout(net2, keep_prob=0.8, is_training=is_training, scope='seg/dp2') # [4X2048X1X256]
net2 = tf_util.conv2d(net2, 128, [1,1], padding='VALID', stride=[1,1], bn_decay=bn_decay,
bn=True, is_training=is_training, scope='seg/conv3', weight_decay=weight_decay) # [4X2048X1X128]
net2 = tf_util.conv2d(net2, part_num, [1,1], padding='VALID', stride=[1,1], activation_fn=None,
bn=False, scope='seg/conv4', weight_decay=weight_decay) # [4X2048X1X50]
net2 = tf.reshape(net2, [batch_size, num_point, part_num]) # [4X2048X50]
return net, net2, end_points
5)train.py
def train():
with tf.Graph().as_default():
with tf.device('/gpu:'+str(FLAGS.gpu)):
pointclouds_ph, input_label_ph, labels_ph, seg_ph = placeholder_inputs()
is_training_ph = tf.placeholder(tf.bool, shape=())
'''
batch:初始化变量0
learning_rate:指数衰减学习率
bn_decay:批标准化衰减率
'''
batch = tf.Variable(0, trainable=False)
learning_rate = tf.train.exponential_decay(
BASE_LEARNING_RATE, # base learning rate
batch * batch_size, # global_var indicating the number of steps
DECAY_STEP, # step size
DECAY_RATE, # decay rate
staircase=True # Stair-case or continuous decreasing
)
learning_rate = tf.maximum(learning_rate, LEARNING_RATE_CLIP)
bn_momentum = tf.train.exponential_decay(
BN_INIT_DECAY,
batch*batch_size,
BN_DECAY_DECAY_STEP,
BN_DECAY_DECAY_RATE,
staircase=True)
bn_decay = tf.minimum(BN_DECAY_CLIP, 1 - bn_momentum)
lr_op = tf.summary.scalar('learning_rate', learning_rate)
batch_op = tf.summary.scalar('batch_number', batch)
bn_decay_op = tf.summary.scalar('bn_decay', bn_decay)
'''labels_pred:[4X16] seg_pred:[4X2048X50] end_points:[4X128X128]'''
labels_pred, seg_pred, end_points = model.get_model(pointclouds_ph, input_label_ph, \
is_training=is_training_ph, bn_decay=bn_decay, cat_num=NUM_CATEGORIES, \
part_num=NUM_PART_CATS, batch_size=batch_size, num_point=point_num, weight_decay=FLAGS.wd)
# model.py defines both classification net and segmentation net, which share the common global feature extractor network.
# In model.get_loss, we define the total loss to be weighted sum of the classification and segmentation losses.
# Here, we only train for segmentation network. Thus, we set weight to be 1.0.
loss, label_loss, per_instance_label_loss, seg_loss, per_instance_seg_loss, per_instance_seg_pred_res \
= model.get_loss(labels_pred, seg_pred, labels_ph, seg_ph, 1.0, end_points)
total_training_loss_ph = tf.placeholder(tf.float32, shape=())
total_testing_loss_ph = tf.placeholder(tf.float32, shape=())
label_training_loss_ph = tf.placeholder(tf.float32, shape=())
label_testing_loss_ph = tf.placeholder(tf.float32, shape=())
seg_training_loss_ph = tf.placeholder(tf.float32, shape=())
seg_testing_loss_ph = tf.placeholder(tf.float32, shape=())
label_training_acc_ph = tf.placeholder(tf.float32, shape=())
label_testing_acc_ph = tf.placeholder(tf.float32, shape=())
label_testing_acc_avg_cat_ph = tf.placeholder(tf.float32, shape=())
seg_training_acc_ph = tf.placeholder(tf.float32, shape=())
seg_testing_acc_ph = tf.placeholder(tf.float32, shape=())
seg_testing_acc_avg_cat_ph = tf.placeholder(tf.float32, shape=())
total_train_loss_sum_op = tf.summary.scalar('total_training_loss', total_training_loss_ph)
total_test_loss_sum_op = tf.summary.scalar('total_testing_loss', total_testing_loss_ph)
label_train_loss_sum_op = tf.summary.scalar('label_training_loss', label_training_loss_ph)
label_test_loss_sum_op = tf.summary.scalar('label_testing_loss', label_testing_loss_ph)
seg_train_loss_sum_op = tf.summary.scalar('seg_training_loss', seg_training_loss_ph)
seg_test_loss_sum_op = tf.summary.scalar('seg_testing_loss', seg_testing_loss_ph)
label_train_acc_sum_op = tf.summary.scalar('label_training_acc', label_training_acc_ph)
label_test_acc_sum_op = tf.summary.scalar('label_testing_acc', label_testing_acc_ph)
label_test_acc_avg_cat_op = tf.summary.scalar('label_testing_acc_avg_cat', label_testing_acc_avg_cat_ph)
seg_train_acc_sum_op = tf.summary.scalar('seg_training_acc', seg_training_acc_ph)
seg_test_acc_sum_op = tf.summary.scalar('seg_testing_acc', seg_testing_acc_ph)
seg_test_acc_avg_cat_op = tf.summary.scalar('seg_testing_acc_avg_cat', seg_testing_acc_avg_cat_ph)
train_variables = tf.trainable_variables()
# tf.train.AdamOptimizer表示可训练参数,优化算法
trainer = tf.train.AdamOptimizer(learning_rate)
# 优化器优化
train_op = trainer.minimize(loss, var_list=train_variables, global_step=batch)
saver = tf.train.Saver() # 保存和加载模型
config = tf.ConfigProto()
config.gpu_options.allow_growth = True
config.allow_soft_placement = True
sess = tf.Session(config=config)
init = tf.global_variables_initializer() # 全局变量初始化
sess.run(init) # 图结构创建好了,开始会话
train_writer = tf.summary.FileWriter(SUMMARIES_FOLDER + '/train', sess.graph)
test_writer = tf.summary.FileWriter(SUMMARIES_FOLDER + '/test')
train_file_list = provider.getDataFiles(TRAINING_FILE_LIST)
num_train_file = len(train_file_list)
test_file_list = provider.getDataFiles(TESTING_FILE_LIST)
num_test_file = len(test_file_list)
fcmd = open(os.path.join(LOG_STORAGE_PATH, 'cmd.txt'), 'w')
fcmd.write(str(FLAGS))
fcmd.close()
# write logs to the disk
flog = open(os.path.join(LOG_STORAGE_PATH, 'log.txt'), 'w')
def train_one_epoch(train_file_idx, epoch_num):
is_training = True
for i in range(num_train_file):
cur_train_filename = os.path.join(hdf5_data_dir, train_file_list[train_file_idx[i]])
printout(flog, 'Loading train file ' + cur_train_filename)
cur_data, cur_labels, cur_seg = provider.loadDataFile_with_seg(cur_train_filename)
cur_data, cur_labels, order = provider.shuffle_data(cur_data, np.squeeze(cur_labels))
cur_seg = cur_seg[order, ...]
cur_labels_one_hot = convert_label_to_one_hot(cur_labels)
num_data = len(cur_labels)
num_batch = num_data // batch_size
total_loss = 0.0
total_label_loss = 0.0
total_seg_loss = 0.0
total_label_acc = 0.0
total_seg_acc = 0.0
for j in range(num_batch):
begidx = j * batch_size
endidx = (j + 1) * batch_size
feed_dict = {
pointclouds_ph: cur_data[begidx: endidx, ...],
labels_ph: cur_labels[begidx: endidx, ...],
input_label_ph: cur_labels_one_hot[begidx: endidx, ...],
seg_ph: cur_seg[begidx: endidx, ...],
is_training_ph: is_training,
}
_, loss_val, label_loss_val, seg_loss_val, per_instance_label_loss_val, \
per_instance_seg_loss_val, label_pred_val, seg_pred_val, pred_seg_res \
= sess.run([train_op, loss, label_loss, seg_loss, per_instance_label_loss, \
per_instance_seg_loss, labels_pred, seg_pred, per_instance_seg_pred_res], \
feed_dict=feed_dict)
per_instance_part_acc = np.mean(pred_seg_res == cur_seg[begidx: endidx, ...], axis=1)
average_part_acc = np.mean(per_instance_part_acc)
total_loss += loss_val
total_label_loss += label_loss_val
total_seg_loss += seg_loss_val
per_instance_label_pred = np.argmax(label_pred_val, axis=1)
total_label_acc += np.mean(np.float32(per_instance_label_pred == cur_labels[begidx: endidx, ...]))
total_seg_acc += average_part_acc
total_loss = total_loss * 1.0 / num_batch
total_label_loss = total_label_loss * 1.0 / num_batch
total_seg_loss = total_seg_loss * 1.0 / num_batch
total_label_acc = total_label_acc * 1.0 / num_batch
total_seg_acc = total_seg_acc * 1.0 / num_batch
lr_sum, bn_decay_sum, batch_sum, train_loss_sum, train_label_acc_sum, \
train_label_loss_sum, train_seg_loss_sum, train_seg_acc_sum = sess.run(\
[lr_op, bn_decay_op, batch_op, total_train_loss_sum_op, label_train_acc_sum_op, \
label_train_loss_sum_op, seg_train_loss_sum_op, seg_train_acc_sum_op], \
feed_dict={total_training_loss_ph: total_loss, label_training_loss_ph: total_label_loss, \
seg_training_loss_ph: total_seg_loss, label_training_acc_ph: total_label_acc, \
seg_training_acc_ph: total_seg_acc})
train_writer.add_summary(train_loss_sum, i + epoch_num * num_train_file)
train_writer.add_summary(train_label_loss_sum, i + epoch_num * num_train_file)
train_writer.add_summary(train_seg_loss_sum, i + epoch_num * num_train_file)
train_writer.add_summary(lr_sum, i + epoch_num * num_train_file)
train_writer.add_summary(bn_decay_sum, i + epoch_num * num_train_file)
train_writer.add_summary(train_label_acc_sum, i + epoch_num * num_train_file)
train_writer.add_summary(train_seg_acc_sum, i + epoch_num * num_train_file)
train_writer.add_summary(batch_sum, i + epoch_num * num_train_file)
printout(flog, '\tTraining Total Mean_loss: %f' % total_loss)
printout(flog, '\t\tTraining Label Mean_loss: %f' % total_label_loss)
printout(flog, '\t\tTraining Label Accuracy: %f' % total_label_acc)
printout(flog, '\t\tTraining Seg Mean_loss: %f' % total_seg_loss)
printout(flog, '\t\tTraining Seg Accuracy: %f' % total_seg_acc)
def eval_one_epoch(epoch_num):
is_training = False
total_loss = 0.0
total_label_loss = 0.0
total_seg_loss = 0.0
total_label_acc = 0.0
total_seg_acc = 0.0
total_seen = 0
# NUM_CATEGORIES = 16
total_label_acc_per_cat = np.zeros((NUM_CATEGORIES)).astype(np.float32) # 每一类物体分类标签的正确数
total_seg_acc_per_cat = np.zeros((NUM_CATEGORIES)).astype(np.float32) # 每一类分割的正确数
total_seen_per_cat = np.zeros((NUM_CATEGORIES)).astype(np.int32) # 每类个数
for i in range(num_test_file):
cur_test_filename = os.path.join(hdf5_data_dir, test_file_list[i])
printout(flog, 'Loading test file ' + cur_test_filename)
'''
cur_data(2048,2048,3):测试集的点云数据
cur_labels(2048,16):点云数据物体对应的16类
cur_seg(2048,2048):每个点对应的50类其中之一
'''
cur_data, cur_labels, cur_seg = provider.loadDataFile_with_seg(cur_test_filename)
cur_labels = np.squeeze(cur_labels)
# 将label都转换为one_hot形式
cur_labels_one_hot = convert_label_to_one_hot(cur_labels)
num_data = len(cur_labels)
num_batch = num_data // batch_size
for j in range(num_batch): # 按批次运行
begidx = j * batch_size # 开始的索引
endidx = (j + 1) * batch_size # 结束的索引
feed_dict = {
pointclouds_ph: cur_data[begidx: endidx, ...],
labels_ph: cur_labels[begidx: endidx, ...],
input_label_ph: cur_labels_one_hot[begidx: endidx, ...],
seg_ph: cur_seg[begidx: endidx, ...],
is_training_ph: is_training,
}
loss_val, label_loss_val, seg_loss_val, per_instance_label_loss_val, \
per_instance_seg_loss_val, label_pred_val, seg_pred_val, pred_seg_res \
= sess.run([loss, label_loss, seg_loss, per_instance_label_loss, \
per_instance_seg_loss, labels_pred, seg_pred, per_instance_seg_pred_res], \
feed_dict=feed_dict)
# 求每个物体的零件正确率
per_instance_part_acc = np.mean(pred_seg_res == cur_seg[begidx: endidx, ...], axis=1)
# 求这4个batch_size的平均零件正确率
average_part_acc = np.mean(per_instance_part_acc)
total_seen += 1
total_loss += loss_val
total_label_loss += label_loss_val
total_seg_loss += seg_loss_val
# 求这4个batch_size对类别预测的标签
per_instance_label_pred = np.argmax(label_pred_val, axis=1)
# 算出预测标签的正确率并求平均进行累加
total_label_acc += np.mean(np.float32(per_instance_label_pred == cur_labels[begidx: endidx, ...]))
# 将平均零件分割正确率累加
total_seg_acc += average_part_acc
for shape_idx in range(begidx, endidx):
# test过的每一类的个数
total_seen_per_cat[cur_labels[shape_idx]] += 1
# 每一类标签判断正确的个数:预测标签与正确标签对比,如果正确就在相应位置+1
total_label_acc_per_cat[cur_labels[shape_idx]] += np.int32(per_instance_label_pred[shape_idx-begidx] == cur_labels[shape_idx])
# 将每个物体分割的正确率累加
total_seg_acc_per_cat[cur_labels[shape_idx]] += per_instance_part_acc[shape_idx - begidx]
total_loss = total_loss * 1.0 / total_seen
total_label_loss = total_label_loss * 1.0 / total_seen
total_seg_loss = total_seg_loss * 1.0 / total_seen
total_label_acc = total_label_acc * 1.0 / total_seen
total_seg_acc = total_seg_acc * 1.0 / total_seen
test_loss_sum, test_label_acc_sum, test_label_loss_sum, test_seg_loss_sum, test_seg_acc_sum = sess.run(\
[total_test_loss_sum_op, label_test_acc_sum_op, label_test_loss_sum_op, seg_test_loss_sum_op, seg_test_acc_sum_op], \
feed_dict={total_testing_loss_ph: total_loss, label_testing_loss_ph: total_label_loss, \
seg_testing_loss_ph: total_seg_loss, label_testing_acc_ph: total_label_acc, seg_testing_acc_ph: total_seg_acc})
test_writer.add_summary(test_loss_sum, (epoch_num+1) * num_train_file-1)
test_writer.add_summary(test_label_loss_sum, (epoch_num+1) * num_train_file-1)
test_writer.add_summary(test_seg_loss_sum, (epoch_num+1) * num_train_file-1)
test_writer.add_summary(test_label_acc_sum, (epoch_num+1) * num_train_file-1)
test_writer.add_summary(test_seg_acc_sum, (epoch_num+1) * num_train_file-1)
printout(flog, '\tTesting Total Mean_loss: %f' % total_loss)
printout(flog, '\t\tTesting Label Mean_loss: %f' % total_label_loss)
printout(flog, '\t\tTesting Label Accuracy: %f' % total_label_acc)
printout(flog, '\t\tTesting Seg Mean_loss: %f' % total_seg_loss)
printout(flog, '\t\tTesting Seg Accuracy: %f' % total_seg_acc)
for cat_idx in range(NUM_CATEGORIES):
if total_seen_per_cat[cat_idx] > 0:
printout(flog, '\n\t\tCategory %s Object Number: %d' % (all_obj_cats[cat_idx][0], total_seen_per_cat[cat_idx]))
printout(flog, '\t\tCategory %s Label Accuracy: %f' % (all_obj_cats[cat_idx][0], total_label_acc_per_cat[cat_idx]/total_seen_per_cat[cat_idx]))
printout(flog, '\t\tCategory %s Seg Accuracy: %f' % (all_obj_cats[cat_idx][0], total_seg_acc_per_cat[cat_idx]/total_seen_per_cat[cat_idx]))
if not os.path.exists(MODEL_STORAGE_PATH):
os.mkdir(MODEL_STORAGE_PATH)
for epoch in range(TRAINING_EPOCHES): # 训练次数
printout(flog, '\n<<< Testing on the test dataset ...')
eval_one_epoch(epoch)
printout(flog, '\n>>> Training for the epoch %d/%d ...' % (epoch, TRAINING_EPOCHES))
train_file_idx = np.arange(0, len(train_file_list))
np.random.shuffle(train_file_idx) # 打乱顺序
# train_one_proch比eval_one_epoch多一个优化器过程
train_one_epoch(train_file_idx, epoch)
if (epoch+1) % 10 == 0:
# 保存训练模型
cp_filename = saver.save(sess, os.path.join(MODEL_STORAGE_PATH, 'epoch_' + str(epoch+1)+'.ckpt'))
printout(flog, 'Successfully store the checkpoint model into ' + cp_filename)
flog.flush()
flog.close()
三、训练结果(获取训练模型)
训练了200个epoch后,可以在…/part_seg/train_results路径下查看训练结果。如下图所示:
-
train_results/logs文件:保存训练时的日志数据,每一个epoch对应的各个物体的Accuracy和IOU值
1.Total Point :point_num(该实例点云数量)
2.Ground Truth :objnames[cur_gt_label] (实例类别)
3.Predict :objnames[label_pred_val] (预测类别)
4.Accuracy :seg_acc = np.mean(seg_pred_val == seg) (分割正确点数占总点云数的百分比,正确率)
5.IoU :avg_iou (每个零件算IOU相加,求平均IOU) avg_iou = total_iou / len(iou_oids)
6.IoU details :’’ + n_pred + '’ + n_gt + ‘’ + n_intersect + '’ + n_union + ‘’’’ + (n_intersect / n_union) +\n
7.2874个实例的总的 Accuary = total_acc / total_seen 总的正确点云数 / 总的点云数量
8.IoU = (total_acc_iou += avg_iou) / total_seen -
train_results/summaries文件:可以通过tensorboard可视化工具查看训练结果,以图表形式表示,使用方法为:
1.通过anaconda Promapt,激活虚拟环境
2.运行tensorboard --logdir=路径地址(输入summaries文件的绝对路径)
3.获得本地服务器地址,运行到浏览器即可查看
可视化展示(可查看精确度、IOU、以及Graphs图):
-
train_results/trained_models文件:存放的是训练好的模型,一般选择最好的那个模型进行测试。