前言
1.关于RCNN的详细介绍可见我的另一篇博客:
https://blog.csdn.net/u014796085/article/details/83352821
2.由于项目比较大,并且需要相关数据集,所以这篇博客的代码不适用于需要整个项目的人,这里仅把项目主体部分拆分解读,来为对项目某部分有疑问的人提供参考,并且对本人的代码工作作一个记录,方便以后复习。
3.该程序是在github相关项目基础上修改而来,参考的github项目:
https://github.com/yangxue0827/RCNN
代码实现与解读
大训练集预训练
# Building 'AlexNet'
def create_alexnet(num_classes):
network = input_data(shape=[None, config.IMAGE_SIZE, config.IMAGE_SIZE, 3])
# 4维输入张量,卷积核个数,卷积核尺寸,步长
network = conv_2d(network, 96, 11, strides=4, activation='relu')
network = max_pool_2d(network, 3, strides=2)
# 数据归一化
network = local_response_normalization(network)
network = conv_2d(network, 256, 5, activation='relu')
network = max_pool_2d(network, 3, strides=2)
network = local_response_normalization(network)
network = conv_2d(network, 384, 3, activation='relu')
network = conv_2d(network, 384, 3, activation='relu')
network = conv_2d(network, 256, 3, activation='relu')
network = max_pool_2d(network, 3, strides=2)
network = local_response_normalization(network)
network = fully_connected(network, 4096, activation='tanh')
network = dropout(network, 0.5)
network = fully_connected(network, 4096, activation='tanh')
network = dropout(network, 0.5)
network = fully_connected(network, num_classes, activation='softmax')
momentum = tflearn.Momentum(learning_rate=0.001, lr_decay=0.95, decay_step=200)
network = regression(network, optimizer=momentum,
loss='categorical_crossentropy')
return network
定义alexnet网络,这里num_classes是大训练集对应的分类数量。
def load_data(datafile, num_class, save=False, save_path='dataset.pkl'):
fr = codecs.open(datafile, 'r', 'utf-8')
train_list = fr.readlines()
labels = []
images = []
# 对每一个训练样本
for line in train_list:
tmp = line.strip().split(' ')
fpath = tmp[0]
img = cv2.imread(fpath)
# 样本resize到227x227,转为矩阵保存
img = prep.resize_image(img, config.IMAGE_SIZE, config.IMAGE_SIZE)
np_img = np.asarray(img, dtype="float32")
images.append(np_img)
index = int(tmp[1])
label = np.zeros(num_class)
label[index] = 1
labels.append(label)
if save:
# 序列化保存
pickle.dump((images, labels), open(save_path, 'wb'))
fr.close()
return images, labels
样本预处理,所有样本resize后转矩阵保存。
def train(network, X, Y, save_model_path):
# Training
model = tflearn.DNN(network, checkpoint_path='model_alexnet',
max_checkpoints=1, tensorboard_verbose=2, tensorboard_dir='output')
if os.path.isfile(save_model_path + '.index'):
model.load(save_model_path)
print('load model...')
model.fit(X, Y, n_epoch=200, validation_set=0.1, shuffle=True,
show_metric=True, batch_size=64, snapshot_step=200,
snapshot_epoch=False, run_id='alexnet_oxflowers17') # epoch = 1000
# Save the model
model.save(save_model_path)
print('save model...')
训练模型,X、Y为样本,迭代次数200次,训练集中取出10%作为验证集(用来计算模型预测正确率),每次迭代训练64个样本。
小数据集微调
# Use a already trained alexnet with the last layer redesigned
def create_alexnet(num_classes, restore=False):
# Building 'AlexNet'
network = input_data(shape=[None, config.IMAGE_SIZE, config.IMAGE_SIZE, 3])
network = conv_2d(network, 96, 11, strides=4, activation='relu')
network = max_pool_2d(network, 3, strides=2)
network = local_response_normalization(network)
network = conv_2d(network, 256, 5, activation='relu')
network = max_pool_2d(netw