网上比较多的是分类和目标检测,介绍回归训练的比较少。
今天使用digits测试一下回归问题。
回归问题和分类其实差不多,只不过把网络输出的结果直接作为输出结果。官方给的是一个渐变方向的检测,输入的图像是50x50大小。
下面简单记录一下过程:
1 安装 Image Gradients extensions
首先要安装digits,再执行下面的命令
# export DIGITS_ROOT=你的digits安装目录
#sudo pip install -e $DIGITS_ROOT
然后安装data and view plug-ins for Image Gradients.
$ pip install $DIGITS_ROOT/plugins/data/imageGradients
$ pip install $DIGITS_ROOT/plugins/view/imageGradients
2 创建数据
创建数据有两种方式, 一个是使用网页app,另一种是使用脚本创建。
方式一:
在Dataset 中New Dataset>Images>Gradients
:
方式二:
通过脚本创建LMDB
$ ./digits/dataset/images/generic/test_lmdb_creator.py -x 50 -y 50 -c 1000 /tmp/my_dataset
看一下这个脚本代码,核心部分
def create_lmdbs(folder, image_width=None, image_height=None, image_count=None):
"""
Creates LMDBs for generic inference
Returns the filename for a test image
Creates these files in "folder":
train_images/
train_labels/
val_images/
val_labels/
mean.binaryproto
test.png
"""
if image_width is None:
image_width = IMAGE_SIZE
if image_height is None:
image_height = IMAGE_SIZE
if image_count is None:
train_image_count = TRAIN_IMAGE_COUNT
else:
train_image_count = image_count
val_image_count = VAL_IMAGE_COUNT
# Used to calculate the gradients later
yy, xx = np.mgrid[:image_height, :image_width].astype('float')
for phase, image_count in [
('train', train_image_count),
('val', val_image_count)]:
image_db = lmdb.open(os.path.join(folder, '%s_images' % phase),
map_async=True,
max_dbs=0)
label_db = lmdb.open(os.path.join(folder, '%s_labels' % phase),
map_async=True,
max_dbs=0)
image_sum = np.zeros((image_height, image_width), 'float64')
for i in xrange(image_count):
xslope, yslope = np.random.random_sample(2) - 0.5
a = xslope * 255 / image_width
b = yslope * 255 / image_height
# a b是两个长和宽的随机数,这就是生成的 label二维数据,只不过是未归一化的
# image是生成的图像渐变图像
image = a * (xx - image_width / 2) + b * (yy - image_height / 2) + 127.5
image_sum += image
image = image.astype('uint8')
pil_img = PIL.Image.fromarray(image)
# create image Datum
image_datum = caffe_pb2.Datum()
image_datum.height = image.shape[0]
image_datum.width = image.shape[1]
image_datum.channels = 1
s = StringIO()
pil_img.save(s, format='PNG')
image_datum.data = s.getvalue()
image_datum.encoded = True
_write_to_lmdb(image_db, str(i), image_datum.SerializeToString())
# create label Datum
label_datum = caffe_pb2.Datum()
label_datum.channels, label_datum.height, label_datum.width = 1, 1, 2
label_datum.float_data.extend(np.array([xslope, yslope]).flat)
_write_to_lmdb(label_db, str(i), label_datum.SerializeToString())
# close databases
image_db.close()
label_db.close()
# save mean
mean_image = (image_sum / image_count).astype('uint8')
_save_mean(mean_image, os.path.join(folder, '%s_mean.png' % phase))
_save_mean(mean_image, os.path.join(folder, '%s_mean.binaryproto' % phase))
# create test image
# The network should be able to easily produce two numbers >1
xslope, yslope = 0.5, 0.5
a = xslope * 255 / image_width
b = yslope * 255 / image_height
test_image = a * (xx - image_width / 2) + b * (yy - image_height / 2) + 127.5
test_image = test_image.astype('uint8')
pil_img = PIL.Image.fromarray(test_image)
test_image_filename = os.path.join(folder, 'test.png')
pil_img.save(test_image_filename)
return test_image_filename
上面用到了numpy.mgrid,这个返回两个二维数组,xx是x方向0到宽,yy是y方向0到高的数组。 然后通过
image = a * (xx - image_width / 2) + b * (yy - image_height / 2) + 127.5
就生成了由ab决定的渐变图像,这里使用的方式有点难理解。
然后选择New Dataset> Images> Other:
3创建model
New Model>Images>Gradients
Using Caffe
layer {
name: "scale"
type: "Power"
bottom: "data"
top: "scale"
power_param {
scale: 0.004
}
}
layer {
name: "hidden"
type: "InnerProduct"
bottom: "scale"
top: "output"
inner_product_param {
num_output: 2
}
}
layer {
name: "loss"
type: "EuclideanLoss"
bottom: "output"
bottom: "label"
top: "loss"
exclude { stage: "deploy" }
}
官网有Torch7和tensorflow的例子,我没测试,这个网络一共一层,数据输入,经过一个power进行归一,然后直接全连接层,输出一个xy,两个值。计算label和output的损失值
4Verification
5 测试
过程不麻烦,但是弄环境还是感觉有点麻烦的,安装的tensorflow是1.11.0的版本,有很多的信赖冲突。安装完这个之后,tensorflow不能使用了。