【2017/07】实验记录——SSSP

1. 背景

接上次的报告

想要利用deep learning 去预测两个点之间最短路径上的前向或者说是parent or father(当然路径可能不止一条,前向肯定也不是唯一一个,这里后面再说)

因为在用一些求shortest path 算法时,都会利用保存当前点的前向最后得到路径, 或许每个点都训练一个模型,这样根据前向节点,可以把路径一步一步迭代出来。。。当然这是比较愚蠢的做法。。。不过前期先试一试

和导师交流了想法,决定先用grid这个比较规则的形状(就当做棋盘吧),因为每个点最多有四条边,上下左右或者东南西北bala。。。可以让他们代表四个通道类似与图像的RBG三个通道。这里我grid选400个点的,尺寸也就是20 × 20 × 4。

4个通道,我令 north 为第 0 个维度, 顺时针, east 为第 1 个维度, south 第 2 个, west 第 3 个,边存在的方向,值为 1 , 边不存在的方向,值为 inf

起点就选(0,0),你可以想象一下面前有一个棋盘,但是这个棋盘有的边是不存在的,但是所有点是全联通的,选取左上角为起点,现在要用BFS遍历这个棋盘,保存每个节点的距离和前向, 这样得到的路径就是一个从起点到每个其他点的最短路的树。


2. 实验

step1: 生成数据

随机生成 1 million grids(20*20*4),有边的值为1, 没有边的值为inf,BFS遍历得到前向节点,具体看代码

step2: 用CNN模型训练,4个卷积层,一个全连接层,lable用的是随机选取的终点,这里我选取第151个点,结果训练三个minibatch基本上accuracy就收敛了,然后一直波动。

先是用的400个分类标签,后来觉得应该四个标签就够了,虽然训练完得到的accuracy都差不多。


3. 总结

考虑到每个节点的前向一般只有四种可能(边界不算),grid密度大的情况下猜到的几率还是很大的,因此我又生成了两万个边密度0.7的数据,训练结果accuracy较0.8的降低了


4. 后续

和导师交流了一下,觉得应该用反卷积(deconvolution),这样可以输出一个channel(点标号从0~399) 20*20 ,或者两个channel(点用坐标表示)也是20×20,这里输出的是每个点的前向,这样可以得到一个最短树

accuracy的衡量可以用得到的最短树,计算出到某一点的距离,用之前生成数据里的距离作比较,因为路径可能不止一条,比距离显得更有意义一些。

PS:这里生成grid的边并不是无向的。。这里应该是无向的,在下次实验中会解决这个问题。

代码:

生成数据:

""" 
2017-7-21
experiment 1
try to use CNN to predict the parents in SSSP
generate 1000000 20*20*4 grids(4 means each nodes maybe have 4 degree, 20 means the number of nodes
shape as below:
					      north
		     the start<-##########
						##########
					west##########east
						##########
						##########
						  south
Checking whether a grid is connected or not by BFS, and save the connected one along with parents in the single source shortest paths  
"""
import numpy as np
import random
import Queue
np.set_printoptions(threshold=np.inf)  

# not exist a edge
inf = 0x3fffffff
#set grid's size
N = 20
#[north, east, south, west] if exist a path, then the value should be weight
D = 4
#in this field, we set the weight 0 or 1
weight = 1
#mp density
density = 0.8

def generate_mp(size=N, dime=D, dens=density):
	#init mp matrix
	e = np.random.random_integers(0,9,(N,N,D))

	#set connection density
	e[e<10*(1-density)] = inf
	e[e!=inf] = 1

	#set edge value
	#north
	e[0,:,0] = inf
	#east
	e[:,N-1,1] = inf
	#south
	e[N-1,:,2] = inf
	#west
	e[:,0,3] = inf

	#init father matrix
	fa = np.zeros((N,N,2))
	fa += -2

	#init visited
	visited = np.zeros((N,N))

	#init distance map
	dis = np.zeros((N,N))
	dis += inf
	#set source node
	s = [0,0]

	#move direction north, east, south, west respectively
	dx = [-1,0,1,0]
	dy = [0,1,0,-1]

	q = Queue.Queue(0)
	q.put(s)
	visited[0,0] = 1
	dis[0,0] = 0
	fa[0,0,:] = [-1,-1]
	while(q.empty() == False):
		u = q.get()
		for k in range(D):
			nx = u[0]+dx[k]
			ny = u[1]+dy[k]
			v = [nx,ny]
			if e[u[0],u[1],k] != inf and visited[nx,ny] == 0:
				visited[nx,ny] = 1
				dis[nx,ny] = dis[u[0],u[1]] + 1
				fa[nx,ny,:] = u
				q.put(v)
	if(np.sum(visited) == N*N):
		output={}
		output["map"] = e
		output["father"] = fa
		return output
num = 0
X = np.zeros((1,N*N*D))
y = np.zeros((1,N*N))
first = 1
while(num<1000000):
	new_out = generate_mp(N, D, density)
	if new_out != None:
		x_ = new_out["map"].reshape(1,-1)
		y_temp = new_out["father"].reshape(N*N,2)
		y_ = np.zeros(N*N)
		y_[range(N*N)] = y_temp[range(N*N), 0] * N + y_temp[range(N*N), 1]
		if first == 1:
			X += x_
			y += y_
			first = 0
			num += 1
			continue
		X = np.vstack((X, x_))
		y = np.vstack((y, y_))
		num += 1
		if num%1000 == 0:
			print num
		if num%10000== 0:
			np.save("X"+str(num/10000)+".npy", X)
			np.save("y"+str(num/10000)+".npy", y)
			print X.shape, y.shape
			first = 1
			X = np.zeros((1,N*N*D))
			y = np.zeros((1,N*N))

训练模型:

'''
A Convolution Network is used to train a single source shortest path(SSSP) in grid.
We want to use CNN to predict the father in the shortest-path from start to every node, and that
may need N*N - 1 models. Sounds terrible! 
The train data was generated randomly by running the python file called "generate_data.py". We can
get one million grids along with plent of fathers.
Author: Line290
Date:2017-07-21
'''
from __future__ import print_function
import numpy as np
import tensorflow as tf

#Import data
X = np.load("X1.npy")
y = np.load("y1.npy").astype(np.int32)
for i in range(21):
	if i == 0 or i == 1:
		continue
	X_load = np.load("X"+str(i)+".npy")
	y_load = np.load("y"+str(i)+".npy").astype(np.int32)
	X = np.vstack((X,X_load))
	y = np.vstack((y,y_load))
# print np.shape(X), np.shape(y)
#This is the node that we prepare to reach, there are N*N-1 nodes totally.
#end range from 1 to N*N, in there it's [1,400] 
end = 151
end_ = [7,11]
#the number of grid
N = X.shape[0]

#extract the endth colume
y_end = np.zeros((N,4))
y_temp = y[:,end]
for i in range(N):
    if y_temp[i] + 20 == end:
        y_end[i][0] = 1
    elif y_temp[i] - 1 == end:
        y_end[i][1] = 1
    elif y_temp[i] - 20 == end:
        y_end[i][2] = 1
    elif y_temp[i] + 1 == end:
        y_end[i][3] = 1
# #one-hot, 400 nodes
# y_end = np.zeros((N,4))
# y_end[range(N),list(y_temp)] = 1

#Partitoning the data, then get data_train and data_test
X_train = X[:189999]
X_test = X[190000:]
y_train = y_end[:189999]
y_test = y_end[190000:]

#Parameters
learning_rate = 0.001
training_iters = 1000000
capacity = X_train.shape[0]
batch_size = 256
display_step = 1

#Network Parameters
n_input = 1600 # grid shape: 20*20*4
n_classes = 4 #total nodes
dropout = 0.75

#tf Graph input
x = tf.placeholder(tf.float32, [None, n_input])
y = tf.placeholder(tf.float32, [None, n_classes])
keep_prob = tf.placeholder(tf.float32) #dropout (keep probability)


# Create some wrappers for simplicity
def conv2d(x, W, b, strides=1):
    # Conv2D wrapper, with bias and relu activation
    x = tf.nn.conv2d(x, W, strides=[1, strides, strides, 1], padding='SAME')
    x = tf.nn.bias_add(x, b)
    return tf.nn.relu(x)


def maxpool2d(x, k=2):
    # MaxPool2D wrapper
    return tf.nn.max_pool(x, ksize=[1, k, k, 1], strides=[1, k, k, 1],
                          padding='SAME')


# Create model
def conv_net(x, weights, biases, dropout):
    # Reshape input picture
    x = tf.reshape(x, shape=[-1, 20, 20, 4])

    # Convolution Layer
    conv1 = conv2d(x, weights['wc1'], biases['bc1'])
    # Max Pooling (down-sampling)
    conv1 = maxpool2d(conv1, k=2)

    # Convolution Layer
    conv2 = conv2d(conv1, weights['wc2'], biases['bc2'])
    # Max Pooling (down-sampling)
    conv2 = maxpool2d(conv2, k=2)

    # Convolution Layer
    conv3 = conv2d(conv2, weights['wc3'], biases['bc3'])
    # Max Pooling (down-sampling)
    conv3 = maxpool2d(conv3, k=2)

    # Convolution Layer
    conv4 = conv2d(conv3, weights['wc4'], biases['bc4'])
    # Max Pooling (down-sampling)
    conv4 = maxpool2d(conv4, k=2)

    # Fully connected layer
    # Reshape conv4 output to fit fully connected layer input
    fc1 = tf.reshape(conv4, [-1, weights['wd1'].get_shape().as_list()[0]])
    fc1 = tf.add(tf.matmul(fc1, weights['wd1']), biases['bd1'])
    fc1 = tf.nn.relu(fc1)
    # Apply Dropout
    fc1 = tf.nn.dropout(fc1, dropout)

    # Output, class prediction
    out = tf.add(tf.matmul(fc1, weights['out']), biases['out'])
    return out

# Store layers weight & bias
weights = {
    # 3x3 conv, 4 input, 32 outputs
    'wc1': tf.Variable(tf.random_normal([3, 3, 4, 32])),
    # 3x3 conv, 32 inputs, 64 outputs
    'wc2': tf.Variable(tf.random_normal([3, 3, 32, 64])),
    # 3x3 conv, 64 inputs, 128 outputs
    'wc3': tf.Variable(tf.random_normal([3, 3, 64, 128])),

    # 1x1 conv, 128 inputs, 256 outputs
    'wc4': tf.Variable(tf.random_normal([1, 1, 128, 256])),

    # fully connected, 2*2*256 inputs, 2048 outputs
    'wd1': tf.Variable(tf.random_normal([2*2*256, 2048])),
    # 2048 inputs, 4 outputs (class prediction)
    'out': tf.Variable(tf.random_normal([2048, n_classes]))
}

biases = {
    'bc1': tf.Variable(tf.random_normal([32])),
    'bc2': tf.Variable(tf.random_normal([64])),
    'bc3': tf.Variable(tf.random_normal([128])),    
    'bc4': tf.Variable(tf.random_normal([256])),    
    'bd1': tf.Variable(tf.random_normal([2048])),
    'out': tf.Variable(tf.random_normal([n_classes]))
}

# Construct model
pred = conv_net(x, weights, biases, keep_prob)

# Define loss and optimizer
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred, labels=y))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)

# Evaluate model
correct_pred = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))

# Initializing the variables
init = tf.initialize_all_variables()

# Launch the graph
with tf.Session() as sess:
    sess.run(init)
    step = 1
    # start = 0
    # end = start + batch_size
    # Keep training until reach max iterations
    while step * batch_size < training_iters:
        # Run optimization op (backprop)
        # batch_x = X_train[start:end,:]
        # batch_y = y_train[start:end,:]
        indeices = np.random.choice(capacity, batch_size)
        batch_x = X_train[indeices,:]
        batch_y = y_train[indeices,:]
        sess.run(optimizer, feed_dict={x: batch_x, y: batch_y,
                                       keep_prob: dropout})
        if step % display_step == 0:
            # Calculate batch loss and accuracy
            loss, acc = sess.run([cost, accuracy], feed_dict={x: batch_x,
                                                              y: batch_y,
                                                              keep_prob: 1.})
            print("Iter " + str(step*batch_size) + ", Minibatch Loss= " + \
                  "{:.6f}".format(loss) + ", Training Accuracy= " + \
                  "{:.5f}".format(acc))
        step += 1
        # start = end
        # end = end + batch_size
    print("Optimization Finished!")

    # Calculate accuracy for 10000 tests
    print("Testing Accuracy:", \
        sess.run(accuracy, feed_dict={x: X_test,
                                      y: y_test,
keep_prob: 1.}))




  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值