我的理解:
1. tensorflow 和 spark的思想有点类似,都是lazy执行;
在调用tf.Session().run(...)之前所有的操作都只是在构建Graph的过程,这期间不会执行操作;
tf.Session().run(op) 调用时,op操作才会被执行。
2. tf.Session().run(x)
当x是operate (mul, square,train...)时,执行这一op,返回操作的结果,没有返回值就是None;
当x是Variable或者Constant时,返回该变量/常量的值(官方称为Fetch操作).
3. tf.Session().run(x) x可以是一个list, 返回值也会是一个tuple,返回list里的每一个操作对应的返回值。
4. tf.placehodler 的作用就是先在Graph中分配好节点,后面run的时候通过feed_dict的字典参数赋值。(官方称为feed)
5. Constant的初始化实在创建该节点时, 而Variable的初始化却是在调用sess.run(tf.global_variables_initializer)时
6. tf里的*,+等符号操作,与numpy.array的操作类似,如w*b是对应相乘,tf.matmul(w,b)才是矩阵相乘
这里给出使用全部数据迭代和使用batch方法迭代两种实现:
1)下面是标准的线性回归方法,误差函数是平均距离函数,优化方法是梯度下降
import tensorflow as tf
import numpy as np
# prepared train data
M = 100
N = 2
w_data = np.mat([[1.0, 3.0]]).T
b_data = 10
x_data = np.random.randn(M, N).astype(np.float32)
y_data = np.mat(x_data) * w_data + 10 + np.random.randn(M, 1) * 0.33
# define model and graph
w = tf.Variable(tf.random_uniform([N, 1], -1, 1))
b = tf.Variable(tf.random_uniform([1], -1, 1))
y = tf.matmul(x_data, w) + b
loss = tf.reduce_mean(tf.square(y - y_data))
# choose optimizer
train_op = tf.train.GradientDescentOptimizer(0.1).minimize(loss)
# create a session to run
with tf.Session() as sess:
# if use var=tf.Variable, then sess must run init operate. or when sess.run(var) will get not init error.
sess.run(tf.global_variables_initializer())
for i in range(201):
sess.run(train_op)
if i % 20 == 0:
# sess.run(w), sess.run(b) same as sess.run([w, b])
print sess.run(w).T, sess.run(b)
2)下面通过通过batch的方法训练(即训练时由于数据量过大,每次迭代的并不使用所有数据,而是每batch_size条数据进行每次训练迭代),
这里batch_size = 1
import tensorflow as tf
import numpy as np
# prepared train data
M = 100 # train data count
N = 2 # train data feature dimension
w_data = np.mat([[1.0, 3.0]]).T
b_data = 10
x_data = np.random.randn(M, N).astype(np.float32) # randn return float64 number
y_data = np.mat(x_data) * w_data + 10 + np.random.randn(M, 1) * 0.33 # last term is random error term
# define model graph and loss function
batch_size = 1
# use tf tensor type var just like use np.array
X = tf.placeholder("float", [batch_size, N]) # declare a graph node, but not init it immediately.
Y = tf.placeholder("float", [batch_size, 1])
w = tf.Variable(tf.random_uniform([N, 1], -1, 1))
b = tf.Variable(tf.random_uniform([1], -1, 1))
loss = tf.reduce_mean(tf.square(Y - tf.matmul(X, w) - b))
# choose optimizer and operator
train_op = tf.train.GradientDescentOptimizer(0.1).minimize(loss)
# run model use session
with tf.Session() as sess:
# init all global var in graph
sess.run(tf.global_variables_initializer())
for epoch in range(200 * batch_size / M):
i = 0
while i < M:
sess.run(train_op, feed_dict={X: x_data[i: i + batch_size], Y: y_data[i: i + batch_size]})
i += batch_size
print "epoch: {}, w: {}, b: {}".format(epoch, sess.run(w).T, sess.run(b))