最近一直在实现一个seq2seq模型,正好把相关tensorflow和RNN基础模型的知识整理回顾一下
(一)tensorflow入门笔记
(三)attention机制
(四)seq2seq实例详解
- 基础背景
tensorflow的核心是计算图,即op。
一个基于tensorflow的代码一般包含两步:
1. 创建计算图(op)
2. 运行会话(session)
- 环境
python3.6 + tensorflow1.8.0
以下是基于Github上的例子,不定时更新
- 两种打开session方式等价
这两种方式需要在启动session之前构建完计算图import tensorflow as tf y=tf.constant(23) with tf.Session() as sess: #会自动关闭session print(sess.run(y)) sess = tf.Session() print(sess.run(y)) sess.close()
更灵活的方式是sess = tf.InteractiveSession() sess.close()
- 常数和变量
#常数 y = tf.constant(23,name='y') #变量 y2 = tf.Variable(23,name='y2') y3 = tf.placeholder(tf.int32) #占位符,类似形参 init = tf.global_variables_initializer() #初始化节点 sess = tf.Session() sess.run(init) print(sess.run(y),sess.run(y2)) z = tf.identity(y3) #占位符无法引用自身,放入计算图至少需要一个操作 value = 23 print(sess.run(z,feed_dict={y3:value}))
- 设置参数
#第一个是参数名称,第二个参数是默认值,第三个是参数描述 tf.app.flags.DEFINE_string('str_name', 'value',"descrip1") FLAGS = tf.app.flags.FLAGS
- 加法,乘法
x = tf.placeholder(tf.int32) y = tf.placeholder(tf.int32) add = tf.add(x, y) mul = tf.multiply(x, y) with tf.Session() as sess: print(sess.run(add, feed_dict={x: 2, y: 3})) print(sess.run(mul, feed_dict={x: 2, y: 3}))
- 矩阵运算
matrix1 = tf.constant([[3., 3.]]) matrix2 = tf.constant([[2.],[2.]]) product = tf.matmul(matrix1, matrix2) add = tf.add(matrix1, matrix2) with tf.Session() as sess: result1 = sess.run(product) result2 = sess.run(add) print(result1,result2)
- 线性函数
def linear_function(): np.random.seed(1) X = tf.constant(np.random.randn(3,1),name='X') W = tf.constant(np.random.randn(4,3),name='W') b = tf.constant(np.random.randn(4,1),name='b') Y = tf.add(tf.matmul(W,X),b) a = tf.constant(2) sess = tf.Session() result = sess.run(Y) sess.close() return result print(str(linear_function()))
- sigmoid函数
def sigmoid(z): x = tf.placeholder(tf.float32,name = "x") sigmoid = tf.sigmoid(x) with tf.Session() as sess: result = sess.run(sigmoid,feed_dict = {x:z}) return result print(sigmoid(2))
- 转化为一个one hot向量
def one_hot_matrix(labels,C): depth = tf.constant(value=C,name='C') one_hot_matrix = tf.one_hot(labels,depth,axis=0) sess = tf.Session() one_hot = sess.run(one_hot_matrix) sess.close() return one_hot labels = np.array([1,2,3,0,2,1]) one_hot = one_hot_matrix(labels, C = 4) #4x6矩阵,四行分别对应0,1,2,3;元素为1表示出现的位置 print (str(one_hot))
- 保存模型
会自动存为三个文件:# 每训练50次保存一次模型 if i% 50 == 0: saver.save(sess, self.ckpt_path + self.model_name + '.ckpt', global_step=i)
model_name.ckpt-iteration_step.data-00000-of-00001, model_name.ckpt-iteration_step.index, model_name.ckpt-iteration_step.meta
以及一个chekpoint文件保存最新模型的名字,内容:
model_checkpoint_path: "model_name.ckpt-iteration_step"
all_model_checkpoint_paths: "model_name.ckpt-0"
- 读取之前保存的模型
def restore_last_session(ckpt_path): saver = tf.train.Saver() sess = tf.Session() ckpt = tf.train.get_checkpoint_state(ckpt_path) if ckpt and ckpt.model_checkpoint_path: saver.restore(sess, ckpt.model_checkpoint_path) return sess
ckpt.model_checkpoint_path会自动在checkpoint文件读取最新模型的名字sess=restore_last_session('yourpath')
- 线性拟合
import tensorflow as tf import numpy import matplotlib.pyplot as plt rng = numpy.random # Parameters learning_rate = 0.01 training_epochs = 2000 display_step = 50 # Training Data train_X = numpy.asarray([3.3,4.4,5.5,6.71,6.93,4.168,9.779,6.182,7.59,2.167,7.042,10.791,5.313,7.997,5.654,9.27,3.1]) train_Y = numpy.asarray([1.7,2.76,2.09,3.19,1.694,1.573,3.366,2.596,2.53,1.221,2.827,3.465,1.65,2.904,2.42,2.94,1.3]) n_samples = train_X.shape[0] # tf Graph Input X = tf.placeholder("float") Y = tf.placeholder("float") # Create Model # Set model weights W = tf.Variable(rng.randn(), name="weight") b = tf.Variable(rng.randn(), name="bias") # Construct a linear model activation = tf.add(tf.multiply(X, W), b) # Minimize the squared errors cost = tf.reduce_sum(tf.pow(activation-Y, 2))/(2*n_samples) #L2 loss optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost) #Gradient descent # Initializing the variables init = tf.initialize_all_variables() # Launch the graph with tf.Session() as sess: sess.run(init) # Fit all training data for epoch in range(training_epochs): for (x, y) in zip(train_X, train_Y): sess.run(optimizer, feed_dict={X: x, Y: y}) #Display logs per epoch step if epoch % display_step == 0: print("Epoch:", '%04d' % (epoch+1), "cost=", \ "{:.9f}".format(sess.run(cost, feed_dict={X: train_X, Y:train_Y})), \ "W=", sess.run(W), "b=", sess.run(b) ) print("Optimization Finished!") print("cost=", sess.run(cost, feed_dict={X: train_X, Y: train_Y}), \ "W=", sess.run(W), "b=", sess.run(b)) #Graphic display plt.plot(train_X, train_Y, 'ro', label='Original data') plt.plot(train_X, sess.run(W) * train_X + sess.run(b), label='Fitted line') plt.legend() plt.show()
- Reference:
tensorflow 从入门到上天教程一
Tensorflow简单例子

被折叠的 条评论
为什么被折叠?



