Chapter 9 Up and Running with TensorFlow

Chapter 9 Up and Running with TensorFlow

OReilly. Hands-On Machine Learning with Scikit-Learn and TensorFlow读书笔记

TensorFlow first defines in Python a graph of computations to perform, and then takes that graph and runs it efficiently using optimized C++ code.

It is possible to break up the graph into several chunks and run them in parallel across multiple CPUs or GPUs. TensorFlow also supports distributed computing, so you can train colossal neural networks on humongous training sets in a reasonable amount of time by splitting the computations across hundreds of servers.

9.1 Installation

Create a virtual environment using virtualenv \verb+virtualenv+ virtualenv command, then activate it.

$ cd $ML_PATH # Your ML working directory (e.g., $HOME/ml)
$ source env/bin/activate

Install TensorFlow

$ pip3 install --upgrade tensorflow

Test your installation

$ python3 -c 'import tensorflow; print(tensorflow.__version__)'
1.0.0

9.2 Creating Your First Graph and Running It in a Session

Create a graph

import tensorflow as tf
x=tf.Variable(3,name="x")
y=tf.Variable(4,name="y")
f=x*x*y+y+2

A TensorFlow session takes care of placing the operations onto devices such as CPUs and GPUs and running them, and it holds all the variable values.

creates a session, initializes the variables, and evaluates f \verb+f+ f, then closes the session.

sess=tf.Session()
sess.run(x.initializer)
sess.run(y.initializer)
result=sess.run(f)
print(result)#42
sess.close()

Alternatives that fulfil same task are as follows.

with tf.Session() as sess:
    x.initializer.run()
    y.initializer.run()
    result=f.eval()
print(result)
init=tf.global_variables_initializer()# prepare an init node
with tf.Session() as sess:
    init.run()# actually initialize all the variables
    result =f.eval()
print(result)
sess=tf.InteractiveSession()
init.run()
result=f.eval()
print(result)
sess.close()

A TensorFlow program is typically split into two parts: the first part builds a computation graph (this is called the construction phase), and the second part runs it (this is the execution phase). The construction phase typically builds a computation graph representing the ML model and the computations required to train it. The execution phase generally runs a loop that evaluates a training step repeatedly (for example, one step per mini-batch), gradually improving the model parameters.

9.3 Managing Graphs

Any node you create is automatically added to the default graph:

x1=tf.Variable(1)
x1.graph is tf.get_default_graph()#True

Managing multiple independent graphs at one time by creating a new Graph and temporarily making it the default graph inside a with \verb+with+ with block :

graph=tf.Graph()
with graph.as_default():
    x2=tf.Variable(2)
x2.graph is graph#True
x2.graph is tf.get_default_graph()#False

Resetting the default graph:

tf.reset_default_graph()
x1.graph is tf.get_default_graph()#False

9.4 Lifecycle of a Node Value

When you evaluate a node, TensorFlow automatically determines the set of nodes that it depends on and it evaluates these nodes first.

w=tf.constant(3)
x=w+2
y=x+5
z=x*3

with tf.Session() as sess:
    print(y.eval())#10
    print(z.eval())#15

All node values are dropped between graph runs, except variable values, which are maintained by the session across graph runs. A variable starts its life when its initializer is run, and it ends when the session is closed.

More efficient,

with tf.Session() as sess:
    y_val,z_val=sess.run([y,z])
    print(y_val)#10
    print(z_val)#15

In single-process TensorFlow, multiple sessions do not share any state, even if they reuse the same graph (each session would have its own copy of every variable).

9.5 Linear Regression with TensorFlow

Using Normal Equation to compute θ ^ \hat \theta θ^.

import numpy as np
from sklearn.datasets import fetch_california_housing

housing =fetch_california_housing()
m,n=housing.data.shape
housing_data_plus_bias= np.c_[np.ones((m,1)),housing.data]

X=tf.constant(housing_data_plus_bias,dtype=tf.float32,name="X")
y=tf.constant(housing.target.reshape(-1,1),dtype=tf.float32,name="y")
#–1 means “unspecified”
XT=tf.transpose(X)
theta=tf.matmul(tf.matmul(tf.matrix_inverse(tf.matmul(XT,X)),XT),y)

with tf.Session() as sess:
    theta_value=theta.eval()

9.6 Implementing Gradient Descent

When using Gradient Descent, remember that it is important to first normalize the input feature vectors, or else training may be much slower.

from sklearn.preprocessing import StandardScaler
scaler=StandardScaler()
scaled_housing_data=scaler.fit_transform(housing.data)
scaled_housing_data_plus_bias=np.c_[np.ones((m,1)),scaled_housing_data]

9.6.1 Manually Computing the Gradients

n_epochs=1000
learning_rate=0.01

X=tf.constant(scaled_housing_data_plus_bias,dtype=tf.float32,name="X")
y=tf.constant(housing.target.reshape(-1,1),dtype=tf.float32,name="y")
theta=tf.Variable(tf.random_uniform([n+1,1],-1.0,1.0),name="theta")
#creates a node in the graph that will generate a tensor containing random #values, given its shape and value range, much like NumPy’s rand() function.
y_pred=tf.matmul(X,theta,name="predictions")
error=y_pred-y
mse=tf.reduce_mean(tf.square(error),name="mse")
#Computes the mean of elements across dimensions of a tensor.
#Reduces `input_tensor` along the dimensions given in `axis`.
gradients=2/m*tf.matmul(tf.transpose(X),error)#Equation 4-6
training_op=tf.assign(theta,theta-learning_rate*gradients)
#creates a node that will assign a new value to a variable.

init=tf.global_variables_initializer()

with tf.Session() as sess:
    sess.run(init)
    
    for epoch in range(n_epochs):
        if epoch%100==0:
            print("Epoch",epoch,"MSE=",mse.eval())
        sess.run(training_op)
    best_theta=theta.eval()

9.6.2 Using autodiff

Using symbolic differentiation can automatically find the equations for the partial derivatives, but the resulting code would not necessarily be very efficient (see Appendix D Autodiff).

TensorFlow’s autodiff feature can automatically and efficiently compute the gradients. Replace the gradients=... \verb+gradients=...+ gradients=... line in the Section 9.6.1 with the following line:

gradients = tf.gradients(mse, [theta])[0]

The gradients() function takes an op (in this case mse \verb+mse+ mse) and a list of variables (in this case just theta \verb+theta+ theta), and it creates a list of ops (one per variable) to compute the gradients of the op with regards to each variable. So the ​ gradients \verb+gradients+ gradients node will compute the gradient vector of the MSE with regards to ​ theta \verb+theta+ theta.

9.6.3 Using an Optimizer

Replace the gradients=... \verb+gradients=...+ gradients=... and training_op = ... \verb+training_op = ...+ training_op = ... lines in the Section 9.6.1 with the following line:

optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)
training_op = optimizer.minimize(mse)

use a momentum optimizer:

optimizer = tf.train.MomentumOptimizer(learning_rate=learning_rate,
momentum=0.9)

9.7 Feeding Data to the Training Algorithm

Placeholder nodes don’t actually perform any computation, they just output the data you tell them to output at runtime. They are typically used to pass the training data to TensorFlow
during training. If you don’t specify a value at runtime for a placeholder, you get an exception.

A=tf.placeholder(tf.float32,shape=(None,3))#None means "any size"
B=A+5
with tf.Session() as sess:
    B_val_1=B.eval(feed_dict={A:[[1,2,3]]})
    #pass a feed_dict to the eval() method that specifies the value of A.
    B_val_2=B.eval(feed_dict={A:[[4,5,6],[7,8,9]]})
print(B_val_1)
print(B_val_2)

Mini-batch Gradient Descent:

n_epochs=1000
learning_rate=0.01
batch_size=100
n_batches=int(np.ceil(m/batch_size))

X=tf.placeholder(dtype=tf.float32,shape=(None,n+1),name="X")
y=tf.placeholder(dtype=tf.float32,shape=(None,1),name="y")
theta=tf.Variable(tf.random_uniform([n+1,1],-1.0,1.0),name="theta")
y_pred=tf.matmul(X,theta,name="predictions")
error=y_pred-y
mse=tf.reduce_mean(tf.square(error),name="mse")
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)
training_op = optimizer.minimize(mse)

init=tf.global_variables_initializer()
def fetch_batch(epoch,batch_index,batch_size):
    np.random.seed(epoch * n_batches + batch_index)  # not shown in the book
    indices = np.random.randint(m, size=batch_size)  # not shown
    X_batch = scaled_housing_data_plus_bias[indices] # not shown
    y_batch = housing.target.reshape(-1, 1)[indices] # not shown
    return X_batch,y_batch
with tf.Session() as sess:
    sess.run(init)
    
    for epoch in range(n_epochs):
        if epoch%100==0:
            print("Epoch",epoch,"MSE=",mse.eval(feed_dict={X:X_batch,y:y_batch}))
        for batch_index in range(n_batches):
            X_batch,y_batch=fetch_batch(epoch,batch_index,batch_size)
            sess.run(training_op,feed_dict={X:X_batch,y:y_batch})        
    best_theta=theta.eval()

We don’t need to pass the value of X \verb+X+ X and y \verb+y+ y when evaluating theta since it does not depend on either of them.

9.8 Saving and Restoring Models

TensorFlow makes saving and restoring a model very easy. Just create a Saver \verb+Saver+ Saver node at the end of the construction phase (after all variable nodes are created); then, in the execution phase, just call its save() \verb+save()+ save() method whenever you want to save the model, passing it the session and path of the checkpoint file:

[...]
theta = tf.Variable(tf.random_uniform([n + 1, 1], -1.0, 1.0), name="theta")
[...]
init = tf.global_variables_initializer()
saver = tf.train.Saver()

with tf.Session() as sess:
    sess.run(init)
    for epoch in range(n_epochs):
        if epoch % 100 == 0: # checkpoint every 100 epochs
            save_path = saver.save(sess, "/tmp/my_model.ckpt")
        sess.run(training_op)
    best_theta = theta.eval()
    save_path = saver.save(sess, "/tmp/my_model_final.ckpt")

Restoring a model is just as easy: you create a Saver \verb+Saver+ Saver at the end of the construction phase just like before, but then at the beginning of the execution phase, instead of initializing the variables using the init \verb+init+ init node, you call the restore() \verb+restore()+ restore() method of the Saver \verb+Saver+ Saver object:

with tf.Session() as sess:
    saver.restore(sess, "/tmp/my_model_final.ckpt")
    [...]

By default a Saver \verb+Saver+ Saver saves and restores all variables under their own name, but if you need more control, you can specify which variables to save or restore, and what names to use. For example, the following Saver \verb+Saver+ Saver will save or restore only the theta variable under the name  weights \verb+ weights+  weights:

saver = tf.train.Saver({"weights": theta})

9.9 Visualizing the Graph and Training Curves Using TensorBoard

from datetime import datetime
now=datetime.utcnow().strftime("%Y%m%d%H%M%S")
root_logdir="tf_logs"
logdir="{}/run-{}/".format(root_logdir,now)

n_epochs=1000
learning_rate=0.01
batch_size=100
n_batches=int(np.ceil(m/batch_size))

X=tf.placeholder(dtype=tf.float32,shape=(None,n+1),name="X")
y=tf.placeholder(dtype=tf.float32,shape=(None,1),name="y")
theta=tf.Variable(tf.random_uniform([n+1,1],-1.0,1.0),name="theta")
y_pred=tf.matmul(X,theta,name="predictions")
error=y_pred-y
mse=tf.reduce_mean(tf.square(error),name="mse")
gradients=2/m*tf.matmul(tf.transpose(X),error)#Equation 4-6
training_op=tf.assign(theta,theta-learning_rate*gradients)

#creates a node that will evaluate the MSE value and write it
#to a TensorBoard-compatible binary log string called a summary. 
mse_summary=tf.summary.scalar('MSE',mse)
#write summaries to logfiles in the log directory
file_writer=tf.summary.FileWriter(logdir,tf.get_default_graph())

init=tf.global_variables_initializer()
def fetch_batch(epoch,batch_index,batch_size):
    np.random.seed(epoch * n_batches + batch_index)  # not shown in the book
    indices = np.random.randint(m, size=batch_size)  # not shown
    X_batch = scaled_housing_data_plus_bias[indices] # not shown
    y_batch = housing.target.reshape(-1, 1)[indices] # not shown
    return X_batch,y_batch
with tf.Session() as sess:
    sess.run(init)
    
    for epoch in range(n_epochs):
        if epoch%100==0:
            print("Epoch",epoch,"MSE=",mse.eval(feed_dict={X:X_batch,y:y_batch}))
        for batch_index in range(n_batches):
            X_batch,y_batch=fetch_batch(epoch,batch_index,batch_size)
            if batch_index % 10 == 0:
                summary_str = mse_summary.eval(feed_dict={X: X_batch, y: y_batch})
                step = epoch * n_batches + batch_index
                file_writer.add_summary(summary_str, step)
            sess.run(training_op,feed_dict={X:X_batch,y:y_batch})        
    best_theta=theta.eval()
    file_writer.close()

Starts the TensorBoard web server, listening on port 6006. Be sure to use the full path if you encounter an error on showing the results.

$ tensorboard --logdir C:\ProgramData\Anaconda3\envs\tensorflow\HML\tf_logs\

9.10 Name Scopes

Create name scopes to group related nodes. Define error \verb+error+ error and mse \verb+mse+ mse within name scope loss \verb+loss+ loss:

with tf.name_scope("loss") as scope:
    error=y_pred-y
    mse=tf.reduce_mean(tf.square(error),name="mse")
print(error.op.name)
#The varaible error is not given a name, so return the operator sub(tract)
#loss/sub
print(mse.op.name)
#loss/mse

9.11 Modularity

A Rectified Linear Unit (ReLU) computes a linear function of the inputs, and outputs the result if it is positive, and 0 otherwise.

Equation 9-1. Rectified linear unit
h w , b ( W ) = max ⁡ ( X ⋅ w + b , 0 ) h_{\textbf w,b}(\textbf W)=\max (\textbf X\cdot\textbf w+b,0) hw,b(W)=max(Xw+b,0)
Create a graph that adds the output of five ReLUs. The first ReLU contains nodes named “weights”, “bias”, “z”, and “relu” (plus many more nodes with their default name, such as “MatMul”); the second ReLU contains nodes named “weights_1”, “bias_1”, and so on; the third ReLU contains nodes named “weights_2”, “bias_2”, and so on.

def relu(X):
    w_shape=(int(X.get_shape()[1]),1)
    w=tf.Variable(tf.random_normal(w_shape),name="weights")
    b=tf.Variable(0.0,name="bias")
    z=tf.add(tf.matmul(X,w),b,name="z")
    return tf.maximum(z,0.,name="relu")

n_features=3
X=tf.placeholder(tf.float32,shape=(None,n_features),name="X")
relus=[relu(X) for i in range(5)]
output=tf.add_n(relus, name="ooutput")
#compute the sum of a list of tensors

When you create a node, TensorFlow checks whether its name already exists, and if it does it appends an underscore followed by an index to make the name unique.

Moving all the content of the relu() \verb+relu()+ relu() function inside a name scope makes the graph much clearer. Only name scopes are appended _1, _2, and so on. Same node names such as “weights”, “bias” and “z” under different name scopes.

def relu(X):
    with tf.name_scope("relu"):
        [...]

9.12 Sharing Variables

If you want to share a variable between various components of your graph

  • one simple option is to create it first, then pass it as a parameter to the functions that need it.
def relu(X,threshold):
    with tf.name_scope("relu"):
        w_shape=(int(X.get_shape()[1]),1)
        w=tf.Variable(tf.random_normal(w_shape),name="weights")
        b=tf.Variable(0.0,name="bias")
        z=tf.add(tf.matmul(X,w),b,name="z")
        return tf.maximum(z,threshold,name="max")

n_features=3
threshold=tf.Variable(0.0,name="threshold")
X=tf.placeholder(tf.float32,shape=(None,n_features),name="X")
relus=[relu(X,threshold) for i in range(5)]
output=tf.add_n(relus, name="ooutput")
  • create a Python dictionary containing all the variables in their model, and pass it around to
    every function.
  • create a class for each module (e.g., a ReLU class using class variables to handle the shared parameter).
  • another option is to set the shared variable as an attribute of the relu() function upon the first call
def relu(X):
    with tf.variable_scope("relu"):
        if not hasattr(relu,"threshold"):
            relu.threshold= tf.Variable(0.0,name="threshold")
        w_shape=(int(X.get_shape()[1]),1)
        w=tf.Variable(tf.random_normal(w_shape),name="weights")
        b=tf.Variable(0.0,name="bias")
        z=tf.add(tf.matmul(X,w),b,name="z")
        return tf.maximum(z,relu.threshold,name="max")
  • use the get_variable() function to create the shared variable if it does not exist yet, or reuse it if it already exists. Creating or reusing is controlled by reuse \verb+reuse+ reuse attribute of the current variable_scope(), which is False \verb+False+ False as default.
#create
with tf.variable_scope("relu"):
    threshold=tf.get_variable("threshold",shape=(),
                              initializer=tf.constant_initializer(0.0))
#reuse
with tf.variable_scope("relu", reuse=True):
    threshold = tf.get_variable("threshold")

or

#reuse
with tf.variable_scope("relu") as scope:
    scope.reuse_variables()
    threshold = tf.get_variable("threshold")

Combine creating and reusing, we have

import tensorflow as tf
n_features=3
def relu(X):
    with tf.variable_scope("relu", reuse=True):
        threshold = tf.get_variable("threshold") # reuse existing variable
        print(threshold.op.name)
        w_shape=(int(X.get_shape()[1]),1)
        w=tf.Variable(tf.random_normal(w_shape),name="weights")
        print(w.op.name)
        b=tf.Variable(0.0,name="bias")
        z=tf.add(tf.matmul(X,w),b,name="z")
        return tf.maximum(z, threshold, name="max")
X = tf.placeholder(tf.float32, shape=(None, n_features), name="X")
with tf.variable_scope("relu"): # create the variable
    threshold = tf.get_variable("threshold", shape=(),initializer=tf.constant_initializer(0.0))
relus = [relu(X) for relu_index in range(5)]
output = tf.add_n(relus, name="output")

output:

relu/threshold
relu_1/weights
relu/threshold
relu_2/weights
relu/threshold
relu_3/weights
relu/threshold
relu_4/weights
relu/threshold
relu_5/weights

Another clearer solution that put the initiation of threshold \verb+threshold+ threshold into relu() \verb+relu()+ relu().

import tensorflow as tf
n_features=3
def relu(X):
    threshold = tf.get_variable("threshold", shape=(),
                                initializer=tf.constant_initializer(0.0))
    print(threshold.op.name)
    w_shape=(int(X.get_shape()[1]),1)
    w=tf.Variable(tf.random_normal(w_shape),name="weights")
    print(w.op.name)
    b=tf.Variable(0.0,name="bias")
    z=tf.add(tf.matmul(X,w),b,name="z")
    return tf.maximum(z, threshold, name="max")
X = tf.placeholder(tf.float32, shape=(None, n_features), name="X")
relus = []
for relu_index in range(5):
    with tf.variable_scope("relu", reuse=(relu_index >= 1)) as scope:
        relus.append(relu(X))
output = tf.add_n(relus, name="output")

output

relu/threshold
relu/weights
relu/threshold
relu_1/weights
relu/threshold
relu_2/weights
relu/threshold
relu_3/weights
relu/threshold
relu_4/weights
  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
Learning TensorFlow: A Guide to Building Deep Learning Systems By 作者: Tom Hope – Yehezkel S. Resheff – Itay Lieder ISBN-10 书号: 1491978511 ISBN-13 书号: 9781491978511 Edition 版本: 1 Release 出版日期: 2017-08-28 pages 页数: (242) List Price: $49.99 Book Description Roughly inspired by the human brain, deep neural networks trained with large amounts of data can solve complex tasks with unprecedented accuracy. This practical book provides an end-to-end guide to TensorFlow, the leading open source software library that helps you build and train neural networks for computer vision, natural language processing (NLP), speech recognition, and general predictive analytics. Authors Tom Hope, Yehezkel Resheff, and Itay Lieder provide a hands-on approach to TensorFlow fundamentals for a broad technical audience—from data scientists and engineers to students and researchers. You’ll begin by working through some basic examples in TensorFlow before diving deeper into topics such as neural network architectures, TensorBoard visualization, TensorFlow abstraction libraries, and multithreaded input pipelines. Once you finish this book, you’ll know how to build and deploy production-ready deep learning systems in TensorFlow. Get up and running with TensorFlow, rapidly and painlessly Learn how to use TensorFlow to build deep learning models from the ground up Train popular deep learning models for computer vision and NLP Use extensive abstraction libraries to make development easier and faster Learn how to scale TensorFlow, and use clusters to distribute model training Deploy TensorFlow in a production setting Contents Chapter 1 Introduction Chapter 2 Go with the Flow: Up and running with TensorFlow Chapter 3 Understanding TensorFlow Basics Chapter 4 Convolutional Neural Networks Chapter 5 Working with Text and Sequences + TensorBoard visualization Chapter 6 TF Abstractions and Simplification Chapter 7 Queues, Threads, and Reading Data Chapter 8 Distributed TensorFlow Chapter 9 Serving Models Chapter 10 Miscellaneous

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值