Linear and Logistic Regression
主要讲了Tensorflow’s Optimizers,tf.data和通过tensorflow构建机器学习模型的一般步骤,构建模型的步骤都是一样的,只记录了Linear Regression,TF使用方面主要介绍了Optimizers,TF Control Flow(tf的控制逻辑),tf.data
overview:TensorFlow separates definition of computations from their execution
Phase 1: assemble a graph
Phase 2: use a session to execute operations in the graph.
Linear Regression
线性回归主要是以世界人口的出生率(自变量(explanatory variables
X
X
))和预期寿命(因变量(dependent variable))为例,构建变量之间的线性方程。
Model
Inference: Y_predicted = w * X + b
Mean squared error: E[(y - y_predicted)2]
code
""" Solution for simple linear regression example using placeholders
Created by Chip Huyen (chiphuyen@cs.stanford.edu)
CS20: "TensorFlow for Deep Learning Research"
cs20.stanford.edu
Lecture 03
"""
import os
os.environ['TF_CPP_MIN_LOG_LEVEL']='2'
import time
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
import utils
DATA_FILE = 'data/birth_life_2010.txt'
# Step 1: read in data from the .txt file
data, n_samples = utils.read_birth_life_data(DATA_FILE)
# Step 2: create placeholders for X (birth rate) and Y (life expectancy)
X = tf.placeholder(tf.float32, name='X')
Y = tf.placeholder(tf.float32, name='Y')
# Step 3: create weight and bias, initialized to 0
w = tf.get_variable('weights', initializer=tf.constant(0.0))
b = tf.get_variable('bias', initializer=tf.constant(0.0))
# Step 4: build model to predict Y
Y_predicted = w * X + b
# Step 5: use the squared error as the loss function
# you can use either mean squared error or Huber loss
loss = tf.square(Y - Y_predicted, name='loss')
# loss = utils.huber_loss(Y, Y_predicted)
# Step 6: using gradient descent with learning rate of 0.001 to minimize loss
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.001).minimize(loss)
start = time.time()
writer = tf.summary.FileWriter('./graphs/linear_reg', tf.get_default_graph())
with tf.Session() as sess:
# Step 7: initialize the necessary variables, in this case, w and b
sess.run(tf.global_variables_initializer())
# Step 8: train the model for 100 epochs
for i in range(100):
total_loss = 0
for x, y in data:
# Session execute optimizer and fetch values of loss
_, l = sess.run([optimizer, loss], feed_dict={X: x, Y:y})
total_loss += l
print('Epoch {0}: {1}'.format(i, total_loss/n_samples))
# close the writer when you're done using it
writer.close()
# Step 9: output the values of w and b
w_out, b_out = sess.run([w, b])
print('Took: %f seconds' %(time.time() - start))
# plot the results
plt.plot(data[:,0], data[:,1], 'bo', label='Real data')
plt.plot(data[:,0], data[:,0] * w_out + b_out, 'r', label='Predicted data')
plt.legend()
plt.show()
How does TensorFlow know what variables to update?(Optimizers)
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.01).minimize(loss)
_, l = sess.run([optimizer, loss], feed_dict={X: x, Y:y})
Session looks at all trainable variables that loss depends on and update them
#Specify if a variable should be trained or notBy default, all variables are trainable
tf.Variable(initial_value=None, trainable=True,...)
List of optimizers in TF
“Advanced” optimizers work better when tuned, but are generally harder to tune(不是很理解。。)
- tf.train.GradientDescentOptimizer
- tf.train.AdagradOptimizer
- tf.train.MomentumOptimizer
- tf.train.AdamOptimizer
- tf.train.FtrlOptimizer
- tf.train.RMSPropOptimizer
- …
Huber loss
Robust to outliers, If the difference between the predicted value and the real value is small, square it, If it’s large, take its absolute value
#Implementing Huber loss
#tf.cond(pred, fn1, fn2, name=None)
def huber_loss(labels, predictions, delta=14.0):
residual = tf.abs(labels - predictions)
def f1(): return 0.5 * tf.square(residual)
def f2(): return delta * residual - 0.5 * tf.square(delta)
return tf.cond(residual < delta, f1, f2)
TF Control Flow
Since TF builds graph before computation, we have to specify all possible subgraphs beforehand. PyTorch’s dynamic graphs and TF’s eager execution help overcome this
tf.data
Pros and Cons of Placeholder
- **Pro:**put the data processing outside TensorFlow, making it easy to do in Python
- **Cons:**users often end up processing their data in a single thread and creating data bottleneck that slows execution down.
data, n_samples = utils.read_birth_life_data(DATA_FILE)
X = tf.placeholder(tf.float32, name='X')
Y = tf.placeholder(tf.float32, name='Y')
…
with tf.Session() as sess:
…
# Step 8: train the model
for i in range(100): # run 100 epochs
for x, y in data:
# Session runs train_op to minimize loss
sess.run(optimizer, feed_dict={X: x, Y:y})
Instead of doing inference with placeholders and feeding in data later, do inference directly with data
tf.data.Dataset(Store data in tf.data.Dataset)
- tf.data.Dataset.from_tensor_slices((features, labels))
- tf.data.Dataset.from_generator(gen, output_types, output_shapes)
#tf.data.Dataset.from_tensor_slices((features, labels))
dataset = tf.data.Dataset.from_tensor_slices((data[:,0], data[:,1]))
print(dataset.output_types) #>> (tf.float32, tf.float32)
print(dataset.output_shapes) #>> (TensorShape([]), TensorShape([]))
Can also create Dataset from files
tf.data.TextLineDataset(filenames)
tf.data.FixedLengthRecordDataset(filenames)
tf.data.TFRecordDataset(filenames)
tf.data.Iterator(Create an iterator to iterate through samples in Dataset)
- iterator = dataset.make_one_shot_iterator()
Iterates through the dataset exactly once. No need to initialization. - iterator = dataset.make_initializable_iterator()
Iterates through the dataset as many times as we want. Need to initialize with each epoch.
iterator = dataset.make_one_shot_iterator()
X, Y = iterator.get_next() # X is the birth rate, Y is the life expectancy
with tf.Session() as sess:
print(sess.run([X, Y])) # >> [1.822, 74.82825]
print(sess.run([X, Y])) # >> [3.869, 70.81949]
print(sess.run([X, Y])) # >> [3.911, 72.15066]
iterator = dataset.make_initializable_iterator()
...
for i in range(100):
sess.run(iterator.initializer)
total_loss = 0
try:
while True:
sess.run([optimizer])
except tf.errors.OutOfRangeError:
pass
Should we always use tf.data?
- For prototyping, feed dict can be faster and easier to write (pythonic)
- tf.data is tricky to use when you have complicated preprocessing or multiple data sources
- NLP data is normally just a sequence of integers. In this case, transferring the data over to GPU is pretty quick, so the speedup of tf.data isn’t that large