tensorflow基础知识

本文介绍了TensorFlow的基础知识,包括Graph operators中的collection、数值计算、LOW LEVEL API的使用,如data -> tensor、computation、Session等。详细讨论了Tensor、Variable的创建、初始化和使用,以及Graphs和Sessions的概念。此外,还提到了Chap1-2的TensorFlow cookbook笔记,涵盖了Tensor、Placeholders、Variables、Session、Loss Function和Back Propagation等内容。
摘要由CSDN通过智能技术生成

1. Graph operators
1.1 collection

tf.add_to_collection(name, value):
tf.get_collection(key): return The list of values in the collection with the given name

 tf.add_to_collection('losses', cross_entropy_mean)

  # The total loss is defined as the cross entropy loss plus all of the weight
  # decay terms (L2 loss).
  return tf.add_n(tf.get_collection('losses'), name='total_loss')

tf.add_n(inputs): Adds all input tensors element-wise.

常用数值计算
  • 绝对值: tf.abs
  • 最大值: rf.reduce_max
    • tf.maximum:用法tf.maximum(a,b),返回的是a,b之间的最大值,
    • tf.miniimum:用法tf.miiinimum(a,b),返回的是a,b之间的最小值,
    • tf.argmax:用法tf.argmax(a,dimension),返回的是a中的某个维度最大值的索引,
import tensorflow as tf;    

a = [1,5,3]  

f1 = tf.maximum(a, 3)  
f2 = tf.minimum(a, 3)  
f3 = tf.argmax(a, 0)  
f4 = tf.argmin(a, 0)  

with tf.Session() as sess:  
    print sess.run(f1)#print f1.eval()  
    print sess.run(f2)  
    print sess.run(f3)  
    print sess.run(f4) 
#### Results
[3 5 3]
[1 3 3]
1
0
condition
  • tf.where
tf.where(
    condition,
    x=None,
    y=None,
    name=None
)
# The condition tensor acts as a mask that chooses, based on the value at each element, whether the corresponding element / row in the output should be taken from x (if true) or y (if false).

LOW LEVEL API

1. Introduction
1.1 data -> tensor
  • rank: number of dimensions
  • shape: a tuple of integers specifying the array’s length along each dimension
  • A tensor consists of a set of primitive values shaped into an array of any number of dimensions
  • TensorFlow uses numpy arrays to represent tensor values.
1.2 computation
  • building computational graph(tf.Graph)
  • running computational grapsh(tf.Session)
  • Graph: A computational graph is a series of TensorFlow operations arranged into a graph
    • Operations(“ops”): nodes of graph
    • Tensors: edges in the graph
    • tf.Tensors do not have values, they are just handles to elements in the computation graph
1.3 Session
  • can pass multiple tensors to tf.Session.run. The run method transparently handles any combination of tuples or dictionaries
sess = tf.Session()
print(sess.run({'ab':(a, b), 'total':total}))
# {'total': 7.0, 'ab': (3.0, 4.0)}
  • a consistent value during a single run
vec = tf.random_uniform(shape=(3,)) # default in tf.constant
out1 = vec + 1
out2 = vec + 2
print(sess.run(vec))
print(sess.run(vec))
print(sess.run((out1, out2)))
## results
[ 0.52917576  0.64076328  0.68353939]
[ 0.66192627  0.89126778  0.06254101]
(
  array([ 1.88408756,  1.87149239,  1.84057522], dtype=float32),
  array([ 2.88408756,  2.87149239,  2.84057522], dtype=float32)
)
1.4 Feeding
  • feed_dict argument can be used to overwrite any tensor in the graph
  • The only difference between placeholders and other tf.Tensors is that placeholders throw an error if no value is fed to them.
a = tf.constant(1)
b = tf.constant(2)
total = a + b
sess.run(total, feed_dict={a: 4})
>> 6
1.5 layers
2. Tensor
  • tf.Variable
  • tf.constant
  • tf.placeholder
  • tf.SparseTensor
2.1 shape
zeros = tf.zeros(my_matrix.shape[1])
2.2 data type
float_tensor = tf.cast(tf.constant([1, 2, 3]), dtype=tf.float32)
3. Variable
  • represent shared, persistent state manipulated by your program.
  • A tf.Variable represents a tensor whose value can be changed by running ops on it.
  • Unlike tf.Tensor objects, a tf.Variable exists outside the context of a single session.run call.
3.1 creating
  • default, float32, tf.glorot_uniform_initializer
my_variable = tf.get_variable("my_variable", [1, 2, 3])
# initialze by specified type
my_int_variable = tf.get_variable("my_int_variable", [1, 2, 3], dtype=tf.int32,
  initializer=tf.zeros_initializer)
# initialize using specified value
other_variable = tf.get_variable("other_variable", dtype=tf.int32,
  initializer=tf.constant([23, 42]))
3.2 variable collections
  • named lists of tensors or other objects
  • every tf.Variable gets placed in the following two collections
    • tf.GraphKeys.GLOBAL_VARIABLES: variables that can be shared across multiple devices
    • tf.GraphKeys.TRAINABLE_VARIABLES: variables for which TensorFlow will calculate gradients.
  • want variable not trainable
my_local = tf.get_variable("my_local", shape=(),
collections=[tf.GraphKeys.LOCAL_VARIABLES])
# or
my_non_trainable = tf.get_variable("my_non_trainable",
                                   shape=(),
                                   trainable=False)
  • use own collection
# no need to explicitly create a collection
tf.add_to_collection("my_collection_name", my_local)
# retrieve a list of all the variables
tf.get_collection("my_collection_name")
3.3 initializing
  • tf.global_variables_initializer
    • initializing all variables in the tf.GraphKeys.GLOBAL_VARIABLES collection
3.4 using, assigning
  • treat it like a normal tf.Tensor
  • To assign a value to a variable, use the methods assign, assign_add
v = tf.get_variable("v", shape=(), initializer=tf.zeros_initializer())
assignment = v.assign_add(1)
tf.global_variables_initializer().run()
sess.run(assignment)  # or assignment.op.run(), or assignment.eval()
3.5 sharing
  • Implicitly wrapping tf.Variable objects within tf.variable_scope objects.
def conv_relu(input, kernel_shape, bias_shape):
    # Create variable named "weights".
    weights = tf.get_variable("weights", kernel_shape,
        initializer=tf.random_normal_initializer())
    # Create variable named "biases".
    biases = tf.get_variable("biases", bias_shape,
        initializer=tf.constant_initializer(0.0))
    conv = tf.nn.conv2d(input, weights,
        strides=[1, 1, 1, 1], padding='SAME')
    return tf.nn.relu(conv + biases)


def my_image_filter(input_images):
    with tf.variable_scope("conv1"):
        # Variables created here will be named "conv1/weights", "conv1/biases".
        relu1 = conv_relu(input_images, [5, 5, 32, 32], [32])
    with tf.variable_scope("conv2"):
        # Variables created here will be named "conv2/weights", "conv2/biases".
        return conv_relu(relu1, [5, 5, 32, 32], [32])
  • reusing
 def my_variable_sharing(): 
    # variable sharing
    # opt1
    with tf.variable_scope("model") as scope:
        output1 = my_image_filter(input1)
        scope.reuse_variables()
        output2 = my_image_filter(input2)

    # opt2
    with tf.variable_scope("model") as scope:
        output1 = my_image_filter(input1)
    with tf.variable_scope(scope, reuse=True):
        output2 = my_image_filter(input2)
4. Graphs and Sessions
  • tf.Operation node
  • tf.Tensor edge
4.1 Naming operation
  • tf.Tensor objects are implicitly named after the tf.Operation that produces the tensor as output. A tensor name has the form "<OP_NAME>:<i>" where:
    • "<OP_NAME>" is the name of the operation that produces it.
    • "<i>" is an integer representing the index of that tensor among the operation’s outputs.
4.2 Tensor-like objects
  • operations take one or more tf.Tensor objects as arguments
  • these functions will accept a tensor-like object in place of a tf.Tensor, and implicitly convert it to a tf.Tensor using the tf.convert_to_tensor method
  • Tensor-like objects
    • tf.Variable
    • numpy.ndarray
    • list
    • scalar pyton types
  • TensorFlow will create a new tf.Tensor each time you use the same tensor-like object. use it multiple times, you may run out of memory.
    • manually call tf.convert_to_tensor on the tensor-like object once and use the returned tf.Tensor instead.
4.3 Session

to do

5. Save

to do

TensorFlow cookbook 笔记Chap1-2

Chap1

Tensor
  • primary data structure
zero_tsr = tf.zeros([row_dim, col_dim])
ones_tsr = tf.ones([row_dim, col_dim])
# Create a constant filled tensor. Use the following
filled_tsr = tf.fill([row_dim, col_dim], 42)
constant_tsr = tf.constant([1,2,3])
  • declare as variables or feed as placeholders
  • sequence tensor
    • stop include
    • limit exclude
linear_tsr = tf.linspace(start=0, stop=1, start=3)
integer_seq_tsr = tf.range(start=6, limit=15, delta=3)
y_vals = np.repeat(10., 100)  
x_vals = np.random.normal(1, 0.1, 100)
  • random tensor
    • uniform distribution
    • normal distribution
randunif_tsr = tf.random_uniform([row_dim, col_dim],
minval=0, maxval=1)
randnorm_tsr = tf.random_normal([row_dim, col_dim],
mean=0.0, stddev=1.0)
  • random entries of arrays
shuffled_output = tf.random_shuffle(input_tensor)
cropped_output = tf.random_crop(input_tensor, crop_size)
Placeholders and Variables
  • Variables are the parameters of the algorithm and TensorFlow keeps track of how to change these to optimize the algorithm.
  • Placeholders are objects that allow you to feed in data of a specific type and shape and depend on the results of the computational graph, such as the expected outcome of a computation.
  • Placeholders are just holding the position for data to be fed into the graph. Placeholders get data from a feed_dict argument in the session. To put a placeholder in the graph, we must perform at least one operation on the placeholder
my_var = tf.Variable(tf.zeros([row_dim, col_dim]))
x_data = tf.placeholder(tf.float32, shape=(3, 5))
for x_val in x_vals:
    print(sess.run(add1, feed_dict={x_data: x_val}))
# vary columns
x_data = tf.placeholder(tf.float32, shape=(3,None))
notes
  • use tf.get_variable instead of tf.Variable in work env
    • it will make it way easier to refactor your code if you need to share variables at any time, e.g. in a multi-gpu setting
    • tf.Variable will always create a new variable, whether tf.get_variable gets from the graph an existing variable with those parameters, and if it does not exists, it creates a new one.
    • default: xavier initializer
W = tf.get_variable("W", shape=[784, 256], initializer=tf.contrib.layers.xavier_initializer())
with tf.variable_scope("one"):
    a = tf.get_variable("v", [1]) #a.name == "one/v:0"
with tf.variable_scope("one"):
    b = tf.get_variable("v", [1]) #ValueError: Variable one/v already exists
with tf.variable_scope("one", reuse = True):
    c = tf.get_variable("v", [1]) #c.name == "one/v:0"

with tf.variable_scope("two"):
    d = tf.get_variable("v", [1]) #d.name == "two/v:0"
    e = tf.Variable(1, name = "v", expected_shape = [1]) #e.name == "two/v_1:0"

assert(a is c)  #Assertion is true, they refer to the same object.
assert(a is d)  #AssertionError: they are different objects
assert(d is e)  #AssertionError: they are different objects
Matrices
  • tf.diag
  • tf.convert_to_tensor
identity_matrix = tf.diag([1.0, 1.0, 1.0])
A = tf.truncated_normal([2, 3])
B = tf.fill([2,3], 5.0)
C = tf.random_uniform([3,2])
D = tf.convert_to_tensor(np.array([[1., 2., 3.],[-3., -7.,
-1.],[0., 5., -2.]]))
print(sess.run(identity_matrix)
  • tf.matmul(A,B) mat multiplication
  • tf.transpose(C) transpose
  • tf.matrix_determinant()
  • tf.matrix_inverse()
  • tf.self_adjoint_eig() eigenvalues and vectors
    • outputs the eigenvalues in the first row
    • the subsequent vectors in the remaining vectors
Operations
  • add, sub, mul, div, mod
    • div() returns the same type as the inputs. This means it really returns the floor of the division
    • truediv()
    • floordiv() rounded down to the nearest integer
print(sess.run(tf.div(3,4)))
0
print(sess.run(tf.truediv(3,4)))
0.75
print(sess.run(tf.floordiv(3.0,4.0)))
0.0
  • customize
def custom_polynomial(value):
return(tf.sub(3 * tf.square(value), value) + 10)
print(sess.run(custom_polynomial(11)))
362
Activation function
  • relu rectified linear unit
    • relu6
# max(0,x)
print(sess.run(tf.nn.relu([-3., 3., 10.])))
[ 0. 3. 10.]
# min(max(0,x), 6)
print(sess.run(tf.nn.relu6([-3., 3., 10.])))
[ 0. 3. 6.]
  • sigmoid
    • not zero centered, require zero-mean the data
print(sess.run(tf.nn.sigmoid([-1., 0., 1.])))
[ 0.26894143 0.5
0.7310586 ]
  • hyper tangent
    • range between -1 and 1
# ((exp(x)-exp(-x)/exp(x)+exp(-x))
print(sess.run(tf.nn.tanh([-1., 0., 1.])))
[-0.76159418 0.
0.76159418 ]
  • softsign, softplus, ELU
    reludiff
    sigdiff

Chap2

Session
import tensorflow as tf
sess = tf.Session()
  • use the same TensorFlow script if we reset the graph first
from tensorflow.python.framework import ops
ops.reset_default_graph()
sess = tf.Session()
Loss Function
  • L2 norm
    • it is very curved near the target
    • algorithms can use this fact to converge to the target more slowly
    • nn.l2_loss() half the L2-norm
l2_y_vals = tf.square(target - x_vals)
l2_y_out = sess.run(l2_y_vals)
  • L1 norm
    • The L1 norm is better for outliers than the L2 norm because it is not as steep for larger values
    • L1 norm is not smooth at the target and this can result in algorithms not converging well
l1_y_vals = tf.abs(target - x_vals)
  • Pseudo-Huber
    • a continuous and smooth approximation to the Huber loss
    • attempts to take the best of the L1 and L2 norms by being convex near the target and less steep for extreme values
  • Hinge loss
hinge_y_vals = tf.maximum(0., 1. - tf.mul(target, x_vals))
hinge_y_out = sess.run(hinge_y_vals)
  • cross entropy
# this might be for two classes?
# from cs231n, loss = - y log P(y|x)
# y is label, one-hot, so result is loss = -sum(log(a_i))
xentropy_y_vals = - tf.mul(target, tf.log(x_vals)) - tf.mul((1. -
target), tf.log(1. - x_vals))
xentropy_y_out = sess.run(xentropy_y_vals)
  • sigmoid cross entropy
  • weighted cross entropy
  • softmax cross entropy
  • sparse softmax corss-entropy
    rloss
    closs
  • metrics
    • stable: whether smooth near target
    • robust: whether sensitive to outliers
  • batches
    • specific loss function expects batches of data
my_output_expanded = tf.expand_dims(my_output, 0)
y_target_expanded = tf.expand_dims(y_target, 0)
Back Propagation
  • minimize loss function
  • MomentumOptimizer()
  • AdagradOptimizer()
my_opt = tf.train.GradientDescentOptimizer(0.05)
train_step = my_opt.minimize(xentropy)
for i in range(1400):
    rand_index = np.random.choice(100)
    rand_x = [x_vals[rand_index]]
    rand_y = [y_vals[rand_index]]
Batch and Stochastic Training
  • mean loss
    • tf.reduce_mean()
loss = tf.reduce_mean(tf.square(my_output - y_target))
  • record loss
loss_batch = []
for i in range(100):
    rand_index = np.random.choice(100, size=batch_size)
    rand_x = np.transpose([x_vals[rand_index]])
    rand_y = np.transpose([y_vals[rand_index]])
    sess.run(train_step, feed_dict={x_data: rand_x, y_target: rand_y})
    if (i+1)%5==0:
    print('Step #' + str(i+1) + ' A = ' + str(sess.run(A)))
    temp_loss = sess.run(loss, feed_dict={x_data: rand_x, y_target: rand_y})
    print('Loss = ' + str(temp_loss))
    loss_batch.append(temp_loss)
Evaluation
  • regression: an aggregate measure of the distance between predictions and actual targets
  • classification: a measure of how close we are to the
    truth from our predictions
  • split train and validation
train_indices = np.random.choice(len(x_vals), round(len(x_vals)*0.8), replace=False)
test_indices = np.array(list(set(range(len(x_vals))) - set(train_indices)))
  • prediction operation
y_prediction = tf.squeeze(tf.round(tf.nn.sigmoid(tf.add(x_data,
A))))
correct_prediction = tf.equal(y_prediction, y_target)
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
acc_value_test = sess.run(accuracy, feed_dict={x_data: [x_vals_test], y_target: [y_vals_test]})
acc_value_train = sess.run(accuracy, feed_dict={x_data: [x_vals_train], y_target: [y_vals_train]})
print('Accuracy' on train set: ' + str(acc_value_train))
print('Accuracy' on test set: ' + str(acc_value_test))
Accuracy on train set: 0.925
Accuracy on test set: 0.95

Others

1. broadcasting
  • Broadcasting is the process of making arrays with different shapes have compatible shapes for arithmetic operations.
|1 2 3| + |7 8 9|
|4 5 6|
------------
|1 2 3| + |7 8 9| = |8  10 12|
|4 5 6|   |7 8 9|   |11 13 15|
-------------
 |7| ==> |7 7 7|
 |8|     |8 8 8|
 |9|     |9 9 9|

Miscellaneous

Thoughts
  1. activation function
  2. loss function
  3. optimize method(gradient)
  4. update method(batch, mini)
Scope
  • place manual layer with a named scope
    so that it is identifiable and collapsible/expandable on the computational graph
with tf.name_scope('Custom_Layer') as scope:
    custom_layer1 = custom_layer(mov_avg_layer)
image
  • 4 dimentions
    • image number, height, width, and channel
  • conv2d
    • takes a piecewise product of the window and a filter we specify
    • an input tensor of shape [batch, in_height, in_width, in_channels]
    • a filter / kernel tensor of shape [filter_height, filter_width, in_channels, out_channels]
conv2d(
    input,
    filter,
    strides,
    padding,
    use_cudnn_on_gpu=True,
    data_format='NHWC',
    name=None
)
  • squeeze drop the extra dimensions of our image that are of size 1
    • matrix multiplication only operates on two-dimensional matrices,
  • crop randomly cropping an image
cropped_image = tf.random_crop(my_image, [height/2, width/2,
3])
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值