tensorflow基础知识

最新推荐文章于 2022-01-01 19:26:32 发布

置顶 feitianlzk

最新推荐文章于 2022-01-01 19:26:32 发布

阅读量597

点赞数

分类专栏：工具

本文链接：https://blog.csdn.net/feitianlzk/article/details/79010130

版权

工具专栏收录该内容

5 篇文章 0 订阅

订阅专栏

本文介绍了TensorFlow的基础知识，包括Graph operators中的collection、数值计算、LOW LEVEL API的使用，如data -> tensor、computation、Session等。详细讨论了Tensor、Variable的创建、初始化和使用，以及Graphs和Sessions的概念。此外，还提到了Chap1-2的TensorFlow cookbook笔记，涵盖了Tensor、Placeholders、Variables、Session、Loss Function和Back Propagation等内容。

摘要由CSDN通过智能技术生成

- - - 1. Graph operators
      - 1.1 collection
    - 常用数值计算
      - condition
- LOW LEVEL API
TensorFlow cookbook 笔记Chap1-2

1. Graph operators

1.1 collection

tf.add_to_collection(name, value):
tf.get_collection(key): return The list of values in the collection with the given name

 tf.add_to_collection('losses', cross_entropy_mean)

  # The total loss is defined as the cross entropy loss plus all of the weight
  # decay terms (L2 loss).
  return tf.add_n(tf.get_collection('losses'), name='total_loss')

tf.add_n(inputs): Adds all input tensors element-wise.

常用数值计算

绝对值: tf.abs
最大值: rf.reduce_max
- tf.maximum：用法tf.maximum(a,b),返回的是a,b之间的最大值，
- tf.miniimum：用法tf.miiinimum(a,b),返回的是a,b之间的最小值，
- tf.argmax：用法tf.argmax(a,dimension),返回的是a中的某个维度最大值的索引，

import tensorflow as tf;    

a = [1,5,3]  

f1 = tf.maximum(a, 3)  
f2 = tf.minimum(a, 3)  
f3 = tf.argmax(a, 0)  
f4 = tf.argmin(a, 0)  

with tf.Session() as sess:  
    print sess.run(f1)#print f1.eval()  
    print sess.run(f2)  
    print sess.run(f3)  
    print sess.run(f4) 
#### Results
[3 5 3]
[1 3 3]
1
0

condition

tf.where

tf.where(
    condition,
    x=None,
    y=None,
    name=None
)
# The condition tensor acts as a mask that chooses, based on the value at each element, whether the corresponding element / row in the output should be taken from x (if true) or y (if false).

LOW LEVEL API

1. Introduction

1.1 data -> tensor

rank: number of dimensions
shape: a tuple of integers specifying the array’s length along each dimension
A tensor consists of a set of primitive values shaped into an array of any number of dimensions
TensorFlow uses numpy arrays to represent tensor values.

1.2 computation

building computational graph(tf.Graph)
running computational grapsh(tf.Session)
Graph: A computational graph is a series of TensorFlow operations arranged into a graph
- Operations(“ops”): nodes of graph
- Tensors: edges in the graph
- tf.Tensors do not have values, they are just handles to elements in the computation graph

1.3 Session

can pass multiple tensors to tf.Session.run. The run method transparently handles any combination of tuples or dictionaries

sess = tf.Session()
print(sess.run({'ab':(a, b), 'total':total}))
# {'total': 7.0, 'ab': (3.0, 4.0)}

a consistent value during a single run

vec = tf.random_uniform(shape=(3,)) # default in tf.constant
out1 = vec + 1
out2 = vec + 2
print(sess.run(vec))
print(sess.run(vec))
print(sess.run((out1, out2)))
## results
[ 0.52917576  0.64076328  0.68353939]
[ 0.66192627  0.89126778  0.06254101]
(
  array([ 1.88408756,  1.87149239,  1.84057522], dtype=float32),
  array([ 2.88408756,  2.87149239,  2.84057522], dtype=float32)
)

1.4 Feeding

feed_dict argument can be used to overwrite any tensor in the graph
The only difference between placeholders and other tf.Tensors is that placeholders throw an error if no value is fed to them.

a = tf.constant(1)
b = tf.constant(2)
total = a + b
sess.run(total, feed_dict={a: 4})
>> 6

1.5 layers

2. Tensor

tf.Variable
tf.constant
tf.placeholder
tf.SparseTensor

2.1 shape

zeros = tf.zeros(my_matrix.shape[1])

2.2 data type

float_tensor = tf.cast(tf.constant([1, 2, 3]), dtype=tf.float32)

3. Variable

represent shared, persistent state manipulated by your program.
A tf.Variable represents a tensor whose value can be changed by running ops on it.
Unlike tf.Tensor objects, a tf.Variable exists outside the context of a single session.run call.

3.1 creating

default, float32, tf.glorot_uniform_initializer

my_variable = tf.get_variable("my_variable", [1, 2, 3])
# initialze by specified type
my_int_variable = tf.get_variable("my_int_variable", [1, 2, 3], dtype=tf.int32,
  initializer=tf.zeros_initializer)
# initialize using specified value
other_variable = tf.get_variable("other_variable", dtype=tf.int32,
  initializer=tf.constant([23, 42]))

3.2 variable collections

named lists of tensors or other objects
every tf.Variable gets placed in the following two collections
- tf.GraphKeys.GLOBAL_VARIABLES: variables that can be shared across multiple devices
- tf.GraphKeys.TRAINABLE_VARIABLES: variables for which TensorFlow will calculate gradients.
want variable not trainable

my_local = tf.get_variable("my_local", shape=(),
collections=[tf.GraphKeys.LOCAL_VARIABLES])
# or
my_non_trainable = tf.get_variable("my_non_trainable",
                                   shape=(),
                                   trainable=False)

use own collection

# no need to explicitly create a collection
tf.add_to_collection("my_collection_name", my_local)
# retrieve a list of all the variables
tf.get_collection("my_collection_name")

3.3 initializing

tf.global_variables_initializer
- initializing all variables in the tf.GraphKeys.GLOBAL_VARIABLES collection

3.4 using, assigning

treat it like a normal tf.Tensor
To assign a value to a variable, use the methods assign, assign_add

v = tf.get_variable("v", shape=(), initializer=tf.zeros_initializer())
assignment = v.assign_add(1)
tf.global_variables_initializer().run()
sess.run(assignment)  # or assignment.op.run(), or assignment.eval()

Implicitly wrapping tf.Variable objects within tf.variable_scope objects.

def conv_relu(input, kernel_shape, bias_shape):
    # Create variable named "weights".
    weights = tf.get_variable("weights", kernel_shape,
        initializer=tf.random_normal_initializer())
    # Create variable named "biases".
    biases = tf.get_variable("biases", bias_shape,
        initializer=tf.constant_initializer(0.0))
    conv = tf.nn.conv2d(input, weights,
        strides=[1, 1, 1, 1], padding='SAME')
    return tf.nn.relu(conv + biases)


def my_image_filter(input_images):
    with tf.variable_scope("conv1"):
        # Variables created here will be named "conv1/weights", "conv1/biases".
        relu1 = conv_relu(input_images, [5, 5, 32, 32], [32])
    with tf.variable_scope("conv2"):
        # Variables created here will be named "conv2/weights", "conv2/biases".
        return conv_relu(relu1, [5, 5, 32, 32], [32])

reusing

 def my_variable_sharing(): 
    # variable sharing
    # opt1
    with tf.variable_scope("model") as scope:
        output1 = my_image_filter(input1)
        scope.reuse_variables()
        output2 = my_image_filter(input2)

    # opt2
    with tf.variable_scope("model") as scope:
        output1 = my_image_filter(input1)
    with tf.variable_scope(scope, reuse=True):
        output2 = my_image_filter(input2)

4. Graphs and Sessions

tf.Operation node
tf.Tensor edge

4.1 Naming operation

tf.Tensor objects are implicitly named after the tf.Operation that produces the tensor as output. A tensor name has the form "<OP_NAME>:<i>" where:
- "<OP_NAME>" is the name of the operation that produces it.
- "<i>" is an integer representing the index of that tensor among the operation’s outputs.

4.2 Tensor-like objects

operations take one or more tf.Tensor objects as arguments
these functions will accept a tensor-like object in place of a tf.Tensor, and implicitly convert it to a tf.Tensor using the tf.convert_to_tensor method
Tensor-like objects
- tf.Variable
- numpy.ndarray
- list
- scalar pyton types
TensorFlow will create a new tf.Tensor each time you use the same tensor-like object. use it multiple times, you may run out of memory.
- manually call tf.convert_to_tensor on the tensor-like object once and use the returned tf.Tensor instead.

4.3 Session

to do

5. Save

to do

TensorFlow cookbook 笔记Chap1-2

Chap1

Tensor

primary data structure

zero_tsr = tf.zeros([row_dim, col_dim])
ones_tsr = tf.ones([row_dim, col_dim])
# Create a constant filled tensor. Use the following
filled_tsr = tf.fill([row_dim, col_dim], 42)
constant_tsr = tf.constant([1,2,3])

declare as variables or feed as placeholders
sequence tensor
- stop include
- limit exclude

linear_tsr = tf.linspace(start=0, stop=1, start=3)
integer_seq_tsr = tf.range(start=6, limit=15, delta=3)
y_vals = np.repeat(10., 100)  
x_vals = np.random.normal(1, 0.1, 100)

random tensor
- uniform distribution
- normal distribution

randunif_tsr = tf.random_uniform([row_dim, col_dim],
minval=0, maxval=1)
randnorm_tsr = tf.random_normal([row_dim, col_dim],
mean=0.0, stddev=1.0)

random entries of arrays

shuffled_output = tf.random_shuffle(input_tensor)
cropped_output = tf.random_crop(input_tensor, crop_size)

Placeholders and Variables

Variables are the parameters of the algorithm and TensorFlow keeps track of how to change these to optimize the algorithm.
Placeholders are objects that allow you to feed in data of a specific type and shape and depend on the results of the computational graph, such as the expected outcome of a computation.
Placeholders are just holding the position for data to be fed into the graph. Placeholders get data from a feed_dict argument in the session. To put a placeholder in the graph, we must perform at least one operation on the placeholder

my_var = tf.Variable(tf.zeros([row_dim, col_dim]))
x_data = tf.placeholder(tf.float32, shape=(3, 5))
for x_val in x_vals:
    print(sess.run(add1, feed_dict={x_data: x_val}))
# vary columns
x_data = tf.placeholder(tf.float32, shape=(3,None))

notes

use tf.get_variable instead of tf.Variable in work env
- it will make it way easier to refactor your code if you need to share variables at any time, e.g. in a multi-gpu setting
- tf.Variable will always create a new variable, whether tf.get_variable gets from the graph an existing variable with those parameters, and if it does not exists, it creates a new one.
- default: xavier initializer

W = tf.get_variable("W", shape=[784, 256], initializer=tf.contrib.layers.xavier_initializer())

with tf.variable_scope("one"):
    a = tf.get_variable("v", [1]) #a.name == "one/v:0"
with tf.variable_scope("one"):
    b = tf.get_variable("v", [1]) #ValueError: Variable one/v already exists
with tf.variable_scope("one", reuse = True):
    c = tf.get_variable("v", [1]) #c.name == "one/v:0"

with tf.variable_scope("two"):
    d = tf.get_variable("v", [1]) #d.name == "two/v:0"
    e = tf.Variable(1, name = "v", expected_shape = [1]) #e.name == "two/v_1:0"

assert(a is c)  #Assertion is true, they refer to the same object.
assert(a is d)  #AssertionError: they are different objects
assert(d is e)  #AssertionError: they are different objects

Matrices

tf.diag
tf.convert_to_tensor

identity_matrix = tf.diag([1.0, 1.0, 1.0])
A = tf.truncated_normal([2, 3])
B = tf.fill([2,3], 5.0)
C = tf.random_uniform([3,2])
D = tf.convert_to_tensor(np.array([[1., 2., 3.],[-3., -7.,
-1.],[0., 5., -2.]]))
print(sess.run(identity_matrix)

tf.matmul(A,B) mat multiplication
tf.transpose(C) transpose
tf.matrix_determinant()
tf.matrix_inverse()
tf.self_adjoint_eig() eigenvalues and vectors
- outputs the eigenvalues in the first row
- the subsequent vectors in the remaining vectors

Operations

add, sub, mul, div, mod
- div() returns the same type as the inputs. This means it really returns the floor of the division
- truediv()
- floordiv() rounded down to the nearest integer

print(sess.run(tf.div(3,4)))
0
print(sess.run(tf.truediv(3,4)))
0.75
print(sess.run(tf.floordiv(3.0,4.0)))
0.0

customize

def custom_polynomial(value):
return(tf.sub(3 * tf.square(value), value) + 10)
print(sess.run(custom_polynomial(11)))
362

Activation function

relu rectified linear unit
- relu6

# max(0,x)
print(sess.run(tf.nn.relu([-3., 3., 10.])))
[ 0. 3. 10.]
# min(max(0,x), 6)
print(sess.run(tf.nn.relu6([-3., 3., 10.])))
[ 0. 3. 6.]

sigmoid
- not zero centered, require zero-mean the data

print(sess.run(tf.nn.sigmoid([-1., 0., 1.])))
[ 0.26894143 0.5
0.7310586 ]

hyper tangent
- range between -1 and 1

# ((exp(x)-exp(-x)/exp(x)+exp(-x))
print(sess.run(tf.nn.tanh([-1., 0., 1.])))
[-0.76159418 0.
0.76159418 ]

softsign, softplus, ELU

Chap2

Session

import tensorflow as tf
sess = tf.Session()

use the same TensorFlow script if we reset the graph first

from tensorflow.python.framework import ops
ops.reset_default_graph()
sess = tf.Session()

Loss Function

L2 norm
- it is very curved near the target
- algorithms can use this fact to converge to the target more slowly
- nn.l2_loss() half the L2-norm

l2_y_vals = tf.square(target - x_vals)
l2_y_out = sess.run(l2_y_vals)

L1 norm
- The L1 norm is better for outliers than the L2 norm because it is not as steep for larger values
- L1 norm is not smooth at the target and this can result in algorithms not converging well

l1_y_vals = tf.abs(target - x_vals)

Pseudo-Huber
- a continuous and smooth approximation to the Huber loss
- attempts to take the best of the L1 and L2 norms by being convex near the target and less steep for extreme values
Hinge loss

hinge_y_vals = tf.maximum(0., 1. - tf.mul(target, x_vals))
hinge_y_out = sess.run(hinge_y_vals)

cross entropy

# this might be for two classes?
# from cs231n, loss = - y log P(y|x)
# y is label, one-hot, so result is loss = -sum(log(a_i))
xentropy_y_vals = - tf.mul(target, tf.log(x_vals)) - tf.mul((1. -
target), tf.log(1. - x_vals))
xentropy_y_out = sess.run(xentropy_y_vals)

sigmoid cross entropy
weighted cross entropy
softmax cross entropy
sparse softmax corss-entropy
metrics
- stable: whether smooth near target
- robust: whether sensitive to outliers
batches
- specific loss function expects batches of data

my_output_expanded = tf.expand_dims(my_output, 0)
y_target_expanded = tf.expand_dims(y_target, 0)

Back Propagation

minimize loss function
MomentumOptimizer()
AdagradOptimizer()

my_opt = tf.train.GradientDescentOptimizer(0.05)
train_step = my_opt.minimize(xentropy)

for i in range(1400):
    rand_index = np.random.choice(100)
    rand_x = [x_vals[rand_index]]
    rand_y = [y_vals[rand_index]]

Batch and Stochastic Training

mean loss
- tf.reduce_mean()

loss = tf.reduce_mean(tf.square(my_output - y_target))

record loss

loss_batch = []
for i in range(100):
    rand_index = np.random.choice(100, size=batch_size)
    rand_x = np.transpose([x_vals[rand_index]])
    rand_y = np.transpose([y_vals[rand_index]])
    sess.run(train_step, feed_dict={x_data: rand_x, y_target: rand_y})
    if (i+1)%5==0:
    print('Step #' + str(i+1) + ' A = ' + str(sess.run(A)))
    temp_loss = sess.run(loss, feed_dict={x_data: rand_x, y_target: rand_y})
    print('Loss = ' + str(temp_loss))
    loss_batch.append(temp_loss)

Evaluation

regression: an aggregate measure of the distance between predictions and actual targets
classification: a measure of how close we are to the
truth from our predictions
split train and validation

train_indices = np.random.choice(len(x_vals), round(len(x_vals)*0.8), replace=False)
test_indices = np.array(list(set(range(len(x_vals))) - set(train_indices)))

prediction operation

y_prediction = tf.squeeze(tf.round(tf.nn.sigmoid(tf.add(x_data,
A))))
correct_prediction = tf.equal(y_prediction, y_target)
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

acc_value_test = sess.run(accuracy, feed_dict={x_data: [x_vals_test], y_target: [y_vals_test]})
acc_value_train = sess.run(accuracy, feed_dict={x_data: [x_vals_train], y_target: [y_vals_train]})
print('Accuracy' on train set: ' + str(acc_value_train))
print('Accuracy' on test set: ' + str(acc_value_test))
Accuracy on train set: 0.925
Accuracy on test set: 0.95

Others

1. broadcasting

Broadcasting is the process of making arrays with different shapes have compatible shapes for arithmetic operations.

|1 2 3| + |7 8 9|
|4 5 6|
------------
|1 2 3| + |7 8 9| = |8  10 12|
|4 5 6|   |7 8 9|   |11 13 15|
-------------
 |7| ==> |7 7 7|
 |8|     |8 8 8|
 |9|     |9 9 9|

Miscellaneous

Thoughts

activation function
loss function
optimize method(gradient)
update method(batch, mini)

Scope

place manual layer with a named scope
so that it is identifiable and collapsible/expandable on the computational graph

with tf.name_scope('Custom_Layer') as scope:
    custom_layer1 = custom_layer(mov_avg_layer)

image

4 dimentions
- image number, height, width, and channel
conv2d
- takes a piecewise product of the window and a filter we specify
- an input tensor of shape [batch, in_height, in_width, in_channels]
- a filter / kernel tensor of shape [filter_height, filter_width, in_channels, out_channels]

conv2d(
    input,
    filter,
    strides,
    padding,
    use_cudnn_on_gpu=True,
    data_format='NHWC',
    name=None
)

squeeze drop the extra dimensions of our image that are of size 1
- matrix multiplication only operates on two-dimensional matrices,
crop randomly cropping an image

cropped_image = tf.random_crop(my_image, [height/2, width/2,
3])

feitianlzk

关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录