笔记三

最新推荐文章于 2024-08-09 00:07:41 发布

MIIEo

最新推荐文章于 2024-08-09 00:07:41 发布

阅读量156

点赞数

分类专栏：深度学习——基于python和TensorFlow

本文链接：https://blog.csdn.net/qq_29421241/article/details/112561873

版权

深度学习——基于python和TensorFlow 专栏收录该内容

9 篇文章 0 订阅

订阅专栏

笔记二的模型是从一般的思维方式来设计的，这与神经网络通常设计中的思路并不一致。
笔记三关注如何优化这个模型，使其逻辑更清晰、运行更高效。

4.1，在程序运行中查看变量的取值

>>> name = "adam"
>>> print("Name: %s" % name)
Name: adam
>>>
>>> x = 101
>>> y = 12.35
>>> print("x = %d" % x)
x = 101
>>> print("y = %f" % y)
y = 12.350000
>>>

4.7，用softmax函数来规范可变参数

我以为是书上错了，还误以为softmax无效

这一节称，*通过softmax函数运算，如果再用 相同的学习率和循环次数来训练，会发现达到相同误差率所需的训练次数明显减少。 *
一开始我以为，用了softmax就可以很快达到[0.6, 0.3, 0.1]，但试验结果跟我想的太不一样了。softmax作用在我自己的试验中，权重的波动一直很大。
按照书中给的例子：

# code_4.6_score1f.py

import tensorflow as tf


x = tf.placeholder(shape=[3], dtype=tf.float32)
yTrain = tf.placeholder(shape=[], dtype=tf.float32)

w = tf.Variable(tf.zeros([3]), dtype=tf.float32)

wn = tf.nn.softmax(w)

n = x * wn

y = tf.reduce_sum(n)

loss = tf.abs(y - yTrain)

optimizer = tf.train.RMSPropOptimizer(0.1)

train = optimizer.minimize(loss)

sess = tf.Session()

init = tf.global_variables_initializer()

sess.run(init)

for i in range(2):
    result = sess.run([train, x, w, wn, y, yTrain, loss], feed_dict={x: [90, 80, 70], yTrain: 85})
    print(result[3])

    result = sess.run([train, x, w, wn, y, yTrain, loss], feed_dict={x: [98, 95, 87], yTrain: 96})
    print(result[3])

输出如下：

D:\Program_Files_x64\Anaconda3\envs\deeplearning_Py_TF\python.exe D:/inst2vec_experiment/PycharmProjects/deeplearning_Py_TF/code_4.6_score1f.py
2021-01-14 11:40:30.913697: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2
[0.33333334 0.33333334 0.33333334]
[0.413998   0.32727832 0.2587237 ]
[0.44992    0.32819405 0.22188595]
[0.5284719  0.2905868  0.18094125]

Process finished with exit code 0

看这个输出，貌似经过4次训练，参数已经很接近解析解[0.6, 0.3, 0.1]了。
当增大训练次数到range(100) * 2时，训练的最后10次的输出却是这样：

[0.53597116 0.38605216 0.07797671]
[0.6009958  0.3307964  0.06820776]
[0.57949114 0.34662864 0.07388028]
[0.514625   0.40151832 0.08385672]
[0.53671014 0.3865417  0.07674818]
[0.6015758  0.33125973 0.06716447]
[0.5801556  0.34711993 0.07272442]
[0.5154343  0.40204716 0.0825186 ]
[0.53747654 0.38697198 0.07555145]
[0.6022332  0.33163568 0.06613111]

波动不是很稳定。当增大训练次数到range(1000) * 2时，训练的最后10次的输出如下：

[0.5909161  0.34312806 0.06595593]
[0.52686936 0.39814836 0.07498237]
[0.5489259  0.3824051  0.06866898]
[0.6131694  0.32687482 0.05995575]
[0.5922006  0.34290916 0.0648903 ]
[0.5282114  0.39799386 0.07379475]
[0.55027235 0.3821475  0.06758016]
[0.61447376 0.32654396 0.0589823 ]
[0.5935523  0.34261104 0.06383662]
[0.5296246  0.39775392 0.07262148]

我们发现依然不稳定。当增大训练次数到range(5000) * 2时，训练的最后10次的输出如下：

[0.581127   0.35069135 0.06818163]
[0.51661265 0.40605828 0.07732906]
[0.5385668  0.39057976 0.07085335]
[0.6031867  0.3347785  0.06203476]
[0.58199143 0.3508892  0.06711936]
[0.5175219  0.4063409  0.07613724]
[0.5394755  0.39075965 0.06976488]
[0.60406595 0.33486596 0.06106813]
[0.5829101  0.3510191  0.06607077]
[0.5184833  0.4065536  0.07496312]

我们发现这次的效果依然很迷。
因此，softmax所起的作用很难说清楚。
而对于书中之前的代码，

# code_3.3_score1c.py

import tensorflow as tf

x1 = tf.placeholder(dtype=tf.float32)
x2 = tf.placeholder(dtype=tf.float32)
x3 = tf.placeholder(dtype=tf.float32)
yTrain = tf.placeholder(dtype=tf.float32)
w1 = tf.Variable(0.1, dtype=tf.float32)
w2 = tf.Variable(0.1, dtype=tf.float32)
w3 = tf.Variable(0.1, dtype=tf.float32)

n1 = x1 * w1
n2 = x2 * w2
n3 = x3 * w3

y = n1 + n2 + n3

loss = tf.abs(y - yTrain)

optimizer = tf.train.RMSPropOptimizer(0.001)

train = optimizer.minimize(loss)

sess = tf.Session()

init = tf.global_variables_initializer()

sess.run(init)

flag = 0
tmp = 999


def train_v1():
    for i in range(10000):
        result = sess.run([train, x1, x2, x3, w1, w2, w3, y, yTrain, loss],
                          feed_dict={x1: 90, x2: 80, x3: 70, yTrain: 85})
        if i == tmp:
            flag = 1
            print("i = %d" % i)
            print(result)

        result = sess.run([train, x1, x2, x3, w1, w2, w3, y, yTrain, loss],
                          feed_dict={x1: 98, x2: 95, x3: 87, yTrain: 96})
        if i == tmp:
            print(result)

        result = sess.run([train, x1, x2, x3, w1, w2, w3, y, yTrain, loss],
                          feed_dict={x1: 70, x2: 90, x3: 80, yTrain: 77})
        if i == tmp:
            print(result)

        if flag == 1:
            tmp = tmp + 1000
            flag = 0


def train_v2():
    for i in range(5000):
        result = sess.run([train, x1, x2, x3, w1, w2, w3, y, yTrain, loss],
                          feed_dict={x1: 90, x2: 80, x3: 70, yTrain: 85})
        print(result)

        result = sess.run([train, x1, x2, x3, w1, w2, w3, y, yTrain, loss],
                          feed_dict={x1: 98, x2: 95, x3: 87, yTrain: 96})
        print(result)


def train_v3():
    for i in range(5000):
        result = sess.run([train, x1, x2, x3, w1, w2, w3, y, yTrain, loss],
                          feed_dict={x1: 92, x2: 98, x3: 90, yTrain: 94})
        print(result)

        result = sess.run([train, x1, x2, x3, w1, w2, w3, y, yTrain, loss],
                          feed_dict={x1: 92, x2: 99, x3: 98, yTrain: 96})
        print(result)


if __name__ == '__main__':
    train_v2()

最后10次的训练结果输出如下：

[None, array(90., dtype=float32), array(80., dtype=float32), array(70., dtype=float32), 0.58388305, 0.28717414, 0.1325421, 85.02325, array(85., dtype=float32), 0.023246765]
[None, array(98., dtype=float32), array(95., dtype=float32), array(87., dtype=float32), 0.5828438, 0.2860972, 0.13144642, 96.03325, array(96., dtype=float32), 0.0332489]
[None, array(90., dtype=float32), array(80., dtype=float32), array(70., dtype=float32), 0.5838025, 0.28701225, 0.132338, 84.54497, array(85., dtype=float32), 0.45503235]
[None, array(98., dtype=float32), array(95., dtype=float32), array(87., dtype=float32), 0.5848418, 0.2880892, 0.13343368, 95.99221, array(96., dtype=float32), 0.007789612]
[None, array(90., dtype=float32), array(80., dtype=float32), array(70., dtype=float32), 0.58388305, 0.28717414, 0.1325421, 85.02325, array(85., dtype=float32), 0.023246765]
[None, array(98., dtype=float32), array(95., dtype=float32), array(87., dtype=float32), 0.5828438, 0.2860972, 0.13144642, 96.03325, array(96., dtype=float32), 0.0332489]
[None, array(90., dtype=float32), array(80., dtype=float32), array(70., dtype=float32), 0.5838025, 0.28701225, 0.132338, 84.54497, array(85., dtype=float32), 0.45503235]
[None, array(98., dtype=float32), array(95., dtype=float32), array(87., dtype=float32), 0.5848418, 0.2880892, 0.13343368, 95.99221, array(96., dtype=float32), 0.007789612]
[None, array(90., dtype=float32), array(80., dtype=float32), array(70., dtype=float32), 0.58388305, 0.28717414, 0.1325421, 85.02325, array(85., dtype=float32), 0.023246765]
[None, array(98., dtype=float32), array(95., dtype=float32), array(87., dtype=float32), 0.5828438, 0.2860972, 0.13144642, 96.03325, array(96., dtype=float32), 0.0332489]

我们发现，loss还是比较稳定的。唉，咋算稳定呢，也有几次的loss在0.5左右，也不能说稳定。这个训练真是一个玄学。

其实是我没有理解清楚书中的描述，书中是对的，softmax有效

我把code_4.6_score1f.py的学习率调整成和code_3.3_score1c.py一样的0.001，训练权重还真的就稳定多了。
我们把上面第一份代码（code_4.6_score1f.py）的学习率（tf.train.RMSPropOptimizer()的参数）调整为0.001，当训练次数设为range(5000) * 2时，训练的最后10次的输出如下

[0.59985083 0.3001836  0.09996562]
[0.6004757  0.29970306 0.09982121]
[0.6002519  0.29983714 0.09991095]
[0.599627   0.3003176  0.10005541]
[0.599851   0.30018356 0.09996543]
[0.6004759  0.29970303 0.09982101]
[0.6002521  0.2998371  0.09991075]
[0.5996272  0.30031753 0.1000552 ]
[0.5998512  0.30018353 0.09996524]
[0.6004761  0.299703   0.09982082]