优化网络的原理与方法（4）--TensorFlow提供的优化器

最新推荐文章于 2022-01-23 00:38:51 发布

Meruz

最新推荐文章于 2022-01-23 00:38:51 发布

阅读量347

点赞数

分类专栏： TensorFlow深度学习算法原理与编程实战文章标签：机器学习优化器神经网络

本文链接：https://blog.csdn.net/weixin_43002202/article/details/89400781

版权

TensorFlow深度学习算法原理与编程实战专栏收录该内容

20 篇文章 5 订阅

订阅专栏

本节在上节的基础上给出TensorFlow提供的可以直接调用的几种常用的优化器。

Ⅰ.tf.train.Optimizer 优化器（optimizers）类的基类。基本上不会直接使用这个类，但是会用到他的子类比如GradientDescentOptimizer, AdagradOptimizer,等等这些。

Ⅱ. tf.train.GradientDescentOptimizer 这个类是实现梯度下降算法的优化器

def__init__(self,learning_rate, use_locking=False,name=’GradientDescent’)

作用：创建一个梯度下降优化器对象
参数：
learning_rate: A Tensor or a floating point value. 要使用的学习率
use_locking: 要是True的话，就对于更新操作（update operations.）使用锁
name: 名字，可选，默认是”GradientDescent”.

compute_gradients(loss,var_list=None,gate_gradients=GATE_OP,aggregation_method=None,colocate_gradients_with_ops=False,grad_loss=None)

作用：对于在变量列表（var_list）中的变量计算对于损失函数的梯度,这个函数返回一个（梯度，变量）对的列表，其中梯度就是相对应变量的梯度了。这是minimize()函数的第一个部分，
参数：
loss: 待减小的值
var_list: 默认是在GraphKey.TRAINABLE_VARIABLES.
gate_gradients: How to gate the computation of gradients. Can be GATE_NONE, GATE_OP, or GATE_GRAPH. 如何控制梯度的计算
aggregation_method: Specifies the method used to combine gradient terms. Valid values are defined in the class AggregationMethod.
colocate_gradients_with_ops: If True, try colocating gradients with the corresponding op.
grad_loss: Optional. A Tensor holding the gradient computed for loss.

apply_gradients(grads_and_vars,global_step=None,name=None)

作用：把梯度“应用”（Apply）到变量上面去。其实就是按照梯度下降的方式加到上面去。这是minimize（）函数的第二个步骤。返回一个应用的操作。

minimize(loss,global_step=None,var_list=None,gate_gradients=GATE_OP,aggregation_method=None,colocate_gradients_with_ops=False,name=None,grad_loss=None)

作用：非常常用的一个函数通过更新var_list来减小loss，这个函数就是前面compute_gradients() 和apply_gradients().的结合

#加载 tensorflow 和 numpy 两个模块, 并且使用 numpy 来创建我们的数据.
import tensorflow as tf
import numpy as np

# create data
x_data = np.random.rand(100).astype(np.float32)
y_data = x_data*0.1 + 0.3

# tf.Variable 来创建描述 y 的参数.#
Weights = tf.Variable(tf.random_uniform([1], -1.0, 1.0))
biases = tf.Variable(tf.zeros([1]))

#搭建模型
y = Weights*x_data + biases

#计算误差
loss = tf.reduce_mean(tf.square(y-y_data))


#Gradient Descent 让后我们使用 optimizer 来进行参数的更新.#
optimizer = tf.train.GradientDescentOptimizer(0.5)
train = optimizer.minimize(loss)

#到目前为止, 我们建立了神经网络的结构, 还没有使用这个结构. 在使用这个结构之前, 必须先初始化所有之前定义的Variable#
init = tf.global_variables_initializer()  
 
with tf.Session() as sess:
    sess.run(init) 
    
    for step in range(201):
        sess.run(train)
        if step % 20 == 0:
            print(step, sess.run(Weights), sess.run(biases))

3.tf.train.AdagradOptimizer

__init__(learning_rate, initial_accumulator_value=0.1, use_locking=False, name=’Adagrad’)

learning_rate: A Tensor or a floating point value. The learning rate.
initial_accumulator_value: A floating point value. Starting value for the accumulators, must be positive.
use_locking: If True use locks for update operations.
name: Optional name prefix for the operations created when applying gradients. Defaults to "Adagrad".

4.tf.train.RMSPropOptimizer

__init__(
    learning_rate,
    decay=0.9,
    momentum=0.0,
    epsilon=1e-10,
    use_locking=False,
    centered=False,
    name="RMSProp"

learning_rate: （学习率）
decay=0.9,衰减率，默认值为0.9
epsilon=1e-10,小常数，保持数值稳定性
use_locking=False,

除学习率可调整外，建议保持优化器的其他默认参数不变

5.tf.train.AdamOptimizer

__init__(
    learning_rate=0.001,
    beta1=0.9,
    beta2=0.999,
    epsilon=1e-08,
    use_locking=False,
    name='Adam'
)

learning_rate: （学习率）张量或者浮点数
•beta1: 浮点数或者常量张量，表示 The exponential decay rate for the 1st moment estimates.第一时刻的指数衰减率估计。
•beta2: 浮点数或者常量张量，表示 The exponential decay rate for the 2nd moment estimates.
•epsilon: A small constant for numerical stability. This epsilon is "epsilon hat" in the Kingma and Ba paper (in the formula just before Section 2.1), not the epsilon in Algorithm 1 of the paper.数值稳定性
•use_locking: 为True时锁定更新
•name: 梯度下降名称，默认为 "Adam".

欢迎交流指正！

Meruz

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
优化网络的原理与方法（4）--TensorFlow提供的优化器

本节在上节的基础上给出TensorFlow提供的可以直接调用的几种常用的优化器。Ⅰ.tf.train.Optimizer 优化器（optimizers）类的基类。基本上不会直接使用这个类，但是会用到他的子类比如GradientDescentOptimizer, AdagradOptimizer,等等这些。Ⅱ. tf.train.GradientDescentOptimizer 这个类是实现...
复制链接

扫一扫

专栏目录