DQN

最新推荐文章于 2023-04-26 19:43:44 发布

weixin_41012946

最新推荐文章于 2023-04-26 19:43:44 发布

阅读量168

点赞数

文章标签： DRL

本文链接：https://blog.csdn.net/weixin_41012946/article/details/97813321

版权

1、tf.summary.FileWriter
指定一个文件用来保存图。
格式：tf.summary.FileWritter(path,sess.graph)
可以调用其add_summary（）方法将训练过程数据保存在filewriter指定的文件中

2.在TensorFlow的世界里，变量的定义和初始化是分开的，所有关于图变量的赋值和计算都要通过tf.Session的run来进行。想要将所有图变量进行集体初始化时应该使用tf.global_variables_initializer。session用于执行命令，对话控制。sess.run()用于执行某一个小图片上的功能。Session 是 Tensorflow 为了控制,和输出文件的执行的语句. 运行 session.run() 可以获得你要得知的运算结果, 或者是你所要运算的部分.
当我们训练自己的神经网络的时候，无一例外的就是都会加上一句 sess.run(tf.global_variables_initializer()) ，这行代码的官方解释是初始化模型的参数,初始化全局所有变量。

3.os.path.isfile()
用于判断某一对象(需提供绝对路径)是否为文件

4.replay_buffer.load(preliminary_replay_buffer_path):
Load the buffer from a file.
@param file_name

5.replay_buffer.return_size_byte()
Return the number bytes occupied by the buffer in memory
@param value a string representing the type of value
it can be: byte, kilobyte, megabyte, gigabyte.
@return an integer representing the number of bytes

6.replay_buffer.return_size()
Return the number of elements inside the buffer
@return an integer representing the number of elements

7.send_action(action):
Send an action to the UAV.

8.r = rospy.Rate(10)
rospy.sleep(1.0)
Rate类中的sleep主要用来保持一个循环按照固定的频率，会考虑上次sleep的时间，从而使整个循环严格按照指定的频率

9.np.stack([image_t] * images_stack_size, axis=2)
当重放缓冲区为空时，用相同的图片填充4次，创建一堆X图像

10.get_random_action()
Choose a random action for the UAV.

11.get_done_reward()
Get the done status and the reward after completing an action.
@return resp contains the reward and the done status

12.python numpy.expand_dims(a, axis)的用法
就是在axis的那一个轴上把数据加上去，这个数据在axis这个轴的0位置。
例如原本为一维的2个数据，axis=0，则shape变为(1,2),axis=1则shape变为(2,1)
再例如原本为 (2,3),axis=0，则shape变为(1,2,3),axis=1则shape变为(2,1,3)

13.函数np.append(arr, values, axis=None)
作用：为原始array添加一些values
参数：
arr:需要被添加values的数组
values:添加到数组arr中的值（array_like，类数组）
axis:可选参数，如果axis没有给出，那么arr，values都将先展平成一维数组。注：如果axis被指定了，那么arr和values需要有相同的shape，否则报错：ValueError: arrays must have same number of dimensions
补充对axis的理解
axis的最大值为数组arr的维数-1，如arr维数等于1，axis最大值为0；arr维数等于2，axis最大值为1，以此类推。
当arr的维数为2(理解为单通道图)，axis=0表示沿着行方向添加values；axis=1表示沿着列方向添加values
当arr的维数为3(理解为多通道图)，axis=0，axis=1时同上；axis=2表示沿着深度方向添加values

14.replay_buffer.add_experience( image_t, action, reward, image_t1, done)
Add a new experience in the buffer.The components of the experience are stored as tuple(instead of lists) because they are immutable. It saves memory.
@param image_t the image at time t
@param action_t taken at time t
@param reward_t obtained at time t
@param image_t1 at time t+1
@param done_t1 boolean indicating if t+1 is terminal

1.Class ExperienceReplayBuffer 为DQN实现经验重放FIFO缓冲区,提供了添加和获取经验的方法,它基于Python deque类。初始化阶段需要定义缓冲区的容量 : 图像形状和存储的帧数（通常为3-5）。@param capacity它是一个指定缓冲区维度的整数.

2.rospy.Subscriber("/quadrotor/ardrone/bottom/ardrone/bottom/image_raw", ROSImage, image_callback,queue_size=30) # Store the last 30 messages before discarding them

3.Load Neural Networks weights from memory if a valid checkpoint path is passed

weixin_41012946

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
DQN

1、tf.summary.FileWriter指定一个文件用来保存图。格式：tf.summary.FileWritter(path,sess.graph)可以调用其add_summary（）方法将训练过程数据保存在filewriter指定的文件中2.在TensorFlow的世界里，变量的定义和初始化是分开的，所有关于图变量的赋值和计算都要通过tf.Session的run来进行。想要将所有图...
复制链接

扫一扫