self.s = tf.placeholder(tf.float32, [None, N_S], 'S')
状态s的维数:N_S
self.v_target = tf.placeholder(tf.float32, [None, 1], 'Vtarget')
价值的维数:1
状态
self.s = tf.placeholder(tf.float32, [1, n_features], "state")
状态s的维数1 * n_features
Cart-pole的状态是四维
函数形式:
tf.placeholder(
dtype,
shape=None,
name=None
)
参数:
dtype:数据类型。常用的是【tf.float32】,【tf.float64】等数值类型
shape:数据形状。
默认是None,就是一维值;
也可以是多维(比如[2,3];
[None, 3] 表示行数不定,列数是3,行数可以是任意数)。
name:名称
示例
行数不定(None),列数为S_DIM(S_DIM):
行数不定(None),列数为1(1):
行数不定(None),列数为N_S(N_S):
self.s = tf.placeholder(tf.float32, [None, N_S], 'S')
self.s = tf.placeholder(tf.float32, [None, N_S], 'S')
self.a_his = tf.placeholder(tf.int32, [None, ], 'A')
self.v_target = tf.placeholder(tf.float32, [None, 1], 'Vtarget')
self.s = tf.placeholder(tf.float32, [None, self.n_features], name='s') # input
self.q_target = tf.placeholder(tf.float32, [None, self.n_actions], name='Q_target')