tensorlayer学习日志17_chapter7_7.2

在安装gym[atari]时遇到错误,提示无法make files。在Mac和Linux上安装顺利,但在Windows上通过特定命令`pip install --no-index -f https://github.com/Kojoley/atari-py/releases atari_py`成功解决。运行后得知,需训练20000轮才能看到效果。
摘要由CSDN通过智能技术生成

第七章的乒乓球~~

import time
import gym
import numpy as np
import tensorflow as tf
import tensorlayer as tl
from tensorlayer.layers import *

def prepro(I):
    I = I[35:195]
    I = I[::2, ::2, 0]
    I[I == 144] = 0
    I[I == 109] = 0
    I[I != 0] = 1
    return I.astype(np.float).ravel()

image_size = 80
D = image_size * image_size
t_states = tf.placeholder(tf.float32, shape=[None, D])
network = InputLayer(t_states, name='input')
network = DenseLayer(network, n_units=200, act=tf.nn.relu, name='hidden')
network = DenseLayer(network, n_units=3, name='output')
probs = network.outputs
sampling_prob = tf.nn.softmax(probs)

batch_size = 10
learning_rate = 1e-4
gamma = 0.99
decay_rate = 0.99
render = False  
# resume = True    
model_file_name = "model_pong72"

t_actions = tf.placeholder(tf.int32, shape=[None])
t_discount_rewards = tf.placeholder(tf.float32, shape=[None])
loss = tl.rein.cross_entropy_reward_loss(probs, t_actions, t_discount_rewards)
train_op = tf.train.RMSPropOptimizer(learning_rate, decay_rate).minimize(loss)

# np.set_printoptions(threshold=np.nan)
env = gym.make("Pong-v0")
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值