xiting123-CSDN博客

转载 Q-learning 和 deep Q-learning（DQN）理解

Q-learning ：根据 Q 表的估计, 因为在 s1 中, a2 的值比较大, 通过之前的决策方法, 我们在 s1 采取了 a2, 并到达 s2, 这时我们开始更新用于决策的 Q 表, 接着我们并没有在实际中采取任何行为, 而是再想象自己在 s2 上采取了每种行为, 分别看看两种行为哪一个的 Q 值大, 比如说 Q(s2, a2) 的值比 Q(s2, a1) 的大, 所以我们把大的 Q(s2, a2) 乘上一个衰减值 gamma (比如是0.9) 并加上到达s2时所获取的奖励 ...

2021-12-28 15:44:38 1211

原创验证python环境出现警告：Warning:This Python interpreter is in a conda environment, but the environment hasnot

出现警告： Warning: This Python interpreter is in a conda environment, but the environment has not been activated. Libraries may fail to load. To activate this environment please see https://conda.io/activation 第一步：查找python安装路径 conda info ...

2021-12-04 21:25:50 1020

空空如也

TA创建的收藏夹 TA关注的收藏夹

TA关注的人

转载 Q-learning 和 deep Q-learning（DQN）理解

原创 验证python环境出现警告：Warning:This Python interpreter is in a conda environment, but the environment hasnot

空空如也

空空如也

原创验证python环境出现警告：Warning:This Python interpreter is in a conda environment, but the environment hasnot