目录
1. 复现对象
1.1 相关链接
1.2 仿真环境lunarlander
action:
a[0] 什么也不做
a[1] 启动左引擎
a[2] 启动主引擎
a[3] 启动右引擎
state:
s[0] is the horizontal coordinate
s[1] is the vertical coordinate
s[2] is the horizontal speed
s[3] is the vertical speed
s[4] is the angle
s[5] is the angular speed
s[6] 1 if first leg has contact, else 0
s[7] 1 if second leg has contact, else 0
1.3 仿真环境cartpole
action:
向左施加力
向右施加力
state:
车辆水平位置
车辆瞬时速度
杆与垂直方向的夹角
杆的角速度
2. lunarlander复现结果
运行结果如下:
2.1 第9步
starting causal discovery
(angle') caused by (v_angle) with assurrance 0.99644
(angle') caused by (landed_legs) with assurrance 0.66270
(angle') caused by (engine) with assurrance 0.67312
(angle') caused by (angle) with assurrance 0.99992
(angle') caused by (vx) with assurrance 0.27607
(angle') caused by (vy) with assurrance 0.65637
(angle') caused by (x) with assurrance 0.83167
(angle') caused by (y) with assurrance 0.96726
(crash) caused by (v_angle) with assurrance 0.59318
(crash) caused by (landed_legs) with assurrance 0.99256
(crash) caused by (engine) with assurrance 0.08763
(crash) caused by (angle) with assurrance 0.85457
(crash) caused by (vy) with assurrance 0.91271
(crash) caused by (vx) with assurrance 0.58051
(crash) caused by (x) with assurrance 0.96912
(fuel_cost) caused by (angle) with assurrance 0.00000
(crash) caused by (y) with assurrance 0.66427
(fuel_cost) caused by (engine) with assurrance 1.00000
(fuel_cost) caused by (landed_legs) with assurrance 0.00000
(fuel_cost) caused by (vx) with assurrance 0.00000
(fuel_cost) caused by (v_angle) with assurrance 0.00000
(fuel_cost) caused by (vy) with assurrance 0.00000
(fuel_cost) caused by (x) with assurrance 0.00000
(fuel_cost) caused by (y) with assurrance 0.00000
(landed_legs') caused by (angle) with assurrance 0.69358
(landed_legs') caused by (engine) with assurrance 0.88777
(landed_legs') caused by (landed_legs) with assurrance 0.99966
(landed_legs') caused by (vx) with assurrance 0.85345
(landed_legs') caused by (v_angle) with assurrance 0.43689
(rest) caused by (angle) with assurrance 0.00000
(rest) caused by (engine) with assurrance 0.00000
(rest) caused by (landed_legs) with assurrance 0.00000
(rest) caused by (v_angle) with assurrance 0.00000
(rest) caused by (vx) with assurrance 0.00000
(rest) caused by (vy) with assurrance 0.00000
(landed_legs') caused by (vy) with assurrance 0.93282
(rest) caused by (x) with assurrance 0.00000
(rest) caused by (y) with assurrance 0.00000
(landed_legs') caused by (x) with assurrance 0.05350
(landed_legs') caused by (y) with assurrance 0.77181
(v_angle') caused by (angle) with assurrance 0.76077
(v_angle') caused by (engine) with assurrance 0.96332
(v_angle') caused by (landed_legs) with assurrance 0.99181
(v_angle') caused by (v_angle) with assurrance 0.99550
(v_angle') caused by (vx) with assurrance 0.16669
(v_angle') caused by (vy) with assurrance 0.85722
(v_angle') caused by (x) with assurrance 0.08066
(v_angle') caused by (y) with assurrance 0.08966
(vx') caused by (angle) with assurrance 0.92614
(vx') caused by (engine) with assurrance 0.36020
(vx') caused by (landed_legs) with assurrance 0.87006
(vx') caused by (v_angle) with assurrance 0.08313
(vx') caused by (vx) with assurrance 0.99999
(vx') caused by (vy) with assurrance 0.47437
(vx') caused by (y) with assurrance 0.05503
(vx') caused by (x) with assurrance 0.88342
(vy') caused by (angle) with assurrance 0.85471
(vy') caused by (engine) with assurrance 0.99994
(vy') caused by (landed_legs) with assurrance 0.99996
(vy') caused by (v_angle) with assurrance 0.04239
(vy') caused by (vx) with assurrance 0.26754
(vy') caused by (vy) with assurrance 1.00000
(vy') caused by (x) with assurrance 0.21860
(vy') caused by (y) with assurrance 0.92468
(x') caused by (angle) with assurrance 0.68152
(x') caused by (engine) with assurrance 0.00171
(x') caused by (v_angle) with assurrance 0.36346
(x') caused by (landed_legs) with assurrance 0.72334
(x') caused by (vx) with assurrance 1.00000
(x') caused by (vy) with assurrance 0.06414
(x') caused by (y) with assurrance 0.06903
(y') caused by (angle) with assurrance 0.21156
(x') caused by (x) with assurrance 1.00000
(y') caused by (engine) with assurrance 0.19189
(y') caused by (v_angle) with assurrance 0.92250
(y') caused by (vx) with assurrance 0.36926
(y') caused by (landed_legs) with assurrance 0.97075
(y') caused by (vy) with assurrance 0.99999
(y') caused by (x) with assurrance 0.15692
(y') caused by (y) with assurrance 1.00000
-------------------discovered-causal-graph---------------------
(angle, y, x, v_angle) --> angle'
(angle, x, vy, landed_legs) --> crash
(engine) --> fuel_cost
(vy, vx, engine, landed_legs) --> landed_legs'
() --> rest
(v_angle, vy, engine, landed_legs) --> v_angle'
(angle, vx, x, landed_legs) --> vx'
(vy, angle, y, engine, landed_legs) --> vy'
(vx, x) --> x'
(v_angle, vy, y, landed_legs) --> y'
---------------------------------------------------------------
2.2 第99步
至此,耗时约6h
starting causal discovery
(angle') caused by (v_angle) with assurrance 1.00000
(angle') caused by (engine) with assurrance 0.91401
(angle') caused by (landed_legs) with assurrance 0.91566
(angle') caused by (angle) with assurrance 1.00000
(angle') caused by (vx) with assurrance 0.84761
(angle') caused by (vy) with assurrance 0.79875
(angle') caused by (x) with assurrance 0.85814
(angle') caused by (y) with assurrance 0.81510
(crash) caused by (angle) with assurrance 0.09747
(crash) caused by (engine) with assurrance 0.63105
(crash) caused by (landed_legs) with assurrance 0.99990
(crash) caused by (v_angle) with assurrance 0.82667
(crash) caused by (vx) with assurrance 0.40407
(crash) caused by (vy) with assurrance 1.00000
(crash) caused by (x) with assurrance 0.99757
(crash) caused by (y) with assurrance 0.12045
(fuel_cost) caused by (angle) with assurrance 0.00000
(fuel_cost) caused by (engine) with assurrance 1.00000
(fuel_cost) caused by (v_angle) with assurrance 0.00000
(fuel_cost) caused by (landed_legs) with assurrance 0.00000
(fuel_cost) caused by (vx) with assurrance 0.00000
(fuel_cost) caused by (vy) with assurrance 0.00000
(fuel_cost) caused by (x) with assurrance 0.00000
(fuel_cost) caused by (y) with assurrance 0.00000
(landed_legs') caused by (angle) with assurrance 0.87897
(landed_legs') caused by (v_angle) with assurrance 0.25252
(landed_legs') caused by (vx) with assurrance 0.15359
(landed_legs') caused by (engine) with assurrance 0.77116
(landed_legs') caused by (landed_legs) with assurrance 1.00000
(rest) caused by (angle) with assurrance 0.00000
(rest) caused by (engine) with assurrance 0.00000
(landed_legs') caused by (x) with assurrance 0.30346
(landed_legs') caused by (vy) with assurrance 0.07573
(rest) caused by (landed_legs) with assurrance 0.96466
(rest) caused by (v_angle) with assurrance 0.00000
(rest) caused by (vx) with assurrance 0.90287
(rest) caused by (vy) with assurrance 0.00000
(rest) caused by (x) with assurrance 0.00000
(rest) caused by (y) with assurrance 0.00000
(landed_legs') caused by (y) with assurrance 0.99688
(v_angle') caused by (angle) with assurrance 0.86793
(v_angle') caused by (landed_legs) with assurrance 0.99553
(v_angle') caused by (engine) with assurrance 0.95110
(v_angle') caused by (v_angle) with assurrance 1.00000
(v_angle') caused by (vx) with assurrance 0.68554
(v_angle') caused by (vy) with assurrance 0.02769
(v_angle') caused by (x) with assurrance 0.79948
(v_angle') caused by (y) with assurrance 0.03491
(vx') caused by (angle) with assurrance 0.99994
(vx') caused by (engine) with assurrance 0.99999
(vx') caused by (v_angle) with assurrance 0.05110
(vx') caused by (landed_legs) with assurrance 0.26498
(vx') caused by (vx) with assurrance 0.99999
(vx') caused by (vy) with assurrance 0.32017
(vx') caused by (x) with assurrance 0.78903
(vx') caused by (y) with assurrance 0.02321
(vy') caused by (angle) with assurrance 0.72637
(vy') caused by (engine) with assurrance 0.99999
(vy') caused by (v_angle) with assurrance 0.29209
(vy') caused by (landed_legs) with assurrance 0.99994
(vy') caused by (vx) with assurrance 0.83871
(vy') caused by (x) with assurrance 0.05038
(vy') caused by (y) with assurrance 0.58507
(vy') caused by (vy) with assurrance 1.00000
(x') caused by (angle) with assurrance 0.00393
(x') caused by (engine) with assurrance 0.23336
(x') caused by (v_angle) with assurrance 0.00037
(x') caused by (landed_legs) with assurrance 0.50181
(x') caused by (vx) with assurrance 1.00000
(x') caused by (vy) with assurrance 0.00177
(x') caused by (y) with assurrance 0.00005
(x') caused by (x) with assurrance 1.00000
(y') caused by (angle) with assurrance 0.23687
(y') caused by (engine) with assurrance 0.98094
(y') caused by (v_angle) with assurrance 0.11466
(y') caused by (landed_legs) with assurrance 0.96841
(y') caused by (vx) with assurrance 0.02891
(y') caused by (vy) with assurrance 1.00000
(y') caused by (x) with assurrance 0.13967
(y') caused by (y) with assurrance 1.00000
-------------------discovered-causal-graph---------------------
(x, landed_legs, angle, y, engine, vx, v_angle) --> angle'
(v_angle, vy, x, landed_legs) --> crash
(engine) --> fuel_cost
(angle, y, landed_legs) --> landed_legs'
(vx, landed_legs) --> rest
(v_angle, angle, engine, landed_legs) --> v_angle'
(angle, vx, engine) --> vx'
(vy, vx, engine, landed_legs) --> vy'
(vx, x) --> x'
(vy, y, engine, landed_legs) --> y'
---------------------------------------------------------------
2.3 第198步
starting causal discovery
(angle') caused by (v_angle) with assurrance 1.00000
(angle') caused by (landed_legs) with assurrance 0.60837
(angle') caused by (engine) with assurrance 0.08607
(angle') caused by (angle) with assurrance 1.00000
(angle') caused by (vx) with assurrance 0.43705
(angle') caused by (vy) with assurrance 0.13294
(angle') caused by (x) with assurrance 0.43582
(angle') caused by (y) with assurrance 0.68825
(crash) caused by (angle) with assurrance 0.31873
(crash) caused by (engine) with assurrance 0.27178
(crash) caused by (v_angle) with assurrance 0.01975
(crash) caused by (landed_legs) with assurrance 0.81415
(crash) caused by (vx) with assurrance 0.21756
(crash) caused by (vy) with assurrance 0.99726
(crash) caused by (x) with assurrance 0.79158
(crash) caused by (y) with assurrance 0.86942
(fuel_cost) caused by (angle) with assurrance 0.00000
(fuel_cost) caused by (engine) with assurrance 1.00000
(fuel_cost) caused by (v_angle) with assurrance 0.00000
(fuel_cost) caused by (landed_legs) with assurrance 0.00000
(fuel_cost) caused by (vx) with assurrance 0.00000
(fuel_cost) caused by (vy) with assurrance 0.00000
(fuel_cost) caused by (x) with assurrance 0.00000
(fuel_cost) caused by (y) with assurrance 0.00000
(landed_legs') caused by (angle) with assurrance 0.17666
(landed_legs') caused by (landed_legs) with assurrance 1.00000
(landed_legs') caused by (engine) with assurrance 0.95938
(landed_legs') caused by (v_angle) with assurrance 0.31031
(landed_legs') caused by (vx) with assurrance 0.66512
(rest) caused by (angle) with assurrance 0.00000
(rest) caused by (engine) with assurrance 0.43517
(rest) caused by (landed_legs) with assurrance 0.00000
(rest) caused by (v_angle) with assurrance 0.00000
(landed_legs') caused by (x) with assurrance 0.69929
(rest) caused by (vx) with assurrance 0.00000
(rest) caused by (vy) with assurrance 0.04040
(landed_legs') caused by (vy) with assurrance 0.43862
(rest) caused by (x) with assurrance 0.76712
(rest) caused by (y) with assurrance 0.00000
(landed_legs') caused by (y) with assurrance 0.99029
(v_angle') caused by (angle) with assurrance 0.28326
(v_angle') caused by (landed_legs) with assurrance 0.82534
(v_angle') caused by (engine) with assurrance 0.87614
(v_angle') caused by (v_angle) with assurrance 0.99998
(v_angle') caused by (vx) with assurrance 0.20477
(v_angle') caused by (vy) with assurrance 0.95739
(v_angle') caused by (x) with assurrance 0.85125
(v_angle') caused by (y) with assurrance 0.02198
(vx') caused by (angle) with assurrance 1.00000
(vx') caused by (v_angle) with assurrance 0.12842
(vx') caused by (engine) with assurrance 1.00000
(vx') caused by (landed_legs) with assurrance 0.85017
(vx') caused by (vx) with assurrance 1.00000
(vx') caused by (vy) with assurrance 0.00734
(vx') caused by (x) with assurrance 0.02833
(vx') caused by (y) with assurrance 0.07515
(vy') caused by (angle) with assurrance 0.02162
(vy') caused by (v_angle) with assurrance 0.85994
(vy') caused by (engine) with assurrance 1.00000
(vy') caused by (landed_legs) with assurrance 0.99979
(vy') caused by (vx) with assurrance 0.24905
(vy') caused by (x) with assurrance 0.54831
(vy') caused by (vy) with assurrance 1.00000
(vy') caused by (y) with assurrance 0.00610
(x') caused by (angle) with assurrance 0.03291
(x') caused by (landed_legs) with assurrance 0.55412
(x') caused by (engine) with assurrance 0.17837
(x') caused by (v_angle) with assurrance 0.03749
(x') caused by (vx) with assurrance 1.00000
(x') caused by (vy) with assurrance 0.00592
(x') caused by (y) with assurrance 0.26801
(x') caused by (x) with assurrance 1.00000
(y') caused by (angle) with assurrance 0.07336
(y') caused by (engine) with assurrance 0.99998
(y') caused by (v_angle) with assurrance 0.00088
(y') caused by (landed_legs) with assurrance 0.98922
(y') caused by (vx) with assurrance 0.00064
(y') caused by (x) with assurrance 0.06248
(y') caused by (vy) with assurrance 1.00000
(y') caused by (y) with assurrance 1.00000
-------------------discovered-causal-graph---------------------
(angle, v_angle) --> angle'
(vy, y, landed_legs) --> crash
(engine) --> fuel_cost
(y, engine, landed_legs) --> landed_legs'
() --> rest
(vy, x, landed_legs, engine, v_angle) --> v_angle'
(angle, vx, engine, landed_legs) --> vx'
(v_angle, vy, engine, landed_legs) --> vy'
(vx, x) --> x'
(vy, y, engine, landed_legs) --> y'
---------------------------------------------------------------
3. cartpole复现结果
3.1 第0步
20240525-1505启动,运行结果如下:
---------------step 0 / 200----------------
episodic return: 9.621621621621621
mean reward: 0.9075 (truth)
perform causal disocery with threshold 0.2
starting causal discovery
(angle') caused by (PUSH) with assurrance 0.00030
(angle') caused by (angle_velocity) with assurrance 0.99963
(angle') caused by (angle) with assurrance 1.00000
(angle') caused by (position) with assurrance 0.01584
(angle') caused by (velocity) with assurrance 0.08129
(angle_velocity') caused by (angle) with assurrance 0.83305
(angle_velocity') caused by (PUSH) with assurrance 0.99976
(angle_velocity') caused by (angle_velocity) with assurrance 0.99949
(angle_velocity') caused by (position) with assurrance 0.42480
(angle_velocity') caused by (velocity) with assurrance 0.00627
(position') caused by (PUSH) with assurrance 0.29983
(position') caused by (angle) with assurrance 0.24988
(position') caused by (angle_velocity) with assurrance 0.07966
(position') caused by (position) with assurrance 0.99998
(position') caused by (velocity) with assurrance 0.98985
(velocity') caused by (PUSH) with assurrance 0.99482
(velocity') caused by (angle) with assurrance 0.44962
(velocity') caused by (angle_velocity) with assurrance 0.00733
(velocity') caused by (position) with assurrance 0.72191
(velocity') caused by (velocity) with assurrance 0.99105
-------------------discovered-causal-graph---------------------
(angle, angle_velocity) --> angle'
(angle, PUSH, angle_velocity) --> angle_velocity'
(position, velocity) --> position'
(PUSH, velocity) --> velocity'
---------------------------------------------------------------
3.2 第99步
20240525-1635,运行结果如下:
耗时1.5h
---------------step 99 / 200----------------
episodic return: 200.0
mean reward: 1.0 (truth)
perform causal disocery with threshold 0.2
starting causal discovery
(angle') caused by (position) with assurrance 0.00154
(angle') caused by (angle_velocity) with assurrance 1.00000
(angle') caused by (PUSH) with assurrance 0.21346
(angle') caused by (angle) with assurrance 1.00000
(angle') caused by (velocity) with assurrance 0.00000
(angle_velocity') caused by (angle) with assurrance 0.99996
(angle_velocity') caused by (angle_velocity) with assurrance 1.00000
(angle_velocity') caused by (PUSH) with assurrance 1.00000
(angle_velocity') caused by (position) with assurrance 0.00013
(angle_velocity') caused by (velocity) with assurrance 0.20396
(position') caused by (PUSH) with assurrance 0.00293
(position') caused by (angle) with assurrance 0.00000
(position') caused by (angle_velocity) with assurrance 0.00002
(position') caused by (position) with assurrance 1.00000
(position') caused by (velocity) with assurrance 1.00000
(velocity') caused by (PUSH) with assurrance 0.99923
(velocity') caused by (angle) with assurrance 0.92331
(velocity') caused by (angle_velocity) with assurrance 0.30441
(velocity') caused by (position) with assurrance 0.31953
(velocity') caused by (velocity) with assurrance 0.99826
-------------------discovered-causal-graph---------------------
(angle, angle_velocity) --> angle'
(angle, PUSH, angle_velocity) --> angle_velocity'
(position, velocity) --> position'
(angle, PUSH, velocity) --> velocity'
---------------------------------------------------------------
3.3 第198步
20240525-1845已经跑完,运行结果如下:
---------------step 198 / 200----------------
episodic return: 200.0
mean reward: 1.0 (truth)
perform causal disocery with threshold 0.2
starting causal discovery
(angle') caused by (position) with assurrance 0.00001
(angle') caused by (angle_velocity) with assurrance 1.00000
(angle') caused by (angle) with assurrance 1.00000
(angle') caused by (PUSH) with assurrance 0.16289
(angle') caused by (velocity) with assurrance 0.00004
(angle_velocity') caused by (angle) with assurrance 1.00000
(angle_velocity') caused by (angle_velocity) with assurrance 1.00000
(angle_velocity') caused by (PUSH) with assurrance 1.00000
(angle_velocity') caused by (position) with assurrance 0.03369
(angle_velocity') caused by (velocity) with assurrance 0.68328
(position') caused by (PUSH) with assurrance 0.00045
(position') caused by (angle) with assurrance 0.00000
(position') caused by (angle_velocity) with assurrance 0.00000
(position') caused by (position) with assurrance 1.00000
(position') caused by (velocity) with assurrance 1.00000
(velocity') caused by (PUSH) with assurrance 0.99903
(velocity') caused by (angle) with assurrance 0.99428
(velocity') caused by (angle_velocity) with assurrance 0.85500
(velocity') caused by (position) with assurrance 0.53975
(velocity') caused by (velocity) with assurrance 0.99811
-------------------discovered-causal-graph---------------------
(angle, angle_velocity) --> angle'
(angle, PUSH, angle_velocity) --> angle_velocity'
(position, velocity) --> position'
(angle, PUSH, angle_velocity, velocity) --> velocity'
---------------------------------------------------------------