参考:
配置conda环境:
python==3.7
gym==0.19.0
tensorboardX
torch
stable-baselines3[extra,tests,docs]==1.1.0
pybullet==2.7.8
optuna
pyyaml>=5.1
sb3-contrib==1.0.0
测试教师模型:
python policy_distillation.py --model teacher --algo td3 --env AntBulletEnv-v0
知识蒸馏:
先创建文件夹:
mkdir distilled-agents
训练:
python policy_distillation.py --algo td3 --env AntBulletEnv-v0
测试学生:
python playground.py --mode student -p /home/blamlight/Documents/Github/policy-distillation-baselines/distilled-agents/AntBulletEnv-v0_td3_1710318896.1154015/student_10000_3260.12.pkl