[framework] debuging

hanjialeOK

已于 2022-01-22 17:36:51 修改

阅读量810

点赞数

分类专栏：强化学习文章标签： rl

于 2021-11-16 16:12:04 首次发布

本文链接：https://blog.csdn.net/weixin_43742643/article/details/121323818

版权

强化学习专栏收录该内容

7 篇文章 1 订阅

订阅专栏

debug 工具

本打算使用 pdb 逐步运行

python -m pdb actor.py --config examples/ppo/cartpole_actor.yaml

然而，pdb 并不支持多线程，而且效率贼低。因此换用 vscode，支持多线程而且效率超高！

只需编写 launch.json 即可！

"stopOnEntry": true 程序运行时自动在第一条语句停下
"args": ["--config", "examples/ppo/pong_actor.yaml"] 指定参数
"env": {"CUDA_VISIBLE_DEVICES":"0,1"} 指定 gpu

{
    // Use IntelliSense to learn about possible attributes.
    // Hover to view descriptions of existing attributes.
    // For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387
    "version": "0.2.0",
    "configurations": [
        {
            "name": "Python: Current File",
            "type": "python",
            "request": "launch",
            "program": "${file}",
            "console": "integratedTerminal",
            "stopOnEntry": true,
            "env": {"CUDA_VISIBLE_DEVICES":"0,1"},
            "args": ["--config", "examples/ppo/pong_actor.yaml"]
        }
    ]
}

相关库

argparse

https://docs.python.org/3.7/library/argparse.html

一般是 type 表示 --config 后面输入的类型
store_true 表示 --use_gpu 后面不需要输入，默认是 False。

from argparse import ArgumentParser

parser = ArgumentParser()
parser.add_argument('--config', type=str, default=None, help='The YAML configuration file')
parser.add_argument('--use_gpu', action='store_true', help='Use GPU to sample every action')

def main():
    args, unknown_args = parser.parse_known_args()
    print(args.config)
    print(args.use_gpu)

if __name__ == '__main__':
    main()

输入命令

python help.py --config "test" --use_gpu

pickle

实现了用于序列化和反序列化Python对象结构的二进制协议。可以序列化自定义对象。
pickle.dump(obj, file, [protocol]) 序列化对象
pickle.load(file) 反序列化对象

typing

类型检查，防止运行时出现参数和返回值类型不符合。
作为开发文档附加说明，方便使用者调用时传入和返回参数类型。
该模块加入后并不会影响程序的运行，不会报正式的错误，只有提醒。

注意：typing模块只有在python3.5以上的版本中才可以使用,pycharm目前支持typing检查

from typing import List, Tuple, Dict
def add(a:int, string:str, f:float, b:bool) -> Tuple[List, Tuple, Dict, bool]:
    list1 = list(range(a))
    tup = (string, string, string)
    d = {"a":f}
    bl = b
    return list1, tup, d,bl
print(add(5,"hhhh", 2.3, False))
# 结果：([0, 1, 2, 3, 4], ('hhhh', 'hhhh', 'hhhh'), {'a': 2.3}, False)

zmq

创建和销毁套接字：zmq_socket(), zmq_close()
配置和读取套接字选项：zmq_setsockopt(), zmq_getsockopt()
为套接字建立连接：zmq_bind(), zmq_connect()
发送和接收消息：zmq_send(), zmq_recv()

设置非阻塞 socket.recv(flags=zmq.NOBLOCK)。zmq.ZMQError 可以替换为 zmq.Again。

import zmq
import time

context = zmq.Context()
receiver = context.socket(zmq.PULL)
receiver.connect("tcp://localhost:5557")
subscriber = context.socket(zmq.SUB)
subscriber.connect("tcp://localhost:5556")
subscriber.setsockopt(zmq.SUBSCRIBE, b"10001")

while True:
    while True:
        try:
            msg = receiver.recv(zmq.NOBLOCK)
        except zmq.ZMQError:
            break
    while True:
        try:
            msg = subscriber.recv(zmq.NOBLOCK)
        except zmq.ZMQError:
            break
    time.sleep(1)

PUB-SUB

使用 SUB 设置一个订阅时，必须使用 zmq_setsockopt() 对消息进行过滤。setsockopt 的详细解释，可参考http://api.zeromq.org/3-2:zmq-setsockopt。
PUB 和 SUB 谁 bind 谁 connect 并无严格要求（虽本质并无区别），但仍建议 PUB 使用 bind，SUB 使用 connect
也就是如果 push 的 send 不能 send 出去，就会出现一直阻塞的情况，而 pull 的 recv 也是会一直等待数据的到达，否则无法执行后面的函数。https://blog.csdn.net/weixin_42066185/article/details/103015332

REQ(client)-REP(server)

客户端必要先发送消息，在接收消息；服务端必须先进行接收客户端发送过来的消息，再发送应答给客户端。
服务端和客户端谁先启动，效果都是一样的。
服务端在收到消息之前，会一直阻塞，等待客户端连上来。
客户端使用 connect，服务端使用 bind。

multiprocessing

https://docs.python.org/3.7/library/multiprocessing.html

Process

用于创建进程模块

Array & Value

不同的 Python 进程之间创建共享的内存区域。Value 是共享值（单个数值），Array 是共享数组。
m = Array('i',3) 表示开辟 3 个空间，且均为整型 i，其实就是一个列表
m = Array('i',[1,2,3,4,5]) 表示开辟 5 个空间，同时存入列表中的元素

Condition

进程同步互斥

import multiprocessing,time

def A(cond):
    name=multiprocessing.current_process().name
    print(f"starting,{name}")
    with cond:
        print("%s is done and next is ready"%name)
        cond.notify_all()

def B(cond):
    name=multiprocessing.current_process().name
    print(f"starting{name}")
    with cond:
        cond.wait()
        print("%s running..."%name)

cond=multiprocessing.Condition()
m=multiprocessing.Process(target=A,args=(cond,))
n=[multiprocessing.Process(target=B,name="Process2[%d]"%i,args=(cond,)) for i in range(1,3)]
for i in n:
    i.start()
    time.sleep(2)

m.start()
m.join()
for i in n:
    i.join()

from itertools import count

从 10 开始无限循环

for i in count(10):
// 10 11 12 13 ...

from pyarrow import serialize

测试记录

我认为框架的核心是

$\frac{sizeof(data)}{learntime}$

及单位时间内训练的数据量大小，该值越大，速度越快。

actor	pool	batch	send_fps	wait_time	sample_time	learn_time	all_time	learn_rate
60	60*	60*	560~580	0.1~0.3	1.8	1.5	3.5	3.3e4
60	90*	60*	400~430	1e-5	3.2	1.5	4.7	2.6e4
40	40*	40*	590~610	1.26	0.87	1.37	3.5	2.4e4
20	20*	20*	620	2	0.26	1	3.2	1.2e4
20	30*	20*	620	1.8	0.48	1	3.2	1.2e4
20	40*	20*	620	1.6	0.56	1	3.2	1.2e4
20	50*	20*	620	1.6	0.64	1	3.2	1.2e4
20	60*	20*	620	1.25	0.75	1	3.2	1.2e4
20	70*	20*	620	1.25	0.82	1	3.2	1.2e4
20	80*	20*	620	1.25	0.93	1	3.2	1.2e4
20	90*	20*	620	1.25	1	1	3.2	1.2e4
40	40*	40*	620	1.25	0.87	1.37	3.2	1.2e4
40	50*	40*	620	1.25	1.32	1.37	3.2	1.2e4
40	60*	40*	620	1.25	1.51	1.37	3.2	1.2e4
40	70*	40*	620	1.25	1.73	1.37	3.2	1.2e4
40	80*	40*	620	1.25	1.90	1.37	3.2	1.2e4
40	90*	40*	620	1.25	2.12	1.37	3.2	1.2e4
60	60*	60*	620	1.25	1.80	1.37	3.2	1.2e4
60	70*	60*	620	1.25	2.56	1.37	3.2	1.2e4
60	80*	60*	620	1.25	2.80	1.37	3.2	1.2e4
60	90*	60*	620	1.25	3.3	1.37	3.2	1.2e4

hanjialeOK

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
[framework] debuging

使用 pdb 逐步运行python -m pdb actor.py --config examples/ppo/cartpole_actor.yaml --use_gpu用到的库pickle实现了用于序列化和反序列化Python对象结构的二进制协议。可以序列化自定义对象。pickle.dump(obj, file, [protocol]) 序列化对象pickle.load(file) 反序列化对象typing类型检查，防止运行时出现参数和返回值类型不符合。作为开发文档附加说明，方便使用
复制链接

扫一扫

专栏目录