Introduction A distributed system for Deep RL tasks. Workers: - actor - policy - trainer Workers are managed by controllers and communicate via a custom-built _**RPC framework**_, which operates client-server work flow: > - establish connection through **host_name** and **port**. > - (server-worker connection) register command-function pairs > - Server contains **handle_request**: listen to requests from clients > - Clients send request when needed, server respond --- ## Code Structure ###worker_base.py Abstract base class for constructing a worker. Child classes include `actor`, `policy`, `trainer` workers 1. A **server** is a handler which responds to request from client/controller 1. `WorkerServer` class implements 2 methods: 1. `register_handler`: link request command to function in **worker** 2. `handle_requests`: handles requests 2. `WorkerControlPanel` which connects and send requests to multiple Workers (identified by worker_name) 1. `connect(self, worker_names,...)` and `autoconnect` 2. `request(self, worker_name: str, command, **kwargs)`, `async_request` and `group_request` 3. `worker` connects to a server, record config info, 1. `__init__(self, server)`: bundle with dedicated server 2. `run(self)`: 1. constantly calls `self._poll()` to carry out tasks 2. calls `self._server.handle_requests()`: server constantly request from client ###worker_control.py A ZMQ implementation of `worker_base` Example: ```peudo-code block # Server side. server = RpcServer(port) server.register_handler('foo', foo) while True: server.handle_requests() # Client side. client = RpcClient(host, port) client.request('foo', x=42, y='str') # foo(x=42, y='str') will be called on the server side. ``` #### ZmqServer() - `__handlers`: for transforming request commands to functions for `workers`, - `__context`, `__socket`, `__port`: for connection with controller. When initiated, the server stores this info in `base.name_resolve` - base.name_resolve stores **key** (experiment_name, trial_name, worker_name) and **address** ({host_name}:{port}) - it's socket has `zmq.REP` type, binds to a host and port and can be `close`d ```python # storing name, address pairs in base.name_resolve key = names.worker(experiment_name, trial_name, worker_name) address = f"{socket.gethostname()}:{self.__port}" base.name_resolve.add(key, address) # handle requests data = self.__socket.recv(zmq.NOBLOCK) command, kwargs = pickle.loads(data) response = self.__handlers[command](**kwargs) self.__socket.send(pickle.dumps(response) ``` #### ZmqWorkerControl() - for a single `__experiment_name` and a single `__trial_name` - zmq structure: has a single `__context` but a dict `__sockets` of multiple {_worker_name_ : _zmq.socket_} - connection to a worker: creates a socket; find server address from base name_resolve through worker_name, and connect; register socket to `__sockets[worker_name]` - ```python # to connect to a worker server_address = base.name_resolve.wait(names.worker(self.__experiment_name, self.__trial_name, name)) sock = self.__context.socket(zmq.REQ) sock.connect(f"tcp://{server_address}") self.__sockets[name] = sock ```
Handler
Handler In general handlers are functions that 'handle' certain events that they are registered for.
handler(request, response)
It is expected that the handler will use information from the request (e.g. the path) either to populate the response object with the data to send, or to directly write to the output stream via the ResponseWriter instance associated with the request."
Pooling is a technique where the program is checking the state of something frequently to react when the state has changed.
https://stackoverflow.com/questions/58628653/what-are-handlers-in-python-in-plain-english
Socket
A network socket is a software structure within a network node of a computer network that serves as an endpoint for sending and receiving data across the network. The structure and properties of a socket are defined by an application programming interface (API) for the networking architecture. Sockets are created only during the lifetime of a process of an application running in the node.
Because of the standardization of the TCP/IP protocols in the development of the Internet, the term network socket is most commonly used in the context of the Internet protocol suite, and is therefore often also referred to as Internet socket. In this context, a socket is externally identified to other hosts by its socket address, which is the triad of transport protocol, IP address, and port number.
Source: wikipedia
IPC
interprocess communication (IPC) refers specifically to the mechanisms an operating system provides to allow the processes to manage shared data.