Ray-Evolution Strategies、Cython 和 Streaming MapReduce

最新推荐文章于 2024-08-28 07:42:25 发布

快乐地笑

最新推荐文章于 2024-08-28 07:42:25 发布

阅读量271

点赞数

分类专栏：学习文章标签： ray-cython Streaming MapReduce ray Evolution Strategies

学习专栏收录该内容

72 篇文章 14 订阅

订阅专栏

本篇主要介绍强化学习算法ES，Cython应用和Streaming MapReduce在Ray中的应用。ES算法主要介绍有算法的运行和算法的核心类worker。Cython介绍将python代码转化c语言和注意事项。Streaming MapReduce主要介绍如何使用Ray的actor功能实现一个简单的流应用程序。它实现了一个流式MapReduce的例子。

1. Evolution Strategies ES

本节主要提供Ray框架中ES的示例，代码链接。

要运行应用程序，首先安装一些依赖项。

pip install tensorflow
pip install gym

脚本可以按如下方式运行。注意，该配置已调优为在Humanoid-v1gym环境下工作。

rllib train --env=Humanoid-v1 --run=ES

要在集群上训练策略(例如，使用900个worker)，请运行以下命令。

rllib train \
    --env=Humanoid-v1 \
    --run=ES \
    --redis-address=<redis-address> \
    --config='{"num_workers": 900, "episodes_per_batch": 10000, "train_batch_size": 100000}'

在这个例子的核心，我们定义了一个worker类。这个worker有一个do_rollouts方法，该方法将用于在给定环境中模拟随机扰动的策略。

@ray.remote
class Worker(object):
    def __init__(self, config, policy_params, env_name, noise):
        self.env = # Initialize environment.
        self.policy = # Construct policy.
        # Details omitted.

    def do_rollouts(self, params):
        perturbation = # Generate a random perturbation to the policy.

        self.policy.set_weights(params + perturbation)
        # Do rollout with the perturbed policy.

        self.policy.set_weights(params - perturbation)
        # Do rollout with the perturbed policy.

        # Return the rewards.

在主循环中，我们用这个类创建了许多actor。

workers = [Worker.remote(config, policy_params, env_name, noise_id)
           for _ in range(num_workers)]

然后，我们进入一个无限循环，在这个循环中，我们使用actor执行滚动，并使用滚动的奖励来更新策略。

while True:
    # Get the current policy weights.
    theta = policy.get_weights()
    # Put the current policy weights in the object store.
    theta_id = ray.put(theta)
    # Use the actors to do rollouts, note that we pass in the ID of the policy
    # weights.
    rollout_ids = [worker.do_rollouts.remote(theta_id), for worker in workers]
    # Get the results of the rollouts.
    results = ray.get(rollout_ids)
    # Update the policy.
    optimizer.update(...)

此外，请注意，我们创建了一个表示随机噪声共享块的大对象。然后，我们将该块放入对象存储中，这样每个worker actor就可以使用它，而不需要创建自己的副本。

@ray.remote
def create_shared_noise():
    noise = np.random.randn(250000000)
    return noise

noise_id = create_shared_noise.remote()

回调noise id参数被传递到actor构造函数中。

2.Cython

本部分主要介绍在ray中使用cython生成C代码的示例。Cython是属于python的超集，他首先会将python代码转化成C语言代码，然后通过c编译器生成可执行文件。

2.1启动

首先cd到目录$RAY_HOME/examples/cython运行如下：

pip install scipy # For BLAS example
pip install -e .
python cython_main.py --help

可以从Python脚本或解释器中导入cython_examples模块。

2.2 注意事项

必须在任何*.pyx文件的顶部包含以下两行

#!python
# cython: embedsignature=True, binding=True

你无法在* .pyx文件中修饰Cython函数（有很多方法可以解决这个问题，但是在Cython和Python之间创建一个漏洞抽象，这对于支持来说非常具有挑战性）。相反，在Python代码中更喜欢以下内容：

some_cython_func = ray.remote(some_cython_module.some_cython_func)

不能将内存缓冲区传输到远程函数(请参见示example8，该函数当前失败);远程函数必须返回一个值。
有关如何分别调用，定义和构建Cython代码的示例，请查看cython_main.py，cython_simple.pyx和setup.py。 Cython文档也非常有用。
来自cython自身不支持的限制。
我们目前不支持编译Cython代码并将其分发给ray集群。换句话说，Cython开发人员负责编译和分发任何Cython代码到他们的集群中(对于需要像scipy这样的Python包的用户来说，情况也是如此)。
对于大多数简单的用例，开发人员不需要担心Python 2或Python 3，但是需要关心的用户可以查看language_level Cython编译器指令(请参阅这里)。

3.Streaming MapReduce

本部分将介绍如何使用Ray的actor功能实现一个简单的流应用程序。它实现了一个流式MapReduce，计算维基百科文章的字数。
您可以查看这个示例的代码。

要运行示例，需要安装依赖项：

pip install wikipedia

然后执行一下命令：

python ray/examples/streaming/streaming.py

对于每一轮阅读的文章，脚本将输出这些文章的前10个单词及其字数:

article index = 0
   the 2866
   of 1688
   and 1448
   in 1101
   to 593
   a 553
   is 509
   as 325
   are 284
   by 261
article index = 1
   the 3597
   of 1971
   and 1735
   in 1429
   to 670
   a 623
   is 578
   as 401
   by 293
   for 285
article index = 2
   the 3910
   of 2123
   and 1890
   in 1468
   to 658
   a 653
   is 488
   as 364
   by 362
   for 297
article index = 3
   the 2962
   of 1667
   and 1472
   in 1220
   a 546
   to 538
   is 516
   as 307
   by 253
   for 243
article index = 4
   the 3523
   of 1866
   and 1690
   in 1475
   to 645
   a 583
   is 572
   as 352
   by 318
   for 306
...

注意，这个例子使用了分布式actor句柄，这仍然被认为是实验性的。

有一个Mapper actor，它有一个get_range方法，用于检索一定范围内的单词计数:

@ray.remote
class Mapper(object):

    def __init__(self, title_stream):
        # Constructor, the title stream parameter is a stream of wikipedia
        # article titles that will be read by this mapper

    def get_range(self, article_index, keys):
        # Return counts of all the words with first
        # letter between keys[0] and keys[1] in the
        # articles that haven't been read yet with index
        # up to article_index

Reducer actor拥有一个映射器列表，在它们上面调用get_range并累积结果。

@ray.remote
class Reducer(object):

    def __init__(self, keys, *mappers):
         # Constructor for a reducer that gets input from the list of mappers
         # in the argument and accumulates word counts for words with first
         # letter between keys[0] and keys[1]

    def next_reduce_result(self, article_index):
         # Get articles up to article_index that haven't been read yet,
         # accumulate the word counts and return them

在驱动程序上，我们创建了一些映射器和简化器，并运行流式MapReduce:

streams = # Create list of num_mappers streams
keys = # Partition the keys among the reducers.

# Create a number of mappers.
mappers = [Mapper.remote(stream) for stream in streams]

# Create a number of reduces, each responsible for a different range of keys.
# This gives each Reducer actor a handle to each Mapper actor.
reducers = [Reducer.remote(key, *mappers) for key in keys]

article_index = 0
while True:
    counts = ray.get([reducer.next_reduce_result.remote(article_index)
                      for reducer in reducers])
    article_index += 1

实际示例读取文章列表并创建一个stream对象，该对象从列表中生成无限的文章流。这是一个小例子，用来说明这个想法。实际上，我们将为每个映射器生成一个非重复项流。

快乐地笑

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Ray-Evolution Strategies、Cython 和 Streaming MapReduce

1. 策略演化（Evolution Strategies ES）本部分主要提供Ray框架中策略演化的示例。代码链接。要运行应用程序，首先安装一些依赖项。pip install tensorflowpip install gym脚本可以按如下方式运行。注意，该配置已调优为在Humanoid-v1gym环境下工作。rllib train --env=Humanoid-v1 --run=E...
复制链接

扫一扫

专栏目录