并行计算课程自学2

Parameter Server's Architecture

Parameter Server's Architecture(李沐大神和Alex提出的)(与mapreduce的主要区别在于本方法为异步mapreduce为同步)

The Parameter Server

The parameter server was proposed by [1] for scalable machine learningCharacters: client-server architecture, message-passing communication,and asynchronous.

(Note that MapReduce is bulk synchronous.)

Ray [2], an open-source software system, supports parameter server.

Reference

1. Li and others: Scaling distributed machine learning with the parameter server. In OSDl, 2014

2. Moritz and others: Ray: A distributed framework for emerging Al applications. In OSDl, 2018

Synchronous algorithm vs Asynchronous algorithm

同步通信效率非常低

异步通信不需要等待其他worker因此高效

Asynchronous Gradient Descent 异步梯度下降

The i-th worker repeats:

1. Pull the up-to-date modelparameters w from the server.

2. Compute gradient \color{red}~{g}_iusing its local data and w.

3. Push \color{red}~{g}_i  to the server.

The server performs:

1. Receive gradient from aworker.

2. Update the parameters by:

\color{blue}w\color{black}\gets \color{blue}w\color{black}-\alpha*\color{red}~{g}_i

Reference

1. Niu and others: Hogwild: A lock-free approach to parallelizing stochastic gradient descent. In NIPS, 2011

Pro and Con of Asynchronous Algorithms

In practice, asynchronous algorithms are faster than the synchronous.

In theory, asynchronous algorithms has slower convergence rate.

Asynchronous algorithms have restrictions, e.g., a worker cannot bemuch slower than the others.(Why?)

  • 12
    点赞
  • 18
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值