Large scale machine learning - Mini-batch gradient descent

In the previous class, we have talked about Stochastic gradient descent and how it can be faster than Batch gradient descent. In this class, let's talk about Mini-batch gradient descent. It can sometimes work even faster than Stochastic gradient descent.

To summarize:

  • Batch gradient descent: use all m examples in each iteration
  • Stochastic gradient descent: use 1 example in each iteration

The Mini-batch gradient descent is somewhere in between. Rather than using 1 example or m examples, we'll use b examples in each iteration, where b is called the "Mini-batch size". And typical value for b is 10; and typical range will be 2 ~ 100.

figure-1

It shows the Mini-batch gradient descent algorithm in above figure-1. Here we have a batch size of 10 and 1000 training examples. So we perform this sort of gradient descent update using 10 examples at a time. And we need 100 steps of size 10 in order to get through all 1000 training examples.

Comparing to Batch gradient descent, this also allows us to make progress much faster. Again let's say we have 300,000,000 examples:

  • With Batch gradient descent, we need scan through the entire 300 million training set before we can make any progress
  • With Mini-batch gradient descent, after looking at just the first 10 examples, we can start to make progress in improving the parameters \thetas. And then we can look at the second 10 examples and modify the parameters a little bit again and so on.

How about Mini-batch gradient descent versus Stochastic gradient descent? Why do we want to look at b examples at a time instead of just looking at 1 example at a time as the Stochastic gradient descent?

The answer is at vectorization. Mini-batch gradient descent is likely to outperform Stochastic gradient descent only if you have a good vectorize implementation. In that case, the item can be performed in a more vectorized way by using good numerical algebra libraries and that will allow you to partially parallelize your computation over the 10 examples.

<end>

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值