机器学习中的神经网络Neural Networks for Machine Learning:Lecture 6 Quiz

Lecture 6 QuizHelp Center

Warning: The hard deadline has passed. You can attempt it, but you will not get credit for it. You are welcome to try it as a learning exercise.


Question 1

Suppose  w  is the weight on some connection in a neural network. The network is trained using gradient descent until the learning converges. However, the dataset consists of two mini-batches, which differ from each other somewhat. As usual, we alternate between the mini-batches for our gradient calculations, and that has implications for what happens after convergence. We plot the change of  w  as training progresses. Which of the following scenarios shows that convergence has occurred?  Notice that we're plotting the change in  w , as opposed to  w  itself.
Note that in the plots below, each  iteration refers to a single  step of steepest descent on a  single minibatch.

Question 2

Suppose you are using mini-batch gradient descent for training some neural net on a large dataset. You have to decide on the learning rate, weight initializations, preprocess the inputs etc. You try some values for these and find that the value of the objective function on the training set decreases smoothly but very slowly. What could be causing this? Check all that apply.

Question 3

Four datasets are shown below. Each dataset has two input values (plotted below) and a target value (not shown). Each point in the plots denotes one training case. Assume that we are solving a classification problem. Which of the following datasets would most likely be easiest to train using neural nets ?

Question 4

Claire is training a neural net using mini-batch gradient descent. She chose a particular learning rate and found that the training error decreased as more iterations of training were performed, as shown here in blue: 
curve.png
She was not sure if this was the best she could do. So she tried a  bigger learning rate. Which of the following error curves (shown in red) might she observe now? Select the two most likely plots.
Note that in the plots below, each  iteration refers to a single  step of steepest descent on a  single minibatch.

Question 5

In the lectures, we discussed two kinds of gradient descent algorithms: mini-batch and full-batch. For which of the following problems is mini-batch gradient descent likely to be  a lot better than full-batch gradient descent?
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值