CS224d Assignment1 答案, Part(2/4)

这篇博客是CS224d Assignment1的解答,主要涵盖了神经网络基础部分。内容包括sigmoid函数的梯度导出、softmax函数在交叉熵损失下的输入梯度、单隐藏层神经网络输入梯度的推导以及网络参数数量的计算。
摘要由CSDN通过智能技术生成

Assignment1的答案一共被我分成了4部分,分别包含第1,2,3,4题。这部分包含第2题的答案。

2. Neural Network Basics (30 points)

(a). (3 points) Derive the gradients of the sigmoid function and show that it can be rewritten as a function of the function value (i.e. in some expression where only σ(x) , but not x , is present). Assume that the input x is a scalar for this question. Recall, the sigmoid function is

σ(x)=11+ex(2)

解:

σ(x)=(1+ex)2(ex)=ex(1+ex)2=11+ex(111+ex)=σ(x)[1σ(x)]


(b). (3 points) Derive the gradient with regard to the inputs of a softmax function when cross entropy loss is used for evaluation, i.e. fi nd the gradients with respect to the softmax input vector θ , when the prediction is made by y^=softmax(θ) . Remember the cross entropy function is

CE(y,y^)=iyilog(y^i)(3)

where y is the one-hot label vector, and y^ is the predicted probability vector for all classes. (Hint: you might want to consider the fact many elements of y are zeros, and assume that only the k-th dimension of y is one.)

解:根据提示,假设 y 的第k个值为1,其余值都为0,即 yk=1 那么有:

CE(y,y^)=yklog(y^k)=log(y^k)

对于 θ 中的第 i 个元素 θi
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值