CS224d Assignment1 答案, Part(2/4)

Assignment1的答案一共被我分成了4部分,分别包含第1,2,3,4题。这部分包含第2题的答案。

2. Neural Network Basics (30 points)

(a). (3 points) Derive the gradients of the sigmoid function and show that it can be rewritten as a function of the function value (i.e. in some expression where only σ(x) , but not x , is present). Assume that the input x is a scalar for this question. Recall, the sigmoid function is

σ(x)=11+ex(2)

解:

σ(x)=(1+ex)2(ex)=ex(1+ex)2=11+ex(111+ex)=σ(x)[1σ(x)]


(b). (3 points) Derive the gradient with regard to the inputs of a softmax function when cross entropy loss is used for evaluation, i.e. fi nd the gradients with respect to the softmax input vector θ , when the prediction is made by y^=softmax(θ) . Remember the cross entropy function is

CE(y,y^)=iyilog(y^i)(3)

where y is the one-hot label vector, and y^ is the predicted probability vector for all classes. (Hint: you might want to consider the fact many elements of y are zeros, and assume that only the k-th dimension of y is one.)

解:根据提示,假设 y 的第k个值为1,其余值都为0,即 yk=1 那么有:

CE(y,y^)=yklog(y^k)=log(y^k)

对于 θ 中的第 i 个元素 θi
  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值