CS224d Assignment1 答案, Part(2/4)

最新推荐文章于 2024-07-07 20:39:04 发布

Silver-

最新推荐文章于 2024-07-07 20:39:04 发布

阅读量2.6k

点赞数

分类专栏： cs224d 文章标签： cs224d 答案

本文链接：https://blog.csdn.net/bumingqiu/article/details/72847255

版权

这篇博客是CS224d Assignment1的解答，主要涵盖了神经网络基础部分。内容包括sigmoid函数的梯度导出、softmax函数在交叉熵损失下的输入梯度、单隐藏层神经网络输入梯度的推导以及网络参数数量的计算。

摘要由CSDN通过智能技术生成

Assignment1的答案一共被我分成了4部分，分别包含第1，2，3，4题。这部分包含第2题的答案。

2. Neural Network Basics (30 points)

(a). (3 points) Derive the gradients of the sigmoid function and show that it can be rewritten as a function of the function value (i.e. in some expression where only $\sigma(x)$ , but not $x$ , is present). Assume that the input $x$ is a scalar for this question. Recall, the sigmoid function is

σ (x) = 1 1 + e - x (2)

$\sigma(x)=\frac{1}{1+e^{-x}} \quad(2)$

解：

σ' (x) = - (1 + e - x) - 2 (- e - x) = e - x ( 1 + e - x ) 2 = 1 1 + e - x (1 - 1 1 + e - x) = σ (x) [1 - σ (x)]

$\begin{align} \sigma'(x)&=-(1+e^{-x})^{-2} (-e^{-x})\\ &=\frac{e^{-x}}{(1+e^{-x})^2}\\ &=\frac{1}{1+e^{-x}}\left(1-\frac{1}{1+e^{-x}}\right)\\ &=\sigma(x)[1-\sigma(x)] \end{align}$

(b). (3 points) Derive the gradient with regard to the inputs of a softmax function when cross entropy loss is used for evaluation, i.e. fi nd the gradients with respect to the softmax input vector $\theta$ , when the prediction is made by $\hat{y}=softmax(\theta)$ . Remember the cross entropy function is