LaTeX / Markdown 书写公式

Yongqiang Cheng

已于 2024-04-17 22:24:15 修改

阅读量614

点赞数 1

分类专栏： Markdown - LaTeX 文章标签： LaTeX Markdown 书写公式

于 2019-11-18 18:41:57 首次发布

世上没有白读的书，每一页都算数。

本文链接：https://blog.csdn.net/chengyq116/article/details/103127635

版权

Markdown - LaTeX 专栏收录该内容

19 篇文章 0 订阅

订阅专栏

LaTeX / Markdown 书写公式

References

Here’s the regularized cross-entropy:

$$
\begin{aligned} 
C = -\frac{1}{n} \sum_{xj} \left[ y_j \ln a^L_j+(1-y_j) \ln (1-a^L_j)\right] + \frac{\lambda}{2n} \sum_w w^2. \tag{85}
\end{aligned}
$$

$\begin{aligned} C = -\frac{1}{n} \sum_{xj} \left[ y_j \ln a^L_j+(1-y_j) \ln (1-a^L_j)\right] + \frac{\lambda}{2n} \sum_w w^2. \tag{85} \end{aligned}$

It’s possible to regularize other cost functions, such as the quadratic cost. This can be done in a similar way:

$$
\begin{aligned} 
C = \frac{1}{2n} \sum_x \|y-a^L\|^2 + \frac{\lambda}{2n} \sum_w w^2. \tag{86}
\end{aligned}
$$

$\begin{aligned} C = \frac{1}{2n} \sum_x \|y-a^L\|^2 + \frac{\lambda}{2n} \sum_w w^2. \tag{86} \end{aligned}$

In both cases we can write the regularized cost function as

$$
\begin{aligned}
C = C_0 + \frac{\lambda}{2n} \sum_w w^2. \tag{87}
\end{aligned}
$$

$\begin{aligned} C = C_0 + \frac{\lambda}{2n} \sum_w w^2. \tag{87} \end{aligned}$

where $C_0$ is the original, unregularized cost function.

Taking the partial derivatives of Equation (87) gives

$$
\begin{aligned} 
\frac{\partial C}{\partial w} & = \frac{\partial C_0}{\partial w} + \frac{\lambda}{n} w \tag{88}\\ 
\end{aligned}
$$

$$
\begin{aligned} 
\frac{\partial C}{\partial b} & = \frac{\partial C_0}{\partial b}. \tag{89}
\end{aligned}
$$

$\begin{aligned} \frac{\partial C}{\partial w} & = \frac{\partial C_0}{\partial w} + \frac{\lambda}{n} w \tag{88}\\ \end{aligned}$

$\begin{aligned} \frac{\partial C}{\partial b} & = \frac{\partial C_0}{\partial b}. \tag{89} \end{aligned}$

The $\partial C_0 / \partial w$ and $\partial C_0 / \partial b$ terms can be computed using backpropagation.

DenseNet.

$$
\begin{aligned}
x_{1} &= w_{1} * x_{0} \\
x_{2} &= w_{2} * [x_{0}, x_{1}] \\ 
\vdots \\
x_{k} &= w_{k} * [x_{0}, x_{1}, ..., x_{k-1}] \\ 
\tag{1}
\end{aligned}
$$

$\begin{aligned} x_{1} &= w_{1} * x_{0} \\ x_{2} &= w_{2} * [x_{0}, x_{1}] \\ \vdots \\ x_{k} &= w_{k} * [x_{0}, x_{1}, ..., x_{k-1}] \\ \tag{1} \end{aligned}$

$$
\begin{aligned}
w_{1}^{,} &= f(w_{1}, \mathcal{g}_{0}) \\
w_{2}^{,} &= f(w_{2}, \mathcal{g}_{0}, \mathcal{g}_{1}) \\ 
w_{3}^{,} &= f(w_{3}, \mathcal{g}_{0}, \mathcal{g}_{1}, \mathcal{g}_{2}) \\ 
\vdots \\
w_{k}^{,} &= f(w_{k}, \mathcal{g}_{0}, \mathcal{g}_{1}, \mathcal{g}_{2}, ..., \mathcal{g}_{k-1}) \\ 
\tag{2}
\end{aligned}
$$

$\begin{aligned} w_{1}^{,} &= f(w_{1}, \mathcal{g}_{0}) \\ w_{2}^{,} &= f(w_{2}, \mathcal{g}_{0}, \mathcal{g}_{1}) \\ w_{3}^{,} &= f(w_{3}, \mathcal{g}_{0}, \mathcal{g}_{1}, \mathcal{g}_{2}) \\ \vdots \\ w_{k}^{,} &= f(w_{k}, \mathcal{g}_{0}, \mathcal{g}_{1}, \mathcal{g}_{2}, ..., \mathcal{g}_{k-1}) \\ \tag{2} \end{aligned}$

$$
\begin{aligned}
x_{k} &= w_{k} * [x_{0}^{,,}, x_{1}, ..., x_{k-1}] \\
x_{T} &= w_{T} * [x_{0}^{,,}, x_{1}, ..., x_{k}] \\ 
x_{U} &= w_{U} * [x_{0}^{,}, x_{T}] \\ 
\tag{3}
\end{aligned}
$$

$\begin{aligned} x_{k} &= w_{k} * [x_{0}^{,,}, x_{1}, ..., x_{k-1}] \\ x_{T} &= w_{T} * [x_{0}^{,,}, x_{1}, ..., x_{k}] \\ x_{U} &= w_{U} * [x_{0}^{,}, x_{T}] \\ \tag{3} \end{aligned}$

$$
\begin{aligned}
w_{k}^{,} &= f(w_{k}, \mathcal{g}_{0}^{,,}, \mathcal{g}_{1}, \mathcal{g}_{2}, ..., \mathcal{g}_{k-1}) \\
w_{T}^{,} &= f(w_{T}, \mathcal{g}_{0}^{,,}, \mathcal{g}_{1}, \mathcal{g}_{2}, ..., \mathcal{g}_{k}) \\ 
w_{U}^{,} &= f(w_{U}, \mathcal{g}_{0}^{,}, \mathcal{g}_{T}) \\ 
\tag{4}
\end{aligned}
$$

$\begin{aligned} w_{k}^{,} &= f(w_{k}, \mathcal{g}_{0}^{,,}, \mathcal{g}_{1}, \mathcal{g}_{2}, ..., \mathcal{g}_{k-1}) \\ w_{T}^{,} &= f(w_{T}, \mathcal{g}_{0}^{,,}, \mathcal{g}_{1}, \mathcal{g}_{2}, ..., \mathcal{g}_{k}) \\ w_{U}^{,} &= f(w_{U}, \mathcal{g}_{0}^{,}, \mathcal{g}_{T}) \\ \tag{4} \end{aligned}$