LaTeX / Markdown 书写公式

LaTeX / Markdown 书写公式

Here’s the regularized cross-entropy:

$$
\begin{aligned} 
C = -\frac{1}{n} \sum_{xj} \left[ y_j \ln a^L_j+(1-y_j) \ln (1-a^L_j)\right] + \frac{\lambda}{2n} \sum_w w^2. \tag{85}
\end{aligned}
$$

C = − 1 n ∑ x j [ y j ln ⁡ a j L + ( 1 − y j ) ln ⁡ ( 1 − a j L ) ] + λ 2 n ∑ w w 2 . (85) \begin{aligned} C = -\frac{1}{n} \sum_{xj} \left[ y_j \ln a^L_j+(1-y_j) \ln (1-a^L_j)\right] + \frac{\lambda}{2n} \sum_w w^2. \tag{85} \end{aligned} C=n1xj[yjlnajL+(1yj)ln(1ajL)]+2nλww2.(85)

It’s possible to regularize other cost functions, such as the quadratic cost. This can be done in a similar way:

$$
\begin{aligned} 
C = \frac{1}{2n} \sum_x \|y-a^L\|^2 + \frac{\lambda}{2n} \sum_w w^2. \tag{86}
\end{aligned}
$$

C = 1 2 n ∑ x ∥ y − a L ∥ 2 + λ 2 n ∑ w w 2 . (86) \begin{aligned} C = \frac{1}{2n} \sum_x \|y-a^L\|^2 + \frac{\lambda}{2n} \sum_w w^2. \tag{86} \end{aligned} C=2n1xyaL2+2nλww2.(86)

In both cases we can write the regularized cost function as

$$
\begin{aligned}
C = C_0 + \frac{\lambda}{2n} \sum_w w^2. \tag{87}
\end{aligned}
$$

C = C 0 + λ 2 n ∑ w w 2 . (87) \begin{aligned} C = C_0 + \frac{\lambda}{2n} \sum_w w^2. \tag{87} \end{aligned} C=C0+2nλww2.(87)

where C 0 C_0 C0 is the original, unregularized cost function.

Taking the partial derivatives of Equation (87) gives

$$
\begin{aligned} 
\frac{\partial C}{\partial w} & = \frac{\partial C_0}{\partial w} + \frac{\lambda}{n} w \tag{88}\\ 
\end{aligned}
$$

$$
\begin{aligned} 
\frac{\partial C}{\partial b} & = \frac{\partial C_0}{\partial b}. \tag{89}
\end{aligned}
$$

∂ C ∂ w = ∂ C 0 ∂ w + λ n w (88) \begin{aligned} \frac{\partial C}{\partial w} & = \frac{\partial C_0}{\partial w} + \frac{\lambda}{n} w \tag{88}\\ \end{aligned} wC=wC0+nλw(88)

∂ C ∂ b = ∂ C 0 ∂ b . (89) \begin{aligned} \frac{\partial C}{\partial b} & = \frac{\partial C_0}{\partial b}. \tag{89} \end{aligned} bC=bC0.(89)

The ∂ C 0 / ∂ w \partial C_0 / \partial w C0/w and ∂ C 0 / ∂ b \partial C_0 / \partial b C0/b terms can be computed using backpropagation.

DenseNet.

$$
\begin{aligned}
x_{1} &= w_{1} * x_{0} \\
x_{2} &= w_{2} * [x_{0}, x_{1}] \\ 
\vdots \\
x_{k} &= w_{k} * [x_{0}, x_{1}, ..., x_{k-1}] \\ 
\tag{1}
\end{aligned}
$$

x 1 = w 1 ∗ x 0 x 2 = w 2 ∗ [ x 0 , x 1 ] ⋮ x k = w k ∗ [ x 0 , x 1 , . . . , x k − 1 ] (1) \begin{aligned} x_{1} &= w_{1} * x_{0} \\ x_{2} &= w_{2} * [x_{0}, x_{1}] \\ \vdots \\ x_{k} &= w_{k} * [x_{0}, x_{1}, ..., x_{k-1}] \\ \tag{1} \end{aligned} x1x2xk=w1x0=w2[x0,x1]=wk[x0,x1,...,xk1](1)

$$
\begin{aligned}
w_{1}^{,} &= f(w_{1}, \mathcal{g}_{0}) \\
w_{2}^{,} &= f(w_{2}, \mathcal{g}_{0}, \mathcal{g}_{1}) \\ 
w_{3}^{,} &= f(w_{3}, \mathcal{g}_{0}, \mathcal{g}_{1}, \mathcal{g}_{2}) \\ 
\vdots \\
w_{k}^{,} &= f(w_{k}, \mathcal{g}_{0}, \mathcal{g}_{1}, \mathcal{g}_{2}, ..., \mathcal{g}_{k-1}) \\ 
\tag{2}
\end{aligned}
$$

w 1 , = f ( w 1 , g 0 ) w 2 , = f ( w 2 , g 0 , g 1 ) w 3 , = f ( w 3 , g 0 , g 1 , g 2 ) ⋮ w k , = f ( w k , g 0 , g 1 , g 2 , . . . , g k − 1 ) (2) \begin{aligned} w_{1}^{,} &= f(w_{1}, \mathcal{g}_{0}) \\ w_{2}^{,} &= f(w_{2}, \mathcal{g}_{0}, \mathcal{g}_{1}) \\ w_{3}^{,} &= f(w_{3}, \mathcal{g}_{0}, \mathcal{g}_{1}, \mathcal{g}_{2}) \\ \vdots \\ w_{k}^{,} &= f(w_{k}, \mathcal{g}_{0}, \mathcal{g}_{1}, \mathcal{g}_{2}, ..., \mathcal{g}_{k-1}) \\ \tag{2} \end{aligned} w1,w2,w3,wk,=f(w1,g0)=f(w2,g0,g1)=f(w3,g0,g1,g2)=f(wk,g0,g1,g2,...,gk1)(2)

$$
\begin{aligned}
x_{k} &= w_{k} * [x_{0}^{,,}, x_{1}, ..., x_{k-1}] \\
x_{T} &= w_{T} * [x_{0}^{,,}, x_{1}, ..., x_{k}] \\ 
x_{U} &= w_{U} * [x_{0}^{,}, x_{T}] \\ 
\tag{3}
\end{aligned}
$$

x k = w k ∗ [ x 0 , , , x 1 , . . . , x k − 1 ] x T = w T ∗ [ x 0 , , , x 1 , . . . , x k ] x U = w U ∗ [ x 0 , , x T ] (3) \begin{aligned} x_{k} &= w_{k} * [x_{0}^{,,}, x_{1}, ..., x_{k-1}] \\ x_{T} &= w_{T} * [x_{0}^{,,}, x_{1}, ..., x_{k}] \\ x_{U} &= w_{U} * [x_{0}^{,}, x_{T}] \\ \tag{3} \end{aligned} xkxTxU=wk[x0,,,x1,...,xk1]=wT[x0,,,x1,...,xk]=wU[x0,,xT](3)

$$
\begin{aligned}
w_{k}^{,} &= f(w_{k}, \mathcal{g}_{0}^{,,}, \mathcal{g}_{1}, \mathcal{g}_{2}, ..., \mathcal{g}_{k-1}) \\
w_{T}^{,} &= f(w_{T}, \mathcal{g}_{0}^{,,}, \mathcal{g}_{1}, \mathcal{g}_{2}, ..., \mathcal{g}_{k}) \\ 
w_{U}^{,} &= f(w_{U}, \mathcal{g}_{0}^{,}, \mathcal{g}_{T}) \\ 
\tag{4}
\end{aligned}
$$

w k , = f ( w k , g 0 , , , g 1 , g 2 , . . . , g k − 1 ) w T , = f ( w T , g 0 , , , g 1 , g 2 , . . . , g k ) w U , = f ( w U , g 0 , , g T ) (4) \begin{aligned} w_{k}^{,} &= f(w_{k}, \mathcal{g}_{0}^{,,}, \mathcal{g}_{1}, \mathcal{g}_{2}, ..., \mathcal{g}_{k-1}) \\ w_{T}^{,} &= f(w_{T}, \mathcal{g}_{0}^{,,}, \mathcal{g}_{1}, \mathcal{g}_{2}, ..., \mathcal{g}_{k}) \\ w_{U}^{,} &= f(w_{U}, \mathcal{g}_{0}^{,}, \mathcal{g}_{T}) \\ \tag{4} \end{aligned} wk,wT,wU,=f(wk,g0,,,g1,g2,...,gk1)=f(wT,g0,,,g1,g2,...,gk)=f(wU,g0,,gT)(4)

References

[1] Yongqiang Cheng, https://yongqiang.blog.csdn.net/

  • 1
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

Yongqiang Cheng

梦想不是浮躁,而是沉淀和积累。

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值