MDNet网络

最新推荐文章于 2020-10-03 10:33:41 发布

丶麻辣小龙虾

最新推荐文章于 2020-10-03 10:33:41 发布

阅读量407

点赞数

分类专栏：学习笔记文章标签：深度学习自然语言处理神经网络

本文链接：https://blog.csdn.net/qq_23933415/article/details/108673298

版权

学习笔记专栏收录该内容

3 篇文章 1 订阅

订阅专栏

MDNet: A Semantically and Visually Interpretable Medical Image Diagnosis Network

ResNet

在这里插入图片描述

$y_{l}= F_{l}(x_{l})+h(x_{l}) \tag{1}$

$x_{l+1}=f(y_{l}) \tag{2}$
其中f和h是恒等映射，即
$x_{l+1}=y_{l}=x_{l}+F_{l}(x_{l}) \tag{3}$
其中
$x_{l}=x_{l-1}+F_{l}(x_{l-1})\\ i.e., x_{l+1}=x_{l-1}+F_{l}(x_{l})+F(x_{l-1}) \tag{4}$
由此推出
$x_{L}=x_{l}+\sum_{i=l}^{L-1}F_{l}(x_{i}) \tag{5}$
反向传播：
$\frac{\partial\mathcal{L}}{\partial x_{l}}=\frac{\partial\mathcal{L}}{\partial x_{L}}\frac{\partial x_{L}}{\partial x_{l}}=\frac{\partial\mathcal{L}}{\partial x_{L}}\left(1+\frac{\partial}{\partial x_{l}}\sum_{i=l}^{L-1}F_{l}(x_{i})\right) \tag{6}$

Decouple ensemble network outputs

$p^{c}=\sum_{k}w_{k}^{c}\cdot \sum_{i,j}y_{L}^{(k)}(i,j) \tag{7}$

其中 $p^{c}$ 是类别 $c$ 的输出概率， $(i, j)$ 表示空间坐标。 $\mathcal{w}^{c}=[w_{1}^{c},\cdots,w_{k}^{c},\cdots]^{T}$ 是全连接层的第 $c$ 列权重矩阵。
$y_{L}^{k}$ 是最后的残差块的第 $k$ 个特征图。

把公式（5）代入到（7）中，推出：
$p^{c}=\sum_{i,j}w^{c}\cdot y_{L}=\sum_{i,j}w^{c}(y_{1}+\sum_{m=1}^{L-1}F_{m}(y_{m})) \tag{8}$
文章中指出了ResNet的一个缺点： Using a single weighting function in the classification module is suboptimal in this situation. This is because the outputs of all ensembles share classifiers such that the importance of their individual features are undermined.

To address this issue, they propose to decouple the ensemble outputs and apply classifiers to them individually by using:
$p^c=\sum_{i,j}\left(\boldsymbol{w}_{1}^{c}\cdot y_{1}+\sum_{m=1}^{L-1}\boldsymbol{w}_{m+1}^{c}\cdot F_{m}(y_{m})\right) \tag{9}$
同时提出一个新的 skip-connection，定义如下：
$y_{l+1}=F_{l}(y_{l})\otimes y_{l} \tag{10}$
where $\otimes$ is the concatenation operation. And defining this skip-connection scheme as ensemble-connection.

Language model

$\log p(\boldsymbol{x}_{0:T}\mid I;\theta_{L})=\sum_{t=0}^{T}\log p(\boldsymbol{x}_{t}\mid I,\boldsymbol{x}_{0:t-1}; \theta_{L}) \tag{11}$

where $\left\{\boldsymbol{x}_{0},\cdots,\boldsymbol{x}_{T}\right\}$ are sentence words, $\theta_{L}$ are the parameters of LSTM.
$\boldsymbol{h}_{t}=LSTM(E\boldsymbol{x}_{t-1},\boldsymbol{h}_{t-1},\boldsymbol{z}_{t}) \tag{12}$
where $E$ is the word embedding matrix, $\boldsymbol{z}_{t}$ is a context vector.
$\boldsymbol{z}_{t}=\boldsymbol{a}_{t}\mathcal{C}(I)_{T} \tag{13}$
where $\mathcal{C}(I)$ is the Conv feature maps generated by the image model (stands for $y_{L}$ ).
$\boldsymbol{a}_{t}=softmax(W_{att}tanh(W_{h}\boldsymbol{h}_{t-1}+\mathcal{c})) \\ \boldsymbol{c}=(\boldsymbol{w}^c)^{T}\mathcal{C}(I) \tag{14}$
where $W_{att}$ and $W_{h}$ are learned embedding matrices, $\boldsymbol{c}$ is the Conv feature embedding through $\boldsymbol{w}^c$ .

edding matrices, $\boldsymbol{c}$ is the Conv feature embedding through $\boldsymbol{w}^c$ .

丶麻辣小龙虾

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
MDNet网络

MDNet: A Semantically and Visually Interpretable Medical Image Diagnosis NetworkResNetyl=Fl(xl)+h(xl)(1)y_{l}= F_{l}(x_{l})+h(x_{l})\tag{1}yl=Fl(xl)+h(xl)(1)xl+1=f(yl)(2)x_{l+1}=f(y_{l})\tag{2}xl+1=f(yl)(2)其中f和h是恒等映射，即xl+1=yl=xl+Fl(xl)(3)
复制链接

扫一扫

专栏目录