【零基础讲论文源码】gMLP：Pay Attention to MLPs

最新推荐文章于 2024-08-18 10:21:13 发布

Patrick Star1

最新推荐文章于 2024-08-18 10:21:13 发布

阅读量2.8k

点赞数 7

分类专栏： CVTransformer OCR方向

本文链接：https://blog.csdn.net/qq_35307005/article/details/117412226

版权

Transformer方向

swin-transformer解读【链接】
CVT 解读【链接】
gMLP解读【链接】

gMLP是一种使用MLP来获得transformer性能的方法，谷歌用更少的参数在大量实验上达到了transformer同样的精度。
gMLP的论文地址【链接】
讲解的代码地址【链接】

gMLP结构


class gMLP(nn.Module):
    def __init__(
            self,
            *,
            ...
    ):
        super().__init__()
        dim_ff = dim * ff_mult
        self.seq_len = seq_len
        self.prob_survival = prob_survival

        self.to_embed = nn.Embedding(num_tokens, dim) if exists(num_tokens) else nn.Identity()

        self.layers = nn.ModuleList([Residual(PreNorm(dim, gMLPBlock(dim = dim, dim_ff = dim_ff, seq_len = seq_len, attn_dim = attn_dim, causal = causal, act = act))) for i in range(depth)])
        #  gmlp(norm(x))+x

        self.to_logits = nn.Sequential(
            nn.LayerNorm(dim),
            nn.Linear(dim, num_tokens)
        ) if exists(num_tokens) else nn.Identity()

    def forward(self, x):
        x = self.to_embed(x)
        layers