pytorch继承nn.Module类定义模型

最新推荐文章于 2024-02-05 13:55:31 发布

Fang Suk

最新推荐文章于 2024-02-05 13:55:31 发布

阅读量6.5k

点赞数 19

分类专栏： pytorch 文章标签： pytorch

本文链接：https://blog.csdn.net/MrR1ght/article/details/105408374

版权

pytorch 专栏收录该内容

9 篇文章 3 订阅

订阅专栏

在pytorch中，最常用于定义模型的方法是继承nn.Module类后重载__init__()和forward函数。部分疑问记录：

1.为什么重载forward函数后可以直接使用net(x)调用？

2.哪些网络模块要预先写在__init__中？

3.如果一个网络模块有多个重复的网络层。哪些可以在__init__只定义一次。哪些要定义多次。

一 forward函数

1.python的__init__,__new__,__call__函数

__new__,__init__参考这篇博客

（1）__new__类构造函数

1）类级别的方法，需要至少传递一个参数cls。

2）必须有返回值，返回实例化出来的实例self。

（2）__init__类初始化函数

1）实例级别方法，需要至少传递一个参数self，self指的是__new__构造方法构造的实例。

2）必须不能有返回值，否则会报错

python实例先使用类的__new__方法构造出实例，再通过__init__方法初始化该实例。进行以下测试。

class A():
    def __init__(self):
        """对象初始化，此时对象已存在，通过new方法创建，对创建好的对象进行初始化"""
        super(A,self).__init__()
        print("__init__")
        print(self)
        #return self

    def __new__(cls, *args, **kwargs):
        """新建对象。并返回self"""
        print("__new__")
        #print(cls)
        self = super(A,cls).__new__(cls)
        print(self)
        return self

    def __call__(self):
        """可调用对象，实现了__call__方法的类的对象是可调用对象。可调用对象可使用()进行调用"""
        print("run call")
        return "test"
a = A()

运行结果：


__new__
<__main__.A object at 0x7f5df0f8e550>
__init__
<__main__.A object at 0x7f5df0f8e550>

（3）__call__函数

Python中实现了__call__方法的实例就是可调用对象。可以像调用函数一样调用实例。

如在上面的例子中，直接调用该实例，会看到以下输出。

t = a()
print(t)

输出：
run call
test

同时，__call__可以接受参数，也可以有返回值。现在我们知道要在__init__中进行初始化，可以利用__call__对象像函数一样被调用。下面模拟一下网络层的大致流程。

import numpy as np

class MyLinear():
    """不标准的实现一个线性层，y=wx+b"""
    def __init__(self,input_dim,output_dim):
        self.weights = np.random.normal(size=(input_dim,output_dim))
        self.bias = np.zeros(shape=(output_dim,))
       
    def forward(self,x):
        ## x[batch,input_dim]
        print("run forward_fn")
        y = np.dot(x,self.weights) + self.bias
        return y
    
    def __call__(self,x):
        print("run call_fn")
        y = self.forward(x=x)
        return y

#%%
x = np.random.randn(32,100)
print(x.shape)

net = MyLinear(100,10)
y = net(x)
print(y.shape)

输出结果：

(32, 100)
run call_fn
run forward_fn
(32, 10)

从pytorch中所有模型基类Module的源码中可以看到，Module类正是在__call__方法中调用了forward方法，并返回forward方法的结果。

class Module(object):
    def __call__(self, *input, **kwargs):
        for hook in self._forward_pre_hooks.values():
            result = hook(self, input)
            if result is not None:
                if not isinstance(result, tuple):
                    result = (result,)
                input = result
        if torch._C._get_tracing_state():
            result = self._slow_forward(*input, **kwargs)
        else:
            result = self.forward(*input, **kwargs) #调用forward
        for hook in self._forward_hooks.values():
            hook_result = hook(self, input, result)
            if hook_result is not None:
                result = hook_result
        if len(self._backward_hooks) > 0:
            var = result
            while not isinstance(var, torch.Tensor):
                if isinstance(var, dict):
                    var = next((v for v in var.values() if isinstance(v, torch.Tensor)))
                else:
                    var = var[0]
            grad_fn = var.grad_fn
            if grad_fn is not None:
                for hook in self._backward_hooks.values():
                    wrapper = functools.partial(hook, self)
                    functools.update_wrapper(wrapper, hook)
                    grad_fn.register_hook(wrapper)
        return result

二．继承Module类定义模型时，哪些模块要写入init方法中。

这里先分析下torch.nn.functional.softmax（F.softmax）和torch.nn.Softmax的区别

F.softmax:是一个函数，只是定义了计算过程。

nn.Softmax:是一个网络层，除了定义计算过程外，还定义了具体层结构（如线性层的输入维度和输出维度），还要初始化层需要的参数（如init初始化weight和bias）。定义具体层结构和初始化参数就是在__init__中进行的。下面是nn.Linear层的实现。

class Linear(Module):
    r"""Applies a linear transformation to the incoming data: :math:`y = xA^T + b`

    Args:
        in_features: size of each input sample
        out_features: size of each output sample
        bias: If set to ``False``, the layer will not learn an additive bias.
            Default: ``True``

    Shape:
        - Input: :math:`(N, *, H_{in})` where :math:`*` means any number of
          additional dimensions and :math:`H_{in} = \text{in\_features}`
        - Output: :math:`(N, *, H_{out})` where all but the last dimension
          are the same shape as the input and :math:`H_{out} = \text{out\_features}`.

    Attributes:
        weight: the learnable weights of the module of shape
            :math:`(\text{out\_features}, \text{in\_features})`. The values are
            initialized from :math:`\mathcal{U}(-\sqrt{k}, \sqrt{k})`, where
            :math:`k = \frac{1}{\text{in\_features}}`
        bias:   the learnable bias of the module of shape :math:`(\text{out\_features})`.
                If :attr:`bias` is ``True``, the values are initialized from
                :math:`\mathcal{U}(-\sqrt{k}, \sqrt{k})` where
                :math:`k = \frac{1}{\text{in\_features}}`

    Examples::

        >>> m = nn.Linear(20, 30)
        >>> input = torch.randn(128, 20)
        >>> output = m(input)
        >>> print(output.size())
        torch.Size([128, 30])
    """
    __constants__ = ['bias', 'in_features', 'out_features']

    def __init__(self, in_features, out_features, bias=True):
        super(Linear, self).__init__()
        self.in_features = in_features
        self.out_features = out_features
        self.weight = Parameter(torch.Tensor(out_features, in_features))
        if bias:
            self.bias = Parameter(torch.Tensor(out_features))
        else:
            self.register_parameter('bias', None)
        self.reset_parameters()

    def reset_parameters(self):
        init.kaiming_uniform_(self.weight, a=math.sqrt(5))
        if self.bias is not None:
            fan_in, _ = init._calculate_fan_in_and_fan_out(self.weight)
            bound = 1 / math.sqrt(fan_in)
            init.uniform_(self.bias, -bound, bound)

    def forward(self, input):
        return F.linear(input, self.weight, self.bias)

    def extra_repr(self):
        return 'in_features={}, out_features={}, bias={}'.format(
            self.in_features, self.out_features, self.bias is not None
        )

我们看下__init__中具体做了哪些事情：1.传入参数in_features/out_features定义具体层结构。2.reset_parameters对模型参数进行初始化（这两部分参数不一样，一个是用来指定具体模型结构的，一个是模型要学习的参数，下面便于区分都叫做parameters吧）。

知道__init__要做的事情之后也就知道了什么情况下要把模块写入__init__中了。凡是需要参数来定义具体结构（举个栗子，线性层需要指定输入维度/输入维度，需要在__init__初始化，softmax不需要指定）或者是包含需要学习参数的模块都要写入__init__中。

三.哪些重复使用的层要初始化多个，那些不需要

先说下关于网络层复用。想要重复使用某一模块，只需在__init__中定义一次。然后在forward函数中多次调用就可以了。复用其实就是指共享某层的参数。

如果不想复用，那就需要在__init__中定义多个网络层。复用不复用指定是模型参数。不包含参数的网络层也就不存在这个问题了。例如maxpool1d层。不包含参数，只是定义了pool计算的方式。

总结

可分为以下几类

网络层可按照是否需要参数确定结构（如卷积核大小），是否需要参与训练的参数parameters（如weights）分为以下几种。

类型
需要参数，需要parameters	要写入__init__, 不想复用网络层要定义多次
需要参数，不需要parameters	要写入__init__,定义一次即可
不需要参数，不需要parameters	可以不写入__init__，定义一次即可

只要需要参数或者是parameters的都要都是__init__进行初始化。

复用是针对参数的。不带parameters的网络层不存在复用问题。定义一次多次使用即可。带参数的网络层不想复用，要重复定义多次。

Fang Suk

关注

19
点赞
踩
42

收藏

觉得还不错? 一键收藏
3
评论
pytorch继承nn.Module类定义模型

在pytorch中，最常用于定义模型的方法是继承nn.Module类后重载__init__()和forward函数。部分疑问记录：1.为什么重载forward函数后可以直接使用net(x)调用？2.哪些网络模块要预先写在__init__中？3.如果一个网络模块有多个重复的网络层。哪些可以在__init__只定义一次。哪些要定义多次。一 forward函数1.python的__i...
复制链接

扫一扫