PyTorch参数初始化方法

torch.nn.init

torch.nn.init. calculate_gain ( nonlinearityparam=None ) [source]

Return the recommended gain value for the given nonlinearity function. The values are as follows:

nonlinearity gain
linear 1 1
conv{1,2,3}d 1 1
sigmoid 1 1
tanh 5/3 5/3
relu 2 2
leaky_relu 2/(1+negative_slope2) 2/(1+negative_slope2)
Parameters:
  • nonlinearity – the nonlinear function (nn.functional name)
  • param – optional parameter for the nonlinear function

Examples

>>> gain = nn.init.calculate_gain('leaky_relu')
torch.nn.init. uniform ( tensora=0b=1 ) [source]

Fills the input Tensor or Variable with values drawn from the uniform distribution  U(a,b) U(a,b).

Parameters:
  • tensor – an n-dimensional torch.Tensor or autograd.Variable
  • a – the lower bound of the uniform distribution
  • b – the upper bound of the uniform distribution

Examples

>>> w = torch.Tensor(3, 5)
>>> nn.init.uniform(w)
torch.nn.init. normal ( tensormean=0std=1 ) [source]

Fills the input Tensor or Variable with values drawn from the normal distribution  N(mean,std) N(mean,std).

Parameters:
  • tensor – an n-dimensional torch.Tensor or autograd.Variable
  • mean – the mean of the normal distribution
  • std – the standard deviation of the normal distribution

Examples

>>> w = torch.Tensor(3, 5)
>>> nn.init.normal(w)
torch.nn.init. constant ( tensorval ) [source]

Fills the input Tensor or Variable with the value val.

Parameters:
  • tensor – an n-dimensional torch.Tensor or autograd.Variable
  • val – the value to fill the tensor with

Examples

>>> w = torch.Tensor(3, 5)
>>> nn.init.constant(w)
torch.nn.init. eye ( tensor ) [source]

Fills the 2-dimensional input Tensor or Variable with the identity matrix. Preserves the identity of the inputs in Linear layers, where as many inputs are preserved as possible.

Parameters: tensor – a 2-dimensional torch.Tensor or autograd.Variable

Examples

>>> w = torch.Tensor(3, 5)
>>> nn.init.eye(w)
torch.nn.init. dirac ( tensor ) [source]

Fills the {3, 4, 5}-dimensional input Tensor or Variable with the Dirac delta function. Preserves the identity of the inputs in Convolutional layers, where as many input channels are preserved as possible.

Parameters: tensor – a {3, 4, 5}-dimensional torch.Tensor or autograd.Variable

Examples

>>> w = torch.Tensor(3, 16, 5, 5)
>>> nn.init.dirac(w)
torch.nn.init. xavier_uniform ( tensorgain=1 ) [source]

Fills the input Tensor or Variable with values according to the method described in “Understanding the difficulty of training deep feedforward neural networks” - Glorot, X. & Bengio, Y. (2010), using a uniform distribution. The resulting tensor will have values sampled from  U(a,a) U(−a,a) where  a=gain×2/(fan_in+fan_out)×3 a=gain×2/(fan_in+fan_out)×3. Also known as Glorot initialisation.

Parameters:
  • tensor – an n-dimensional torch.Tensor or autograd.Variable
  • gain – an optional scaling factor

Examples

>>> w = torch.Tensor(3, 5)
>>> nn.init.xavier_uniform(w, gain=nn.init.calculate_gain('relu'))
torch.nn.init. xavier_normal ( tensorgain=1 ) [source]

Fills the input Tensor or Variable with values according to the method described in “Understanding the difficulty of training deep feedforward neural networks” - Glorot, X. & Bengio, Y. (2010), using a normal distribution. The resulting tensor will have values sampled from  N(0,std) N(0,std) where  std=gain×2/(fan_in+fan_out) std=gain×2/(fan_in+fan_out). Also known as Glorot initialisation.

Parameters:
  • tensor – an n-dimensional torch.Tensor or autograd.Variable
  • gain – an optional scaling factor

Examples

>>> w = torch.Tensor(3, 5)
>>> nn.init.xavier_normal(w)
torch.nn.init. kaiming_uniform ( tensora=0mode='fan_in' ) [source]

Fills the input Tensor or Variable with values according to the method described in “Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification” - He, K. et al. (2015), using a uniform distribution. The resulting tensor will have values sampled from  U(bound,bound) U(−bound,bound) where  bound=2/((1+a2)×fan_in)×3 bound=2/((1+a2)×fan_in)×3. Also known as He initialisation.

Parameters:
  • tensor – an n-dimensional torch.Tensor or autograd.Variable
  • a – the negative slope of the rectifier used after this layer (0 for ReLU by default)
  • mode – either ‘fan_in’ (default) or ‘fan_out’. Choosing fan_in preserves the magnitude of the variance of the weights in the forward pass. Choosing fan_outpreserves the magnitudes in the backwards pass.

Examples

>>> w = torch.Tensor(3, 5)
>>> nn.init.kaiming_uniform(w, mode='fan_in')
torch.nn.init. kaiming_normal ( tensora=0mode='fan_in' ) [source]

Fills the input Tensor or Variable with values according to the method described in “Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification” - He, K. et al. (2015), using a normal distribution. The resulting tensor will have values sampled from  N(0,std) N(0,std)where  std=2/((1+a2)×fan_in) std=2/((1+a2)×fan_in). Also known as He initialisation.

Parameters:
  • tensor – an n-dimensional torch.Tensor or autograd.Variable
  • a – the negative slope of the rectifier used after this layer (0 for ReLU by default)
  • mode – either ‘fan_in’ (default) or ‘fan_out’. Choosing fan_in preserves the magnitude of the variance of the weights in the forward pass. Choosing fan_outpreserves the magnitudes in the backwards pass.

Examples

>>> w = torch.Tensor(3, 5)
>>> nn.init.kaiming_normal(w, mode='fan_out')
torch.nn.init. orthogonal ( tensorgain=1 ) [source]

Fills the input Tensor or Variable with a (semi) orthogonal matrix, as described in “Exact solutions to the nonlinear dynamics of learning in deep linear neural networks” - Saxe, A. et al. (2013). The input tensor must have at least 2 dimensions, and for tensors with more than 2 dimensions the trailing dimensions are flattened.

Parameters:
  • tensor – an n-dimensional torch.Tensor or autograd.Variable, where n >= 2
  • gain – optional scaling factor

Examples

>>> w = torch.Tensor(3, 5)
>>> nn.init.orthogonal(w)
torch.nn.init. sparse ( tensorsparsitystd=0.01 ) [source]

Fills the 2D input Tensor or Variable as a sparse matrix, where the non-zero elements will be drawn from the normal distribution  N(0,0.01) N(0,0.01), as described in “Deep learning via Hessian-free optimization” - Martens, J. (2010).

Parameters:
  • tensor – an n-dimensional torch.Tensor or autograd.Variable
  • sparsity – The fraction of elements in each column to be set to zero
  • std – the standard deviation of the normal distribution used to generate
  • non-zero values (the) –

Examples

>>> w = torch.Tensor(3, 5)
>>> nn.init.sparse(w, sparsity=0.1)
  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值