torch.nn.init
-
Return the recommended gain value for the given nonlinearity function. The values are as follows:
nonlinearity gain linear 1 conv{1,2,3}d 1 sigmoid 1 tanh 5/3 relu 2‾√ leaky_relu 2/(1+negative_slope2)‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾√ Parameters: - nonlinearity – the nonlinear function (nn.functional name)
- param – optional parameter for the nonlinear function
Examples
>>> gain = nn.init.calculate_gain('leaky_relu')
torch.nn.init.
calculate_gain
(
nonlinearity,
param=None
)
[source]
-
Fills the input Tensor or Variable with values drawn from the uniform distribution U(a,b) .
Parameters: - tensor – an n-dimensional torch.Tensor or autograd.Variable
- a – the lower bound of the uniform distribution
- b – the upper bound of the uniform distribution
Examples
>>> w = torch.Tensor(3, 5) >>> nn.init.uniform(w)
torch.nn.init.
uniform
(
tensor,
a=0,
b=1
)
[source]
-
Fills the input Tensor or Variable with values drawn from the normal distribution N(mean,std) .
Parameters: - tensor – an n-dimensional torch.Tensor or autograd.Variable
- mean – the mean of the normal distribution
- std – the standard deviation of the normal distribution
Examples
>>> w = torch.Tensor(3, 5) >>> nn.init.normal(w)
torch.nn.init.
normal
(
tensor,
mean=0,
std=1
)
[source]
-
Fills the input Tensor or Variable with the value val.
Parameters: - tensor – an n-dimensional torch.Tensor or autograd.Variable
- val – the value to fill the tensor with
Examples
>>> w = torch.Tensor(3, 5) >>> nn.init.constant(w)
torch.nn.init.
constant
(
tensor,
val
)
[source]
-
Fills the 2-dimensional input Tensor or Variable with the identity matrix. Preserves the identity of the inputs in Linear layers, where as many inputs are preserved as possible.
Parameters: tensor – a 2-dimensional torch.Tensor or autograd.Variable Examples
>>> w = torch.Tensor(3, 5) >>> nn.init.eye(w)
torch.nn.init.
eye
(
tensor
)
[source]
-
Fills the {3, 4, 5}-dimensional input Tensor or Variable with the Dirac delta function. Preserves the identity of the inputs in Convolutional layers, where as many input channels are preserved as possible.
Parameters: tensor – a {3, 4, 5}-dimensional torch.Tensor or autograd.Variable Examples
>>> w = torch.Tensor(3, 16, 5, 5) >>> nn.init.dirac(w)
torch.nn.init.
dirac
(
tensor
)
[source]
-
Fills the input Tensor or Variable with values according to the method described in “Understanding the difficulty of training deep feedforward neural networks” - Glorot, X. & Bengio, Y. (2010), using a uniform distribution. The resulting tensor will have values sampled from U(−a,a) where a=gain×2/(fan_in+fan_out)‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾√×3‾√ . Also known as Glorot initialisation.
Parameters: - tensor – an n-dimensional torch.Tensor or autograd.Variable
- gain – an optional scaling factor
Examples
>>> w = torch.Tensor(3, 5) >>> nn.init.xavier_uniform(w, gain=nn.init.calculate_gain('relu'))
torch.nn.init.
xavier_uniform
(
tensor,
gain=1
)
[source]
-
Fills the input Tensor or Variable with values according to the method described in “Understanding the difficulty of training deep feedforward neural networks” - Glorot, X. & Bengio, Y. (2010), using a normal distribution. The resulting tensor will have values sampled from N(0,std) where std=gain×2/(fan_in+fan_out)‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾√ . Also known as Glorot initialisation.
Parameters: - tensor – an n-dimensional torch.Tensor or autograd.Variable
- gain – an optional scaling factor
Examples
>>> w = torch.Tensor(3, 5) >>> nn.init.xavier_normal(w)
torch.nn.init.
xavier_normal
(
tensor,
gain=1
)
[source]
-
Fills the input Tensor or Variable with values according to the method described in “Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification” - He, K. et al. (2015), using a uniform distribution. The resulting tensor will have values sampled from U(−bound,bound) where bound=2/((1+a2)×fan_in)‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾√×3‾√ . Also known as He initialisation.
Parameters: - tensor – an n-dimensional torch.Tensor or autograd.Variable
- a – the negative slope of the rectifier used after this layer (0 for ReLU by default)
- mode – either ‘fan_in’ (default) or ‘fan_out’. Choosing fan_in preserves the magnitude of the variance of the weights in the forward pass. Choosing fan_outpreserves the magnitudes in the backwards pass.
Examples
>>> w = torch.Tensor(3, 5) >>> nn.init.kaiming_uniform(w, mode='fan_in')
torch.nn.init.
kaiming_uniform
(
tensor,
a=0,
mode='fan_in'
)
[source]
-
Fills the input Tensor or Variable with values according to the method described in “Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification” - He, K. et al. (2015), using a normal distribution. The resulting tensor will have values sampled from N(0,std) where std=2/((1+a2)×fan_in)‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾√ . Also known as He initialisation.
Parameters: - tensor – an n-dimensional torch.Tensor or autograd.Variable
- a – the negative slope of the rectifier used after this layer (0 for ReLU by default)
- mode – either ‘fan_in’ (default) or ‘fan_out’. Choosing fan_in preserves the magnitude of the variance of the weights in the forward pass. Choosing fan_outpreserves the magnitudes in the backwards pass.
Examples
>>> w = torch.Tensor(3, 5) >>> nn.init.kaiming_normal(w, mode='fan_out')
torch.nn.init.
kaiming_normal
(
tensor,
a=0,
mode='fan_in'
)
[source]
-
Fills the input Tensor or Variable with a (semi) orthogonal matrix, as described in “Exact solutions to the nonlinear dynamics of learning in deep linear neural networks” - Saxe, A. et al. (2013). The input tensor must have at least 2 dimensions, and for tensors with more than 2 dimensions the trailing dimensions are flattened.
Parameters: - tensor – an n-dimensional torch.Tensor or autograd.Variable, where n >= 2
- gain – optional scaling factor
Examples
>>> w = torch.Tensor(3, 5) >>> nn.init.orthogonal(w)
torch.nn.init.
orthogonal
(
tensor,
gain=1
)
[source]
-
Fills the 2D input Tensor or Variable as a sparse matrix, where the non-zero elements will be drawn from the normal distribution N(0,0.01) , as described in “Deep learning via Hessian-free optimization” - Martens, J. (2010).
Parameters: - tensor – an n-dimensional torch.Tensor or autograd.Variable
- sparsity – The fraction of elements in each column to be set to zero
- std – the standard deviation of the normal distribution used to generate
- non-zero values (the) –
Examples
>>> w = torch.Tensor(3, 5) >>> nn.init.sparse(w, sparsity=0.1)
torch.nn.init.
sparse
(
tensor,
sparsity,
std=0.01
)
[source]