torch matric operation_nn.view(-1):setnuminputdims(3)-CSDN博客

本文链接：https://blog.csdn.net/yiqingyang2012/article/details/54627263

1.SpatialConvolution

require('nn')
require('sys')

local model = nn.Sequential()

x = torch.Tensor(1, 5, 8):fill(1)

model:add(nn.Identity())
model:add(nn.SpatialConvolution(1, 2, 2, 3))
y = model:forward(x)
print(y)

返回的数组是2x(5-3+1)(8-2+1)

2.查找一个矩阵每一维大小时，index是从1开始而不是零

x = torch.Tensor(2,3):fill(1)
x:size(1)  :返回数组第1维的长度2
x:size(2)  :返回数组第二维的长度3

3. linspace(a,b,N)

linspace(a,b,N)

得到一个a到b之间的等差数组，a为起点，b为终点。数组的间隔为(a-b)/(N-1)。

一维数组[1,2,3,4,5]的shape为(5, )

4.arr.transpose

arr.transpose((1,0,2))

对三维矩阵做转置。其中(1,0,2)中的1表示目标矩阵的第一维的大小等于原矩阵第2维的大小； 0表示目标矩阵第二维的大小等于原矩阵第一维的大小。

5. torch.Storage

torch.Storage

是一个一维连续数组。

6. torch.ones(*sizes, out=None) → Tensor

>>> torch.ones(2, 3)

 1  1  1
 1  1  1
[torch.FloatTensor of size 2x3]

>>> torch.ones(5)

 1
 1
 1
 1
 1
[torch.FloatTensor of size 5]

下面得到一个1乘2乘3的全1矩阵，然后第一维和第2维进行转换。
>>> x = np.ones((1, 2, 3))
>>> np.transpose(x, (1, 0, 2)).shape
(2, 1, 3)

7. torch.randperm(n, out=None) → LongTensor

>>> torch.randperm(4)
 2
 1
 3
 0
[torch.LongTensor of size 4]

用0到n之间的数字随机排列得到一个数组。

8. torch.cat(inputs, dimension=0) → Tensor

>>> x = torch.randn(2, 3)
>>> x

 0.5983 -0.0341  2.4918
 1.5981 -0.5265 -0.8735
[torch.FloatTensor of size 2x3]

>>> torch.cat((x, x, x), 0)

 0.5983 -0.0341  2.4918
 1.5981 -0.5265 -0.8735
 0.5983 -0.0341  2.4918
 1.5981 -0.5265 -0.8735
 0.5983 -0.0341  2.4918
 1.5981 -0.5265 -0.8735
[torch.FloatTensor of size 6x3]

>>> torch.cat((x, x, x), 1)

 0.5983 -0.0341  2.4918  0.5983 -0.0341  2.4918  0.5983 -0.0341  2.4918
 1.5981 -0.5265 -0.8735  1.5981 -0.5265 -0.8735  1.5981 -0.5265 -0.8735
[torch.FloatTensor of size 2x9]

在给定维度dimension 上串联序列inputs。串联后得到的数组和input里每个元素的维度数相同

9. torch.index_select(input, dim, index, out=None) → Tensor

>>> x = torch.randn(3, 4)
>>> x

 1.2045  2.4084  0.4001  1.1372
 0.5596  1.5677  0.6219 -0.7954
 1.3635 -1.2313 -0.5414 -1.8478
[torch.FloatTensor of size 3x4]

>>> indices = torch.LongTensor([0, 2])
>>> torch.index_select(x, 0, indices)

 1.2045  2.4084  0.4001  1.1372
 1.3635 -1.2313 -0.5414 -1.8478
[torch.FloatTensor of size 2x4]

>>> torch.index_select(x, 1, indices)

 1.2045  0.4001
 0.5596  0.6219
 1.3635 -0.5414
[torch.FloatTensor of size 3x2]

在指定维度dim方向上从input中抽取由位置序列index所指定的值。output的其他维度的长度和原来矩阵相同，在第dim维度上的长度和index的长度相同。

10. torch.nonzero(input, out=None) → LongTensor
抽取input里的非零元素，输出矩阵的每一行包含了input里每个非零元素在input里的索引。
得到的矩阵为z x n，z是input矩阵里非零元素的个数（每个非零元素需要一行里的值来索引），n为input矩阵的维数。

>>> torch.nonzero(torch.Tensor([1, 1, 1, 0, 1]))

 0
 1
 2
 4
[torch.LongTensor of size 4x1]

>>> torch.nonzero(torch.Tensor([[0.6, 0.0, 0.0, 0.0],
...                             [0.0, 0.4, 0.0, 0.0],
...                             [0.0, 0.0, 1.2, 0.0],
...                             [0.0, 0.0, 0.0,-0.4]]))

 0  0
 1  1
 2  2
 3  3
[torch.LongTensor of size 4x2]

11. torch.max(input, dim, max=None, max_indices=None) -> (Tensor, LongTensor)
在输入矩阵的dim维度上求最大值，得到的矩阵除了dim维度上的长度为1之外，其他维度和input的维度相同。
同时返回每个最大值在input矩阵里的index。

>> a = torch.randn(4, 4)
>> a

0.0692  0.3142  1.2513 -0.5428
0.9288  0.8552 -0.2073  0.6409
1.0695 -0.0101 -2.4507 -1.2230
0.7426 -0.7666  0.4862 -0.6628
torch.FloatTensor of size 4x4]

>>> torch.max(a, 1)
(
 1.2513
 0.9288
 1.0695
 0.7426
[torch.FloatTensor of size 4x1]
,
 2
 0
 0
 0
[torch.LongTensor of size 4x1]
)

 model:add(nn.View(1, -1, nhid):setNumInputDims(2))
 model:add(cudnn.SpatialConvolution(1, nhid, nhid, kwidth, 1, 1, 0))
 model:add(cudnn.SpatialMaxPooling(1, 2, 1, 2))
 model:add(nn.Threshold())
 model:add(nn.Transpose({2,4}))

b = torch.Tensor(2, 2)
b[1][1]=1
b[1][2]=2

b[2][1]=3
b[2][2]=4
print(b[{{}, 1}]:contiguous())//得到的是第一列

c = torch.range(1, 3):view(1, 3)
      :expand(2, 3):contiguous()
print(c)
print(c+2)

print如下：
yi@yi:~$ luajit test.lua
 1  2  3
 1  2  3
[torch.DoubleTensor of size 2x3]

 3  4  5
 3  4  5
[torch.DoubleTensor of size 2x3]

x = torch.Tensor(3,4,4):fill(1)
net = nn.Sequential()
--下面会得到一个3x1x2x4的矩阵
net:add(nn.View(1, -1, 4):setNumInputDims(2))
net:add(nn.SpatialConvolution(1, 2, 2, 2))
net:add(nn.Tanh())
print(net:forward(x))
--会得到一个3x2x3x3，所以卷积操作里，如果输入为4维，则第一维的大小为batch的大小

model = nn.Sequential()

x = torch.Tensor(1,6,3):fill(1)
m=nn.SpatialSubSampling(1,1,2)
model:add(m)
print(model:forward(x))
print(m.weight)
print(m.bias)
--这其实是个池化动作，和maxpool类似，将每个窗口区域内的多个值变成一个值. 在这里，SpatialSubSampling的weight的长度和input plane长度相同，所以如果只有一个输入面权重就只有一个。权重的值为随机的。
--得出的值为这个窗口内的每个值乘以权重然后相加，最后加上一个bias

updateGradInput(input, gradOutput)

Computing the gradient of the module with respect to its own input. This is returned in gradInput. Also, the gradInput state variable is updated accordingly.

accGradParameters(input, gradOutput, scale)
Computing the gradient of the module with respect to its own parameters. Many modules do not perform this step as they do not have any parameters. The state variable name for the parameters is module dependent. The module is expected to accumulate the gradients（保存某些参数的梯度） with respect to the parameters in some variable.

scale is a scale factor that is multiplied with the gradParameters before being accumulated.

Zeroing this accumulation is achieved with zeroGradParameters() and updating the parameters according to this accumulation is done with updateParameters().

zeroGradParameters()
If the module has parameters, this will zero the accumulation of the gradients with respect to these parameters, accumulated through accGradParameters(input, gradOutput,scale) calls. Otherwise, it does nothing.

updateParameters(learningRate)

根据之前求得的梯度值更新参数If the module has parameters, this will update these parameters, according to the accumulation of the gradients with respect to these parameters, accumulated through backward() calls.

The update is basically:

parameters = parameters - learningRate * gradients_wrt_parameters
If the module does not have parameters, it does nothing.

StochasticGradient
他有一个maxIteration 代表最大的迭代次数，同时训练样本也有一个size，训练时每次随机的从训练样本中抽取一个样本来计算并更新梯度，当迭代的次数达到maxIteration 时结束
https://zhuanlan.zhihu.com/p/21550685里介绍了
updateOutput(input)
updateGradInput(input, gradOutput)
accGradParameters(input, gradOutput)

∀x∈M,p(x)： “对任意x属于M，p(x)成立。”
∃x ∈ M，p（x）：存在一个x属于M，使p（x）成立。