Torch学习笔记

最新推荐文章于 2024-07-01 15:02:48 发布

巴拉巴拉朵

最新推荐文章于 2024-07-01 15:02:48 发布

阅读量1.1w

点赞数 1

分类专栏： torch 深度学习入门教程文章标签： torch 深度学习 tensor 快速入门机器学习

本文链接：https://blog.csdn.net/whgyxy/article/details/52204206

版权

深度学习同时被 3 个专栏收录

10 篇文章 0 订阅

订阅专栏

入门教程

7 篇文章 0 订阅

订阅专栏

torch

5 篇文章 1 订阅

订阅专栏

Torch笔记（二）快速入门

Torch中的唯一的数据结构就是Tensor了，而该结构简洁而且强大，非常适合进行矩阵类的数值计算，它是Torch中最最重要的类了。这个Tensor其实就是个多维矩阵，支持矩阵的各种操作。这里需要特别强调的是，lua中的数组（其实是table）下标是从1开始的，因此Tensor对象的下标也是从1开始的。
咱们从程序猿的角度来说，首先，Tensor也是有类型的，Tensor家族有ByteTensor 、CharTensor 、ShortTensor 、IntTensor 、LongTensor 、FloatTensor 、DoubleTensor 这么几个成员，不用我说，看字面意思就知道这些Tensor各式表示什么意思。默认的是DoubleTensor ，这是为了计算的方便吧。
然后是Tensor的构造函数了，怎么创建一个Tensor呢，常用的有下面几个。在安装有Torch的机器上输入“th”命令进入Torch环境。

th> a = torch.Tensor(2,4);print(a)
 0  0  0  0
 0  0  0  0
[torch.DoubleTensor of size 2x4]

th> a = torch.Tensor(2,4,2)                                                         
th> print(b)
(1,.,.) = 
  6.9414e-310  6.9414e-310
  5.0872e-317  2.3253e+251
  5.0450e+223  1.6304e-322
  6.9414e-310  5.0873e-317

(2,.,.) = 
  1.0277e-321  2.3715e-322
  5.0873e-317  5.9416e-313
  5.0873e-317   8.5010e-96
  6.9677e+252  1.6304e-322
[torch.DoubleTensor of size 2x4x2]

th> c = torch.IntTensor(2,3);print(c) -- 也可以指定类型
 1.0302e+07  0.0000e+00  1.7000e+09
 1.1000e+02  0.0000e+00  0.0000e+00
[torch.IntTensor of size 2x3]

torch.Tensor(sz1 [,sz2 [,sz3 [,sz4]]]])

上面的的构造函数是创建一个sz1 x sz2 x sx3 x sz4 x …的N维Tensor，例如上述Tensor对象b就是个2 x 4x 2的3维Tensor，但是这样只是定义，没有初始化，需要对其进行赋值操作。

th> torch.Tensor({{1,2,3,4}, {5,6,7,8}})
 1  2  3  4
 5  6  7  8
[torch.DoubleTensor of dimension 2x4]

上面的是用一个lua的table来进行初始化的，即下面的形式

torch.Tensor(table)

还可以使用Tensor对象进行初始化

torch.Tensor(tensor)

各个Tensor之间的转化

th> a = torch.IntTensor({3,4,5,3,3});b = torch.DoubleTensor({3.3,4.9,3.2});c = b:typeAs(a);print(c);print(b)    
 3
 4
 3
[torch.IntTensor of size 3]

 3.3000
 4.9000
 3.2000
[torch.DoubleTensor of size 3]

Tensor中元素的类型转换，注意其返回的还是一个Tensor对象，而不是单纯的数字

[Tensor] byte(), char(), short(), int(), long(), float(), double()

Torch中还有个Storage类，其实就是对应着C语言的一维数组，在一般文件操作使用，节省空间但相关函数不是很丰富，这方面也有ByteStorage, CharStorage, ShortStorage, IntStorage, LongStorage, FloatStorage, DoubleStorage.等待类型，用法和Tensor差不多。由于Storage相当于一个一维数组，所以可以这样创建一个多维数组（这里也是没有初始化）

th> x =torch.Tensor( torch.LongStorage({2,3,4}));print(x) 
(1,.,.) = 
  1.1132e+171  6.1707e-114  8.8211e+199  1.0167e-152
  5.7781e-114  7.3587e+223   2.9095e-14   6.9117e-72
  8.8211e+199  1.0167e-152   3.9232e-85   6.9183e-72

(2,.,.) = 
  1.3923e-259  2.2831e-109  1.6779e+243  7.3651e+228
  2.2082e-259  1.1132e+171  6.1707e-114  2.3253e+251
  5.0450e+223  2.8811e+159   1.1995e-22  2.1723e-153
[torch.DoubleTensor of size 2x3x4]

因为Tensor对象在操作的时候其实是像在操作C++中的引用，是会改变其本身的值的，所有有些时候我们需要进行备份复制操作，注意在Lua中“：”表示的含义，它表示类的成员。只有包（像nn、torch、optim）等等使用其中的函数时用“.”。

th> a = torch.randn(3);b = a:clone();print(a,b)
-0.7112
 0.1953
-2.0389
[torch.DoubleTensor of size 3]

-0.7112
 0.1953
-2.0389
[torch.DoubleTensor of size 3]

获取Tensor的维度dim()，还有每个维度具体数值size(),下面显示的Tensor a的第一维为size(1) = 2,第二维为size(2) = 4。获取Tensor元素总个数nElement()

th> a = torch.Tensor(2,4):zero();print(a);print(a:dim())
 0  0  0  0
 0  0  0  0
[torch.DoubleTensor of size 2x4]

2

th> print(a:size())                            
 2
 4
[torch.LongStorage of size 2]

th> print(a:nElement())
8

下标访问方式

x = torch.Tensor(3,3)
i = 0; x:apply(function() i = i + 1; return i end)
> x
 1  2  3
 4  5  6
 7  8  9
[torch.DoubleTensor of dimension 3x3]

> x[2] -- returns row 2
 4
 5
 6
[torch.DoubleTensor of dimension 3]

> x[2][3] -- returns row 2, column 3
6

> x[{2,3}] -- another way to return row 2, column 3
6

> x[torch.LongStorage{2,3}] -- yet another way to return row 2, column 3
6

元素复制，只要两个Tensor的元素总数一样就可以，矩阵形状可以不同

x = torch.Tensor(4):fill(1)
y = torch.Tensor(2,2):copy(x)
> x
 1
 1
 1
 1
[torch.DoubleTensor of dimension 4]

> y
 1  1
 1  1
[torch.DoubleTensor of dimension 2x2]

在需要动态扩充Tensor时非常有用的函数resize(sz1 [,sz2 [,sz3 [,sz4]]]])

th> x = torch.randn(2,4):zero();print(x);print(x:resize(3,4))  
 0  0  0  0
 0  0  0  0
[torch.DoubleTensor of size 2x4]

  0.0000e+00   0.0000e+00   0.0000e+00   0.0000e+00
  0.0000e+00   0.0000e+00   0.0000e+00   0.0000e+00
 8.8211e+199  7.4861e-114   2.9085e-33  1.0251e+170
[torch.DoubleTensor of size 3x4]

下面介绍的narrow(dim, index, size)，sub(dim1s, dim1e … [, dim4s [, dim4e]])，select(dim, index)返回的对象会得到一个Tensor一部分数据，但是这个返回的对象其实是这部分数据的引用，一旦对该对象进行任何的操作，都将直接影响原来的Tensor
narrow的dim是待选取的第几个维度，index是开始的位置，结束的位置是index+size-1

th><font size=3.5> x = torch.Tensor(5, 6):zero();y = x:narrow(1, 2, 3);y:fill(1);print(y);print(x)
 1  1  1  1  1  1
 1  1  1  1  1  1
 1  1  1  1  1  1
[torch.DoubleTensor of size 3x6]

 0  0  0  0  0  0
 1  1  1  1  1  1
 1  1  1  1  1  1
 1  1  1  1  1  1
 0  0  0  0  0  0
[torch.DoubleTensor of size 5x6]

th> x = torch.Tensor(5, 6):zero();y = x:narrow(2, 2, 3);y:fill(1);print(y);print(x) 
 1  1  1
 1  1  1
 1  1  1
 1  1  1
 1  1  1
[torch.DoubleTensor of size 5x3]

 0  1  1  1  0  0
 0  1  1  1  0  0
 0  1  1  1  0  0
 0  1  1  1  0  0
 0  1  1  1  0  0
[torch.DoubleTensor of size 5x6]

sub中的dim1s表示第一个维度开始位置，dim1e表示第一个维度的结束位置，均可以取负值，-1表示最后一个位置，-2表示倒数第二个位置

th> x = torch.Tensor(5, 6):zero();z = x:sub(2,4,3,4):fill(2);print(z);print(x)
 2  2
 2  2
 2  2
[torch.DoubleTensor of size 3x2]

 0  0  0  0  0  0
 0  0  2  2  0  0
 0  0  2  2  0  0
 0  0  2  2  0  0
 0  0  0  0  0  0
[torch.DoubleTensor of size 5x6]

select(dim, index)返回的Tensor比原来的Tensor少一维，dim依然表示第几个维度

th> x = torch.Tensor(5,6):zero();y = x:select(1, 2):fill(2);print(y);print(x)
 2
 2
 2
 2
 2
 2
[torch.DoubleTensor of size 6]

 0  0  0  0  0  0
 2  2  2  2  2  2
 0  0  0  0  0  0
 0  0  0  0  0  0
 0  0  0  0  0  0
[torch.DoubleTensor of size 5x6]

th> x = torch.Tensor(2,3,4):zero();y = x:select(1, 2):fill(2);print(y);print(x)   
 2  2  2  2
 2  2  2  2
 2  2  2  2
[torch.DoubleTensor of size 3x4]

(1,.,.) = 
  0  0  0  0
  0  0  0  0
  0  0  0  0

(2,.,.) = 
  2  2  2  2
  2  2  2  2
  2  2  2  2
[torch.DoubleTensor of size 2x3x4]

还有下标操作符[{ dim1,dim2,… }] or [{ {dim1s,dim1e}, {dim2s,dim2e} }]也是返回一个Tensor，这种方式更加简洁和常用

th> x = torch.Tensor(3, 4):zero();x[{ 1,3 }] = 1;print(x) -- 和x[1][3] = 1 等价
 0  0  1  0
 0  0  0  0
 0  0  0  0
[torch.DoubleTensor of size 3x4]

th><font size=3.5> x = torch.Tensor(3, 4):zero();x[{ 2,{2,4} }] = 1;print(x) -- 2表示第一维index = 2，｛2，4｝表示第二维index从2到4            
 0  0  0  0
 0  1  1  1
 0  0  0  0
[torch.DoubleTensor of size 3x4]

th><font size=3.5> x = torch.Tensor(3, 4):zero();x[{ {},2 }] = torch.range(1,3);print(x)  -- ｛｝表示第一个维度的所有index，2表示第二个维度index = 2 range生成的Tensor叫IDTensor
 0  1  0  0
 0  2  0  0
 0  3  0  0
[torch.DoubleTensor of size 3x4]

th> x = torch.Tensor(3, 4):fill(5);x[{ {},2 }] = torch.range(1,3);x[torch.lt(x,3)] = -2;print(x) ; 
 5 -2  5  5
 5 -2  5  5
 5  3  5  5
[torch.DoubleTensor of size 3x4]

index(dim, index)函数就不一样了，返回的Tensor就是一个新的Tensor了，和原来的Tensor就没有关系了，这里的index比较特殊，必须是LongTensor类型的

th> x = torch.randn(3,5);print(x);y = x:index(2,torch.range(2,4):typeAs(torch.LongTensor()));y:mul(100);print(y);print(x)
 0.6146 -0.3204 -1.2182  1.5573 -0.7232
-1.1692 -0.0071  3.1590  0.6008  0.4566
 0.1957 -0.4057  2.0835 -0.3365 -1.3541
[torch.DoubleTensor of size 3x5]

 -32.0396 -121.8182  155.7330
  -0.7097  315.9025   60.0773
 -40.5727  208.3459  -33.6517
[torch.DoubleTensor of size 3x3]

 0.6146 -0.3204 -1.2182  1.5573 -0.7232
-1.1692 -0.0071  3.1590  0.6008  0.4566
 0.1957 -0.4057  2.0835 -0.3365 -1.3541
[torch.DoubleTensor of size 3x5]

th>x = torch.randn(3,5);print(x);y = x:index(2,torch.LongTensor({1,4}));y:mul(100)print(y);print(x) -- 指定获取第二维中index = 1 和index = 4 即原Tensor中第1列和第4列形成的新Tensor
-0.4880 -1.6397 -0.3257 -0.5051 -0.1214
-0.4002  1.3845  0.4411  0.1753  2.0174
 0.5882  0.9351 -0.7685  0.6377 -1.7308
[torch.DoubleTensor of size 3x5]

-48.8024 -50.5097
-40.0198  17.5341
 58.8207  63.7730
[torch.DoubleTensor of size 3x2]

-0.4880 -1.6397 -0.3257 -0.5051 -0.1214
-0.4002  1.3845  0.4411  0.1753  2.0174
 0.5882  0.9351 -0.7685  0.6377 -1.7308
[torch.DoubleTensor of size 3x5]

indexCopy(dim, index, tensor)将tensor复制进来，index也是LongTensor类型的，类似的函数还有indexAdd(dim, index, tensor)，indexFill(dim, index, val)

th> x = torch.randn(3,5);print(x);y = torch.Tensor(2,5);y:select(1,1):fill(-1);y:select(1,2):fill(-2);print(y);x:indexCopy(1,torch.LongTensor{3,1},y);print(x)
 0.8086  1.7714 -1.6337  0.2549  0.2131
 1.4018 -0.9938  0.3035  1.6247 -0.1368
 0.3516 -1.3728 -0.5203  0.2754 -1.6965
[torch.DoubleTensor of size 3x5]

-1 -1 -1 -1 -1
-2 -2 -2 -2 -2
[torch.DoubleTensor of size 2x5]

-2.0000 -2.0000 -2.0000 -2.0000 -2.0000
 1.4018 -0.9938  0.3035  1.6247 -0.1368
-1.0000 -1.0000 -1.0000 -1.0000 -1.0000
[torch.DoubleTensor of size 3x5]

这里需要强调一下，二维Tensor的第二个维度为1时这个Tensor还是二维的

-- 这个是一维的Tensor
th> a = torch.Tensor({1,2,3,4,5});print("dim=" .. a:dim());print(a);print(a:size()) 
dim=1
 1
 2
 3
 4
 5
[torch.DoubleTensor of size 5]
 5
[torch.LongStorage of size 1]
[torch.LongStorage of size 1]
-- 这个就是二维的Tensor了，此时是5x1的二维Tensor了
th> x = torch.Tensor({1,2,3,4,5});y = x:reshape(a:size(1),1);print(y);print(y:dim())
 1
 2
 3
 4
 5
[torch.DoubleTensor of size 5x1]
2
-- 也可以这么写
th> y = torch.Tensor({{1},{2},{3},{4},{5}});print(y);print(y:dim()) 
 1
 2
 3
 4
 5
[torch.DoubleTensor of size 5x1]
-- 这样就变成了1x5的二维Tensor了
th> y = torch.Tensor({{1,2,3,4,5}});print(y);print(y:dim())                              
 1  2  3  4  5
[torch.DoubleTensor of size 1x5]
2
-- 或者这样也可以，都是二维的Tensor了，size()的值即各维度的大小，因为是二维，所以有两个值
th> x = torch.Tensor({1,2,3,4,5});y = x:reshape(1,a:size(1));print(y:dim());print(y);;print(y:size())
2
 1  2  3  4  5
[torch.DoubleTensor of size 1x5]
 1
 5
[torch.LongStorage of size 2]

gather(dim, index)取出指定区域的数据，其中index的维数必须和最后输出的Tensor形状（维数，输出Tensor形状是mxn，index必须也是mxn）相同，且数据类型是LongTensor

-- dim = 1
result[i][j][k]... = src[index[i][j][k]...][j][k]...
-- dim = 2
result[i][j][k]... = src[i][index[i][j][k]...][k]...
-- etc.
-- src 是原来的Tensor
-- 如果想取对角线上的所有元素，输出是一个nx1的具有两个维度（Tensor:dim() = 2）,则index必须是nx1具有两个维度的LongTensor

th> x = torch.rand(5, 5);print(x);y = x:gather(1, torch.LongTensor{{1, 2, 3, 4, 5}, {2, 3, 4, 5, 1}});print(y)
 0.2188  0.3625  0.7812  0.2781  0.9327
 0.5342  0.3879  0.7225  0.6031  0.7325
 0.1464  0.4534  0.5134  0.9993  0.6617
 0.0594  0.6398  0.1741  0.7357  0.6613
 0.2926  0.7286  0.7255  0.7108  0.1820
[torch.DoubleTensor of size 5x5]

 0.2188  0.3879  0.5134  0.7357  0.1820
 0.5342  0.4534  0.1741  0.7108  0.9327
[torch.DoubleTensor of size 2x5]
-- 如果要取出对角线上的元素(1,1),(2,2),(3,3),(4,4),(5,5)这个1x5（二维Tensor）的Tensor出来 index = torch.LongTensor({{1, 2, 3, 4, 5}});  y = x:gather(1, index) 其中的index是1x5的二维Tensor
-- 如果要取出对角线(1,2),(2,3),(3,4),(4,5),(5,1)的1x5Tensor，index = torch.LongTensor({{5，1, 2, 3, 4}});  y = x:gather(1, index)或者index = torch.LongTensor({{2, 3, 4, 5，1}});  y = x:gather(2, index)
-- 看个例子
th> x = torch.rand(5, 5);print(x);y = x:gather(2, torch.LongTensor({ {1,2},{2,3},{3,4},{4,5},{5,1} }) );print(y)    
 0.8563  0.2664  0.6895  0.8124  0.0788
 0.0503  0.6646  0.7659  0.4013  0.0670
 0.4760  0.0517  0.9621  0.7437  0.1162
 0.4069  0.9932  0.6118  0.6200  0.3585
 0.9795  0.9601  0.9098  0.4714  0.5577
[torch.DoubleTensor of size 5x5]

 0.8563  0.2664
 0.6646  0.7659
 0.9621  0.7437
 0.6200  0.3585
 0.5577  0.9795
[torch.DoubleTensor of size 5x2]

scatter(dim, index, src|val)就是把其它Tensor src或者标量值val写入到自己参数含义和gather一样

x = torch.rand(2, 5)
> x
 0.3227  0.4294  0.8476  0.9414  0.1159
 0.7338  0.5185  0.2947  0.0578  0.1273
[torch.DoubleTensor of size 2x5]

y = torch.zeros(3, 5):scatter(1, torch.LongTensor{{1, 2, 3, 1, 1}, {3, 1, 1, 2, 3}}, x)
> y
 0.3227  0.5185  0.2947  0.9414  0.1159
 0.0000  0.4294  0.0000  0.0578  0.0000
 0.7338  0.0000  0.8476  0.0000  0.1273
[torch.DoubleTensor of size 3x5]

z = torch.zeros(2, 4):scatter(2, torch.LongTensor{{3}, {4}}, 1.23)
> z
 0.0000  0.0000  1.2300  0.0000
 0.0000  0.0000  0.0000  1.2300
[torch.DoubleTensor of size 2x4]

nonzero(tensor)返回一个nx2 LongTensor，包含原Tensor中所有的非零值的下标和值

th> x = torch.rand(4, 4):mul(3):floor():int();y = torch.nonzero(x);print(x);print(y) 
 2  1  0  2
 2  1  2  1
 1  0  1  0
 2  0  2  2
[torch.IntTensor of size 4x4]

 1  1
 1  2
 1  4
 2  1
 2  2
 2  3
 2  4
 3  1
 3  3
 4  1
 4  3
 4  4
[torch.LongTensor of size 12x2]

转置操作transpose(dim1, dim2)或者直接t()，仅对二维Tensor有用
对多维tensor转置permute(dim1, dim2, …, dimn)

x = torch.Tensor(3,4,2,5)
> x:size()
 3
 4
 2
 5
[torch.LongStorage of size 4]

y = x:permute(2,3,1,4) -- 等价于 y = x:transpose(1,3):transpose(1,2)
> y:size()
 4
 2
 3
 5
[torch.LongStorage of size 4]

还有对tensor对象每个元素应用函数处理apply(function)

 = 0
z = torch.Tensor(3,3)
z:apply(function(x)
  i = i + 1
  return i
end)
> z
 1  2  3
 4  5  6
 7  8  9
[torch.DoubleTensor of dimension 3x3]

针对两个tensor的每个元素处理map(tensor, function(xs, xt))

x = torch.Tensor(3,3)
y = torch.Tensor(9)
i = 0
x:apply(function() i = i + 1; return i end) -- fill-up x
i = 0
y:apply(function() i = i + 1; return i end) -- fill-up y
> x
 1  2  3
 4  5  6
 7  8  9
[torch.DoubleTensor of dimension 3x3]
> y
 1
 2
 3
 4
 5
 6
 7
 8
 9
[torch.DoubleTensor of dimension 9]
th> z =x:map(y, function(xx, yy) return xx*xx + yy end);print(z)
  2   6  12
 20  30  42
 56  72  90
[torch.DoubleTensor of size 3x3]

将tensor进行切分split([result,] tensor, size, [dim])，size是切分的单位（大小），dim是指定的维度，即在那个维度上面切分

th> x = torch.randn(3,10,15) 
th> x:split(2,1)             
{
  1 : DoubleTensor - size: 2x10x15
  2 : DoubleTensor - size: 1x10x15
}
                                                                      [0.0003s]
th> x:split(4,2)             
{
  1 : DoubleTensor - size: 3x4x15
  2 : DoubleTensor - size: 3x4x15
  3 : DoubleTensor - size: 3x2x15
}
                                                                      [0.0004s]
th> x:split(5,3)    
{
  1 : DoubleTensor - size: 3x10x5
  2 : DoubleTensor - size: 3x10x5
  3 : DoubleTensor - size: 3x10x5
}

然后是一些tensor的Math操作函数，例如sum、add、sub、mul、div、mode、max、min、std、mean、pow、rand、randn、log、range、exp、abs、floor、sqrt等还有很多很多Math官网
这些函数因为非常常用已经放到了torch包中，既可以mean = torch.mean(x)，也可以mean = x:mean()

th> x = torch.range(1,15,3);print(x);print(x:mean());print(torch.mean(x))
  1
  4
  7
 10
 13
[torch.DoubleTensor of size 5]

7
7

还有一个很常用的是torch.sort([resval, resind,] x [,dim] [,flag])，默认升序，返回两个tensor，第一个是排好序的tensor，第二个是排好序的tensor每个元素对应的下标的tensor

th> x = torch.Tensor({8.3,3.4,5.7,1.7,9.3});y,index = x:sort(true);print(y,index) 
-- 也可以这样写 y,index = torch.sort(x,true);print(y,index) 
 9.3000
 8.3000
 5.7000
 3.4000
 1.7000
[torch.DoubleTensor of size 5]

 5
 1
 3
 2
 4
[torch.LongTensor of size 5]