Torch笔记 (二)快速入门
Torch中的唯一的数据结构就是Tensor了,而该结构简洁而且强大,非常适合进行矩阵类的数值计算,它是Torch中最最重要的类了。这个Tensor其实就是个多维矩阵,支持矩阵的各种操作。这里需要特别强调的是,lua中的数组(其实是table)下标是从1开始的,因此Tensor对象的下标也是从1开始的。
咱们从程序猿的角度来说,首先,Tensor也是有类型的,Tensor家族有ByteTensor 、CharTensor 、ShortTensor 、IntTensor 、LongTensor 、FloatTensor 、DoubleTensor 这么几个成员,不用我说,看字面意思就知道这些Tensor各式表示什么意思。默认的是DoubleTensor ,这是为了计算的方便吧。
然后是Tensor的构造函数了,怎么创建一个Tensor呢,常用的有下面几个。在安装有Torch的机器上输入“th”命令进入Torch环境。
th> a = torch.Tensor(2,4);print(a)
0 0 0 0
0 0 0 0
[torch.DoubleTensor of size 2x4]
th> a = torch.Tensor(2,4,2)
th> print(b)
(1,.,.) =
6.9414e-310 6.9414e-310
5.0872e-317 2.3253e+251
5.0450e+223 1.6304e-322
6.9414e-310 5.0873e-317
(2,.,.) =
1.0277e-321 2.3715e-322
5.0873e-317 5.9416e-313
5.0873e-317 8.5010e-96
6.9677e+252 1.6304e-322
[torch.DoubleTensor of size 2x4x2]
th> c = torch.IntTensor(2,3);print(c) -- 也可以指定类型
1.0302e+07 0.0000e+00 1.7000e+09
1.1000e+02 0.0000e+00 0.0000e+00
[torch.IntTensor of size 2x3]
torch.Tensor(sz1 [,sz2 [,sz3 [,sz4]]]])
上面的的构造函数是创建一个sz1 x sz2 x sx3 x sz4 x …的N维Tensor,例如上述Tensor对象b就是个2 x 4x 2的3维Tensor,但是这样只是定义,没有初始化,需要对其进行赋值操作。
th> torch.Tensor({{1,2,3,4}, {5,6,7,8}})
1 2 3 4
5 6 7 8
[torch.DoubleTensor of dimension 2x4]
上面的是用一个lua的table来进行初始化的,即下面的形式
torch.Tensor(table)
还可以使用Tensor对象进行初始化
torch.Tensor(tensor)
各个Tensor之间的转化
th> a = torch.IntTensor({3,4,5,3,3});b = torch.DoubleTensor({3.3,4.9,3.2});c = b:typeAs(a);print(c);print(b)
3
4
3
[torch.IntTensor of size 3]
3.3000
4.9000
3.2000
[torch.DoubleTensor of size 3]
Tensor中元素的类型转换,注意其返回的还是一个Tensor对象,而不是单纯的数字
[Tensor] byte(), char(), short(), int(), long(), float(), double()
Torch中还有个Storage类,其实就是对应着C语言的一维数组,在一般文件操作使用,节省空间但相关函数不是很丰富,这方面也有ByteStorage, CharStorage, ShortStorage, IntStorage, LongStorage, FloatStorage, DoubleStorage.等待类型,用法和Tensor差不多。由于Storage相当于一个一维数组,所以可以这样创建一个多维数组(这里也是没有初始化)
th> x =torch.Tensor( torch.LongStorage({2,3,4}));print(x)
(1,.,.) =
1.1132e+171 6.1707e-114 8.8211e+199 1.0167e-152
5.7781e-114 7.3587e+223 2.9095e-14 6.9117e-72
8.8211e+199 1.0167e-152 3.9232e-85 6.9183e-72
(2,.,.) =
1.3923e-259 2.2831e-109 1.6779e+243 7.3651e+228
2.2082e-259 1.1132e+171 6.1707e-114 2.3253e+251
5.0450e+223 2.8811e+159 1.1995e-22 2.1723e-153
[torch.DoubleTensor of size 2x3x4]
因为Tensor对象在操作的时候其实是像在操作C++中的引用,是会改变其本身的值的,所有有些时候我们需要进行备份复制操作,注意在Lua中“:”表示的含义,它表示类的成员。只有包(像nn、torch、optim)等等使用其中的函数时用“.”。
th> a = torch.randn(3);b = a:clone();print(a,b)
-0.7112
0.1953
-2.0389
[torch.DoubleTensor of size 3]
-0.7112
0.1953
-2.0389
[torch.DoubleTensor of size 3]
获取Tensor的维度dim(),还有每个维度具体数值size(),下面显示的Tensor a的第一维为size(1) = 2,第二维为size(2) = 4。获取Tensor元素总个数nElement()
th> a = torch.Tensor(2,4):zero();print(a);print(a:dim())
0 0 0 0
0 0 0 0
[torch.DoubleTensor of size 2x4]
2
th> print(a:size())
2
4
[torch.LongStorage of size 2]
th> print(a:nElement())
8
下标访问方式
x = torch.Tensor(3,3)
i = 0; x:apply(function() i = i + 1; return i end)
> x
1 2 3
4 5 6
7 8 9
[torch.DoubleTensor of dimension 3x3]
> x[2] -- returns row 2
4
5
6
[torch.DoubleTensor of dimension 3]
> x[2][3] -- returns row 2, column 3
6
> x[{2,3}] -- another way to return row 2, column 3
6
> x[torch.LongStorage{2,3}] -- yet another way to return row 2, column 3
6
元素复制,只要两个Tensor的元素总数一样就可以,矩阵形状可以不同
x = torch.Tensor(4):fill(1)
y = torch.Tensor(2,2):copy(x)
> x
1
1
1
1
[torch.DoubleTensor of dimension 4]
> y
1 1
1 1
[torch.DoubleTensor of dimension 2x2]
在需要动态扩充Tensor时非常有用的函数resize(sz1 [,sz2 [,sz3 [,sz4]]]])
th> x = torch.randn(2,4):zero();print(x);print(x:resize(3,4))
0 0 0 0
0 0 0 0
[torch.DoubleTensor of size 2x4]
0.0000e+00 0.0000e+00 0.0000e+00 0.0000e+00
0.0000e+00 0.0000e+00 0.0000e+00 0.0000e+00
8.8211e+199 7.4861e-114 2.9085e-33 1.0251e+170
[torch.DoubleTensor of size 3x4]
下面介绍的narrow(dim, index, size),sub(dim1s, dim1e … [, dim4s [, dim4e]]),select(dim, index)返回的对象会得到一个Tensor一部分数据,但是这个返回的对象其实是这部分数据的引用,一旦对该对象进行任何的操作,都将直接影响原来的Tensor
narrow的dim是待选取的第几个维度,index是开始的位置,结束的位置是index+size-1
th><font size=3.5> x = torch.Tensor(5, 6):zero();y = x:narrow(1, 2, 3);y:fill(1);print(y);print(x)
1 1 1 1 1 1
1 1 1 1 1 1
1 1 1 1 1 1
[torch.DoubleTensor of size 3x6]
0 0 0 0 0 0
1 1 1 1 1 1
1 1 1 1 1 1
1 1 1 1 1 1
0 0 0 0 0 0
[torch.DoubleTensor of size 5x6]
th> x = torch.Tensor(5, 6):zero();y = x:narrow(2, 2, 3);y:fill(1);print(y);print(x)
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
[torch.DoubleTensor of size 5x3]
0 1 1 1 0 0
0 1 1 1 0 0
0 1 1 1 0 0
0 1 1 1 0 0
0 1 1 1 0 0
[torch.DoubleTensor of size 5x6]
sub中的dim1s表示第一个维度开始位置,dim1e表示第一个维度的结束位置,均可以取负值,-1表示最后一个位置,-2表示倒数第二个位置
th> x = torch.Tensor(5, 6):zero();z = x:sub(2,4,3,4):fill(2);print(z);print(x)
2 2
2 2
2 2
[torch.DoubleTensor of size 3x2]
0 0 0 0 0 0
0 0 2 2 0 0
0 0 2 2 0 0
0 0 2 2 0 0
0 0 0 0 0 0
[torch.DoubleTensor of size 5x6]
select(dim, index)返回的Tensor比原来的Tensor少一维,dim依然表示第几个维度
th> x = torch.Tensor(5,6):zero();y = x:select(1, 2):fill(2);print(y);print(x)
2
2
2
2
2
2
[torch.DoubleTensor of size 6]
0 0 0 0 0 0
2 2 2 2 2 2
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
[torch.DoubleTensor of size 5x6]
th> x = torch.Tensor(2,3,4):zero();y = x:select(1, 2):fill(2);print(y);print(x)
2 2 2 2
2 2 2 2
2 2 2 2
[torch.DoubleTensor of size 3x4]
(1,.,.) =
0 0 0 0
0 0 0 0
0 0 0 0
(2,.,.) =
2 2 2 2
2 2 2 2
2 2 2 2
[torch.DoubleTensor of size 2x3x4]
还有下标操作符[{ dim1,dim2,… }] or [{ {dim1s,dim1e}, {dim2s,dim2e} }]也是返回一个Tensor,这种方式更加简洁和常用
th> x = torch.Tensor(3, 4):zero();x[{ 1,3 }] = 1;print(x) -- 和x[1][3] = 1 等价
0 0 1 0
0 0 0 0
0 0 0 0
[torch.DoubleTensor of size 3x4]
th><font size=3.5> x = torch.Tensor(3, 4):zero();x[{ 2,{2,4} }] = 1;print(x) -- 2表示第一维index = 2,{2,4}表示第二维index从2到4
0 0 0 0
0 1 1 1
0 0 0 0
[torch.DoubleTensor of size 3x4]
th><font size=3.5> x = torch.Tensor(3, 4):zero();x[{ {},2 }] = torch.range(1,3);print(x) -- {}表示第一个维度的所有index,2表示第二个维度index = 2 range生成的Tensor叫IDTensor
0 1 0 0
0 2 0 0
0 3 0 0
[torch.DoubleTensor of size 3x4]
th> x = torch.Tensor(3, 4):fill(5);x[{ {},2 }] = torch.range(1,3);x[torch.lt(x,3)] = -2;print(x) ;
5 -2 5 5
5 -2 5 5
5 3 5 5
[torch.DoubleTensor of size 3x4]
index(dim, index)函数就不一样了,返回的Tensor就是一个新的Tensor了,和原来的Tensor就没有关系了,这里的index比较特殊,必须是LongTensor类型的
th> x = torch.randn(3,5);print(x);y = x:index(2,torch.range(2,4):typeAs(torch.LongTensor()));y:mul(100);print(y);print(x)
0.6146 -0.3204 -1.2182 1.5573 -0.7232
-1.1692 -0.0071 3.1590 0.6008 0.4566
0.1957 -0.4057 2.0835 -0.3365 -1.3541
[torch.DoubleTensor of size 3x5]
-32.0396 -121.8182 155.7330
-0.7097 315.9025 60.0773
-40.5727 208.3459 -33.6517
[torch.DoubleTensor of size 3x3]
0.6146 -0.3204 -1.2182 1.5573 -0.7232
-1.1692 -0.0071 3.1590 0.6008 0.4566
0.1957 -0.4057 2.0835 -0.3365 -1.3541
[torch.DoubleTensor of size 3x5]
th>x = torch.randn(3,5);print(x);y = x:index(2,torch.LongTensor({1,4}));y:mul(100)print(y);print(x) -- 指定获取第二维中index = 1 和index = 4 即原Tensor中第1列和第4列形成的新Tensor
-0.4880 -1.6397 -0.3257 -0.5051 -0.1214
-0.4002 1.3845 0.4411 0.1753 2.0174
0.5882 0.9351 -0.7685 0.6377 -1.7308
[torch.DoubleTensor of size 3x5]
-48.8024 -50.5097
-40.0198 17.5341
58.8207 63.7730
[torch.DoubleTensor of size 3x2]
-0.4880 -1.6397 -0.3257 -0.5051 -0.1214
-0.4002 1.3845 0.4411 0.1753 2.0174
0.5882 0.9351 -0.7685 0.6377 -1.7308
[torch.DoubleTensor of size 3x5]
indexCopy(dim, index, tensor)将tensor复制进来,index也是LongTensor类型的,类似的函数还有indexAdd(dim, index, tensor),indexFill(dim, index, val)
th> x = torch.randn(3,5);print(x);y = torch.Tensor(2,5);y:select(1,1):fill(-1);y:select(1,2):fill(-2);print(y);x:indexCopy(1,torch.LongTensor{3,1},y);print(x)
0.8086 1.7714 -1.6337 0.2549 0.2131
1.4018 -0.9938 0.3035 1.6247 -0.1368
0.3516 -1.3728 -0.5203 0.2754 -1.6965
[torch.DoubleTensor of size 3x5]
-1 -1 -1 -1 -1
-2 -2 -2 -2 -2
[torch.DoubleTensor of size 2x5]
-2.0000 -2.0000 -2.0000 -2.0000 -2.0000
1.4018 -0.9938 0.3035 1.6247 -0.1368
-1.0000 -1.0000 -1.0000 -1.0000 -1.0000
[torch.DoubleTensor of size 3x5]
这里需要强调一下,二维Tensor的第二个维度为1时这个Tensor还是二维的
-- 这个是一维的Tensor
th> a = torch.Tensor({1,2,3,4,5});print("dim=" .. a:dim());print(a);print(a:size())
dim=1
1
2
3
4
5
[torch.DoubleTensor of size 5]
5
[torch.LongStorage of size 1]
[torch.LongStorage of size 1]
-- 这个就是二维的Tensor了,此时是5x1的二维Tensor了
th> x = torch.Tensor({1,2,3,4,5});y = x:reshape(a:size(1),1);print(y);print(y:dim())
1
2
3
4
5
[torch.DoubleTensor of size 5x1]
2
-- 也可以这么写
th> y = torch.Tensor({{1},{2},{3},{4},{5}});print(y);print(y:dim())
1
2
3
4
5
[torch.DoubleTensor of size 5x1]
-- 这样就变成了1x5的二维Tensor了
th> y = torch.Tensor({{1,2,3,4,5}});print(y);print(y:dim())
1 2 3 4 5
[torch.DoubleTensor of size 1x5]
2
-- 或者这样也可以,都是二维的Tensor了,size()的值即各维度的大小,因为是二维,所以有两个值
th> x = torch.Tensor({1,2,3,4,5});y = x:reshape(1,a:size(1));print(y:dim());print(y);;print(y:size())
2
1 2 3 4 5
[torch.DoubleTensor of size 1x5]
1
5
[torch.LongStorage of size 2]
gather(dim, index)取出指定区域的数据,其中index的维数必须和最后输出的Tensor形状(维数,输出Tensor形状是mxn,index必须也是mxn)相同,且数据类型是LongTensor
-- dim = 1
result[i][j][k]... = src[index[i][j][k]...][j][k]...
-- dim = 2
result[i][j][k]... = src[i][index[i][j][k]...][k]...
-- etc.
-- src 是原来的Tensor
-- 如果想取对角线上的所有元素,输出是一个nx1的具有两个维度(Tensor:dim() = 2),则index必须是nx1具有两个维度的LongTensor
th> x = torch.rand(5, 5);print(x);y = x:gather(1, torch.LongTensor{{1, 2, 3, 4, 5}, {2, 3, 4, 5, 1}});print(y)
0.2188 0.3625 0.7812 0.2781 0.9327
0.5342 0.3879 0.7225 0.6031 0.7325
0.1464 0.4534 0.5134 0.9993 0.6617
0.0594 0.6398 0.1741 0.7357 0.6613
0.2926 0.7286 0.7255 0.7108 0.1820
[torch.DoubleTensor of size 5x5]
0.2188 0.3879 0.5134 0.7357 0.1820
0.5342 0.4534 0.1741 0.7108 0.9327
[torch.DoubleTensor of size 2x5]
-- 如果要取出对角线上的元素(1,1),(2,2),(3,3),(4,4),(5,5)这个1x5(二维Tensor)的Tensor出来 index = torch.LongTensor({{1, 2, 3, 4, 5}}); y = x:gather(1, index) 其中的index是1x5的二维Tensor
-- 如果要取出对角线(1,2),(2,3),(3,4),(4,5),(5,1)的1x5Tensor,index = torch.LongTensor({{5,1, 2, 3, 4}}); y = x:gather(1, index)或者index = torch.LongTensor({{2, 3, 4, 5,1}}); y = x:gather(2, index)
-- 看个例子
th> x = torch.rand(5, 5);print(x);y = x:gather(2, torch.LongTensor({ {1,2},{2,3},{3,4},{4,5},{5,1} }) );print(y)
0.8563 0.2664 0.6895 0.8124 0.0788
0.0503 0.6646 0.7659 0.4013 0.0670
0.4760 0.0517 0.9621 0.7437 0.1162
0.4069 0.9932 0.6118 0.6200 0.3585
0.9795 0.9601 0.9098 0.4714 0.5577
[torch.DoubleTensor of size 5x5]
0.8563 0.2664
0.6646 0.7659
0.9621 0.7437
0.6200 0.3585
0.5577 0.9795
[torch.DoubleTensor of size 5x2]
scatter(dim, index, src|val)就是把其它Tensor src或者标量值val写入到自己参数含义和gather一样
x = torch.rand(2, 5)
> x
0.3227 0.4294 0.8476 0.9414 0.1159
0.7338 0.5185 0.2947 0.0578 0.1273
[torch.DoubleTensor of size 2x5]
y = torch.zeros(3, 5):scatter(1, torch.LongTensor{{1, 2, 3, 1, 1}, {3, 1, 1, 2, 3}}, x)
> y
0.3227 0.5185 0.2947 0.9414 0.1159
0.0000 0.4294 0.0000 0.0578 0.0000
0.7338 0.0000 0.8476 0.0000 0.1273
[torch.DoubleTensor of size 3x5]
z = torch.zeros(2, 4):scatter(2, torch.LongTensor{{3}, {4}}, 1.23)
> z
0.0000 0.0000 1.2300 0.0000
0.0000 0.0000 0.0000 1.2300
[torch.DoubleTensor of size 2x4]
nonzero(tensor)返回一个nx2 LongTensor,包含原Tensor中所有的非零值的下标和值
th> x = torch.rand(4, 4):mul(3):floor():int();y = torch.nonzero(x);print(x);print(y)
2 1 0 2
2 1 2 1
1 0 1 0
2 0 2 2
[torch.IntTensor of size 4x4]
1 1
1 2
1 4
2 1
2 2
2 3
2 4
3 1
3 3
4 1
4 3
4 4
[torch.LongTensor of size 12x2]
转置操作transpose(dim1, dim2)或者直接t(),仅对二维Tensor有用
对多维tensor转置permute(dim1, dim2, …, dimn)
x = torch.Tensor(3,4,2,5)
> x:size()
3
4
2
5
[torch.LongStorage of size 4]
y = x:permute(2,3,1,4) -- 等价于 y = x:transpose(1,3):transpose(1,2)
> y:size()
4
2
3
5
[torch.LongStorage of size 4]
还有对tensor对象每个元素应用函数处理apply(function)
= 0
z = torch.Tensor(3,3)
z:apply(function(x)
i = i + 1
return i
end)
> z
1 2 3
4 5 6
7 8 9
[torch.DoubleTensor of dimension 3x3]
针对两个tensor的每个元素处理map(tensor, function(xs, xt))
x = torch.Tensor(3,3)
y = torch.Tensor(9)
i = 0
x:apply(function() i = i + 1; return i end) -- fill-up x
i = 0
y:apply(function() i = i + 1; return i end) -- fill-up y
> x
1 2 3
4 5 6
7 8 9
[torch.DoubleTensor of dimension 3x3]
> y
1
2
3
4
5
6
7
8
9
[torch.DoubleTensor of dimension 9]
th> z =x:map(y, function(xx, yy) return xx*xx + yy end);print(z)
2 6 12
20 30 42
56 72 90
[torch.DoubleTensor of size 3x3]
将tensor进行切分split([result,] tensor, size, [dim]),size是切分的单位(大小),dim是指定的维度,即在那个维度上面切分
th> x = torch.randn(3,10,15)
th> x:split(2,1)
{
1 : DoubleTensor - size: 2x10x15
2 : DoubleTensor - size: 1x10x15
}
[0.0003s]
th> x:split(4,2)
{
1 : DoubleTensor - size: 3x4x15
2 : DoubleTensor - size: 3x4x15
3 : DoubleTensor - size: 3x2x15
}
[0.0004s]
th> x:split(5,3)
{
1 : DoubleTensor - size: 3x10x5
2 : DoubleTensor - size: 3x10x5
3 : DoubleTensor - size: 3x10x5
}
然后是一些tensor的Math操作函数,例如sum、add、sub、mul、div、mode、max、min、std、mean、pow、rand、randn、log、range、exp、abs、floor、sqrt等还有很多很多Math官网
这些函数因为非常常用已经放到了torch包中,既可以mean = torch.mean(x),也可以mean = x:mean()
th> x = torch.range(1,15,3);print(x);print(x:mean());print(torch.mean(x))
1
4
7
10
13
[torch.DoubleTensor of size 5]
7
7
还有一个很常用的是torch.sort([resval, resind,] x [,dim] [,flag]),默认升序,返回两个tensor,第一个是排好序的tensor,第二个是排好序的tensor每个元素对应的下标的tensor
th> x = torch.Tensor({8.3,3.4,5.7,1.7,9.3});y,index = x:sort(true);print(y,index)
-- 也可以这样写 y,index = torch.sort(x,true);print(y,index)
9.3000
8.3000
5.7000
3.4000
1.7000
[torch.DoubleTensor of size 5]
5
1
3
2
4
[torch.LongTensor of size 5]
下一篇Torch实现线性回归