PyTorch Tensors Explained - Neural Network Programming
PyTorch Tensors Explained - Neural Network Programming
Instances of the torch.Tensor class
PyTorch tensors are instances of the torch.Tensor Python class. We can create a torch.Tensor object using the class constructor like so:
> t = torch.Tensor()
> type(t)
torch.Tensor
Tensor attributes
Every torch.Tensor has these attributes:
- torch.dtype
- torch.device
- torch.layout
> print(t.dtype)
> print(t.device)
> print(t.layout)
torch.float32
cpu
torch.strided
Tensors have a torch.dtype
The dtype, which is torch.float32 in our case, specifies the type of the data that is contained within the tensor. Tensors contain uniform (of the same type) numerical data with one of these types:
Tensor operations between tensors must happen between tensors with the same type of data.
Tensors have a torch.device
The device, cpu in our case, specifies the device (CPU or GPU) where the tensor’s data is allocated.
This determines where tensor computations for the given tensor will be performed.
PyTorch supports the use of multiple devices, and they are specified using an index like so:
> device = torch.device('cuda:0')
> device
device(type='cuda', index=0)
Tensor operations between tensors must happen between tensors that exists on the same device.
Tensors have a torch.layout
The layout, strided in our case, specifies how the tensor is stored in memory.
Take away from the tensor attributes
As neural network programmers, we need to be aware of the following:
- Tensors contain data of a uniform type (dtype).
- Tensor computations between tensors depend on the dtype and the device.
Creating tensors using data
These are the primary ways of creating tensor objects (instances of the torch.Tensor class), with data (array-like) in PyTorch:
1.torch.Tensor(data)
2.torch.tensor(data)
3.torch.as_tensor(data)
4.torch.from_numpy(data)
Here is an example:
> data = np.array([1,2,3])
> type(data)
numpy.ndarray
> o1 = torch.Tensor(data)
> o2 = torch.tensor(data)
> o3 = torch.as_tensor(data)
> o4 = torch.from_numpy(data)
> print(o1)
> print(o2)
> print(o3)
> print(o4)
tensor([1., 2., 3.])
tensor([1, 2, 3], dtype=torch.int32)
tensor([1, 2, 3], dtype=torch.int32)
tensor([1, 2, 3], dtype=torch.int32)
Tensor creation operations: What’s the difference?
Uppercase/lowercase: torch.Tensor() vs torch.tensor()
The first option with the uppercase T is the constructor of the torch.Tensor class (构造器)
The second option is what we call a factory function that constructs torch.Tensor objects and returns them to the caller. (制造函数) You can think of the torch.tensor() function as a factory that builds tensors given some parameter inputs. Factory functions are a software design pattern for creating objects.
default dtype vs inferred dtype 默认数据类型和推断得出的数据类型
> print(o1.dtype)
> print(o2.dtype)
> print(o3.dtype)
> print(o4.dtype)
torch.float32
torch.int32
torch.int32
torch.int32
The difference here arises in the fact that the torch.Tensor() constructor uses the default dtype when building the tensor. 构造器用默认数据类型创建tensor
> torch.get_default_dtype()
torch.float32
> o1.dtype == torch.get_default_dtype()
True
The other calls choose a dtype based on the incoming data. This is called type inference. The dtype is inferred based on the incoming data. 其他方式创建tensor的数据类型是基于输入数据的,也被称作类型引用。
Note that the dtype can also be explicitly set for these calls by specifying the dtype as an argument:
> torch.tensor(data, dtype=torch.float32)
> torch.as_tensor(data, dtype=torch.float32)
With torch.Tensor(), we are unable to pass a dtype to the constructor. This is an example of the torch.Tensor() constructor lacking in configuration options.
无法向构造器传入数据类型,这也是构造器 缺少一个配置选项的例子
This is one of the reasons to go with the torch.tensor() factory function for creating our tensors.
Sharing memory for performance: copy vs share
Here is an example:
> data = np.array([1,2,3])
> type(data)
numpy.ndarray
> o1 = torch.Tensor(data)
> o2 = torch.tensor(data)
> o3 = torch.as_tensor(data)
> o4 = torch.from_numpy(data)
> print('old:', data)
old: [1 2 3]
# 仅改变原始值,o1,o2,o3,o4不做任何变化
> data[0] = 0
> print('new:', data)
new: [0 2 3]
> print(o1)
> print(o2)
> print(o3)
> print(o4)
tensor([1., 2., 3.])
tensor([1, 2, 3], dtype=torch.int32)
tensor([0, 2, 3], dtype=torch.int32)
tensor([0, 2, 3], dtype=torch.int32)
The first two o1 and o2 still have the original value of 1 for index 0, while the second two o3 and o4 have the new value of 0 for index 0.
This happens because torch.Tensor() and torch.tensor() copy their input data while torch.as_tensor() and torch.from_numpy() share their input data in memory with the original input object.
This sharing just means that the actual data in memory exists in a single place. As a result, any changes that occur in the underlying data will be reflected in both objects, the torch.Tensor and the numpy.ndarray.
Sharing data is more efficient and uses less memory than copying data because the data is not written to two locations in memory.
If we have a torch.Tensor and we want to convert it to a numpy.ndarray, we do it like so:
> print(o3.numpy())
> print(o4.numpy())
[0 2 3]
[0 2 3]
> print(type(o3.numpy()))
> print(type(o4.numpy()))
<class 'numpy.ndarray'>
<class 'numpy.ndarray'>
This establishes that torch.as_tensor() and torch.from_numpy() both share memory with their input data. However, which one should we use, and how are they different?
The torch.from_numpy() function only accepts numpy.ndarrays, while the torch.as_tensor() function accepts a wide variety of array-like objects including other PyTorch tensors. For this reason, torch.as_tensor() is the winning choice in the memory sharing game.
Best options for creating tensors in PyTorch
These two are the best options:
- torch.tensor()
- torch.as_tensor()
The torch.tensor() call is the sort of go-to call, while torch.as_tensor() should be employed when tuning our code for performance.
torch.tensor()是一个首选调用方式,但在调试代码,调优的过程中可以选用torch.as_tensor()
Some things to keep in mind about memory sharing (it works where it can):
- Since numpy.ndarray objects are allocated on the CPU, the as_tensor() function must copy the data from the CPU to the GPU when a GPU is being used.
- The memory sharing of as_tensor() doesn’t work with built-in Python data structures like lists.
- The as_tensor() call requires developer knowledge of the sharing feature. This is necessary so we don’t inadvertently make an unwanted change in the underlying data without realizing the change impacts multiple objects.
as_tensor()调用需要开发人员了解共享特性。这是必要的,这样我们就不会无意中对底层数据进行不必要的更改,而没有意识到更改会影响多个对象。 - The as_tensor() performance improvement will be greater if there are a lot of back and forth operations between numpy.ndarray objects and tensor objects. However, if there is just a single load operation, there shouldn’t be much impact from a performance perspective.
如果numpy.ndarray对象和tensor对象之间有大量的来回操作,那么as_tensor()的性能改进将会更大。但是,如果只有一个加载操作,那么从性能的角度来看,应该不会有太大的影响。
Creation options without data
We have the torch.eye() function which returns a 2-D tensor with ones on the diagonal and zeros elsewhere. The name eye() is connected to the idea of an identity matrix , which is a square matrix with ones on the main diagonal and zeros everywhere else.返回一个对角矩阵
> print(torch.eye(2))
tensor([
[1., 0.],
[0., 1.]
])
We have the torch.zeros() function that creates a tensor of zeros with the shape of specified shape argument.
> print(torch.zeros([2,2]))
tensor([
[0., 0.],
[0., 0.]
])
Similarly, we have the torch.ones() function that creates a tensor of ones.
> print(torch.ones([2,2]))
tensor([
[1., 1.],
[1., 1.]
])
We also have the torch.rand() function that creates a tensor with a shape of the specified argument whose values are random.
> print(torch.rand([2,2]))
tensor([
[0.0465, 0.4557],
[0.6596, 0.0941]
])
更多内容可以参见: