torch.nn.functional.interpolate()
函数是用来上采样或下采样,可以给定size或者scale_factor来进行上下采样。同时支持3D、4D、5D的张量输入。
插值算法可选,最近邻、线性、双线性。
函数参数:
input (Tensor) – the input tensor
size (int or Tuple[int] or Tuple[int, int] or Tuple[int, int, int]) – output spatial size.
scale_factor (float or Tuple[float]) – multiplier for spatial size. Has to match input size if it is a tuple.
mode (str) – algorithm used for upsampling: 'nearest' | 'linear' | 'bilinear' | 'bicubic' |'trilinear' | 'area'. Default: 'nearest'
align_corners (bool, optional) – Geometrically, we consider the pixels of the input and output as squares rather than points. If set to True, the input and output tensors are aligned by the center points of their corner pixels, preserving the values at the corner pixels. If set to False, the input and output tensors are aligned by the corner points of their corner pixels, and the interpolation uses edge value padding for out-of-boundary values, making this operation independent of input size when scale_factor is kept the same. This only has an effect when mode is 'linear', 'bilinear', 'bicubic' or 'trilinear'. Default: False
BHWC扩展维度
https://www.cnblogs.com/skyfsm/p/8276501.html
img = np.expand_dims(img, axis=0)
img = torch.Tensor(img)
torch.einsum
softmax
https://blog.csdn.net/bitcarmanlee/article/details/82320853
https://www.jianshu.com/p/7e200a487916
attention(值得深入探索)
依次学习
入门版
二分类——多分类
单头注意力——多头注意力
论文
Hierarchical Attention Networks for Document Classification
Neural machine translation by jointly learning to align and translate(最早提出Soft Attention Model)
Attention Mechanism详细介绍:原理、分类及应用
Attention的实现可直接由一个激活函数为softmax的Dense层实现,Dense层的输出乘以Dense的输入即完成了Attention权重的分配。在这里的实现看上去比较复杂,但本质上仍是那两步操作,只是为了将问题更为泛化,把维度进行了扩展。