Regular operators

KatiyaX

已于 2022-11-23 20:27:12 修改

阅读量131

点赞数

分类专栏： CV 通用文章标签： python 算法

于 2022-10-27 19:26:02 首次发布

本文链接：https://blog.csdn.net/id242301/article/details/127549493

版权

CV 通用专栏收录该内容

3 篇文章 0 订阅

订阅专栏

1. Mask

mask= (targets[:,None] == class_set[None,:]).any(dim=-1)

targets: tensor([19, 29, 0, …, 51, 42, 70])：共60000个数字
class_set: tensor([19, 29, 0, …, 51, 42, 70]) : 共80个数字

targets[:].shape = 60000
targets[:,None].shape = torch.Size([60000, 1])

class_set[:].shape = 80
class_set[None,:].shape = torch.Size([1, 80])

mask的最终大小为：torch.Size([60000])
！！使用None的这种方式，可以让两个互不相等的tensor互相找出相同值，并且不会报错。比如将上述的code变成：

b = (targets[:] == class_set[:]).any(dim=-1)

报错：The size of tensor a (60000) must match the size of tensor b (80) at non-singleton dimension 0

b = (targets[None,:] == class_set[None,:]).any(dim=-1)

报错：The size of tensor a (60000) must match the size of tensor b (80) at non-singleton dimension 1

b = (targets[None,:] == class_set[:,None]).any(dim=-1)

b的最终大小为：torch.Size([80]) 以class_set为基准

b = (targets[None,:] == class_set[:,None]).any(dim=0)

b的最终大小为：torch.Size([60000]) 以targets为基准，与原有的代码一样。

2. Image transformation

方法1：使用 “/255”

最终数据不会centered by 0. 如何要用PCA的算法进行分析，那么这个方法处理的数据则不适用。

方法2：使用 mean & std

这个处理后的数据属于 centred by 0. (推荐！！)

3. Quartile Data

例子

10, 30, 5, 12, 20, 40, 25, 15, 18

排序：5, 10, 12, 15, 18, 20, 25, 30, 40
- $(\frac{n + 1}{4})th$ 即 1/4
- $(\frac{n + 1}{2})th$ 即 2/4
- $(\frac{3(n + 1)}{4})th$ 即 3/4

First Quartile = ((9 + 1)/4)th term
= (10/4)th term
= 2.5th term

2.5th term = 2nd term + (0.5) (3rd term - 2nd term)
= (10) + (0.5) (12 - 10)
= 10+1
= 11
The First Quartile value is 11.

Second Quartile = $\frac{(9 + 1)}{2}th$ term
= (10/2)th term
= 5th term
5th term is 18

So the second Quartile value is 18.

Third Quartile = $\frac{3(9 + 1)}{4}th$ term

= $\frac{3(10)}{4}th$ term

= 7.5th term

7.5th term is average result of 7th and 8th term = (25 + 30)/2 = 27.5

3. `iter` & `next`

当构建class时，使用`__iter__` 内置method，返回值是个iterable 的数据，并且可通过内置的 `__next__`获取下一个值

4. defaultdict

特点：

和 Java init 一个list 使用default的值类似
不会有 KeyError
也因为它独特的设计，不会有KeyError，它比传统的dictionary 好用（不需要判断key存不存在）
- 例子：

 d = defaultdict(list)
 for i in range(5):
    d[i].append(i)

5. `@staticmethod`

当作一个class里附属的统计小工具method。因为有了这个decorator，则不能在method的parameters里添加‘‘self’’，也因此不能访问classe的attributes.

6. log_softmax

比Softmax更好，因为可以凸显结果。

7. save dataloader

使用pkl

torch.save(dls, ‘fname.pkl’)

8. torch nn.functional vs nn

nn.functional 不存任何 state. 因此例如 nn.functional.Linear，需要人工输入weight是多少，bias是多少。
nn.Linear returns : class
nn.functional.Linear returns (N,∗,out_features).
也许 nn.functional在那种需要对weight和bias有特别骚操作的时候，比较user friendly。

9. `preds = F.linear(feats, output_weight, output_bias)`

F 即为通常在“8”里提到的nn.functional

10. output_weight.grad.fill_(0)

autograd by default frees the intermediate gradients that are not needed anymore, so that the memory usage is minimal.
retain_variables will only prevent autograd from freeing some buffers needed for backward (e.g. when you want to backprop multiple times through a graph). Use hooks to access intermediate gradients.
当使用了nn.functional, 需要Reset gradient的操作，套路一般如下：

# Reset gradients
            local_optim.zero_grad()
            output_weight.grad.fill_(0)
            output_bias.grad.fill_(0)

11. `init_weight.detach().requires_grad_()`

detach() detaches the output from the computationnal graph. So no gradient will be backproped along this variable.
torch.no_grad says that no operation should build the graph.

The difference is that one refers to only a given variable on which it’s called. The other affects all operations taking place within the with statement.
torch.no_grad yes you can use in eval phase in general.

detach() on the other hand should not be used if you’re doing classic cnn like architectures. It is usually used for more tricky operations.
detach() is useful when you want to compute something that you can’t / don’t want to differentiate.
- Like for example if you’re computing some indices from the output of the network and then want to use that to index a tensor. The indexing operation is not differentiable wrt the indices. So you should detach() the indices before providing them.
the version with torch.no_grad will use less memory

12. torch Vs torchvision

torchvision 相当于一个基础工具包，有一系列支持pytorch项目的包，减少重复造轮子。例如DenseNet 这个模型（有别于dense layer）就可以直接从torchvision 调用。

Dense layer, also called fully-connected layer, refers to the layer whose inside neurons connect to every neuron in the preceding layer.

torchvision.models.DenseNet

13. `np.lib.stride_tricks.as_strided`

模拟了sliding window 获取经过了filter的图片结果输出。function 里的变量 ‘shape’ 改变了原始图像的尺寸
case：

def conv2d(image, ftr):
    s = ftr.shape + tuple(np.subtract(image.shape, ftr.shape)+1)
    sub_image = np.lib.stride_tricks.as_strided(image, shape = s, strides = image.strides * 2)
    return np.einsum('ij,ijkl->kl', ftr, sub_image)

Filter = 3x3， Image = 100 x 100

Ftr.shape: tuple type tuple(np.subtract(image.shape, ftr.shape) + 1)
把numpy array 变成tuple type

s = ftr.shape + tuple(np.subtract(image.shape, ftr.shape) + 1)
=> $[3, 3] + (n p a r r a y [(100, 100) - (3, 3)] + 1)$
=> $[3, 3] + [(97, 97) + 1]$
=> $[3, 3] + [98, 98]$
=> $[3 ， 3 ， 98 ， 98]$
这里最后一步相当于两个list 相加的操作。可借鉴用于设定新的shape。

14. `np.einsum`

Torch 和 numpy 都可以用这个operant
两个matrices做multiplication ，需要用到for loop
去模拟 summation的计算规则，但写代码可以用Einsum简洁的表示，并且不用考虑
两个matrix相乘，第二个要用transpose 形式

basic 应用：
np.einsum('ij,ijkl->kl', ftr, sub_image)

ij,ijkl->kl: ij是ftr的shape; ijkl是 sub_image的shape;ij,ijkl指当ij的shape matrix 与 ijkl的shape的matrix相乘;->kl 指为ij,ijkl的相乘设定规则，并指定输出为 kl.
这里的kl是outer loop， ij是inter loop
其它应用：
将x的所有值相加

X = np.ones(3)
Sum_x = np.einsum('i->',x)

将 X 从顺序 ‘ijk’ 转换到顺序 ‘kji’,相当于reverse

X= np.ones((5,4,3))
np.einsum('ijk-<kji',x)

Element-wise

'ij,ij->ij'

Batch multiplication:

A = torch.rand((3,2,5))
B= torch.rand((3,5,3))
'ijk,ikl->ijl'

i : batch。实际上是 jk 与 kl相乘

Matrix diagonal:

X = torch.rand((3,3))
torch.einsum(‘ii->I’,x)

输出仅是对角线上的值

14. `x.flatten()`

将要合并拉平的维度放到括号：原始：x.shape = [4,2,2,3,16,16]

x = x.flatten(1,2) # X.SHAPE = [4,4,3,16,16] 将第1和第2结合拉平
x = x.flatten(0).shape # X.SHAPE = [12288] 将第0到第N结合拉平
x = x.flatten(3).shape # X.SHAPE = [4, 2, 2, 768] 将第3到第N结合拉平

括号里只能填两个数，放平开始的index，和放平结束的index

x = x.flatten(0,1,2).shape # 报错！
x = x.flatten(0,2).shape # 达到以上要要的 X.SHAPE = [16, 3, 16, 16] 将第0到第2结合拉平

不能跳着填，如果要结合的拉平的index中间有gap：原始：x.shape = [4,2,2,3,16,16]
想要结合index （1，3，5）=> x.shape = [4,(2x3x16),2,16] =>[4,96,2,16]

x = x.flatten(1,3,5).shape # 报错！
# 改正：
x = x.permute(0, 1, 3, 5, 2, 4) # torch.Size([4, 2, 3, 16, 2, 16])
x = x.flatten(1,3).shape #torch.Size([4, 96, 2, 16])

15. `nn.embedding`

用于定义一个数据结构，相当于一个备用的tensor。比如在NLP或者Transformer里，需要根据输入初始化一个Tensor用于后续训练。
比如Transformer Decoder的输入：Query。一开始就是根据所需定义的一个由random数字组成的初始化Tensor, 随着训练，这个Tensor会被继续更新。

···
embedding = nn.Embedding(10, 3) # 定义要怎么样的Tensor
input = torch.LongTensor([[1,2,4,5],[4,3,2,9]]) #假设Transformer Encoder 的 input 数据
embedding(input) # 相当于 numpy 里带自定义形状的np.zeros_like((input))
···