Introduction to Deep Learning with PyTorch
Neural Networks
一个简单的神经网络
数学表达式为:
y = f ( w 1 x 1 + w 2 x 2 + b ) y = f(w_1 x_1 + w_2 x_2 + b) y=f(w1x1+w2x2+b)
y = f ( ∑ i w i x i + b ) y = f\left(\sum_i w_i x_i +b \right) y=f(i∑wixi+b)
h = [ x 1 x 2 ⋯ x n ] ⋅ [ w 1 w 2 ⋮ w n ] h = \begin{bmatrix} x_1 \, x_2 \cdots x_n \end{bmatrix} \cdot \begin{bmatrix} w_1 \\ w_2 \\ \vdots \\ w_n \end{bmatrix} h=[x1x2⋯xn]⋅⎣⎢⎢⎢⎡w1w2⋮wn⎦⎥⎥⎥⎤
Tensors
Pytorch用tensor张量作为数据结构,和numpy很类似,用tensor的好处是:
- 可以快速简单的送进GPU进行运算
- 有专门的module可以计算梯度
- 有专门的module可以构建神经网络
tensor可以是一维,二维,多维,一般我们会用三维张量(RGB图像)
代码
代码实现
y
=
f
(
w
1
x
1
+
w
2
x
2
+
b
)
y = f(w_1 x_1 + w_2 x_2 + b)
y=f(w1x1+w2x2+b)表达式,用input features
,weights
, bias
,然后用function activation
来计算,其中求和的部分可以用torch.sum()
, 或者 .sum()
来运算
# First, import PyTorch
import torch
def activation(x):
""" Sigmoid activation function
Arguments
---------
x: torch.Tensor
"""
return 1/(1+torch.exp(-x))
### Generate some data
torch.manual_seed(7) # Set the random seed so things are predictable
# Features are 5 random normal variables
features = torch.randn((1, 5))
# True weights for our data, random normal variables again
weights = torch.randn_like(features)
# and a true bias term
bias = torch.randn((1, 1))
## Calculate the output of this network using the weights and bias tensors
output = activation(torch.sum(features*weights) + bias) #选其中一个
output = activation((features*weights).sum()+bias) #选其中一个
改变张量形状
可以注意到前面的代码中矩阵相乘用的是 features*weights
,*表示element-wise相乘,那么如果我们想要用支持broadcast矩阵相乘运算的torch.mm()
or torch.matmul()
这两个功能,直接使用就会报错,这是因为这两个操作对矩阵的维度有要求。
torch.mm(a, b)
是矩阵a和b矩阵相乘,比如a的维度是(1, 2),b的维度是(2, 3),返回的就是(1, 3)的矩阵
那么我们就需要将 weights
转置,改变形状,有多种方法
weights.reshape(a, b)
will return a new tensor with the same data asweights
with size(a, b)
sometimes, and sometimes a clone, as in it copies the data to another part of memory.weights.resize_(a, b)
returns the same tensor with a different shape. However, if the new shape results in fewer elements than the original tensor, some elements will be removed from the tensor (but not from memory). If the new shape results in more elements than the original tensor, new elements will be uninitialized in memory. Here I should note that the underscore at the end of the method denotes that this method is performed in-place. Here is a great forum thread to read more about in-place operations in PyTorch.weights.view(a, b)
will return a new tensor with the same data asweights
with size(a, b)
.torch.transpose(weights,0,1)
will return transposed weights tensor. This returns transposed version of inpjut tensor along dim 0 and dim 1. This is efficient since we do not specify to actual dimesions of weights.- One more approach is to use
.t()
to transpose vector of weights, in our case from (1,5) to (5,1) shape.
## Calculate the output of this network using matrix multiplication
output = activation(torch.matmul(features,torch.transpose(weights,0,1)) + bias) #任选其一
output = activation(torch.mm(features, weights.view(5,1)) + bias) #任选其一
Stack them up!
前面的代码只是相当于一个感知机的部分,也就是下图得到 h 1 h_{1} h1的部分
如果我们要得到最终的输出,就要进行更大的矩阵运算:
首先,得到hidden layer:
h
⃗
=
[
h
1
h
2
]
=
[
x
1
x
2
⋯
x
n
]
⋅
[
w
11
w
12
w
21
w
22
⋮
⋮
w
n
1
w
n
2
]
\vec{h} = [h_1 \, h_2] = \begin{bmatrix} x_1 \, x_2 \cdots \, x_n \end{bmatrix} \cdot \begin{bmatrix} w_{11} & w_{12} \\ w_{21} &w_{22} \\ \vdots &\vdots \\ w_{n1} &w_{n2} \end{bmatrix}
h=[h1h2]=[x1x2⋯xn]⋅⎣⎢⎢⎢⎡w11w21⋮wn1w12w22⋮wn2⎦⎥⎥⎥⎤
上面得到的是
x
⃗
W
1
\vec{x}\bold{W_{1}}
xW1,然后我们计算最终的输出y
y
=
f
2
(
f
1
(
x
⃗
W
1
)
W
2
)
y = f_2 \! \left(\, f_1 \! \left(\vec{x} \, \mathbf{W_1}\right) \mathbf{W_2} \right)
y=f2(f1(xW1)W2)
代码:
### Generate some data
torch.manual_seed(7) # Set the random seed so things are predictable
# Features are 3 random normal variables
features = torch.randn((1, 3))
# Define the size of each layer in our network
n_input = features.shape[1] # Number of input units, must match number of input features
n_hidden = 2 # Number of hidden units
n_output = 1 # Number of output units
# Weights for inputs to hidden layer
W1 = torch.randn(n_input, n_hidden)
# Weights for hidden layer to output layer
W2 = torch.randn(n_hidden, n_output)
# and bias terms for hidden and output layers
B1 = torch.randn((1, n_hidden))
B2 = torch.randn((1, n_output))
## Your solution here
h = activation(torch.matmul(features,W1).add_(B1))
output = activation(torch.matmul(h,W2).add_(B2))
'''或者用下面两行'''
h = activation(torch.mm(features, W1) + B1)
output = activation(torch.mm(h, W2) + B2)
print(output)
和numpy的转化
torch.from_numpy(a)
将numpy数据a转化成torch数据
b.numpy()
将torch数据b转化成numpy
注意:它们的memory是共享的,所以改变其中一个,另一个也会改变
本系列笔记来自Udacity课程《Intro to Deep Learning with Pytorch》
全部笔记请关注微信公众号【阿肉爱学习】,在菜单栏点击“利其器”,并选择“pytorch”查看