C2_W1_Lab01_Neurons_and_Layers吴恩达中英 pytorch版_we can define a layer with one neuron or unit and -CSDN博客

本文链接：https://blog.csdn.net/2303_79620604/article/details/136196644

本文介绍了在Tensorflow和Keras框架下，通过实验方式探讨神经元和层的工作原理，对比了线性回归模型（无激活）和逻辑回归模型（sigmoid激活），展示了如何在单个神经元层中实现这两种模型，并使用实例数据展示了模型的预测过程。

摘要由CSDN通过智能技术生成

Optional Lab - Neurons and Layers

In this lab we will explore the inner workings of neurons/units and layers. In particular, the lab will draw parallels to the models you have mastered in Course 1, the regression/linear model and the logistic model. The lab will introduce Tensorflow and demonstrate how these models are implemented in that framework.
在这次实验中，我们将探讨神经元和层的内部工作原理。特别是，该实验将把模型与你在课程1中掌握的模型进行类比，回归/线性模型和逻辑回归模型。该实验将介绍Tensorflow，并演示如何在该框架中实现这些模型。
在这里插入图片描述

Packages

Tensorflow and Keras
Tensorflow is a machine learning package developed by Google. In 2019, Google integrated Keras into Tensorflow and released Tensorflow 2.0. Keras is a framework developed independently by François Chollet that creates a simple, layer-centric interface to Tensorflow. This course will be using the Keras interface.
Tensorflow是一个由谷歌开发的机器学习包。2019年，谷歌将Keras集成到Tensorflow中，并发布了Tensorflow 2.0。Keras是由François Chollet独立开发的框架，创建了一个简单、基于层的界面。本课程将使用Keras

import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow.keras.layers import Dense, Input
from tensorflow.keras import Sequential
from tensorflow.keras.losses import MeanSquaredError, BinaryCrossentropy
from tensorflow.keras.activations import sigmoid
from lab_utils_common import dlc
from lab_neurons_utils import plt_prob_1d, sigmoidnp, plt_linear, plt_logistic
plt.style.use('./deeplearning.mplstyle')
import logging
logging.getLogger("tensorflow").setLevel(logging.ERROR)
tf.autograph.set_verbosity(0)

Neuron without activation - Regression/Linear Model

DataSet

We’ll use an example from Course 1, linear regression on house prices.

X_train = np.array([[1.0], [2.0]], dtype=np.float32)           #(size in 1000 square feet)
Y_train = np.array([[300.0], [500.0]], dtype=np.float32)       #(price in 1000s of dollars)

fig, ax = plt.subplots(1,1)
ax.scatter(X_train, Y_train, marker='x', c='r', label="Data Points")
ax.legend( fontsize='xx-large')
ax.set_ylabel('Price (in 1000s of dollars)', fontsize='xx-large')
ax.set_xlabel('Size (1000 sqft)', fontsize='xx-large')
plt.show()

在这里插入图片描述

Regression/Linear Model

The function implemented by a neuron with no activation is the same as in Course 1, linear regression:
$f_{\mathbf{w},b}(x^{(i)}) = \mathbf{w}\cdot x^{(i)} + b \tag{1}$

We can define a layer with one neuron or unit and compare it to the familiar linear regression function.
我们可以定义一个具有一个神经元的层或单位，并将其与熟悉的线性回归函数进行比较。

linear_layer = tf.keras.layers.Dense(units=1, activation = 'linear', )

# 用pytorch定义一个具有一个神经元的层或单位
# linear_layer = nn.Sequential(
#     nn.Linear(1,1)
# )

Let’s examine the weights.(让我们来看看权重)

linear_layer.get_weights()

[]

There are no weights as the weights are not yet instantiated. Let’s try the model on one example in X_train. This will trigger the instantiation of the weights. Note, the input to the layer must be 2-D, so we’ll reshape it.
这里没有权重，因为权重还没有被实例化。让我们试着在一个例子上检查模型，X_train。这将触发权重的实例化。注意，层输入必须是2维的，因此我们必须重塑它。

a1 = linear_layer(X_train[0].reshape(1,1))
print(a1)

tf.Tensor([[0.22]], shape=(1, 1), dtype=float32)

The result is a tensor (another name for an array) with a shape of (1,1) or one entry.
结果是一个张量（另一个数组的名称），形状为（1，1）或一个条目。
Now let’s look at the weights and bias. These weights are randomly initialized to small numbers and the bias defaults to being initialized to zero.
现在让我们来看看权重和偏差。这些权重被随机初始化为较小的数字，偏差默认为初始化为零。

w, b= linear_layer.get_weights()
print(f"w = {w}, b={b}")

w = [[0.22]], b=[0.]

A linear regression model (1) with a single input feature will have a single weight and bias. This matches the dimensions of our linear_layer above.
一个线性回归模型（1）具有一个输入特征将有一个单一的权重和偏差。这符合我们上面linear_layer的维度。
The weights are initialized to random values so let’s set them to some known values.
权重被初始化为随机值，所以让我们把它们设置为一些已知值。

set_w = np.array([[200]])
set_b = np.array([100])

# set_weights takes a list of numpy arrays
linear_layer.set_weights([set_w, set_b])
print(linear_layer.get_weights())

[array([[200.]], dtype=float32), array([100.], dtype=float32)]

Let’s compare equation (1) to the layer output.

a1 = linear_layer(X_train[0].reshape(1,1))
print(a1)
alin = np.dot(set_w,X_train[0].reshape(1,1)) + set_b
print(alin)

tf.Tensor([[300.]], shape=(1, 1), dtype=float32)
[[300.]]

They produce the same values!
Now, we can use our linear layer to make predictions on our training data.
他们生产相同的值！现在，我们可以使用我们的线性层在训练数据上进行预测。

prediction_tf = linear_layer(X_train)
prediction_np = np.dot( X_train, set_w) + set_b

plt_linear(X_train, Y_train, prediction_tf, prediction_np)

在这里插入图片描述

Neuron with Sigmoid activation（具有sigmoid激活函数的神经元）

The function implemented by a neuron/unit with a sigmoid activation is the same as in Course 1, logistic regression:
$f_{\mathbf{w},b}(x^{(i)}) = g(\mathbf{w}x^{(i)} + b) \tag{2}$
where $g (x) = s i g m o i d (x)$

Let’s set $w$ and $b$ to some known values and check the model.

DataSet

We’ll use an example from Course 1, logistic regression.

X_train = np.array([0., 1, 2, 3, 4, 5], dtype=np.float32).reshape(-1,1)  # 2-D Matrix
Y_train = np.array([0,  0, 0, 1, 1, 1], dtype=np.float32).reshape(-1,1)  # 2-D Matrix

pos = Y_train == 1
neg = Y_train == 0
X_train[pos]

array([3., 4., 5.], dtype=float32)

pos = Y_train == 1
neg = Y_train == 0

fig,ax = plt.subplots(1,1,figsize=(4,3))
ax.scatter(X_train[pos], Y_train[pos], marker='x', s=80, c = 'red', label="y=1")
ax.scatter(X_train[neg], Y_train[neg], marker='o', s=100, label="y=0", facecolors='none', 
              edgecolors=dlc["dlblue"],lw=3)

ax.set_ylim(-0.08,1.1)
ax.set_ylabel('y', fontsize=12)
ax.set_xlabel('x', fontsize=12)
ax.set_title('one variable plot')
ax.legend(fontsize=12)
plt.show()

在这里插入图片描述

Logistic Neuron（逻辑神经元）

We can implement a ‘logistic neuron’ by adding a sigmoid activation. The function of the neuron is then described by (2) above.
我们可以通过添加sigmoid激活来实现“逻辑神经元”。神经元的功能由（2）描述。
This section will create a Tensorflow Model that contains our logistic layer to demonstrate an alternate method of creating models. Tensorflow is most often used to create multi-layer models. The Sequential model is a convenient means of constructing these models.
本节将创建一个包含我们的逻辑层以演示另一种创建模型的方法。Tensorflow最常用于创建多层模型。 Sequential模型是一种方便的方法来构建这些模型。

model = Sequential(
    [
        tf.keras.layers.Dense(1, input_dim=1,  activation = 'sigmoid', name='L1')
    ]
)

model.summary() shows the layers and number of parameters in the model. There is only one layer in this model and that layer has only one unit. The unit has two parameters, $w$ and $b$ .
model.summary()显示了模型中的层和参数数量。在这个模型中只有一个层，并且该层只有一个单元。该单元有两个参数， $w$ 和 $b$ 。

model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
L1 (Dense)                   (None, 1)                 2         
=================================================================
Total params: 2
Trainable params: 2
Non-trainable params: 0
_________________________________________________________________

logistic_layer = model.get_layer('L1')
w,b = logistic_layer.get_weights()
print(w,b)
print(w.shape,b.shape)

[[-0.93]] [0.]
(1, 1) (1,)

Let’s set the weight and bias to some known values.

set_w = np.array([[2]])
set_b = np.array([-4.5])
# set_weights takes a list of numpy arrays
logistic_layer.set_weights([set_w, set_b])
print(logistic_layer.get_weights())

[array([[2.]], dtype=float32), array([-4.5], dtype=float32)]

Let’s compare equation (2) to the layer output.

a1 = model.predict(X_train[0].reshape(1,1))
print(a1)
alog = sigmoidnp(np.dot(set_w,X_train[0].reshape(1,1)) + set_b)
print(alog)

[[0.01]]
[[0.01]]

They produce the same values!
Now, we can use our logistic layer and NumPy model to make predictions on our training data.

plt_logistic(X_train, Y_train, model, set_w, set_b, pos, neg)

在这里插入图片描述

The shading above reflects the output of the sigmoid which varies from 0 to 1.
上述阴影表示sigmoid的输出，范围从0到1。

Congratulations!

You built a very simple neural network and have explored the similarities of a neuron to the linear and logistic regression from Course 1.
你搭建了一个非常简单的神经网络，并探索了神经元与线性回归和逻辑回归的相似性。

pytorch实现

创建只有一个神经元的线性层神经网络模型（如上面tensorflow实现的那样）

import torch
from torch import nn
import numpy as np
from torchsummary import summary
from torch.nn.parameter import Parameter

# 创建一个神经元的线性层神经网络模型
class One_Neuron(nn.Module):
    def __init__(self):
        super(One_Neuron,self).__init__()
        self.model = nn.Sequential(
            nn.Linear(1,1)
        )

    def forward(self,x):
        x = self.model(x)
        return x

我们将用到与上述一样的数据集，注意在pytorch中要求的数据类型为torch.tensor


X_train = np.array([[1.0],[2.0]],dtype=np.float32)
Y_train = np.array([[300.0],[500.0]],dtype=np.float32)

#将numpy转换为torch的tensor
X_train = torch.tensor(X_train)
Y_train = torch.tensor(Y_train)

#使用gpu
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
X_train = X_train.to(device)
Y_train = Y_train.to(device)

在pytorch中使用summary函数需要导入torchsummary库，并使用summary函数来查看模型的结构。

#实例化神经网络模型并使用gpu
model = One_Neuron().to(device)  
#查看网络模型结构 
summary(model,(1,1))

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
================================================================
            Linear-1                 [-1, 1, 1]               2
================================================================
Total params: 2
Trainable params: 2
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.00
Forward/backward pass size (MB): 0.00
Params size (MB): 0.00
Estimated Total Size (MB): 0.00
----------------------------------------------------------------

正如前面虽说，神经网络模型中的权重随机初始化为较小的数字，偏执初始化默认为0，接下来我们将设置权重和偏执与之前的实验进行比较。


w = torch.tensor([200],dtype=torch.float32)
b= torch.tensor([100],dtype=torch.float32)
#使用gpu运行
w = w.to(device)
b = b.to(device)
# pytorch设置权重
#第一种方法是直接赋值给weight和bias的data
# model.model[0].weight.data = w.reshape(-1,1)
# model.model[0].bias.data = b.reshape(-1,1)
#第二种，调用para
model.model[0].weight = Parameter(w.reshape(-1,1))
model.model[0].bias = Parameter(b.reshape(-1,1))

#查看权重
print(model.model[0].weight)

a = model(X_train)
print(a)

Parameter containing:
tensor([[200.]], device='cuda:0', requires_grad=True)
tensor([[300.],
        [500.]], device='cuda:0', grad_fn=<AddmmBackward0>)

通过结果我们可以看到，使用pytorch设置权重和偏执与之前实验的结果是一致的。线性神经元的计算就跟下式运行结果一样
$f_{\mathbf{w},b}(x^{(i)}) = \mathbf{w}\cdot x^{(i)} + b \tag{1}$

逻辑神经元

我们将用到与实验1中一样的数据集，我们都知道线性神经元与逻辑神经元的区别
$f_{\mathbf{w},b}(x^{(i)}) = g(\mathbf{w}x^{(i)} + b) \tag{2}$
where $g (x) = s i g m o i d (x)$
接下来让我们用pytorch来实现逻辑神经元

class One_Neuron(nn.Module):
    def __init__(self):
        super(One_Neuron,self).__init__()
        self.model = nn.Sequential(
            nn.Linear(1,1),
            nn.Sigmoid()
        )

    def forward(self,x):
        x = self.model(x)
        return x

接下来让我们使用之前的数据集来验证我们的逻辑神经元模型

#创建数据集
X_train_1 = torch.tensor([0,1,2,3,4,5],dtype=torch.float32).reshape(-1,1)
Y_train_1 = torch.tensor([0,0,0,1,1,1],dtype=torch.float32).reshape(-1,1)
#将权重和偏执设置为我们已知的值
w_1 = torch.tensor([2.]).reshape(-1,1)
b_1 = torch.tensor([-4.5]).reshape(-1,1)
#实例化神经网络模型
model = One_Neuron()
#设置神经网络中的权重和偏执
model.model[0].weight = Parameter(w_1)
model.model[0].bias = Parameter(b_1)

output = model(X_train_1)
print(output)

tensor([[0.0110],
        [0.0759],
        [0.3775],
        [0.8176],
        [0.9707],
        [0.9959]], grad_fn=<SigmoidBackward0>)

通过结果可以发现结果与之前实验的结果是一致的。

C2_W1_Lab01_Neurons_and_Layers吴恩达中英 pytorch版

Optional Lab - Neurons and Layers

Packages

Neuron without activation - Regression/Linear Model

DataSet

Regression/Linear Model

Neuron with Sigmoid activation（具有sigmoid激活函数的神经元）

DataSet

Logistic Neuron（逻辑神经元）

Congratulations!

pytorch实现

逻辑神经元

恭喜！