Machine learning week 4(Andrew Ng)——神经网络基础、Tensorflow的使用以及矩阵乘法

小白有颗大白梦

已于 2023-01-03 09:24:44 修改

阅读量409

点赞数 1

分类专栏： Machine learning 文章标签： tensorflow python 人工智能神经网络

于 2022-08-14 18:06:32 首次发布

本文链接：https://blog.csdn.net/weixin_62012485/article/details/126277883

版权

Machine learning 专栏收录该内容

10 篇文章 0 订阅

订阅专栏

文章目录

- Neural Network

Neural Network

1、Neural Network Intuition

1.1、Neurons and brain

在这里插入图片描述

1.2、Demand prediction

在这里插入图片描述

1.3、Face recognition

The input $x$ is a 1000pixel $\times$ 1000pixel = 1 million，That is, the column vector of one million rows.
Each layer detects different things. The deeper the layers are, the greater the field of vision.

2、Neural Network Model

Every layer inputs a vector of numbers and applies a bunch of logistic regression units to it, and then computes another vector of numbers that then gets passed from layer to layer until you get to the final output layers computation
$a^{[2]}$ is associated with the second layer.
在这里插入图片描述
The number of neurons represents the number of outputs for this layer. For example, layer 3 has three neurons so the output $\vec{a}^{[3]}$ is a vector containing three values.

2.1、The forward propagation algorithm.

To predict the outputs via inputs.

2.2、Lab

Introduce Tensorflow and demonstrate how these models are implemented in that framework.

2.3、Quiz

在这里插入图片描述

3、Tensorflow implementation

3.1、Reasoning in code

在这里插入图片描述
Sigmoid was used as an Activation function

激活函数的作用：增加非线性因素，解决线性模型表达能力不足的缺陷

3.2、The data in tensorflow

The Difference

a1 = layer_1(x)
# print tf.Tensor([[0.2 0.7 0.3]])
# Consider tensors as a way to represent matrices
a1.numpy()
#print array([[………………]])

3.3、Building a neural network

在这里插入图片描述
Each call to the “Dense” function defines a layer of the neural network, so there are 3 layers.
We can simplify the code to the form below

3.4、The lab

Normalize Data

norm_l = tf.keras.layers.Normalization(axis=-1)
norm_l.adapt(X)  # learns mean, variance
Xn = norm_l(X)

The creation of model

tf.random.set_seed(1234)  # applied to achieve consistent results
model = Sequential(
    [
        tf.keras.Input(shape=(2,)),
        Dense(3, activation='sigmoid', name = 'layer1'),
        Dense(1, activation='sigmoid', name = 'layer2')
     ]
)
model.summary()  #  Provides a description of the network

Get the parameters

W1, b1 = model.get_layer("layer1").get_weights()
W2, b2 = model.get_layer("layer2").get_weights()

Compile and fit

model.compile(
    loss = tf.keras.losses.BinaryCrossentropy(),
    optimizer = tf.keras.optimizers.Adam(learning_rate=0.01),
)

model.fit(
    Xt,Yt,            
    epochs=10,
)

Change the parameters

model.get_layer("layer1").set_weights([W1,b1])
model.get_layer("layer2").set_weights([W2,b2])

Prediction

X_test = np.array([
    [200,13.9],  # postive example
    [200,17]])   # negative example
X_testn = norm_l(X_test)
predictions = model.predict(X_testn)

4、Implementation

4.1、Forward prop

Notice: Place the w parameters for each neuron in the columns of W.
在这里插入图片描述

def my_dense(a_in, W, b, g):
    """
    Computes dense layer
    Args:
      a_in (ndarray (n, )) : Data, 1 example 
      W    (ndarray (n,j)) : Weight matrix, n features per unit, j units
      b    (ndarray (j, )) : bias vector, j units  
      g    activation function (e.g. sigmoid, relu..)
    Returns
      a_out (ndarray (j,))  : j units|
    """
    units = W.shape[1]
    a_out = np.zeros(units)
    for j in range(units):               
        w = W[:,j]    # 第j列所有元素                                
        z = np.dot(w, a_in) + b[j]         
        a_out[j] = g(z)               
    return(a_out)


def my_sequential(x, W1, b1, W2, b2):
    a1 = my_dense(x,  W1, b1, sigmoid)
    a2 = my_dense(a1, W2, b2, sigmoid)
    return(a2)


def my_predict(X, W1, b1, W2, b2):
    m = X.shape[0]
    p = np.zeros((m,1))
    for i in range(m):
        p[i,0] = my_sequential(X[i], W1, b1, W2, b2)
    return(p)


yhat = (predictions >= 0.5).astype(int)
print(f"decisions = \n{yhat}")

5、Vectorization

在这里插入图片描述
$A W$ is equal to $A^T * W$

AT = A.T
Z = np.matmul(AT,W)
#  or Z = AT @ W

${XW}$ is a matrix-matrix operation with dimensions $m,j_1)(j_1,j_2)$ which results in a matrix with dimension $m,j_2)$ . To that, we add a vector ${b}$ with dimension $1,j_2)$ . ${b}$ must be expanded to be a $m,j_2)$ matrix for this element-wise operation to make sense. This expansion is accomplished for you by NumPy broadcasting.