文章目录
Neural Network
1、Neural Network Intuition
1.1、Neurons and brain
1.2、Demand prediction
1.3、Face recognition
The input
x
x
x is a 1000pixel
×
\times
× 1000pixel = 1 million,That is, the column vector of one million rows.
Each layer detects different things. The deeper the layers are, the greater the field of vision.
2、Neural Network Model
Every layer inputs a vector of numbers and applies a bunch of logistic regression units to it, and then computes another vector of numbers that then gets passed from layer to layer until you get to the final output layers computation
a
[
2
]
a^{[2]}
a[2] is associated with the second layer.
The number of neurons represents the number of outputs for this layer. For example, layer 3 has three neurons so the output
a
⃗
[
3
]
\vec{a}^{[3]}
a[3] is a vector containing three values.
2.1、The forward propagation algorithm.
To predict the outputs via inputs.
2.2、Lab
Introduce Tensorflow and demonstrate how these models are implemented in that framework.
2.3、Quiz
3、Tensorflow implementation
3.1、Reasoning in code
Sigmoid was used as an Activation function
激活函数的作用:增加非线性因素,解决线性模型表达能力不足的缺陷
3.2、The data in tensorflow
- The Difference
a1 = layer_1(x)
# print tf.Tensor([[0.2 0.7 0.3]])
# Consider tensors as a way to represent matrices
a1.numpy()
#print array([[………………]])
3.3、Building a neural network
Each call to the “Dense” function defines a layer of the neural network, so there are 3 layers.
We can simplify the code to the form below
3.4、The lab
- Normalize Data
norm_l = tf.keras.layers.Normalization(axis=-1)
norm_l.adapt(X) # learns mean, variance
Xn = norm_l(X)
- The creation of model
tf.random.set_seed(1234) # applied to achieve consistent results
model = Sequential(
[
tf.keras.Input(shape=(2,)),
Dense(3, activation='sigmoid', name = 'layer1'),
Dense(1, activation='sigmoid', name = 'layer2')
]
)
model.summary() # Provides a description of the network
Get the parameters
W1, b1 = model.get_layer("layer1").get_weights()
W2, b2 = model.get_layer("layer2").get_weights()
Compile and fit
model.compile(
loss = tf.keras.losses.BinaryCrossentropy(),
optimizer = tf.keras.optimizers.Adam(learning_rate=0.01),
)
model.fit(
Xt,Yt,
epochs=10,
)
Change the parameters
model.get_layer("layer1").set_weights([W1,b1])
model.get_layer("layer2").set_weights([W2,b2])
Prediction
X_test = np.array([
[200,13.9], # postive example
[200,17]]) # negative example
X_testn = norm_l(X_test)
predictions = model.predict(X_testn)
4、Implementation
4.1、Forward prop
Notice: Place the w parameters for each neuron in the columns of W.
def my_dense(a_in, W, b, g):
"""
Computes dense layer
Args:
a_in (ndarray (n, )) : Data, 1 example
W (ndarray (n,j)) : Weight matrix, n features per unit, j units
b (ndarray (j, )) : bias vector, j units
g activation function (e.g. sigmoid, relu..)
Returns
a_out (ndarray (j,)) : j units|
"""
units = W.shape[1]
a_out = np.zeros(units)
for j in range(units):
w = W[:,j] # 第j列所有元素
z = np.dot(w, a_in) + b[j]
a_out[j] = g(z)
return(a_out)
def my_sequential(x, W1, b1, W2, b2):
a1 = my_dense(x, W1, b1, sigmoid)
a2 = my_dense(a1, W2, b2, sigmoid)
return(a2)
def my_predict(X, W1, b1, W2, b2):
m = X.shape[0]
p = np.zeros((m,1))
for i in range(m):
p[i,0] = my_sequential(X[i], W1, b1, W2, b2)
return(p)
yhat = (predictions >= 0.5).astype(int)
print(f"decisions = \n{yhat}")
5、Vectorization
A
W
AW
AW is equal to
A
T
∗
W
A^T * W
AT∗W
AT = A.T
Z = np.matmul(AT,W)
# or Z = AT @ W
X W {XW} XW is a matrix-matrix operation with dimensions ( m , j 1 ) ( j 1 , j 2 ) (m,j_1)(j_1,j_2) (m,j1)(j1,j2) which results in a matrix with dimension ( m , j 2 ) (m,j_2) (m,j2). To that, we add a vector b {b} b with dimension ( 1 , j 2 ) (1,j_2) (1,j2). b {b} b must be expanded to be a ( m , j 2 ) (m,j_2) (m,j2) matrix for this element-wise operation to make sense. This expansion is accomplished for you by NumPy broadcasting.