Why do you need Non-Linear Activation Functions?为什么需要使用非线性激活函数?
Why does a neural network need a non-linear activation function? Turns out that your neural network to compute interesting functions, you do need to pick a non-linear activation function, let's see one. So, here's the four prop equations for the neural network.
Why don't we just get rid of this? Get rid of the function g? And set a1 equals z1. Or alternatively, you can say that g of z is equal to z, all right? Sometimes this is called the linear activation function. Maybe a better name for it would be the identity activation function because it just outputs whatever was input. For the purpose of this, what if a(2) was just equal z(2)? It turns out if you do this, then this model is just computing y or y-hat as a linear function of your input features, x, to take the first two equations. If you have that a(1) = Z(1) = W(1)x + b, and then a(2) = z (2) = W(2)a