优达学城——浅析神经网络

本文深入浅出地介绍了神经网络的基础——感知机,包括权重的作用(如何决定输入的重要程度)、激活函数的功能(如何决定节点输出)以及偏置如何调整决策边界。通过权重的学习和调整,神经网络能自我学习并优化数据分类。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

神经网络推理过程

Perceptron

。
Data, like test scores and grades, are fed into a network of interconnected nodes. These individual nodes are called perceptrons, or artificial neurons, and they are the basic unit of a neural network. Each one looks at input data and decides how to categorize that data. In the example above, the input either passes a threshold for grades and test scores or doesn’t, and so the two categories are: yes (passed the threshold) and no (didn’t pass the threshold). These categories then combine to form a decision – for example, if both nodes produce a “yes” output, then this student gains admission into the university.

The perceptron above is one of the two perceptrons from the video that help determine whether or not a student is accepted to a university. It decides whether a student’s grades are high enough to be accepted to the university. You might be wondering: “How does it know whether grades or test scores are more important in making this acceptance decision?” Well, when we initialize a neural network, we don’t know what information will be most important in making a decision. It’s up to the neural network to learn for itself which data is most important and adjust how it considers that data.

Weights

在这里插入图片描述
When input comes into a perceptron, it gets multiplied by a weight value that is assigned to this particular input. For example, the perceptron above has two inputs, tests for test scores and grades, so it has two associated weights that can be adjusted individually. These weights start out as random values, and as the neural network network learns more about what kind of input data leads to a student being accepted into a university, the network adjusts the weights based on any errors in categorization that results from the previous weights. This is called training the neural network.

A higher weight means the neural network considers that input more important than other inputs, and lower weight means that the data is considered less important. An extreme example would be if test scores had no affect at all on university acceptance; then the weight of the test score input would be zero and it would have no affect on the output of the perceptron.

When writing equations related to neural networks, the weights will always be represented by some type of the letter w. It will usually look like a W when it represents a matrix of weights or a w when it represents an individual weight, and it may include some additional information in the form of a subscript to specify which weights (you’ll see more on that next). But remember, when you see the letter w, think weights.

Activation function

在这里插入图片描述
Finally, the result of the perceptron’s summation is turned into an output signal! This is done by feeding the linear combination into an activation function.

Activation functions are functions that decide, given the inputs into the node, what should be the node’s output? Because it’s the activation function that decides the actual output, we often refer to the outputs of a layer as its “activations”.

One of the simplest activation functions is the Heaviside step function. This function returns a 0 if the linear combination is less than 0. It returns a 1 if the linear combination is positive or equal to zero. The Heaviside step function is shown below, where h is the calculated linear combination:

In the university acceptance example above, we used the weights w​1 =−1,w2 =−0.2. Since w1 and w2 are negative values, the activation function will only return a 1 if grades and test are 0! This is because the range of values from the linear combination using these weights and inputs are (−∞,0] (i.e. negative infinity to 0, including 0 itself).

It’s easiest to see this with an example in two dimensions. In the following graph, imagine any points along the line or in the shaded area represent all the possible inputs to our node. Also imagine that the value along the y-axis is the result of performing the linear combination on these inputs and the appropriate weights. It’s this result that gets passed to the activation function.

Now remember that the step activation function returns 1 for any inputs greater than or equal to zero. As you can see in the image, only one point has a y-value greater than or equal to zero – the point right at the origin, (0,0):

在这里插入图片描述
Now, we certainly want more than one possible grade/test combination to result in acceptance, so we need to adjust the results passed to our activation function so it activates – that is, returns 1 – for more inputs. Specifically, we need to find a way so all the scores we’d like to consider acceptable for admissions produce values greater than or equal to zero when linearly combined with the weights into our node.

One way to get our function to return 1 for more inputs is to add a value to the results of our linear combination, called a bias.

Bias

A bias, represented in equations as b, lets us move values in one direction or another.

For example, the following diagram shows the previous hypothetical function with an added bias of +3. The blue shaded area shows all the values that now activate the function. But notice that these are produced with the same inputs as the values shown shaded in grey – just adjusted higher by adding the bias term:
在这里插入图片描述
Of course, with neural networks we won’t know in advance what values to pick for biases. That’s ok, because just like the weights, the bias can also be updated and changed by the neural network during training. So after adding a bias, we now have a complete perceptron formula:

在这里插入图片描述
This formula returns 1 if the input belongs to the accepted-to-university category or returns 0 if it doesn’t. The input is made up of one or more real numbers, each one represented by each x,where m is the number of inputs.

Then the neural network starts to learn! Initially, the weights ( wi) and bias (b) are assigned a random value, and then they are updated using a learning algorithm like gradient descent. The weights and biases change so that the next training example is more accurately categorized, and patterns in data are “learned” by the neural network.

好意希望与你共同成长~

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值