9444_1

最新推荐文章于 2024-07-04 07:30:26 发布

綺魍

最新推荐文章于 2024-07-04 07:30:26 发布

阅读量129

点赞数 1

分类专栏： UNSW

本文链接：https://blog.csdn.net/weixin_35307564/article/details/117979093

版权

UNSW 专栏收录该内容

4 篇文章 0 订阅

订阅专栏

1b有问题
Supervised, Reinforcement,Unsupervised Learning 区别和概念
meant by Overfitting in neural networks
- how to avoid Overfitting
how Dropout is used for neural networks(train&test)
Squared Error, Cross Entropy, Softmax, Weight Decay
difference between Maximum Likelihood estimation and Bayesian Inference(supervised)
concept of Momentum( as an enhancement for Gradient Descent)

1b有问题

Supervised, Reinforcement,Unsupervised Learning 区别和概念

Supervised Learning: The system is presented with training items consisting of an input and a target output. The aim is to predict the output, given the input (for the training set as well as an unseen test set).

Reinforcement Learning: The system chooses actions in a simulated environment, observing its state and receiving rewards along the way. The aim is to maximize the cumulative reward.

Unsupervised Learning: The system is presented with training items consisting of only an input (no target value). The aim is to extract hidden features or other structure from these data.

meant by Overfitting in neural networks

Overfitting is where the training set error continues to reduce, but the test set error stalls or increases.

how to avoid Overfitting

limiting the number of neurons or connections in the network
early stopping, with a validation set
dropout
weight decay (this can avoid overfitting by limiting the size of the weights)
how Dropout is used for neural networks(train&test)

During each minibatch of training, a fixed percentage (usually, one half) of nodes are chosen to be inactive. In the testing phase, all nodes are active but the activation of each node is multiplied by the same percentage that was used in training.

Squared Error, Cross Entropy, Softmax, Weight Decay

Assume $z_i$ is the actual output, $t_i$ is the target output and $w_j$ are the weights.

Squared Error: $\frac{1}{2}\sum_i (z_i - t_i)^2$

Cross Entropy: $\sum_i (-t_i\log z_i - (1-t_i)\log(1-z_i))$

Softmax: $-(z_i - \log\sum_j \exp(z_j))$ , where $i$ is the correct class.

Weight Decay: $\frac{1}{2}\sum_j w_j^2$

difference between Maximum Likelihood estimation and Bayesian Inference(supervised)

In Maximum Likelihood estimation, the hypothesis $h∈Hh\in H$ is chosen which maximises the conditional probability $P(D∣h)P(D\mid h)$ of the observed data $D$ , conditioned on $h$ .

In Bayesian Inference, the hypothesis $h∈Hh\in H$ is chosen which maximizes $P(D∣h)P(h)P(D\mid h)P(h)$ , where $P (h)$ is the prior probability of $h$ .

concept of Momentum( as an enhancement for Gradient Descent)

A running average of the differentials for each weight is maintained and used to update the weights as follows:

$−ηdEdw\delta w\ =\ \alpha\delta w\ -\eta\frac{dE}{dw}$

$δww\ =\ w\ +\ \delta w$

The constant $α\alpha$ with $0≤α<10\leq\alpha < 1$ is called the momentum.

綺魍

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
9444_1

目录1b有问题Supervised, Reinforcement,Unsupervised Learning 区别和概念meant by Overfitting in neural networkshow to avoid Overfittinghow Dropout is used for neural networks(train&test)Squared Error, Cross Entropy, Softmax, Weight Decaydifference between Maximum
复制链接

扫一扫