task4

最新推荐文章于 2024-03-29 22:32:46 发布

px?????

最新推荐文章于 2024-03-29 22:32:46 发布

阅读量89

点赞数

分类专栏：笔记

本文链接：https://blog.csdn.net/Hellen2020/article/details/119861070

版权

9 篇文章 0 订阅

订阅专栏

定义一个function
Goodness of function
Pick the best function
在这里插入图片描述

Deeper usually does not imply better
Vanishing gradient problem
Input 改参数对output 变化很小导致small gradient
改进：做linear unit(ReLU)
Reason:

Fast to compute
Biological reason
Infinite sigmoid with different biases
Vanishing gradient problem
A thinner linear network
Maxout:ReLU is a special cases of Maxout
Learnable activation function
Activation function in maxout network can be any piecewise linear convex function
How many pieces depending on how many elements ina group
Maxout-Training
Given a training data x, we know which a would be the max
Train this thin and linear network
不同input max不同，每个example都会被pooling
RmsPROP与adgrad不同，少了平方

Momentum
Movement:movement of last step minus gradient at present
在这里插入图片描述

关注