Coursera吴恩达机器学习week3笔记

最新推荐文章于 2022-04-29 15:54:46 发布

loserChen.

最新推荐文章于 2022-04-29 15:54:46 发布

阅读量304

点赞数

分类专栏：吴恩达机器学习笔记机器学习文章标签：机器学习吴恩达笔记 Cousera

本文链接：https://blog.csdn.net/qq_35564813/article/details/104226835

版权

38 篇文章 3 订阅

订阅专栏

8 篇文章 1 订阅

订阅专栏

Once we have done some trouble shooting for errors in our predictions by:

可能有的公式针对训练集已经有很低的错误了，但是依然不够准确，因为这是过拟合的情况。所以为了分析假说公式，我们把数据集分为两类：训练集（70%）和测试集（30%）

在这里插入图片描述

One way to break down our dataset into the three sets is:

We can now calculate three separate error values for the three different sets using the following method:

Optimize the parameters in Θ using the training set for each polynomial degree.
Find the polynomial degree d with the least error using the cross validation set.
Estimate the generalization error using the test set with J , (d = theta from polynomial with lower error);

This way, the degree of the polynomial d has not been trained using the test set.

在这里插入图片描述

在这里插入图片描述

在这里插入图片描述

A neural network with fewer parameters is prone to underfitting. It is also computationally cheaper.
A large neural network with more parameters is prone to overfitting. It is also computationally expensive. In this case you can use regularization (increase λ) to address the overfitting.

Collect lots of data (for example “honeypot” project but doesn’t always work)
Develop sophisticated features (for example: using email header data in spam emails)
Develop algorithms to process your input in different ways (recognizing misspellings in spam).

在这里插入图片描述

Start with a simple algorithm, implement it quickly, and test it early on your cross validation data.
Plot learning curves to decide if more data, more features, etc. are likely to help.
Manually examine the errors on examples in the cross validation set and try to spot a trend where most of the errors were made.