ML - Coursera Andrew Ng - Week6 & Ex5 - System Design - 笔记与代码

Week 6主要讲解了机器学习的系统设计与评估。

评估一个学习算法的好坏,我们需要评估假设函数。评估假设函数是先在training set中学习θ并最小化 J t r a i n ( θ ) J_{train}(θ) Jtrain(θ),然后计算test set的error J t e s t ( θ ) J_{test}(θ) Jtest(θ)。为了使test set error成为generalization error,我们增加一个validation (cross validation) set,在validation set中进行评估函数的测试。没有validation set时,一般training set 70%,test set 30%。有validation set时,一般training set 60%,cross validation set 20%,test set 20%。注意数据在划分前要是乱序的,不然要先shuffle。

有了测试结果后,我们需要进行诊断(Diagnostic)看这个学习算法是否有用,并以此指导提升性能的方向。Diagnostics可能会花费很多时间,但这样做是充分地利用时间。一般我们需要诊断算法的偏差(Bias)和方差(Variance)。High Bias意味着Underfitting,High Variance意味着Overfitting。High Bias和High Variance都意味着学习算法表现没有预期的好, J c v ( θ ) J_{cv}(θ) Jcv(θ) J t e s t ( θ ) J_{test}(θ) Jtest(θ)会很大,但是High Bias中 J t r a i n ( θ ) J_{train}(θ) Jtrain(θ)会很接近 J c v ( θ ) J_{cv}(θ) Jcv(θ) J t e s t ( θ ) J_{test}(θ) Jtest(θ),且都高于预期的误差,然而High Variance中 J t r a i n ( θ ) J_{train}(θ) Jtrain(θ)会很大幅度低于 J c v ( θ ) J_{cv}(θ) Jcv(θ) J t e s t ( θ ) J_{test}(θ) Jtest(θ),预期的误差位于两者之间。这时候学习曲线(Learning Curves)可以帮忙理解Bias和Variance的情形,并且告诉我们增加训练集的数量能帮忙解决High Variance问题,但是无法帮忙解决High Bias问题。

解决High Variance问题,一般可以尝试:1)获取更多训练集;2)减小特征集;3)增大λ。解决High Bias问题,一般可以尝试:1)获取更多的特征;2)增加多项式特征;3)减小λ。

设计一个机器学习系统,建议:1)先完成一个可以快速实现的简单算法,在validation set中进行测试;2)画learning curves,看更多的数据,或更多的特征等条件能否改善模型;3)Error analysis,手动检测validation set中判断失误的例子,看能否从中判断出失误的类型。

Error analysis需要numerical value。在处理skewed data时,precision不足以评估模型是否准确,我们是否有在改进这个模型,为此引进recallF score(F 1 _1 1 score)是precision和recall的综合体现, 2 P R P + R 2\frac{PR}{P+R} 2P+RPR

High performance learning algorithm would need both enough parameters to predict y accurately, and a very large training set.

1. Evaluating a Learning Algorithm
1.1 Evaluating a Hypothesis
1.2 Model Selection and Train/Validation/Test Sets

Given many models with different polynomial degrees, we can use a systematic approach to identify the ‘best’ function. In order to choose the model of your hypothesis, you can test each degree of polynomial and look at the error result.

2. Bias vs. Variance
2.1 Diagnosing Bias vs. Variance
2.2 Regularization and Bias/Variance
2.3 Learning Curves
2.4 Deciding What to Do Next
3. Building a Spam Classifier
4. Handling Skewed Data
5. Using Large Data Sets
6. Exercise 5: 探索正则化线性回归中的Bias和Variance - Matlab
6.1 Regularized Linear Regression

对于正则化线性回归,根据损失函数和梯度下降的公式,我们在linearRegCostFunction.m文件中,补充对J和gradient的计算。X在传进函数的时候,已经扩充了 x 0 x_0 x0的那一列。 θ 0 θ_0 θ0不参与正则化项。

function [J, grad] = linearRegCostFunction(X, y, theta, lambda)
%LINEARREGCOSTFUNCTION Compute cost and gradient for regularized linear 
%regression with multiple variables
%   [J, grad] = LINEARREGCOSTFUNCTION(X, y, theta, lambda) computes the 
%   cost of using theta as the parameter for linear regression to fit the 
%   data points in X and y. Returns the cost in J and the gradient in grad

% Initialize some useful values
m = length(y); % number of training examples

% You need to return the following variables correctly 
J = 0;
grad = zeros(size(theta));

% ====================== YOUR CODE HERE ======================
% Instructions: Compute the cost and gradient of regularized linear 
%               regression for a particular choice of theta.
%
%               You should set J to the cost and grad to the gradient.
%

reg = (lambda / 2 / m) * (theta(2 : end))' * theta(2 : end);
J = sum((X * theta - y) .^ 2) / (2 * m) + reg;

grad_temp = X' * (X * theta - y) / m;
grad = [grad_temp(1 : 1) ; grad_temp(2 : end) + (lambda / m) * theta(2 : end)];

% =========================================================================

grad = grad(:);

end

得到了损失函数与下降梯度后,在trainLearReg.m中用fmincg()函数来学习参数。

function [theta] = trainLinearReg(X, y, lambda)
%TRAINLINEARREG Trains linear regression given a dataset (X, y) and a
%regularization parameter lambda
%   [theta] = TRAINLINEARREG (X, y, lambda) trains linear regression using
%   the dataset (X, y) and regularization parameter lambda. Returns the
%   trained parameters theta.
%

% Initialize Theta
initial_theta = zeros(size(X, 2), 1); 

% Create "short hand" for the cost function to be minimized
costFunction = @(t) linearRegCostFunction(X, y, t, lambda);

% Now, costFunction is a function that takes in only one argument
options = optimset('MaxIter', 200, 'GradObj', 'on');

% Minimize using fmincg
theta = fmincg(costFunction, initial_theta, options);

end
6.2 Bias-Variance
6.2.1 Learning Curves

通过绘制learning curves来debug学习算法。learning curves的横坐标是training set size,纵坐标是error,所以对于一个i大小的training set,我们取training set中的前i个例子,用trainLinearReg()函数找到使损失函数最小的参数θ,然后用这个θ分别计算training set与validation set的error。注意error是没有正则化项的,因此在调用linearRegCostFunction()时传入的lambda为0。在learningCurve.m文件中,完成对两种error的计算。

function [error_train, error_val] = ...
    learningCurve(X, y, Xval, yval, lambda)
%LEARNINGCURVE Generates the train and cross validation set errors needed 
%to plot a learning curve
%   [error_train, error_val] = ...
%       LEARNINGCURVE(X, y, Xval, yval, lambda) returns the train and
%       cross validation set errors for a learning curve. In particular, 
%       it returns two vectors of the same length - error_train and 
%       error_val. Then, error_train(i) contains the training error for
%       i examples (and similarly for error_val(i)).
%
%   In this function, you will compute the train and test errors for
%   dataset sizes from 1 up to m. In practice, when working with larger
%   datasets, you might want to do this in larger intervals.
%

% Number of training examples
m = size(X, 1);

% You need to return these values correctly
error_train = zeros(m, 1);
error_val   = zeros(m, 1);

% ====================== YOUR CODE HERE ======================
% Instructions: Fill in this function to return training errors in 
%               error_train and the cross validation errors in error_val. 
%               i.e., error_train(i) and 
%               error_val(i) should give you the errors
%               obtained after training on i examples.
%
% Note: You should evaluate the training error on the first i training
%       examples (i.e., X(1:i, :) and y(1:i)).
%
%       For the cross-validation error, you should instead evaluate on
%       the _entire_ cross validation set (Xval and yval).
%
% Note: If you are using your cost function (linearRegCostFunction)
%       to compute the training and cross validation error, you should 
%       call the function with the lambda argument set to 0. 
%       Do note that you will still need to use lambda when running
%       the training to obtain the theta parameters.
%
% Hint: You can loop over the examples with the following:
%
%       for i = 1:m
%           % Compute train/cross validation errors using training examples 
%           % X(1:i, :) and y(1:i), storing the result in 
%           % error_train(i) and error_val(i)
%           ....
%           
%       end
%

% ---------------------- Sample Solution ----------------------

for i = 1 : m
    theta = trainLinearReg(X(1 : i, :), y(1 : i), lambda);
    error_train(i) = linearRegCostFunction(X(1 : i, :), y(1 : i), theta, 0);
    error_val(i) = linearRegCostFunction(Xval, yval, theta, 0);
end

% -------------------------------------------------------------

% =========================================================================

end

绘制出来的learning curves如图。由图可看出,随着训练集大小增加,train error和validation error都很高,这意味着这个模型有high bias问题。

6.3 Polynomial Regression
6.3.1 Learning Polynomial Regression

通过增加多项式来增加特征个数。在polyFeatures.m文件中,完成对多项式特征的扩充。因为多项式的特征取值范围差距会很大,所以这里需要用到特征归一化(Normalization)。

function [X_poly] = polyFeatures(X, p)
%POLYFEATURES Maps X (1D vector) into the p-th power
%   [X_poly] = POLYFEATURES(X, p) takes a data matrix X (size m x 1) and
%   maps each example into its polynomial features where
%   X_poly(i, :) = [X(i) X(i).^2 X(i).^3 ...  X(i).^p];
%


% You need to return the following variables correctly.
X_poly = zeros(numel(X), p);

% ====================== YOUR CODE HERE ======================
% Instructions: Given a vector X, return a matrix X_poly where the p-th 
%               column of X contains the values of X to the p-th power.
%
% 

for i = 1 : p
    X_poly(:, i) = X .^ i;
end

% =========================================================================

end

绘制出的learning curve如图。由图可看出,training error很小,非常贴近x轴,validation error很小,但和training error之间还是有一段距离,这意味着这个模型有high variance问题。

6.3.2 Selecting Lambda Using a Cross Validation Set

λ的取值会显著影响正则化多项式回归的结果。为了观察不同的λ的影响,我们用一个向量lambda_vec保存了一系列的λ值,针对不同的λ分别训练一个模型,对每个模型检验它的validation error与training error。在validationCurve.m文件中,完成相应代码。

function [lambda_vec, error_train, error_val] = ...
    validationCurve(X, y, Xval, yval)
%VALIDATIONCURVE Generate the train and validation errors needed to
%plot a validation curve that we can use to select lambda
%   [lambda_vec, error_train, error_val] = ...
%       VALIDATIONCURVE(X, y, Xval, yval) returns the train
%       and validation errors (in error_train, error_val)
%       for different values of lambda. You are given the training set (X,
%       y) and validation set (Xval, yval).
%

% Selected values of lambda (you should not change this)
lambda_vec = [0 0.001 0.003 0.01 0.03 0.1 0.3 1 3 10]';

% You need to return these variables correctly.
error_train = zeros(length(lambda_vec), 1);
error_val = zeros(length(lambda_vec), 1);

% ====================== YOUR CODE HERE ======================
% Instructions: Fill in this function to return training errors in 
%               error_train and the validation errors in error_val. The 
%               vector lambda_vec contains the different lambda parameters 
%               to use for each calculation of the errors, i.e, 
%               error_train(i), and error_val(i) should give 
%               you the errors obtained after training with 
%               lambda = lambda_vec(i)
%
% Note: You can loop over lambda_vec with the following:
%
%       for i = 1:length(lambda_vec)
%           lambda = lambda_vec(i);
%           % Compute train / val errors when training linear 
%           % regression with regularization parameter lambda
%           % You should store the result in error_train(i)
%           % and error_val(i)
%           ....
%           
%       end
%
%

for i = 1 : length(lambda_vec)
    theta = trainLinearReg(X, y, lambda_vec(i));
    error_train(i) = linearRegCostFunction(X, y, theta, 0);
    error_val(i) = linearRegCostFunction(Xval, yval, theta, 0);
end

% =========================================================================

end

绘制出的图像如图。由图可看出,λ最好的取值在3附近。因为数据集是随机排序的,validation error有时候可能会比training error低。

作业代码参考:https://www.cnblogs.com/hapjin/p/6114466.html

Ex4全部代码已上传Github

### 回答1: Coursera-ml-andrewng-notes-master.zip是一个包含Andrew Ng机器学习课程笔记代码的压缩包。这门课程是由斯坦福大学提供的计算机科学和人工智能实验室(CSAIL)的教授Andrew Ng教授开设的,旨在通过深入浅出的方式介绍机器学习的基础概念,包括监督学习、无监督学习、逻辑回归、神经网络等等。 这个压缩包中的笔记代码可以帮助机器学习初学者更好地理解和应用所学的知识。笔记中包含了课程中涉及到的各种公式、算法和概念的详细解释,同时也包括了编程作业的指导和解答。而代码部分包含了课程中使用的MATLAB代码,以及Python代码的实现。 这个压缩包对机器学习爱好者和学生来说是一个非常有用的资源,能够让他们深入了解机器学习的基础,并掌握如何运用这些知识去解决实际问题。此外,这个压缩包还可以作为教师和讲师的教学资源,帮助他们更好地传授机器学习的知识和技能。 ### 回答2: coursera-ml-andrewng-notes-master.zip 是一个 Coursera Machine Learning 课程的笔记和教材的压缩包,由学生或者讲师编写。这个压缩包中包括了 Andrew Ng 教授在 Coursera 上发布的 Machine Learning 课程的全部讲义、练习题和答案等相关学习材料。 Machine Learning 课程是一个介绍机器学习的课程,它包括了许多重要的机器学习算法和理论,例如线性回归、神经网络、决策树、支持向量机等。这个课程的目标是让学生了解机器学习的方法,学习如何使用机器学习来解决实际问题,并最终构建自己的机器学习系统。 这个压缩包中包含的所有学习材料都是免费的,每个人都可以从 Coursera 的网站上免费获取。通过学习这个课程,你将学习到机器学习的基础知识和核心算法,掌握机器学习的实际应用技巧,以及学会如何处理不同种类的数据和问题。 总之,coursera-ml-andrewng-notes-master.zip 是一个非常有用的学习资源,它可以帮助人们更好地学习、理解和掌握机器学习的知识和技能。无论你是机器学习初学者还是资深的机器学习专家,它都将是一个重要的参考工具。 ### 回答3: coursera-ml-andrewng-notes-master.zip是一份具有高价值的文件,其中包含了Andrew NgCoursera上开授的机器学习课程的笔记。这份课程笔记可以帮助学习者更好地理解掌握机器学习技术和方法,提高在机器学习领域的实践能力。通过这份文件,学习者可以学习到机器学习的算法、原理和应用,其中包括线性回归、逻辑回归、神经网络、支持向量机、聚类、降维等多个内容。同时,这份笔记还提供了很多代码实现和模板,学习者可以通过这些实例来理解、运用和进一步深入研究机器学习技术。 总的来说,coursera-ml-andrewng-notes-master.zip对于想要深入学习和掌握机器学习技术和方法的学习者来说是一份不可多得的资料,对于企业中从事机器学习相关工作的从业人员来说也是进行技能提升或者知识更新的重要资料。因此,对于机器学习领域的学习者和从业人员来说,学习并掌握coursera-ml-andrewng-notes-master.zip所提供的知识和技能是非常有价值的。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值