# Stanford机器学习---第三讲. 逻辑回归和过拟合问题的解决 logistic Regression & Regularization

Logistic Regression

=========================

(一)、Classification

（二）、Hypothesis Representation

（三）、Decision Boundary

（四）、Cost Function

（五）、Simplified Cost Function and Gradient Descent

（六）、Parameter Optimization in Matlab

（七）、Multiclass classification : One-vs-all

The problem of overfitting and how to solve it

=========================

（八）、The problem of overfitting

（九）、Cost Function

（十）、Regularized Linear Regression

（十一）、Regularized Logistic Regression

/*************（一）~（二）、Classification / Hypothesis Representation***********/

y=1, if h(x)>=0.5

y=0, if  h(x)<0.5

y=1, h(x)>0.5

y=0, h(x)<=0.5

/*****************************（三）、decision boundary**************************/

predict Y=1, if -3+x1+x2>=0

predict Y=0, if -3+x1+x2<0

Another Example:

/********************（四）~（五）Simplified cost function and gradient descent<非常重要>*******************/

Q：Suppose you are running gradient descent to fit a logistic regression model with parameter θRn+1. Which of the following is a reasonable way to make sure the learning rate α is set properly and that gradient descent is running correctly?

A：

/*************（六）、Parameter Optimization in Matlab***********/

jVal 是 cost function 的表示，比如设有两个点（1,0,5）和（0,1,5）进行回归，那么就设方程为hθ(x)=θ1x1+θ2x2;

function [ jVal,gradient ] = costFunction( theta )
%COSTFUNCTION Summary of this function goes here
%   Detailed explanation goes here

jVal= (theta(1)-5)^2+(theta(2)-5)^2;

%code to compute derivative to theta

end

function [optTheta,functionVal,exitFlag]=Gradient_descent( )
%GRADIENT_DESCENT Summary of this function goes here
%   Detailed explanation goes here

initialTheta = zeros(2,1)
[optTheta,functionVal,exitFlag] = fminunc(@costFunction,initialTheta,options);

end

matlab主窗口中调用，得到优化厚的参数(θ1,θ2)=(5,5),即hθ(x)=θ1x1+θ2x2=5*x1+5*x2

 [optTheta,functionVal,exitFlag] = Gradient_descent()

initialTheta =

0
0

Local minimum found.

Optimization completed because the size of the gradient is less than
the default value of the function tolerance.

<stopping criteria details>

optTheta =

5
5

functionVal =

0

exitFlag =

1


/*****************************（七）、Multi-class Classification One-vs-all**************************/

The problem of overfitting and how to solve it

/************（八）、The problem of overfitting***********/

The Problem of overfitting:

overfitting就是过拟合，如下图中最右边的那幅图。对于以上讲述的两类（logistic regression和linear regression）都有overfitting的问题，下面分别用两幅图进行解释：

<Linear Regression>:

<logistic regression>:

1. 减少feature个数（人工定义留多少个feature、算法选取这些feature）

2. 规格化（留下所有的feature，但对于部分feature定义其parameter非常小）

$MSE(f)=\frac{1}{n}(y_{i}-f(x_{i})^2)$

$for\:problem\:\: Y=aX,\\ J(a)=\sum_{\overrightarrow{x}\epsilon X}(a^T \overrightarrow{x}-y)^2)\\ X = [x_{1},x_{2},...,x_{n}],\:\:Y = [y_{1},y_{2},...,y_{n}]\\$

i.e. the loss function can be written as

$J(a)=(Y-X^Ta)^T(Y-X^Ta)$

there we can get:

$a=(XX^T)^{-1}XY$

After regularization, however,we have:

$a=(XX^T+\lambda I)^{-1}XY$

/************（九）、Cost Function***********/

Q:

A:λ很大会导致所有θ≈0

/************（十）、Regularized Linear Regression***********/

<Linear regression>:

/************（十一）、Regularized Logistic Regression***********/

<Logistic regression>:

When using regularized logistic regression, which of these is the best way to monitor whether gradient descent is working correctly?

• 写评论