机器学习系列之coursera week 3 Logistic Regression

最新推荐文章于 2019-08-29 03:14:20 发布

爱战术的码农新人

最新推荐文章于 2019-08-29 03:14:20 发布

阅读量335

点赞数

本文链接：https://blog.csdn.net/zyh3826/article/details/81632001

版权

1. Classification and Representation

1.1 Classification

classifiction:

Email: Spam / not spam

online transaction: Fraudulent(yes/no) 欺骗性的

tumor: Malignant / benign

0: negative class

1: positive class

e.g.

图1

(引自coursera week2 Classification)

threshold classifier output h(x) at 0.5

If h(x) >= 0.5, predict y = 1

If h(x) < 0.5, predict y = 0

但是加入一个样本点后，如图2蓝色直线：

图2

(引自coursera week2 Classification)

这会导致分类错误，因此线性回归一般不会用于分类

use linear regression, h(x) may > 1 or < 0

Logistic regression: 0 <= h(x) <= 1

1.2 Hypothesis representation of Logistic Regression

Logistic Regression model:

want

图3

(引自coursera week2 hypothesis represention)

Interpretation of hypothesis output:

= estimated probability that y=1 on input x

e.g. if x = [x0; x1] = [1; tumor size]

h(x) = 0.7 tell patient that 70% chance of tumor being malignant

1.3 Decision boundary

Logistic Regression:

suppose: predict y=1, if h(x)>=0.5,

predict y=0, if h(x)<=0.5,

Decision boundary:

suppose θ0 = -3, θ1 = 1, θ2 = 1

predict y = 1, if -3 + x1 + x2 >= 0, 即 x1 + x2 >= 3, decision boundary 如图4：

图4

(引自coursera week2 decision boundary)

Non-linear decision boundary:

由多项式回归的思想可对h(x)添加高阶项:

suppose:

θ0 = -1, θ1 = 0, θ2 = 0, θ3 = 1, θ4 = 1

predict y=1, if -1 + x1^2 + x2^2 >= 0, 即 x1^2 + x2^2 >= 1, 如图4：

图5

(引自coursera week2 decision boundary)

2. Logistic Regression Model

2.1 Cost function

Training set:

x0 = 1, y = {0, 1}

how to choose θ？

Cost function:

因为h(x)为sigmoid函数，若使用

则J(θ)为非凸函数，运用Gradient Descent不能保证能找到全局最小值。

图6

(引自coursera week2 Cost Function)

Logistic regression cost function"

图7图8

(引自coursera week2 Cost Function)

2.2 Simplified cost function and gradient descent

LR cost function:

simplified:

To fit parameters θ:

Gradient descent:

(simultaneously update all θj)

即：

2.3 Advanced optimization

cost function J(θ), want minJ(θ)

Given θ, we have code that can comput:

-J(θ)

-partial derivative of J(θ)

advanced optimization algorithm:

-Conjugate gradient

-BFGS

-L-BFGS

advantages:

-NO need to manually pick α

-ofen faster than gradient descent

disadvantages:

-more complex

code:

function [jVal, grad] = costFunction(theta)
	jVal = (theta(1) - 5)^2 + (theta(2)-5)^2;
	grad = zeros(2, 1);
	grad(1) = 2 * (theta(1) - 5);
	grad(2) = 2 * (theta(2) - 5);
end

options = optimset('GradObj', 'on', 'MaxIter', 100);
initialTheta = zeros(2, 1);
[optTheta, functionVal, excitFlag] = fminunc(@costFunction, initialTheta, options);

3. Multiclass classification: one-vs-all

One-vs-all(One-vs-rest):如图9

图9

(引自coursera week2 Multiclass Classification)

共得到如上图的三个分类器。

on a new input x, to make a prediction, pick the class i that maximizes:

4. Solving the problem of overfitting

4.1 The problem of overfitting

E.g. Linear regression (housing prices)

图10

(引自coursera week2 The problem of overfitting)

图10中第一个图为underfitting(high bias)，第三个图为overfitting(high variance)

addressing overfitting:

options:

(1) Reduce number of features

-Manually select which features to keep

-Model selecting algorithm

(2) Regularization

-keep all the features, but reduce magnitude/values of parameters θ, works well when we have a lot of features, each of which contributes a bit to predict y.

4.2 Cost function

Intuition

图11

(引自coursera week2 Cost function)

suppose we penalize and make θ3, θ4 really small

这能使θ3, θ4尽可能的小，θ3, θ4约等于0

Regularization

small values for parameters θ0, θ1, ... ,θn

-"simpler" hypothesis

-Less prone to overfitting

for linear regression:

housing:

-features: x1, x2, ... x100

-parameters: θ0, θ1, ... ,θ100

特别注意：一般不惩罚θ0

称为正则化项，λ为正则化参数

λ的作用：（1）保持训练目的，即最小化损失函数；（2）保持参数较小

Q：为什么λ设置极端大后，会导致欠拟合

对参数惩罚太重会导致θj约等于0，进而h(x)约等于θ0，即欠拟合

4.3 Regularized Linear Regression

Gradient descent:

normal equation:

当m<n时，不可逆，但上式可逆。

4.4 Regularized Logistic Regression

cost function:

Gradient descent:

爱战术的码农新人

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
机器学习系列之coursera week 3 Logistic Regression

目录1. Classification and Representation1.1 Classification1.2 Hypothesis representation of Logistic Regression1.3 Decision boundary2. Logistic Regression Model2.1 Cost function2.2 Simplifi...
复制链接

扫一扫