Linear Ragression with multiple variables

Linear Ragression with multiple variables


Multiple features

Hypothesis:

hθ(x)θ⃗ X⃗ =θT×X=[θ0,θ1,θ2,θn]=[1,x1,x2,xn]


Gradient descent for multiple variables

Hypothesis:

hθ(x)=θT×X

Parameters: θ⃗ =[θ0,θ1,θ2,,θn]

Cost function:

J(θ0,θ1,θ2,,θn)J(θ⃗ )θjθj(simultaneously=12mi=1m(hθ(x(i))y(i))2=12mi=1m(hθ(x(i))y(i))2:=θjαθjJ(θ⃗ ):=θjα1mi=1m(hθ(x(i)j)y(i)j)×x(i)jupdateθjforj=0,1,2,,n)


Practice I : Feature Scaling

Feature Scaling

​ E.g. x1=size(02000feet2)==>x1=size(feet2)2000

x2=numberofbedrooms(15)==>x2=number_of_bedrooms5

Mean normalization

​ E.g. x1=size(feet2)10002000


Practice II : Learning rate

J(θ) should decrease after every iteration.

thetaOfLearningRate

​ Gradient descent not working, Use smaller α .

Summary

​ - if α is too small : slow convergence.

​ - if α is too large : J(θ) may not decrease on every iteration; may not converge.


Features and polynomial regression

polynomialRegression

Feature scaling !!

choiceOfFeature


Normal equation

Normal equation: Method to solve for θ analytically.

Intuition:

θR J(θ0,θ1,θ2,,θn)=12mmi=1(hθ(x(i))y(i))2

θjJ(θ)==0 (for every j)

​ Solve for θ0,θ1,θ2,,θn

​ So: (not use Features scaling!)

θ⃗ =(θ0,θ1,,θn) X=(x(1)T,x(2)T,x(3)T,,x(m)T)

x(i)T=(xi0,xi1,xi2,,xin) yT=(y(1),y(2),,y(m))

θ=(XTX)1XTy

​ Octave: pinv(x'*x)*x'*y

m training examples, n features

Gradient DescentNormal Equation
Need to choose α No need to choose α
Needs many iterationsDont’ need to iterate
Work well even when n is largeNeed to compute (XTX)1
Slow if n is very large ( O(n3) )

Tips

​ 我们有如下定义:

​ 对于一个函数 f:Rm×n>R ,将 m×n 的矩阵 A 映射为一个实数,定义 f 为对A 的偏导 为:

Af(A)=fA11fAn1fA1nfAnn

​ 引入矩阵的 迹 : tr 。 对于一个 n 阶方阵 A , tr(A)=ni=1Aii

​ 容易得出如下性质:

tr(AB)tr(A)tr(A+B)tr(αA)Atr(AB)ATf(A)Atr(ABATC)A|A|=tr(BA)=tr(AT)=tr(A)+tr(B)=αtr(A)=BT=(f(A))T=CAB+CTABT=|A|(A1)T

​ 对于一个向量 z , 有 zTz=ni=1z2i 则:
12(Xθy⃗ )T(Xθy⃗ )=12i=1m(hθ(x(i))y(i))2=J(θ)

​ 最小化 J(θ) ,就要求解它关于 θ 的导数:
θJ(θ)=θ12(Xθy⃗ )T(Xθy⃗ )=12θ(θTXTXθθTXTy⃗ y⃗ TXθ+y⃗ Ty⃗ )=12θtr(θTXTXθθTXTy⃗ y⃗ TXθ+y⃗ Ty⃗ )=12θ(tr(θTXTXθ)2tr(y⃗ TXθ))=12(XTXθ+XTXθ2XTy⃗ )=XTXθXTy⃗ =0(tr(α)=α)(tr(A)=tr(AT))(Atr(AB)=BT)

得出正规方程 :
XTXθθ=XTy⃗ =(XTX)1XTy⃗ 

​ Octave 的 pinv (XTX) 不可逆的情况下依旧可以得出 θ 的解。

  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值