Machine Learning by Andrew Ng-----note

最新推荐文章于 2021-05-28 10:25:54 发布

northeastsqure

最新推荐文章于 2021-05-28 10:25:54 发布

阅读量1.2k

点赞数

分类专栏：深度学习

本文链接：https://blog.csdn.net/northeastsqure/article/details/43897919

版权

深度学习专栏收录该内容

61 篇文章 1 订阅

订阅专栏

The cost function J(θ) is guaranteed to be convex for logistic regression.

Adding polynomial features (e.g., instead using hθ(x)=g(θ0+θ1x1+θ2x2+θ3x21+θ4x1x2+θ5x22) ) could increase how well we can fit the training data

Adding new features can only improve the fit on the training set: since setting θ3=θ4=θ5=0 makes the hypothesis the same as the original one, gradient descent will use those features (by making the corresponding θj non-zero) only if doing so improves the training set fit.

For logistic regression, the gradient is given by ∂∂θjJ(θ)=∑mi=1(hθ(x(i))−y(i))x(i)j . Which vectorized form

     
     
      
      
     
     
     
     
      
      
        
        
         
         θ:=θ−α1m∑mi=1(hθ(x(i))−y(i))x(i)
        
        .

Regularized logistic regression and regularized linear regression are both convex, and thus gradient descent will still converge to the global minimum.

θ:=θ−α1m∑mi=1(
θ:=θ−α1m∑mi=1(hθ(x(i))−y(i))x(i) .hθ(x(i))−y(i))x(i).