Lecture 4: Scalable logistic regression

Lecture 4: Scalable logistic regression

1. Probabilistic classifier

A logistic regression model is an example of a probabilistic classifier

2. Logistic regression

We model the relationship between y and x using a Bernoulli distribution:
P(Y = y) = ber(y|u) = u^y * (1-u)^(1-y)

In logistic regression, the probability u(x) is given as the logistic sigmoid function:

u(x) = 1/(1+exp(-wx))

3. Optimization routines

  • Gradient(steepest) descent: the main issue in gradient desent is how to set the step size
  • Conjugate gradient:
  • NewTon’s method: faster optimisation algorithm by taking the curvature(曲率) of the space. The condition holds if the function is strictly convex
  • Levenberg Marquardt algorithm: if the function is not strictly convex. compromises between the Newton direction and the steepest direction
  • Quasi-Newton methods: to solve the problem that Newton direction need Hessian computing which is a cumbersome, error-prone, and expensive process
  • Limited memory BFGS: ignoring older information and perserve the most recent pairs of data

4. How to compute the gradient and the Hessian

Batch gradient desent: distribute the caculations over multiple workers(cores)
Stochastic Gradient Descent(SGD): Mini-batch gradient descent. Using subset of data to compute the gradient

The step size in SDG should follow the Robbins-Monro conditions.

5. Regularisation

regParam: the regularisation parameter, if =0, no regularisation
elasticNetParam: the weight of l1 and l2 regularisation

6. Logistic regression in Pyspark

Driver/controller: Initialize Weights, Broadcast Weights to Execytors,
Workers/Executors: Compute loss and gradient for each sample and sim them locally
Driver/controller: Reduce from executors to get the total sum of losses and gradients, Handle regularization and use LBFGS/OWLQN to update weights

L-BFGS is used as a solver for LogisticRegression() with l2 regularisation
OWLQN is used as a solver when l1 or elastic Net are used

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值