Andrew Ng machine learning 课程笔记--贝叶斯统计正则化

Bayesian statistics:it turns out if you add this prior term there,it turns out that the authorization objective you end up optimizing turns out to be this.where you add an extra term that,you know ,penalizas your parameter theta as being large.this algorithm tend to keep your parameters small.that strengthening the parameters has the sffect of keeping the functions you fit to be smoother and less likely to overfit.like logistic regresson would be very much prone to overfitting.but it turns out that with this sort of baysian regularation,with Gaussian,logistic regression becomes a very sffective text classification algorithm .

Regularation:

Online learning:in which you have to make predictions even while you are in the process of learning.

Advice for applying machine learning:

Diagnostics for debugging learning algorithms:

Sort of talk briefly about error analyses and ablative analysis:

Advice for how to get started on a machine learning problem:

In case 1,let's say that J of SVM is,indeed,is greater than J of BLR ,but we know that Bayesian logistic regression was trying to maximize J of theta,that's the definition of Bayesian logistic regression.so this means that theta the value of theta output that Bayesian logistic regression actually fails to maximize J because the support back to machine actually returned the value of the theta that you know a better job out-maximizing J.and so,this tells me that Bayesian logistic regression did not actually maximize J correctly,and so the problem is with the optimization algorithm.the optimization algorithm has not converged.the other case is as follows,this means that Bayesian logistic regression  actually attains the higher value for the optimization objective J than does SVM.the svm,what does worse on your optimization problem,actually does better on the weighted accurcy measure.so what this means is that  something that does worse on your optimization objective,on J.can actually do better on the weighted accurcy objective.and this really means that maximizing J of theta,does not really correspond that well to maximizing your weighted accurcy critera.and that is tell you that J of theta is maybe the wrong optimization objective to be maximizing.

  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值