Andrew Ng machine learning 课程笔记--经验风险最小化

Varience-bias trade-off:

Underfit:

Overfit:

Linear classification:in particular,logistic regression fits this parameters the model like this for maximizing the law of likelihood.but in order to understand learning algorithms more deeply,I am just going to assume a simplifies model of machine learning,I am going to define training error as so this is a training error of a hypothesis X subscript data.write this epsilon hat of subscript data.if I want to make the dependence on a training set explicit,I will write this with a subscript S there where S is a training set.this is a sum of indicator functions for whether your hypothesis correctly classifies the Y the IFE example.and so when you divide by M ,this is just in your training set what's the fraction of training examples your hypothesis classifies so defined as a training error.and training error is also called risk.the simplified model of machine learning I am gonna talk about is called empirical risk minimization.and iin particular,I am going to assume that the way my learning algorithm works is it will choose parameters data,that minimiza my training error.so it turns out that if you actually want to do this,this is a nonconvex optimization problem.it turns out that it will be useful to thinkk of our learning aigorithm as not to choosing a set of parameters,but as choosing a function.let me define the hypothesis class,script h,as the class of all hypotheses of in other words as the class of all learning  algorithms .H subscript data is a special linear classifier,so H subscript data in each of these functions each of these is a function mapping from the input domain X  in the class zero.each of these is a function,now redefinition it as choosing a function hypothesis class of script H that minimizes my training error.

Generalization error:

Empirical risk minimization:let us say script H is a class of K hypotheses.empirical risk minimization takes the hypothesis of the lowest training error,and what I would like to do is prove a bound on the generalization error of H hat.

Show training error that gives approximatation generalization error:so my training set is drawn randomly from sum distribution scripts d,and depending on what training examples I have got,that Ais would be either zero or one.so let us figure out what the probability distribution ZI is.so ZI takes on the value of either zero or one so what is the probability that ZI is equal to one?in another words what is the probability that from a fixed hypothesis HJ,when I sample my traiining set IID from distribution D,what is the chance that my hypothesis will misclassify it?well ,by definition,that is just a generalization error of my hypothesis HJ.so ZI is a Benuve random variable with mean given by the generalization error of this hypothesis.

Give a bound on the generalization error of the hypothesis output by empirical tisk minimization:

Another:given gamma,so what we proved was given gamma the probability delta of making a large error,how large a training set size do you need in order to given a bound on how large a training set do you need to give a uniform conversions bound with parameters gamma and delta?

Sample complexity bound:how large a training example you need in order to achineve a certain bound and error.

Error bound:the result of the training error is essentially that uniform conversions will hold true with high probability.

 

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值