- 博客(8)
- 收藏
- 关注
原创 reinforcement learning
1. Model Free1.1 Monte Carlo1.1.1 Value IterationSARSA 1. current Q -> e-greedy policy2. sample trajectorys (s1,a1,r1,s2,a2,r2 …), first visit MC3. update Q(s,a)=1N(s,a)∑iGit(s,a) Q(s,a) = \frac{1}{N(s,a)}\sum_{i} G_i^t(s,a) Q(s,a)=N(s,a)1i∑Git
2020-12-22 12:51:22
169
原创 Hypothesis testing
1.P- valuep value is measure the probability of the outcome which are more extreme than what you have observed under null hypotheis.2. Power analysisdrow the probility of rejection criteria verying from different parameter domain.Null hypothesis domain
2020-10-28 05:16:35
183
原创 EM
1. Classical EM algorithmThe ideaWe would like to learn the model where we may have latent/unobserved variables in the model.So instead to maxmize the log likelihood of margining out latent varables, we can learn instead maxmize the expected log likelih
2020-07-08 11:37:32
133
原创 PGM
PGM1. Directed Graph (Bayesian Net)1.1 RepresentationL(x)=∏cP(xc∣parent(xc))L(\bold x)=\prod_c P(\bold x_c | parent(\bold x_c))L(x)=c∏P(xc∣parent(xc))1.2 InferenceThe simplest method is Variable Elimination.Sum product message passing scheme:It
2020-07-08 01:15:06
180
原创 Boosting
BoostGradient boostingIdeawe can think our prediction function itself is a parameter f(x)f(x)f(x). So the update of it is the gradient diction. For each if(x)t+1=f(x)t+γ∂L(y,f(x))∂f(x)f(x)_{t+1}=f(x)_t + \gamma \frac{\partial L(y,f(x))}{\partial f(x)}
2020-06-10 07:12:41
118
原创 Bagging
BaggingIdeaWe want to ensemble of learners to be our predictors, the predicted value is weighted sum of the predictions from each weak learners.Introduce randomness in the dataset, so that the effect of extrem outliers will be mitigated.MethodWe sampl
2020-06-10 02:17:01
108
原创 SVM
SVMIdeaIf we can separate our datasets by two parallel hyparplane, the best solution is the one that maximize the margin between them.The objectto maximize the margin2∣∣w∣∣2\frac{2}{||\boldsymbol {w}||_{2}}∣∣w∣∣22for every i, subjectyi(w⋅xi−b)≥1
2020-06-09 22:02:37
67
原创 Linear Regression
Linear RegressionIdeaWe assume f(x)=w⋅xf(x)= \bold w\cdotp\bold xf(x)=w⋅x, we find the w\bold ww that minimize the MSE,∑i=1n∣∣y−w⋅x∣∣\sum_{i=1}^{n}||y-\bold w\cdotp\bold x||i=1∑n∣∣y−w⋅x∣∣PerfermanceWe can compute the coefficient of determination to
2020-06-09 10:33:04
96
空空如也
空空如也
TA创建的收藏夹 TA关注的收藏夹
TA关注的人