huoxingshu12345-CSDN博客

原创 reinforcement learning

1. Model Free1.1 Monte Carlo1.1.1 Value IterationSARSA 1. current Q -> e-greedy policy2. sample trajectorys (s1,a1,r1,s2,a2,r2 …), first visit MC3. update Q(s,a)=1N(s,a)∑iGit(s,a) Q(s,a) = \frac{1}{N(s,a)}\sum_{i} G_i^t(s,a) Q(s,a)=N(s,a)1i∑Git

2020-12-22 12:51:22 169

原创 Hypothesis testing

1.P- valuep value is measure the probability of the outcome which are more extreme than what you have observed under null hypotheis.2. Power analysisdrow the probility of rejection criteria verying from different parameter domain.Null hypothesis domain

2020-10-28 05:16:35 183

原创 EM

1. Classical EM algorithmThe ideaWe would like to learn the model where we may have latent/unobserved variables in the model.So instead to maxmize the log likelihood of margining out latent varables, we can learn instead maxmize the expected log likelih

2020-07-08 11:37:32 133

原创 PGM

PGM1. Directed Graph (Bayesian Net)1.1 RepresentationL(x)=∏cP(xc∣parent(xc))L(\bold x)=\prod_c P(\bold x_c | parent(\bold x_c))L(x)=c∏P(xc∣parent(xc))1.2 InferenceThe simplest method is Variable Elimination.Sum product message passing scheme:It

2020-07-08 01:15:06 180

原创 Boosting

BoostGradient boostingIdeawe can think our prediction function itself is a parameter f(x)f(x)f(x). So the update of it is the gradient diction. For each if(x)t+1=f(x)t+γ∂L(y,f(x))∂f(x)f(x)_{t+1}=f(x)_t + \gamma \frac{\partial L(y,f(x))}{\partial f(x)}

2020-06-10 07:12:41 118

原创 Bagging

BaggingIdeaWe want to ensemble of learners to be our predictors, the predicted value is weighted sum of the predictions from each weak learners.Introduce randomness in the dataset, so that the effect of extrem outliers will be mitigated.MethodWe sampl

2020-06-10 02:17:01 108

原创 SVM

SVMIdeaIf we can separate our datasets by two parallel hyparplane, the best solution is the one that maximize the margin between them.The objectto maximize the margin2∣∣w∣∣2\frac{2}{||\boldsymbol {w}||_{2}}∣∣w∣∣22for every i, subjectyi(w⋅xi−b)≥1

2020-06-09 22:02:37 67

原创 Linear Regression

Linear RegressionIdeaWe assume f(x)=w⋅xf(x)= \bold w\cdotp\bold xf(x)=w⋅x, we find the w\bold ww that minimize the MSE,∑i=1n∣∣y−w⋅x∣∣\sum_{i=1}^{n}||y-\bold w\cdotp\bold x||i=1∑n∣∣y−w⋅x∣∣PerfermanceWe can compute the coefficient of determination to

2020-06-09 10:33:04 96

huoxingshu12345的博客