Online Learning 1: Introduction

xiwang_chn

于 2021-05-17 03:42:01 发布

阅读量392

点赞数 1

分类专栏： Online Learning

本文链接：https://blog.csdn.net/weixin_42017454/article/details/116553583

版权

Online Learning 专栏收录该内容

9 篇文章 5 订阅

订阅专栏

1 Assumption

在这里插入图片描述

2 Probability

Probability triplet

在这里插入图片描述

Expectation and variance

在这里插入图片描述

Independence

在这里插入图片描述

Conditioning, conditional expectation

在这里插入图片描述

3 Concentration

Review

在这里插入图片描述

Markov’s inequality (non-negative random variable)

在这里插入图片描述

Chebyshev’s inequality (arbitrary random variable)

在这里插入图片描述

Chernoff bound

在这里插入图片描述

Gaussian and sub-gaussian

在这里插入图片描述

4 Bandit Framework and Regret

在这里插入图片描述

Notation

在这里插入图片描述

Stochastic bandit

在这里插入图片描述

Unstructured, structured environment

在这里插入图片描述

Unstructured environment: Play each arm a reasonable number of times to estimate the goodness of that arm.
Structured environment: Infinite actions. Different actions or different arms leak information about each other, need to only play about order d amount of times and basically get samples to figure out what the theta is. In some sense easier in terms of the number of samples needed.

Regret

在这里插入图片描述

Suboptimality

在这里插入图片描述
Suboptimality quantifies how much was any particular arm is in an expected sense from the best arm.

Regret decomposition

在这里插入图片描述

Bayesian regret

在这里插入图片描述

xiwang_chn

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
1
评论
Online Learning 1: Introduction

Online Learning 1: Introduction1 Assumption2 ProbabilityProbability tripletExpectation and varianceIndependenceConditioning, conditional expectation3 ConcentrationReviewMarkov's inequality (non-negative random variable)Chebyshev’s inequality (arbitrary ran
复制链接

扫一扫

专栏目录