机器学习基础（二十）—— 数学语言与 Python 代码

最新推荐文章于 2022-11-02 10:08:01 发布

五道口纳什

最新推荐文章于 2022-11-02 10:08:01 发布

阅读量1.2k

点赞数

分类专栏：机器学习

本文链接：https://blog.csdn.net/lanchunhui/article/details/50981246

版权

机器学习专栏收录该内容

121 篇文章 39 订阅

订阅专栏

（1）加权求和就是计算内积；
（2）加权（weighted）的权值用于衡量重要程度
（3）编程语言中的赋值即更新，尤其在 iterative process：
$w \leftarrow w + λ v$ $\mathbf{w}\leftarrow \mathbf{w}+\lambda \mathbf{v}$
（4）自然对数，因为其微分比较方便；

1. weighted base algorithm

E u i n (h) = 1 N \sum n = 1 N u 1 y n \neq g t (x n)

$E_{in}^\mathbf u(h)=\frac1N\sum_{n=1}^N\mathbf u\:1_{y_n\neq g_t(\mathbf x_n)}$

加权求和就是计算内积；

u = np.ones(N)/N                            # 权重初始化
errLabels = np.ones(N)
predLabels = clf(X, y)
errLabels[predLabels == y] = 0              # 实现了指示器函数
u.dot(errLabels)

2. AdaBoost 的权重更新

如果某样本被正确分类，那么该样本的权重更改为：

D t + 1 i = D t i \sum i D t i / 1 - ϵ ϵ - - - - - \sqrt

$D_i^{t+1}=\frac{D_i^t}{\sum_iD_i^t}/\sqrt{\frac{1-\epsilon}{\epsilon}}$

如果样本被错分，那么该样本的权重更新为：

D t + 1 i = D t i \sum i D t i \cdot 1 - ϵ ϵ - - - - - \sqrt

$D_i^{t+1}=\frac{D_i^t}{\sum_iD_i^t}\cdot\sqrt{\frac{1-\epsilon}{\epsilon}}$
其中

D $D$ 为权重向量；

这看似是一个二分支的判断、处理问题，在 Python（Numpy）中根据强大的布尔索引判断（二分类问题的列别标签为1/-1），可简化为一条语句：

ratio = np.sqrt((epsilon-1)/max(epsilon, 1e-16))
D = D*ratio**(-pred*y)
                                # 误分类时，pred*y == -1
                                # 正确分类时，pred*y == 1
D /= D.sum()

3. feature projection

Python机器学习——如何shuffle一个数据集（ndarray类型）

特征投影也即特征选择，如下实现 $\mathbb R^d\rightarrow \mathbb R^{d'}$

ϕ (x i 1, x i 2, \dots, x i' d) = P x

$\phi(x_{i_1},x_{i_2},\cdots,x_{i_d'})=\mathbf P x$

$\mathbf P_{d'\times d}$ 的构建过程如下：

def randomProjMat(d, d_prime):
    P = np.hstack((np.eye(d_prime), np.zeros(d_prime, d-d_prime)))
    r = np.random.permutation(d)
    return P[:, r]

如

P = [100100]

$\mathbf P=\begin{bmatrix} 1 & 0& 0\\ 0 & 1& 0 \end{bmatrix}$

实现了 $\mathbb R^3\rightarrow \mathbb R^2$ ，也即将 $(x,y,z)\rightarrow (x,y)$

五道口纳什

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
打赏
0
评论
机器学习基础（二十）—— 数学语言与 Python 代码

（1）加权求和就是计算内积；weighted base algorithmEuin(h)=1N∑n=1Nu1yn≠gt(xn)E_{in}^\mathbf u(h)=\frac1N\sum_{n=1}^N\mathbf u\:1_{y_n\neq g_t(\mathbf x_n)}加权求和就是计算内积；u = np.ones(N)/N #
复制链接

扫一扫