Dimensionality Reduction by Learning an Invariant Mapping

最新推荐文章于 2024-04-28 08:19:30 发布

cool whidpers

最新推荐文章于 2024-04-28 08:19:30 发布

阅读量1.5k

点赞数 1

分类专栏：机器学习算法文章标签：套路问题集锦

机器学习算法专栏收录该内容

10 篇文章 0 订阅

订阅专栏

这篇博客是对这篇论文的翻译以及理解

绪论以及2.2节以前的部分比较好理解，这里我就从2.2节开始，一段英文一段中文的翻译。

2.2 Spring Model Analogy

An analogy to a particular mechanical spring system is given to provide an intuition of what is happening when the loss function is minimized.The output of Gw can be thought of as masses attracting and repelling each other with springs.
F=-kX
下面用弹簧系统的机制来理解在损失函数是怎么最小化的，整个网络的输出Gw可以看做是（弹簧之间）许多吸引力和排斥力的作用。公式为：F=-kX

when F is the force ,K is the spring constant and X is the displacement of the spring from its rest length.
F是表示弹簧的弹力，K是弹簧的劲度系数（是一个常量），X是弹簧的位移。
下面是弹簧系统和损失函数之间的类比关系。
1、A spring is attract-only if its rest length is equal to zero ,Thus any positive displacement X will result in an attractive force between its ends(这里相当于拉长弹簧，位移为正)
2、A spring is said to be m-repulse-only if its rest length is equal to m.Thus two point that are connected to with m-repulse-only will be pushed apart if X is less than m。
3、However ,this spring has a special property that if the spring is stretched by a length X>m,then no attractive force bring it back to rest length.
4、Each point is connected to other points using these two kind of springs
5、Seen in the light of the loss function ,each point is connected by attract-only spring to similar points ,and its connected by by m-repulse-only springs to dissimilar points.
1、当一个弹簧的静置长度为零的时候，这个弹簧是属于attract-only类型的。因此当弹簧的位移为正的时候，在弹簧的两端会产生吸引力。
2、当一个弹簧的静止长度为m的时候，这个弹簧是属于m-repulse-only类型的，若两个端点为通过m-repulse-only类型的弹簧相连接，则当弹簧的位移长度X

先回顾一下，我们分析这个损失函数的目的到底是什么，主要是为了对每对输入的图片经过网络得到Gw，通过对比两个Gw的欧氏距离，优化损失函数的同时，对权重W进行调整，这样当训练网络收敛的时候，学习到W的网络可以将输入进行降维处理。下面我们接着来讲。

1、Consider the loss function

L s (W, X 1 - \to, X 2 - \to)

$L_{s}(W,\overrightarrow{X_{1}} ,\overrightarrow{X_{2}})$ associated with similar pairs.

L s (W, X 1 - \to, X 2 - \to) = 1 2 (D w) 2

$L_{s}(W,\overrightarrow{X_{1}},\overrightarrow{X_{2}})=\frac{1}{2}(D_{w})^{2}$
The loss function L is mninmized using the stochastic gradient descent algorithm,The gradient of

Ls $L_{s}$ is :

\partial L s \partial W = D w \partial D w \partial W

$\frac{\partial L_{s}}{\partial W}=D_{w}\frac{\partial D_{w}}{\partial W}$
将这个公式与

F = - k X

$F=-kX$
进行对比。
It’s clear that the gradient

∂Ls∂W $\frac{\partial L_{s}}{\partial W}$ of

Ls $L_{s}$ gives the attractive force between the two points,

∂Dw∂W $\frac{\partial D_{w}}{\partial W}$ defines the spring constant K of the spring and

DW $D_{W}$ ,which is the distance between the two points,gives the perturbation X of the spring from its rest length.
Clearly, even a small value of

DW $D_{W}$ will generate a gradient (force) to decrease

DW $D_{W}$
(我理解的是可以降低L_{s}，因为

Ls=12(Dw)2 $L_{s}=\frac{1}{2}(D_{w})^{2}$ ,所以它等同于减少了

DW $D_{W}$ )
Thus the similar loss function corresponds to the attract-only spring
2、Now consider the partial loss function

LD $L_{D}$ :

L D (W, X 1 - \to, X 2 - \to) = 1 2 (m a x {0, (m - D w)}) 2

$L_{D}(W,\overrightarrow{X_{1}},\overrightarrow{X_{2}})=\frac{1}{2}(max\left \{ 0,(m-D_{w}) \right \})^{2}$
when Dw>m,

∂LD∂W=0 $\frac{\partial L_{D}}{\partial W}=0$ ,there is no gradient (force) on the two points that are dissimilar and are at a distance Dw>m.
if Dw 小于 m then

\partial L D \partial W = - (m - D W) * \partial D W \partial W

$\frac{\partial L_{D}}{\partial W}=-(m-D_{W})*\frac{\partial D_{W}}{\partial W}$
对比公式

F = - k X

$F=-kX$
Again ,comparing equation of the two formula above,it’s clear that the dissimilar loss function L_{D} corresponds to the m-repulse-only spring;its gradient gives the force of the spring.

∂DW∂W $\frac{\partial D_{W}}{\partial W}$ gives the spring constant K and

(m−DW) $(m-D_{W})$ gives the perturbation X .The nagetive sign denote the fact that the force is repulsive only
Clearly,the force is maximum when

DW=0 $D_{W}=0$ (这里不懂可以看论文里面的figure2中的图d) and absent when Dw=m.
3、Here,especially in the case of

Ls $L_{s}$ ,one might think that simply making

DW=0 $D_{W}=0$ for all atract-only springs would put the system in equilibtium.Consider ,However figure2.
4、Suppose

b1 $b_{1}$ is connected to

b2 $b_{2}$ and

b3 $b_{3}$ with attract-only springs.Then the Decreasing

DW $D_{W}$ between b1 and b2 will increase

DW $D_{W}$ between the b1 and b3,Thus by minizing the global loss function over all springs,one would ultimately drive the system to its equilibrium state.

2.3 The Algorithm

the algorithm first generates the training set,then trains the mechine.
Step 1： For each input sample $\overrightarrow{X_{i}}$ ,do the following:

a.Using prior knowledge find the set of samples $S_{\overrightarrow{X_{i}}}=\left \{ \overrightarrow{X_{j}} \right \}_{j=1}^{p}$ such that $\overrightarrow{X_{j}}$ is deemed similar to $\overrightarrow{X_{i}}$

Y i j i f X j - \to ϵ S X i \to a n d Y i j = 1 o t h e r w i s e

$Y_{ij} if \overrightarrow{X_{j}} \epsilon S_{\overrightarrow{X_{i}}} and Y_{ij}=1 otherwise$
Combine all the pairs to form the labeled training set.

Step2:Repeat until convergence

For each pair $( \overrightarrow{X_{i}},\overrightarrow{X_{j}})$ in the training set do
if $Y_{ij}=0$ ,then update W to decrease
$D w = ∥ ∥ G w (X i - \to) - G w (X j - \to) ∥ ∥ 2$ $D_{w}=\left \| G_{w}\left ( \overrightarrow{X_{i}} \right )-G_{w}\left ( \overrightarrow{X_{j} }\right ) \right \|_{2}$

if $Y_{ij}=1$ ,then update W to increase
$D w = ∥ ∥ G w (X i - \to) - G w (X j - \to) ∥ ∥ 2$ $D_{w}=\left \| G_{w}\left ( \overrightarrow{X_{i}} \right )-G_{w}\left ( \overrightarrow{X_{j} }\right ) \right \|_{2}$

cool whidpers

关注

1
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
Dimensionality Reduction by Learning an Invariant Mapping

这篇博客是对这篇论文的翻译以及理解绪论以及2.2节以前的部分比较好理解，这里我就从2.2节开始，一段英文一段中文的翻译。 2.2 Spring Model AnalogyAn analogy to a particular mechanical spring system is given to provide an intuition of what is happening
复制链接

扫一扫