theano教程

最新推荐文章于 2021-10-11 16:09:03 发布

xiangnanla

最新推荐文章于 2021-10-11 16:09:03 发布

阅读量1.9k

点赞数

损失函数

模型的训练过程也就是最小化损失函数的过程。在多类别的对数回归模型中，通常采用负对数似然函数作为模型的参数。这相当于在以 θ 为参数的模型中，最大化训练数据的似然。如果我们定义似然和损失函数如下：

L (θ = {W, b}, D) = \sum i = 0 | D | log (P (Y = y (i) | x (i), W, b)) ℓ (θ = {W, b}, D) = - L (θ = {W, b}, D)

分类器的学习

0-1损失

在本指南中介绍的方法也常常用于一般的分类问题中。训练一个分类器的目的是最小化预测函数在测试实例上面的错误。这种错误最简单的表示方法是0-1损失。如果预测函数定义为 f:RD−>0,...,L ，那么损失函数可以表示为：

l 0, 1 = \sum i = 0 | D | I f (x i \neq y i)

这里， D 可以是训练过程中的训练数据，或者和训练数据没有任何交集，以避免验证或测试过程中的偏差。指标函数 I 定义为：

I x = {1 i f x i s T r u e 0 e l s e

在本指南中，预测函数定于为：

f (x) = a r g m a x k P (Y = k | x, θ)

在python中，结合Theano，该函数的实现如下：

# zero_one_loss is a Theano variable representing a symbolic
# expression of the zero one loss ; to get the actual value this
# symbolic expression has to be compiled into a Theano function (see
# the Theano tutorial for more details)
zero_one_loss = T.sum(T.neq(T.argmax(p_y_given_x), y))

负对数似然损失

因为0-1损失函数是不可微的，在一个含有几千甚至几万个参数的复杂问题中，模型的求解变得非常困难。因此我们最大化分类器的对数似然函数：

L (θ, D) = \sum i = 0 | D | l o g P (Y = y i | x i, θ)

正确类别的似然，并不和正确预测的数目完全一致，但是，从随机初始化的分类器的角度看，他们是非常类似的。但是请记住，似然函数和0-1损失函数是不同的，你应该看到他们的在验证数据上面的正相关性，但是有时候又是负相关。（这段是不是很明白）

既然我们可以最小化损失函数，那么学习的过程，也就是最小化负的对数似然函数的过程：

N L L (θ, D) = \sum i = 0 | D | l o g P (Y = y i | x i, θ)

NLL函数其实是0-1损失函数的一种可以微分的替代，这样我们就可以用它在训练集合的梯度来训练分类器。相应的代码如下：

# NLL is a symbolic variable ; to get the actual value of NLL, this symbolic
# expression has to be compiled into a Theano function (see the Theano
# tutorial for more details)
NLL = -T.sum(T.log(p_y_given_x)[T.arange(y.shape[0]), y])
# note on syntax: T.arange(y.shape[0]) is a vector of integers [0,1,2,...,len(y)].
# Indexing a matrix M by the two vectors [0,1,...,K], [a,b,...,k] returns the
# elements M[0,a], M[1,b], ..., M[K,k] as a vector.  Here, we use this
# syntax to retrieve the log-probability of the correct labels, y.