python的scipy库optimize函数,使用scipy.optimize最小化多元可微函数

I'm trying to minimize the following function with scipy.optimize:

6ecac21ff92b756b34616a91a3735a8f.png

whose gradient is this:

7a1d3c525e1f59adb90a4adaddf28b4a.png

(for those who are interested, this is the likelihood function of a Bradley-Terry-Luce model for pairwise comparisons. Very closely linked to logistic regression.)

It is fairly clear that adding a constant to all the parameters does not change the value of the function. Hence, I let \theta_1 = 0. Here are the implementation the objective functions and the gradient in python (theta becomes x here):

def objective(x):

x = np.insert(x, 0, 0.0)

tiles = np.tile(x, (len(x), 1))

combs = tiles.T - tiles

exps = np.dstack((zeros, combs))

return np.sum(cijs * scipy.misc.logsumexp(exps, axis=2))

def gradient(x):

zeros = np.zeros(cijs.shape)

x = np.insert(x, 0, 0.0)

tiles = np.tile(x, (len(x), 1))

combs = tiles - tiles.T

one = 1.0 / (np.exp(combs) + 1)

two = 1.0 / (np.exp(combs.T) + 1)

mat = (cijs * one) + (cijs.T * two)

grad = np.sum(mat, axis=0)

return grad[1:] # Don't return the first element

Here's an example of what cijs might look like:

[[ 0 5 1 4 6]

[ 4 0 2 2 0]

[ 6 4 0 9 3]

[ 6 8 3 0 5]

[10 7 11 4 0]]

This is the code I run to perform the minimization:

x0 = numpy.random.random(nb_items - 1)

# Let's try one algorithm...

xopt1 = scipy.optimize.fmin_bfgs(objective, x0, fprime=gradient, disp=True)

# And another one...

xopt2 = scipy.optimize.fmin_cg(objective, x0, fprime=gradient, disp=True)

However, it always fails in the first iteration:

Warning: Desired error not necessarily achieved due to precision loss.

Current function value: 73.290610

Iterations: 0

Function evaluations: 38

Gradient evaluations: 27

I can't figure out why it fails. The error gets displayed because of this line:

https://github.com/scipy/scipy/blob/master/scipy/optimize/optimize.py#L853

So this "Wolfe line search" does not seem to succeed, but I have no idea how to proceed from here... Any help is appreciated!

解决方案

As @pv. pointed out as a comment, I made a mistake in computing the gradient. First of all, the correct (mathematical) expression for the gradient of my objective function is:

763e9d63fb622c9729306fd0430a987e.png

(notice the minus sign.) Furthermore, my Python implementation was completely wrong, beyond the sign mistake. Here's my updated gradient:

def gradient(x):

nb_comparisons = cijs + cijs.T

x = np.insert(x, 0, 0.0)

tiles = np.tile(x, (len(x), 1))

combs = tiles - tiles.T

probs = 1.0 / (np.exp(combs) + 1)

mat = (nb_comparisons * probs) - cijs

grad = np.sum(mat, axis=1)

return grad[1:] # Don't return the first element.

To debug it , I used:

scipy.optimize.check_grad: showed that my gradient function was producing results very far away from an approximated (finite difference) gradient.

scipy.optimize.approx_fprime to get an idea of the values should look like.

a few hand-picked simple examples that could be analyzed by hand if needed, and a few Wolfram Alpha queries for sanity-checking.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值