Conjugate Function

共轭函数

共轭函数的定义:

设函数f:RnR,定义函数f:RnR为: 

f(y)=sup(<y,x>f(x))  xD


此函数称为函数f的共轭函数。即函数yx和函数f(x)之间差值的上确界。

如下图所示:

                  

         假设y=2时,yTx的图像是xy那条虚线,而定义式右边的部分是求x等于多少时yTx - f(x)的值最大,在上图中我们可以一眼看出,在“和xy平行且是f(x)切线的那个点”处两函数的差值最大,假设差值是10,于是我们就求出yTx - f(x)的共轭函数的一个点,即f*(2) = 10,就这样把y扩展到这个定义域范围内后就得到了整个共轭函数。

 假设有函数f(x) = xTQx/2,其中Q是可逆的对称阵,算它的共轭函数,根据定义就是求:g(x, y) = yTx - xTQx/2 的上确界。

         于是将g(x, y)对x求偏导:

                   g’(x,y) = (yTx)’ - (xTQx/2)’

         因为xTQx对x求偏导的结果是2Qx,所以上式继续推导为:

                   =y - Qx

         另偏导等于0,得:

                   x= Q-1y

         因为是求偏导,所以得到的是上确界,于是把上式代入g(x, y)后就得f(x)的共轭函数:

                   f*(y)= (yTQ-1y)/2

负熵函数: f(x)=xlog(x), xR+,f(0)=0,共轭函数 
f(y)=supyxxlog(x), 在y=log(x)+1取最大值,即x=ey1,因此,f(y)=ey1.

http://blog.csdn.net/raby_gyl/article/details/53178467

转载于:https://www.cnblogs.com/xiaoxuesheng993/p/7953612.html

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
解释:def conjugate_gradient(fun, grad, x0, iterations, tol): """ Minimization of scalar function of one or more variables using the conjugate gradient algorithm. Parameters ---------- fun : function Objective function. grad : function Gradient function of objective function. x0 : numpy.array, size=9 Initial value of the parameters to be estimated. iterations : int Maximum iterations of optimization algorithms. tol : float Tolerance of optimization algorithms. Returns ------- xk : numpy.array, size=9 Parameters wstimated by optimization algorithms. fval : float Objective function value at xk. grad_val : float Gradient value of objective function at xk. grad_log : numpy.array The record of gradient of objective function of each iteration. """ fval = None grad_val = None x_log = [] y_log = [] grad_log = [] x0 = asarray(x0).flatten() # iterations = len(x0) * 200 old_fval = fun(x0) gfk = grad(x0) k = 0 xk = x0 # Sets the initial step guess to dx ~ 1 old_old_fval = old_fval + np.linalg.norm(gfk) / 2 pk = -gfk x_log = np.append(x_log, xk.T) y_log = np.append(y_log, fun(xk)) grad_log = np.append(grad_log, np.linalg.norm(xk - x_log[-1:])) gnorm = np.amax(np.abs(gfk)) sigma_3 = 0.01 while (gnorm > tol) and (k < iterations): deltak = np.dot(gfk, gfk) cached_step = [None] def polak_ribiere_powell_step(alpha, gfkp1=None): xkp1 = xk + alpha * pk if gfkp1 is None: gfkp1 = grad(xkp1) yk = gfkp1 - gfk beta_k = max(0, np.dot(yk, gfkp1) / deltak) pkp1 = -gfkp1 + beta_k * pk gnorm = np.amax(np.abs(gfkp1)) return (alpha, xkp1, pkp1, gfkp1, gnorm) def descent_condition(alpha, xkp1, fp1, gfkp1): # Polak-Ribiere+ needs an explicit check of a sufficient # descent condition, which is not guaranteed by strong Wolfe. # # See Gilbert & Nocedal, "Global convergence properties of # conjugate gradient methods for optimization", # SIAM J. Optimization 2, 21 (1992). cached_step[:] = polak_ribiere_powell_step(alpha, gfkp1) alpha, xk, pk, gfk, gnorm = cached_step # Accept step if it leads to convergence. if gnorm <= tol: return True # Accept step if sufficient descent condition applies. return np.dot(pk, gfk) <= -sigma_3 * np.dot(gfk, gfk) try: alpha_k, fc, gc, old_fval, old_old_fval, gfkp1 = \ _line_search_wolfe12(fun, grad, xk, pk, gfk, old_fval, old_old_fval, c2=0.4, amin=1e-100, amax=1e100, extra_condition=descent_condition) except _LineSearchError: break # Reuse already computed results if possible if alpha_k == cached_step[0]: alpha_k, xk, pk, gfk, gnorm = cached_step else: alpha_k, xk, pk, gfk, gnorm = polak_ribiere_powell_step(alpha_k, gfkp1) k += 1 grad_log = np.append(grad_log, np.linalg.norm(xk - x_log[-1:])) x_log = np.append(x_log, xk.T) y_log = np.append(y_log, fun(xk)) fval = old_fval grad_val = grad_log[-1] return xk, fval, grad_val, x_log, y_log, grad_log
06-06

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值