Online Convex Programming and Generalized Infinitesimal Gradient Ascent

Online Convex Programming and Generalized Infinitesimal Gradient Ascent


Jan. 23, 2021


Aim ‾ \underline{\text{Aim}} Aim

In this paper, the concept of online convex programming is introduced. To solve the online convex programming problem, an algorithm for general convex functions based on gradient descent is also presented.

Background ‾ \underline{\text{Background}} Background

Online convex programming widely appears in fields such as model factory production, farm production, and many other industrial optimization problems where one is unaware of the value of the items produced until they have already been constructed. One of the core rule in online convex programming is that, though the convex set in which you make your choice is known in advance, the cost function for that step can be seen only after you make your choice (in that step).

To handle an online convex programming problem, there exists a widely-used algorithm named experts algorithm. Compared to the experts algorithm, the gradient-descent-based algorithm in this paper has two advantages: firstly, it can handle an arbitrary sequence of convex functions, which has yet to be solved; secondly, this algorithm can in some circumstances perform better than an experts algorithm in online linear programs.

Brief Project Description ‾ \underline{\text{Brief Project Description}} Brief Project Description

Definition 4 give the definition of an online convex programming problem: An online convex programming problem consists of a feasible set F ⊆ R n \mathcal{F} \subseteq\mathbb{R}^n FRn and an infinite sequence { c 1 , c 2 , ⋯   } \{c^1,c^2,\cdots\} {c1,c2,} where each c t c^t ct: F → R \mathcal{F}\rightarrow\mathbb{R} FR is a convex function. At each time step t t t, an online convex programming algorithm selects a vector x t ∈ F \bm{x}^t\in\mathcal{F} xtF. After the vector is selected, it retrieves the cost function c t c^t ct.

Two Algorithms, Greedy Projection and Lazy Projection, are proposed for this problem. For the former algorithm, both the upper bounds of the regret (Theorem 1) and the dynamic regret (Theorem 2) of it are derived. It can be seen that the regret upper bound approaches 0 (as T → 0 T\rightarrow0 T0). For the Lazy Projection, only the upper bound of the regret is derived.

It is established in this paper that repeated games are online linear programming problems. A repeated game is thus formulated as an online linear program and then an algorithm (GIGA, Generalized Infinitesimal Gradient Ascent) for this online linear program is proposed.An upper bound of the expected regret of GIGA is derived in Theorem 4. Observed that GIGA is essentially just a method of constructing a behavior, though a bevavior from it can be generated by proper simulations. As the strategy of GIGA in the current time step can be calculated given x 1 x^1 x1 (a constant) and the past actions of the environment, GIGA is also self-oblivious. And, moreover, GIGA is also universally consistent.

This paper naively translate 1) algorithms for mixing experts into algorithms for online linear programs, and 2) online linear programming algorithms into algorithms for online convex programs. Algorithm 5 presents how to achieve the former translation. As for the latter one, two issues should be handled: one is that the feasible region of an online convex program may be impossible to be described as the convex hull of any finite number of points; the other one is that a convex function may not achieve its optimal value at the boundary of the feasible region. About the former issue, the paper just assume that the OLPA can handle the feasible region of the online convex programming problem. As for the latter one, the paper handle it by converting the cost function to a linear one. Algorithm 6 and Algorithm 7 are the Exact and Approx versions of the convertion from an OLPA (online linear programming algorithm) to an online programming algorithm. The worst-case difference between Approx and Exact are bounded in the paper.

Significance of Paper ‾ \underline{\text{Significance of Paper}} Significance of Paper

In this paper, we have defined an online convex programming problem. We have established that gradient descent is a very effective technique on this problem. This work was motivated by trying to better understand the infinitesimal gradient ascent algorithm, and the techniques developed we applied to that problem to establish an extension to infinitesimal gradient ascent that is universally consistent.

The definition of Online Convex Programming Problem is firstly given – this makes the paper a pioneer in this field. The algorithms proposed are all based on gradient descent, one of the most simple and natural algorithm that is widely used. And the analytical upper bound of different kinds of regrets render the algorithms usable, rather than just intuitively understandable.

Reference {\text{\Large Reference}} Reference

[1] Zinkevich, Martin. “Online convex programming and generalized infinitesimal gradient ascent.” Proceedings of the 20th international conference on machine learning (icml-03). 2003.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值