【运筹学】utility theory

decision criteria

decision making under uncertainty

  • action, a i ∈ A a_i \in A aiA
  • state, s j ∈ S s_j \in S sjS
  • reward, r i j r_{ij} rij

a discrete newsvendor example

  • c = 20 , p = 25 , S = { 600 , 700 , 800 , 900 , 1000 } c=20,p=25,S=\{600,700,800,900,1000\} c=20,p=25,S={600,700,800,900,1000}

  • r i j = r_{ij}= rij= reward of purchasing i with demand j
    r i j = { ( p − c ) i , i ≤ j p j − c i , i > j r_{ij}= \left\{ \begin{aligned} & (p-c)i, &i\le j \\ & pj-ci, &i > j \end{aligned} \right. rij={(pc)i,pjci,iji>j

  • dominated actions 支配
    a i   i s   d o m i n a t e d   b y   a i ′ , i f   r i j ≤ r i ′ j ,   ∀ s j ∈ S ,   r i ′ j < r i ′ j ′ ,   f o r   s o m e   s j ′ ∈ S a_i\ is\ dominated\ by\ a_i',\\ if\ r_{ij} \le r_{i'j},\ \forall s_j\in S,\ r_{i'j}< r_{i'j'},\ for\ some\ s_{j'}\in S ai is dominated by ai,if rijrij, sjS, rij<rij, for some sjS
    对于action i,在所有state里,reward都小于等于action i’ 带来的reward;

    ​ 在部分state里,reward比action i’ 带来的reward 小

decision criteria

  • maximin:最大化最少的reward
    a i = arg ⁡ max ⁡ a i ∈ A { min ⁡ s j ∈ S r i j } a_i=\arg\max_{a_i\in A}\{\min_{s_j\in S}r_{ij} \} ai=argaiAmax{sjSminrij}

  • maximax:最大化最大的reward
    a i = arg ⁡ max ⁡ a i ∈ A { max ⁡ s j ∈ S r i j } a_i=\arg\max_{a_i\in A}\{\max_{s_j\in S}r_{ij}\} ai=argaiAmax{sjSmaxrij}

  • expected value:最大化期望

a i = arg ⁡ max ⁡ a i ∈ A { ∑ s j ∈ S p j r i j } a_i=\arg\max_{a_i\in A}\{\sum_{s_j\in S}p_jr_{ij}\} ai=argaiAmax{sjSpjrij}

  • minimax regret:最小化最大的后悔(最好的action和实际采取的action之间的差值)
    a i = arg ⁡ min ⁡ a i ∈ A { max ⁡ s j ∈ S R i j } R e g r e t R i j = r i ∗ ( j ) , j − r i j , w h e r e i ∗ ( j ) = arg ⁡ max ⁡ a i ∈ A r i j , ∀ s j ∈ S r i ∗ ( j ) , j = 对 于 j , 最 好 的 a c t i o n 产 生 的 r e w a r d r i j = 实 际 采 取 的 a c t i o n 产 生 的 r e w a r d a_i=\arg\min_{a_i\in A}\{\max_{s_j\in S}R_{ij}\} \\ \begin{aligned} &Regret& R_{ij} = & r_{i^*(j),j}-r_{ij}, \\ & where && i^*(j)=\arg\max_{a_i\in A}r_{ij}, &\forall s_j\in S \end{aligned} \\ \\ r_{i^*(j),j}=对于j,最好的action产生的reward \\ r_{ij}=实际采取的action产生的reward ai=argaiAmin{sjSmaxRij}RegretwhereRij=ri(j),jrij,i(j)=argaiAmaxrij,sjSri(j),j=jactionrewardrij=actionreward

Utility Theory

Lottery (L)

  • ( p 1 , r 1 ; p 2 , r 2 ; . . . , p n , r n ) (p_1,r_1;p_2,r_2;...,p_n,r_n) (p1,r1;p2,r2;...,pn,rn)

  • r: reward

  • p: probability

  • tree representation

  • 表示方式

    • L 1 L_1 L1 p p p L 2 L_2 L2: prefers L 1 L_1 L1
    • L 1 i L 2 L_1iL_2 L1iL2: equivalent lotteries, indifferent between L 1 L_1 L1 and L 2 L_2 L2
    • L 2 p L 1 L_2pL_1 L2pL1: prefers L 2 L_2 L2

Von Neumann-Morgenstern Utility Theory

  • Utility of the reward r i r_i ri, u ( r i ) u(r_i) u(ri) is the number q i q_i qi such that L i L ′ LiL' LiL.

    回报 r i r_i ri的效用, u ( r i ) u(r_i) u(ri),使得 L i L ′ LiL' LiL的数 q i q_i qi

    • L = ( 1 , r 1 ) L=(1,r_1) L=(1,r1)

      L ′ = ( q 1 , m o s t   f a v o r a b l e   o u t c o m e ; 1 − q 1 , l e a s t   f a v o r a b l e   o u t c o m e ) L'=(q_1,most\ favorable\ outcome;1-q_1,least\ favorable\ outcome) L=(q1,most favorable outcome;1q1,least favorable outcome)

    • u(least favorable outcome)=0

      u(most favorable outcome)=1

  • Utility function, u ( r i ) , ∀ r i u(r_i),\forall r_i u(ri),ri

  • expected utility of the lottery L
    E ( U   f o r   L ) = ∑ i = 1 n p i u ( r i ) E(U\ for\ L)=\sum_{i=1}^n p_iu(r_i) E(U for L)=i=1npiu(ri)

Axiom

  1. Complete ordering axiom: define the most/least favorable outcome
    ∀ r 1 , r 2 : r 1 ≻ r 2 , r 1 ∼ r 2 , r 1 ≺ r 2 r 1 ≻ r 2 , r 2 ≻ r 3 , t h e n   r 1 ≻ r 3 ( t r a n s i t i v i t y   o f   p r e f e r e n c e s ) \forall r_1,r_2:r_1 \succ r_2,r_1 \sim r_2,r_1 \prec r_2 \\ r_1 \succ r_2,r_2\succ r_3, then\ r_1\succ r_3(transitivity\ of\ preferences) r1,r2:r1r2,r1r2,r1r2r1r2,r2r3,then r1r3(transitivity of preferences)

  2. Continuity axiom: evaluate the intermediate outcome values between the most/least favorable outcome
    L 1 = ( 1 , r 2 ) , L 2 = ( c , r 1 ; 1 − c , r 3 ) I f   r 1 ≻ r 2 , r 2 ≻ r 3 t h e n   ∃ c ( 0 < c < 1 ) , s u c h   t h a t   L 1 i L 2 L_1=(1,r_2),L_2=(c,r_1;1-c,r_3) \\ If\ r_1\succ r_2,r_2\succ r_3 \\ then\ \exist c(0<c<1),such\ that\ L_1iL_2 L1=(1,r2),L2=(c,r1;1c,r3)If r1r2,r2r3then c(0<c<1),such that L1iL2

  3. Independence axiom: “plug in” the equivalent lottery of the intermediate outcomes
    L 1 = ( c , r 1 ; 1 − c , r 3 ) , L 2 = ( c , r 2 ; 1 − c , r 3 ) r 1 ∼ r 2 , 任 意 r 3 ∀ c ( 0 < c < 1 ) , L 1 i L 2 L_1=(c,r_1;1-c,r_3),L_2=(c,r_2;1-c,r_3) \\ r_1\sim r_2,任意r_3 \\ \forall c(0<c<1), L_1iL_2 L1=(c,r1;1c,r3),L2=(c,r2;1c,r3)r1r2,r3c(0<c<1),L1iL2

  4. Unequal probability axiom: rank the lotteries
    L 1 = ( p 1 , r 1 ; 1 − p 1 , r 2 ) , L 2 = ( p 2 , r 1 ; 1 − p 2 , r 2 ) r 1 ≻ r 2 , p 1 > p 2 → L 1 p L 2 L_1=(p_1,r_1;1-p_1,r_2), L_2=(p_2,r_1;1-p_2,r_2) \\ r_1\succ r_2,p_1>p_2 \rightarrow L_1pL_2 L1=(p1,r1;1p1,r2),L2=(p2,r1;1p2,r2)r1r2,p1>p2L1pL2

  5. Compound lottery axiom: convert a compound lottery into a simple lottery

    If we consider all the possible rewards, a compound lottery L L L has a probability p i p_i pi of receiving a reward r i r_i ri , then L i L ′ LiL' LiL, where L ′ = ( p 1 , r 1 ; p 2 , r 2 ; . . . ; p n , r n ) L'=(p_1,r_1;p_2,r_2;...;p_n,r_n) L=(p1,r1;p2,r2;...;pn,rn)

  6. lemma 1:线性

    给定 utility function u ( x ) u(x) u(x), 定义 positive linear function v ( x ) = a u ( x ) + b , ∀ a > 0 , b v(x)=au(x)+b, \forall a>0,b v(x)=au(x)+b,a>0,b,

    对于两个lotteries L 1 L_1 L1 L 2 L_2 L2
    L 1 p L 2   u s i n g   u ( x ) ↔ L 1 p L 2   u s i n g   v ( x ) L 1 i L 2   u s i n g   u ( x ) ↔ L 1 i L 2   u s i n g   v ( x ) L_1pL_2\ using\ u(x) \leftrightarrow L_1pL_2\ using\ v(x) \\ L_1iL_2\ using\ u(x) \leftrightarrow L_1iL_2\ using\ v(x) L1pL2 using u(x)L1pL2 using v(x)L1iL2 using u(x)L1iL2 using v(x)

definitions

  • expected value of L’s outcomes, EV(L)
    E V ( L ) = ∑ i = 1 n p i r i EV(L)=\sum_{i=1}^np_ir_i EV(L)=i=1npiri

  • expected utility of L, E(U for L)
    E ( U   f o r   L ) = ∑ i = 1 n p i u ( r i ) E(U\ for\ L)=\sum_{i=1}^np_iu(r_i) E(U for L)=i=1npiu(ri)

  • certainty equivalent of L, CE(L)

    is the number such that a decision maker is indifferent between L and receiving a certain payoff of CE(L)
    E ( U f o r L ) = u ( C E ( L ) ) E(UforL)=u(CE(L)) E(UforL)=u(CE(L))

  • risk premium of L, RP(L)
    R P ( L ) = E V ( L ) − C E ( L ) RP(L)=EV(L)-CE(L) RP(L)=EV(L)CE(L)

  • risk attitudes

    risk-averse R P ( L ) > 0 RP(L)>0 RP(L)>0u(x) is strictly concave 严格凹
    risk-neutral R P ( L ) = 0 RP(L)=0 RP(L)=0u(x) is linear
    risk-seeking R P ( L ) < 0 RP(L)<0 RP(L)<0u(x) is strictly convex 严格凸
  • graphic illustration

    L = ( p , x 1 ; 1 − p , x 2 ) , x 1 < x 2 L=(p,x_1;1-p,x_2),x_1<x_2 L=(p,x1;1p,x2),x1<x2

    [外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-D9PkJiCE-1652706216498)(C:\Users\FJB\AppData\Roaming\Typora\typora-user-images\1652692715154.png)]

Exercise

  • value of insurance (Winston 2004, p.p. 85-86)

    • cash: $10,000
    • Home: $90,000
    • accident probability: 0.1%
  • 问题:如果房子被毁,需要支付多少保险?

  • 假设: u ( x ) = x 1 / 2 u(x)=x^{1/2} u(x)=x1/2, x表示整体财富(现金加房子)

  • solution
    KaTeX parse error: Can't use function '$' in math mode at position 28: …ned} L_1 & =(1,$̲100,000-y),L_2=…




example

1.

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-wVakE5na-1652706216500)(C:\Users\FJB\AppData\Roaming\Typora\typora-user-images\1652604680187.png)]


  • $5000

  • u ( x ) = x / 1 , 000 u(x)=\sqrt x/1,000 u(x)=x /1,000

    investment 1investment 2
    +$295,000, 80%;
    +$95,000, 20%
    +$595,000, 50%
    +$5,000, 50%

| [外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-t3lT6mIi-1652706216501)(C:\Users\FJB\AppData\Roaming\Typora\typora-user-images\1652694074064.png)] | [外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-PQDy2KJS-1652706216501)(C:\Users\FJB\AppData\Roaming\Typora\typora-user-images\1652694089566.png)] |
| $EV(L_1)=80%*$300,000+20% *$100,000= $260,000$ | $EV(L_2)=$305,000$ |
| E ( U f o r L 1 ) = 80 % ∗ 0.55 + 20 % ∗ 0.32 = 0.504 E(UforL_1)=80\%*0.55+20\%*0.32=0.504 E(UforL1)=80%0.55+20%0.32=0.504 | E ( U f o r L 2 ) = 0.437 E(UforL_2)=0.437 E(UforL2)=0.437 |
| $CE(L_1)=u{-1}(E(UforL_1))=(0.504*1,000)2=$254,016$ | $CE(L_2)=$190,969$ |
| $RP(L_1)=EV(L_1)-CE(L_1)=$5,984>0$ | $RP(L_2)=$114,031$ |
| risk-averse | risk-averse |

We are risk-averse.

  • C E ( L 1 ) > C E ( L 2 ) CE(L_1)>CE(L_2) CE(L1)>CE(L2)

    We prefer investment 1.

2.

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-RM8aPoCb-1652706216502)(C:\Users\FJB\AppData\Roaming\Typora\typora-user-images\1652604699537.png)]

  • risk-neutral, so R P ( L ) = 0 , E V ( L ) = C E ( L ) , u ( x ) RP(L)=0,EV(L)=CE(L),u(x) RP(L)=0,EV(L)=CE(L),u(x) is linear.

    P ( t h e   f i r s t   h e a d   i s   o b t a i n e d   o n   t h e   n t h   t o s s   o f   t h e   c o i n ) = 1 2 n P(the\ first\ head\ is\ obtained\ on\ the\ nth\ toss\ of\ the\ coin)=\frac{1}{2^n} P(the first head is obtained on the nth toss of the coin)=2n1

    1652696775328

p i = 1 2 i ,   r i = 2 i E V ( L ) = ∑ i = 1 n p i r i = 1 2 ∗ 2 + 1 2 2 ∗ 2 2 + . . . + 1 2 n ∗ 2 n = n C E ( L ) = E V ( L ) = n n → ∞ , C E ( L ) → ∞ ∵ u ( x ) i s   l i n e a r ∴ u ( x ) → ∞ , E ( U f o r L ) → ∞ \begin{aligned} p_i & =\frac{1}{2^i},\ r_i=2^i \\ EV(L) & =\sum_{i=1}^np_ir_i=\frac{1}{2}*2+\frac{1}{2^2}*2^2+...+\frac{1}{2^n}*2^n=n \\ CE(L) & =EV(L)=n \\ n & \rightarrow\infty,CE(L)\rightarrow\infty \\ \because & u(x) is\ linear \\ \therefore & u(x)\rightarrow\infty,E(UforL)\rightarrow\infty \end{aligned} piEV(L)CE(L)n=2i1, ri=2i=i=1npiri=212+22122+...+2n12n=n=EV(L)=n,CE(L)u(x)is linearu(x),E(UforL)

​ This is unreasonable.

  • u ( x ) = log ⁡ 2 ( x ) u(x)=\log_2 (x) u(x)=log2(x)
    u ( r i ) = log ⁡ 2 ( r i ) = log ⁡ 2 ( 2 i ) = i E ( U f o r L ) = ∑ i = 1 n p i u ( r i ) = ∑ i = 1 n 1 2 i i = 2 − 1 2 n − 1 − n 2 n C E ( L ) = u − 1 ( E ( U f o r L ) ) = 2 2 − 1 2 n − 1 − n 2 n \begin{aligned} u(r_i) & =\log_2(r_i)=\log_2(2^i)=i \\ E(UforL) & =\sum_{i=1}^np_iu(r_i)=\sum_{i=1}^n\frac{1}{2^i}i=2-\frac{1}{2^{n-1}}-\frac{n}{2^n} \\ CE(L) & =u^{-1}(E(UforL))=2^{2-\frac{1}{2^{n-1}}-\frac{n}{2^n}} \end{aligned} u(ri)E(UforL)CE(L)=log2(ri)=log2(2i)=i=i=1npiu(ri)=i=1n2i1i=22n112nn=u1(E(UforL))=222n112nn
    ble.

  • u ( x ) = log ⁡ 2 ( x ) u(x)=\log_2 (x) u(x)=log2(x)
    u ( r i ) = log ⁡ 2 ( r i ) = log ⁡ 2 ( 2 i ) = i E ( U f o r L ) = ∑ i = 1 n p i u ( r i ) = ∑ i = 1 n 1 2 i i = 2 − 1 2 n − 1 − n 2 n C E ( L ) = u − 1 ( E ( U f o r L ) ) = 2 2 − 1 2 n − 1 − n 2 n \begin{aligned} u(r_i) & =\log_2(r_i)=\log_2(2^i)=i \\ E(UforL) & =\sum_{i=1}^np_iu(r_i)=\sum_{i=1}^n\frac{1}{2^i}i=2-\frac{1}{2^{n-1}}-\frac{n}{2^n} \\ CE(L) & =u^{-1}(E(UforL))=2^{2-\frac{1}{2^{n-1}}-\frac{n}{2^n}} \end{aligned} u(ri)E(UforL)CE(L)=log2(ri)=log2(2i)=i=i=1npiu(ri)=i=1n2i1i=22n112nn=u1(E(UforL))=222n112nn

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值