decision criteria
decision making under uncertainty
- action, a i ∈ A a_i \in A ai∈A
- state, s j ∈ S s_j \in S sj∈S
- reward, r i j r_{ij} rij
a discrete newsvendor example
-
c = 20 , p = 25 , S = { 600 , 700 , 800 , 900 , 1000 } c=20,p=25,S=\{600,700,800,900,1000\} c=20,p=25,S={600,700,800,900,1000}
-
r i j = r_{ij}= rij= reward of purchasing i with demand j
r i j = { ( p − c ) i , i ≤ j p j − c i , i > j r_{ij}= \left\{ \begin{aligned} & (p-c)i, &i\le j \\ & pj-ci, &i > j \end{aligned} \right. rij={(p−c)i,pj−ci,i≤ji>j -
dominated actions 支配
a i i s d o m i n a t e d b y a i ′ , i f r i j ≤ r i ′ j , ∀ s j ∈ S , r i ′ j < r i ′ j ′ , f o r s o m e s j ′ ∈ S a_i\ is\ dominated\ by\ a_i',\\ if\ r_{ij} \le r_{i'j},\ \forall s_j\in S,\ r_{i'j}< r_{i'j'},\ for\ some\ s_{j'}\in S ai is dominated by ai′,if rij≤ri′j, ∀sj∈S, ri′j<ri′j′, for some sj′∈S
对于action i,在所有state里,reward都小于等于action i’ 带来的reward; 在部分state里,reward比action i’ 带来的reward 小
decision criteria
-
maximin:最大化最少的reward
a i = arg max a i ∈ A { min s j ∈ S r i j } a_i=\arg\max_{a_i\in A}\{\min_{s_j\in S}r_{ij} \} ai=argai∈Amax{sj∈Sminrij} -
maximax:最大化最大的reward
a i = arg max a i ∈ A { max s j ∈ S r i j } a_i=\arg\max_{a_i\in A}\{\max_{s_j\in S}r_{ij}\} ai=argai∈Amax{sj∈Smaxrij} -
expected value:最大化期望
a i = arg max a i ∈ A { ∑ s j ∈ S p j r i j } a_i=\arg\max_{a_i\in A}\{\sum_{s_j\in S}p_jr_{ij}\} ai=argai∈Amax{sj∈S∑pjrij}
- minimax regret:最小化最大的后悔(最好的action和实际采取的action之间的差值)
a i = arg min a i ∈ A { max s j ∈ S R i j } R e g r e t R i j = r i ∗ ( j ) , j − r i j , w h e r e i ∗ ( j ) = arg max a i ∈ A r i j , ∀ s j ∈ S r i ∗ ( j ) , j = 对 于 j , 最 好 的 a c t i o n 产 生 的 r e w a r d r i j = 实 际 采 取 的 a c t i o n 产 生 的 r e w a r d a_i=\arg\min_{a_i\in A}\{\max_{s_j\in S}R_{ij}\} \\ \begin{aligned} &Regret& R_{ij} = & r_{i^*(j),j}-r_{ij}, \\ & where && i^*(j)=\arg\max_{a_i\in A}r_{ij}, &\forall s_j\in S \end{aligned} \\ \\ r_{i^*(j),j}=对于j,最好的action产生的reward \\ r_{ij}=实际采取的action产生的reward ai=argai∈Amin{sj∈SmaxRij}RegretwhereRij=ri∗(j),j−rij,i∗(j)=argai∈Amaxrij,∀sj∈Sri∗(j),j=对于j,最好的action产生的rewardrij=实际采取的action产生的reward
Utility Theory
Lottery (L)
-
( p 1 , r 1 ; p 2 , r 2 ; . . . , p n , r n ) (p_1,r_1;p_2,r_2;...,p_n,r_n) (p1,r1;p2,r2;...,pn,rn)
-
r: reward
-
p: probability
-
tree representation
-
表示方式
- L 1 L_1 L1 p p p L 2 L_2 L2: prefers L 1 L_1 L1
- L 1 i L 2 L_1iL_2 L1iL2: equivalent lotteries, indifferent between L 1 L_1 L1 and L 2 L_2 L2
- L 2 p L 1 L_2pL_1 L2pL1: prefers L 2 L_2 L2
Von Neumann-Morgenstern Utility Theory
-
Utility of the reward r i r_i ri, u ( r i ) u(r_i) u(ri) is the number q i q_i qi such that L i L ′ LiL' LiL′.
回报 r i r_i ri的效用, u ( r i ) u(r_i) u(ri),使得 L i L ′ LiL' LiL′的数 q i q_i qi
-
L = ( 1 , r 1 ) L=(1,r_1) L=(1,r1)
L ′ = ( q 1 , m o s t f a v o r a b l e o u t c o m e ; 1 − q 1 , l e a s t f a v o r a b l e o u t c o m e ) L'=(q_1,most\ favorable\ outcome;1-q_1,least\ favorable\ outcome) L′=(q1,most favorable outcome;1−q1,least favorable outcome)
-
u(least favorable outcome)=0
u(most favorable outcome)=1
-
-
Utility function, u ( r i ) , ∀ r i u(r_i),\forall r_i u(ri),∀ri
-
expected utility of the lottery L
E ( U f o r L ) = ∑ i = 1 n p i u ( r i ) E(U\ for\ L)=\sum_{i=1}^n p_iu(r_i) E(U for L)=i=1∑npiu(ri)
Axiom
-
Complete ordering axiom: define the most/least favorable outcome
∀ r 1 , r 2 : r 1 ≻ r 2 , r 1 ∼ r 2 , r 1 ≺ r 2 r 1 ≻ r 2 , r 2 ≻ r 3 , t h e n r 1 ≻ r 3 ( t r a n s i t i v i t y o f p r e f e r e n c e s ) \forall r_1,r_2:r_1 \succ r_2,r_1 \sim r_2,r_1 \prec r_2 \\ r_1 \succ r_2,r_2\succ r_3, then\ r_1\succ r_3(transitivity\ of\ preferences) ∀r1,r2:r1≻r2,r1∼r2,r1≺r2r1≻r2,r2≻r3,then r1≻r3(transitivity of preferences) -
Continuity axiom: evaluate the intermediate outcome values between the most/least favorable outcome
L 1 = ( 1 , r 2 ) , L 2 = ( c , r 1 ; 1 − c , r 3 ) I f r 1 ≻ r 2 , r 2 ≻ r 3 t h e n ∃ c ( 0 < c < 1 ) , s u c h t h a t L 1 i L 2 L_1=(1,r_2),L_2=(c,r_1;1-c,r_3) \\ If\ r_1\succ r_2,r_2\succ r_3 \\ then\ \exist c(0<c<1),such\ that\ L_1iL_2 L1=(1,r2),L2=(c,r1;1−c,r3)If r1≻r2,r2≻r3then ∃c(0<c<1),such that L1iL2 -
Independence axiom: “plug in” the equivalent lottery of the intermediate outcomes
L 1 = ( c , r 1 ; 1 − c , r 3 ) , L 2 = ( c , r 2 ; 1 − c , r 3 ) r 1 ∼ r 2 , 任 意 r 3 ∀ c ( 0 < c < 1 ) , L 1 i L 2 L_1=(c,r_1;1-c,r_3),L_2=(c,r_2;1-c,r_3) \\ r_1\sim r_2,任意r_3 \\ \forall c(0<c<1), L_1iL_2 L1=(c,r1;1−c,r3),L2=(c,r2;1−c,r3)r1∼r2,任意r3∀c(0<c<1),L1iL2 -
Unequal probability axiom: rank the lotteries
L 1 = ( p 1 , r 1 ; 1 − p 1 , r 2 ) , L 2 = ( p 2 , r 1 ; 1 − p 2 , r 2 ) r 1 ≻ r 2 , p 1 > p 2 → L 1 p L 2 L_1=(p_1,r_1;1-p_1,r_2), L_2=(p_2,r_1;1-p_2,r_2) \\ r_1\succ r_2,p_1>p_2 \rightarrow L_1pL_2 L1=(p1,r1;1−p1,r2),L2=(p2,r1;1−p2,r2)r1≻r2,p1>p2→L1pL2 -
Compound lottery axiom: convert a compound lottery into a simple lottery
If we consider all the possible rewards, a compound lottery L L L has a probability p i p_i pi of receiving a reward r i r_i ri , then L i L ′ LiL' LiL′, where L ′ = ( p 1 , r 1 ; p 2 , r 2 ; . . . ; p n , r n ) L'=(p_1,r_1;p_2,r_2;...;p_n,r_n) L′=(p1,r1;p2,r2;...;pn,rn)
-
lemma 1:线性
给定 utility function u ( x ) u(x) u(x), 定义 positive linear function v ( x ) = a u ( x ) + b , ∀ a > 0 , b v(x)=au(x)+b, \forall a>0,b v(x)=au(x)+b,∀a>0,b,
对于两个lotteries L 1 L_1 L1 和 L 2 L_2 L2
L 1 p L 2 u s i n g u ( x ) ↔ L 1 p L 2 u s i n g v ( x ) L 1 i L 2 u s i n g u ( x ) ↔ L 1 i L 2 u s i n g v ( x ) L_1pL_2\ using\ u(x) \leftrightarrow L_1pL_2\ using\ v(x) \\ L_1iL_2\ using\ u(x) \leftrightarrow L_1iL_2\ using\ v(x) L1pL2 using u(x)↔L1pL2 using v(x)L1iL2 using u(x)↔L1iL2 using v(x)
definitions
-
expected value of L’s outcomes, EV(L)
E V ( L ) = ∑ i = 1 n p i r i EV(L)=\sum_{i=1}^np_ir_i EV(L)=i=1∑npiri -
expected utility of L, E(U for L)
E ( U f o r L ) = ∑ i = 1 n p i u ( r i ) E(U\ for\ L)=\sum_{i=1}^np_iu(r_i) E(U for L)=i=1∑npiu(ri) -
certainty equivalent of L, CE(L)
is the number such that a decision maker is indifferent between L and receiving a certain payoff of CE(L)
E ( U f o r L ) = u ( C E ( L ) ) E(UforL)=u(CE(L)) E(UforL)=u(CE(L)) -
risk premium of L, RP(L)
R P ( L ) = E V ( L ) − C E ( L ) RP(L)=EV(L)-CE(L) RP(L)=EV(L)−CE(L) -
risk attitudes
risk-averse R P ( L ) > 0 RP(L)>0 RP(L)>0 u(x) is strictly concave 严格凹 risk-neutral R P ( L ) = 0 RP(L)=0 RP(L)=0 u(x) is linear risk-seeking R P ( L ) < 0 RP(L)<0 RP(L)<0 u(x) is strictly convex 严格凸 -
graphic illustration
L = ( p , x 1 ; 1 − p , x 2 ) , x 1 < x 2 L=(p,x_1;1-p,x_2),x_1<x_2 L=(p,x1;1−p,x2),x1<x2
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-D9PkJiCE-1652706216498)(C:\Users\FJB\AppData\Roaming\Typora\typora-user-images\1652692715154.png)]
Exercise
-
value of insurance (Winston 2004, p.p. 85-86)
- cash: $10,000
- Home: $90,000
- accident probability: 0.1%
-
问题:如果房子被毁,需要支付多少保险?
-
假设: u ( x ) = x 1 / 2 u(x)=x^{1/2} u(x)=x1/2, x表示整体财富(现金加房子)
-
solution
KaTeX parse error: Can't use function '$' in math mode at position 28: …ned} L_1 & =(1,$̲100,000-y),L_2=…
example
1.
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-wVakE5na-1652706216500)(C:\Users\FJB\AppData\Roaming\Typora\typora-user-images\1652604680187.png)]
-
$5000
-
u ( x ) = x / 1 , 000 u(x)=\sqrt x/1,000 u(x)=x/1,000
investment 1 investment 2 +$295,000, 80%;
+$95,000, 20%+$595,000, 50%
+$5,000, 50%
| [外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-t3lT6mIi-1652706216501)(C:\Users\FJB\AppData\Roaming\Typora\typora-user-images\1652694074064.png)] | [外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-PQDy2KJS-1652706216501)(C:\Users\FJB\AppData\Roaming\Typora\typora-user-images\1652694089566.png)] |
| $EV(L_1)=80%*$300,000+20% *$100,000= $260,000$ | $EV(L_2)=$305,000$ |
|
E
(
U
f
o
r
L
1
)
=
80
%
∗
0.55
+
20
%
∗
0.32
=
0.504
E(UforL_1)=80\%*0.55+20\%*0.32=0.504
E(UforL1)=80%∗0.55+20%∗0.32=0.504 |
E
(
U
f
o
r
L
2
)
=
0.437
E(UforL_2)=0.437
E(UforL2)=0.437 |
| $CE(L_1)=u{-1}(E(UforL_1))=(0.504*1,000)2=$254,016$ | $CE(L_2)=$190,969$ |
| $RP(L_1)=EV(L_1)-CE(L_1)=$5,984>0$ | $RP(L_2)=$114,031$ |
| risk-averse | risk-averse |
We are risk-averse.
-
C E ( L 1 ) > C E ( L 2 ) CE(L_1)>CE(L_2) CE(L1)>CE(L2)
We prefer investment 1.
2.
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-RM8aPoCb-1652706216502)(C:\Users\FJB\AppData\Roaming\Typora\typora-user-images\1652604699537.png)]
-
risk-neutral, so R P ( L ) = 0 , E V ( L ) = C E ( L ) , u ( x ) RP(L)=0,EV(L)=CE(L),u(x) RP(L)=0,EV(L)=CE(L),u(x) is linear.
P ( t h e f i r s t h e a d i s o b t a i n e d o n t h e n t h t o s s o f t h e c o i n ) = 1 2 n P(the\ first\ head\ is\ obtained\ on\ the\ nth\ toss\ of\ the\ coin)=\frac{1}{2^n} P(the first head is obtained on the nth toss of the coin)=2n1
p i = 1 2 i , r i = 2 i E V ( L ) = ∑ i = 1 n p i r i = 1 2 ∗ 2 + 1 2 2 ∗ 2 2 + . . . + 1 2 n ∗ 2 n = n C E ( L ) = E V ( L ) = n n → ∞ , C E ( L ) → ∞ ∵ u ( x ) i s l i n e a r ∴ u ( x ) → ∞ , E ( U f o r L ) → ∞ \begin{aligned} p_i & =\frac{1}{2^i},\ r_i=2^i \\ EV(L) & =\sum_{i=1}^np_ir_i=\frac{1}{2}*2+\frac{1}{2^2}*2^2+...+\frac{1}{2^n}*2^n=n \\ CE(L) & =EV(L)=n \\ n & \rightarrow\infty,CE(L)\rightarrow\infty \\ \because & u(x) is\ linear \\ \therefore & u(x)\rightarrow\infty,E(UforL)\rightarrow\infty \end{aligned} piEV(L)CE(L)n∵∴=2i1, ri=2i=i=1∑npiri=21∗2+221∗22+...+2n1∗2n=n=EV(L)=n→∞,CE(L)→∞u(x)is linearu(x)→∞,E(UforL)→∞
This is unreasonable.
-
u ( x ) = log 2 ( x ) u(x)=\log_2 (x) u(x)=log2(x)
u ( r i ) = log 2 ( r i ) = log 2 ( 2 i ) = i E ( U f o r L ) = ∑ i = 1 n p i u ( r i ) = ∑ i = 1 n 1 2 i i = 2 − 1 2 n − 1 − n 2 n C E ( L ) = u − 1 ( E ( U f o r L ) ) = 2 2 − 1 2 n − 1 − n 2 n \begin{aligned} u(r_i) & =\log_2(r_i)=\log_2(2^i)=i \\ E(UforL) & =\sum_{i=1}^np_iu(r_i)=\sum_{i=1}^n\frac{1}{2^i}i=2-\frac{1}{2^{n-1}}-\frac{n}{2^n} \\ CE(L) & =u^{-1}(E(UforL))=2^{2-\frac{1}{2^{n-1}}-\frac{n}{2^n}} \end{aligned} u(ri)E(UforL)CE(L)=log2(ri)=log2(2i)=i=i=1∑npiu(ri)=i=1∑n2i1i=2−2n−11−2nn=u−1(E(UforL))=22−2n−11−2nn
ble. -
u ( x ) = log 2 ( x ) u(x)=\log_2 (x) u(x)=log2(x)
u ( r i ) = log 2 ( r i ) = log 2 ( 2 i ) = i E ( U f o r L ) = ∑ i = 1 n p i u ( r i ) = ∑ i = 1 n 1 2 i i = 2 − 1 2 n − 1 − n 2 n C E ( L ) = u − 1 ( E ( U f o r L ) ) = 2 2 − 1 2 n − 1 − n 2 n \begin{aligned} u(r_i) & =\log_2(r_i)=\log_2(2^i)=i \\ E(UforL) & =\sum_{i=1}^np_iu(r_i)=\sum_{i=1}^n\frac{1}{2^i}i=2-\frac{1}{2^{n-1}}-\frac{n}{2^n} \\ CE(L) & =u^{-1}(E(UforL))=2^{2-\frac{1}{2^{n-1}}-\frac{n}{2^n}} \end{aligned} u(ri)E(UforL)CE(L)=log2(ri)=log2(2i)=i=i=1∑npiu(ri)=i=1∑n2i1i=2−2n−11−2nn=u−1(E(UforL))=22−2n−11−2nn