EXAMPLE1
[
[0, 2, -1, 1],
[-2, 0, 3, 1],
[1, -3, 0, 1],
]
双人零和对称博弈(
M
=
−
M
T
M = -M^T
M=−MT)的所有纳什均衡都是对称纳什均衡.
max
p
min
q
p
M
q
T
=
min
q
max
p
p
M
q
T
=
p
M
p
T
=
−
p
M
p
T
=
0
\max\limits_p \min\limits_q pMq^T = \min\limits_q \max\limits_p pMq^T = pMp^T = -pMp^T = 0
pmaxqminpMqT=qminpmaxpMqT=pMpT=−pMpT=0
-
max min 问题如下:
max p min q p M q T ⟺ max p min { p M } ⟺ { max p , v v p M ≽ v 1 p ≽ 0 p 1 T = 1 \begin{aligned} & \max\limits_p \min\limits_q pMq^T \\ \iff& \max\limits_p \min\{pM\} \\ \iff& \begin{cases} \max\limits_{p,v} v \\ pM \succcurlyeq v1 \\ p \succcurlyeq 0 \\ p1^T = 1 \\ \end{cases} \\ \end{aligned} ⟺⟺pmaxqminpMqTpmaxmin{pM}⎩ ⎨ ⎧p,vmaxvpM≽v1p≽0p1T=1
⟺ { min p , v − v 2 p 1 − p 2 + v ⩽ 0 − 2 p 0 + 3 p 2 + v ⩽ 0 p 0 − 3 p 1 + v ⩽ 0 − p 0 ⩽ 0 − p 1 ⩽ 0 − p 2 ⩽ 0 p 0 + p 1 + p 2 = 1 \iff \begin{cases} & \min\limits_{p,v} -v \\ & \begin{matrix} & 2p_1 & -p_2 & +v &\leqslant &0 \\ -2p_0 & & +3p_2 & +v &\leqslant &0 \\ p_0 & -3p_1 & & +v &\leqslant &0 \\ -p_0 & & & &\leqslant &0 \\ & -p_1 & & &\leqslant &0 \\ & & -p_2 & &\leqslant &0 \\ p_0 & +p_1 & +p_2 & &= &1 \\ \end{matrix} \\ \end{cases} ⟺⎩ ⎨ ⎧p,vmin−v−2p0p0−p0p02p1−3p1−p1+p1−p2+3p2−p2+p2+v+v+v⩽⩽⩽⩽⩽⩽=0000001
-
max min 问题解得 p ∗ = [ 1 2 , 1 6 , 1 3 ] p^*=[\frac{1}{2}, \frac{1}{6}, \frac{1}{3}] p∗=[21,61,31], v ∗ = 0 v^*=0 v∗=0.
-
min max 问题如下:
min q max p q M T p T ⟺ min q max { q M T } ⟺ { min q , u u q M T ≼ u 1 q ≽ 0 q 1 T = 1 \begin{aligned} & \min\limits_q \max\limits_p qM^Tp^T \\ \iff& \min\limits_q \max\{qM^T\} \\ \iff& \begin{cases} \min\limits_{q, u} u \\ qM^T \preccurlyeq u1 \\ q \succcurlyeq 0 \\ q1^T = 1 \\ \end{cases} \\ \end{aligned} ⟺⟺qminpmaxqMTpTqminmax{qMT}⎩ ⎨ ⎧q,uminuqMT≼u1q≽0q1T=1
⟺ { min q , u u 2 q 1 − q 2 − u ⩽ 0 − 2 q 0 + 3 q 2 − u ⩽ 0 q 0 − 3 q 1 − u ⩽ 0 − q 0 ⩽ 0 − q 1 ⩽ 0 − q 2 ⩽ 0 q 0 + q 1 + q 2 = 1 \iff \begin{cases} & \min\limits_{q,u} u \\ & \begin{matrix} & 2q_1 & -q_2 & -u &\leqslant &0 \\ -2q_0 & & +3q_2 & -u &\leqslant &0 \\ q_0 & -3q_1 & & -u &\leqslant &0 \\ -q_0 & & & &\leqslant &0 \\ & -q_1 & & &\leqslant &0 \\ & & -q_2 & &\leqslant &0 \\ q_0 & +q_1 & +q_2 & &= &1 \\ \end{matrix} \\ \end{cases} ⟺⎩ ⎨ ⎧q,uminu−2q0q0−q0q02q1−3q1−q1+q1−q2+3q2−q2+q2−u−u−u⩽⩽⩽⩽⩽⩽=0000001
-
min max 问题解得 q ∗ = [ 1 2 , 1 6 , 1 3 ] q^*=[\frac{1}{2}, \frac{1}{6}, \frac{1}{3}] q∗=[21,61,31], u ∗ = 0 u^*=0 u∗=0.
-
结论:
计算机求解结果满足 v ∗ = u ∗ v^*=u^* v∗=u∗, 因此 ( p ∗ , q ∗ ) (p^*,q^*) (p∗,q∗) 是一个混合策略纳什均衡.
计算机求解结果满足 p ∗ = q ∗ p^*=q^* p∗=q∗, v ∗ = u ∗ = 0 v^*=u^*=0 v∗=u∗=0, 与理论分析一致.
具体代码如下:
# https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.linprog.html
import numpy
import scipy
problem_p_v = {
'c': numpy.array([[0, 0, 0, -1]]),
'A_ub': numpy.array([
[0, 2, -1, 1],
[-2, 0, 3, 1],
[1, -3, 0, 1],
[-1, 0, 0, 0],
[0, -1, 0, 0],
[0, 0, -1, 0],
]),
'b_ub': numpy.array([
[0],
[0],
[0],
[0],
[0],
[0],
]),
'A_eq': numpy.array([[1, 1, 1, 0]]),
'b_eq': numpy.array([[1]]),
}
result_p_v = scipy.optimize.linprog(**problem_p_v, bounds=(None, None))
print(result_p_v)
problem_q_u = {
'c': numpy.array([[0, 0, 0, 1]]),
'A_ub': numpy.array([
[0, 2, -1, -1],
[-2, 0, 3, -1],
[1, -3, 0, -1],
[-1, 0, 0, 0],
[0, -1, 0, 0],
[0, 0, -1, 0],
]),
'b_ub': numpy.array([
[0],
[0],
[0],
[0],
[0],
[0],
]),
'A_eq': numpy.array([[1, 1, 1, 0]]),
'b_eq': numpy.array([[1]]),
}
result_q_u = scipy.optimize.linprog(**problem_q_u, bounds=(None, None))
print(result_q_u)
EXAMPLE2
[
[1, -2, 6, -4],
[2, -7, 2, 4],
[-3, 4, -4, -3],
[-8, 3, -2, 3],
]
-
max min 问题如下:
max p min q p M q T ⟺ max p min { p M } ⟺ { max p , v v p M ≽ v 1 p ≽ 0 p 1 T = 1 ⟺ { min p , v − v [ 1 − M T ⋮ 1 − 1 0 ⋱ ⋮ − 1 0 ] [ p T v ] ≼ 0 [ 1 ⋯ 1 0 ] [ p T v ] = [ 1 ] \begin{aligned} & \max\limits_p \min\limits_q pMq^T \\ \iff& \max\limits_p \min\{pM\} \\ \iff& \begin{cases} \max\limits_{p,v} v \\ pM \succcurlyeq v1 \\ p \succcurlyeq 0 \\ p1^T = 1 \\ \end{cases} \\ \iff& \begin{cases} \min\limits_{p,v} -v \\ \begin{bmatrix} & & &1 \\ &-M^T & &\vdots \\ & & &1 \\ -1 & & &0 \\ &\ddots & &\vdots \\ & &-1 &0 \\ \end{bmatrix} \begin{bmatrix} \\ p^T \\ \\ v \end{bmatrix} \preccurlyeq 0 \\ \begin{bmatrix} 1 &\cdots &1 &0 \end{bmatrix} \begin{bmatrix} \\ p^T \\ \\ v \end{bmatrix} = [1] \\ \end{cases} \\ \end{aligned} ⟺⟺⟺pmaxqminpMqTpmaxmin{pM}⎩ ⎨ ⎧p,vmaxvpM≽v1p≽0p1T=1⎩ ⎨ ⎧p,vmin−v −1−MT⋱−11⋮10⋮0 pTv ≼0[1⋯10] pTv =[1]
-
max min 问题解得 p ∗ = [ 0.14 , 0.35 , 0.51 , 0 ] p^*=[0.14,0.35,0.51,0] p∗=[0.14,0.35,0.51,0], v ∗ = − 0.69 v^*=-0.69 v∗=−0.69.
-
min max 问题如下:
min q max p q M T p T ⟺ min q max { q M T } ⟺ { min q , u u q M T ≼ u 1 q ≽ 0 q 1 T = 1 ⟺ { min q , u u [ − 1 M ⋮ − 1 − 1 0 ⋱ ⋮ − 1 0 ] [ q T u ] ≼ 0 [ 1 ⋯ 1 0 ] [ q T u ] = [ 1 ] \begin{aligned} & \min\limits_q \max\limits_p qM^Tp^T \\ \iff& \min\limits_q \max\{qM^T\} \\ \iff& \begin{cases} \min\limits_{q, u} u \\ qM^T \preccurlyeq u1 \\ q \succcurlyeq 0 \\ q1^T = 1 \\ \end{cases} \\ \iff& \begin{cases} \min\limits_{q,u} u \\ \begin{bmatrix} & & &-1 \\ &M & &\vdots \\ & & &-1 \\ -1 & & &0 \\ &\ddots & &\vdots \\ & &-1 &0 \\ \end{bmatrix} \begin{bmatrix} \\ q^T \\ \\ u \end{bmatrix} \preccurlyeq 0 \\ \begin{bmatrix} 1 &\cdots &1 &0 \end{bmatrix} \begin{bmatrix} \\ q^T \\ \\ u \end{bmatrix} = [1] \\ \end{cases} \\ \end{aligned} ⟺⟺⟺qminpmaxqMTpTqminmax{qMT}⎩ ⎨ ⎧q,uminuqMT≼u1q≽0q1T=1⎩ ⎨ ⎧q,uminu −1M⋱−1−1⋮−10⋮0 qTu ≼0[1⋯10] qTu =[1]
-
min max 问题解得 q ∗ = [ 0.53 , 0.33 , 0 , 0.14 ] q^*=[0.53,0.33,0,0.14] q∗=[0.53,0.33,0,0.14], u ∗ = − 0.69 u^*=-0.69 u∗=−0.69.
-
结论:
计算机求解结果满足 v ∗ = u ∗ v^*=u^* v∗=u∗, 因此 ( p ∗ , q ∗ ) (p^*,q^*) (p∗,q∗) 是一个混合策略纳什均衡.
具体代码如下:
# https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.linprog.html
import numpy
import scipy
M = numpy.array([
[1, -2, 6, -4],
[2, -7, 2, 4],
[-3, 4, -4, -3],
[-8, 3, -2, 3],
])
problem_p_v = {
'c': numpy.concatenate((numpy.zeros((1, 4)), numpy.array([[-1]])), axis=1),
'A_ub': numpy.concatenate(
(
numpy.concatenate((-M.T, numpy.ones((4, 1))), axis=1),
numpy.concatenate((-numpy.identity(4), numpy.zeros((4, 1))), axis=1),
),
axis=0,
),
'b_ub': numpy.zeros((8, 1)),
'A_eq': numpy.concatenate((numpy.ones((1, 4)), numpy.array([[0]])), axis=1),
'b_eq': numpy.array([[1]]),
}
result_p_v = scipy.optimize.linprog(**problem_p_v, bounds=(None, None))
print(result_p_v)
problem_q_u = {
'c': numpy.concatenate((numpy.zeros((1, 4)), numpy.array([[1]])), axis=1),
'A_ub': numpy.concatenate(
(
numpy.concatenate((M, -numpy.ones((4, 1))), axis=1),
numpy.concatenate((-numpy.identity(4), numpy.zeros((4, 1))), axis=1),
),
axis=0,
),
'b_ub': numpy.zeros((8, 1)),
'A_eq': numpy.concatenate((numpy.ones((1, 4)), numpy.array([[0]])), axis=1),
'b_eq': numpy.array([[1]]),
}
result_q_u = scipy.optimize.linprog(**problem_q_u, bounds=(None, None))
print(result_q_u)