如何理解拉格朗日乘子法
问题引出,与原点最短距离
假设有方程
x 2 y = 3 x^2y = 3 x2y=3
对应图像如下:
现在我们想求其上的点与原点的最短距离:
这里介绍一种解题思路。首先,与原点距离为 α \alpha α的点全部在半径为 α \alpha α的圆上:
那么,我们逐渐扩大圆的半径:
显然,第一次与 x 2 y = 3 x^2y = 3 x2y=3 相交的点就是距离原点最近的点:
此时,圆和曲线相切,也就是在该点切线相同:
至此,我们分析出了:在极值点,圆与曲线相切!
问题延伸:等高线
为了继续解题,需要引入等高线。这些同心圆:
上图可看做是 f ( x , y ) = x 2 + y 2 f(x,y) = x^2 + y^2 f(x,y)=x2+y2 的等高线
根据梯度的性质,梯度向量:
∇ f = ( ∂ f ∂ x ∂ f ∂ y ) = ( 2 x 2 y ) \nabla f = \left (\begin{aligned} \frac{\partial f}{\partial x} \\ \frac{\partial f}{\partial y} \end{aligned}\right) = \left (\begin{aligned} 2x \\ 2y \end{aligned}\right) ∇f= ∂x∂f∂y∂f =(2x2y)
是等高线的法线:
另一个函数 g ( x , y ) = x 2 y g(x,y) = x^2y g(x,y)=x2y 的等高线是:
其中 x 2 y = 3 x^2y = 3 x2y=3 就是其中值为3的等高线:
因此,梯度向量:
∇ g = ( ∂ g ∂ x ∂ g ∂ y ) = ( 2 x y x 2 ) \nabla g = \left (\begin{aligned} \frac{\partial g}{\partial x} \\ \frac{\partial g}{\partial y} \end{aligned}\right) = \left (\begin{aligned} 2xy \\ x^2 \end{aligned}\right) ∇g= ∂x∂g∂y∂g =(2xyx2)
和曲线 x 2 y = 3 x^2y = 3 x2y=3 是垂直关系:
梯度向量是等高线的法线,更准确地表述是:梯度与等高线的切线垂直!
拉格朗日乘子法
求解
根据之前的两个分析:
{ 在极值点,圆与曲线相切 梯度与等高线的切线垂直 \left \{ \begin{aligned} 在极值点,圆与曲线相切 \\梯度与等高线的切线垂直 \end{aligned} \right. {在极值点,圆与曲线相切梯度与等高线的切线垂直
综合可知,在相切点,圆的梯度向量和曲线的梯度向量平行:
也就是梯度向量平行,用数学符号表示为:
∇ f = λ ∇ g \nabla f = \lambda \nabla g ∇f=λ∇g 其中 λ \lambda λ 表示乘以合适的系数使方程左右相等。
还必须引入 x 2 y = 3 x^2y = 3 x2y=3这个条件,否则这么多等高线,不知道指的是哪一根:
因此联立方程:
{ ∇ f = λ ∇ g x 2 y = 3 \left \{ \begin{aligned} &\nabla f = \lambda \nabla g \\&x^2y = 3 \end{aligned} \right. {∇f=λ∇gx2y=3
展开:
{ ( 2 x 2 y ) = λ ( 2 x y x 2 ) x 2 y = 3 \left \{ \begin{aligned}&\left ( \begin{aligned} 2x \\ 2y\end{aligned}\right) = \lambda \left ( \begin{aligned}2xy \\ x^2\end{aligned}\right) \\&x^2y = 3 \end{aligned} \right. ⎩ ⎨ ⎧(2x2y)=λ(2xyx2)x2y=3
{ 2 x = λ 2 x y 2 y = λ x 2 x 2 y = 3 \left \{ \begin{aligned} & 2x = \lambda 2xy \\ &2y = \lambda x^2 \\&x^2y = 3 \end{aligned} \right. ⎩ ⎨ ⎧2x=λ2xy2y=λx2x2y=3
三元二次方程,求解可得:
{ x ≈ ± 1.61 y ≈ 1.1 λ ≈ 0.87 \left \{ \begin{aligned} &x \approx \pm 1.61 \\ &y \approx 1.1 \\&\lambda \approx 0.87 \end{aligned} \right. ⎩ ⎨ ⎧x≈±1.61y≈1.1λ≈0.87
定义
要求函数 f 在 g 约束下的机制这种问题可以表述为:
{ m i n f s . t . g = 0 \left \{ \begin{aligned} &min f \\ &s.t. \ \ g = 0 \end{aligned} \right. {minfs.t. g=0
s . t . s.t. s.t. 意思是subject to,服从于,约束于的意思。
可以列出方程组求解:
{ ∇ f = λ ∇ g g = 0 \left \{ \begin{array}{c} &\nabla f = \lambda \nabla g \\&g = 0 \end{array} \right. {∇f=λ∇gg=0
用这个定义来翻译下刚才的例子,要求:
{ f ( x , y ) = x 2 + y 2 g ( x , y ) = x 2 y − 3 \left \{ \begin{array}{c} f(x,y) = x^2 + y^2 \\g(x,y) = x^2y - 3 \end{array} \right. {f(x,y)=x2+y2g(x,y)=x2y−3
求
{ m i n f ( x , y ) s . t . g ( x , y ) = 0 \left \{ \begin{array}{c} min f(x,y) \\ s.t. \ \ g(x,y) = 0 \end{array} \right. {minf(x,y)s.t. g(x,y)=0
联立方程进行求解:
{ ∇ f ( x , y ) = λ ∇ g ( x , y ) g ( x , y ) = 0 \left \{ \begin{aligned}&\nabla f(x,y) = \lambda \nabla g(x,y) \\ &g(x,y) = 0 \end{aligned} \right. {∇f(x,y)=λ∇g(x,y)g(x,y)=0
变形
这个定义还有种变形也比较常见,要求:
{ m i n f s . t . g = 0 \left \{ \begin{aligned} &min f \\ &s.t. \ \ g = 0 \end{aligned} \right. {minfs.t. g=0
定义:
F = f + λ g F = f + \lambda g F=f+λg
求解下面方程组即可得到答案:
( ∂ F ∂ x ∂ F ∂ y ∂ F ∂ λ ) = ( 0 0 0 ) \left ( \begin{aligned} \\ \frac{\partial F}{\partial x} \\ \frac{\partial F}{\partial y}\\ \frac{\partial F}{\partial \lambda} \\ \end{aligned}\right) = \left( \begin{aligned} 0 \\ 0 \\0 \end{aligned}\right) ∂x∂F∂y∂F∂λ∂F = 000
把等式左边的偏导算出来就和上面的定义是一样的了:
( ∂ F ∂ x ∂ F ∂ y ∂ F ∂ λ ) = ( 2 x + λ 2 x y 2 y + λ x 2 x 2 y − 3 ) = ( 0 0 0 ) \left ( \begin{aligned} \frac{\partial F}{\partial x} \\ \frac{\partial F}{\partial y}\\ \frac{\partial F}{\partial \lambda} \end{aligned}\right) = \left ( \begin{aligned} & 2x + \lambda 2xy \\ &2y + \lambda x^2\\ &x^2y - 3 \end{aligned}\right)= \left( \begin{aligned} 0 \\ 0 \\0 \end{aligned}\right) ∂x∂F∂y∂F∂λ∂F = 2x+λ2xy2y+λx2x2y−3 = 000
求解结果是:
{ x ≈ ± 1.61 y ≈ 1.1 λ ≈ − 0.87 \left \{ \begin{aligned} &x \approx \pm 1.61 \\ &y \approx 1.1 \\&\lambda \approx -0.87 \end{aligned} \right. ⎩ ⎨ ⎧x≈±1.61y≈1.1λ≈−0.87
和上面的求解结果略有差异, λ \lambda λ 是引入的,正负无关紧要,只需要将g(x,y)调整一下即可
{ f ( x , y ) = x 2 + y 2 g ( x , y ) = 3 − x 2 y \left \{ \begin{array}{c} f(x,y) = x^2 + y^2 \\g(x,y) = 3 - x^2y \end{array} \right. {f(x,y)=x2+y2g(x,y)=3−x2y