【2021-07-02更新】补充一个例子:GDAS中有这样一个公式:
min α E ( x ′ , y ′ ) ∼ D V − log Pr ( y ′ ∣ x ′ ; α , ω α ∗ ) s.t. ω α ∗ = arg min ω E ( x , y ) ∼ D T − log Pr ( y ∣ x ; α , ω α ) \begin{array}{r} \min _{\alpha} \mathbb{E}_{\left(x^{\prime}, y^{\prime}\right) \sim \mathbb{D}_{V}}-\log \operatorname{Pr}\left(y^{\prime} \mid x^{\prime} ; \alpha, \omega_{\alpha}^{*}\right) \\ \text { s.t. } \omega_{\alpha}^{*}=\arg \min _{\omega} \mathbb{E}_{(x, y) \sim \mathbb{D}_{T}}-\log \operatorname{Pr}\left(y \mid x ; \alpha, \omega_{\alpha}\right) \end{array} min