可加模型的一个简单示例

5 篇文章 18 订阅
4 篇文章 0 订阅
文章介绍了加法模型作为避免维度灾难和提高解释性的工具,其中每个函数可以通过一维最优率进行估计。主要方法包括回帖拟合和边际积分估计,且模型要求对每个特征的条件均值为零。通过希尔伯特空间框架,利用投影定理寻找最佳近似解。实际应用中,使用简化版高斯-塞德尔算法的回帖拟合算法进行估计。文章还展示了一个模拟示例,展示了在加法模型中的光滑性能。
摘要由CSDN通过智能技术生成

Additive Models

to avoid the curse of dimensionality and for better interpretability we assume
m ( x ) = E ( Y ∣ X = x ) = c + ∑ j = 1 d g j ( x j ) m(\boldsymbol{x})=E(Y|\boldsymbol{X}=\boldsymbol{x})=c+\sum_{j=1}^dg_j(x_j) m(x)=E(YX=x)=c+j=1dgj(xj)
⟹ \Longrightarrow the additive functions g j g_j gj can be estimated with the optimal one-dimensional rate

two possible methods for estimating an additive model:

  • backfitting estimator
  • marginal integration estimator
    indentification conditions for both methods

E X j { g ( X j ) } = 0 , ∀ j = 1 , … , d ⟹ E ( Y ) = e \begin{aligned} E_{X_j}\{ g(X&_j) \}=0, \forall j=1,\dots,d\\ & \Longrightarrow E(Y)=e \end{aligned} EXj{g(Xj)}=0,j=1,,dE(Y)=e

formulation Hibert space framework:

  • let H Y X \mathcal{H}_{Y\boldsymbol{X}} HYX be the Hilbert space of random variables which are functions of Y , X Y, \boldsymbol{X} Y,X
  • let ⟨ U , V ⟩ = E ( U V ) \langle U,V\rangle=E(UV) U,V=E(UV) the scalar product
  • define H X \mathcal{H}_{\boldsymbol{X}} HX and H X , j = 1 , … , d \mathcal{H}_{X},j=1,\dots,d HX,j=1,,d the corresponding subspaces

⟹ \Longrightarrow we aim to find the element of H X 1 ⊕ ⋯ ⊕ H X d \mathcal{H}_{X_1}\oplus \cdots \oplus\mathcal{H}_{X_d} HX1HXd closest to Y ∈ H Y X Y\in\mathcal{H}_{Y\boldsymbol{X}} YHYX or m ∈ H X m\in \mathcal{H}_{\boldsymbol{X}} mHX
by the projection theorem, there exists a unique solution with
E [ { Y − m ( X ) } ∣ X α ] = 0    ⟺    g α ( X α ) = E [ { Y − ∑ j ≠ α g j ( X j ) } ∣ X α ] , α = 1 , … , d E[\{ Y-m(\boldsymbol{X}) \}|X_{\alpha}]=0\\ \iff g_{\alpha}(X_{\alpha})=E[\{ Y-\sum_{j\neq\alpha}g_j(X_j) \}|X_{\alpha}], \quad\alpha=1,\dots,d E[{Ym(X)}Xα]=0gα(Xα)=E[{Yj=αgj(Xj)}Xα],α=1,,d
denote projection P α ( ∙ ) = E ( ∙ ∣ X α ) P_{\alpha}(\bullet)=E(\bullet|X_{\alpha}) Pα()=E(Xα)
⟹ ( I P 1 ⋯ P 1 P 2 I ⋯ P 2 ⋮ ⋱ ⋮ P d ⋯ P d I ) ( g 1 ( X 1 ) g 2 ( X 2 ) ⋮ g d ( X d ) ) = ( P 1 Y P 2 Y ⋮ P d Y ) \Longrightarrow\left(\begin{array}{cccc} I & P_{1} & \cdots & P_{1} \\ P_{2} & I & \cdots & P_{2} \\ \vdots & & \ddots & \vdots \\ P_{d} & \cdots & P_{d} & I \end{array}\right)\left(\begin{array}{c} g_{1}\left(X_{1}\right) \\ g_{2}\left(X_{2}\right) \\ \vdots \\ g_{d}\left(X_{d}\right) \end{array}\right)=\left(\begin{array}{c} P_{1} Y \\ P_{2} Y \\ \vdots \\ P_{d} Y \end{array}\right) IP2PdP1IPdP1P2I g1(X1)g2(X2)gd(Xd) = P1YP2YPdY
denote by
S α the   ( n × n ) smoother matrix \bold{S}_{\alpha}\quad \text{the} \,(n\times n) \quad \text{smoother matrix} Sαthe(n×n)smoother matrix
such that S α Y \bold{S}_{\alpha}\boldsymbol{Y} SαY is an estimate of the vector { E ( Y 1 ∣ X α 1 ) , … , E ( Y n ∣ X α n ) } ⊤ \{ E(Y_1|X_{\alpha1}),\dots,E(Y_n|X_{\alpha n}) \}^{\top} {E(Y1Xα1),,E(YnXαn)}
⟹ ( I S 1 ⋯ S 1 S 2 I ⋯ S 2 ⋮ ⋱ ⋮ S d ⋯ S d I ) ⏟ n d × n d ( g 1 g 2 ⋮ g d ) = ( S 1 Y S 2 Y ⋮ S d Y ) \Longrightarrow \underbrace{\left(\begin{array}{cccc} \mathbf{I} & \mathbf{S}_{1} & \cdots & \mathbf{S}_{1} \\ \mathbf{S}_{2} & \mathbf{I} & \cdots & \mathbf{S}_{2} \\ \vdots & & \ddots & \vdots \\ \mathbf{S}_{d} & \cdots & \mathbf{S}_{d} & \mathbf{I} \end{array}\right)}_{n d \times n d}\left(\begin{array}{c} \boldsymbol{g}_{1} \\ \boldsymbol{g}_{2} \\ \vdots \\ \boldsymbol{g}_{d} \end{array}\right)=\left(\begin{array}{c} \mathbf{S}_{1} \boldsymbol{Y} \\ \mathbf{S}_{2} \boldsymbol{Y} \\ \vdots \\ \mathbf{S}_{d} \boldsymbol{Y} \end{array}\right) nd×nd IS2SdS1ISdS1S2I g1g2gd = S1YS2YSdY
note: infinite samples the matrix on the left side can be singular

Bacfitting algorithm

in practice, the following backfitting algorithm (a simplification of the Gauss-Seidel procedure) is used:


  • initialize g ^ ( 0 ) ≡ 0   ∀ α , c ^ = Y ˉ \hat{\boldsymbol{g}}^{(0)}\equiv 0 \,\forall\alpha,\hat{c}=\bar{Y} g^(0)0α,c^=Yˉ
  • repeat for α = 1 , … , d \alpha=1,\dots,d α=1,,d
    r α = Y − c ^ − ∑ j = 1 α − 1 g ^ j ( ℓ + 1 ) − ∑ j = α + 1 d g ^ j ( ℓ ) g ^ α ( ℓ + 1 ) ( ∙ ) = S α ( r α ) \begin{aligned} \boldsymbol{r}_\alpha & =\boldsymbol{Y}-\widehat{c}-\sum_{j=1}^{\alpha-1} \widehat{\boldsymbol{g}}_j^{(\ell+1)}-\sum_{j=\alpha+1}^d \widehat{\boldsymbol{g}}_j^{(\ell)} \\ \widehat{\boldsymbol{g}}_\alpha^{(\ell+1)}(\bullet) & =\mathbf{S}_\alpha\left(\boldsymbol{r}_\alpha\right) \end{aligned} rαg α(+1)()=Yc j=1α1g j(+1)j=α+1dg j()=Sα(rα)
  • proceed until convergence is reached

Example: smoother performance in additive models

simulated sample of n = 75 n = 75 n=75 regression observations with regressors X j X_j Xj i.i.d.
uniform on [ − 2.5 , 2.5 ] [-2.5, 2.5] [2.5,2.5], generated from
Y = ∑ j = 1 4 g j ( X j ) + ε , ε ∼ N ( 0 , 1 ) Y=\sum_{j=1}^4g_j(X_j)+\varepsilon, \quad \varepsilon\sim N(0,1) Y=j=14gj(Xj)+ε,εN(0,1)
where
g 1 ( X 1 ) = − sin ⁡ ( 2 X 1 ) g 2 ( X 2 ) = X 2 2 − E ( X 2 2 ) g 3 ( X 3 ) = X 3 g 4 ( X 4 ) = exp ⁡ ( − X 4 ) − E { exp ⁡ ( − X 4 ) } \begin{array}{ll} g_1\left(X_1\right)=-\sin \left(2 X_1\right) & g_2\left(X_2\right)=X_2^2-E\left(X_2^2\right) \\ g_3\left(X_3\right)=X_3 & g_4\left(X_4\right)=\exp \left(-X_4\right)-E\left\{\exp \left(-X_4\right)\right\} \end{array} g1(X1)=sin(2X1)g3(X3)=X3g2(X2)=X22E(X22)g4(X4)=exp(X4)E{exp(X4)}
Plotting results in this example:
Estimated (solid lines) versus true additive component functions (circles at the input values), local linear estimator with Quartic kernel, bandwidths h = 1:0

Code:

n = 75
X = matrix(NA, n, 4)
for (i in 1:4) {
  X[, i] = runif(n, min = -2.5, max = 2.5)
}

g1 = function(x) {
  return(-sin(2 * x))
}

g2 = function(x) {
  return(x ^ 2 - mean(x ^ 2))
}

g3 = function(x) {
  return(x)
}

g4 = function(x) {
  return(exp(-x) - mean(exp(-x)))
}
eps = rnorm(n)

###indicator function
I = function(x, index) {
  if (index == 1) {
    return(x)
  }
  if (index == 0) {
    return(0)
  }
}

x <- seq(-2.5, 2.5, l = 100)
Y = I(g1(X[, 1]), 1) + I(g2(X[, 2]), 0) + I(g3(X[, 3]), 0) + I(g4(X[, 4]), 0) + eps
fit_g1 <- loess(
  Y ~ x,
  family = 'symmetric',
  degree = 2,
  span = 0.7,
  data = data.frame(x = X[, 1], Y = Y),
  surface = "direct"
)
out_g1 <- predict(fit_g1,
                  newdata = data.frame(newx = x),
                  se = TRUE)
low_g1 <- out_g1$fit - qnorm(0.975) * out_g1$se.fit
high_g1 <- out_g1$fit + qnorm(0.975) * out_g1$se.fit
df.low_g1 <- data.frame(x = x, y = low_g1)
df.high_g1 <- data.frame(x = x, y = high_g1)
P1 = ggplot(data = data.frame(X1 = X[, 1], g1 = Y),
            aes(X1, g1)) +
  geom_point() +
  geom_smooth(method = "loess", show.legend = TRUE) +
  geom_line(data = df.low_g1, aes(x, y), color = "red") +
  geom_line(data = df.high_g1, aes(x, y), color = "red")

Y = I(g1(X[, 1]), 0) + I(g2(X[, 2]), 1) + I(g3(X[, 3]), 0) + I(g4(X[, 4]), 0) + eps
fit_g2 <- loess(
  Y ~ x,
  family = 'symmetric',
  degree = 2,
  span = 0.9,
  data = data.frame(
    x = X[, 2],
    Y = (Y - fit_g1$fitted),
    surface = "direct"
  )
)
out_g2 <- predict(fit_g2,
                  newdata = data.frame(newx = x),
                  se = TRUE)
low_g2 <- out_g2$fit - qnorm(0.975) * out_g2$se.fit
high_g2 <- out_g2$fit + qnorm(0.975) * out_g2$se.fit
df.low_g2 <- data.frame(x = x, y = low_g2)
df.high_g2 <- data.frame(x = x, y = high_g2)
P2 = ggplot(data = data.frame(X2 = X[, 2], g2 = (Y - fit_g1$fitted)),
            aes(X2, g2)) +
  geom_point() +
  geom_smooth(method = "loess", show.legend = TRUE) +
  geom_line(data = df.low_g2, aes(x, y), color = "red") +
  geom_line(data = df.high_g2, aes(x, y), color = "red")

Y = I(g1(X[, 1]), 0) + I(g2(X[, 2]), 0) + I(g3(X[, 3]), 1) + I(g4(X[, 4]), 0) + eps
fit_g3 <- loess(
  Y ~ x,
  family = 'symmetric',
  degree = 2,
  span = 0.9,
  data = data.frame(
    x = X[, 3],
    Y = (Y - fit_g1$fitted - fit_g2$fitted),
    surface = "direct"
  )
)
out_g3 <- predict(fit_g3,
                  newdata = data.frame(newx = x),
                  se = TRUE)
low_g3 <- out_g3$fit - qnorm(0.975) * out_g3$se.fit
high_g3 <- out_g3$fit + qnorm(0.975) * out_g3$se.fit
df.low_g3 <- data.frame(x = x, y = low_g3)
df.high_g3 <- data.frame(x = x, y = high_g3)
P3 = ggplot(data = data.frame(X3 = X[, 3], g3 = (Y - fit_g1$fitted - fit_g2$fitted)),
            aes(X3, g3)) +
  geom_point() +
  geom_smooth(method = "loess", show.legend = TRUE) +
  geom_line(data = df.low_g3, aes(x, y), color = "red") +
  geom_line(data = df.high_g3, aes(x, y), color = "red")

Y = I(g1(X[, 1]), 0) + I(g2(X[, 2]), 0) + I(g3(X[, 3]), 0) + I(g4(X[, 4]), 1) + eps
fit_g4 <- loess(
  Y ~ x,
  family = 'symmetric',
  degree = 2,
  span = 0.9,
  data = data.frame(
    x = X[, 4],
    Y = (Y - fit_g1$fitted - fit_g2$fitted - fit_g3$fitted),
    surface = "direct"
  )
)
out_g4 <- predict(fit_g4,
                  newdata = data.frame(newx = x),
                  se = TRUE)
low_g4 <- out_g4$fit - qnorm(0.975) * out_g4$se.fit
high_g4 <- out_g4$fit + qnorm(0.975) * out_g4$se.fit
df.low_g4 <- data.frame(x = x, y = low_g4)
df.high_g4 <- data.frame(x = x, y = high_g4)
P4 = ggplot(data = data.frame(
  X4 = X[, 4],
  g4 = (Y - fit_g1$fitted - fit_g2$fitted - fit_g3$fitted)
),
aes(X4, g4)) +
  geom_point() +
  geom_smooth(method = "loess", show.legend = TRUE) +
  geom_line(data = df.low_g4, aes(x, y), color = "red") +
  geom_line(data = df.high_g4, aes(x, y), color = "red")

cowplot::plot_grid(P1, P2, P3, P4, align = "vh")

result:

在这里插入图片描述

参考文献

https://academic.uprm.edu/wrolke/esma6836/smooth.html

Hastie, T. J. and Tibshirani, R. J. (1990). Generalized Additive Models, Vol. 43 of Monographs on Statistics and Applied Probability, Chapman and Hall, London.

Opsomer, J. and Ruppert, D. (1997). Fitting a bivariate additive model by local polynomial regression, Annals of Statistics 25: 186-211.

Mammen, E., Linton, O. and Nielsen, J. P. (1999). The existence and asymptotic properties of a backfitting projection algorithm under weak conditions, Annals of Statistics 27: 1443-1490.

  • 2
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
好的,下面是一个简单的Java示例,演示如何使用MVVM模式: Model类(数据模型): ```java public class User { private String name; private String email; public User(String name, String email) { this.name = name; this.email = email; } public String getName() { return name; } public String getEmail() { return email; } } ``` ViewModel类(视图模型): ```java public class UserViewModel { private User user; public UserViewModel(User user) { this.user = user; } public String getName() { return user.getName(); } public String getEmail() { return user.getEmail(); } } ``` View类(视图): ```java public class UserView { private UserViewModel viewModel; public UserView(UserViewModel viewModel) { this.viewModel = viewModel; } public void show() { System.out.println("Name: " + viewModel.getName()); System.out.println("Email: " + viewModel.getEmail()); } } ``` Main类: ```java public class Main { public static void main(String[] args) { User user = new User("John Doe", "johndoe@example.com"); UserViewModel viewModel = new UserViewModel(user); UserView view = new UserView(viewModel); view.show(); } } ``` 在这个示例中,我们有一个User类作为数据模型一个UserViewModel类作为视图模型,并且一个UserView类作为视图。Main类只是创建这些类的实例并向用户显示数据。UserView类只是简单地打印数据,但在实际应用程序中,它可能会显示在屏幕上或通过网络发送给其他设备。使用MVVM模式,我们可以将数据和业务逻辑从视图中解耦,并且可以更轻松地测试和维护代码。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

Infinity343

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值