可加模型的一个简单示例

最新推荐文章于 2024-08-06 20:40:43 发布

Infinity343

最新推荐文章于 2024-08-06 20:40:43 发布

阅读量879

点赞数 2

分类专栏：统计学 R 非参数统计文章标签：算法 r语言回归

本文链接：https://blog.csdn.net/qq_44638724/article/details/130983524

版权

统计学同时被 3 个专栏收录

6 篇文章 0 订阅

订阅专栏

非参数统计

5 篇文章 18 订阅

订阅专栏

4 篇文章 0 订阅

订阅专栏

文章介绍了加法模型作为避免维度灾难和提高解释性的工具，其中每个函数可以通过一维最优率进行估计。主要方法包括回帖拟合和边际积分估计，且模型要求对每个特征的条件均值为零。通过希尔伯特空间框架，利用投影定理寻找最佳近似解。实际应用中，使用简化版高斯-塞德尔算法的回帖拟合算法进行估计。文章还展示了一个模拟示例，展示了在加法模型中的光滑性能。

摘要由CSDN通过智能技术生成

Additive Models

to avoid the curse of dimensionality and for better interpretability we assume
$m(\boldsymbol{x})=E(Y|\boldsymbol{X}=\boldsymbol{x})=c+\sum_{j=1}^dg_j(x_j)$
$\Longrightarrow$ the additive functions $g_j$ can be estimated with the optimal one-dimensional rate

two possible methods for estimating an additive model:

backfitting estimator
marginal integration estimator
indentification conditions for both methods

$\begin{aligned} E_{X_j}\{ g(X&_j) \}=0, \forall j=1,\dots,d\\ & \Longrightarrow E(Y)=e \end{aligned}$

formulation Hibert space framework:

let $\mathcal{H}_{Y\boldsymbol{X}}$ be the Hilbert space of random variables which are functions of $\boldsymbol{X}$
let $\langle U,V\rangle=E(UV)$ the scalar product
define $\mathcal{H}_{\boldsymbol{X}}$ and $\mathcal{H}_{X},j=1,\dots,d$ the corresponding subspaces

$\Longrightarrow$ we aim to find the element of $\mathcal{H}_{X_1}\oplus \cdots \oplus\mathcal{H}_{X_d}$ closest to $Y\in\mathcal{H}_{Y\boldsymbol{X}}$ or $m\in \mathcal{H}_{\boldsymbol{X}}$
by the projection theorem, there exists a unique solution with
$E[\{ Y-m(\boldsymbol{X}) \}|X_{\alpha}]=0\\ \iff g_{\alpha}(X_{\alpha})=E[\{ Y-\sum_{j\neq\alpha}g_j(X_j) \}|X_{\alpha}], \quad\alpha=1,\dots,d$
denote projection $P_{\alpha}(\bullet)=E(\bullet|X_{\alpha})$
$\Longrightarrow\left(\begin{array}{cccc} I & P_{1} & \cdots & P_{1} \\ P_{2} & I & \cdots & P_{2} \\ \vdots & & \ddots & \vdots \\ P_{d} & \cdots & P_{d} & I \end{array}\right)\left(\begin{array}{c} g_{1}\left(X_{1}\right) \\ g_{2}\left(X_{2}\right) \\ \vdots \\ g_{d}\left(X_{d}\right) \end{array}\right)=\left(\begin{array}{c} P_{1} Y \\ P_{2} Y \\ \vdots \\ P_{d} Y \end{array}\right)$
denote by
$\bold{S}_{\alpha}\quad \text{the} \,(n\times n) \quad \text{smoother matrix}$
such that $\bold{S}_{\alpha}\boldsymbol{Y}$ is an estimate of the vector $\{ E(Y_1|X_{\alpha1}),\dots,E(Y_n|X_{\alpha n}) \}^{\top}$
$\Longrightarrow \underbrace{\left(\begin{array}{cccc} \mathbf{I} & \mathbf{S}_{1} & \cdots & \mathbf{S}_{1} \\ \mathbf{S}_{2} & \mathbf{I} & \cdots & \mathbf{S}_{2} \\ \vdots & & \ddots & \vdots \\ \mathbf{S}_{d} & \cdots & \mathbf{S}_{d} & \mathbf{I} \end{array}\right)}_{n d \times n d}\left(\begin{array}{c} \boldsymbol{g}_{1} \\ \boldsymbol{g}_{2} \\ \vdots \\ \boldsymbol{g}_{d} \end{array}\right)=\left(\begin{array}{c} \mathbf{S}_{1} \boldsymbol{Y} \\ \mathbf{S}_{2} \boldsymbol{Y} \\ \vdots \\ \mathbf{S}_{d} \boldsymbol{Y} \end{array}\right)$
note: infinite samples the matrix on the left side can be singular

Bacfitting algorithm

in practice, the following backfitting algorithm (a simplification of the Gauss-Seidel procedure) is used:

initialize $\hat{\boldsymbol{g}}^{(0)}\equiv 0 \,\forall\alpha,\hat{c}=\bar{Y}$
repeat for $\alpha=1,\dots,d$
$\begin{aligned} \boldsymbol{r}_\alpha & =\boldsymbol{Y}-\widehat{c}-\sum_{j=1}^{\alpha-1} \widehat{\boldsymbol{g}}_j^{(\ell+1)}-\sum_{j=\alpha+1}^d \widehat{\boldsymbol{g}}_j^{(\ell)} \\ \widehat{\boldsymbol{g}}_\alpha^{(\ell+1)}(\bullet) & =\mathbf{S}_\alpha\left(\boldsymbol{r}_\alpha\right) \end{aligned}$
proceed until convergence is reached

Example: smoother performance in additive models

simulated sample of $n = 75$ regression observations with regressors $X_j$ i.i.d.
uniform on $[- 2.5, 2.5]$ , generated from
$Y=\sum_{j=1}^4g_j(X_j)+\varepsilon, \quad \varepsilon\sim N(0,1)$
where
$\begin{array}{ll} g_1\left(X_1\right)=-\sin \left(2 X_1\right) & g_2\left(X_2\right)=X_2^2-E\left(X_2^2\right) \\ g_3\left(X_3\right)=X_3 & g_4\left(X_4\right)=\exp \left(-X_4\right)-E\left\{\exp \left(-X_4\right)\right\} \end{array}$
Plotting results in this example:
Estimated (solid lines) versus true additive component functions (circles at the input values), local linear estimator with Quartic kernel, bandwidths h = 1:0

Code:

n = 75
X = matrix(NA, n, 4)
for (i in 1:4) {
  X[, i] = runif(n, min = -2.5, max = 2.5)
}

g1 = function(x) {
  return(-sin(2 * x))
}

g2 = function(x) {
  return(x ^ 2 - mean(x ^ 2))
}

g3 = function(x) {
  return(x)
}

g4 = function(x) {
  return(exp(-x) - mean(exp(-x)))
}
eps = rnorm(n)

###indicator function
I = function(x, index) {
  if (index == 1) {
    return(x)
  }
  if (index == 0) {
    return(0)
  }
}

x <- seq(-2.5, 2.5, l = 100)
Y = I(g1(X[, 1]), 1) + I(g2(X[, 2]), 0) + I(g3(X[, 3]), 0) + I(g4(X[, 4]), 0) + eps
fit_g1 <- loess(
  Y ~ x,
  family = 'symmetric',
  degree = 2,
  span = 0.7,
  data = data.frame(x = X[, 1], Y = Y),
  surface = "direct"
)
out_g1 <- predict(fit_g1,
                  newdata = data.frame(newx = x),
                  se = TRUE)
low_g1 <- out_g1$fit - qnorm(0.975) * out_g1$se.fit
high_g1 <- out_g1$fit + qnorm(0.975) * out_g1$se.fit
df.low_g1 <- data.frame(x = x, y = low_g1)
df.high_g1 <- data.frame(x = x, y = high_g1)
P1 = ggplot(data = data.frame(X1 = X[, 1], g1 = Y),
            aes(X1, g1)) +
  geom_point() +
  geom_smooth(method = "loess", show.legend = TRUE) +
  geom_line(data = df.low_g1, aes(x, y), color = "red") +
  geom_line(data = df.high_g1, aes(x, y), color = "red")

Y = I(g1(X[, 1]), 0) + I(g2(X[, 2]), 1) + I(g3(X[, 3]), 0) + I(g4(X[, 4]), 0) + eps
fit_g2 <- loess(
  Y ~ x,
  family = 'symmetric',
  degree = 2,
  span = 0.9,
  data = data.frame(
    x = X[, 2],
    Y = (Y - fit_g1$fitted),
    surface = "direct"
  )
)
out_g2 <- predict(fit_g2,
                  newdata = data.frame(newx = x),
                  se = TRUE)
low_g2 <- out_g2$fit - qnorm(0.975) * out_g2$se.fit
high_g2 <- out_g2$fit + qnorm(0.975) * out_g2$se.fit
df.low_g2 <- data.frame(x = x, y = low_g2)
df.high_g2 <- data.frame(x = x, y = high_g2)
P2 = ggplot(data = data.frame(X2 = X[, 2], g2 = (Y - fit_g1$fitted)),
            aes(X2, g2)) +
  geom_point() +
  geom_smooth(method = "loess", show.legend = TRUE) +
  geom_line(data = df.low_g2, aes(x, y), color = "red") +
  geom_line(data = df.high_g2, aes(x, y), color = "red")

Y = I(g1(X[, 1]), 0) + I(g2(X[, 2]), 0) + I(g3(X[, 3]), 1) + I(g4(X[, 4]), 0) + eps
fit_g3 <- loess(
  Y ~ x,
  family = 'symmetric',
  degree = 2,
  span = 0.9,
  data = data.frame(
    x = X[, 3],
    Y = (Y - fit_g1$fitted - fit_g2$fitted),
    surface = "direct"
  )
)
out_g3 <- predict(fit_g3,
                  newdata = data.frame(newx = x),
                  se = TRUE)
low_g3 <- out_g3$fit - qnorm(0.975) * out_g3$se.fit
high_g3 <- out_g3$fit + qnorm(0.975) * out_g3$se.fit
df.low_g3 <- data.frame(x = x, y = low_g3)
df.high_g3 <- data.frame(x = x, y = high_g3)
P3 = ggplot(data = data.frame(X3 = X[, 3], g3 = (Y - fit_g1$fitted - fit_g2$fitted)),
            aes(X3, g3)) +
  geom_point() +
  geom_smooth(method = "loess", show.legend = TRUE) +
  geom_line(data = df.low_g3, aes(x, y), color = "red") +
  geom_line(data = df.high_g3, aes(x, y), color = "red")

Y = I(g1(X[, 1]), 0) + I(g2(X[, 2]), 0) + I(g3(X[, 3]), 0) + I(g4(X[, 4]), 1) + eps
fit_g4 <- loess(
  Y ~ x,
  family = 'symmetric',
  degree = 2,
  span = 0.9,
  data = data.frame(
    x = X[, 4],
    Y = (Y - fit_g1$fitted - fit_g2$fitted - fit_g3$fitted),
    surface = "direct"
  )
)
out_g4 <- predict(fit_g4,
                  newdata = data.frame(newx = x),
                  se = TRUE)
low_g4 <- out_g4$fit - qnorm(0.975) * out_g4$se.fit
high_g4 <- out_g4$fit + qnorm(0.975) * out_g4$se.fit
df.low_g4 <- data.frame(x = x, y = low_g4)
df.high_g4 <- data.frame(x = x, y = high_g4)
P4 = ggplot(data = data.frame(
  X4 = X[, 4],
  g4 = (Y - fit_g1$fitted - fit_g2$fitted - fit_g3$fitted)
),
aes(X4, g4)) +
  geom_point() +
  geom_smooth(method = "loess", show.legend = TRUE) +
  geom_line(data = df.low_g4, aes(x, y), color = "red") +
  geom_line(data = df.high_g4, aes(x, y), color = "red")

cowplot::plot_grid(P1, P2, P3, P4, align = "vh")

result:

在这里插入图片描述

参考文献

https://academic.uprm.edu/wrolke/esma6836/smooth.html

Hastie, T. J. and Tibshirani, R. J. (1990). Generalized Additive Models, Vol. 43 of Monographs on Statistics and Applied Probability, Chapman and Hall, London.

Opsomer, J. and Ruppert, D. (1997). Fitting a bivariate additive model by local polynomial regression, Annals of Statistics 25: 186-211.

Mammen, E., Linton, O. and Nielsen, J. P. (1999). The existence and asymptotic properties of a backfitting projection algorithm under weak conditions, Annals of Statistics 27: 1443-1490.