DKW不等式指出,对于任何累积分布函数
F
F
F 和所有自然数
n
n
n ,有:
P
(
sup
x
∣
F
^
n
(
x
)
−
F
(
x
)
∣
>
ε
n
)
≤
2
e
−
2
n
ε
n
2
P(\sup_x | \hat{F}_n(x) - F(x) | > \varepsilon_n) \leq 2e^{-2n\varepsilon_n^2}
P(xsup∣F^n(x)−F(x)∣>εn)≤2e−2nεn2
为了使不等式的右边小于或等于
α
\alpha
α ,
ε
n
\varepsilon_n
εn 被选为:
ε
n
=
1
2
n
ln
(
2
α
)
\varepsilon_n = \sqrt{\frac{1}{2n} \ln\left(\frac{2}{\alpha}\right)}
εn=2n1ln(α2)
# 使用R内置的faithful数据集
data("faithful")
# 提取等待时间数据
waiting_times <- faithful$waiting
# 计算经验分布函数
ecdf_data <- ecdf(waiting_times)
# 使用DKW不等式计算统一置信带
alpha <- 0.05
epsilon <- sqrt(log(2/alpha) / (2 * length(waiting_times)))
# 绘制ECDF及其置信带
x_vals <- seq(min(waiting_times), max(waiting_times), length.out = 100)
plot(ecdf_data, main = "Uniform Confidence Band", verticals = TRUE, do.points = FALSE)
lines(x_vals, pmax(0, ecdf_data(x_vals) - epsilon), col = "blue", lty = "dashed")
lines(x_vals, pmin(1, ecdf_data(x_vals) + epsilon), col = "blue", lty = "dashed")