经验贝叶斯-Robbins公式及其R代码实现_贝叶斯估计r代码-CSDN博客

数据显示9461个汽车保险保单持有人在一年内索赔次数最少0次,最多7次，相对索赔人次yx 也随着索赔次数的x 的增加而急剧下降。假设保险公司非常关心客户的年内索赔次数,因为这将影响一家企业的盈亏情况。现在公司收集到了过去一年的索赔历史数据,为了针对性地调整政策,公司的数据分析师需要根据今年的客户索赔情况估算出下一年客户的期望索赔次数。

Robbins公式（单个贝叶斯公式的经验估计）

公式推导

代码实现

# Robbins经验公式
x = c(0,1,2,3,4,5,6,7)
yx = c(7840,1317,239,42,14,4,4,1)
exp_count = function(x, yx){ # 构造经验函数
  exp_count = vector(mode = 'numeric', length = 0)
  for (i in x){
    temp = ((i + 1) * (yx[i + 2] / sum(yx))) / (yx[i + 1] / sum(yx))
    exp_count[i + 1] = temp
  }
  return(exp_count)
}
exp_count = exp_count(x, yx)
dat = data.frame(num = x, count = yx, exp_count = exp_count)

Gamma先验的极大似然估计

当频数很小的时候,经验公式提供的结果并不稳定。可以使用参数化方法获得更为可靠的结果。

前提假设

基于R和Python的极大似然估计的牛顿法实现https://blog.csdn.net/zns972630879/article/details/120399944?spm=1001.2014.3001.5501

参数估计

牛顿法实现

代码实现

# Gamma先验的MLE
Est_Gamma_shape = function(y, shape =0.5, eps = 1e-6, maxIter =100){
  # 该函数用于估计Gamma的参数,最终输出形状参数和尺度参数
  tol = 1e-14
  x = seq(0, length(y) - 1)
  ybar = mean(y)
  ytilde = mean(x * y)
  tmp = ytilde / ybar
  for (k in 1 : maxIter)
    {
    g_v = -ybar * log(1 + tmp / shape) + mean(y * digamma(shape + x)) -
      ybar * digamma(shape)
    h_v = ybar/shape - ybar / (shape + tmp) +
      mean(y * trigamma(shape + x)) - ybar * trigamma(shape)
    if (abs(h_v) < tol || abs(g_v / h_v) < eps){
      # 保证分母不至于过小或者参数更新值过小
      break
    }else if (shape < g_v / h_v){
      # 避免形状参数出现负值
      break
    }else{
      # 更新参数
      shape = shape - g_v / h_v
    }
  }
  rate = ytilde / (shape * ybar) # 通过形状参数与尺度参数的关系计算尺度参数
  return(list(shape =shape, rate = rate))
}

fx = function(x, shape, rate){
  # 该函数用于计算客户的索赔次数的概率密度函数
  f = (rate / (1 + rate))^(x + shape) * gamma(x + shape) / (rate ^ shape * gamma(shape) * gamma(x + 1))
  return(f)
}

gamma_claims = function(){
  # 该函数用于输入数据并进行新一年客户的条件期望索赔次数
  yx = c(7848, 1317, 239, 42, 14, 4, 4, 1)
  fit = Est_Gamma_shape(yx)
  v = fit$shape
  sigma = fit$rate
  
  x = seq(0, length(yx) - 1)
  f = fx(x, v, sigma)
  pre = (x + 1) * fx(x + 1, v, sigma) / fx(x, v, sigma)
  
  print(pre)
}
result = gamma_claims()

小结

索赔次数x	0	1	2	3	4	5	6	7
索赔人次yx	7840	1317	239	42	14	4	4	1
Robbins经验函数	0.168	0.363	0.527	1.333	1.429	6.000	1.750	/
GammaMLE	0.164	0.398	0.632	0.866	1.101	1.335	1.569	1.803

可以看到,Robbins经验公式和GammaMLE的参数估计结果在数据量较大时估计结果非常接近,而在数据量较小时,Robbins的经验函数估计结果变化非常大。

参考文献

[1] Robbins H.E. (1992) An Empirical Bayes Approach to Statistics. In: Kotz S., Johnson N.L. (eds) Breakthroughs in Statistics. Springer Series in Statistics (Perspectives in Statistics). Springer, New York, NY. An Empirical Bayes Approach to Statistics | SpringerLink

[2] Wolf J , Levi. Bradley Efron and Trevor Hastie, Computer age statistical inference: Algorithms, evidence, and data science EfronBradley and HastieTrevor, Computer age statistical inference: Algorithms, evidence, and data science, Cambridge University Press: New York, 201[J]. Environment & Planning B Urban Analytics & City Science, 2017, 44(5):986-987.