一文介绍 statisitc behand ab test anova test handson code

一文介绍 statisitc behind a/b test, anova test handson code

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import statsmodels.api as sm
import statsmodels.stats.api as sms
import scipy.stats as stats
import warnings
warnings.filterwarnings('ignore')
from math import ceil
sns.set_theme(style='dark')
sns.set(rc={'figure.figsize':(20,15)})
data_ab = pd.read_csv('data/ab_test.csv')
data_ab.head()
idtimecon_treatpageconverted
085110411:48.6controlold_page0
180422801:45.2controlold_page0
266159055:06.2treatmentnew_page0
385354128:03.1treatmentnew_page0
486497552:26.2controlold_page1

EDA

check whether user in treatment group only receive new pages , check whether duplicated records exists

data_ab.columns = ["user_id", "timestamp", "group", "landing_page", "converted"]
# num of rows and unique user
print('row number:{0}'.format(data_ab.shape[0]))
print('unique user:{0}'.format(data_ab['user_id'].nunique()))
row number:294478
unique user:290584
data_ab.head()
user_idtimestampgrouplanding_pageconverted
085110411:48.6controlold_page0
180422801:45.2controlold_page0
266159055:06.2treatmentnew_page0
385354128:03.1treatmentnew_page0
486497552:26.2controlold_page1
# check whether mismatch exists
data_ab.groupby(by=['group'],as_index=False).agg({'landing_page':pd.Series.nunique})
grouplanding_page
0control2
1treatment2
pd.crosstab(data_ab['group'],data_ab['landing_page'],margins='True')
landing_pagenew_pageold_pageAll
group
control1928145274147202
treatment1453111965147276
All147239147239294478
# check for mismatch 
n_treat = data_ab[data_ab['group']=='treatment'].shape[0]
n_newpage = data_ab[data_ab['landing_page']=='new_page'].shape[0]
diff = n_treat - n_newpage
pd.DataFrame({'n treatment':[n_treat],
             'n newPage':[n_newpage],
             'diff':diff})
n treatmentn newPagediff
014727614723937
page_diff = pd.crosstab(index=data_ab['group'],columns=data_ab['landing_page'])
page_diff.plot.bar()

在这里插入图片描述

# clean up the dataset, so that group=treatment -> page= new
df = data_ab[(data_ab['group']=='treatment')&(data_ab['landing_page']=='new_page')|(data_ab['group']=='control')&(data_ab['landing_page']=='old_page')]
df.shape[0]
290585
df[df.duplicated(subset=['user_id'])==True]
user_idtimestampgrouplanding_pageconverted
289377319255:59.6treatmentnew_page0
df = df.drop_duplicates('user_id',keep='first')

AB TEST

  • post campaign review
  • calculate minimum sample size before campaign
df['user_id'] = df['user_id'].astype(str)
control_cvt = df[df['group']=='control']['converted'].mean()
print('control_cvt:%.4f'%control_cvt)
control_cvt:0.1204
treat_cvt= df[df['group']=='treatment']['converted'].mean()
print('control_cvt:%.4f'%treat_cvt)
control_cvt:0.1188
n_treat = df[df['group']=='treatment']['user_id'].count()
n_control = df[df['group']=='control']['user_id'].count()
n_treat/n_control
               
1.0002478075911725
n_control
145274

test on minimum sample size

e f f e c t s i z e = δ s t d effect_size = \frac {\delta}{std} effectsize=stdδ
做完一个hypothesis test后,如果P<5%,还需要计算效应量,如果P>5%,需要计算功效
效应量:样本间差异或相关程度的量化指标,P值判定是否有统计学意义,效应量判断差异性有多大,反应实际上的意义,有时即使有显著统计学意义,效应量却很小。ES<0.2,则小效应,0.2-0.5中效应,>0.5大效应

effect_size = sms.proportion_effectsize(control_cvt,treat_cvt)
print('effect_size:%.4f'%effect_size)
# proportion_effectsize effect size的公式是cohen h
effect_size:0.0049
effect_size2 = power_analysis.solve_power(effect_size = None,power=0.8,alpha=0.05,nobs1 =n_treat)
# 给定sample size,power,显著性水平计算effect size

effect_size和effect_size2计算结果含义不同,effect_size是指给定baseline和improvement后计算出的效应量,此时样本量没有固定,为infinite。effect_size2是指给定样本量后最小能够计算出的效应量。

required_n = sms.NormalIndPower().solve_power(effect_size,power=0.8,alpha=0.05,ratio=n_treat/n_control)
required_n = ceil(required_n)
print('required_n:%d'%required_n)
required_n:663492
import statsmodels.stats.power as smp
power_analysis = smp.TTestIndPower()
n_required = power_analysis.solve_power(effect_size = effect_size,power=0.8,alpha=0.05,ratio=n_treat/n_control)
print('n_required:%.4f'%n_required)
n_required:663492.3208

因此,control_cvt为0.1188,treat_cvt为0.1204时,至少需要样本量663492,但是实验样本量只有145274,检测不出

test on whether experiment achieve the goal: difference is significant

通过ztest和ttest坐下hypothesis test

convert_old = df[(df["converted"] == 1) & (df["landing_page"] == "old_page")]['user_id'].nunique()
convert_new = df[(df["converted"] == 1) & (df["landing_page"] == "new_page")]['user_id'].nunique()
n_old = df[df["landing_page"] == "old_page"]['user_id'].nunique()
n_new = df[df["landing_page"] == "new_page"]['user_id'].nunique()
n_old
145274
convert_old
17489

方法1:通过ztest做hypothesis test

z_score,p_value = sm.stats.proportions_ztest(np.array([convert_new,convert_old]),np.array([n_new,n_old]), alternative = 'larger')
p_value
0.9050583127590245

方法2:通过ttest做hypothesis test

当sample size>30时,用ztest也可以

from scipy import stats as st
new_gr = df.loc[df['landing_page']=="new_page",'converted'].to_numpy()
old_gr = df.loc[df['landing_page']=="old_page",'converted'].to_numpy()
st.ttest_ind(new_gr,old_gr)
Ttest_indResult(statistic=-1.3109235634981506, pvalue=0.18988462498742617)

方法3:手算 ttest

var_old = control_cvt*(1-control_cvt)
var_new = treat_cvt*(1-treat_cvt)
p_delta = -control_cvt +  treat_cvt
print(p_delta)
-0.0015782389853555567
pooled_se = np.sqrt(var_new/n_new + var_old/n_old)
t_sts = p_delta/pooled_se
# degree of freedom
dof = (var_new/n_new + var_old/n_old)**2/((var_new/n_new)**2/(n_new-1) + (var_old/n_old)**2/(n_old-1))
dof
290571.7142957336
pvalue = 2*st.t.cdf(-abs(t_sts),dof)
print(pvalue)
0.18988341360095998

通过假设检验,也可以看出,原假设即control组合treat组没区别不能被拒绝

计算置信区间

(low_treat,up_treat) = st.norm.interval(0.95,loc=treat_cvt,scale=np.sqrt(var_new/n_new))
(low_control,up_control) = st.norm.interval(0.95,loc=control_cvt,scale=np.sqrt(var_old/n_old))
(lower_con, lower_treat), (upper_con, upper_treat) = sm.stats.proportion_confint(np.array([convert_new,convert_old]),np.array([n_new,n_old]),alpha=0.05)

scip和statmodels库两个方法计算置信区间,control组和treat组区间有重合,也说明原假设不能被拒绝

ANOVA TEST

测试下时间段是否对转换有影响,将时间段分为0-20,20-40,40-60

df.head()
user_idtimestampgrouplanding_pageconvertedhourtime_range
085110411:48.6controlold_page0110-20
180422801:45.2controlold_page010-20
266159055:06.2treatmentnew_page05540-60
385354128:03.1treatmentnew_page02820-40
486497552:26.2controlold_page15240-60
df['hour'] = [int(i.split(':')[0]) for i in df['timestamp']]
df['time_range'] = pd.cut(df['hour'],bins=[0,20,40,60],include_lowest=True,labels=['0-20','20-40','40-60'])

方法1

fvalue,pvl = stats.f_oneway(df[df['time_range']=='0-20']['converted'],df[df['time_range']=='20-40']['converted'],df[df['time_range']=='40-60']['converted'])
print('fvalue:{0} pvalue:{1}'.format(fvalue,pvl))
fvalue:0.6913612944665806 pvalue:0.5008945648250425
from statsmodels.formula.api import ols
df_melt = pd.melt(df,id_vars=['time_range'],value_vars=['converted'])
df_melt.head()
time_rangevariablevalue
00-20converted0
10-20converted0
240-60converted0
320-40converted0
440-60converted1

方法2

df.head()
user_idtimestampgrouplanding_pageconvertedhourtime_range
085110411:48.6controlold_page0110-20
180422801:45.2controlold_page010-20
266159055:06.2treatmentnew_page05540-60
385354128:03.1treatmentnew_page02820-40
486497552:26.2controlold_page15240-60
model = ols('converted~C(time_range)',data=df[['time_range','converted']]).fit()
anova_table = sm.stats.anova_lm(model,typ=2)
anova_table
sum_sqdfFPR(>F)
C(time_range)0.1455932.00.6913610.500895
Residual30596.496834290581.0NaNNaN

Pvalue>0.05,因此判断按照目前的时间段划分,时间段对转换无影响

demo two - way anova test

model_twoway = ols('converted~C(time_range) + C(landing_page) + C(time_range):C(landing_page)',data=df[['time_range','converted','landing_page']]).fit()
anova_table_twoway = sm.stats.anova_lm(model_twoway,typ=2)
anova_table_twoway
sum_sqdfFPR(>F)
C(time_range)0.1453092.00.6900170.501568
C(landing_page)0.1806661.01.7158250.190232
C(time_range):C(landing_page)0.1844782.00.8760150.416440
Residual30596.131690290578.0NaNNaN

结果表明,时间划分,landingpage及时间和landingpage的组合都对转换无影响

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值