ESMM模型介绍
CTR和CVR
来源于https://zhuanlan.zhihu.com/p/57481330
点击率CTR(click through rate):假设用户搜索一个物品,平台上展现了2000个结果,用户点击了300个,那么该物品广告的点击率就是300/2000.
转化率CVR(conversion rate):假设用户搜索一个物品,平台上展现了2000个结果,用户点击了300个,并且让用户成功激活或者称为付费用户的有50个,这里还需要考虑那些用户没有点击的广告,因为假设有一个广告因为配图或者主题很差,导致用户没有点击,但是实际上该广告和用户的爱好需求非常匹配,只要用户点击了该广告,那么用户大概率也会称为付费用户,那么计算CVR的时候也需要把这类广告计算进去,不能简单地把用户没有点击的广告全部标记为未能转化的广告.
定义一个模型,里面的参数记录为
x
\mathbf{x}
x,事件
y
y
y只能取值为0或者1,其中
y
=
0
y=0
y=0表示该广告没有被用户点击,
y
=
1
y=1
y=1表示用户点击了该广告,事件
z
z
z也只能取值0或者1,当
z
=
0
z=0
z=0的时候说明该广告不能被转化,当
z
=
1
z=1
z=1的时候说明该广告可以被转化.
P
(
z
=
1
,
y
=
1
∣
x
)
=
P
(
z
=
1
∣
y
=
1
,
x
)
P
(
y
=
1
∣
x
)
P(z=1,y=1|\mathbf{x}) = P(z = 1|y = 1,\mathbf{x})P(y = 1|\mathbf{x})
P(z=1,y=1∣x)=P(z=1∣y=1,x)P(y=1∣x)
转换率是
P
(
z
=
1
∣
y
=
1
,
x
)
P(z = 1|y = 1,\mathbf{x})
P(z=1∣y=1,x),点击率是
P
(
y
=
1
∣
x
)
P(y = 1|\mathbf{x})
P(y=1∣x)
需要优化的目标函数如下所示,其中
θ
c
v
r
,
θ
c
t
r
\theta_{cvr},\theta_{ctr}
θcvr,θctr分别表示CVR部分和CTR部分的网络参数.
L
(
θ
c
v
r
,
θ
c
t
r
)
=
∑
i
=
1
N
l
(
y
i
,
f
(
x
i
;
θ
c
t
r
)
)
+
∑
i
=
1
N
l
(
y
i
&
z
i
,
f
(
x
i
;
θ
c
t
r
)
∗
f
(
x
i
;
θ
c
v
r
)
)
L\left(\theta_{c v r}, \theta_{c t r}\right)=\sum_{i=1}^{N} l\left(y_{i}, f\left(\boldsymbol{x}_{i} ; \theta_{c t r}\right)\right)+\sum_{i=1}^{N} l\left(y_{i} \& z_{i}, f\left(\boldsymbol{x}_{i} ; \theta_{c t r}\right) * f\left(\boldsymbol{x}_{i} ; \theta_{c v r}\right)\right)
L(θcvr,θctr)=∑i=1Nl(yi,f(xi;θctr))+∑i=1Nl(yi&zi,f(xi;θctr)∗f(xi;θcvr))
这里仍然使用adult数据集,上面提到过adult数据集有两个标签,分别是收入水平和婚姻状况,为了使用ESMM模型测试,这里对数据集做进一步的预处理,假设收入水平小于50k,那么在原来的数据基础上把婚姻状况标记为未婚(假设收入水平小于50k,缺乏结婚物质条件),那么做ESMM模型预测的时候,就可以把收入水平为1的认为是成功点击,把婚姻状况为1的认为是成功点击并且转化.
根据ESMM模型的定义,我们设计了两个不同的网络分别预测ctr和ctcvr,并且这两个网络的输出都只能是一个节点,并且取值必须在0和1之前,认定为ctr和ctcvr的概率.引入的损失函数为python自带的nn.BCELoss().
网上的ESMM模型在搭建网络的时候除了引入DNN以外,还加入了batchnorm和dropout,下面本人将在ESMM模型基础上引入FM结构和去掉Dropout的数值结果.
数据集来源于https://github.com/busesese/MultiTaskModel
添加链接描述
adultutils.py
import pandas as pd
from sklearn.preprocessing import LabelEncoder, MinMaxScaler
from sklearn.model_selection import train_test_split
from torch.utils.data import Dataset, DataLoader
# data process
def data_preparation():
# The column names are from
column_names = ['age', 'workclass', 'fnlwgt', 'education', 'education_num', 'marital_status', 'occupation',
'relationship', 'race', 'sex', 'capital_gain', 'capital_loss', 'hours_per_week', 'native_country',
'income_50k']
# Load the dataset in Pandas
train_df = pd.read_csv(
'C:\\Users\\2001213226\\Desktop\\icbc\\MultiTaskModel-main\\data\\adult.data',
delimiter=',',
header=None,
index_col=None,
names=column_names
)
other_df = pd.read_csv(
'C:\\Users\\2001213226\\Desktop\\icbc\\MultiTaskModel-main\\data\\adult.test',
delimiter=',',
header=None,
index_col=None,
names=column_names
)
#print(train_df)#[M,15]
train_df['tag'] = 1#额外增加一列tag
#print(train_df)
other_df['tag'] = 0
#print(other_df)
other_df.dropna(inplace=True)#默认按行删除,如果某行有缺失值,就删除该行,这个数据只删除了一行
#print(other_df)
#print(other_df['income_50k'])
other_df['income_50k'] = other_df['income_50k'].apply(lambda x: x[:-1])#似乎没看出处理以后数据的变化
#print(other_df['income_50k'])
data = pd.concat([train_df, other_df])#默认按行拼接两个数据
#print(train_df.shape,other_df.shape,data.shape)
data.dropna(inplace=True)
#print(data.shape)
# First group of tasks according to the paper
label_columns = ['income_50k', 'marital_status']#把收入和婚姻状态作为标签
# categorical columns
categorical_columns = ['workclass', 'education', 'occupation', 'relationship', 'race', 'sex', 'native_country']
for col in label_columns:
if col == 'income_50k':
data[col] = data[col].apply(lambda x: 0 if x == ' <=50K' else 1)
else:
data[col] = data[col].apply(lambda x: 0 if x == ' Never-married' else 1)
# feature engine
for col in column_names:
if col not in label_columns + ['tag']:
if col in categorical_columns:
le = LabelEncoder()#把不同的状态标记为不同类别,比图兔子,狗,猫会分别标记为0,1,2
data[col] = le.fit_transform(data[col])#把类别用连续的整数标记为不同标签
else:
mm = MinMaxScaler()#根据数值信息归一化,离散数值,可能有小数
data[col] = mm.fit_transform(data[[col]]).reshape(-1)#归一化处理
data = data[['age', 'workclass', 'fnlwgt', 'education', 'education_num', 'occupation',
'relationship', 'race', 'sex', 'capital_gain', 'capital_loss', 'hours_per_week', 'native_country',
'income_50k', 'marital_status', 'tag']]
# user feature, item feature
user_feature_dict, item_feature_dict = dict(), dict()
for idx, col in enumerate(data.columns):
if col not in label_columns + ['tag']:
if idx < 7:
if col in categorical_columns:
user_feature_dict[col] = (len(data[col].unique())+1, idx)#返回该列的所有不同取值个数+1
#if idx == 5:
# print(user_feature_dict[col])
else:
user_feature_dict[col] = (1, idx)
else:
if col in categorical_columns:
item_feature_dict[col] = (len(data[col].unique())+1, idx)
else:
item_feature_dict[col] = (1, idx)
# Split the other dataset into 1:1 validation to test according to the paper
train_data, test_data = data[data['tag'] == 1], data[data['tag'] == 0]
train_data.drop('tag', axis=1, inplace=True)#.copy()
test_data.drop('tag', axis=1, inplace=True)#.copy()
# val data
# train_data, val_data = train_test_split(train_data, test_size=0.5, random_state=2021)
return train_data, test_data, user_feature_dict, item_feature_dict
ESMM.py
import torch
import torch.nn as nn
import numpy as np
import time
import pandas as pd
from adultutils import data_preparation
class Model(torch.nn.Module):
def __init__(self, user_feature_dict, item_feature_dict,ctr_hid_layers,cvr_hid_layers,dtype,p,embed):
super(Model, self).__init__()#输入维度与embed有关,这里只设置hidden层和输出层
self.user_feature_dict = user_feature_dict
self.item_feature_dict = item_feature_dict
self.user_embed_layers = dict()
self.item_embed_layers = dict()
user_cate_feature_nums, item_cate_feature_nums = 0, 0
for cate,num in self.user_feature_dict.items():
if num[0] > 1:#如果该特征不同取值大于1,那么使用embed
user_cate_feature_nums += 1
self.user_embed_layers['user_embed_' + str(num[1])] = torch.nn.Embedding(num[0], embed)
for cate,num in self.item_feature_dict.items():
if num[0] > 1:
item_cate_feature_nums += 1
self.item_embed_layers['item_embed_' + str(num[1])] = torch.nn.Embedding(num[0], embed)
#注意,有部分特征做了embedding,因此网络接受的输入节点数目跟embedding的特征数目有关
#上面引入user_cate_feature_nums就是为了统计用户特征中做了embedding的特征数目,方便后面计算输入维度
input_dim = embed * (user_cate_feature_nums + item_cate_feature_nums) + \
(len(user_feature_dict) - user_cate_feature_nums) + (len(item_feature_dict) - item_cate_feature_nums)
#-------ctr DNN
self.ctr_layers = [input_dim] + ctr_hid_layers
self.ctr_layers_hid_num = len(self.ctr_layers)-2
ctr_fc = []
for i in range(self.ctr_layers_hid_num+1):
ctr_fc.append(torch.nn.Linear(self.ctr_layers[i],self.ctr_layers[i+1]))
self.ctr_fc = torch.nn.Sequential(*ctr_fc)
for i in range(self.ctr_layers_hid_num+1):
self.ctr_fc[i].weight.data = self.ctr_fc[i].weight.data.type(dtype)
self.ctr_fc[i].bias.data = self.ctr_fc[i].bias.data.type(dtype)
#--------cvr DNN
self.cvr_layers = [input_dim] + cvr_hid_layers
self.cvr_layers_hid_num = len(self.cvr_layers)-2
cvr_fc = []
for i in range(self.cvr_layers_hid_num+1):
cvr_fc.append(torch.nn.Linear(self.cvr_layers[i],self.cvr_layers[i+1]))
self.cvr_fc = torch.nn.Sequential(*cvr_fc)
for i in range(self.cvr_layers_hid_num+1):
self.cvr_fc[i].weight.data = self.cvr_fc[i].weight.data.type(dtype)
self.cvr_fc[i].bias.data = self.cvr_fc[i].bias.data.type(dtype)
#-------------FM
self.p = p
#----------
def CTR(self,x):
for i in range(self.ctr_layers_hid_num):
x = torch.relu(self.ctr_fc[i](x))#.to(device)
x = nn.BatchNorm1d(self.ctr_layers[i + 1])(x)
x = nn.Dropout(self.p)(x)
return torch.sigmoid((self.ctr_fc[-1](x) ))
def CVR(self,x):
for i in range(self.cvr_layers_hid_num):
x = torch.relu(self.cvr_fc[i](x))#.to(device)
x = nn.BatchNorm1d(self.cvr_layers[i + 1])(x)
x = nn.Dropout(self.p)(x)
return torch.sigmoid((self.cvr_fc[-1](x) ))
def forward(self,x):
user_list = list()
for cate,num in self.user_feature_dict.items():
if num[0] > 1:
user_list.append(self.user_embed_layers['user_embed_{}'.format(num[1])](x[:,num[1]].long()))
else:
user_list.append(x[:,num[1]].unsqueeze(1))
item_list = list()
for cate,num in self.item_feature_dict.items():
if num[0] > 1:
item_list.append(self.item_embed_layers['item_embed_{}'.format(num[1])](x[:,num[1]].long()))
else:
item_list.append(x[:,num[1]].unsqueeze(1))
user_embed = torch.cat(user_list, axis=1)
item_embed = torch.cat(item_list, axis=1)
x = torch.cat([user_embed, item_embed], axis=1)
p_ctr = self.CTR(x)
p_cvr = self.CVR(x)
p_ctcvr = p_ctr*p_cvr
return p_ctr,p_ctcvr
def total_para(self):#计算参数数目
return sum([x.numel() for x in self.parameters()])
def Loss(model,x,click,conversion):
p_ctr,p_ctcvr = model.forward(x)
loss_fn = nn.BCELoss()
#print(click.shape,p_ctr.shape)
ctr_loss = loss_fn(p_ctr, click)
ctcvr_loss = loss_fn(p_ctcvr, conversion)
total_loss = ctr_loss + ctcvr_loss
return total_loss
def loadtype(X_train,X_test,y_train,y_test,z_train,z_test,dtype):
X_train = torch.tensor(X_train).type(dtype)
X_test = torch.tensor(X_test).type(dtype)
y_train = torch.tensor(y_train).type(dtype).reshape(-1,1)
y_test = torch.tensor(y_test).type(dtype).reshape(-1,1)
z_train = torch.tensor(z_train).type(dtype).reshape(-1,1)
z_test = torch.tensor(z_test).type(dtype).reshape(-1,1)
for i in range(y_train.shape[0]):
if y_train[i,0] == 0:
z_train[i,0] = 0
for i in range(y_test.shape[0]):
if y_test[i,0] == 0:
z_test[i,0] = 0
return (X_train, X_test),(y_train, y_test),(z_train, z_test)
def Train(model,X_train,click_train,conversion_train,optim,optimtype,epoch,record,batch):
iteration = (X_train.shape[0] + batch - 1)//(batch)
loss = Loss(model,X_train,click_train,conversion_train)
print('start train:the loss:%.4e'%(loss.item()))
for i in range(epoch):
for j in range(iteration):
inds = j*batch
if (j + 1)*batch <= X_train.shape[0]:
inde = (j + 1)*batch
else:
inde = X_train.shape[0]
x = X_train[inds:inde]
clk = click_train[inds:inde]
cov = conversion_train[inds:inde]
loss = Loss(model,x,clk,cov)
#print(loss.item())
if optimtype == 'LBFGS' or 'BFGS':
def closure():
optim.zero_grad()
lloss = Loss(model,x,clk,cov)
loss.backward()
return loss
optim.step(closure)
else:
for j in range(record):
optim.zero_grad()
loss = Loss(model,x,clk,cov)
#print(loss)
loss.backward()
optim.step()
loss = Loss(model,x,clk,cov)
print('the %d epoch,the %d iter,the loss:%.4e'%(i + 1,j + 1,loss.item()))
loss = Loss(model,X_train,click_train,conversion_train)
p_ctr,p_ctcvr = model.forward(X_train)
ctr_pre = [1 if x > 0.5 else 0 for x in p_ctr]
ctr_pre = torch.tensor(ctr_pre).reshape(-1,1)
ctcvr_pre = [1 if x > 0.5 else 0 for x in p_ctcvr]
ctcvr_pre = torch.tensor(ctcvr_pre).reshape(-1,1)
ctr_acc = ctr_pre.eq(click_train).sum()/click_train.shape[0]
ctcvr_acc = ctcvr_pre.eq(conversion_train).sum()/conversion_train.shape[0]
print('the %d epoch:the loss:%.4e,the ctr acc:%.2e,ctcvr acc:%.2e'%(i + 1,loss.item(),ctr_acc,ctcvr_acc))
embed = 8
p = 0.5
ctr_hid_layers = [256,64,1]
cvr_hid_layers = [256,64,1]
dtype = torch.float32
train_data, test_data, user_feature_dict, item_feature_dict = data_preparation()
#print(user_feature_dict, item_feature_dict)#查看用户和物品的特征到底有哪些
#print(train_data.shape)#[32561,15]
train_dataset = (train_data.iloc[:, :-2].values, train_data.iloc[:, -2].values, train_data.iloc[:, -1].values)
test_dataset = (test_data.iloc[:, :-2].values, test_data.iloc[:, -2].values, test_data.iloc[:, -1].values)
#train_data里面最后两列是标签,一个是收入是否大于50K,一个是婚姻状态
X_train,y_train,z_train = train_dataset[0],train_dataset[1],train_dataset[2]
X_test,y_test,z_test = test_dataset[0],test_dataset[1],test_dataset[2]
#print(X_train[:2,:5])
model = Model(user_feature_dict, item_feature_dict,ctr_hid_layers,cvr_hid_layers,dtype,p,embed)
(X_train, X_test),(y_train, y_test),(z_train,z_test) = loadtype(X_train,X_test,y_train,y_test,z_train,z_test,dtype)
epoch = 5
batch = 1000
record = 2
lr = 1e-2
optimtype = 'SGD'
#optimtype = 'Adam'
#optimtype = 'LBFGS'
if optimtype == 'SGD':
optim = torch.optim.SGD(model.parameters(),lr = lr)
elif optimtype == 'Adam':
optim = torch.optim.Adam(model.parameters(),lr = lr)
elif optimtype == 'LBFGS':
optim = torch.optim.LBFGS(model.parameters(),lr = lr,max_iter = record,
tolerance_grad=1e-16, tolerance_change=1e-16,
line_search_fn='strong_wolfe')
Train(model,X_train,y_train,z_train,optim,optimtype,epoch,record,batch)
p_ctr,p_ctcvr = model.forward(X_test)
print(p_ctr[:3],p_ctcvr[:3])
ctr_pre = [1 if x > 0.5 else 0 for x in p_ctr]
ctr_pre = torch.tensor(ctr_pre).reshape(-1,1)
ctcvr_pre = [1 if x > 0.5 else 0 for x in p_ctcvr]
ctcvr_pre = torch.tensor(ctcvr_pre).reshape(-1,1)
ctr_acc = ctr_pre.eq(y_test).sum()/y_test.shape[0]
ctcvr_acc = ctcvr_pre.eq(z_test).sum()/z_test.shape[0]
print('the pre ctr acc:%.2e,pre ctcvr acc:%.2e'%(ctr_acc,ctcvr_acc))
ESMM+FM.py
import torch
import torch.nn as nn
import numpy as np
import time
import pandas as pd
from adultutils import data_preparation
class Model(torch.nn.Module):
def __init__(self, user_feature_dict, item_feature_dict,ctr_hid_layers,cvr_hid_layers,dtype,k,embed):
super(Model, self).__init__()#输入维度与embed有关,这里只设置hidden层和输出层
self.user_feature_dict = user_feature_dict
self.item_feature_dict = item_feature_dict
self.user_embed_layers = dict()
self.item_embed_layers = dict()
user_cate_feature_nums, item_cate_feature_nums = 0, 0
for cate,num in self.user_feature_dict.items():
if num[0] > 1:#如果该特征不同取值大于1,那么使用embed
user_cate_feature_nums += 1
self.user_embed_layers['user_embed_' + str(num[1])] = torch.nn.Embedding(num[0], embed)
for cate,num in self.item_feature_dict.items():
if num[0] > 1:
item_cate_feature_nums += 1
self.item_embed_layers['item_embed_' + str(num[1])] = torch.nn.Embedding(num[0], embed)
#注意,有部分特征做了embedding,因此网络接受的输入节点数目跟embedding的特征数目有关
#上面引入user_cate_feature_nums就是为了统计用户特征中做了embedding的特征数目,方便后面计算输入维度
input_dim = embed * (user_cate_feature_nums + item_cate_feature_nums) + \
(len(user_feature_dict) - user_cate_feature_nums) + (len(item_feature_dict) - item_cate_feature_nums)
#-------ctr DNN
self.ctr_layers = [input_dim] + ctr_hid_layers
self.ctr_layers_hid_num = len(self.ctr_layers)-2
ctr_fc = []
for i in range(self.ctr_layers_hid_num+1):
ctr_fc.append(torch.nn.Linear(self.ctr_layers[i],self.ctr_layers[i+1]))
self.ctr_fc = torch.nn.Sequential(*ctr_fc)
for i in range(self.ctr_layers_hid_num+1):
self.ctr_fc[i].weight.data = self.ctr_fc[i].weight.data.type(dtype)
self.ctr_fc[i].bias.data = self.ctr_fc[i].bias.data.type(dtype)
#--------cvr DNN
self.cvr_layers = [input_dim] + cvr_hid_layers
self.cvr_layers_hid_num = len(self.cvr_layers)-2
cvr_fc = []
for i in range(self.cvr_layers_hid_num+1):
cvr_fc.append(torch.nn.Linear(self.cvr_layers[i],self.cvr_layers[i+1]))
self.cvr_fc = torch.nn.Sequential(*cvr_fc)
for i in range(self.cvr_layers_hid_num+1):
self.cvr_fc[i].weight.data = self.cvr_fc[i].weight.data.type(dtype)
self.cvr_fc[i].bias.data = self.cvr_fc[i].bias.data.type(dtype)
#-------------FM
self.w = torch.nn.Linear(input_dim,1)
fm = [];fm.append(self.w)
self.fm = torch.nn.Sequential(*fm)
self.fm[0].weight.data = self.fm[0].weight.data.type(dtype)
self.fm[0].bias.data = self.fm[0].bias.data.type(dtype)
self.v = torch.nn.Parameter(torch.FloatTensor(torch.rand(input_dim,k)), requires_grad=True)
self.v = self.v.type(dtype)
#----------
def FM(self,x):
linear_part = self.fm[0](x)
inner_part1 = torch.pow(x,2)@torch.pow(self.v,2)
inner_part2 = torch.pow((x@self.v),2)
inner_part = 0.5*(inner_part2 - inner_part1).sum(axis = 1,keepdims = True)
return linear_part + inner_part
def CTR(self,x):
linear = self.FM(x)
for i in range(self.ctr_layers_hid_num):
x = torch.relu(self.ctr_fc[i](x))#.to(device)
x = nn.BatchNorm1d(self.ctr_layers[i + 1])(x)
return torch.sigmoid(0.5*(self.ctr_fc[-1](x) + linear))
def CVR(self,x):
linear = self.FM(x)
for i in range(self.cvr_layers_hid_num):
x = torch.relu(self.cvr_fc[i](x))#.to(device)
x = nn.BatchNorm1d(self.cvr_layers[i + 1])(x)
return torch.sigmoid(0.5*(self.cvr_fc[-1](x) + linear))
def forward(self,x):
user_list = list()
for cate,num in self.user_feature_dict.items():
if num[0] > 1:
user_list.append(self.user_embed_layers['user_embed_{}'.format(num[1])](x[:,num[1]].long()))
else:
user_list.append(x[:,num[1]].unsqueeze(1))
item_list = list()
for cate,num in self.item_feature_dict.items():
if num[0] > 1:
item_list.append(self.item_embed_layers['item_embed_{}'.format(num[1])](x[:,num[1]].long()))
else:
item_list.append(x[:,num[1]].unsqueeze(1))
user_embed = torch.cat(user_list, axis=1)
item_embed = torch.cat(item_list, axis=1)
x = torch.cat([user_embed, item_embed], axis=1)
p_ctr = self.CTR(x)
p_cvr = self.CVR(x)
p_ctcvr = p_ctr*p_cvr
return p_ctr,p_ctcvr
def total_para(self):#计算参数数目
return sum([x.numel() for x in self.parameters()])
def Loss(model,x,click,conversion):
p_ctr,p_ctcvr = model.forward(x)
loss_fn = nn.BCELoss()
#print(click.shape,p_ctr.shape)
ctr_loss = loss_fn(p_ctr, click)
ctcvr_loss = loss_fn(p_ctcvr, conversion)
total_loss = ctr_loss + ctcvr_loss
return total_loss
def loadtype(X_train,X_test,y_train,y_test,z_train,z_test,dtype):
X_train = torch.tensor(X_train).type(dtype)
X_test = torch.tensor(X_test).type(dtype)
y_train = torch.tensor(y_train).type(dtype).reshape(-1,1)
y_test = torch.tensor(y_test).type(dtype).reshape(-1,1)
z_train = torch.tensor(z_train).type(dtype).reshape(-1,1)
z_test = torch.tensor(z_test).type(dtype).reshape(-1,1)
for i in range(y_train.shape[0]):
if y_train[i,0] == 0:
z_train[i,0] = 0
for i in range(y_test.shape[0]):
if y_test[i,0] == 0:
z_test[i,0] = 0
return (X_train, X_test),(y_train, y_test),(z_train, z_test)
def Train(model,X_train,click_train,conversion_train,optim,optimtype,epoch,record,batch):
iteration = (X_train.shape[0] + batch - 1)//(batch)
loss = Loss(model,X_train,click_train,conversion_train)
print('start train:the loss:%.4e'%(loss.item()))
for i in range(epoch):
for j in range(iteration):
inds = j*batch
if (j + 1)*batch <= X_train.shape[0]:
inde = (j + 1)*batch
else:
inde = X_train.shape[0]
x = X_train[inds:inde]
clk = click_train[inds:inde]
cov = conversion_train[inds:inde]
loss = Loss(model,x,clk,cov)
#print(loss.item())
if optimtype == 'LBFGS' or 'BFGS':
def closure():
optim.zero_grad()
lloss = Loss(model,x,clk,cov)
loss.backward()
return loss
optim.step(closure)
else:
for j in range(record):
optim.zero_grad()
loss = Loss(model,x,clk,cov)
#print(loss)
loss.backward()
optim.step()
loss = Loss(model,x,clk,cov)
print('the %d epoch,the %d iter,the loss:%.4e'%(i + 1,j + 1,loss.item()))
loss = Loss(model,X_train,click_train,conversion_train)
p_ctr,p_ctcvr = model.forward(X_train)
ctr_pre = [1 if x > 0.5 else 0 for x in p_ctr]
ctr_pre = torch.tensor(ctr_pre).reshape(-1,1)
ctcvr_pre = [1 if x > 0.5 else 0 for x in p_ctcvr]
ctcvr_pre = torch.tensor(ctcvr_pre).reshape(-1,1)
ctr_acc = ctr_pre.eq(click_train).sum()/click_train.shape[0]
ctcvr_acc = ctcvr_pre.eq(conversion_train).sum()/conversion_train.shape[0]
print('the %d epoch:the loss:%.4e,the ctr acc:%.2e,ctcvr acc:%.2e'%(i + 1,loss.item(),ctr_acc,ctcvr_acc))
embed = 8
k = 10
ctr_hid_layers = [256,64,1]
cvr_hid_layers = [256,64,1]
dtype = torch.float32
train_data, test_data, user_feature_dict, item_feature_dict = data_preparation()
#print(user_feature_dict, item_feature_dict)#查看用户和物品的特征到底有哪些
#print(train_data.shape)#[32561,15]
train_dataset = (train_data.iloc[:, :-2].values, train_data.iloc[:, -2].values, train_data.iloc[:, -1].values)
test_dataset = (test_data.iloc[:, :-2].values, test_data.iloc[:, -2].values, test_data.iloc[:, -1].values)
#train_data里面最后两列是标签,一个是收入是否大于50K,一个是婚姻状态
X_train,y_train,z_train = train_dataset[0],train_dataset[1],train_dataset[2]
X_test,y_test,z_test = test_dataset[0],test_dataset[1],test_dataset[2]
#print(X_train[:2,:5])
model = Model(user_feature_dict, item_feature_dict,ctr_hid_layers,cvr_hid_layers,dtype,k,embed)
(X_train, X_test),(y_train, y_test),(z_train,z_test) = loadtype(X_train,X_test,y_train,y_test,z_train,z_test,dtype)
epoch = 5
batch = 1000
record = 2
lr = 1e-2
optimtype = 'SGD'
#optimtype = 'Adam'
#optimtype = 'LBFGS'
if optimtype == 'SGD':
optim = torch.optim.SGD(model.parameters(),lr = lr)
elif optimtype == 'Adam':
optim = torch.optim.Adam(model.parameters(),lr = lr)
elif optimtype == 'LBFGS':
optim = torch.optim.LBFGS(model.parameters(),lr = lr,max_iter = record,
tolerance_grad=1e-16, tolerance_change=1e-16,
line_search_fn='strong_wolfe')
Train(model,X_train,y_train,z_train,optim,optimtype,epoch,record,batch)
p_ctr,p_ctcvr = model.forward(X_test)
print(p_ctr[:3],p_ctcvr[:3])
ctr_pre = [1 if x > 0.5 else 0 for x in p_ctr]
ctr_pre = torch.tensor(ctr_pre).reshape(-1,1)
ctcvr_pre = [1 if x > 0.5 else 0 for x in p_ctcvr]
ctcvr_pre = torch.tensor(ctcvr_pre).reshape(-1,1)
ctr_acc = ctr_pre.eq(y_test).sum()/y_test.shape[0]
ctcvr_acc = ctcvr_pre.eq(z_test).sum()/z_test.shape[0]
print('the pre ctr acc:%.2e,pre ctcvr acc:%.2e'%(ctr_acc,ctcvr_acc))
ESMM-dropout.py
import torch
import torch.nn as nn
import numpy as np
import time
import pandas as pd
from adultutils import data_preparation
class Model(torch.nn.Module):
def __init__(self, user_feature_dict, item_feature_dict,ctr_hid_layers,cvr_hid_layers,dtype,embed):
super(Model, self).__init__()#输入维度与embed有关,这里只设置hidden层和输出层
self.user_feature_dict = user_feature_dict
self.item_feature_dict = item_feature_dict
self.user_embed_layers = dict()
self.item_embed_layers = dict()
user_cate_feature_nums, item_cate_feature_nums = 0, 0
for cate,num in self.user_feature_dict.items():
if num[0] > 1:#如果该特征不同取值大于1,那么使用embed
user_cate_feature_nums += 1
self.user_embed_layers['user_embed_' + str(num[1])] = torch.nn.Embedding(num[0], embed)
for cate,num in self.item_feature_dict.items():
if num[0] > 1:
item_cate_feature_nums += 1
self.item_embed_layers['item_embed_' + str(num[1])] = torch.nn.Embedding(num[0], embed)
#注意,有部分特征做了embedding,因此网络接受的输入节点数目跟embedding的特征数目有关
#上面引入user_cate_feature_nums就是为了统计用户特征中做了embedding的特征数目,方便后面计算输入维度
input_dim = embed * (user_cate_feature_nums + item_cate_feature_nums) + \
(len(user_feature_dict) - user_cate_feature_nums) + (len(item_feature_dict) - item_cate_feature_nums)
#-------ctr DNN
self.ctr_layers = [input_dim] + ctr_hid_layers
self.ctr_layers_hid_num = len(self.ctr_layers)-2
ctr_fc = []
for i in range(self.ctr_layers_hid_num+1):
ctr_fc.append(torch.nn.Linear(self.ctr_layers[i],self.ctr_layers[i+1]))
self.ctr_fc = torch.nn.Sequential(*ctr_fc)
for i in range(self.ctr_layers_hid_num+1):
self.ctr_fc[i].weight.data = self.ctr_fc[i].weight.data.type(dtype)
self.ctr_fc[i].bias.data = self.ctr_fc[i].bias.data.type(dtype)
#--------cvr DNN
self.cvr_layers = [input_dim] + cvr_hid_layers
self.cvr_layers_hid_num = len(self.cvr_layers)-2
cvr_fc = []
for i in range(self.cvr_layers_hid_num+1):
cvr_fc.append(torch.nn.Linear(self.cvr_layers[i],self.cvr_layers[i+1]))
self.cvr_fc = torch.nn.Sequential(*cvr_fc)
for i in range(self.cvr_layers_hid_num+1):
self.cvr_fc[i].weight.data = self.cvr_fc[i].weight.data.type(dtype)
self.cvr_fc[i].bias.data = self.cvr_fc[i].bias.data.type(dtype)
#----------
def CTR(self,x):
for i in range(self.ctr_layers_hid_num):
x = torch.relu(self.ctr_fc[i](x))#.to(device)
x = nn.BatchNorm1d(self.ctr_layers[i + 1])(x)
return torch.sigmoid(self.ctr_fc[-1](x))
def CVR(self,x):
for i in range(self.cvr_layers_hid_num):
x = torch.relu(self.cvr_fc[i](x))#.to(device)
x = nn.BatchNorm1d(self.cvr_layers[i + 1])(x)
return torch.sigmoid(self.cvr_fc[-1](x))
def forward(self,x):
user_list = list()
for cate,num in self.user_feature_dict.items():
if num[0] > 1:
user_list.append(self.user_embed_layers['user_embed_{}'.format(num[1])](x[:,num[1]].long()))
else:
user_list.append(x[:,num[1]].unsqueeze(1))
item_list = list()
for cate,num in self.item_feature_dict.items():
if num[0] > 1:
item_list.append(self.item_embed_layers['item_embed_{}'.format(num[1])](x[:,num[1]].long()))
else:
item_list.append(x[:,num[1]].unsqueeze(1))
user_embed = torch.cat(user_list, axis=1)
item_embed = torch.cat(item_list, axis=1)
x = torch.cat([user_embed, item_embed], axis=1)
p_ctr = self.CTR(x)
p_cvr = self.CVR(x)
p_ctcvr = p_ctr*p_cvr
return p_ctr,p_ctcvr
def total_para(self):#计算参数数目
return sum([x.numel() for x in self.parameters()])
def Loss(model,x,click,conversion):
p_ctr,p_ctcvr = model.forward(x)
loss_fn = nn.BCELoss()
#print(click.shape,p_ctr.shape)
ctr_loss = loss_fn(p_ctr, click)
ctcvr_loss = loss_fn(p_ctcvr, conversion)
total_loss = ctr_loss + ctcvr_loss
return total_loss
def loadtype(X_train,X_test,y_train,y_test,z_train,z_test,dtype):
X_train = torch.tensor(X_train).type(dtype)
X_test = torch.tensor(X_test).type(dtype)
y_train = torch.tensor(y_train).type(dtype).reshape(-1,1)
y_test = torch.tensor(y_test).type(dtype).reshape(-1,1)
z_train = torch.tensor(z_train).type(dtype).reshape(-1,1)
z_test = torch.tensor(z_test).type(dtype).reshape(-1,1)
for i in range(y_train.shape[0]):
if y_train[i,0] == 0:
z_train[i,0] = 0
for i in range(y_test.shape[0]):
if y_test[i,0] == 0:
z_test[i,0] = 0
return (X_train, X_test),(y_train, y_test),(z_train, z_test)
def Train(model,X_train,click_train,conversion_train,optim,optimtype,epoch,record,batch):
iteration = (X_train.shape[0] + batch - 1)//(batch)
loss = Loss(model,X_train,click_train,conversion_train)
print('start train:the loss:%.4e'%(loss.item()))
for i in range(epoch):
for j in range(iteration):
inds = j*batch
if (j + 1)*batch <= X_train.shape[0]:
inde = (j + 1)*batch
else:
inde = X_train.shape[0]
x = X_train[inds:inde]
clk = click_train[inds:inde]
cov = conversion_train[inds:inde]
loss = Loss(model,x,clk,cov)
#print(loss.item())
if optimtype == 'LBFGS' or 'BFGS':
def closure():
optim.zero_grad()
lloss = Loss(model,x,clk,cov)
loss.backward()
return loss
optim.step(closure)
else:
for j in range(record):
optim.zero_grad()
loss = Loss(model,x,clk,cov)
#print(loss)
loss.backward()
optim.step()
loss = Loss(model,x,clk,cov)
print('the %d epoch,the %d iter,the loss:%.4e'%(i + 1,j + 1,loss.item()))
loss = Loss(model,X_train,click_train,conversion_train)
p_ctr,p_ctcvr = model.forward(X_train)
ctr_pre = [1 if x > 0.5 else 0 for x in p_ctr]
ctr_pre = torch.tensor(ctr_pre).reshape(-1,1)
ctcvr_pre = [1 if x > 0.5 else 0 for x in p_ctcvr]
ctcvr_pre = torch.tensor(ctcvr_pre).reshape(-1,1)
ctr_acc = ctr_pre.eq(click_train).sum()/click_train.shape[0]
ctcvr_acc = ctcvr_pre.eq(conversion_train).sum()/conversion_train.shape[0]
print('the %d epoch:the loss:%.4e,the ctr acc:%.2e,ctcvr acc:%.2e'%(i + 1,loss.item(),ctr_acc,ctcvr_acc))
embed = 8
k = 10
ctr_hid_layers = [256,64,1]
cvr_hid_layers = [256,64,1]
dtype = torch.float32
train_data, test_data, user_feature_dict, item_feature_dict = data_preparation()
#print(user_feature_dict, item_feature_dict)#查看用户和物品的特征到底有哪些
#print(train_data.shape)#[32561,15]
train_dataset = (train_data.iloc[:, :-2].values, train_data.iloc[:, -2].values, train_data.iloc[:, -1].values)
test_dataset = (test_data.iloc[:, :-2].values, test_data.iloc[:, -2].values, test_data.iloc[:, -1].values)
#train_data里面最后两列是标签,一个是收入是否大于50K,一个是婚姻状态
X_train,y_train,z_train = train_dataset[0],train_dataset[1],train_dataset[2]
X_test,y_test,z_test = test_dataset[0],test_dataset[1],test_dataset[2]
#print(X_train[:2,:5])
model = Model(user_feature_dict, item_feature_dict,ctr_hid_layers,cvr_hid_layers,dtype,embed)
(X_train, X_test),(y_train, y_test),(z_train,z_test) = loadtype(X_train,X_test,y_train,y_test,z_train,z_test,dtype)
epoch = 5
batch = 1000
record = 2
lr = 1e-2
optimtype = 'SGD'
#optimtype = 'Adam'
#optimtype = 'LBFGS'
if optimtype == 'SGD':
optim = torch.optim.SGD(model.parameters(),lr = lr)
elif optimtype == 'Adam':
optim = torch.optim.Adam(model.parameters(),lr = lr)
elif optimtype == 'LBFGS':
optim = torch.optim.LBFGS(model.parameters(),lr = lr,max_iter = record,
tolerance_grad=1e-16, tolerance_change=1e-16,
line_search_fn='strong_wolfe')
Train(model,X_train,y_train,z_train,optim,optimtype,epoch,record,batch)
p_ctr,p_ctcvr = model.forward(X_test)
ctr_pre = [1 if x > 0.5 else 0 for x in p_ctr]
ctr_pre = torch.tensor(ctr_pre).reshape(-1,1)
ctcvr_pre = [1 if x > 0.5 else 0 for x in p_ctcvr]
ctcvr_pre = torch.tensor(ctcvr_pre).reshape(-1,1)
ctr_acc = ctr_pre.eq(y_test).sum()/y_test.shape[0]
ctcvr_acc = ctcvr_pre.eq(z_test).sum()/z_test.shape[0]
print('the pre ctr acc:%.2e,pre ctcvr acc:%.2e'%(ctr_acc,ctcvr_acc))
交叉熵ESMM+FM
better
import torch
import torch.nn as nn
import numpy as np
import time
import pandas as pd
from adultutils import data_preparation
np.random.seed(1234)
torch.manual_seed(1234)
class Model(torch.nn.Module):
def __init__(self, user_feature_dict, item_feature_dict,ctr_hid_layers,cvr_hid_layers,dtype,k,embed):
super(Model, self).__init__()#输入维度与embed有关,这里只设置hidden层和输出层
self.user_feature_dict = user_feature_dict
self.item_feature_dict = item_feature_dict
self.user_embed_layers = dict()
self.item_embed_layers = dict()
user_cate_feature_nums, item_cate_feature_nums = 0, 0
for cate,num in self.user_feature_dict.items():
if num[0] > 1:#如果该特征不同取值大于1,那么使用embed
user_cate_feature_nums += 1
self.user_embed_layers['user_embed_' + str(num[1])] = torch.nn.Embedding(num[0], embed)
for cate,num in self.item_feature_dict.items():
if num[0] > 1:
item_cate_feature_nums += 1
self.item_embed_layers['item_embed_' + str(num[1])] = torch.nn.Embedding(num[0], embed)
#注意,有部分特征做了embedding,因此网络接受的输入节点数目跟embedding的特征数目有关
#上面引入user_cate_feature_nums就是为了统计用户特征中做了embedding的特征数目,方便后面计算输入维度
input_dim = embed * (user_cate_feature_nums + item_cate_feature_nums) + \
(len(user_feature_dict) - user_cate_feature_nums) + (len(item_feature_dict) - item_cate_feature_nums)
#-------ctr DNN
self.ctr_layers = [input_dim] + ctr_hid_layers
self.ctr_layers_hid_num = len(self.ctr_layers)-2
ctr_fc = []
for i in range(self.ctr_layers_hid_num+1):
ctr_fc.append(torch.nn.Linear(self.ctr_layers[i],self.ctr_layers[i+1]))
self.ctr_fc = torch.nn.Sequential(*ctr_fc)
for i in range(self.ctr_layers_hid_num+1):
self.ctr_fc[i].weight.data = self.ctr_fc[i].weight.data.type(dtype)
self.ctr_fc[i].bias.data = self.ctr_fc[i].bias.data.type(dtype)
#--------cvr DNN
self.cvr_layers = [input_dim] + cvr_hid_layers
self.cvr_layers_hid_num = len(self.cvr_layers)-2
cvr_fc = []
for i in range(self.cvr_layers_hid_num+1):
cvr_fc.append(torch.nn.Linear(self.cvr_layers[i],self.cvr_layers[i+1]))
self.cvr_fc = torch.nn.Sequential(*cvr_fc)
for i in range(self.cvr_layers_hid_num+1):
self.cvr_fc[i].weight.data = self.cvr_fc[i].weight.data.type(dtype)
self.cvr_fc[i].bias.data = self.cvr_fc[i].bias.data.type(dtype)
#-------------FM
self.w = torch.nn.Linear(input_dim,2)
fm = [];fm.append(self.w)
self.fm = torch.nn.Sequential(*fm)
self.fm[0].weight.data = self.fm[0].weight.data.type(dtype)
self.fm[0].bias.data = self.fm[0].bias.data.type(dtype)
self.v = torch.nn.Parameter(torch.FloatTensor(torch.rand(input_dim,k)), requires_grad=True)
self.v = self.v.type(dtype)
#----------
def reg(self,x):
tmp1 = 1/(1 + torch.exp(x[:,1:2] - x[:,0:1]))
tmp2 = 1/(1 + torch.exp(x[:,0:1] - x[:,1:2]))
return torch.cat([tmp1,tmp2],dim = 1)
def FM(self,x):
linear_part = self.fm[0](x)
inner_part1 = torch.pow(x,2)@torch.pow(self.v,2)
inner_part2 = torch.pow((x@self.v),2)
inner = 0.5*(inner_part2 - inner_part1).sum(axis = 1,keepdims = True)
inner_part = torch.cat([inner,inner],axis = 1)
return linear_part + inner_part
def CTR(self,x):
#x = nn.BatchNorm1d(x.shape[0])(x)
linear = self.FM(x)
for i in range(self.ctr_layers_hid_num):
x = torch.relu(self.ctr_fc[i](x))#.to(device)
temp = torch.eye(x.shape[-1],self.ctr_layers[i+1])
x = x + x@temp
x = nn.BatchNorm1d(self.ctr_layers[i + 1])(x)
#x = nn.Dropout(self.p)(x)
return (self.ctr_fc[-1](x) + linear)
def CVR(self,x):
linear = self.FM(x)
for i in range(self.cvr_layers_hid_num):
x = torch.relu(self.cvr_fc[i](x))#.to(device)
temp = torch.eye(x.shape[-1],self.cvr_layers[i+1])
x = x + x@temp
x = nn.BatchNorm1d(self.cvr_layers[i + 1])(x)
#x = nn.Dropout(self.p)(x)
return torch.sigmoid(self.cvr_fc[-1](x) + linear)
def forward(self,x):
user_list = list()
for cate,num in self.user_feature_dict.items():
if num[0] > 1:
user_list.append(self.user_embed_layers['user_embed_{}'.format(num[1])](x[:,num[1]].long()))
else:
user_list.append(x[:,num[1]].unsqueeze(1))
item_list = list()
for cate,num in self.item_feature_dict.items():
if num[0] > 1:
item_list.append(self.item_embed_layers['item_embed_{}'.format(num[1])](x[:,num[1]].long()))
else:
item_list.append(x[:,num[1]].unsqueeze(1))
user_embed = torch.cat(user_list, axis=1)
item_embed = torch.cat(item_list, axis=1)
x = torch.cat([user_embed, item_embed], axis=1)
p_ctr = self.CTR(x)
p_cvr = self.CVR(x)
ind = p_ctr.argmax(dim = 1)
p_ctcvr = p_cvr*(torch.eye(2)[ind])
return p_ctr,p_ctcvr
def total_para(self):#计算参数数目
return sum([x.numel() for x in self.parameters()])
def pred(model,x):
p_ctr,p_ctcvr = model.forward(x)
ctr_pre = (p_ctr.argmax(dim = 1)).reshape(-1,1)
ctcvr_pre = (p_ctcvr.argmax(dim = 1)).reshape(-1,1)
return ctr_pre,ctcvr_pre
def Loss(model,x,click,conversion):
criteon = nn.CrossEntropyLoss()
p_ctr,p_ctcvr = model.forward(x)
ctr_loss = criteon(p_ctr,click.squeeze().type(torch.long))
loss_fn = nn.CrossEntropyLoss()
ctcvr_loss = loss_fn(p_ctcvr,conversion.squeeze().type(torch.long))
total_loss = ctr_loss + ctcvr_loss
return total_loss
def loadtype(X_train,X_test,y_train,y_test,z_train,z_test,dtype):
X_train = torch.tensor(X_train).type(dtype)
X_test = torch.tensor(X_test).type(dtype)
y_train = torch.tensor(y_train).type(dtype).reshape(-1,1)
y_test = torch.tensor(y_test).type(dtype).reshape(-1,1)
z_train = torch.tensor(z_train).type(dtype).reshape(-1,1)
z_test = torch.tensor(z_test).type(dtype).reshape(-1,1)
for i in range(y_train.shape[0]):
if y_train[i,0] == 0:
z_train[i,0] = 0
for i in range(y_test.shape[0]):
if y_test[i,0] == 0:
z_test[i,0] = 0
return (X_train, X_test),(y_train, y_test),(z_train, z_test)
def Train(model,X_train,click_train,conversion_train,optim,optimtype,epoch,record,batch):
iteration = (X_train.shape[0] + batch - 1)//(batch)
loss = Loss(model,X_train,click_train,conversion_train)
print('start train:the loss:%.4e'%(loss.item()))
for i in range(epoch):
for j in range(iteration):
inds = j*batch
if (j + 1)*batch <= X_train.shape[0]:
inde = (j + 1)*batch
else:
inde = X_train.shape[0]
x = X_train[inds:inde]
clk = click_train[inds:inde]
cov = conversion_train[inds:inde]
loss = Loss(model,x,clk,cov)
#print(loss.item())
if optimtype == 'LBFGS' or 'BFGS':
def closure():
optim.zero_grad()
lloss = Loss(model,x,clk,cov)
loss.backward()
return loss
optim.step(closure)
else:
for j in range(record):
optim.zero_grad()
loss = Loss(model,x,clk,cov)
#print(loss)
loss.backward()
optim.step()
loss = Loss(model,x,clk,cov)
print('the %d epoch,the %d iter,the loss:%.4e'%(i + 1,j + 1,loss.item()))
loss = Loss(model,X_train,click_train,conversion_train)
ctr_pre,ctcvr_pre = pred(model,X_train)
#print(ctr_pre.shape,click_train.shape)
ctr_acc = ctr_pre.eq(click_train).sum()/click_train.shape[0]
ctcvr_acc = ctcvr_pre.eq(conversion_train).sum()/conversion_train.shape[0]
print('the %d epoch:the loss:%.4e,the ctr acc:%.2e,ctcvr acc:%.2e'%(i + 1,loss.item(),ctr_acc,ctcvr_acc))
embed = 8
p = 10
ctr_hid_layers = [128,64,2]
cvr_hid_layers = [128,64,2]
dtype = torch.float32
train_data, test_data, user_feature_dict, item_feature_dict = data_preparation()
#print(user_feature_dict, item_feature_dict)#查看用户和物品的特征到底有哪些
#print(train_data.shape)#[32561,15]
train_dataset = (train_data.iloc[:, :-2].values, train_data.iloc[:, -2].values, train_data.iloc[:, -1].values)
test_dataset = (test_data.iloc[:, :-2].values, test_data.iloc[:, -2].values, test_data.iloc[:, -1].values)
#train_data里面最后两列是标签,一个是收入是否大于50K,一个是婚姻状态
X_train,y_train,z_train = train_dataset[0],train_dataset[1],train_dataset[2]
X_test,y_test,z_test = test_dataset[0],test_dataset[1],test_dataset[2]
#print(X_train[:2,:5])
model = Model(user_feature_dict, item_feature_dict,ctr_hid_layers,cvr_hid_layers,dtype,p,embed)
(X_train, X_test),(y_train, y_test),(z_train,z_test) = loadtype(X_train,X_test,y_train,y_test,z_train,z_test,dtype)
epoch = 10
batch = 1000
record = 2
lr = 1e-2
optimtype = 'SGD'
#optimtype = 'Adam'
#optimtype = 'LBFGS'
if optimtype == 'SGD':
optim = torch.optim.SGD(model.parameters(),lr = lr)
elif optimtype == 'Adam':
optim = torch.optim.Adam(model.parameters(),lr = lr)
elif optimtype == 'LBFGS':
optim = torch.optim.LBFGS(model.parameters(),lr = lr,max_iter = record,
tolerance_grad=1e-16, tolerance_change=1e-16,
line_search_fn='strong_wolfe')
Train(model,X_train,y_train,z_train,optim,optimtype,epoch,record,batch)
ctr_pre,ctcvr_pre = pred(model,X_test)
ctr_acc = ctr_pre.eq(y_test).sum()/y_test.shape[0]
ctcvr_acc = ctcvr_pre.eq(z_test).sum()/z_test.shape[0]
print('the pre ctr acc:%.2e,pre ctcvr acc:%.2e'%(ctr_acc,ctcvr_acc))
best
import torch
import torch.nn as nn
import numpy as np
import time
import pandas as pd
from adultutils import data_preparation
np.random.seed(1234)
torch.manual_seed(1234)
class Model(torch.nn.Module):
def __init__(self, user_feature_dict, item_feature_dict,ctr_hid_layers,cvr_hid_layers,dtype,k,embed):
super(Model, self).__init__()#输入维度与embed有关,这里只设置hidden层和输出层
self.user_feature_dict = user_feature_dict
self.item_feature_dict = item_feature_dict
self.user_embed_layers = dict()
self.item_embed_layers = dict()
user_cate_feature_nums, item_cate_feature_nums = 0, 0
for cate,num in self.user_feature_dict.items():
if num[0] > 1:#如果该特征不同取值大于1,那么使用embed
user_cate_feature_nums += 1
self.user_embed_layers['user_embed_' + str(num[1])] = torch.nn.Embedding(num[0], embed)
for cate,num in self.item_feature_dict.items():
if num[0] > 1:
item_cate_feature_nums += 1
self.item_embed_layers['item_embed_' + str(num[1])] = torch.nn.Embedding(num[0], embed)
#注意,有部分特征做了embedding,因此网络接受的输入节点数目跟embedding的特征数目有关
#上面引入user_cate_feature_nums就是为了统计用户特征中做了embedding的特征数目,方便后面计算输入维度
input_dim = embed * (user_cate_feature_nums + item_cate_feature_nums) + \
(len(user_feature_dict) - user_cate_feature_nums) + (len(item_feature_dict) - item_cate_feature_nums)
#-------ctr DNN
self.ctr_layers = [input_dim] + ctr_hid_layers
self.ctr_layers_hid_num = len(self.ctr_layers)-2
ctr_fc = []
for i in range(self.ctr_layers_hid_num+1):
ctr_fc.append(torch.nn.Linear(self.ctr_layers[i],self.ctr_layers[i+1]))
self.ctr_fc = torch.nn.Sequential(*ctr_fc)
for i in range(self.ctr_layers_hid_num+1):
self.ctr_fc[i].weight.data = self.ctr_fc[i].weight.data.type(dtype)
self.ctr_fc[i].bias.data = self.ctr_fc[i].bias.data.type(dtype)
#--------cvr DNN
self.cvr_layers = [input_dim] + cvr_hid_layers
self.cvr_layers_hid_num = len(self.cvr_layers)-2
cvr_fc = []
for i in range(self.cvr_layers_hid_num+1):
cvr_fc.append(torch.nn.Linear(self.cvr_layers[i],self.cvr_layers[i+1]))
self.cvr_fc = torch.nn.Sequential(*cvr_fc)
for i in range(self.cvr_layers_hid_num+1):
self.cvr_fc[i].weight.data = self.cvr_fc[i].weight.data.type(dtype)
self.cvr_fc[i].bias.data = self.cvr_fc[i].bias.data.type(dtype)
#-------------FM
self.w = torch.nn.Linear(input_dim,2)
fm = [];fm.append(self.w)
self.fm = torch.nn.Sequential(*fm)
self.fm[0].weight.data = self.fm[0].weight.data.type(dtype)
self.fm[0].bias.data = self.fm[0].bias.data.type(dtype)
self.v = torch.nn.Parameter(torch.FloatTensor(torch.rand(input_dim,k)), requires_grad=True)
self.v = self.v.type(dtype)
#----------
def FM(self,x):
linear_part = self.fm[0](x)
inner_part1 = torch.pow(x,2)@torch.pow(self.v,2)
inner_part2 = torch.pow((x@self.v),2)
inner = 0.5*(inner_part2 - inner_part1).sum(axis = 1,keepdims = True)
inner_part = torch.cat([inner,inner],axis = 1)
return linear_part + inner_part
def CTR(self,x):
#x = nn.BatchNorm1d(x.shape[0])(x)
linear = self.FM(x)
for i in range(self.ctr_layers_hid_num):
x = torch.relu(self.ctr_fc[i](x))#.to(device)
temp = torch.eye(x.shape[-1],self.ctr_layers[i+1])
x = x + x@temp
x = nn.BatchNorm1d(self.ctr_layers[i + 1])(x)
#x = nn.Dropout(self.p)(x)
return (self.ctr_fc[-1](x) + linear)
def CVR(self,x):
#x = nn.BatchNorm1d(x.shape[0])(x)
for i in range(self.cvr_layers_hid_num):
x = torch.relu(self.cvr_fc[i](x))#.to(device)
temp = torch.eye(x.shape[-1],self.cvr_layers[i+1])
x = x + x@temp
x = nn.BatchNorm1d(self.cvr_layers[i + 1])(x)
#x = nn.Dropout(self.p)(x)
return torch.sigmoid(self.cvr_fc[-1](x))
def forward(self,x):
user_list = list()
for cate,num in self.user_feature_dict.items():
if num[0] > 1:
user_list.append(self.user_embed_layers['user_embed_{}'.format(num[1])](x[:,num[1]].long()))
else:
user_list.append(x[:,num[1]].unsqueeze(1))
item_list = list()
for cate,num in self.item_feature_dict.items():
if num[0] > 1:
item_list.append(self.item_embed_layers['item_embed_{}'.format(num[1])](x[:,num[1]].long()))
else:
item_list.append(x[:,num[1]].unsqueeze(1))
user_embed = torch.cat(user_list, axis=1)
item_embed = torch.cat(item_list, axis=1)
x = torch.cat([user_embed, item_embed], axis=1)
p_ctr = self.CTR(x)
p_cvr = self.CVR(x)
p_ctcvr = (p_ctr.argmax(dim = 1).reshape(-1,1))*p_cvr
return p_ctr,p_ctcvr
def total_para(self):#计算参数数目
return sum([x.numel() for x in self.parameters()])
def pred(model,x):
p_ctr,p_ctcvr = model.forward(x)
ctr_pre = (p_ctr.argmax(dim = 1)).reshape(-1,1)
ctcvr_pre = [1 if x > 0.5 else 0 for x in p_ctcvr]
ctcvr_pre = torch.tensor(ctcvr_pre).reshape(-1,1)
return ctr_pre,ctcvr_pre
def Loss(model,x,click,conversion):
criteon = nn.CrossEntropyLoss()
p_ctr,p_ctcvr = model.forward(x)
ctr_loss = criteon(p_ctr,click.squeeze().type(torch.long))
loss_fn = nn.BCELoss()
ctcvr_loss = loss_fn(p_ctcvr,conversion)
total_loss = ctr_loss + ctcvr_loss
return total_loss
def loadtype(X_train,X_test,y_train,y_test,z_train,z_test,dtype):
X_train = torch.tensor(X_train).type(dtype)
X_test = torch.tensor(X_test).type(dtype)
y_train = torch.tensor(y_train).type(dtype).reshape(-1,1)
y_test = torch.tensor(y_test).type(dtype).reshape(-1,1)
z_train = torch.tensor(z_train).type(dtype).reshape(-1,1)
z_test = torch.tensor(z_test).type(dtype).reshape(-1,1)
for i in range(y_train.shape[0]):
if y_train[i,0] == 0:
z_train[i,0] = 0
for i in range(y_test.shape[0]):
if y_test[i,0] == 0:
z_test[i,0] = 0
return (X_train, X_test),(y_train, y_test),(z_train, z_test)
def Train(model,X_train,click_train,conversion_train,optim,optimtype,epoch,record,batch):
iteration = (X_train.shape[0] + batch - 1)//(batch)
loss = Loss(model,X_train,click_train,conversion_train)
print('start train:the loss:%.4e'%(loss.item()))
for i in range(epoch):
for j in range(iteration):
inds = j*batch
if (j + 1)*batch <= X_train.shape[0]:
inde = (j + 1)*batch
else:
inde = X_train.shape[0]
x = X_train[inds:inde]
clk = click_train[inds:inde]
cov = conversion_train[inds:inde]
loss = Loss(model,x,clk,cov)
#print(loss.item())
if optimtype == 'LBFGS' or 'BFGS':
def closure():
optim.zero_grad()
lloss = Loss(model,x,clk,cov)
loss.backward()
return loss
optim.step(closure)
else:
for j in range(record):
optim.zero_grad()
loss = Loss(model,x,clk,cov)
#print(loss)
loss.backward()
optim.step()
loss = Loss(model,x,clk,cov)
print('the %d epoch,the %d iter,the loss:%.4e'%(i + 1,j + 1,loss.item()))
loss = Loss(model,X_train,click_train,conversion_train)
ctr_pre,ctcvr_pre = pred(model,X_train)
#print(ctr_pre.shape,click_train.shape)
ctr_acc = ctr_pre.eq(click_train).sum()/click_train.shape[0]
ctcvr_acc = ctcvr_pre.eq(conversion_train).sum()/conversion_train.shape[0]
print('the %d epoch:the loss:%.4e,the ctr acc:%.2e,ctcvr acc:%.2e'%(i + 1,loss.item(),ctr_acc,ctcvr_acc))
embed = 8
p = 10
ctr_hid_layers = [128,64,2]
cvr_hid_layers = [128,64,1]
dtype = torch.float32
train_data, test_data, user_feature_dict, item_feature_dict = data_preparation()
#print(user_feature_dict, item_feature_dict)#查看用户和物品的特征到底有哪些
#print(train_data.shape)#[32561,15]
train_dataset = (train_data.iloc[:, :-2].values, train_data.iloc[:, -2].values, train_data.iloc[:, -1].values)
test_dataset = (test_data.iloc[:, :-2].values, test_data.iloc[:, -2].values, test_data.iloc[:, -1].values)
#train_data里面最后两列是标签,一个是收入是否大于50K,一个是婚姻状态
X_train,y_train,z_train = train_dataset[0],train_dataset[1],train_dataset[2]
X_test,y_test,z_test = test_dataset[0],test_dataset[1],test_dataset[2]
#print(X_train[:2,:5])
model = Model(user_feature_dict, item_feature_dict,ctr_hid_layers,cvr_hid_layers,dtype,p,embed)
(X_train, X_test),(y_train, y_test),(z_train,z_test) = loadtype(X_train,X_test,y_train,y_test,z_train,z_test,dtype)
epoch = 10
batch = 1000
record = 2
lr = 1e-2
optimtype = 'SGD'
#optimtype = 'Adam'
#optimtype = 'LBFGS'
if optimtype == 'SGD':
optim = torch.optim.SGD(model.parameters(),lr = lr)
elif optimtype == 'Adam':
optim = torch.optim.Adam(model.parameters(),lr = lr)
elif optimtype == 'LBFGS':
optim = torch.optim.LBFGS(model.parameters(),lr = lr,max_iter = record,
tolerance_grad=1e-16, tolerance_change=1e-16,
line_search_fn='strong_wolfe')
Train(model,X_train,y_train,z_train,optim,optimtype,epoch,record,batch)
ctr_pre,ctcvr_pre = pred(model,X_test)
ctr_acc = ctr_pre.eq(y_test).sum()/y_test.shape[0]
ctcvr_acc = ctcvr_pre.eq(z_test).sum()/z_test.shape[0]
print('the pre ctr acc:%.2e,pre ctcvr acc:%.2e'%(ctr_acc,ctcvr_acc))
数据增强和验证集的使用
import torch
import torch.nn as nn
import numpy as np
import time
import pandas as pd
import random
from adultutils import data_preparation
np.random.seed(1234)
torch.manual_seed(1234)
class Model(torch.nn.Module):
def __init__(self, user_feature_dict, item_feature_dict,ctr_hid_layers,cvr_hid_layers,dtype,k,embed):
super(Model, self).__init__()#输入维度与embed有关,这里只设置hidden层和输出层
self.user_feature_dict = user_feature_dict
self.item_feature_dict = item_feature_dict
self.user_embed_layers = dict()
self.item_embed_layers = dict()
user_cate_feature_nums, item_cate_feature_nums = 0, 0
for cate,num in self.user_feature_dict.items():
if num[0] > 1:#如果该特征不同取值大于1,那么使用embed
user_cate_feature_nums += 1
for cate,num in self.item_feature_dict.items():
if num[0] > 1:
item_cate_feature_nums += 1
#注意,有部分特征做了embedding,因此网络接受的输入节点数目跟embedding的特征数目有关
#上面引入user_cate_feature_nums就是为了统计用户特征中做了embedding的特征数目,方便后面计算输入维度
input_dim = embed * (user_cate_feature_nums + item_cate_feature_nums) + \
(len(user_feature_dict) - user_cate_feature_nums) + (len(item_feature_dict) - item_cate_feature_nums)
#-------ctr DNN
self.ctr_layers = [input_dim] + ctr_hid_layers
self.ctr_layers_hid_num = len(self.ctr_layers)-2
ctr_fc = []
for i in range(self.ctr_layers_hid_num+1):
ctr_fc.append(torch.nn.Linear(self.ctr_layers[i],self.ctr_layers[i+1]))
self.ctr_fc = torch.nn.Sequential(*ctr_fc)
for i in range(self.ctr_layers_hid_num+1):
self.ctr_fc[i].weight.data = self.ctr_fc[i].weight.data.type(dtype)
self.ctr_fc[i].bias.data = self.ctr_fc[i].bias.data.type(dtype)
#--------cvr DNN
self.cvr_layers = [input_dim] + cvr_hid_layers
self.cvr_layers_hid_num = len(self.cvr_layers)-2
cvr_fc = []
for i in range(self.cvr_layers_hid_num+1):
cvr_fc.append(torch.nn.Linear(self.cvr_layers[i],self.cvr_layers[i+1]))
self.cvr_fc = torch.nn.Sequential(*cvr_fc)
for i in range(self.cvr_layers_hid_num+1):
self.cvr_fc[i].weight.data = self.cvr_fc[i].weight.data.type(dtype)
self.cvr_fc[i].bias.data = self.cvr_fc[i].bias.data.type(dtype)
#-------------FM
self.w = torch.nn.Linear(input_dim,2)
fm = [];fm.append(self.w)
self.fm = torch.nn.Sequential(*fm)
self.fm[0].weight.data = self.fm[0].weight.data.type(dtype)
self.fm[0].bias.data = self.fm[0].bias.data.type(dtype)
self.v = torch.nn.Parameter(torch.FloatTensor(torch.rand(input_dim,k)), requires_grad=True)
self.v = self.v.type(dtype)
#----------
def FM(self,x):
linear_part = self.fm[0](x)
inner_part1 = torch.pow(x,2)@torch.pow(self.v,2)
inner_part2 = torch.pow((x@self.v),2)
inner = 0.5*(inner_part2 - inner_part1).sum(axis = 1,keepdims = True)
inner_part = torch.cat([inner,inner],axis = 1)
return linear_part + inner_part
def reg(self,x):
tmp1 = 1/(1 + torch.exp(x[:,1:2] - x[:,0:1]))
tmp2 = 1/(1 + torch.exp(x[:,0:1] - x[:,1:2]))
return torch.cat([tmp1,tmp2],dim = 1)
def CTR(self,x):
#x = nn.BatchNorm1d(x.shape[0])(x)
linear = self.FM(x)
for i in range(self.ctr_layers_hid_num):
x = torch.relu(self.ctr_fc[i](x))#.to(device)
temp = torch.eye(x.shape[-1],self.ctr_layers[i+1])
x = x + x@temp
x = nn.BatchNorm1d(self.ctr_layers[i + 1])(x)
#x = nn.Dropout(self.p)(x)
return (self.ctr_fc[-1](x) + linear)
def CVR(self,x):
#x = nn.BatchNorm1d(x.shape[0])(x)
for i in range(self.cvr_layers_hid_num):
x = torch.relu(self.cvr_fc[i](x))#.to(device)
temp = torch.eye(x.shape[-1],self.cvr_layers[i+1])
x = x + x@temp
x = nn.BatchNorm1d(self.cvr_layers[i + 1])(x)
#x = nn.Dropout(self.p)(x)
return (torch.sigmoid(self.cvr_fc[-1](x)) + 0)
def forward(self,x):
p_ctr = self.CTR(x)
p_cvr = self.CVR(x)
p_ctcvr = (p_ctr.argmax(dim = 1).reshape(-1,1))*p_cvr
return p_ctr,p_ctcvr
def total_para(self):#计算参数数目
return sum([x.numel() for x in self.parameters()])
class Data(torch.nn.Module):
def __init__(self,user_feature_dict, item_feature_dict,embed):
super(Data,self).__init__()
self.user_feature_dict = user_feature_dict
self.item_feature_dict = item_feature_dict
self.user_embed_layers = dict()
self.item_embed_layers = dict()
user_cate_feature_nums, item_cate_feature_nums = 0, 0
for cate,num in self.user_feature_dict.items():
if num[0] > 1:#如果该特征不同取值大于1,那么使用embed
user_cate_feature_nums += 1
self.user_embed_layers['user_embed_' + str(num[1])] = torch.nn.Embedding(num[0], embed)
for cate,num in self.item_feature_dict.items():
if num[0] > 1:
item_cate_feature_nums += 1
self.item_embed_layers['item_embed_' + str(num[1])] = torch.nn.Embedding(num[0], embed)
def deal(self,x):
user_list = list()
for cate,num in self.user_feature_dict.items():
if num[0] > 1:
user_list.append(self.user_embed_layers['user_embed_{}'.format(num[1])](x[:,num[1]].long()))
else:
user_list.append(x[:,num[1]].unsqueeze(1))
item_list = list()
for cate,num in self.item_feature_dict.items():
if num[0] > 1:
item_list.append(self.item_embed_layers['item_embed_{}'.format(num[1])](x[:,num[1]].long()))
else:
item_list.append(x[:,num[1]].unsqueeze(1))
user_embed = torch.cat(user_list, axis=1)
item_embed = torch.cat(item_list, axis=1)
x = torch.cat([user_embed, item_embed], axis=1)
return x
def shullfe(x,y,z):
index = random.sample(range(0,x.shape[0]),x.shape[0])
x = x[index,:]
y = y[index,:]
z = z[index,:]
return x,y,z
def val(x,y,z,p = 0.8):#p>0.5
x,y,z = shullfe(x,y,z)
ind = (int)(p*x.shape[0])
x_train,x_val = x[:ind,:],x[ind:,:]
y_train,y_val = y[:ind,:],y[ind:,:]
z_train,z_val = z[:ind,:],z[ind:,:]
return (x_train,y_train,z_train),(x_val,y_val,z_val)
def pred(model,x):
p_ctr,p_ctcvr = model.forward(x)
ctr_pre = (p_ctr.argmax(dim = 1)).reshape(-1,1)
ctcvr_pre = [1 if x > 0.5 else 0 for x in p_ctcvr]
ctcvr_pre = torch.tensor(ctcvr_pre).reshape(-1,1)
return ctr_pre,ctcvr_pre
def Loss(model,x,click,conversion):
criteon = nn.CrossEntropyLoss()
p_ctr,p_ctcvr = model.forward(x)
ctr_loss = criteon(p_ctr,click.squeeze().type(torch.long))
loss_fn = nn.BCELoss()
ctcvr_loss = loss_fn(p_ctcvr,conversion)
total_loss = ctr_loss + ctcvr_loss
return total_loss
def loadtype(X_train,X_test,y_train,y_test,z_train,z_test,dtype):
X_train = torch.tensor(X_train).type(dtype)
X_test = torch.tensor(X_test).type(dtype)
y_train = torch.tensor(y_train).type(dtype).reshape(-1,1)
y_test = torch.tensor(y_test).type(dtype).reshape(-1,1)
z_train = torch.tensor(z_train).type(dtype).reshape(-1,1)
z_test = torch.tensor(z_test).type(dtype).reshape(-1,1)
for i in range(y_train.shape[0]):
if y_train[i,0] == 0:
z_train[i,0] = 0
for i in range(y_test.shape[0]):
if y_test[i,0] == 0:
z_test[i,0] = 0
return (X_train, X_test),(y_train, y_test),(z_train, z_test)
def Train(model,X_train,click_train,conversion_train,optim,optimtype,epoch,record,batch):
loss = Loss(model,X_train,click_train,conversion_train)
print('start train:the loss:%.4e'%(loss.item()))
for i in range(epoch):
(X_train,click_train,conversion_train),(X_val,y_val,z_val) = val(X_train,click_train,conversion_train,p = 0.8)
iteration = (X_train.shape[0] + batch - 1)//(batch)
#X_train,click_train,conversion_train = shullfe(X_train,click_train,conversion_train)
for j in range(iteration):
inds = j*batch
if (j + 1)*batch <= X_train.shape[0]:
inde = (j + 1)*batch
else:
inde = X_train.shape[0]
x = X_train[inds:inde]
clk = click_train[inds:inde]
cov = conversion_train[inds:inde]
#x,clk,cov = shullfe(x,clk,cov)
loss = Loss(model,x,clk,cov)
#print(loss.item())
if optimtype == 'LBFGS' or 'BFGS':
def closure():
optim.zero_grad()
lloss = Loss(model,x,clk,cov)
loss.backward()
return loss
optim.step(closure)
else:
for j in range(record):
optim.zero_grad()
loss = Loss(model,x,clk,cov)
#print(loss)
loss.backward()
optim.step()
loss = Loss(model,x,clk,cov)
print('the %d epoch,the %d iter,the loss:%.4e'%(i + 1,j + 1,loss.item()))
loss = Loss(model,X_train,click_train,conversion_train)
#-----------------train
ctr_pre,ctcvr_pre = pred(model,X_train)
#print(ctr_pre.shape,click_train.shape)
ctr_acc = ctr_pre.eq(click_train).sum()/click_train.shape[0]
ctcvr_acc = ctcvr_pre.eq(conversion_train).sum()/conversion_train.shape[0]
#-------------------------val
ctr_val,ctcvr_val = pred(model,X_val)
#print(ctr_pre.shape,click_train.shape)
ctr_val_acc = ctr_val.eq(y_val).sum()/X_val.shape[0]
ctcvr_val_acc = ctcvr_val.eq(z_val).sum()/X_val.shape[0]
print('the %d epoch:the loss:%.4e,the ctr acc:%.2e,ctcvr acc:%.2e,ctr val:%.2e,ctcvr val:%.2e'
%(i + 1,loss.item(),ctr_acc,ctcvr_acc,ctr_val_acc,ctcvr_val_acc))
st = time.time()
embed = 8
p = 10
ctr_hid_layers = [64,64,2]
cvr_hid_layers = [128,64,1]
dtype = torch.float32
train_data, test_data, user_feature_dict, item_feature_dict = data_preparation()
#print(user_feature_dict, item_feature_dict)#查看用户和物品的特征到底有哪些
#print(train_data.shape)#[32561,15]
train_dataset = (train_data.iloc[:, :-2].values, train_data.iloc[:, -2].values, train_data.iloc[:, -1].values)
test_dataset = (test_data.iloc[:, :-2].values, test_data.iloc[:, -2].values, test_data.iloc[:, -1].values)
#train_data里面最后两列是标签,一个是收入是否大于50K,一个是婚姻状态
X_train,y_train,z_train = train_dataset[0],train_dataset[1],train_dataset[2]
X_test,y_test,z_test = test_dataset[0],test_dataset[1],test_dataset[2]
#print(X_train[:2,:5])
(X_train, X_test),(y_train, y_test),(z_train,z_test) = loadtype(X_train,X_test,y_train,y_test,z_train,z_test,dtype)
data = Data(user_feature_dict, item_feature_dict,embed)
X_train = data.deal(X_train).data
X_test = data.deal(X_test).data
tmp = nn.BatchNorm1d(X_train.shape[1])(X_train)
#X_train = torch.cat([X_train,tmp]).data
#y_train = torch.cat([y_train,y_train]).data
#z_train = torch.cat([z_train,z_train]).data
#(X_train,y_train,z_train),(X_val,y_val,z_val) = val(X_train,y_train,z_train,p = 0.8)
model = Model(user_feature_dict, item_feature_dict,ctr_hid_layers,cvr_hid_layers,dtype,p,embed)
epoch = 10
batch = 1000
record = 2
lr = 1e-2
optimtype = 'SGD'
#optimtype = 'Adam'
#optimtype = 'LBFGS'
if optimtype == 'SGD':
optim = torch.optim.SGD(model.parameters(),lr = lr)
elif optimtype == 'Adam':
optim = torch.optim.Adam(model.parameters(),lr = lr)
elif optimtype == 'LBFGS':
optim = torch.optim.LBFGS(model.parameters(),lr = lr,max_iter = record,
tolerance_grad=1e-16, tolerance_change=1e-16,
line_search_fn='strong_wolfe')
Train(model,X_train,y_train,z_train,optim,optimtype,epoch,record,batch)
ctr_pre,ctcvr_pre = pred(model,X_test)
ctr_acc = ctr_pre.eq(y_test).sum()/y_test.shape[0]
ctcvr_acc = ctcvr_pre.eq(z_test).sum()/z_test.shape[0]
ela = time.time() - st
num = model.total_para()
print('time:%.2f,num:%d,the pre ctr acc:%.2e,pre ctcvr acc:%.2e'%(ela,num,ctr_acc,ctcvr_acc))