推荐算法中CTR和CVR的ESMM模型pytorch实现

42 篇文章 68 订阅 ¥239.90 ¥399.90

ESMM模型介绍

CTR和CVR
来源于https://zhuanlan.zhihu.com/p/57481330
点击率CTR(click through rate):假设用户搜索一个物品,平台上展现了2000个结果,用户点击了300个,那么该物品广告的点击率就是300/2000.
转化率CVR(conversion rate):假设用户搜索一个物品,平台上展现了2000个结果,用户点击了300个,并且让用户成功激活或者称为付费用户的有50个,这里还需要考虑那些用户没有点击的广告,因为假设有一个广告因为配图或者主题很差,导致用户没有点击,但是实际上该广告和用户的爱好需求非常匹配,只要用户点击了该广告,那么用户大概率也会称为付费用户,那么计算CVR的时候也需要把这类广告计算进去,不能简单地把用户没有点击的广告全部标记为未能转化的广告.
定义一个模型,里面的参数记录为 x \mathbf{x} x,事件 y y y只能取值为0或者1,其中 y = 0 y=0 y=0表示该广告没有被用户点击, y = 1 y=1 y=1表示用户点击了该广告,事件 z z z也只能取值0或者1,当 z = 0 z=0 z=0的时候说明该广告不能被转化,当 z = 1 z=1 z=1的时候说明该广告可以被转化.
P ( z = 1 , y = 1 ∣ x ) = P ( z = 1 ∣ y = 1 , x ) P ( y = 1 ∣ x ) P(z=1,y=1|\mathbf{x}) = P(z = 1|y = 1,\mathbf{x})P(y = 1|\mathbf{x}) P(z=1,y=1∣x)=P(z=1∣y=1,x)P(y=1∣x)
转换率是 P ( z = 1 ∣ y = 1 , x ) P(z = 1|y = 1,\mathbf{x}) P(z=1∣y=1,x),点击率是 P ( y = 1 ∣ x ) P(y = 1|\mathbf{x}) P(y=1∣x)
在这里插入图片描述

需要优化的目标函数如下所示,其中 θ c v r , θ c t r \theta_{cvr},\theta_{ctr} θcvr,θctr分别表示CVR部分和CTR部分的网络参数.
L ( θ c v r , θ c t r ) = ∑ i = 1 N l ( y i , f ( x i ; θ c t r ) ) + ∑ i = 1 N l ( y i & z i , f ( x i ; θ c t r ) ∗ f ( x i ; θ c v r ) ) L\left(\theta_{c v r}, \theta_{c t r}\right)=\sum_{i=1}^{N} l\left(y_{i}, f\left(\boldsymbol{x}_{i} ; \theta_{c t r}\right)\right)+\sum_{i=1}^{N} l\left(y_{i} \& z_{i}, f\left(\boldsymbol{x}_{i} ; \theta_{c t r}\right) * f\left(\boldsymbol{x}_{i} ; \theta_{c v r}\right)\right) L(θcvr,θctr)=i=1Nl(yi,f(xi;θctr))+i=1Nl(yi&zi,f(xi;θctr)f(xi;θcvr))
这里仍然使用adult数据集,上面提到过adult数据集有两个标签,分别是收入水平和婚姻状况,为了使用ESMM模型测试,这里对数据集做进一步的预处理,假设收入水平小于50k,那么在原来的数据基础上把婚姻状况标记为未婚(假设收入水平小于50k,缺乏结婚物质条件),那么做ESMM模型预测的时候,就可以把收入水平为1的认为是成功点击,把婚姻状况为1的认为是成功点击并且转化.

根据ESMM模型的定义,我们设计了两个不同的网络分别预测ctr和ctcvr,并且这两个网络的输出都只能是一个节点,并且取值必须在0和1之前,认定为ctr和ctcvr的概率.引入的损失函数为python自带的nn.BCELoss().

网上的ESMM模型在搭建网络的时候除了引入DNN以外,还加入了batchnorm和dropout,下面本人将在ESMM模型基础上引入FM结构和去掉Dropout的数值结果.
在这里插入图片描述
数据集来源于https://github.com/busesese/MultiTaskModel
添加链接描述
adultutils.py




import pandas as pd
from sklearn.preprocessing import LabelEncoder, MinMaxScaler
from sklearn.model_selection import train_test_split
from torch.utils.data import Dataset, DataLoader


# data process
def data_preparation():
    # The column names are from
    column_names = ['age', 'workclass', 'fnlwgt', 'education', 'education_num', 'marital_status', 'occupation',
                    'relationship', 'race', 'sex', 'capital_gain', 'capital_loss', 'hours_per_week', 'native_country',
                    'income_50k']
    
    # Load the dataset in Pandas
    train_df = pd.read_csv(
        'C:\\Users\\2001213226\\Desktop\\icbc\\MultiTaskModel-main\\data\\adult.data',
        delimiter=',',
        header=None,
        index_col=None,
        names=column_names
    )
    other_df = pd.read_csv(
        'C:\\Users\\2001213226\\Desktop\\icbc\\MultiTaskModel-main\\data\\adult.test',
        delimiter=',',
        header=None,
        index_col=None,
        names=column_names
    )
    #print(train_df)#[M,15]
    
    train_df['tag'] = 1#额外增加一列tag
    #print(train_df)
    other_df['tag'] = 0
    #print(other_df)
    other_df.dropna(inplace=True)#默认按行删除,如果某行有缺失值,就删除该行,这个数据只删除了一行
    #print(other_df)
    #print(other_df['income_50k'])
    other_df['income_50k'] = other_df['income_50k'].apply(lambda x: x[:-1])#似乎没看出处理以后数据的变化
    #print(other_df['income_50k'])
    data = pd.concat([train_df, other_df])#默认按行拼接两个数据
    #print(train_df.shape,other_df.shape,data.shape)
    data.dropna(inplace=True)
    #print(data.shape)
    # First group of tasks according to the paper
    label_columns = ['income_50k', 'marital_status']#把收入和婚姻状态作为标签
    
    # categorical columns
    categorical_columns = ['workclass', 'education', 'occupation', 'relationship', 'race', 'sex', 'native_country']
    for col in label_columns:
        if col == 'income_50k':
            data[col] = data[col].apply(lambda x: 0 if x == ' <=50K' else 1)
        else:
            data[col] = data[col].apply(lambda x: 0 if x == ' Never-married' else 1)
            
    # feature engine
    for col in column_names:
        if col not in label_columns + ['tag']:
            if col in categorical_columns:
                le = LabelEncoder()#把不同的状态标记为不同类别,比图兔子,狗,猫会分别标记为0,1,2
                data[col] = le.fit_transform(data[col])#把类别用连续的整数标记为不同标签
            else:
                mm = MinMaxScaler()#根据数值信息归一化,离散数值,可能有小数
                data[col] = mm.fit_transform(data[[col]]).reshape(-1)#归一化处理
    data = data[['age', 'workclass', 'fnlwgt', 'education', 'education_num', 'occupation',
                 'relationship', 'race', 'sex', 'capital_gain', 'capital_loss', 'hours_per_week', 'native_country',
                 'income_50k', 'marital_status', 'tag']]
    
    # user feature, item feature
    user_feature_dict, item_feature_dict = dict(), dict()
    for idx, col in enumerate(data.columns):
        if col not in label_columns + ['tag']:
            if idx < 7:
                if col in categorical_columns:
                    user_feature_dict[col] = (len(data[col].unique())+1, idx)#返回该列的所有不同取值个数+1
                    #if idx == 5:
                    #    print(user_feature_dict[col])
                else:
                    user_feature_dict[col] = (1, idx)
            else:
                if col in categorical_columns:
                    item_feature_dict[col] = (len(data[col].unique())+1, idx)
                else:
                    item_feature_dict[col] = (1, idx)
    
    # Split the other dataset into 1:1 validation to test according to the paper
    train_data, test_data = data[data['tag'] == 1], data[data['tag'] == 0]
    train_data.drop('tag', axis=1, inplace=True)#.copy()
    test_data.drop('tag', axis=1, inplace=True)#.copy()
    
    # val data
    # train_data, val_data = train_test_split(train_data, test_size=0.5, random_state=2021)
    return train_data, test_data, user_feature_dict, item_feature_dict



ESMM.py

import torch
import torch.nn as nn
import numpy as np
import time
import pandas as pd
from adultutils import data_preparation

class Model(torch.nn.Module):
    def __init__(self, user_feature_dict, item_feature_dict,ctr_hid_layers,cvr_hid_layers,dtype,p,embed):
        super(Model, self).__init__()#输入维度与embed有关,这里只设置hidden层和输出层
        self.user_feature_dict = user_feature_dict
        self.item_feature_dict = item_feature_dict
        self.user_embed_layers = dict()
        self.item_embed_layers = dict()
        user_cate_feature_nums, item_cate_feature_nums = 0, 0
        for cate,num in self.user_feature_dict.items():
            if num[0] > 1:#如果该特征不同取值大于1,那么使用embed
                user_cate_feature_nums += 1
                self.user_embed_layers['user_embed_' + str(num[1])] = torch.nn.Embedding(num[0], embed)
        for cate,num in self.item_feature_dict.items():
            if num[0] > 1:
                item_cate_feature_nums += 1
                self.item_embed_layers['item_embed_' + str(num[1])] = torch.nn.Embedding(num[0], embed)
        #注意,有部分特征做了embedding,因此网络接受的输入节点数目跟embedding的特征数目有关
        #上面引入user_cate_feature_nums就是为了统计用户特征中做了embedding的特征数目,方便后面计算输入维度
        input_dim = embed * (user_cate_feature_nums + item_cate_feature_nums) + \
                      (len(user_feature_dict) - user_cate_feature_nums) + (len(item_feature_dict) - item_cate_feature_nums)
        #-------ctr DNN
        self.ctr_layers = [input_dim] + ctr_hid_layers
        self.ctr_layers_hid_num = len(self.ctr_layers)-2
        ctr_fc = []
        for i in range(self.ctr_layers_hid_num+1):
            ctr_fc.append(torch.nn.Linear(self.ctr_layers[i],self.ctr_layers[i+1]))
        self.ctr_fc = torch.nn.Sequential(*ctr_fc)
        for i in range(self.ctr_layers_hid_num+1):
            self.ctr_fc[i].weight.data = self.ctr_fc[i].weight.data.type(dtype)
            self.ctr_fc[i].bias.data = self.ctr_fc[i].bias.data.type(dtype)
        #--------cvr DNN
        self.cvr_layers = [input_dim] + cvr_hid_layers
        self.cvr_layers_hid_num = len(self.cvr_layers)-2
        cvr_fc = []
        for i in range(self.cvr_layers_hid_num+1):
            cvr_fc.append(torch.nn.Linear(self.cvr_layers[i],self.cvr_layers[i+1]))
        self.cvr_fc = torch.nn.Sequential(*cvr_fc)
        for i in range(self.cvr_layers_hid_num+1):
            self.cvr_fc[i].weight.data = self.cvr_fc[i].weight.data.type(dtype)
            self.cvr_fc[i].bias.data = self.cvr_fc[i].bias.data.type(dtype)
        #-------------FM
        self.p = p
        #----------
    
    def CTR(self,x):
        
        for i in range(self.ctr_layers_hid_num):
            x = torch.relu(self.ctr_fc[i](x))#.to(device)
            x = nn.BatchNorm1d(self.ctr_layers[i + 1])(x)
            x = nn.Dropout(self.p)(x)
        return torch.sigmoid((self.ctr_fc[-1](x) ))
    def CVR(self,x):
        
        for i in range(self.cvr_layers_hid_num):
            x = torch.relu(self.cvr_fc[i](x))#.to(device)
            x = nn.BatchNorm1d(self.cvr_layers[i + 1])(x)
            x = nn.Dropout(self.p)(x)
        return torch.sigmoid((self.cvr_fc[-1](x) ))
    def forward(self,x):
        user_list = list()
        for cate,num in self.user_feature_dict.items():
            if num[0] > 1:
                user_list.append(self.user_embed_layers['user_embed_{}'.format(num[1])](x[:,num[1]].long()))
            else:
                user_list.append(x[:,num[1]].unsqueeze(1))
        item_list = list()
        for cate,num in self.item_feature_dict.items():
            if num[0] > 1:
                item_list.append(self.item_embed_layers['item_embed_{}'.format(num[1])](x[:,num[1]].long()))
            else:
                item_list.append(x[:,num[1]].unsqueeze(1))
        user_embed = torch.cat(user_list, axis=1)
        item_embed = torch.cat(item_list, axis=1)
        x = torch.cat([user_embed, item_embed], axis=1)

        p_ctr = self.CTR(x)
        p_cvr = self.CVR(x)
        p_ctcvr = p_ctr*p_cvr
        return p_ctr,p_ctcvr
    
    def total_para(self):#计算参数数目
        return sum([x.numel() for x in self.parameters()])  

def Loss(model,x,click,conversion):
    p_ctr,p_ctcvr = model.forward(x)
    loss_fn = nn.BCELoss()
    
    #print(click.shape,p_ctr.shape)
    ctr_loss = loss_fn(p_ctr, click)
    ctcvr_loss = loss_fn(p_ctcvr, conversion)
    total_loss = ctr_loss + ctcvr_loss
    return total_loss
def loadtype(X_train,X_test,y_train,y_test,z_train,z_test,dtype):
    X_train = torch.tensor(X_train).type(dtype)
    X_test = torch.tensor(X_test).type(dtype)
    y_train = torch.tensor(y_train).type(dtype).reshape(-1,1)
    y_test = torch.tensor(y_test).type(dtype).reshape(-1,1)
    z_train = torch.tensor(z_train).type(dtype).reshape(-1,1)
    z_test = torch.tensor(z_test).type(dtype).reshape(-1,1)
    for i in range(y_train.shape[0]):
        if y_train[i,0] == 0:
            z_train[i,0] = 0
    for i in range(y_test.shape[0]):
        if y_test[i,0] == 0:
            z_test[i,0] = 0
    return (X_train, X_test),(y_train, y_test),(z_train, z_test)
def Train(model,X_train,click_train,conversion_train,optim,optimtype,epoch,record,batch):
    iteration = (X_train.shape[0] + batch - 1)//(batch)
    loss = Loss(model,X_train,click_train,conversion_train)
    print('start train:the loss:%.4e'%(loss.item()))
    for i in range(epoch):
        
        for j in range(iteration):
            inds = j*batch
            if (j + 1)*batch <= X_train.shape[0]:
                inde = (j + 1)*batch
            else:
                inde = X_train.shape[0]
                
            x = X_train[inds:inde]
            clk = click_train[inds:inde]
            cov = conversion_train[inds:inde]
            loss = Loss(model,x,clk,cov)
            #print(loss.item())
            if optimtype == 'LBFGS' or 'BFGS':
                def closure():
                    optim.zero_grad()
                    lloss = Loss(model,x,clk,cov)
                    loss.backward()
                    return loss
                optim.step(closure)
            else:
                for j in range(record):
                    optim.zero_grad()
                    loss = Loss(model,x,clk,cov)
                    #print(loss)
                    loss.backward()
                    optim.step()
            loss = Loss(model,x,clk,cov)
            print('the %d epoch,the %d iter,the loss:%.4e'%(i + 1,j + 1,loss.item()))
        
        loss = Loss(model,X_train,click_train,conversion_train)
        p_ctr,p_ctcvr = model.forward(X_train)
        ctr_pre = [1 if x > 0.5 else 0 for x in p_ctr]
        ctr_pre = torch.tensor(ctr_pre).reshape(-1,1)
        ctcvr_pre = [1 if x > 0.5 else 0 for x in p_ctcvr]
        ctcvr_pre = torch.tensor(ctcvr_pre).reshape(-1,1)
        ctr_acc = ctr_pre.eq(click_train).sum()/click_train.shape[0]
        ctcvr_acc = ctcvr_pre.eq(conversion_train).sum()/conversion_train.shape[0]
        print('the %d epoch:the loss:%.4e,the ctr acc:%.2e,ctcvr acc:%.2e'%(i + 1,loss.item(),ctr_acc,ctcvr_acc))
embed = 8
p = 0.5
ctr_hid_layers = [256,64,1]
cvr_hid_layers = [256,64,1]
dtype = torch.float32

train_data, test_data, user_feature_dict, item_feature_dict = data_preparation()
#print(user_feature_dict, item_feature_dict)#查看用户和物品的特征到底有哪些
#print(train_data.shape)#[32561,15]
train_dataset = (train_data.iloc[:, :-2].values, train_data.iloc[:, -2].values, train_data.iloc[:, -1].values)
test_dataset = (test_data.iloc[:, :-2].values, test_data.iloc[:, -2].values, test_data.iloc[:, -1].values)
#train_data里面最后两列是标签,一个是收入是否大于50K,一个是婚姻状态
X_train,y_train,z_train = train_dataset[0],train_dataset[1],train_dataset[2]
X_test,y_test,z_test = test_dataset[0],test_dataset[1],test_dataset[2]
#print(X_train[:2,:5])
model = Model(user_feature_dict, item_feature_dict,ctr_hid_layers,cvr_hid_layers,dtype,p,embed)
(X_train, X_test),(y_train, y_test),(z_train,z_test) = loadtype(X_train,X_test,y_train,y_test,z_train,z_test,dtype)


epoch = 5
batch = 1000
record = 2
lr = 1e-2
optimtype = 'SGD'
#optimtype = 'Adam'
#optimtype = 'LBFGS'

if optimtype == 'SGD':
    optim = torch.optim.SGD(model.parameters(),lr = lr)
elif optimtype == 'Adam':
    optim = torch.optim.Adam(model.parameters(),lr = lr)
elif optimtype == 'LBFGS':
    optim = torch.optim.LBFGS(model.parameters(),lr = lr,max_iter = record,
                              tolerance_grad=1e-16, tolerance_change=1e-16,
                              line_search_fn='strong_wolfe')
         
Train(model,X_train,y_train,z_train,optim,optimtype,epoch,record,batch)                 

p_ctr,p_ctcvr = model.forward(X_test)
print(p_ctr[:3],p_ctcvr[:3])
ctr_pre = [1 if x > 0.5 else 0 for x in p_ctr]
ctr_pre = torch.tensor(ctr_pre).reshape(-1,1)
ctcvr_pre = [1 if x > 0.5 else 0 for x in p_ctcvr]
ctcvr_pre = torch.tensor(ctcvr_pre).reshape(-1,1)
ctr_acc = ctr_pre.eq(y_test).sum()/y_test.shape[0]
ctcvr_acc = ctcvr_pre.eq(z_test).sum()/z_test.shape[0]
print('the pre ctr acc:%.2e,pre ctcvr acc:%.2e'%(ctr_acc,ctcvr_acc))



ESMM+FM.py

import torch
import torch.nn as nn
import numpy as np
import time
import pandas as pd
from adultutils import data_preparation

class Model(torch.nn.Module):
    def __init__(self, user_feature_dict, item_feature_dict,ctr_hid_layers,cvr_hid_layers,dtype,k,embed):
        super(Model, self).__init__()#输入维度与embed有关,这里只设置hidden层和输出层
        self.user_feature_dict = user_feature_dict
        self.item_feature_dict = item_feature_dict
        self.user_embed_layers = dict()
        self.item_embed_layers = dict()
        user_cate_feature_nums, item_cate_feature_nums = 0, 0
        for cate,num in self.user_feature_dict.items():
            if num[0] > 1:#如果该特征不同取值大于1,那么使用embed
                user_cate_feature_nums += 1
                self.user_embed_layers['user_embed_' + str(num[1])] = torch.nn.Embedding(num[0], embed)
        for cate,num in self.item_feature_dict.items():
            if num[0] > 1:
                item_cate_feature_nums += 1
                self.item_embed_layers['item_embed_' + str(num[1])] = torch.nn.Embedding(num[0], embed)
        #注意,有部分特征做了embedding,因此网络接受的输入节点数目跟embedding的特征数目有关
        #上面引入user_cate_feature_nums就是为了统计用户特征中做了embedding的特征数目,方便后面计算输入维度
        input_dim = embed * (user_cate_feature_nums + item_cate_feature_nums) + \
                      (len(user_feature_dict) - user_cate_feature_nums) + (len(item_feature_dict) - item_cate_feature_nums)
        #-------ctr DNN
        self.ctr_layers = [input_dim] + ctr_hid_layers
        self.ctr_layers_hid_num = len(self.ctr_layers)-2
        ctr_fc = []
        for i in range(self.ctr_layers_hid_num+1):
            ctr_fc.append(torch.nn.Linear(self.ctr_layers[i],self.ctr_layers[i+1]))
        self.ctr_fc = torch.nn.Sequential(*ctr_fc)
        for i in range(self.ctr_layers_hid_num+1):
            self.ctr_fc[i].weight.data = self.ctr_fc[i].weight.data.type(dtype)
            self.ctr_fc[i].bias.data = self.ctr_fc[i].bias.data.type(dtype)
        #--------cvr DNN
        self.cvr_layers = [input_dim] + cvr_hid_layers
        self.cvr_layers_hid_num = len(self.cvr_layers)-2
        cvr_fc = []
        for i in range(self.cvr_layers_hid_num+1):
            cvr_fc.append(torch.nn.Linear(self.cvr_layers[i],self.cvr_layers[i+1]))
        self.cvr_fc = torch.nn.Sequential(*cvr_fc)
        for i in range(self.cvr_layers_hid_num+1):
            self.cvr_fc[i].weight.data = self.cvr_fc[i].weight.data.type(dtype)
            self.cvr_fc[i].bias.data = self.cvr_fc[i].bias.data.type(dtype)
        #-------------FM
        self.w = torch.nn.Linear(input_dim,1)
        fm = [];fm.append(self.w)
        self.fm = torch.nn.Sequential(*fm)
        self.fm[0].weight.data = self.fm[0].weight.data.type(dtype)
        self.fm[0].bias.data = self.fm[0].bias.data.type(dtype)
        self.v = torch.nn.Parameter(torch.FloatTensor(torch.rand(input_dim,k)), requires_grad=True)
        
        self.v = self.v.type(dtype)
        
        #----------
    def FM(self,x):
        linear_part = self.fm[0](x)
        inner_part1 = torch.pow(x,2)@torch.pow(self.v,2)
        inner_part2 = torch.pow((x@self.v),2)
        inner_part = 0.5*(inner_part2 - inner_part1).sum(axis = 1,keepdims = True)
        
        return linear_part + inner_part

    def CTR(self,x):
        linear = self.FM(x)
        for i in range(self.ctr_layers_hid_num):
            x = torch.relu(self.ctr_fc[i](x))#.to(device)
            x = nn.BatchNorm1d(self.ctr_layers[i + 1])(x)
        
        return torch.sigmoid(0.5*(self.ctr_fc[-1](x) + linear))
    def CVR(self,x):
        linear = self.FM(x)
        for i in range(self.cvr_layers_hid_num):
            x = torch.relu(self.cvr_fc[i](x))#.to(device)
            x = nn.BatchNorm1d(self.cvr_layers[i + 1])(x)
        return torch.sigmoid(0.5*(self.cvr_fc[-1](x) + linear))
    def forward(self,x):
        user_list = list()
        for cate,num in self.user_feature_dict.items():
            if num[0] > 1:
                user_list.append(self.user_embed_layers['user_embed_{}'.format(num[1])](x[:,num[1]].long()))
            else:
                user_list.append(x[:,num[1]].unsqueeze(1))
        item_list = list()
        for cate,num in self.item_feature_dict.items():
            if num[0] > 1:
                item_list.append(self.item_embed_layers['item_embed_{}'.format(num[1])](x[:,num[1]].long()))
            else:
                item_list.append(x[:,num[1]].unsqueeze(1))
        user_embed = torch.cat(user_list, axis=1)
        item_embed = torch.cat(item_list, axis=1)
        x = torch.cat([user_embed, item_embed], axis=1)

        p_ctr = self.CTR(x)
        p_cvr = self.CVR(x)
        p_ctcvr = p_ctr*p_cvr
        return p_ctr,p_ctcvr
    
    def total_para(self):#计算参数数目
        return sum([x.numel() for x in self.parameters()])  

def Loss(model,x,click,conversion):
    p_ctr,p_ctcvr = model.forward(x)
    loss_fn = nn.BCELoss()
    
    #print(click.shape,p_ctr.shape)
    ctr_loss = loss_fn(p_ctr, click)
    ctcvr_loss = loss_fn(p_ctcvr, conversion)
    total_loss = ctr_loss + ctcvr_loss
    return total_loss
def loadtype(X_train,X_test,y_train,y_test,z_train,z_test,dtype):
    X_train = torch.tensor(X_train).type(dtype)
    X_test = torch.tensor(X_test).type(dtype)
    y_train = torch.tensor(y_train).type(dtype).reshape(-1,1)
    y_test = torch.tensor(y_test).type(dtype).reshape(-1,1)
    z_train = torch.tensor(z_train).type(dtype).reshape(-1,1)
    z_test = torch.tensor(z_test).type(dtype).reshape(-1,1)
    for i in range(y_train.shape[0]):
        if y_train[i,0] == 0:
            z_train[i,0] = 0
    for i in range(y_test.shape[0]):
        if y_test[i,0] == 0:
            z_test[i,0] = 0
    return (X_train, X_test),(y_train, y_test),(z_train, z_test)
def Train(model,X_train,click_train,conversion_train,optim,optimtype,epoch,record,batch):
    iteration = (X_train.shape[0] + batch - 1)//(batch)
    loss = Loss(model,X_train,click_train,conversion_train)
    print('start train:the loss:%.4e'%(loss.item()))
    for i in range(epoch):
        
        for j in range(iteration):
            inds = j*batch
            if (j + 1)*batch <= X_train.shape[0]:
                inde = (j + 1)*batch
            else:
                inde = X_train.shape[0]
                
            x = X_train[inds:inde]
            clk = click_train[inds:inde]
            cov = conversion_train[inds:inde]
            loss = Loss(model,x,clk,cov)
            #print(loss.item())
            if optimtype == 'LBFGS' or 'BFGS':
                def closure():
                    optim.zero_grad()
                    lloss = Loss(model,x,clk,cov)
                    loss.backward()
                    return loss
                optim.step(closure)
            else:
                for j in range(record):
                    optim.zero_grad()
                    loss = Loss(model,x,clk,cov)
                    #print(loss)
                    loss.backward()
                    optim.step()
            loss = Loss(model,x,clk,cov)
            print('the %d epoch,the %d iter,the loss:%.4e'%(i + 1,j + 1,loss.item()))
        
        loss = Loss(model,X_train,click_train,conversion_train)
        p_ctr,p_ctcvr = model.forward(X_train)
        ctr_pre = [1 if x > 0.5 else 0 for x in p_ctr]
        ctr_pre = torch.tensor(ctr_pre).reshape(-1,1)
        ctcvr_pre = [1 if x > 0.5 else 0 for x in p_ctcvr]
        ctcvr_pre = torch.tensor(ctcvr_pre).reshape(-1,1)
        ctr_acc = ctr_pre.eq(click_train).sum()/click_train.shape[0]
        ctcvr_acc = ctcvr_pre.eq(conversion_train).sum()/conversion_train.shape[0]
        print('the %d epoch:the loss:%.4e,the ctr acc:%.2e,ctcvr acc:%.2e'%(i + 1,loss.item(),ctr_acc,ctcvr_acc))
embed = 8
k = 10
ctr_hid_layers = [256,64,1]
cvr_hid_layers = [256,64,1]
dtype = torch.float32

train_data, test_data, user_feature_dict, item_feature_dict = data_preparation()
#print(user_feature_dict, item_feature_dict)#查看用户和物品的特征到底有哪些
#print(train_data.shape)#[32561,15]
train_dataset = (train_data.iloc[:, :-2].values, train_data.iloc[:, -2].values, train_data.iloc[:, -1].values)
test_dataset = (test_data.iloc[:, :-2].values, test_data.iloc[:, -2].values, test_data.iloc[:, -1].values)
#train_data里面最后两列是标签,一个是收入是否大于50K,一个是婚姻状态
X_train,y_train,z_train = train_dataset[0],train_dataset[1],train_dataset[2]
X_test,y_test,z_test = test_dataset[0],test_dataset[1],test_dataset[2]
#print(X_train[:2,:5])
model = Model(user_feature_dict, item_feature_dict,ctr_hid_layers,cvr_hid_layers,dtype,k,embed)
(X_train, X_test),(y_train, y_test),(z_train,z_test) = loadtype(X_train,X_test,y_train,y_test,z_train,z_test,dtype)


epoch = 5
batch = 1000
record = 2
lr = 1e-2
optimtype = 'SGD'
#optimtype = 'Adam'
#optimtype = 'LBFGS'

if optimtype == 'SGD':
    optim = torch.optim.SGD(model.parameters(),lr = lr)
elif optimtype == 'Adam':
    optim = torch.optim.Adam(model.parameters(),lr = lr)
elif optimtype == 'LBFGS':
    optim = torch.optim.LBFGS(model.parameters(),lr = lr,max_iter = record,
                              tolerance_grad=1e-16, tolerance_change=1e-16,
                              line_search_fn='strong_wolfe')
         
Train(model,X_train,y_train,z_train,optim,optimtype,epoch,record,batch)                 

p_ctr,p_ctcvr = model.forward(X_test)
print(p_ctr[:3],p_ctcvr[:3])
ctr_pre = [1 if x > 0.5 else 0 for x in p_ctr]
ctr_pre = torch.tensor(ctr_pre).reshape(-1,1)
ctcvr_pre = [1 if x > 0.5 else 0 for x in p_ctcvr]
ctcvr_pre = torch.tensor(ctcvr_pre).reshape(-1,1)
ctr_acc = ctr_pre.eq(y_test).sum()/y_test.shape[0]
ctcvr_acc = ctcvr_pre.eq(z_test).sum()/z_test.shape[0]
print('the pre ctr acc:%.2e,pre ctcvr acc:%.2e'%(ctr_acc,ctcvr_acc))



ESMM-dropout.py

import torch
import torch.nn as nn
import numpy as np
import time
import pandas as pd
from adultutils import data_preparation

class Model(torch.nn.Module):
    def __init__(self, user_feature_dict, item_feature_dict,ctr_hid_layers,cvr_hid_layers,dtype,embed):
        super(Model, self).__init__()#输入维度与embed有关,这里只设置hidden层和输出层
        self.user_feature_dict = user_feature_dict
        self.item_feature_dict = item_feature_dict
        self.user_embed_layers = dict()
        self.item_embed_layers = dict()
        user_cate_feature_nums, item_cate_feature_nums = 0, 0
        for cate,num in self.user_feature_dict.items():
            if num[0] > 1:#如果该特征不同取值大于1,那么使用embed
                user_cate_feature_nums += 1
                self.user_embed_layers['user_embed_' + str(num[1])] = torch.nn.Embedding(num[0], embed)
        for cate,num in self.item_feature_dict.items():
            if num[0] > 1:
                item_cate_feature_nums += 1
                self.item_embed_layers['item_embed_' + str(num[1])] = torch.nn.Embedding(num[0], embed)
        #注意,有部分特征做了embedding,因此网络接受的输入节点数目跟embedding的特征数目有关
        #上面引入user_cate_feature_nums就是为了统计用户特征中做了embedding的特征数目,方便后面计算输入维度
        input_dim = embed * (user_cate_feature_nums + item_cate_feature_nums) + \
                      (len(user_feature_dict) - user_cate_feature_nums) + (len(item_feature_dict) - item_cate_feature_nums)
        #-------ctr DNN
        self.ctr_layers = [input_dim] + ctr_hid_layers
        self.ctr_layers_hid_num = len(self.ctr_layers)-2
        ctr_fc = []
        for i in range(self.ctr_layers_hid_num+1):
            ctr_fc.append(torch.nn.Linear(self.ctr_layers[i],self.ctr_layers[i+1]))
        self.ctr_fc = torch.nn.Sequential(*ctr_fc)
        for i in range(self.ctr_layers_hid_num+1):
            self.ctr_fc[i].weight.data = self.ctr_fc[i].weight.data.type(dtype)
            self.ctr_fc[i].bias.data = self.ctr_fc[i].bias.data.type(dtype)
        #--------cvr DNN
        self.cvr_layers = [input_dim] + cvr_hid_layers
        self.cvr_layers_hid_num = len(self.cvr_layers)-2
        cvr_fc = []
        for i in range(self.cvr_layers_hid_num+1):
            cvr_fc.append(torch.nn.Linear(self.cvr_layers[i],self.cvr_layers[i+1]))
        self.cvr_fc = torch.nn.Sequential(*cvr_fc)
        for i in range(self.cvr_layers_hid_num+1):
            self.cvr_fc[i].weight.data = self.cvr_fc[i].weight.data.type(dtype)
            self.cvr_fc[i].bias.data = self.cvr_fc[i].bias.data.type(dtype)
        #----------
    def CTR(self,x):
        for i in range(self.ctr_layers_hid_num):
            x = torch.relu(self.ctr_fc[i](x))#.to(device)
            x = nn.BatchNorm1d(self.ctr_layers[i + 1])(x)
        return torch.sigmoid(self.ctr_fc[-1](x))
    def CVR(self,x):
        for i in range(self.cvr_layers_hid_num):
            x = torch.relu(self.cvr_fc[i](x))#.to(device)
            x = nn.BatchNorm1d(self.cvr_layers[i + 1])(x)
        return torch.sigmoid(self.cvr_fc[-1](x))
    def forward(self,x):
        user_list = list()
        for cate,num in self.user_feature_dict.items():
            if num[0] > 1:
                user_list.append(self.user_embed_layers['user_embed_{}'.format(num[1])](x[:,num[1]].long()))
            else:
                user_list.append(x[:,num[1]].unsqueeze(1))
        item_list = list()
        for cate,num in self.item_feature_dict.items():
            if num[0] > 1:
                item_list.append(self.item_embed_layers['item_embed_{}'.format(num[1])](x[:,num[1]].long()))
            else:
                item_list.append(x[:,num[1]].unsqueeze(1))
        user_embed = torch.cat(user_list, axis=1)
        item_embed = torch.cat(item_list, axis=1)
        x = torch.cat([user_embed, item_embed], axis=1)

        p_ctr = self.CTR(x)
        p_cvr = self.CVR(x)
        p_ctcvr = p_ctr*p_cvr
        return p_ctr,p_ctcvr
    
    def total_para(self):#计算参数数目
        return sum([x.numel() for x in self.parameters()])  

def Loss(model,x,click,conversion):
    p_ctr,p_ctcvr = model.forward(x)
    loss_fn = nn.BCELoss()
    
    #print(click.shape,p_ctr.shape)
    ctr_loss = loss_fn(p_ctr, click)
    ctcvr_loss = loss_fn(p_ctcvr, conversion)
    total_loss = ctr_loss + ctcvr_loss
    return total_loss
def loadtype(X_train,X_test,y_train,y_test,z_train,z_test,dtype):
    X_train = torch.tensor(X_train).type(dtype)
    X_test = torch.tensor(X_test).type(dtype)
    y_train = torch.tensor(y_train).type(dtype).reshape(-1,1)
    y_test = torch.tensor(y_test).type(dtype).reshape(-1,1)
    z_train = torch.tensor(z_train).type(dtype).reshape(-1,1)
    z_test = torch.tensor(z_test).type(dtype).reshape(-1,1)
    for i in range(y_train.shape[0]):
        if y_train[i,0] == 0:
            z_train[i,0] = 0
    for i in range(y_test.shape[0]):
        if y_test[i,0] == 0:
            z_test[i,0] = 0
    return (X_train, X_test),(y_train, y_test),(z_train, z_test)
def Train(model,X_train,click_train,conversion_train,optim,optimtype,epoch,record,batch):
    iteration = (X_train.shape[0] + batch - 1)//(batch)
    loss = Loss(model,X_train,click_train,conversion_train)
    print('start train:the loss:%.4e'%(loss.item()))
    for i in range(epoch):
        
        for j in range(iteration):
            inds = j*batch
            if (j + 1)*batch <= X_train.shape[0]:
                inde = (j + 1)*batch
            else:
                inde = X_train.shape[0]
                
            x = X_train[inds:inde]
            clk = click_train[inds:inde]
            cov = conversion_train[inds:inde]
            loss = Loss(model,x,clk,cov)
            #print(loss.item())
            if optimtype == 'LBFGS' or 'BFGS':
                def closure():
                    optim.zero_grad()
                    lloss = Loss(model,x,clk,cov)
                    loss.backward()
                    return loss
                optim.step(closure)
            else:
                for j in range(record):
                    optim.zero_grad()
                    loss = Loss(model,x,clk,cov)
                    #print(loss)
                    loss.backward()
                    optim.step()
            loss = Loss(model,x,clk,cov)
            print('the %d epoch,the %d iter,the loss:%.4e'%(i + 1,j + 1,loss.item()))
        
        loss = Loss(model,X_train,click_train,conversion_train)
        p_ctr,p_ctcvr = model.forward(X_train)
        ctr_pre = [1 if x > 0.5 else 0 for x in p_ctr]
        ctr_pre = torch.tensor(ctr_pre).reshape(-1,1)
        ctcvr_pre = [1 if x > 0.5 else 0 for x in p_ctcvr]
        ctcvr_pre = torch.tensor(ctcvr_pre).reshape(-1,1)
        ctr_acc = ctr_pre.eq(click_train).sum()/click_train.shape[0]
        ctcvr_acc = ctcvr_pre.eq(conversion_train).sum()/conversion_train.shape[0]
        print('the %d epoch:the loss:%.4e,the ctr acc:%.2e,ctcvr acc:%.2e'%(i + 1,loss.item(),ctr_acc,ctcvr_acc))
embed = 8
k = 10
ctr_hid_layers = [256,64,1]
cvr_hid_layers = [256,64,1]
dtype = torch.float32

train_data, test_data, user_feature_dict, item_feature_dict = data_preparation()
#print(user_feature_dict, item_feature_dict)#查看用户和物品的特征到底有哪些
#print(train_data.shape)#[32561,15]
train_dataset = (train_data.iloc[:, :-2].values, train_data.iloc[:, -2].values, train_data.iloc[:, -1].values)
test_dataset = (test_data.iloc[:, :-2].values, test_data.iloc[:, -2].values, test_data.iloc[:, -1].values)
#train_data里面最后两列是标签,一个是收入是否大于50K,一个是婚姻状态
X_train,y_train,z_train = train_dataset[0],train_dataset[1],train_dataset[2]
X_test,y_test,z_test = test_dataset[0],test_dataset[1],test_dataset[2]
#print(X_train[:2,:5])
model = Model(user_feature_dict, item_feature_dict,ctr_hid_layers,cvr_hid_layers,dtype,embed)
(X_train, X_test),(y_train, y_test),(z_train,z_test) = loadtype(X_train,X_test,y_train,y_test,z_train,z_test,dtype)


epoch = 5
batch = 1000
record = 2
lr = 1e-2
optimtype = 'SGD'
#optimtype = 'Adam'
#optimtype = 'LBFGS'

if optimtype == 'SGD':
    optim = torch.optim.SGD(model.parameters(),lr = lr)
elif optimtype == 'Adam':
    optim = torch.optim.Adam(model.parameters(),lr = lr)
elif optimtype == 'LBFGS':
    optim = torch.optim.LBFGS(model.parameters(),lr = lr,max_iter = record,
                              tolerance_grad=1e-16, tolerance_change=1e-16,
                              line_search_fn='strong_wolfe')
         
Train(model,X_train,y_train,z_train,optim,optimtype,epoch,record,batch)                 

p_ctr,p_ctcvr = model.forward(X_test)
ctr_pre = [1 if x > 0.5 else 0 for x in p_ctr]
ctr_pre = torch.tensor(ctr_pre).reshape(-1,1)
ctcvr_pre = [1 if x > 0.5 else 0 for x in p_ctcvr]
ctcvr_pre = torch.tensor(ctcvr_pre).reshape(-1,1)
ctr_acc = ctr_pre.eq(y_test).sum()/y_test.shape[0]
ctcvr_acc = ctcvr_pre.eq(z_test).sum()/z_test.shape[0]
print('the pre ctr acc:%.2e,pre ctcvr acc:%.2e'%(ctr_acc,ctcvr_acc))



交叉熵ESMM+FM

better

import torch
import torch.nn as nn
import numpy as np
import time
import pandas as pd
from adultutils import data_preparation
np.random.seed(1234)
torch.manual_seed(1234)

class Model(torch.nn.Module):
    def __init__(self, user_feature_dict, item_feature_dict,ctr_hid_layers,cvr_hid_layers,dtype,k,embed):
        super(Model, self).__init__()#输入维度与embed有关,这里只设置hidden层和输出层
        self.user_feature_dict = user_feature_dict
        self.item_feature_dict = item_feature_dict
        self.user_embed_layers = dict()
        self.item_embed_layers = dict()
        user_cate_feature_nums, item_cate_feature_nums = 0, 0
        for cate,num in self.user_feature_dict.items():
            if num[0] > 1:#如果该特征不同取值大于1,那么使用embed
                user_cate_feature_nums += 1
                self.user_embed_layers['user_embed_' + str(num[1])] = torch.nn.Embedding(num[0], embed)
        for cate,num in self.item_feature_dict.items():
            if num[0] > 1:
                item_cate_feature_nums += 1
                self.item_embed_layers['item_embed_' + str(num[1])] = torch.nn.Embedding(num[0], embed)
        #注意,有部分特征做了embedding,因此网络接受的输入节点数目跟embedding的特征数目有关
        #上面引入user_cate_feature_nums就是为了统计用户特征中做了embedding的特征数目,方便后面计算输入维度
        input_dim = embed * (user_cate_feature_nums + item_cate_feature_nums) + \
                      (len(user_feature_dict) - user_cate_feature_nums) + (len(item_feature_dict) - item_cate_feature_nums)
        #-------ctr DNN
        self.ctr_layers = [input_dim] + ctr_hid_layers
        self.ctr_layers_hid_num = len(self.ctr_layers)-2
        ctr_fc = []
        for i in range(self.ctr_layers_hid_num+1):
            ctr_fc.append(torch.nn.Linear(self.ctr_layers[i],self.ctr_layers[i+1]))
        self.ctr_fc = torch.nn.Sequential(*ctr_fc)
        for i in range(self.ctr_layers_hid_num+1):
            self.ctr_fc[i].weight.data = self.ctr_fc[i].weight.data.type(dtype)
            self.ctr_fc[i].bias.data = self.ctr_fc[i].bias.data.type(dtype)
        #--------cvr DNN
        self.cvr_layers = [input_dim] + cvr_hid_layers
        self.cvr_layers_hid_num = len(self.cvr_layers)-2
        cvr_fc = []
        for i in range(self.cvr_layers_hid_num+1):
            cvr_fc.append(torch.nn.Linear(self.cvr_layers[i],self.cvr_layers[i+1]))
        self.cvr_fc = torch.nn.Sequential(*cvr_fc)
        for i in range(self.cvr_layers_hid_num+1):
            self.cvr_fc[i].weight.data = self.cvr_fc[i].weight.data.type(dtype)
            self.cvr_fc[i].bias.data = self.cvr_fc[i].bias.data.type(dtype)
        #-------------FM
        self.w = torch.nn.Linear(input_dim,2)
        fm = [];fm.append(self.w)
        self.fm = torch.nn.Sequential(*fm)
        self.fm[0].weight.data = self.fm[0].weight.data.type(dtype)
        self.fm[0].bias.data = self.fm[0].bias.data.type(dtype)
        self.v = torch.nn.Parameter(torch.FloatTensor(torch.rand(input_dim,k)), requires_grad=True)
        
        self.v = self.v.type(dtype)
        
        #----------
    def reg(self,x):
        tmp1 = 1/(1 + torch.exp(x[:,1:2] - x[:,0:1]))
        tmp2 = 1/(1 + torch.exp(x[:,0:1] - x[:,1:2]))
        return torch.cat([tmp1,tmp2],dim = 1)
    def FM(self,x):
        linear_part = self.fm[0](x)
        inner_part1 = torch.pow(x,2)@torch.pow(self.v,2)
        inner_part2 = torch.pow((x@self.v),2)
        inner = 0.5*(inner_part2 - inner_part1).sum(axis = 1,keepdims = True)
        inner_part = torch.cat([inner,inner],axis = 1)
        
        return linear_part + inner_part
        
    
    def CTR(self,x):
        #x = nn.BatchNorm1d(x.shape[0])(x)
        linear = self.FM(x)
        for i in range(self.ctr_layers_hid_num):
            x = torch.relu(self.ctr_fc[i](x))#.to(device)
            temp = torch.eye(x.shape[-1],self.ctr_layers[i+1])
            x = x + x@temp
            x = nn.BatchNorm1d(self.ctr_layers[i + 1])(x)
            #x = nn.Dropout(self.p)(x)
        return (self.ctr_fc[-1](x) + linear)
    def CVR(self,x):
        linear = self.FM(x)
        for i in range(self.cvr_layers_hid_num):
            x = torch.relu(self.cvr_fc[i](x))#.to(device)
            temp = torch.eye(x.shape[-1],self.cvr_layers[i+1])
            x = x + x@temp
            x = nn.BatchNorm1d(self.cvr_layers[i + 1])(x)
            #x = nn.Dropout(self.p)(x)
        return torch.sigmoid(self.cvr_fc[-1](x) + linear)
    def forward(self,x):
        user_list = list()
        for cate,num in self.user_feature_dict.items():
            if num[0] > 1:
                user_list.append(self.user_embed_layers['user_embed_{}'.format(num[1])](x[:,num[1]].long()))
            else:
                user_list.append(x[:,num[1]].unsqueeze(1))
        item_list = list()
        for cate,num in self.item_feature_dict.items():
            if num[0] > 1:
                item_list.append(self.item_embed_layers['item_embed_{}'.format(num[1])](x[:,num[1]].long()))
            else:
                item_list.append(x[:,num[1]].unsqueeze(1))
        user_embed = torch.cat(user_list, axis=1)
        item_embed = torch.cat(item_list, axis=1)
        x = torch.cat([user_embed, item_embed], axis=1)

        p_ctr = self.CTR(x)
        p_cvr = self.CVR(x)
        ind = p_ctr.argmax(dim = 1)

        p_ctcvr = p_cvr*(torch.eye(2)[ind])
        return p_ctr,p_ctcvr
    
    def total_para(self):#计算参数数目
        return sum([x.numel() for x in self.parameters()])  
def pred(model,x):
    p_ctr,p_ctcvr = model.forward(x)
    ctr_pre = (p_ctr.argmax(dim = 1)).reshape(-1,1)
    
    ctcvr_pre = (p_ctcvr.argmax(dim = 1)).reshape(-1,1)
    return ctr_pre,ctcvr_pre 
def Loss(model,x,click,conversion):
    criteon = nn.CrossEntropyLoss()
    p_ctr,p_ctcvr = model.forward(x)
    
    ctr_loss = criteon(p_ctr,click.squeeze().type(torch.long))
    loss_fn = nn.CrossEntropyLoss()
    ctcvr_loss = loss_fn(p_ctcvr,conversion.squeeze().type(torch.long))
    
    total_loss = ctr_loss + ctcvr_loss
    return total_loss

def loadtype(X_train,X_test,y_train,y_test,z_train,z_test,dtype):
    X_train = torch.tensor(X_train).type(dtype)
    X_test = torch.tensor(X_test).type(dtype)
    y_train = torch.tensor(y_train).type(dtype).reshape(-1,1)
    y_test = torch.tensor(y_test).type(dtype).reshape(-1,1)
    z_train = torch.tensor(z_train).type(dtype).reshape(-1,1)
    z_test = torch.tensor(z_test).type(dtype).reshape(-1,1)
    for i in range(y_train.shape[0]):
        if y_train[i,0] == 0:
            z_train[i,0] = 0
    for i in range(y_test.shape[0]):
        if y_test[i,0] == 0:
            z_test[i,0] = 0
    return (X_train, X_test),(y_train, y_test),(z_train, z_test)
def Train(model,X_train,click_train,conversion_train,optim,optimtype,epoch,record,batch):
    iteration = (X_train.shape[0] + batch - 1)//(batch)
    loss = Loss(model,X_train,click_train,conversion_train)
    print('start train:the loss:%.4e'%(loss.item()))
    for i in range(epoch):
        
        for j in range(iteration):
            inds = j*batch
            if (j + 1)*batch <= X_train.shape[0]:
                inde = (j + 1)*batch
            else:
                inde = X_train.shape[0]
                
            x = X_train[inds:inde]
            clk = click_train[inds:inde]
            cov = conversion_train[inds:inde]
            loss = Loss(model,x,clk,cov)
            #print(loss.item())
            if optimtype == 'LBFGS' or 'BFGS':
                def closure():
                    optim.zero_grad()
                    lloss = Loss(model,x,clk,cov)
                    loss.backward()
                    return loss
                optim.step(closure)
            else:
                for j in range(record):
                    optim.zero_grad()
                    loss = Loss(model,x,clk,cov)
                    #print(loss)
                    loss.backward()
                    optim.step()
            loss = Loss(model,x,clk,cov)
            print('the %d epoch,the %d iter,the loss:%.4e'%(i + 1,j + 1,loss.item()))
        
        loss = Loss(model,X_train,click_train,conversion_train)
        ctr_pre,ctcvr_pre = pred(model,X_train)
        #print(ctr_pre.shape,click_train.shape)
        ctr_acc = ctr_pre.eq(click_train).sum()/click_train.shape[0]
        ctcvr_acc = ctcvr_pre.eq(conversion_train).sum()/conversion_train.shape[0]
        print('the %d epoch:the loss:%.4e,the ctr acc:%.2e,ctcvr acc:%.2e'%(i + 1,loss.item(),ctr_acc,ctcvr_acc))
embed = 8
p = 10
ctr_hid_layers = [128,64,2]
cvr_hid_layers = [128,64,2]
dtype = torch.float32

train_data, test_data, user_feature_dict, item_feature_dict = data_preparation()
#print(user_feature_dict, item_feature_dict)#查看用户和物品的特征到底有哪些
#print(train_data.shape)#[32561,15]
train_dataset = (train_data.iloc[:, :-2].values, train_data.iloc[:, -2].values, train_data.iloc[:, -1].values)
test_dataset = (test_data.iloc[:, :-2].values, test_data.iloc[:, -2].values, test_data.iloc[:, -1].values)
#train_data里面最后两列是标签,一个是收入是否大于50K,一个是婚姻状态
X_train,y_train,z_train = train_dataset[0],train_dataset[1],train_dataset[2]
X_test,y_test,z_test = test_dataset[0],test_dataset[1],test_dataset[2]
#print(X_train[:2,:5])
model = Model(user_feature_dict, item_feature_dict,ctr_hid_layers,cvr_hid_layers,dtype,p,embed)
(X_train, X_test),(y_train, y_test),(z_train,z_test) = loadtype(X_train,X_test,y_train,y_test,z_train,z_test,dtype)


epoch = 10
batch = 1000
record = 2
lr = 1e-2
optimtype = 'SGD'
#optimtype = 'Adam'
#optimtype = 'LBFGS'

if optimtype == 'SGD':
    optim = torch.optim.SGD(model.parameters(),lr = lr)
elif optimtype == 'Adam':
    optim = torch.optim.Adam(model.parameters(),lr = lr)
elif optimtype == 'LBFGS':
    optim = torch.optim.LBFGS(model.parameters(),lr = lr,max_iter = record,
                              tolerance_grad=1e-16, tolerance_change=1e-16,
                              line_search_fn='strong_wolfe')
         
Train(model,X_train,y_train,z_train,optim,optimtype,epoch,record,batch)                 

ctr_pre,ctcvr_pre = pred(model,X_test)

ctr_acc = ctr_pre.eq(y_test).sum()/y_test.shape[0]
ctcvr_acc = ctcvr_pre.eq(z_test).sum()/z_test.shape[0]
print('the pre ctr acc:%.2e,pre ctcvr acc:%.2e'%(ctr_acc,ctcvr_acc))



在这里插入图片描述

best

import torch
import torch.nn as nn
import numpy as np
import time
import pandas as pd
from adultutils import data_preparation
np.random.seed(1234)
torch.manual_seed(1234)
class Model(torch.nn.Module):
    def __init__(self, user_feature_dict, item_feature_dict,ctr_hid_layers,cvr_hid_layers,dtype,k,embed):
        super(Model, self).__init__()#输入维度与embed有关,这里只设置hidden层和输出层
        self.user_feature_dict = user_feature_dict
        self.item_feature_dict = item_feature_dict
        self.user_embed_layers = dict()
        self.item_embed_layers = dict()
        user_cate_feature_nums, item_cate_feature_nums = 0, 0
        for cate,num in self.user_feature_dict.items():
            if num[0] > 1:#如果该特征不同取值大于1,那么使用embed
                user_cate_feature_nums += 1
                self.user_embed_layers['user_embed_' + str(num[1])] = torch.nn.Embedding(num[0], embed)
        for cate,num in self.item_feature_dict.items():
            if num[0] > 1:
                item_cate_feature_nums += 1
                self.item_embed_layers['item_embed_' + str(num[1])] = torch.nn.Embedding(num[0], embed)
        #注意,有部分特征做了embedding,因此网络接受的输入节点数目跟embedding的特征数目有关
        #上面引入user_cate_feature_nums就是为了统计用户特征中做了embedding的特征数目,方便后面计算输入维度
        input_dim = embed * (user_cate_feature_nums + item_cate_feature_nums) + \
                      (len(user_feature_dict) - user_cate_feature_nums) + (len(item_feature_dict) - item_cate_feature_nums)
        #-------ctr DNN
        self.ctr_layers = [input_dim] + ctr_hid_layers
        self.ctr_layers_hid_num = len(self.ctr_layers)-2
        ctr_fc = []
        for i in range(self.ctr_layers_hid_num+1):
            ctr_fc.append(torch.nn.Linear(self.ctr_layers[i],self.ctr_layers[i+1]))
        self.ctr_fc = torch.nn.Sequential(*ctr_fc)
        for i in range(self.ctr_layers_hid_num+1):
            self.ctr_fc[i].weight.data = self.ctr_fc[i].weight.data.type(dtype)
            self.ctr_fc[i].bias.data = self.ctr_fc[i].bias.data.type(dtype)
        #--------cvr DNN
        self.cvr_layers = [input_dim] + cvr_hid_layers
        self.cvr_layers_hid_num = len(self.cvr_layers)-2
        cvr_fc = []
        for i in range(self.cvr_layers_hid_num+1):
            cvr_fc.append(torch.nn.Linear(self.cvr_layers[i],self.cvr_layers[i+1]))
        self.cvr_fc = torch.nn.Sequential(*cvr_fc)
        for i in range(self.cvr_layers_hid_num+1):
            self.cvr_fc[i].weight.data = self.cvr_fc[i].weight.data.type(dtype)
            self.cvr_fc[i].bias.data = self.cvr_fc[i].bias.data.type(dtype)
        #-------------FM
        self.w = torch.nn.Linear(input_dim,2)
        fm = [];fm.append(self.w)
        self.fm = torch.nn.Sequential(*fm)
        self.fm[0].weight.data = self.fm[0].weight.data.type(dtype)
        self.fm[0].bias.data = self.fm[0].bias.data.type(dtype)
        self.v = torch.nn.Parameter(torch.FloatTensor(torch.rand(input_dim,k)), requires_grad=True)
        
        self.v = self.v.type(dtype)
        
        #----------
    def FM(self,x):
        linear_part = self.fm[0](x)
        inner_part1 = torch.pow(x,2)@torch.pow(self.v,2)
        inner_part2 = torch.pow((x@self.v),2)
        inner = 0.5*(inner_part2 - inner_part1).sum(axis = 1,keepdims = True)
        inner_part = torch.cat([inner,inner],axis = 1)
        
        return linear_part + inner_part
        
    
    def CTR(self,x):
        #x = nn.BatchNorm1d(x.shape[0])(x)
        linear = self.FM(x)
        for i in range(self.ctr_layers_hid_num):
            x = torch.relu(self.ctr_fc[i](x))#.to(device)
            temp = torch.eye(x.shape[-1],self.ctr_layers[i+1])
            x = x + x@temp
            x = nn.BatchNorm1d(self.ctr_layers[i + 1])(x)
            #x = nn.Dropout(self.p)(x)
        return (self.ctr_fc[-1](x) + linear)
    def CVR(self,x):
        #x = nn.BatchNorm1d(x.shape[0])(x)
        for i in range(self.cvr_layers_hid_num):
            x = torch.relu(self.cvr_fc[i](x))#.to(device)
            temp = torch.eye(x.shape[-1],self.cvr_layers[i+1])
            x = x + x@temp
            x = nn.BatchNorm1d(self.cvr_layers[i + 1])(x)
            #x = nn.Dropout(self.p)(x)
        return torch.sigmoid(self.cvr_fc[-1](x))
    def forward(self,x):
        user_list = list()
        for cate,num in self.user_feature_dict.items():
            if num[0] > 1:
                user_list.append(self.user_embed_layers['user_embed_{}'.format(num[1])](x[:,num[1]].long()))
            else:
                user_list.append(x[:,num[1]].unsqueeze(1))
        item_list = list()
        for cate,num in self.item_feature_dict.items():
            if num[0] > 1:
                item_list.append(self.item_embed_layers['item_embed_{}'.format(num[1])](x[:,num[1]].long()))
            else:
                item_list.append(x[:,num[1]].unsqueeze(1))
        user_embed = torch.cat(user_list, axis=1)
        item_embed = torch.cat(item_list, axis=1)
        x = torch.cat([user_embed, item_embed], axis=1)

        p_ctr = self.CTR(x)
        p_cvr = self.CVR(x)
        p_ctcvr = (p_ctr.argmax(dim = 1).reshape(-1,1))*p_cvr
        return p_ctr,p_ctcvr
    
    def total_para(self):#计算参数数目
        return sum([x.numel() for x in self.parameters()])  
def pred(model,x):
    p_ctr,p_ctcvr = model.forward(x)
    ctr_pre = (p_ctr.argmax(dim = 1)).reshape(-1,1)
    ctcvr_pre = [1 if x > 0.5 else 0 for x in p_ctcvr]
    ctcvr_pre = torch.tensor(ctcvr_pre).reshape(-1,1)
    return ctr_pre,ctcvr_pre 
def Loss(model,x,click,conversion):
    criteon = nn.CrossEntropyLoss()
    p_ctr,p_ctcvr = model.forward(x)
    
    ctr_loss = criteon(p_ctr,click.squeeze().type(torch.long))
    loss_fn = nn.BCELoss()
    ctcvr_loss = loss_fn(p_ctcvr,conversion)
    
    total_loss = ctr_loss + ctcvr_loss
    return total_loss

def loadtype(X_train,X_test,y_train,y_test,z_train,z_test,dtype):
    X_train = torch.tensor(X_train).type(dtype)
    X_test = torch.tensor(X_test).type(dtype)
    y_train = torch.tensor(y_train).type(dtype).reshape(-1,1)
    y_test = torch.tensor(y_test).type(dtype).reshape(-1,1)
    z_train = torch.tensor(z_train).type(dtype).reshape(-1,1)
    z_test = torch.tensor(z_test).type(dtype).reshape(-1,1)
    for i in range(y_train.shape[0]):
        if y_train[i,0] == 0:
            z_train[i,0] = 0
    for i in range(y_test.shape[0]):
        if y_test[i,0] == 0:
            z_test[i,0] = 0
    return (X_train, X_test),(y_train, y_test),(z_train, z_test)
def Train(model,X_train,click_train,conversion_train,optim,optimtype,epoch,record,batch):
    iteration = (X_train.shape[0] + batch - 1)//(batch)
    loss = Loss(model,X_train,click_train,conversion_train)
    print('start train:the loss:%.4e'%(loss.item()))
    for i in range(epoch):
        
        for j in range(iteration):
            inds = j*batch
            if (j + 1)*batch <= X_train.shape[0]:
                inde = (j + 1)*batch
            else:
                inde = X_train.shape[0]
                
            x = X_train[inds:inde]
            clk = click_train[inds:inde]
            cov = conversion_train[inds:inde]
            loss = Loss(model,x,clk,cov)
            #print(loss.item())
            if optimtype == 'LBFGS' or 'BFGS':
                def closure():
                    optim.zero_grad()
                    lloss = Loss(model,x,clk,cov)
                    loss.backward()
                    return loss
                optim.step(closure)
            else:
                for j in range(record):
                    optim.zero_grad()
                    loss = Loss(model,x,clk,cov)
                    #print(loss)
                    loss.backward()
                    optim.step()
            loss = Loss(model,x,clk,cov)
            print('the %d epoch,the %d iter,the loss:%.4e'%(i + 1,j + 1,loss.item()))
        
        loss = Loss(model,X_train,click_train,conversion_train)
        ctr_pre,ctcvr_pre = pred(model,X_train)
        #print(ctr_pre.shape,click_train.shape)
        ctr_acc = ctr_pre.eq(click_train).sum()/click_train.shape[0]
        ctcvr_acc = ctcvr_pre.eq(conversion_train).sum()/conversion_train.shape[0]
        print('the %d epoch:the loss:%.4e,the ctr acc:%.2e,ctcvr acc:%.2e'%(i + 1,loss.item(),ctr_acc,ctcvr_acc))
embed = 8
p = 10
ctr_hid_layers = [128,64,2]
cvr_hid_layers = [128,64,1]
dtype = torch.float32

train_data, test_data, user_feature_dict, item_feature_dict = data_preparation()
#print(user_feature_dict, item_feature_dict)#查看用户和物品的特征到底有哪些
#print(train_data.shape)#[32561,15]
train_dataset = (train_data.iloc[:, :-2].values, train_data.iloc[:, -2].values, train_data.iloc[:, -1].values)
test_dataset = (test_data.iloc[:, :-2].values, test_data.iloc[:, -2].values, test_data.iloc[:, -1].values)
#train_data里面最后两列是标签,一个是收入是否大于50K,一个是婚姻状态
X_train,y_train,z_train = train_dataset[0],train_dataset[1],train_dataset[2]
X_test,y_test,z_test = test_dataset[0],test_dataset[1],test_dataset[2]
#print(X_train[:2,:5])
model = Model(user_feature_dict, item_feature_dict,ctr_hid_layers,cvr_hid_layers,dtype,p,embed)
(X_train, X_test),(y_train, y_test),(z_train,z_test) = loadtype(X_train,X_test,y_train,y_test,z_train,z_test,dtype)


epoch = 10
batch = 1000
record = 2
lr = 1e-2
optimtype = 'SGD'
#optimtype = 'Adam'
#optimtype = 'LBFGS'

if optimtype == 'SGD':
    optim = torch.optim.SGD(model.parameters(),lr = lr)
elif optimtype == 'Adam':
    optim = torch.optim.Adam(model.parameters(),lr = lr)
elif optimtype == 'LBFGS':
    optim = torch.optim.LBFGS(model.parameters(),lr = lr,max_iter = record,
                              tolerance_grad=1e-16, tolerance_change=1e-16,
                              line_search_fn='strong_wolfe')
         
Train(model,X_train,y_train,z_train,optim,optimtype,epoch,record,batch)                 

ctr_pre,ctcvr_pre = pred(model,X_test)

ctr_acc = ctr_pre.eq(y_test).sum()/y_test.shape[0]
ctcvr_acc = ctcvr_pre.eq(z_test).sum()/z_test.shape[0]
print('the pre ctr acc:%.2e,pre ctcvr acc:%.2e'%(ctr_acc,ctcvr_acc))



数据增强和验证集的使用

import torch
import torch.nn as nn
import numpy as np
import time
import pandas as pd
import random
from adultutils import data_preparation
np.random.seed(1234)
torch.manual_seed(1234)
class Model(torch.nn.Module):
    def __init__(self, user_feature_dict, item_feature_dict,ctr_hid_layers,cvr_hid_layers,dtype,k,embed):
        super(Model, self).__init__()#输入维度与embed有关,这里只设置hidden层和输出层
        self.user_feature_dict = user_feature_dict
        self.item_feature_dict = item_feature_dict
        self.user_embed_layers = dict()
        self.item_embed_layers = dict()
        user_cate_feature_nums, item_cate_feature_nums = 0, 0
        for cate,num in self.user_feature_dict.items():
            if num[0] > 1:#如果该特征不同取值大于1,那么使用embed
                user_cate_feature_nums += 1
                
        for cate,num in self.item_feature_dict.items():
            if num[0] > 1:
                item_cate_feature_nums += 1
                
        #注意,有部分特征做了embedding,因此网络接受的输入节点数目跟embedding的特征数目有关
        #上面引入user_cate_feature_nums就是为了统计用户特征中做了embedding的特征数目,方便后面计算输入维度
        input_dim = embed * (user_cate_feature_nums + item_cate_feature_nums) + \
                      (len(user_feature_dict) - user_cate_feature_nums) + (len(item_feature_dict) - item_cate_feature_nums)
        #-------ctr DNN
        self.ctr_layers = [input_dim] + ctr_hid_layers
        self.ctr_layers_hid_num = len(self.ctr_layers)-2
        ctr_fc = []
        for i in range(self.ctr_layers_hid_num+1):
            ctr_fc.append(torch.nn.Linear(self.ctr_layers[i],self.ctr_layers[i+1]))
        self.ctr_fc = torch.nn.Sequential(*ctr_fc)
        for i in range(self.ctr_layers_hid_num+1):
            self.ctr_fc[i].weight.data = self.ctr_fc[i].weight.data.type(dtype)
            self.ctr_fc[i].bias.data = self.ctr_fc[i].bias.data.type(dtype)
        #--------cvr DNN
        self.cvr_layers = [input_dim] + cvr_hid_layers
        self.cvr_layers_hid_num = len(self.cvr_layers)-2
        cvr_fc = []
        for i in range(self.cvr_layers_hid_num+1):
            cvr_fc.append(torch.nn.Linear(self.cvr_layers[i],self.cvr_layers[i+1]))
        self.cvr_fc = torch.nn.Sequential(*cvr_fc)
        for i in range(self.cvr_layers_hid_num+1):
            self.cvr_fc[i].weight.data = self.cvr_fc[i].weight.data.type(dtype)
            self.cvr_fc[i].bias.data = self.cvr_fc[i].bias.data.type(dtype)
        #-------------FM
        self.w = torch.nn.Linear(input_dim,2)
        fm = [];fm.append(self.w)
        self.fm = torch.nn.Sequential(*fm)
        self.fm[0].weight.data = self.fm[0].weight.data.type(dtype)
        self.fm[0].bias.data = self.fm[0].bias.data.type(dtype)
        self.v = torch.nn.Parameter(torch.FloatTensor(torch.rand(input_dim,k)), requires_grad=True)
        
        self.v = self.v.type(dtype)
        
        #----------
    def FM(self,x):
        linear_part = self.fm[0](x)
        inner_part1 = torch.pow(x,2)@torch.pow(self.v,2)
        inner_part2 = torch.pow((x@self.v),2)
        inner = 0.5*(inner_part2 - inner_part1).sum(axis = 1,keepdims = True)
        inner_part = torch.cat([inner,inner],axis = 1)
        
        return linear_part + inner_part
        
    def reg(self,x):
        tmp1 = 1/(1 + torch.exp(x[:,1:2] - x[:,0:1]))
        tmp2 = 1/(1 + torch.exp(x[:,0:1] - x[:,1:2]))
        return torch.cat([tmp1,tmp2],dim = 1)
    def CTR(self,x):
        #x = nn.BatchNorm1d(x.shape[0])(x)
        linear = self.FM(x)
        for i in range(self.ctr_layers_hid_num):
            x = torch.relu(self.ctr_fc[i](x))#.to(device)
            temp = torch.eye(x.shape[-1],self.ctr_layers[i+1])
            x = x + x@temp
            x = nn.BatchNorm1d(self.ctr_layers[i + 1])(x)
            #x = nn.Dropout(self.p)(x)
        return (self.ctr_fc[-1](x) + linear)
    def CVR(self,x):
        #x = nn.BatchNorm1d(x.shape[0])(x)
        for i in range(self.cvr_layers_hid_num):
            x = torch.relu(self.cvr_fc[i](x))#.to(device)
            temp = torch.eye(x.shape[-1],self.cvr_layers[i+1])
            x = x + x@temp
            x = nn.BatchNorm1d(self.cvr_layers[i + 1])(x)
            #x = nn.Dropout(self.p)(x)
        return (torch.sigmoid(self.cvr_fc[-1](x)) + 0)
    def forward(self,x):
        

        p_ctr = self.CTR(x)
        p_cvr = self.CVR(x)
        p_ctcvr = (p_ctr.argmax(dim = 1).reshape(-1,1))*p_cvr
        return p_ctr,p_ctcvr
    
    def total_para(self):#计算参数数目
        return sum([x.numel() for x in self.parameters()])  

class Data(torch.nn.Module):
    def __init__(self,user_feature_dict, item_feature_dict,embed):
        super(Data,self).__init__()
        self.user_feature_dict = user_feature_dict
        self.item_feature_dict = item_feature_dict
        self.user_embed_layers = dict()
        self.item_embed_layers = dict()
        user_cate_feature_nums, item_cate_feature_nums = 0, 0
        for cate,num in self.user_feature_dict.items():
            if num[0] > 1:#如果该特征不同取值大于1,那么使用embed
                user_cate_feature_nums += 1
                self.user_embed_layers['user_embed_' + str(num[1])] = torch.nn.Embedding(num[0], embed)
        for cate,num in self.item_feature_dict.items():
            if num[0] > 1:
                item_cate_feature_nums += 1
                self.item_embed_layers['item_embed_' + str(num[1])] = torch.nn.Embedding(num[0], embed)
    def deal(self,x):
        user_list = list()
        for cate,num in self.user_feature_dict.items():
            if num[0] > 1:
                user_list.append(self.user_embed_layers['user_embed_{}'.format(num[1])](x[:,num[1]].long()))
            else:
                user_list.append(x[:,num[1]].unsqueeze(1))
        item_list = list()
        for cate,num in self.item_feature_dict.items():
            if num[0] > 1:
                item_list.append(self.item_embed_layers['item_embed_{}'.format(num[1])](x[:,num[1]].long()))
            else:
                item_list.append(x[:,num[1]].unsqueeze(1))
        user_embed = torch.cat(user_list, axis=1)
        item_embed = torch.cat(item_list, axis=1)
        x = torch.cat([user_embed, item_embed], axis=1)
        return x
def shullfe(x,y,z):
    index = random.sample(range(0,x.shape[0]),x.shape[0])
    x = x[index,:]
    y = y[index,:]
    z = z[index,:]
    return x,y,z
def val(x,y,z,p = 0.8):#p>0.5
    x,y,z = shullfe(x,y,z)
    ind = (int)(p*x.shape[0])
    x_train,x_val = x[:ind,:],x[ind:,:]
    y_train,y_val = y[:ind,:],y[ind:,:]
    z_train,z_val = z[:ind,:],z[ind:,:]
    return (x_train,y_train,z_train),(x_val,y_val,z_val)


def pred(model,x):
    p_ctr,p_ctcvr = model.forward(x)
    ctr_pre = (p_ctr.argmax(dim = 1)).reshape(-1,1)
    ctcvr_pre = [1 if x > 0.5 else 0 for x in p_ctcvr]
    ctcvr_pre = torch.tensor(ctcvr_pre).reshape(-1,1)
    return ctr_pre,ctcvr_pre 
def Loss(model,x,click,conversion):
    criteon = nn.CrossEntropyLoss()
    p_ctr,p_ctcvr = model.forward(x)
    
    ctr_loss = criteon(p_ctr,click.squeeze().type(torch.long))
    loss_fn = nn.BCELoss()
    ctcvr_loss = loss_fn(p_ctcvr,conversion)
    
    total_loss = ctr_loss + ctcvr_loss
    return total_loss

def loadtype(X_train,X_test,y_train,y_test,z_train,z_test,dtype):
    X_train = torch.tensor(X_train).type(dtype)
    X_test = torch.tensor(X_test).type(dtype)
    y_train = torch.tensor(y_train).type(dtype).reshape(-1,1)
    y_test = torch.tensor(y_test).type(dtype).reshape(-1,1)
    z_train = torch.tensor(z_train).type(dtype).reshape(-1,1)
    z_test = torch.tensor(z_test).type(dtype).reshape(-1,1)
    for i in range(y_train.shape[0]):
        if y_train[i,0] == 0:
            z_train[i,0] = 0
    for i in range(y_test.shape[0]):
        if y_test[i,0] == 0:
            z_test[i,0] = 0
    return (X_train, X_test),(y_train, y_test),(z_train, z_test)
def Train(model,X_train,click_train,conversion_train,optim,optimtype,epoch,record,batch):
    
    loss = Loss(model,X_train,click_train,conversion_train)
    print('start train:the loss:%.4e'%(loss.item()))
    for i in range(epoch):
        (X_train,click_train,conversion_train),(X_val,y_val,z_val) = val(X_train,click_train,conversion_train,p = 0.8)
        iteration = (X_train.shape[0] + batch - 1)//(batch)

        #X_train,click_train,conversion_train = shullfe(X_train,click_train,conversion_train)
        for j in range(iteration):
            inds = j*batch
            if (j + 1)*batch <= X_train.shape[0]:
                inde = (j + 1)*batch
            else:
                inde = X_train.shape[0]
                
            x = X_train[inds:inde]
            clk = click_train[inds:inde]
            cov = conversion_train[inds:inde]
            #x,clk,cov = shullfe(x,clk,cov)
            loss = Loss(model,x,clk,cov)
            #print(loss.item())
            if optimtype == 'LBFGS' or 'BFGS':
                def closure():
                    optim.zero_grad()
                    lloss = Loss(model,x,clk,cov)
                    loss.backward()
                    return loss
                optim.step(closure)
            else:
                for j in range(record):
                    optim.zero_grad()
                    loss = Loss(model,x,clk,cov)
                    #print(loss)
                    loss.backward()
                    optim.step()
            loss = Loss(model,x,clk,cov)
            print('the %d epoch,the %d iter,the loss:%.4e'%(i + 1,j + 1,loss.item()))
        
        loss = Loss(model,X_train,click_train,conversion_train)
        #-----------------train
        ctr_pre,ctcvr_pre = pred(model,X_train)
        #print(ctr_pre.shape,click_train.shape)
        ctr_acc = ctr_pre.eq(click_train).sum()/click_train.shape[0]
        ctcvr_acc = ctcvr_pre.eq(conversion_train).sum()/conversion_train.shape[0]
        #-------------------------val
        ctr_val,ctcvr_val = pred(model,X_val)
        #print(ctr_pre.shape,click_train.shape)
        ctr_val_acc = ctr_val.eq(y_val).sum()/X_val.shape[0]
        ctcvr_val_acc = ctcvr_val.eq(z_val).sum()/X_val.shape[0]
        print('the %d epoch:the loss:%.4e,the ctr acc:%.2e,ctcvr acc:%.2e,ctr val:%.2e,ctcvr val:%.2e'
        %(i + 1,loss.item(),ctr_acc,ctcvr_acc,ctr_val_acc,ctcvr_val_acc))
st = time.time()
embed = 8
p = 10
ctr_hid_layers = [64,64,2]
cvr_hid_layers = [128,64,1]
dtype = torch.float32

train_data, test_data, user_feature_dict, item_feature_dict = data_preparation()
#print(user_feature_dict, item_feature_dict)#查看用户和物品的特征到底有哪些
#print(train_data.shape)#[32561,15]
train_dataset = (train_data.iloc[:, :-2].values, train_data.iloc[:, -2].values, train_data.iloc[:, -1].values)
test_dataset = (test_data.iloc[:, :-2].values, test_data.iloc[:, -2].values, test_data.iloc[:, -1].values)
#train_data里面最后两列是标签,一个是收入是否大于50K,一个是婚姻状态
X_train,y_train,z_train = train_dataset[0],train_dataset[1],train_dataset[2]
X_test,y_test,z_test = test_dataset[0],test_dataset[1],test_dataset[2]
#print(X_train[:2,:5])

(X_train, X_test),(y_train, y_test),(z_train,z_test) = loadtype(X_train,X_test,y_train,y_test,z_train,z_test,dtype)
data = Data(user_feature_dict, item_feature_dict,embed)
X_train = data.deal(X_train).data
X_test = data.deal(X_test).data
tmp = nn.BatchNorm1d(X_train.shape[1])(X_train)
#X_train = torch.cat([X_train,tmp]).data
#y_train = torch.cat([y_train,y_train]).data
#z_train = torch.cat([z_train,z_train]).data
#(X_train,y_train,z_train),(X_val,y_val,z_val) = val(X_train,y_train,z_train,p = 0.8)
model = Model(user_feature_dict, item_feature_dict,ctr_hid_layers,cvr_hid_layers,dtype,p,embed)


epoch = 10
batch = 1000
record = 2
lr = 1e-2
optimtype = 'SGD'
#optimtype = 'Adam'
#optimtype = 'LBFGS'

if optimtype == 'SGD':
    optim = torch.optim.SGD(model.parameters(),lr = lr)
elif optimtype == 'Adam':
    optim = torch.optim.Adam(model.parameters(),lr = lr)
elif optimtype == 'LBFGS':
    optim = torch.optim.LBFGS(model.parameters(),lr = lr,max_iter = record,
                              tolerance_grad=1e-16, tolerance_change=1e-16,
                              line_search_fn='strong_wolfe')
         
Train(model,X_train,y_train,z_train,optim,optimtype,epoch,record,batch)                 

ctr_pre,ctcvr_pre = pred(model,X_test)

ctr_acc = ctr_pre.eq(y_test).sum()/y_test.shape[0]
ctcvr_acc = ctcvr_pre.eq(z_test).sum()/z_test.shape[0]
ela = time.time() - st
num = model.total_para()
print('time:%.2f,num:%d,the pre ctr acc:%.2e,pre ctcvr acc:%.2e'%(ela,num,ctr_acc,ctcvr_acc))



  • 2
    点赞
  • 4
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

Galerkin码农选手

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值