DeepFM模型的论文阅读原理部分可以参见这一篇博客:https://blog.csdn.net/weixin_45459911/article/details/105359982
本文就来讲述一下DeepFM的代码部分
(代码参考自https://github.com/princewen/tensorflow_practice/tree/master/recommendation/Basic-DeepFM-model)
一、DeepFM模型构建
1、模型参数
"""模型参数"""
dfm_params = {
"use_fm":True,
"use_deep":True,
"embedding_size":8,
"dropout_fm":[1.0,1.0],
"deep_layers":[32,32],
"dropout_deep":[0.5,0.5,0.5],
"deep_layer_activation":tf.nn.relu,
"epoch":30,
"batch_size":1024,
"learning_rate":0.001,
"optimizer":"adam",
"batch_norm":1,
"batch_norm_decay":0.995,
"l2_reg":0.01,
"verbose":True,
"eval_metric":'gini_norm',
"random_seed":3
}
dfm_params['feature_size'] = total_feature
dfm_params['field_size'] = len(train_feature_index.columns)
其中参数feature_size、field_size的值是从数据中读取出来的。
2、模型的输入
"""开始建立模型"""
feat_index = tf.placeholder(tf.int32,shape=[None,None],name='feat_index')
feat_value = tf.placeholder(tf.float32,shape=[None,None],name='feat_value')
label = tf.placeholder(tf.float32,shape=[None,1],name='label')
模型的输入分别是特征的索引、特征的值、label。
feat_index是特征的一个序号,主要用于通过embedding_lookup选择我们的embedding。feat_value是对应的特征值,如果是离散特征的话,就是1,如果不是离散特征的话,就保留原来的特征值。label是实际值。
3、模型的权重
"""建立weights"""
weights = dict()
#embeddings
weights['feature_embeddings'] = tf.Variable(
tf.random_normal([dfm_params['feature_size'],dfm_params['embedding_size']],0.0,0.01),
name='feature_embeddings')
weights['feature_bias'] = tf.Variable(tf.random_normal([dfm_params['feature_size'],1],0.0,1.0),name='feature_bias')
#deep layers
num_layer = len(dfm_params['deep_layers'])
input_size = dfm_params['field_size'] * dfm_params['embedding_size']
glorot = np.sqrt(2.0/(input_size + dfm_params['deep_layers'][0]))
weights['layer_0'] = tf.Variable(
np.random.normal(loc=0,scale=glorot,size=(input_size,dfm_params['deep_layers'][0])),dtype=np.float32
)
weights['bias_0'] = tf.Variable(
np.random.normal(loc=0,scale=glorot,size=(1,dfm_params['deep_layers'][0])),dtype=np.float32
)
for i in range(1,num_layer):
glorot = np.sqrt(2.0 / (dfm_params['deep_layers'][i - 1] + dfm_params['deep_layers'][I]))
weights["layer_%d" % i] = tf.Variable(
np.random.normal(loc=0, scale=glorot, size=(dfm_params['deep_layers'][i - 1], dfm_params['deep_layers'][i])),
dtype=np.float32) # layers[i-1] * layers[I]
weights["bias_%d" % i] = tf.Variable(
np.random.normal(loc=0, scale=glorot, size=(1, dfm_params['deep_layers'][i])),
dtype=np.float32) # 1 * layer[I]
# final concat projection layer
if dfm_params['use_fm'] and dfm_params['use_deep']:
input_size = dfm_params['field_size'] + dfm_params['embedding_size'] + dfm_params['deep_layers'][-1]
elif dfm_params['use_fm']:
input_size = dfm_params['field_size'] + dfm_params['embedding_size']
elif dfm_params['use_deep']:
input_size = dfm_params['deep_layers'][-1]
glorot = np.sqrt(2.0/(input_size + 1))
weights['concat_projection'] = tf.Variable(np.random.normal(loc=0,scale=glorot,size=(input_size,1)),dtype=np.float32)
weights['concat_bias'] = tf.Variable(tf.constant(0.01),dtype=np.float32)
这部分权重主要分为embedding部分权重、深度网络部分权重
weights[‘feature_embeddings’]是每个特征所对应的embedding,它的大小为feature_size * embedding_size,
而weights[‘feature_bias’] ,这个是FM部分计算时所用到的一次项的权重参数,可以理解为embedding_size为1的embedding table,它的大小为feature_size * 1。
weights[‘feature_embeddings’]格式如下所示:
4、嵌入层
嵌入层,主要根据特征索引得到对应特征的embedding:
"""embedding"""
embeddings = tf.nn.embedding_lookup(weights['feature_embeddings'],feat_index)
reshaped_feat_value = tf.reshape(feat_value,shape=[-1,dfm_params['field_size'],1])
embeddings = tf.multiply(embeddings,reshaped_feat_value)
这里注意的是,在得到对应的embedding之后,还乘上了对应的特征值,这个主要是根据FM的公式得到的。过程表示如下:
5、FM部分
一次项的计算如下,我们刚刚也说过了,通过weights[‘feature_bias’]来得到一次项的权重系数:
fm_first_order = tf.nn.embedding_lookup(weights['feature_bias'],feat_index)
fm_first_order = tf.reduce_sum(tf.multiply(fm_first_order,reshaped_feat_value),2)
对于二次项,经过化简之后有两部分(暂不考虑最外层的求和)
summed_features_emb = tf.reduce_sum(embeddings,1)
summed_features_emb_square = tf.square(summed_features_emb)
squared_features_emb = tf.square(embeddings)
squared_sum_features_emb = tf.reduce_sum(squared_features_emb,1)
fm_second_order = 0.5 * tf.subtract(summed_features_emb_square,squared_sum_features_emb)
要注意这里的fm_second_order是二维的tensor,大小为batch-size * embedding-size,也就是公式中最外层的一个求和还没有进行,这也是代码中与FM公式有所出入的地方。我们后面再讲。
6、Deep部分
就是简单的几层全连接神经网络
"""deep part"""
y_deep = tf.reshape(embeddings,shape=[-1,dfm_params['field_size'] * dfm_params['embedding_size']])
for i in range(0,len(dfm_params['deep_layers'])):
y_deep = tf.add(tf.matmul(y_deep,weights["layer_%d" %i]), weights["bias_%d"%I])
y_deep = tf.nn.relu(y_deep)
最后再加上输出和优化部分即可