目录
1:FFM背景
1.1为什么提出FFM
在FM模型中,每一个原始特征经过onehot之后的特征会对应一个隐变量,但在FFM模型中,认为应该将特征分为多个field,每个特征对应每个field分别有一个隐变量,这里的field其实就是原始特征的个数。举个例子,我们的样本有3种类型的字段publisher,advertiser,gender,分别可以代表媒体,广告主或者是具体的商品,性别。其中publisher有5种数据,advertiser有10种数据,gender有男女2种,经过one-hot编码以后,每个样本有17个特征,其中只有3个特征非空。如果使用FM模型,则17个特征,每个特征对应一个隐变量。如果使用FFM模型,则17个特征,每个特征对应3个隐变量,即每个类型对应一个隐变量,具体而言,就是对应publisher, advertiser, gender三个field各有一个隐变量。
1.2:FFM原理
在xlearn中为了使用FFM方法,所有的特征必须转换成“field_id:feat_id:value”格式,field_id代表特征所属field的编号,feat_id是onehot之后特征编号,value是特征的值。数值型的特征比较容易处理,只需分配单独的field编号,如用户评论得分、商品的历史CTR/CVR等。categorical特征需要经过One-Hot编码成数值型,编码产生的所有特征同属于一个field,而特征的值只能是0或1,如用户的性别、年龄段,商品的品类id等。除此之外,还有第三类特征,如用户浏览/购买品类,有多个品类id且用一个数值衡量用户浏览或购买每个品类商品的数量。这类特征按照categorical特征处理,不同的只是特征的值不是0或1,而是代表用户浏览或购买数量的数值。按前述方法得到field_id之后,再对转换后特征顺序编号,得到feat_id,特征的值也可以按照之前的方法获得。
example:和FM相比,二阶特征组合的数量并没有变,只是交叉的时候用的隐向量变了。
原始数据:
特征编号:
特征组合:
1.3:FFM公式
实现过程基本都是用随机梯度下降,然后 每个样本根
1.4:损失函数:
本质上还是交叉熵损失函数,只是label分别是正负一,具体推导过程可参见链接:https://blog.csdn.net/xxiaobaib/article/details/97692966
2:代码实现FFM
和FM不同之处是这里的二阶系数V不再一个二维矩阵,而是一个三维矩阵,维度是[n,f,k],训练的过程中,我们可以按照随机梯度下降来更新参数,从三维矩阵中抽取向量组成二维矩阵来更新参数。
import tensorflow as tf
import pandas as pd
import numpy as np
import os
input_x_size = 20
field_size = 2
vector_dimension = 3
total_plan_train_steps = 1000
# 使用SGD,每一个样本进行依次梯度下降,更新参数
batch_size = 1
all_data_size = 1000
lr = 0.01
MODEL_SAVE_PATH = "TFModel"
MODEL_NAME = "FFM"
def createTwoDimensionWeight(input_x_size, field_size, vector_dimension):
"""
创建三维变量矩阵
:param input_x_size:
:param field_size:
:param vector_dimension:
:return:
"""
weights = tf.truncated_normal([input_x_size, field_size, vector_dimension])
tf_weights = tf.Variable(weights)
return tf_weights
def createOneDimensionWeight(input_x_size):
"""
创建一阶特征系数
:param input_x_size:
:return:
"""
weights = tf.truncated_normal([input_x_size])
tf_weights = tf.Variable(weights)
return tf_weights
def createZeroDimensionWeight():
"""
创建bias,即w0
:return:
"""
weights = tf.truncated_normal([1])
tf_weights = tf.Variable(weights)
return tf_weights
def inference(input_x, input_x_field, zeroWeights, oneDimWeights, thirdWeight):
"""计算回归模型输出的值"""
secondValue = tf.reduce_sum(tf.multiply(oneDimWeights, input_x, name='secondValue'))
firstTwoValue = tf.add(zeroWeights, secondValue, name="firstTwoValue")
thirdValue = tf.Variable(0.0, dtype=tf.float32)
input_shape = input_x_size
for i in range(input_shape):
featureIndex1 = i
fieldIndex1 = int(input_x_field[i])
for j in range(i + 1, input_shape):
featureIndex2 = j
fieldIndex2 = int(input_x_field[j])
vectorLeft = tf.convert_to_tensor([[featureIndex1, fieldIndex2, i] for i in range(vector_dimension)])
weightLeft = tf.gather_nd(thirdWeight, vectorLeft)
weightLeftAfterCut = tf.squeeze(weightLeft)
vectorRight = tf.convert_to_tensor([[featureIndex2, fieldIndex1, i] for i in range(vector_dimension)])
weightRight = tf.gather_nd(thirdWeight, vectorRight)
weightRightAfterCut = tf.squeeze(weightRight)
tempValue = tf.reduce_sum(tf.multiply(weightLeftAfterCut, weightRightAfterCut))
indices2 = [i]
indices3 = [j]
xi = tf.squeeze(tf.gather_nd(input_x, indices2))
xj = tf.squeeze(tf.gather_nd(input_x, indices3))
product = tf.reduce_sum(tf.multiply(xi, xj))
secondItemVal = tf.multiply(tempValue, product)
tf.assign(thirdValue, tf.add(thirdValue, secondItemVal))
return tf.add(firstTwoValue, thirdValue)
def gen_data():
"""
生成数据
:return:
"""
labels = [-1, 1]
y = [np.random.choice(labels, 1)[0] for _ in range(all_data_size)]
x_field = [i // 10 for i in range(input_x_size)]
x = np.random.randint(0, 2, size=(all_data_size, input_x_size))
return x, y, x_field
if __name__ == '__main__':
global_step = tf.Variable(0, trainable=False)
trainx, trainy, trainx_field = gen_data()
#
input_x = tf.placeholder(tf.float32, [input_x_size])
input_y = tf.placeholder(tf.float32)
#
lambda_w = tf.constant(0.001, name='lambda_w')
lambda_v = tf.constant(0.001, name='lambda_v')
zeroWeights = createZeroDimensionWeight()
oneDimWeights = createOneDimensionWeight(input_x_size)
thirdWeight = createTwoDimensionWeight(input_x_size, # 创建二次项的权重变量
field_size,
vector_dimension) # n * f * k
y_ = inference(input_x, trainx_field, zeroWeights, oneDimWeights, thirdWeight)
l2_norm = tf.reduce_sum(
tf.add(
tf.multiply(lambda_w, tf.pow(oneDimWeights, 2)),
tf.reduce_sum(tf.multiply(lambda_v, tf.pow(thirdWeight, 2)), axis=[1, 2])
)
)
loss = tf.log(1 + tf.exp(input_y * y_)) + l2_norm
train_step = tf.train.GradientDescentOptimizer(learning_rate=lr).minimize(loss)
saver = tf.train.Saver()
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for i in range(total_plan_train_steps):
for t in range(all_data_size):
input_x_batch = trainx[t]
input_y_batch = trainy[t]
predict_loss, _, steps = sess.run([loss, train_step, global_step],
feed_dict={input_x: input_x_batch, input_y: input_y_batch})
print("After {step} training step(s) , loss on training batch is {predict_loss} "
.format(step=steps, predict_loss=predict_loss))
saver.save(sess, os.path.join(MODEL_SAVE_PATH, MODEL_NAME), global_step=steps)
writer = tf.summary.FileWriter(os.path.join(MODEL_SAVE_PATH, MODEL_NAME), tf.get_default_graph())
writer.close()
#