DeepFM详解和实现

FM(Factorization Machines)

传统的LR算法是线性模型,想要提取非线性关系,要么通过GBDT来提取非线性特征,要么手动构建非线性特征。

FM直接显示的构建交叉特征,直接建模二阶关系:

公式如下:
y ( x ) = w 0 + ∑ i = 1 n w i x i + ∑ i = 1 n ∑ j = i + 1 n w i j x i x j y(\mathbf{x}) = w_0+ \sum_{i=1}^n w_i x_i + \sum_{i=1}^n \sum_{j=i+1}^n w_{ij} x_i x_j y(x)=w0+i=1nwixi+i=1nj=i+1nwijxixj
其中 w i j w_{ij} wij是二阶关系的参数,其个数是 n ( n − 1 ) 2 \frac{n(n-1)}{2} 2n(n1),复杂度是 O ( n 2 ) O(n^2) O(n2)

优化时间复杂度,矩阵分解提供了一种解决思路, w i j = ⟨ v i , v j ⟩ w_{ij} = \langle \mathbf{v}_i, \mathbf{v}_j \rangle wij=vi,vj 来代替上式。则:
y ( x ) = w 0 + ∑ i = 1 n w i x i + ∑ i = 1 n ∑ j = i + 1 n ⟨ v i , v j ⟩ x i x j y(\mathbf{x}) = w_0+ \sum_{i=1}^n w_i x_i + \sum_{i=1}^n \sum_{j=i+1}^n \langle \mathbf{v}_i, \mathbf{v}_j \rangle x_i x_j y(x)=w0+i=1nwixi+i=1nj=i+1nvi,vjxixj
其中, v i v_i vi是第 i 维特征的隐向量,⟨⋅,⋅⟩ 代表向量点积。隐向量的长度为 k(k«n),包含 k 个描述特征的因子。
直观上看,FM的复杂度是 O ( k n 2 ) O(kn^2) O(kn2),但通过数学化简,可以做到 O ( k n ) O(kn) O(kn),具体推导如下:
∑ i = 1 n ∑ j = i + 1 n ⟨ v i , v j ⟩ x i x j = 1 2 ∑ i = 1 n ∑ j = 1 n ⟨ v i , v j ⟩ x i x j − 1 2 ∑ i = 1 n ⟨ v i , v i ⟩ x i x i = 1 2 ( ∑ i = 1 n ∑ j = 1 n ∑ f = 1 k v i f v j f x i x j − ∑ i = 1 n ∑ f = 1 k v i f v i f x i x i ) = 1 2 ∑ f = 1 k ( ( ∑ i = 1 n v i f x i ) ( ∑ j = 1 n v j f x j ) − ∑ i = 1 n v i , f 2 x i 2 ) = 1 2 ∑ f = 1 k ( ( ∑ i = 1 n v i f x i ) 2 − ∑ i = 1 n v i , f 2 x i 2 ) % <![CDATA[ \begin{aligned} & \sum_{i=1}^n \sum_{j=i+1}^n \langle \mathbf{v}_i, \mathbf{v}_j \rangle x_i x_j \\ = & \frac{1}{2} \sum_{i=1}^n \sum_{j=1}^n \langle \mathbf{v}_i, \mathbf{v}_j \rangle x_i x_j - \frac{1}{2} \sum_{i=1}^n \langle \mathbf{v}_i, \mathbf{v}_i \rangle x_i x_i \\ = & \frac{1}{2} \left( \sum_{i=1}^n \sum_{j=1}^n \sum_{f=1}^k v_{if}v_{jf} x_i x_j - \sum_{i=1}^n \sum_{f=1}^k v_{if}v_{if} x_i x_i \right) \\ = & \frac{1}{2} \sum_{f=1}^k \left( \left(\sum_{i=1}^n v_{if}x_i \right) \left( \sum_{j=1}^n v_{jf}x_j \right) -\sum_{i=1}^n v_{i,f}^2x_i^2 \right) \\ = & \frac{1}{2} \sum_{f=1}^k \left( \left(\sum_{i=1}^n v_{if}x_i \right)^2 -\sum_{i=1}^n v_{i,f}^2x_i^2\right) \end{aligned} %]]> ====i=1nj=i+1nvi,vjxixj21i=1nj=1nvi,vjxixj21i=1nvi,vixixi21i=1nj=1nf=1kvifvjfxixji=1nf=1kvifvifxixi21f=1k((i=1nvifxi)(j=1nvjfxj)i=1nvi,f2xi2)21f=1k(i=1nvifxi)2i=1nvi,f2xi2

DeepFM

Paper [IJCAI 2017]:
DeepFM: A Factorization-Machine based Neural Network for CTR Prediction

网络结构图

整体网络结构

FM部分:

Deep部分:

代码实现(第一种)

数据处理

模型需要输入两个,input_idxs, input_values

  • input_idxs是稀疏编码,即每一个分类型的field下各个独一无二的取值,每个连续型field都编码为一个定值。
  • input_values是特征取值,分类型field特征取值变为1,连续性field特征取值不变

举个例子:

需要注意的

  • second_order_part里面,分类型field和连续型field进行了交叉
  • deep_part里面,embedding之后和特征值相乘,再接Dense

代码

import tensorflow as tf

def dnn(params):
    dnn_model = tf.keras.Sequential()
    for size in params['dnn_hidden_units']:
        dnn_model.add(tf.keras.layers.Dense(size, activation='relu', use_bias=False))
    dnn_model.add(tf.keras.layers.Dense(1, activation=None, use_bias=False))
    return dnn_model

class DeepFM(tf.keras.Model):

    def __init__(self, params):
        '''
        :param params:
            feature_size: 编码id大小
            factor_size:embedding维度大小,对应公式里的k
            field_size: 输入变量个数,对应公式里的f
        '''
        super(DeepFM, self).__init__()
        self.params = params
        self.embeddings_1 = tf.keras.layers.Embedding(params['feature_size'], 1)
        self.embeddings_2 = tf.keras.layers.Embedding(params['feature_size'], params['factor_size'],
                                                      embeddings_regularizer=tf.keras.regularizers.l2(0.00001),
                                                      embeddings_initializer=tf.initializers.RandomNormal(
                                                           mean=0.0, stddev=0.0001, seed=1024)
                                                      )
        self.deep_dnn = dnn(params)
        self.dense_output = tf.keras.layers.Dense(params['class_num'], activation=params['last_activation'], )

    def first_order_part(self, idxs, values):
        '''
        :return: (n, k)
        '''
        x = self.embeddings_1(idxs) # (n, f, 1)
        x = tf.multiply(x, tf.expand_dims(values, axis=-1)) # (n, f, 1)
        x = tf.reduce_sum(x, axis=1) # (n, 1)
        return x

    def second_order_part(self, idxs, values):
        '''2ab = (a+b)^2- (a^2+b^2)
        :return (n, k)
        '''
        x = self.embeddings_2(idxs) # (n, f, k)
        x = tf.multiply(x, tf.expand_dims(values, axis=-1)) # (n, f, k)
        sum_square = tf.square(tf.reduce_sum(x, axis=1)) # (n, k)
        square_sum = tf.reduce_sum(tf.square(x), axis=1) # (n, k)
        output = 0.5*(tf.subtract(sum_square, square_sum))
        return tf.reduce_sum(output, axis=1, keepdims=True)
        return output

    def deep_part(self, idxs, values):
        '''
        :return: (n, 128)
        '''
        x = self.embeddings_2(idxs)
        x = tf.multiply(x, tf.expand_dims(values, axis=-1))  # (n, f, k)
        x = tf.reshape(x, (-1, self.params['field_size']*self.params['factor_size']))
        x =self.deep_dnn(x)
        return x


    def call(self, idxs, values):
        '''
        :param idxs: (n, f)
        :param values: (n, f)
        :return:
        '''
        first_order_output = self.first_order_part(idxs, values)
        second_order_output = self.second_order_part(idxs, values)
        deep_output = self.deep_part(idxs, values)
        combined_output = tf.concat([first_order_output, second_order_output, deep_output], axis=1)
        output = self.dense_output(combined_output)
        return output


if __name__=='__main__':
    import numpy as np
    params = {
        'field_size':12,
        'feature_size':5+3,
        'factor_size':4,
        'class_num': 1,
        'last_activation': 'sigmoid',
        'dnn_hidden_units': (128, 128)
    }
    print('Generate fake data...')
    x_dense = np.random.random((1000, 5))
    x_sparse = np.random.randint(0, 3, (1000, 7))

    # 这里x_idxs没有做更高的处理
    dense_idxs = np.zeros((x_dense.shape))
    for i in range(dense_idxs.shape[1]):
        dense_idxs[:, i] = i
    x_idxs = np.concatenate([dense_idxs, x_sparse+5], axis=1, )
    x_values = np.concatenate([x_dense, np.ones(x_sparse.shape)], axis=1)

    x_idxs = tf.convert_to_tensor(x_idxs, dtype=tf.int64)
    x_values = tf.convert_to_tensor(x_values, dtype=tf.float32)

    y = np.random.randint(0, 2, (1000, 1))

    model = DeepFM(params)
    pred = model(x_idxs, x_values)
    print(pred.shape)

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值