FiBiNET:结合特征重要性和双线性特征交互进行CTR预估

欢迎关注公众号:python科技园,一起学习算法知识。

第一部分:理论

0. FiBiNET模型框架

FiBiNET全称FiBiNET: Combining Feature Importance and Bilinear feature Interaction for Click-Through Rate Prediction,是新浪微博提出的一种基于深度学习的广告推荐/点击率预测算法。可以认为FiBiNET是在wide & deep模型的基础上对它的wide部分进行了一些创新的改进,或者直接视为FNN的一个变体。主要的创新点在于:

  • 在传统的embedding stage加入了一个SENET层对embedding特征升级为新的一种embedding,得到与特征重要性(Feature Importance)相关的信息;
  • 不使用传统的inner product或Hadamard product方法,而是选择了结合二者的一种新的bilinear interaction方法来获得特征之间的联系;

模型的整体架构图如图(1)所示:

从图(1)中可以看到,相比于我们熟悉的基于深度学习的CTR预估模型,主要增加了SENET Layer和创新升级了Bilinear-Interaction Layer两个结构。此处摘抄一张模型数据流程图

preview

下面主要介绍一下SENET LayerBilinear-Interaction Layer

 

1. SENET Layer

SENET全称Squeeze-and-Excitation Network,在CV中用的比较多,可以对特征间的依赖关系进行一定的提取。SENET一共分为三个部分:Squeeze,Excitation和Re-Weight,按照顺序执行后从原始的embedding特征向量E得到加权后的V。如图(2)所示。

(1.1)Squeeze

这一步主要是将每个特征组的Embedding向量进行汇总统计,文中使用均值池化(也可以使用最大池化,但文中表示平均池化效果要好于最大池化)对Embedding向量 E=[e_1, e_2, ..., e_f] 进行压缩为 Z=[z_1, z_2, ..., z_f] ,其中 z_i 表示第 i 个特征的全局信息,z_i 是标量。z_i 的具体计算方式如下:z_i = F_{sq}(e_i) = \frac{1}{k} \sum^{k}_{t=1} e^{(t)}_{i}

举个栗子:

E = [e_1, e_2, e_3, e_4] = \begin{bmatrix} [0.1, 0.2, 0.4] \\ [0.3, 0.2, 0.7] \\ [1.3, 0.1, 2.6] \\ [1.0, 1.3, 5.1] \end{bmatrix} ^ {T},经过平均池化后,就变成了:Z = [z_1, z_2, z_3, z_4] = \begin{bmatrix} sum(0.1, 0.2, 0.4) / 3 \\ sum(0.3, 0.2, 0.7) / 3 \\ sum(1.3, 0.1, 2.6) / 3 \\ sum(1.0, 1.3, 5.1) / 3 \end{bmatrix} ^ {T} = [0.23, 0.40, 1.33, 2.47]

此例子中 f = 4,每个特征的向量维度为3,即 k=3

(1.2)Excitation

这一步基于特征组的压缩统计量 Z=[z_1, z_2, ..., z_f] 来学习特征组的重要性权重,文章使用两层的神经网络来学习。第一层为一个维度缩减层,第二层为维度提升层。

公式表示为:A = F_{ex}(Z) = \sigma _2 (W_2 \sigma _1(W_1Z))

其中:A \in F^{f},是一个向量,形式上同 Z ;\sigma _1, \sigma _2 为激活函数;W_1 \in R^{f \times \frac{f}{r}}, W_2 \in R^{\frac{f}{r} \times f}

接着举个栗子哦:

A = F_{ex}(Z) = Z * W^{1} *W^{2} \\ = [0.23, 0.40, 1.33, 2.47] * \begin{bmatrix} [1, 2] \\ [1, 1] \\ [3, 7] \\ [5, 10] \end{bmatrix} * \begin{bmatrix} [1, 1, 2, 2] \\ [1, 1, 1, 1] \end{bmatrix} \\ = [0.23*1 + 0.40*1 + 1.33*3 + 2.47*5, 0.23*2 + 0.40*1 + 1.33*7 + 2.47*10] * \begin{bmatrix} [1, 1, 2, 2] \\ [1, 1, 1, 1] \end{bmatrix} \\ = [16.97, 34.87] * \begin{bmatrix} [1, 1, 2, 2] \\ [1, 1, 1, 1] \end{bmatrix} \\ = [16.97*1+ 34.87*1, 16.97*1+ 34.87*1, 16.97*2+ 34.87*1, 16.97*2+ 34.87*1] \\ = [51.84, 51.84, 68.81, 68.81]

该例子中,暂忽略了激活函数,设置 r=2

(1.3)Re-Weight

最后一步是把AE按照类似于Hadamard product的方法对其中的f个元素进行element-wise的相乘得到SENET的最终产出 V,如果把A视为一个权重向量,那么这一步也可以被叫做加权或rescale。新的Embedding V向量通过如下的方式计算得到:

V = f_{ReWeight}(A, E) = [a_1 \cdot e_1, a_2 \cdot e_2, ..., a_f \cdot e_f] = [v_1, v_2, ..., v_f]

继续举栗子哦:

V = f_{ReWeight}(A, E) = [a_1 \cdot e_1, a_2 \cdot e_2, ..., a_f \cdot e_f] \\ = [51.84, 51.84, 68.81, 68.81] * \begin{bmatrix} [0.1, 0.2, 0.4] \\ [0.3, 0.2, 0.7] \\ [1.3, 0.1, 2.6] \\ [1.0, 1.3, 5.1] \end{bmatrix} \\ = \begin{bmatrix} [0.1*51.84, 0.2*51.84, 0.4*51.84] \\ [0.3*51.84, 0.2*51.84, 0.7*51.84] \\ [1.3*68.81, 0.1*68.81, 2.6*68.81] \\ [1.0*68.81, 1.3*68.81, 5.1*68.81] \end{bmatrix} ^ {T} \\ = \begin{bmatrix} [5.184, 10.368, 20.736] \\ [15.552, 10.368, 36.288] \\ [89.453, 6.881, 178.906] \\ [68.810, 89.453, 350.931] \end{bmatrix} ^ {T}

 

2. Bilinear-Interaction Layer

传统的特征交叉方式广泛采用了内积(fm,ffm等)和哈达玛积(AFM,NFM等)。而这两种方式在稀疏数据上很难有效对特征交叉进行建模。文章提出结合内积和哈达玛积并引入一个额外的参数矩阵W来学习特征交叉。

内积:[a_1, a_2, ...,a_n]\cdot [b_1, b_2, ..., b_n] = \sum^{n}_{i=1}a_ib_i

哈达玛积:[a_1, a_2, ...,a_n] \odot [b_1, b_2, ..., b_n] = [a_1b_1,a_2b_2, ..., a_nb_n]

即:p_{i,j}=v_i\cdot W \odot v_jp_{i,j} \in R^{k}

交叉向量 p_{i,j} 可以通过一下三种方式计算得到:

(2.1)Filed-All Type: p_{i,j}=v_i\cdot W \odot v_j,所有特征组进行两两交叉时共享一个参数矩阵W,额外参数量为 k \times k

(2.2)Field-Each Type: p_{i,j}=v_i\cdot W_i \odot v_j,每个特征组i维护一个参数矩阵W_i,额外参数量为 f \times k \times k

(2.3)Field-Interaction Type: p_{i,j}=v_i\cdot W_{ij} \odot v_j,每对交互特征 p_{ij} 都有一个参数矩阵W_{ij},额外参数量为 \frac{f \times(f-1)}{2} \times k \times k
​ 
该 Bilinear-Interaction Layer 得到:

(1)E 经过bilinear函数的转换得到一个包含特征之间的关联的向量p = [p_1, p_2, ..., p_n],其中 n=f \times (f-1) /2,每个 p_i \in R^k 保持不变,形式上同 e_i, v_ip \in R^{n \times k}

(2)V 经过bilinear函数的转换得到一个包含特征之间的关联的向量q = [q_1, q_2, ..., q_n],其中 n=f \times (f-1) /2,每个 q_i \in R^k 保持不变,形式上同 e_i, v_iq \in R^{n \times k}

 

3. Combination Layer

在combination层把(1)和(2)的输出 p 和 q 简单的连接为 c = [p, q] = [c_1, c_2, ..,c_{2n}]c \in R^{2n \times k}

 

4. Deep Network

最后把 c 送到多层全连接的神经网络结构,也就是我们通常说的DNN,得到最终的输出。

其中: a^{(0)} = [c_1, c_2, ..., c_{2n}]a^{(l)} = \sigma (W^{(l)}a^{(l-1)} + b^{(l)} )

 

第二部分:代码实践

1. SENET函数

def senet(inputs):
    
    Z = tf.reduce_mean(inputs, axis=-1, )
    
    w1 = np.array([[1., 1.], [2., 2.], [1., 1.]])
    dot1 = tf.tensordot(Z, w1, axes=(-1, 0))
    
    w2 = np.array([[1., 1., 2], [2., 2., 6]])
    dot2 = tf.tensordot(dot1, w2, axes=(-1, 0))
    
    return dot2
inputs = np.array([[[1., 1., 1., 1.], [2., 2., 2., 2.], [1., 1., 1., 2.]], \
                   [[1., 1., 1., 1.], [2., 2., 2., 2.], [1., 2., 1., 2.]]])

print("inputs_shape: {} \n".format(inputs.shape))

senet_result = senet(inputs)
sess = tf.InteractiveSession()
senet_result_sess = sess.run(senet_result)
print("senet_result_sess: \n", senet_result_sess)


"""
inputs_shape: (2, 3, 4) 

senet_result_sess: 
 [[18.75 18.75 50.  ]
 [19.5  19.5  52.  ]]
"""

2. BilinearInteraction函数

import itertools
import numpy as np

def fibinet(inputs, bilinear_type):
    print("bilinear_type =", bilinear_type)
    
    if bilinear_type == "all":
        W = np.array([[1., 1., 1., 1.], [2., 2., 1., 1.], [1., 1., 2., 2.], [3., 3., 1., 1.]])
        print("W:\n", W)
        p = [tf.multiply(tf.tensordot(v_i, W, axes=(-1, 0)), v_j) for v_i, v_j in itertools.combinations(inputs, 2)]
        
    elif bilinear_type == "each": # 示例中共3个field向量,每个field向量对应一个 W 权重
        W_list = np.array([[[1., 1., 1., 1.], [2., 2., 1., 1.], [1., 1., 2., 2.], [3., 3., 1., 1.]], \
                     [[1., 1., 1., 1.], [2., 2., 2., 2.], [1., 1., 2., 2.], [3., 3., 1., 1.]], \
                     [[1., 1., 1., 1.], [2., 2., 1., 1.], [2., 2., 2., 2.], [3., 3., 1., 1.]]])
        print("W_list:\n", W_list)
        p = [tf.multiply(tf.tensordot(inputs[i], W_list[i], axes=(-1, 0)), inputs[j]) for i, j in itertools.combinations(range(len(inputs)), 2)]
        
    elif bilinear_type == "interaction": # 示例中共3个field向量,两两交互共产生3个组合
        W_list = np.array([[[1., 1., 1., 1.], [2., 2., 1., 1.], [1., 1., 2., 2.], [3., 3., 1., 1.]], \
                     [[1., 1., 1., 1.], [2., 2., 1., 1.], [1., 1., 2., 2.], [3., 3., 1., 1.]], \
                     [[1., 1., 1., 1.], [2., 2., 1., 1.], [1., 1., 2., 2.], [3., 3., 1., 1.]]])
        
        print("W_list:\n", W_list)
        
        p = [tf.multiply(tf.tensordot(v[0], w, axes=(-1, 0)), v[1]) for v, w in zip(itertools.combinations(inputs, 2), W_list)]
        
    return p

(1) all 模式

inputs = np.array([[1., 1., 1., 1.], [2., 2., 2., 2.], [1., 1., 1., 2.]]) # 两两交互,共产生3个组合
print("input: ", inputs)
print()
p = fibinet(inputs, bilinear_type="all")
print()

sess = tf.InteractiveSession()
p_sess = sess.run(p)
print(p_sess)

"""
input:  [[1. 1. 1. 1.]
 [2. 2. 2. 2.]
 [1. 1. 1. 2.]]

bilinear_type = all
W:
 [[1. 1. 1. 1.]
 [2. 2. 1. 1.]
 [1. 1. 2. 2.]
 [3. 3. 1. 1.]]

[array([14., 14., 10., 10.]), array([ 7.,  7.,  5., 10.]), array([14., 14., 10., 20.])]
"""

(2) each 模式

inputs = np.array([[1., 1., 1., 1.], [2., 2., 2., 2.], [1., 1., 1., 2.]]) # 两两交互,共产生3个组合
print("input: ", inputs)
print()
p = fibinet(inputs, bilinear_type="each")
print()

sess = tf.InteractiveSession()
p_sess = sess.run(p)
print(p_sess)

"""
input:  [[1. 1. 1. 1.]
 [2. 2. 2. 2.]
 [1. 1. 1. 2.]]

bilinear_type = each
W_list:
 [[[1. 1. 1. 1.]
  [2. 2. 1. 1.]
  [1. 1. 2. 2.]
  [3. 3. 1. 1.]]

 [[1. 1. 1. 1.]
  [2. 2. 2. 2.]
  [1. 1. 2. 2.]
  [3. 3. 1. 1.]]

 [[1. 1. 1. 1.]
  [2. 2. 1. 1.]
  [2. 2. 2. 2.]
  [3. 3. 1. 1.]]]

[array([14., 14., 10., 10.]), array([ 7.,  7.,  5., 10.]), array([14., 14., 12., 24.])]
"""

(3) interaction 模式

inputs = np.array([[1., 1., 1., 1.], [2., 2., 2., 2.], [1., 1., 1., 2.]]) # 两两交互,共产生3个组合
print("input: ", inputs)
print()
p = fibinet(inputs, bilinear_type="interaction")
print()

sess = tf.InteractiveSession()
p_sess = sess.run(p)
print(p_sess)

"""
input:  [[1. 1. 1. 1.]
 [2. 2. 2. 2.]
 [1. 1. 1. 2.]]

bilinear_type = interaction
W_list:
 [[[1. 1. 1. 1.]
  [2. 2. 1. 1.]
  [1. 1. 2. 2.]
  [3. 3. 1. 1.]]

 [[1. 1. 1. 1.]
  [2. 2. 1. 1.]
  [1. 1. 2. 2.]
  [3. 3. 1. 1.]]

 [[1. 1. 1. 1.]
  [2. 2. 1. 1.]
  [1. 1. 2. 2.]
  [3. 3. 1. 1.]]]

[array([14., 14., 10., 10.]), array([ 7.,  7.,  5., 10.]), array([14., 14., 10., 20.])]
"""

 

第三部分:案例实践

1. 准备数据

titanic数据集 的目标是根据乘客信息预测他们在Titanic号撞击冰山沉没后能否生存。结构化数据一般会使用Pandas中的DataFrame进行预处理。

import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
 
df_data = pd.read_csv('data/train.csv')

titanic数据集下载地址: https://www.kaggle.com/c/titanic/data

字段说明:

# 类别变量重新编码
# 数值变量,用0填充缺失值
 
sparse_feature_list = ["Pclass", "Sex", "Cabin", "Embarked"]
dense_feature_list = ["Age", "SibSp", "Parch", "Fare"]

sparse_feature_reindex_dict = {}
for i in sparse_feature_list:
    cur_sparse_feature_list = df_data[i].unique()
    
    sparse_feature_reindex_dict[i] = dict(zip(cur_sparse_feature_list, \
        range(1, len(cur_sparse_feature_list)+1)
                                     )
                                 )
    
    df_data[i] = df_data[i].map(sparse_feature_reindex_dict[i])

for j in dense_feature_list:
    df_data[j] = df_data[j].fillna(0)

 

# 分割数据集

data = df_data[sparse_feature_list + dense_feature_list]
label = df_data["Survived"].values

xtrain, xtest, ytrain, ytest = train_test_split(data, label, test_size=0.2, random_state=2020)

 

xtrain_data = {"Pclass": np.array(xtrain["Pclass"]), \
              "Sex": np.array(xtrain["Sex"]), \
              "Cabin": np.array(xtrain["Cabin"]), \
              "Embarked": np.array(xtrain["Embarked"]), \
              "Age": np.array(xtrain["Age"]), \
              "SibSp": np.array(xtrain["SibSp"]), \
              "Parch": np.array(xtrain["Parch"]), \
              "Fare": np.array(xtrain["Fare"])}
 
xtest_data = {"Pclass": np.array(xtest["Pclass"]), \
              "Sex": np.array(xtest["Sex"]), \
              "Cabin": np.array(xtest["Cabin"]), \
              "Embarked": np.array(xtest["Embarked"]), \
              "Age": np.array(xtest["Age"]), \
              "SibSp": np.array(xtest["SibSp"]), \
              "Parch": np.array(xtest["Parch"]), \
              "Fare": np.array(xtest["Fare"])}

2. 构建模型

(2.1)加载python模块

import tensorflow as tf
from tensorflow.python.keras import backend as K
from tensorflow.python.keras.layers import Input, Embedding, \
    Dot, Flatten, Concatenate, Dense
 
from tensorflow.keras.models import Model
from tensorflow.python.keras.layers import Layer
from tensorflow.python.keras.initializers import Zeros, glorot_normal
from tensorflow.python.keras.optimizers import Adam
from tensorflow.python.keras.regularizers import l2

from deepctr.layers.core import PredictionLayer, DNN
from deepctr.layers.utils import Linear
 
from keras.utils import plot_model

(2.2)定义类别变量的输入层、Embedding层

def input_embedding_layer(
    shape=1, \
    name=None, \
    vocabulary_size=1, \
    embedding_dim=1):
    
    input_layer = Input(shape=[shape, ], name=name)
    embedding_layer = Embedding(vocabulary_size, embedding_dim)(input_layer)
    
    return input_layer, embedding_layer

(2.3)定义 SENETLayer, BilinearInteraction层

class SENETLayer(Layer):
    """SENETLayer used in FiBiNET.

      Input shape
        - A list of 3D tensor with shape: ``(batch_size,1,embedding_size)``.

      Output shape
        - A list of 3D tensor with shape: ``(batch_size,1,embedding_size)``.

      Arguments
        - **reduction_ratio** : Positive integer, dimensionality of the
         attention network output space.

        - **seed** : A Python integer to use as random seed.

      References
        - [FiBiNET: Combining Feature Importance and Bilinear feature Interaction for Click-Through Rate Prediction](https://arxiv.org/pdf/1905.09433.pdf)
    """

    def __init__(self, reduction_ratio=3, seed=1024, **kwargs):
        self.reduction_ratio = reduction_ratio

        self.seed = seed
        super(SENETLayer, self).__init__(**kwargs)

    def build(self, input_shape):

        if not isinstance(input_shape, list) or len(input_shape) < 2:
            raise ValueError('A `AttentionalFM` layer should be called '
                             'on a list of at least 2 inputs')

        self.filed_size = len(input_shape)
        self.embedding_size = input_shape[0][-1]
        reduction_size = max(1, self.filed_size // self.reduction_ratio)

        self.W_1 = self.add_weight(shape=(
            self.filed_size, reduction_size), initializer=glorot_normal(seed=self.seed), name="W_1")
        self.W_2 = self.add_weight(shape=(
            reduction_size, self.filed_size), initializer=glorot_normal(seed=self.seed), name="W_2")

        self.tensordot = tf.keras.layers.Lambda(
            lambda x: tf.tensordot(x[0], x[1], axes=(-1, 0)))

        # Be sure to call this somewhere!
        super(SENETLayer, self).build(input_shape)

    def call(self, inputs, training=None, **kwargs):

        if K.ndim(inputs[0]) != 3:
            raise ValueError(
                "Unexpected inputs dimensions %d, expect to be 3 dimensions" % (K.ndim(inputs)))

        inputs = Concatenate(axis=1)(inputs)
        Z = tf.reduce_mean(inputs, axis=-1, )

        A_1 = tf.nn.relu(self.tensordot([Z, self.W_1]))
        A_2 = tf.nn.relu(self.tensordot([A_1, self.W_2]))
        V = tf.multiply(inputs, tf.expand_dims(A_2, axis=2))

        return tf.split(V, self.filed_size, axis=1)

    def compute_output_shape(self, input_shape):

        return input_shape

    def compute_mask(self, inputs, mask=None):
        return [None] * self.filed_size

    def get_config(self, ):
        config = {'reduction_ratio': self.reduction_ratio, 'seed': self.seed}
        base_config = super(SENETLayer, self).get_config()
        return dict(list(base_config.items()) + list(config.items()))
class BilinearInteraction(Layer):
    """BilinearInteraction Layer used in FiBiNET.

      Input shape
        - A list of 3D tensor with shape: ``(batch_size,1,embedding_size)``.

      Output shape
        - 3D tensor with shape: ``(batch_size,1,embedding_size)``.

      Arguments
        - **str** : String, types of bilinear functions used in this layer.

        - **seed** : A Python integer to use as random seed.

      References
        - [FiBiNET: Combining Feature Importance and Bilinear feature Interaction for Click-Through Rate Prediction](https://arxiv.org/pdf/1905.09433.pdf)

    """

    def __init__(self, bilinear_type="interaction", seed=1024, **kwargs):
        self.bilinear_type = bilinear_type
        self.seed = seed

        super(BilinearInteraction, self).__init__(**kwargs)

    def build(self, input_shape):

        if not isinstance(input_shape, list) or len(input_shape) < 2:
            raise ValueError('A `AttentionalFM` layer should be called '
                             'on a list of at least 2 inputs')
        embedding_size = int(input_shape[0][-1])

        if self.bilinear_type == "all":
            self.W = self.add_weight(shape=(embedding_size, embedding_size), initializer=glorot_normal(
                seed=self.seed), name="bilinear_weight")
        elif self.bilinear_type == "each":
            self.W_list = [self.add_weight(shape=(embedding_size, embedding_size), initializer=glorot_normal(
                seed=self.seed), name="bilinear_weight" + str(i)) for i in range(len(input_shape) - 1)]
        elif self.bilinear_type == "interaction":
            self.W_list = [self.add_weight(shape=(embedding_size, embedding_size), initializer=glorot_normal(
                seed=self.seed), name="bilinear_weight" + str(i) + '_' + str(j)) for i, j in
                           itertools.combinations(range(len(input_shape)), 2)]
        else:
            raise NotImplementedError

        super(BilinearInteraction, self).build(
            input_shape)  # Be sure to call this somewhere!

    def call(self, inputs, **kwargs):

        if K.ndim(inputs[0]) != 3:
            raise ValueError(
                "Unexpected inputs dimensions %d, expect to be 3 dimensions" % (K.ndim(inputs)))

        if self.bilinear_type == "all":
            p = [tf.multiply(tf.tensordot(v_i, self.W, axes=(-1, 0)), v_j)
                 for v_i, v_j in itertools.combinations(inputs, 2)]
        elif self.bilinear_type == "each":
            p = [tf.multiply(tf.tensordot(inputs[i], self.W_list[i], axes=(-1, 0)), inputs[j])
                 for i, j in itertools.combinations(range(len(inputs)), 2)]
        elif self.bilinear_type == "interaction":
            p = [tf.multiply(tf.tensordot(v[0], w, axes=(-1, 0)), v[1])
                 for v, w in zip(itertools.combinations(inputs, 2), self.W_list)]
        else:
            raise NotImplementedError
        return Concatenate(axis=-1)(p)

    def compute_output_shape(self, input_shape):
        filed_size = len(input_shape)
        embedding_size = input_shape[0][-1]

        return (None, 1, filed_size * (filed_size - 1) // 2 * embedding_size)

    def get_config(self, ):
        config = {'bilinear_type': self.bilinear_type, 'seed': self.seed}
        base_config = super(BilinearInteraction, self).get_config()
        return dict(list(base_config.items()) + list(config.items()))

(2.4)定义FIBINET模型结构

def fibinet(sparse_feature_list, \
        sparse_feature_reindex_dict, \
        dense_feature_list, \
        bilinear_type='interaction', \
        reduction_ratio=3, \
        dnn_hidden_units=(128, 128), \
        l2_reg_linear=1e-5, \
        l2_reg_embedding=1e-5, \
        l2_reg_dnn=0, \
        init_std=0.0001, \
        seed=1024, \
        dnn_dropout=0.3, \
        dnn_activation='relu', \
        task='binary'):
    
    sparse_input_layer_list = []
    sparse_embedding_layer_list = []
    
    dense_input_layer_list = []
 
    
    # 1. Input & Embedding sparse features
    for i in sparse_feature_list:
        shape = 1
        name = i
        vocabulary_size = len(sparse_feature_reindex_dict[i]) + 1
        embedding_dim = 64
        
        cur_sparse_feaure_input_layer, cur_sparse_feaure_embedding_layer = \
            input_embedding_layer(
                shape = shape, \
                name = name, \
                vocabulary_size = vocabulary_size, \
                embedding_dim = embedding_dim)
        
        sparse_input_layer_list.append(cur_sparse_feaure_input_layer)
        sparse_embedding_layer_list.append(cur_sparse_feaure_embedding_layer)
 
    
    # 2. Input dense features
    for j in dense_feature_list:
        dense_input_layer_list.append(Input(shape=(1, ), name=j))
    

    
    # === linear part ===
    sparse_linear_input = Concatenate(axis=-1)(sparse_embedding_layer_list)
    dense_linear_input = Concatenate(axis=-1)(dense_input_layer_list)
    linear_logit = Linear()([sparse_linear_input, dense_linear_input])

    
    # === fibinet part ===
    senet_embedding_list = SENETLayer(reduction_ratio, seed)(sparse_embedding_layer_list)
    senet_bilinear_out = BilinearInteraction(bilinear_type=bilinear_type, seed=seed)(senet_embedding_list)
    bilinear_out = BilinearInteraction(bilinear_type=bilinear_type, seed=seed)(sparse_embedding_layer_list)

    
    dnn_input = Concatenate(axis=-1)(
                    [Flatten()(Concatenate(axis=-1)([senet_bilinear_out, bilinear_out])), \
                    dense_linear_input] \
                )
    dnn_output = DNN(dnn_hidden_units, dnn_activation, l2_reg_dnn, dnn_dropout, False, seed)(dnn_input)
    dnn_logit = tf.keras.layers.Dense(1, use_bias=False, activation=None)(dnn_output)
 
    
    # === output ===
    out = PredictionLayer(task)(tf.keras.layers.add([linear_logit, dnn_logit]))
    fibinet_model = Model(inputs = sparse_input_layer_list + dense_input_layer_list, outputs=out)
    
    return fibinet_model

(2.5)应用FIBINET模型

fibinet_model = fibinet(sparse_feature_list, \
              sparse_feature_reindex_dict, \
              dense_feature_list)

(2.6)打印FIBINET模型 summary

print(fibinet_model.summary())

(2.7)输出FIBINET模型结构图

plot_model(fibinet_model, to_file='fibinet_model.png')

(2.8)编译 FIBINET 模型,训练模型

fibinet_model.compile(loss='binary_crossentropy', \
        optimizer=Adam(lr=1e-3), \
        metrics=['accuracy'])
 
history = fibinet_model.fit(xtrain_data, ytrain, epochs=5, batch_size=32, validation_data=(xtest_data, ytest))

(2.9)绘制损失函数图

import matplotlib.pyplot as plt
 
loss = history.history['loss']
val_loss = history.history['val_loss']
epochs = range(1, len(loss) + 1)
plt.figure()
plt.plot(epochs, loss, 'bo', label='Training loss')
plt.plot(epochs, val_loss, 'b', label='Validation loss')
plt.title('Training and validation loss')
plt.legend()
print(plt.show())

 

最后说明:

该模型的调优实践经验,请参考博客:FiBiNET: paper reading + 实践调优经验

 

参考:

[1] Huang, Tongwen, Zhiqi Zhang, and Junlin Zhang. "FiBiNET: Combining Feature Importance and Bilinear feature Interaction for Click-Through Rate Prediction.arXiv preprint arXiv:1905.09433 (2019).

[2] Cheng, Heng-Tze, et al. "Wide & deep learning for recommender systems.Proceedings of the 1st workshop on deep learning for recommender systems. ACM, 2016.

[3] Zhang, Weinan, Tianming Du, and Jun Wang. "Deep learning over multi-field categorical data.European conference on information retrieval. Springer, Cham, 2016.

[4] Hu, Jie, Li Shen, and Gang Sun. "Squeeze-and-excitation networks.Proceedings of the IEEE conference on computer vision and pattern recognition. 2018.

[5] 博客:FiBiNET: paper reading + 实践调优经验

[6] 博客:FiBiNET:结合特征重要性和双线性特征交互进行CTR预估

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值