# 论文阅读：《Wide & Deep Learning for Recommender Systems》

Google Play 用的深度神经网络推荐系统，主要思路是将 Memorization(Wide Model) 和 Generalization(Deep Model) 取长补短相结合。论文见 Wide & Deep Learning for Recommender Systems

# Overview of System

• User features
e.g., country, language, demographics
• Contextual features
e.g., device, hour of the day, day of the week
• Impression features
e.g., app age, historical statistics of an app

WDL 就是用在排序系统中。

# Wide and Deep Learning

## Wide Model

Memorization can be loosely defined as learning the frequent co-occurrence of items or features and exploiting the correlation available in the historical data.

Linear model 大家都很熟悉了

$y={w}^{T}x+b$

$x=\left[{x}_{1},{x}_{2},\dots ,{x}_{d}\right]$$x = [x_1, x_2, …, x_d]$ 是包含了 d 个特征的向量，$w=\left[{w}_{1},{w}_{2},\dots ,{w}_{d}\right]$$w = [w_1, w_2, …, w_d]$ 是模型参数，b 是偏置。特征包括了原始的输入特征以及 cross-product transformation 特征，cross-product transformation 的式子如下：

${\varnothing }_{k}\left(x\right)=\prod _{i=1}^{d}{x}_{i}^{{c}_{ki}}$

${c}_{kj}$$c_{kj}$是一个布尔变量，如果第 i 个特征是第 k 个 transformation φk 的一部分，那么值就为 1，否则为 0，作用：

This captures the interactions between the binary features, and adds nonlinearity to the generalized linear model.

## Deep Model

Generalization is based on transitivity of correlation and explores new feature combinations that have never or rarely occurred in the past.

${a}^{\left(l+1\right)}=f\left({W}^{\left(l\right)}{a}^{\left(l\right)}+{b}^{\left(l\right)}\right)$

f 是激活函数(通常用 ReLU)，l 是层数。

## Joint Training

Joint Training vs Ensemble

• Joint Training 同时训练 wide & deep 模型，优化的参数包括两个模型各自的参数以及 weights of sum
• Ensemble 中的模型是分别独立训练的，互不干扰，只有在预测时才会联系在一起

# System Implementation

pipeline 如下图

## Data Generation

Label: 标准是 app acquisition，用户下载为 1，否则为 0
Vocabularies: 将类别特征(categorical features)映射为整型的 id，连续的实值先用累计分布函数CDF归一化到[0,1]，再划档离散化。

Continuous real-valued features are normalized to [0, 1] by mapping a feature value x to its cumulative distribution function P(X ≤ x), divided into ${n}_{q}$$n_q$ quantiles. The normalized value is $\frac{i-1}{{n}_{q}-1}$$i-1 \over n_q-1$for values in the i-th quantiles.