Wide&Deep
Wide&Deep模型是谷歌在2016年提出的一种用于分类、回归任务的模型
Memorization:
LR模型+大量的原始特征和叉乘特征作为输入,“记忆”历史数据中曾共同出现过的特征对。
Generalization:
为sparse特征学习低维的dense embeddings来捕获特征相关性,学习到的embeddings本身带有一定的语义信息
a ( l + 1 ) = f ( W ( l ) a ( l ) + b ( l ) ) a^{(l+1)} = f(W^{(l)}a^{(l)} + b^{(l)}) a(l+1)=f(W(l)a(l)+b(l))
优点:减少人工参与,并且可以对历史上没有出现的特征组合有更好的泛化能力
缺点:对于一些小众的商品,很难学到有效的embedding,会导致泛化过度
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import argparse
import sys
import tempfile
import pandas as pd
from six.moves import urllib
import tensorflow as tf
CSV_COLUMNS = [
"age", "workclass", "fnlwgt", "education", "education_num",
"marital_status", "occupation", "relationship", "race", "gender",
"capital_gain", "capital_loss", "hours_per_week", "native_country",
"income_bracket"
]
gender = tf.feature_column.categorical_column_with_vocabulary_list(
"gender", ["Female", "Male"])
education = tf.feature_column.categorical_column_with_vocabulary_list(
"education", [
"Bachelors", "HS-gra