深度学习笔记之三——tensorflow实现wide&deep模型

最新推荐文章于 2024-06-04 13:56:20 发布

JAVA技术分享官

最新推荐文章于 2024-06-04 13:56:20 发布

阅读量2.6k

点赞数

分类专栏：深度学习 ML实战文章标签： wide&amp;deep tensoflow实现wide&amp;deep

本文链接：https://blog.csdn.net/qq_35946969/article/details/89403807

版权

ML实战同时被 2 个专栏收录

9 篇文章 0 订阅

订阅专栏

深度学习

5 篇文章 1 订阅

订阅专栏

wide&deep模型原理十分简单，说白了就是wide类模型与deep类模型的组合（借鉴google paper，但是选用模型不同）。

提出两个问题：

1、什么是wide类、deep类？

wide类模型一般指的是复杂度不高的线性模型，以LR最为常见。

deep类模型一般指的是复杂度比较高、拟合数据能力强的神经网络，boosting、bagging类模型应该也是可以的。

2、为什么要将他们组合起来？

换句话说，为什么这样做要比单独的wide和deep效果好？

放在推荐场景中，deep模型的拟合能力强，能够深入挖掘用户与类别之间的关系，找出用户喜欢的类别；而wide模型的拟合能力虽然没那么强，但是多样性会好一些（在历史数据上用户没那么喜欢的东西可能正是目前喜欢的东西）。两者的结合更加适合推荐场景。

3、如何组合？

按照16年google原文章的说法，wide部分和deep部分前向传播部分各自独立，到损失函数之前将两部分数据结果相加，一起计算损失函数，并且一同进行反向传播、参数更新。

搬一张原图：

上代码

代码没有提供main方法，有兴趣的朋友可以自己写写input尝试一下。

def deep_model(input_data,hidden1_units,hidden2_units,hidden3_units):
    """
    三层的神经网络
    :param input_data: 2-D tensor
    :param hidden1_units: int
    :param hidden2_units: int
    :param hidden3_units: int
    :return:
    """
    # 得到每个样本的维度
    input_len = int(input_date.shape[1])
    with tf.name_scope("hidden1"):
        ### truncated_normal 这个函数产生的随机数与均值的差距不会超过两倍的标准差
        weights = tf.Variable(tf.truncated_normal(shape=[input_len,hidden1_units],
                                                  stddev= 0.5 / math.sqrt(float(input_len))),
                              name="weights1")
        biases = tf.Variable(tf.zeros([hidden1_units],name='biases1'))
        hidden1 = tf.nn.relu(tf.matmul(input_data,weights)) + biases
        
    with tf.name_scope("hidden2"):
        weights = tf.Variable(tf.truncated_normal(shape=[hidden1_units,hidden2_units],
                                                  stddev= 0.5 / math.sqrt(float(input_len))),
                             name="weights2")
        biases = tf.Variable(tf.zeros([hidden2_units],name='biases2'))
        hidden2 = tf.nn.relu(tf.matmul(hidden1,weights)) + biases
        
    with tf.name_scope("hidden3"):
        weights = tf.Variable(tf.truncated_normal(shape=[hidden2_units,hidden3_units],
                                                  stddev= 0.5 / math.sqrt(float(input_len))),
                             name="weights3")
        biases = tf.Variable(tf.zeros([hidden3_units],name='biases3'))
        hidden3 = tf.nn.relu(tf.matmul(hidden2,weights)) + biases
        
    with tf.name_scope("output"):
        weights = tf.Variable(tf.truncated_normal(shape=[hidden3_units,1],
                                                  stddev= 0.5 / math.sqrt(float(input_len))),
                             name="weights4")
        biases = tf.Variable(tf.zeros([1],name='biases4'))
        hidden4 = tf.nn.relu(tf.matmul(hidden3,weights)) + biases
    
    return tf.nn.relu(tf.matmul(hidden4+biases))

def wide_model(input_data):
    """
    一层的神经网络+sigmoid == LR
    param input_data 
    return :output
    """
    input_len = int(input_date.shape[1])
    with tf.name_scope("hidden1"):
        ### truncated_normal 这个函数产生的随机数与均值的差距不会超过两倍的标准差
        weights = tf.Variable(tf.truncated_normal(shape=[input_len,1],
                                                  stddev= 0.5 / math.sqrt(float(input_len))),
                              name="weights1")
        output = tf.reduce_sum(output, 1, name="reduce_sum") #原来的output是一个(?,1)的向量，按照后面的维度去加，正好得到一个(?)的向量
        output = tf.reshape(output, [-1, 1])
    return output

def build_wdl(deep_input, wide_input,y):
    """
    得到模型和损失函数
    :param deep_input: 
    :param wide_input: 
    :param y: 
    :return: 
    """
    central_bias = tf.Variable([np.random.randn()],name="central_bias")
    dmodel = deep_model(deep_input,256,128,64)
    wmodel = wide_model(wide_input)

    # 使用 LR 将两个模型组合在一起
    dmodel_weight = tf.Variable(tf.truncated_normal([1,1],name="dmodel_weight"))
    wmodel_weight = tf.Variable(tf.truncated_normal([1,1],name="wmodel_weight"))
    
    network = tf.add(
                #matmul与mutlity不同，就是矩阵相乘
                tf.matmul(dmodel,dmodel_weight),
                tf.matmul(wmodel,wmodel_weight)
    )
    
    prediction = tf.add(network,central_bias)
 
    loss = tf.reduce_mean(
            tf.nn.sigmoid_cross_entropy_with_logits
    )
    train_step = tf.train.AdamOptimizer(0.001).minimize(loss)
    return train_step, loss, prediction

JAVA技术分享官

关注

0
点赞
踩
10

收藏

觉得还不错? 一键收藏
4
评论
深度学习笔记之三——tensorflow实现wide&deep模型

wide&deep模型原理十分简单，说白了就是wide类模型与deep类模型的组合（借鉴google paper，但是选用模型不同）。提出两个问题：1、什么是wide类、deep类？wide类模型一般指的是复杂度不高的线性模型，以LR最为常见。deep类模型一般指的是复杂度比较高、拟合数据能力强的神经网络，boosting、bagging类模型应该也是可以的。2、为什么...
复制链接

扫一扫

专栏目录