TensorFlow&NLP | 使用tf.keras自定义模型建模后model.summary()中Param的计算过程

最新推荐文章于 2024-09-28 19:34:13 发布

朱格羽

最新推荐文章于 2024-09-28 19:34:13 发布

阅读量4.6k

点赞数 16

分类专栏： TensorFlow NLP 文章标签： tensorflow 卷积神经网络深度学习

本文链接：https://blog.csdn.net/qq_40642546/article/details/106622996

版权

摘要

当我们使用TensorFlow2.0中keras.layers API进行自定义模型组网时，我们可以通过使用 model.summary()来输出模型中各层的一些信息。输出的图中包含了3列信息，第一列为各层的名称（层的类型，在tf.keras.layers中定义好了）；第二层为数据经过每层之后，输出的数据维度；第三列为当前层中共有多少个参数。

由于已经有一些讲得较为清楚的博客提到了这些内容，比如：
详解keras的model.summary()输出参数Param计算过程
该博客中主要讲述了 基础神经网络 和 CNN（2维卷积） 中的Param计算过程，这篇文章中就不再赘述了。我们重点来探讨一下当我们使用 CNN（1维卷积）模型对 NLP任务进行建模时，model.summary() 的展示结果中Param的计算过程。

代码演示

以下是使用自定义模型方式完成的demo，仅供参考

# to show the whole model.summary(), especially the part of output shape
from tensorflow import keras
from tensorflow.keras import layers as klayers

class MLP(keras.Model):
    def __init__(self, input_shape, **kwargs):
        super(MLP, self).__init__(**kwargs)
        # Add input layer
        self.input_layer