tensorflow 2.X 自动构建任意层的深度神经网络(DNN)

目录

一些参考博客:

tensorflow2.0 和tensorflow1.0的区别

第一种方式:为自定义各种模块

需要导入的模块有:

构建网络的命令为: 

函数 DNN_model()的定义如下:

第二种方式:序列化定义 Dense Net(缺点:每一层难操作)

第三种方式:使用Keras的API 显式定义 Dense Net(这种策略好)

说明





一些参考博客:

在 TensorFlow 2.x 版本中,可以使用三种方式来构建 Keras 模型,分别是 Sequential , 函数式 (Functional) API 以及自定义模型 (Subclassed)

1、Tensorflow学习笔记2-Keras六步法搭建神经网络模型https://blog.csdn.net/qq_44711932/article/details/107940180

2、TensorFlow2.0(二)--Keras构建神经网络分类模型https://blog.csdn.net/qq_42580947/article/details/105296103  

3、TensorFlow2神经网络训练 https://blog.csdn.net/zhouchen1998/article/details/102572264

4、TensorFlow2.0 (6) 自构建神经网络层—— transformer 实例讲解 https://blog.csdn.net/xm961217/article/details/107787737 

5、TensorFlow 2.x 常用API https://blog.csdn.net/myarrow/article/details/108800167

6、tensorflow2.0自定义模型的三种方法总结 https://blog.csdn.net/weixin_45147782/article/details/108588178

7、全连接层tf.keras.layers.Dense()介绍https://blog.csdn.net/qq_38251616/article/details/115632249

8、tensorflow2.0中Layer的__init__(),build(), call()函数 https://blog.csdn.net/qq_32623363/article/details/104128497

9、tensorflow2 中关于自定义层的build() 和 call()一点探究 https://blog.csdn.net/weixin_37598106/article/details/106693120

10、Keras学习笔记12——keras.initializers https://blog.csdn.net/winter_python/article/details/108706123

看了很多教程,无论是序列方法还是subclass方法,都是逐层定义网络的方法,且需要自己手动设定每层的维度。这里介绍一种提供输入维度,输出维度,隐藏层数目列表,构建全连接神经网络(DNN)的一种方式。

tensorflow2.0 和tensorflow1.0的区别

本人在训练网络的时候发现tensorflow2.X 比之前在TensorFlow1.X上慢了好多。如下为一些相关的博客。

1、tensorflow2.0 和tensorflow1.0的区别 https://blog.csdn.net/qq_38978225/article/details/108942427

2、tensorflow2.0 与tensorflow1.0的性能区别 https://blog.csdn.net/luoganttcc/article/details/93497326

第一种方式:为自定义各种模块

需要导入的模块有:

import os
import sys
import tensorflow as tf
import numpy as np
import matplotlib
import platform
import shutil
import time
import Model_base

构建网络的命令为: 

model = DNN_model(indim=input_dim, outdim=out_dim, hidden_list=hidden_layer,
                      init_opt2model=init_model, name2DNN_Model=name2base_model, opt2regular_WB=regular_wb_model,
                      penalty2WB=penalty2weight_biases, actName=actFun)

这里: 

indim 为输入维度;

outdim 为输出维度;

hidden_list 为隐藏层神经元数目,为元组或者列表。如hidden_list=(20,30,30,40,10) or hidden_list=[20,30,30,40,10]

init_opt2model 为初始化网络weights和biases的方式;

name2DNN_model为要使用的网络模式

opt2regular_WB为正则化权重和偏置的方式

penalty2WB 正则化权重和偏置的系数

actName:隐藏层之间的激活函数

scope2w和scope2b: 为weight和bias指定命名空间。为了使DNN_model 派生不同的网络。如果只派生一个网络,可省略。

如:

DNN1 = DNN_model(。。。。)

DNN2 = DNN_model(。。。。)

注:我们每个*DNN网络都是线性输出,可以调整。如需要调整,对actName2out赋一个字符串,指定要使用的激活函数,如:relu,leakly_relu,tanh,elu,sigmoid等

函数 DNN_model()的定义如下:

class DNN_model(tf.keras.Model):
    def __init__(self, indim=1, outdim=1, hidden_list=None, init_opt2model='DNN_Base', name2DNN_Model='DNN',
                 opt2regular_WB='L1', penalty2WB=0.001, actName='tanh', actName2out='linear', scope2w='Weight',
                 scope2b='Bias'):
        super(DNN_model, self).__init__()
        self.indim = indim
        self.outdim = outdim
        self.hidden_list = hidden_list
        self.init_opt2model = init_opt2model
        self.name2DNN_Model = name2DNN_Model
        self.opt2regular_WB = opt2regular_WB
        self.penalty2WB = penalty2WB
        self.actName = actName
        self.actName2out = actName2out

        init_NN = Model_base.Xavier_init_NN(indim=self.indim, outdim=self.outdim, hiddens=self.hidden_list,
                                            flag2Weight=scope2w, flag2Bias=scope2b)
        self.Weigths, self.Biases = init_NN.init_WB()

        self.DNN = Model_base.DNN(indim=self.indim, outdim=self.outdim, hiddens=self.hidden_list, Ws=self.Weigths,
                                  Bs=self.Biases, actName=self.actName, actName2Out=self.actName2out)

    def call(self, inputs, training=None, mask=None):
        out = self.DNN(inputs)
        return out

构建这个模块所要使用的基本操作如下: 我们定义为一个Model_base.py文件

# -*- coding: utf-8 -*-
"""
Created on 2021.06.15
@author: LXA
"""
import tensorflow as tf
import numpy as np


# ------------------------------------- my activations ----------------------------------
class my_actFunc(tf.keras.layers.Layer):
    def __init__(self, actName='linear'):
        super(my_actFunc, self).__init__()
        self.actName = actName

    def call(self, x_input):
        if str.lower(self.actName) == 'relu':
            out_x = tf.nn.relu(x_input)
        elif str.lower(self.actName) == 'leaky_relu':
            out_x = tf.nn.leaky_relu(x_input)
        elif str.lower(self.actName) == 'tanh':
            out_x = tf.nn.relu(x_input)
        elif str.lower(self.actName) == 'elu':
            out_x = tf.nn.elu(x_input)
        elif str.lower(self.actName) == 'sin':
            out_x = tf.sin(x_input)
        elif str.lower(self.actName) == 'sigmoid':
            out_x = tf.nn.sigmoid(x_input)
        else:
            out_x = x_input
        return out_x


# -----------------------------  initialize weights and bias -----------------------------
class Xavier_init_NN(tf.keras.layers.Layer):
    def __init__(self, indim=20, outdim=1, flag2Weight='Weight', flag2Bias='Bias', hiddens=None):
        super(Xavier_init_NN, self).__init__()
        self.outdim = outdim
        self.indim = indim
        self.flag2w = flag2Weight
        self.flag2b = flag2Bias
        self.hiddens = hiddens

    def Xavier_init_weight(self, in_size=1, out_size=1, flag2Weight='Weight'):
        stddev_W = (2.0 / (in_size + out_size)) ** 0.5
        w_init = tf.random_normal_initializer(mean=0.0, stddev=stddev_W)
        Weight = tf.Variable(initial_value=w_init(shape=(in_size, out_size), dtype='float32'), name=str(flag2Weight),
                             trainable=True)
        return Weight

    def Xavier_init_bias(self, in_size=1, out_size=1, flag2Bias='Bias'):
        stddev_B = (2.0 / (in_size + out_size)) ** 0.5
        b_init = tf.random_normal_initializer(mean=0.0, stddev=stddev_B)
        Bias = tf.Variable(initial_value=b_init(shape=(out_size,), dtype='float32'), name=str(flag2Bias),
                           trainable=True)
        return Bias

    def init_WB(self):
        W_list=[]
        B_list=[]
        n_hiddens = len(self.hiddens)
        w_in = self.Xavier_init_weight(in_size=self.indim, out_size=(self.hiddens)[0],
                                       flag2Weight=self.flag2w + str(0))
        b_in = self.Xavier_init_bias(in_size=self.indim, out_size=(self.hiddens)[0],
                                     flag2Bias=self.flag2b + str(0))
        W_list.append(w_in)
        B_list.append(b_in)
        for i_hiddens in range(n_hiddens-1):
            w = self.Xavier_init_weight(in_size=(self.hiddens)[i_hiddens], out_size=(self.hiddens)[i_hiddens+1],
                                        flag2Weight=self.flag2w + str(i_hiddens+1))
            b = self.Xavier_init_bias(in_size=(self.hiddens)[i_hiddens], out_size=(self.hiddens)[i_hiddens+1],
                                      flag2Bias=self.flag2b + str(i_hiddens+1))
            W_list.append(w)
            B_list.append(b)
        w_out = self.Xavier_init_weight(in_size=(self.hiddens)[-1], out_size=self.outdim,
                                        flag2Weight=self.flag2w + str(n_hiddens))
        b_out = self.Xavier_init_bias(in_size=(self.hiddens)[-1], out_size=self.outdim,
                                      flag2Bias=self.flag2b + str(n_hiddens))
        W_list.append(w_out)
        B_list.append(b_out)
        return W_list, B_list



class Singlelayer(tf.keras.layers.Layer):
    def __init__(self, weight2layer=None, bias2layer=None):
        super(Singlelayer, self).__init__()
        self.w = weight2layer
        self.b = bias2layer

    def call(self, inputs):                       # Defines the computation from inputs to outputs
        return tf.matmul(inputs, self.w) + self.b



# ----------------------------------- Regular wights and biases -----------------------------------------------
def regular_weight_biases(regular_model='L1', weights=None, biases=None):
    layers = len(weights)
    if regular_model == 'L1':
        regular_w = 0
        regular_b = 0
        for i_layer1 in range(layers):
            regular_w = regular_w + tf.reduce_sum(tf.abs(weights[i_layer1]), keep_dims=False)
            regular_b = regular_b + tf.reduce_sum(tf.abs(biases[i_layer1]), keep_dims=False)
    elif regular_model == 'L2':
        regular_w = 0
        regular_b = 0
        for i_layer1 in range(layers):
            regular_w = regular_w + tf.reduce_sum(tf.square(weights[i_layer1]), keep_dims=False)
            regular_b = regular_b + tf.reduce_sum(tf.square(biases[i_layer1]), keep_dims=False)
    else:
        regular_w = 0.0
        regular_b = 0.0
    return regular_w + regular_b


#  ---deep neural network with resnet(one-step skip connection for two consecutive layers if have equal neurons)---
class DNN(tf.keras.layers.Layer):
    def __init__(self, indim=1, outdim=1, hiddens=None, Ws=None, Bs=None, actName='tanh', actName2Out='linear'):
        super(DNN, self).__init__()
        self.outdim = outdim
        self.indim = indim
        self.hiddens = hiddens
        self.Ws = Ws
        self.Bs = Bs
        self.actFunc = actName
        self.actFunc2out = actName2Out

    def call(self, x_input):
        n_hiddens = len(self.hiddens)
        # Win = (self.Ws)[0]
        hiddenLayer = Singlelayer(weight2layer=(self.Ws)[0], bias2layer=(self.Bs)[0])
        activation_in = my_actFunc(actName=self.actFunc)
        out2hidden = hiddenLayer(x_input)
        out2hidden = activation_in(out2hidden)
        activation_hidden = my_actFunc(actName=self.actFunc)
        hidden_record = self.hiddens[0]
        for iLayer in range(n_hiddens-1):
            # W = (self.Ws)[iLayer+1]
            out_pre = out2hidden
            hiddenLayer = Singlelayer(weight2layer=(self.Ws)[iLayer+1], bias2layer=(self.Bs)[iLayer+1])
            out2hidden = hiddenLayer(out2hidden)
            out2hidden = activation_hidden(out2hidden)
            if self.hiddens[iLayer + 1] == hidden_record:
                out2hidden = out2hidden + out_pre
            hidden_record = self.hiddens[iLayer + 1]

        # Wout = (self.Ws)[-1]
        hiddenLayer = Singlelayer(weight2layer=(self.Ws)[-1], bias2layer=(self.Bs)[-1])
        activation_out = my_actFunc(actName=self.actFunc2out)
        out2hidden = hiddenLayer(out2hidden)
        out_result = activation_out(out2hidden)
        return out_result

下面为如何使用这个模型。

if __name__ == "__main__":
    input_dim = 2
    out_dim = 1
    hidden_layer = (5, 10, 10, 15, 20)
    init_model = 'DNN'
    name2base_model = 'DNN'
    regular_wb_model = 'L1'
    penalty2weight_biases = 0.01
    actFun = 'tanh'
    model = DNN_model(indim=input_dim, outdim=out_dim, hidden_list=hidden_layer,
                      init_opt2model=init_model, name2DNN_Model=name2base_model, opt2regular_WB=regular_wb_model,
                      penalty2WB=penalty2weight_biases, actName=actFun)

    batch_size = 100
    x = np.random.rand(batch_size, input_dim)
    y = model(x)
    print(y)

第二种方式:序列化定义 Dense Net(缺点:每一层难操作)

当然,使用tf.keras,layers 一样可以构建网络。我们看一下:

# -*- coding: utf-8 -*-
"""
Created on 2021.06.16
@author: LXA
"""
import tensorflow as tf
import numpy as np


class my_actFunc(tf.keras.layers.Layer):
    def __init__(self, actName='linear'):
        super(my_actFunc, self).__init__()
        self.actName = actName

    def call(self, x_input):
        if str.lower(self.actName) == 'relu':
            out_x = tf.nn.relu(x_input)
        elif str.lower(self.actName) == 'leaky_relu':
            out_x = tf.nn.leaky_relu(x_input)
        elif str.lower(self.actName) == 'tanh':
            out_x = tf.nn.tanh(x_input)
        elif str.lower(self.actName) == 'elu':
            out_x = tf.nn.elu(x_input)
        elif str.lower(self.actName) == 'sin':
            out_x = tf.sin(x_input)
        elif str.lower(self.actName) == 'sigmoid':
            out_x = tf.nn.sigmoid(x_input)
        else:
            out_x = x_input
        return out_x


class Dense_seqNet(tf.keras.Model):
    """
    Args:
        indim: the dimension for input data
        outdim: the dimension for output
        hidden_units: the number of  units for hidden layer, a list or a tuple
        name2Model: the name of using DNN type, DNN , ScaleDNN or FourierDNN
        actName2in: the name of activation function for input layer
        actName: the name of activation function for hidden layer
        actName2out: the name of activation function for output layer
        if name2Model is not wavelet NN, actName2in is not same as actName; otherwise, actName2in is same as actName
    """
    def __init__(self, indim=1, outdim=1, hidden_units=None, name2Model='DNN', actName2in='linear', actName='tanh', actName2out='linear'):
        super(Dense_seqNet, self).__init__()
        self.indim = indim
        self.outdim = outdim
        self.hidden_units = hidden_units
        self.num2NN_layers = len(hidden_units)+1
        self.name2Model = name2Model
        self.actFunc_in = my_actFunc(actName=actName2in)
        self.actFunc = my_actFunc(actName=actName)
        self.actFunc_out = my_actFunc(actName=actName2out)
        self.dense_layers = []

        for i_layer in range(len(hidden_list)):
            dense_hidden = tf.keras.layers.Dense(hidden_list[i_layer])
            self.dense_layers.append(dense_hidden)

    def call(self, inputs, training=None, mask=None):
        dense_in = self.dense_layers[0]
        H = dense_in(inputs)
        H = self.actFunc_in(H)

        for i_layer in range(1, self.num2NN_layers-1):
            dense_layer = self.dense_layers[i_layer]
            H = dense_layer(H)
            H = self.actFunc(H)
        dense_out = self.dense_layers[-1]
        H =dense_out(H)
        H_out = self.actFunc_out(H)
        return H_out

使用:

if __name__ == "__main__":
   input_dim = 2
    out_dim = 1
    hidden_layer = (5, 10, 10, 15, 20)
    regular_wb_model = 'L1'
    penalty2weight_biases = 0.01
    actFun = 'tanh'
    Model_name = 'DNN'

    model = Dense_seqNet(indim=input_dim, outdim=out_dim, hidden_units=hidden_layer, name2Model=Model_name,
                         actName2in=actFun, actName=actFun)
    var_List0 = model.trainable_variables  # 无数据传入时,变量列表为空
    batch_size = 10
    x = np.random.rand(batch_size, input_dim)
    y = model(x)
    var_List1 = model.trainable_variables  # 数据传入时,变量列表中才有变量

    print(y)

这里有一个问题,

var_List0 = model.trainable_variables

是空的,也就是说model中没有可训练的变量。我还没有搞明白!!!

对于拟合问题,可以使用model.fit来训练网络,但是对于其他的不能使用MSE-loss的问题,这个就比较麻烦了。这里的关键是如何得到 trainable_variables!!!!

第三种方式:使用Keras的API 显式定义 Dense Net(这种策略好)

下面看第三种方式。显式地定义每一层的权重和偏置。然后逐层迭代计算

# -*- coding: utf-8 -*-
"""
Created on 2021.06.16
@author: LXA
"""

import tensorflow as tf
import numpy as np

class my_actFunc(tf.keras.layers.Layer):
    def __init__(self, actName='linear'):
        super(my_actFunc, self).__init__()
        self.actName = actName

    def call(self, x_input):
        if str.lower(self.actName) == 'relu':
            out_x = tf.nn.relu(x_input)
        elif str.lower(self.actName) == 'leaky_relu':
            out_x = tf.nn.leaky_relu(x_input)
        elif str.lower(self.actName) == 'tanh':
            out_x = tf.nn.tanh(x_input)
        elif str.lower(self.actName) == 'elu':
            out_x = tf.nn.elu(x_input)
        elif str.lower(self.actName) == 'sin':
            out_x = tf.sin(x_input)
        elif str.lower(self.actName) == 'sigmoid':
            out_x = tf.nn.sigmoid(x_input)
        else:
            out_x = x_input
        return out_x

class Dense_Net(tf.keras.Model):
 """
    Args:
        indim: the dimension for input data
        outdim: the dimension for output
        hidden_units: the number of  units for hidden layer, a list or a tuple
        name2Model: the name of using DNN type, DNN , ScaleDNN or FourierDNN
        actName: the name of activation function for hidden layer
        actName2out: the name of activation function for output layer
        scope2W: the namespace for weight
        scope2B: the namespace for bias
        repeat_high_freq: repeating the high-frequency component of scale-transformation factor or not
    """
    def __init__(self, indim=1, outdim=1, hidden_units=None, name2Model='DNN', actName='tanh', actName2out='linear',
                 scope2W='Weight', scope2B='Bias', repeat_high_freq=True):
        super(Dense_Net, self).__init__()
        self.indim = indim
        self.outdim = outdim
        self.hidden_units = hidden_units
        self.name2Model = name2Model
        self.actName = actName
        self.actName2out = actName2out
        self.actFunc = my_actFunc(actName=actName)
        self.actFunc_out = my_actFunc(actName=actName2out)
        self.repeat_high_freq = repeat_high_freq
        self.Ws = []
        self.Bs = []

        Win = self.add_weight(shape=(indim, hidden_units[0]), initializer=tf.keras.initializers.GlorotNormal,
                              trainable=True, name=str(scope2W)+'_in')
        Bin = self.add_weight(shape=(hidden_units[0],), initializer=tf.keras.initializers.GlorotNormal, trainable=True,
                              name=str(scope2B)+'_in')
        self.Ws.append(Win)
        self.Bs.append(Bin)

        for i_layer in range(len(hidden_units)-1):
             W = self.add_weight(shape=(hidden_units[i_layer], hidden_units[i_layer + 1]),
                                    initializer=tf.keras.initializers.GlorotNormal,
                                    trainable=True, name=str(scope2W) + str(i_layer))
             B = self.add_weight(shape=(hidden_units[i_layer + 1],),
                                    initializer=tf.keras.initializers.GlorotNormal,
                                    trainable=True, name=str(scope2B) + str(i_layer))
             self.Ws.append(W)
             self.Bs.append(B)

        Wout = self.add_weight(shape=(hidden_units[-1], outdim), initializer=tf.keras.initializers.GlorotNormal,
                               trainable=True, name=str(scope2W) + '_out')
        Bout = self.add_weight(shape=(outdim,), initializer=tf.keras.initializers.GlorotNormal, trainable=True,
                               name=str(scope2B) + '_out')
        self.Ws.append(Wout)
        self.Bs.append(Bout)

    def get_regular_sum2WB(self, regular_model):
        layers = len(self.hidden_units)+1
        if regular_model == 'L1':
            regular_w = 0
            regular_b = 0
            for i_layer in range(layers):
                regular_w = regular_w + tf.reduce_sum(tf.abs(self.Ws[i_layer]), keep_dims=False)
                regular_b = regular_b + tf.reduce_sum(tf.abs(self.Bs[i_layer]), keep_dims=False)
        elif regular_model == 'L2':
            regular_w = 0
            regular_b = 0
            for i_layer in range(layers):
                regular_w = regular_w + tf.reduce_sum(tf.square(self.Ws[i_layer]), keep_dims=False)
                regular_b = regular_b + tf.reduce_sum(tf.square(self.Bs[i_layer]), keep_dims=False)
        else:
            regular_w = tf.constant(0.0)
            regular_b = tf.constant(0.0)
        return regular_w + regular_b

    def call(self, inputs, training=None, mask=None):
        # ------ dealing with the input data ---------------
        H = tf.add(tf.matmul(inputs, self.Ws[0]), self.Bs[0])
        H = self.actFunc(H)

        #  ---resnet(one-step skip connection for two consecutive layers if have equal neurons)---
        hidden_record = self.hidden_units[0]
        for i_layer in range(len(self.hidden_units)-1):
            H_pre = H
            H = tf.add(tf.matmul(H, self.Ws[i_layer+1]), self.Bs[i_layer+1])
            H = self.actFunc(H)
            if self.hidden_units[i_layer + 1] == hidden_record:
                H = H + H_pre
            hidden_record = self.hidden_units[i_layer + 1]

        H = tf.add(tf.matmul(H, self.Ws[-1]), self.Bs[-1])
        out_result = self.actFunc_out(H)
        return out_result

if __name__ == "__main__":    
    input_dim = 3
    out_dim = 1
    hidden_layer = (5, 10, 10, 15, 20)
    name2base_model = 'DNN'
    actFun = 'tanh'

    model = DNN_base.Dense_Net(indim=input_dim, outdim=out_dim, hidden_units=hidden_layer, name2Model=name2base_model,
                               actName=actFun)
    var_List0 = model.trainable_variables   # 不为空
    batch_size = 10
    x = np.random.rand(batch_size, input_dim)
    freq = [1, 2, 3, 4, 5, 6, 7, 8]
    y = model(x, scale=freq)
    print(y)

说明

以上代码为本人的工作,只是抽取了一些简单的模块作为展示,因此函数中的一些参变量是多余的,可自行删除。刚从tensorflow1转为TensorFlow2,本人水平还比较弱,代码难免有错误和考虑不全面的地方,欢迎指正。

注:以上代码所需要的环境及配置方法:Anaconda,python,Numpy,Tensorflow2等配置

  • 1
    点赞
  • 6
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值