keras功能API

准备

!pip3 install tensorflow==2.0.0a0
%matplotlib inline
import tensorflow as tf
from tensorflow import keras
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Requirement already satisfied: tensorflow==2.0.0a0 in /usr/local/lib/python3.7/site-packages (2.0.0a0)
Requirement already satisfied: google-pasta>=0.1.2 in /usr/local/lib/python3.7/site-packages (from tensorflow==2.0.0a0) (0.1.4)
Requirement already satisfied: tf-estimator-nightly<1.14.0.dev2019030116,>=1.14.0.dev2019030115 in /usr/local/lib/python3.7/site-packages (from tensorflow==2.0.0a0) (1.14.0.dev2019030115)
Requirement already satisfied: protobuf>=3.6.1 in /usr/local/lib/python3.7/site-packages (from tensorflow==2.0.0a0) (3.7.0)
Requirement already satisfied: termcolor>=1.1.0 in ./Library/Python/3.7/lib/python/site-packages (from tensorflow==2.0.0a0) (1.1.0)
Requirement already satisfied: keras-applications>=1.0.6 in /usr/local/lib/python3.7/site-packages (from tensorflow==2.0.0a0) (1.0.7)
Requirement already satisfied: grpcio>=1.8.6 in /usr/local/lib/python3.7/site-packages (from tensorflow==2.0.0a0) (1.19.0)
Requirement already satisfied: tb-nightly<1.14.0a20190302,>=1.14.0a20190301 in /usr/local/lib/python3.7/site-packages (from tensorflow==2.0.0a0) (1.14.0a20190301)
Requirement already satisfied: absl-py>=0.7.0 in ./Library/Python/3.7/lib/python/site-packages (from tensorflow==2.0.0a0) (0.7.0)
Requirement already satisfied: astor>=0.6.0 in ./Library/Python/3.7/lib/python/site-packages (from tensorflow==2.0.0a0) (0.7.1)
Requirement already satisfied: keras-preprocessing>=1.0.5 in /usr/local/lib/python3.7/site-packages (from tensorflow==2.0.0a0) (1.0.9)
Requirement already satisfied: wheel>=0.26 in /usr/local/lib/python3.7/site-packages (from tensorflow==2.0.0a0) (0.33.1)
Requirement already satisfied: gast>=0.2.0 in ./Library/Python/3.7/lib/python/site-packages (from tensorflow==2.0.0a0) (0.2.2)
Requirement already satisfied: numpy<2.0,>=1.14.5 in /usr/local/lib/python3.7/site-packages (from tensorflow==2.0.0a0) (1.16.2)
Requirement already satisfied: six>=1.10.0 in ./Library/Python/3.7/lib/python/site-packages (from tensorflow==2.0.0a0) (1.12.0)
Requirement already satisfied: setuptools in /usr/local/lib/python3.7/site-packages (from protobuf>=3.6.1->tensorflow==2.0.0a0) (40.8.0)
Requirement already satisfied: h5py in ./Library/Python/3.7/lib/python/site-packages (from keras-applications>=1.0.6->tensorflow==2.0.0a0) (2.9.0)
Requirement already satisfied: werkzeug>=0.11.15 in ./Library/Python/3.7/lib/python/site-packages (from tb-nightly<1.14.0a20190302,>=1.14.0a20190301->tensorflow==2.0.0a0) (0.14.1)
Requirement already satisfied: markdown>=2.6.8 in ./Library/Python/3.7/lib/python/site-packages (from tb-nightly<1.14.0a20190302,>=1.14.0a20190301->tensorflow==2.0.0a0) (3.0.1)

介绍

从概览中你已经熟悉了如何使用tf.keras.Sequential来创建模型。不过tf.keras提供了更为灵活的API函数来创建模型,它可以很好的用于多输入多输出、有共享层、非序列模型的情况。
基于深度学习模型通常是有向无环图(DGA),因此tf.keras提供了一系列API来实现基于层构建计算图的方法。
首先,你需要指定模型的输入:

inputs = keras.Input(shape=(32, 32, 3), name='image')

这里你需要指定输入的尺寸,这里是单个数据的尺寸而不是批的尺寸(比如说对于图片,只需要3个维度)
此时我们就获得了一个输入对象,该对象指定了你需要输入到模型的数据(尺寸、名称、数据类型)
然后可以通过在计算图中添加一个层对象,并使用输入作为参数对该对象进行调用,获得该层的输出

dense = keras.layers.Dense(64, activation='relu')
x = dense(inputs)

层的call的行为,相当于在图中画了一个从输入指向当前层的有向边。
下面多添加几个层:

x = keras.layers.Dense(64, activation='relu')(x)
outputs = keras.layers.Dense(10, activation='softmax')(x)

然后通过关联输入和输出,我们可以创建一个完整的模型:

model = keras.Model(inputs=inputs, outputs=outputs, name='my_model')

可以通过调用summary方法,查看模型的内容:

model.summary()
Model: "my_model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
image (InputLayer)           [(None, 32, 32, 3)]       0         
_________________________________________________________________
dense (Dense)                (None, 32, 32, 64)        256       
_________________________________________________________________
dense_1 (Dense)              (None, 32, 32, 64)        4160      
_________________________________________________________________
dense_2 (Dense)              (None, 32, 32, 10)        650       
=================================================================
Total params: 5,066
Trainable params: 5,066
Non-trainable params: 0
_________________________________________________________________

当然,我们也可以将模型的结构绘制到图片上(此时可能需要安装依赖,请自行安装)

keras.utils.plot_model(model, 'model.png', show_shapes=True)    # 使用show_shapes参数会在图中显示每一层的输入输出的尺寸

在这里插入图片描述

多模型计算

在使用功能API时,模型通过指定输入和输出层来创建,那么一个层就可以用在多个模型当中。
在下面的例子中,我们使用相同的层堆叠出两个模型:encoder,将图片转换为16维的向量以及一个端到端训练的自编码器模型autoencoder

# 编码器部分
encoder_input = keras.Input(shape=(28, 28, 1), name='img')
x = keras.layers.Conv2D(16, 3, activation='relu')(encoder_input)
x = keras.layers.Conv2D(32, 3, activation='relu')(x)
x = keras.layers.MaxPooling2D(3)(x)
x = keras.layers.Conv2D(32, 3, activation='relu')(x)
x = keras.layers.Conv2D(16, 3, activation='relu')(x)
encoder_output = keras.layers.GlobalMaxPooling2D()(x)
encoder = keras.Model(inputs=encoder_input, outputs=encoder_output, name='encoder')
encoder.summary()
keras.utils.plot_model(encoder, 'encoder.png', show_shapes=True)

# 自动编码器
x = keras.layers.Reshape((4, 4, 1))(encoder_output)
x = keras.layers.Conv2DTranspose(16, 3, activation='relu')(x)
x = keras.layers.Conv2DTranspose(32, 3, activation='relu')(x)
x = keras.layers.UpSampling2D(3)(x)
x = keras.layers.Conv2DTranspose(16, 3, activation='relu')(x)
decoder_output = keras.layers.Conv2DTranspose(1, 3, activation='relu')(x)

autoencoder = keras.Model(inputs=encoder_input, outputs=decoder_output, name='autoencoder')
autoencoder.summary()
keras.utils.plot_model(autoencoder, 'autoencoder.png', show_shapes=True)
Model: "encoder"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
img (InputLayer)             [(None, 28, 28, 1)]       0         
_________________________________________________________________
conv2d (Conv2D)              (None, 26, 26, 16)        160       
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 24, 24, 32)        4640      
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 8, 8, 32)          0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 6, 6, 32)          9248      
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 4, 4, 16)          4624      
_________________________________________________________________
global_max_pooling2d (Global (None, 16)                0         
=================================================================
Total params: 18,672
Trainable params: 18,672
Non-trainable params: 0
_________________________________________________________________
Model: "autoencoder"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
img (InputLayer)             [(None, 28, 28, 1)]       0         
_________________________________________________________________
conv2d (Conv2D)              (None, 26, 26, 16)        160       
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 24, 24, 32)        4640      
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 8, 8, 32)          0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 6, 6, 32)          9248      
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 4, 4, 16)          4624      
_________________________________________________________________
global_max_pooling2d (Global (None, 16)                0         
_________________________________________________________________
reshape (Reshape)            (None, 4, 4, 1)           0         
_________________________________________________________________
conv2d_transpose (Conv2DTran (None, 6, 6, 16)          160       
_________________________________________________________________
conv2d_transpose_1 (Conv2DTr (None, 8, 8, 32)          4640      
_________________________________________________________________
up_sampling2d (UpSampling2D) (None, 24, 24, 32)        0         
_________________________________________________________________
conv2d_transpose_2 (Conv2DTr (None, 26, 26, 16)        4624      
_________________________________________________________________
conv2d_transpose_3 (Conv2DTr (None, 28, 28, 1)         145       
=================================================================
Total params: 28,241
Trainable params: 28,241
Non-trainable params: 0
_________________________________________________________________

在这里插入图片描述

可以看到,我们的编码器encoder和解码器decoder是严格对称的,所以autoencoder的输出也是(28, 28, 1)Conv2D的反操作是Conv2DTransposeMaxPooling2D的反操作是UpSampling2D

层和模型都是可调用的

你可以向调用层一样调用模型,比如传入一个input或者其它层的output。值得注意的是,调用模型不仅仅是复用模型的网络结构,还复用了模型的权重。
下面同样是实现一个autoencoder的例子,不同的是,这里创建了两个模型:encoderdecoder,然后将两个模型串联一起组成了autoencoder

# 编码器部分
encoder_input = keras.Input(shape=(28, 28, 1), name='original_img')
x = keras.layers.Conv2D(16, 3, activation='relu')(encoder_input)
x = keras.layers.Conv2D(32, 3, activation='relu')(x)
x = keras.layers.MaxPooling2D(3)(x)
x = keras.layers.Conv2D(32, 3, activation='relu')(x)
x = keras.layers.Conv2D(16, 3, activation='relu')(x)
encoder_output = keras.layers.GlobalMaxPooling2D()(x)
encoder = keras.Model(inputs=encoder_input, outputs=encoder_output, name='encoder')
encoder.summary()

# 解码器部分
decoder_input = keras.Input(shape=(16, ), name='encoder_img')
x = keras.layers.Reshape((4, 4, 1))(decoder_input)
x = keras.layers.Conv2DTranspose(16, 3, activation='relu')(x)
x = keras.layers.Conv2DTranspose(32, 3, activation='relu')(x)
x = keras.layers.UpSampling2D(3)(x)
x = keras.layers.Conv2DTranspose(16, 3, activation='relu')(x)
decoder_output = keras.layers.Conv2DTranspose(1, 3, activation='relu')(x)
decoder = keras.Model(inputs=decoder_input, outputs=decoder_output, name='decoder')
decoder.summary()

# 自编码器
autoencoder_input = keras.Input(shape=(28, 28, 1), name='img')
encoded_img = encoder(autoencoder_input)
decoded_img = decoder(encoded_img)
autoencoder = keras.Model(inputs=autoencoder_input, outputs=decoded_img, name='decoder')
autoencoder.summary()
Model: "encoder"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
original_img (InputLayer)    [(None, 28, 28, 1)]       0         
_________________________________________________________________
conv2d_8 (Conv2D)            (None, 26, 26, 16)        160       
_________________________________________________________________
conv2d_9 (Conv2D)            (None, 24, 24, 32)        4640      
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 8, 8, 32)          0         
_________________________________________________________________
conv2d_10 (Conv2D)           (None, 6, 6, 32)          9248      
_________________________________________________________________
conv2d_11 (Conv2D)           (None, 4, 4, 16)          4624      
_________________________________________________________________
global_max_pooling2d_2 (Glob (None, 16)                0         
=================================================================
Total params: 18,672
Trainable params: 18,672
Non-trainable params: 0
_________________________________________________________________
Model: "decoder"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
encoder_img (InputLayer)     [(None, 16)]              0         
_________________________________________________________________
reshape_2 (Reshape)          (None, 4, 4, 1)           0         
_________________________________________________________________
conv2d_transpose_8 (Conv2DTr (None, 6, 6, 16)          160       
_________________________________________________________________
conv2d_transpose_9 (Conv2DTr (None, 8, 8, 32)          4640      
_________________________________________________________________
up_sampling2d_2 (UpSampling2 (None, 24, 24, 32)        0         
_________________________________________________________________
conv2d_transpose_10 (Conv2DT (None, 26, 26, 16)        4624      
_________________________________________________________________
conv2d_transpose_11 (Conv2DT (None, 28, 28, 1)         145       
=================================================================
Total params: 9,569
Trainable params: 9,569
Non-trainable params: 0
_________________________________________________________________
Model: "decoder"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
img (InputLayer)             [(None, 28, 28, 1)]       0         
_________________________________________________________________
encoder (Model)              (None, 16)                18672     
_________________________________________________________________
decoder (Model)              (None, 28, 28, 1)         9569      
=================================================================
Total params: 28,241
Trainable params: 28,241
Non-trainable params: 0
_________________________________________________________________

可以看出来,模型可以是嵌套的,一个模型中可以包含若干个子模型(就像层一样)
模型嵌套的一个非常有用的地方就是平均输出(多个模型的输出取平均),下面一个例子就是这样:

def get_model():
    inputs = keras.Input(shape=(128, ))
    outputs = keras.layers.Dense(1, activation='sigmoid')(inputs)
    return keras.Model(inputs=inputs, outputs=outputs)

model1 = get_model()
model2 = get_model()
model3 = get_model()

inputs = keras.Input(shape=(128, ))
y1 = model1(inputs)
y2 = model2(inputs)
y3 = model3(inputs)

outputs = keras.layers.average([y1, y2, y3])

ensemble_model = keras.Model(inputs=inputs, outputs=outputs)
ensemble_model.summary()
Model: "model_3"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_4 (InputLayer)            [(None, 128)]        0                                            
__________________________________________________________________________________________________
model (Model)                   (None, 1)            129         input_4[0][0]                    
__________________________________________________________________________________________________
model_1 (Model)                 (None, 1)            129         input_4[0][0]                    
__________________________________________________________________________________________________
model_2 (Model)                 (None, 1)            129         input_4[0][0]                    
__________________________________________________________________________________________________
average (Average)               (None, 1)            0           model[1][0]                      
                                                                 model_1[1][0]                    
                                                                 model_2[1][0]                    
==================================================================================================
Total params: 387
Trainable params: 387
Non-trainable params: 0
__________________________________________________________________________________________________

复杂拓扑计算图的控制

多输入和多输出

tf.keras的功能API可以很容易的创建出多输入和多输出的图,而对于SequentialAPI来说,这是不可能实现的。
下面是一个例子:
假设你创建一个用于将用户的故障单按照紧急程度排序并分配到对应处理部门的模型,那么你的模型会有三个输入:

  • 故障单的抬头(文本输入)
  • 故障单的内容(文本输入)
  • 用户添加的标签(固定的若干标签)

然后会有两个输出:

  • 紧急程度(0~1之间,sigmoid的输出)
  • 处理该故障单的部门(softmax输出的若干个部门)

下面我们就使用功能API来创建该模型:

num_tags = 12                # 故障单的标签总数
num_words = 10000        # 当处理文本时,文本中文字的个数
num_departments = 4      # 部门个数

title_input = keras.Input(shape=(None, ), name='title')
body_input = keras.Input(shape=(None, ), name='body')
tags_input = keras.Input(shape=(num_tags, ), name='tags')

title_features = keras.layers.Embedding(num_words, 64)(title_input)
body_features = keras.layers.Embedding(num_words, 64)(body_input)
title_features = keras.layers.LSTM(128)(title_features)
body_features = keras.layers.LSTM(32)(body_features)

x = keras.layers.concatenate([title_features, body_features, tags_input])

priority_pred = keras.layers.Dense(1, activation='sigmoid', name='priority')(x)
department_pred = keras.layers.Dense(num_departments, activation='softmax', name='department')(x)

model = keras.Model(inputs=[title_input, body_input, tags_input], outputs=[priority_pred, department_pred])

model.summary()
keras.utils.plot_model(model, 'model.png', show_shapes=True)
Model: "model_4"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
title (InputLayer)              [(None, None)]       0                                            
__________________________________________________________________________________________________
body (InputLayer)               [(None, None)]       0                                            
__________________________________________________________________________________________________
embedding (Embedding)           (None, None, 64)     640000      title[0][0]                      
__________________________________________________________________________________________________
embedding_1 (Embedding)         (None, None, 64)     640000      body[0][0]                       
__________________________________________________________________________________________________
unified_lstm (UnifiedLSTM)      (None, 128)          98816       embedding[0][0]                  
__________________________________________________________________________________________________
unified_lstm_1 (UnifiedLSTM)    (None, 32)           12416       embedding_1[0][0]                
__________________________________________________________________________________________________
tags (InputLayer)               [(None, 12)]         0                                            
__________________________________________________________________________________________________
concatenate (Concatenate)       (None, 172)          0           unified_lstm[0][0]               
                                                                 unified_lstm_1[0][0]             
                                                                 tags[0][0]                       
__________________________________________________________________________________________________
priority (Dense)                (None, 1)            173         concatenate[0][0]                
__________________________________________________________________________________________________
department (Dense)              (None, 4)            692         concatenate[0][0]                
==================================================================================================
Total params: 1,392,097
Trainable params: 1,392,097
Non-trainable params: 0
__________________________________________________________________________________________________

在这里插入图片描述

模型创建完成后,我们可以对于每一个输出使用不同的loss计算方法,甚至采用不同的loss权重

model.compile(optimizer=keras.optimizers.RMSprop(1e-3), 
                        loss=['binary_crossentropy', 'categorical_crossentropy'], 
                        loss_weights=[1., 0.2])

由于我们为每一个输出指定了名称,因此也可以采用下面的方式设置loss

model.compile(optimizer=keras.optimizers.RMSprop(1e-3), 
                        loss={'priority': 'binary_crossentropy', 
                                 'department': 'categorical_crossentropy'}, 
                        loss_weights=[1., 0.2])

当训练的时候,我们可以传入一组Numpy数据作为输入和输出

import numpy as np
title_data = np.random.randint(num_words,size=(1280, 10))
body_data = np.random.randint(num_words,size=(1280, 100))
tags_data = np.random.randint(2, size=(1280, num_tags)).astype('float32')

priority_data = np.random.random(size=(1280, 1))
department_data = np.random.randint(2, size=(1280, num_departments))

model.fit({'title': title_data, 'body': body_data, 'tags': tags_data}, 
                {'priority': priority_data, 'department': department_data}, epochs=2, batch_size=32)
Epoch 1/2
1280/1280 [==============================] - 14s 11ms/sample - loss: 1.2485 - priority_loss: 0.6968 - department_loss: 2.7587
Epoch 2/2
1280/1280 [==============================] - 12s 9ms/sample - loss: 1.2022 - priority_loss: 0.6566 - department_loss: 2.7280





<tensorflow.python.keras.callbacks.History at 0x12940e828>

需要注意的是,如果使用tf.data.Dataset传递数据,那么tf.data.Dataset的输出必须是列表类型的元组([title_data, body_data, tags_data], [priority_data, department_data])或者字典类型的元组({'title': title_data, 'body': body_data, 'tags': tags_data}, {'priority': priority_data, 'department': department_data})

一个ResNet玩具模型

除了可以创建多输入多输出的模型外,功能API还可以很容易的实现跳跃连接的方式,即层与层之间并非连续连接的,这种模型对于SequentialAPI来说也是无法实现的。
下面是一个简单的ResNet的例子:

inputs = keras.Input(shape=(32, 32, 3), name='img')
x = keras.layers.Conv2D(32, 3, activation='relu')(inputs)
x = keras.layers.Conv2D(64, 3, activation='relu')(x)
block_1_output = keras.layers.MaxPooling2D(3)(x)

x = keras.layers.Conv2D(32, 3, activation='relu')(inputs)
x = keras.layers.Conv2D(64, 3, activation='relu')(x)
block_1_output = keras.layers.MaxPooling2D(3)(x)

x = keras.layers.Conv2D(64, 3, activation='relu', padding='same')(block_1_output)
x = keras.layers.Conv2D(64, 3, activation='relu', padding='same')(x)
block_2_output = keras.layers.add([x, block_1_output])

x = keras.layers.Conv2D(64, 3, activation='relu', padding='same')(block_2_output)
x = keras.layers.Conv2D(64, 3, activation='relu', padding='same')(x)
block_3_output = keras.layers.add([x, block_2_output])

x = keras.layers.Conv2D(64, 3, activation='relu')(block_3_output)
x = keras.layers.GlobalAveragePooling2D()(x)
x = keras.layers.Dense(256, activation='relu')(x)
x = keras.layers.Dropout(0.5)(x)
outputs = keras.layers.Dense(10, activation='softmax')(x)

model = keras.Model(inputs=inputs, outputs=outputs, name='toy_resnet')
model.summary()
keras.utils.plot_model(model, 'model.png', show_shapes=True)
Model: "toy_resnet"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
img (InputLayer)                [(None, 32, 32, 3)]  0                                            
__________________________________________________________________________________________________
conv2d_14 (Conv2D)              (None, 30, 30, 32)   896         img[0][0]                        
__________________________________________________________________________________________________
conv2d_15 (Conv2D)              (None, 28, 28, 64)   18496       conv2d_14[0][0]                  
__________________________________________________________________________________________________
max_pooling2d_4 (MaxPooling2D)  (None, 9, 9, 64)     0           conv2d_15[0][0]                  
__________________________________________________________________________________________________
conv2d_16 (Conv2D)              (None, 9, 9, 64)     36928       max_pooling2d_4[0][0]            
__________________________________________________________________________________________________
conv2d_17 (Conv2D)              (None, 9, 9, 64)     36928       conv2d_16[0][0]                  
__________________________________________________________________________________________________
add (Add)                       (None, 9, 9, 64)     0           conv2d_17[0][0]                  
                                                                 max_pooling2d_4[0][0]            
__________________________________________________________________________________________________
conv2d_18 (Conv2D)              (None, 9, 9, 64)     36928       add[0][0]                        
__________________________________________________________________________________________________
conv2d_19 (Conv2D)              (None, 9, 9, 64)     36928       conv2d_18[0][0]                  
__________________________________________________________________________________________________
add_1 (Add)                     (None, 9, 9, 64)     0           conv2d_19[0][0]                  
                                                                 add[0][0]                        
__________________________________________________________________________________________________
conv2d_20 (Conv2D)              (None, 7, 7, 64)     36928       add_1[0][0]                      
__________________________________________________________________________________________________
global_average_pooling2d (Globa (None, 64)           0           conv2d_20[0][0]                  
__________________________________________________________________________________________________
dense_6 (Dense)                 (None, 256)          16640       global_average_pooling2d[0][0]   
__________________________________________________________________________________________________
dropout (Dropout)               (None, 256)          0           dense_6[0][0]                    
__________________________________________________________________________________________________
dense_7 (Dense)                 (None, 10)           2570        dropout[0][0]                    
==================================================================================================
Total params: 223,242
Trainable params: 223,242
Non-trainable params: 0
__________________________________________________________________________________________________

在这里插入图片描述

下面我们开始训练该模型:

(x_train, y_train), (x_test, y_test) = keras.datasets.cifar10.load_data()
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.
y_train = keras.utils.to_categorical(y_train, 10)
y_test= keras.utils.to_categorical(y_test, 10)

model.compile(optimizer=keras.optimizers.RMSprop(1e-3), 
                        loss='categorical_crossentropy', metrics=['acc'])
model.fit(x_train, y_train, epochs=1, batch_size=64, validation_split=0.2)    # 拆分出20%的数据进行验证
Train on 40000 samples, validate on 10000 samples
40000/40000 [==============================] - 151s 4ms/sample - loss: 1.8996 - acc: 0.2804 - val_loss: 1.5229 - val_acc: 0.4388





<tensorflow.python.keras.callbacks.History at 0x1291aa278>

层共享

功能API另一个很有用的地方就是用于层共享。层共享指同一个层的实力在同一个模型中被多次重复使用,它们学习图中多个路径的特征。
层共享多用于对相似输入空间进行编码(比如说两段文本映射到相同的词空间中),由于从不同的输入中共享了数据,因此使得用更少的数据进行训练称为可能。例如如果一个词在某个输入中出现,那么其他的输入也会通过层的共享而因此受益。
使用功能API实现层共享,只需要简单的多次调用同一层即可。下面是一个多个文本输入共享一个词嵌入层的例子:

share_embedding = keras.layers.Embedding(1000, 128)

text_input_a = keras.Input(shape=(None, ), dtype='int32')
text_input_b = keras.Input(shape=(None, ), dtype='int32')

encode_input_a = share_embedding(text_input_a)
encode_input_b = share_embedding(text_input_b)

提取和复用层

由于使用功能API创建的层是一个静态的数据结构,所以可以对其进行访问和检查。这也是为什么我们可以将图结构输出到图片中。
这也意味着我们可以访问图中的层(激活的部分),并且对其进行复用(用于迁移学习),这将对特征提取十分有用。
下面是一个在ImageNet上预训练的VGG19的模型。

vgg19 = keras.applications.VGG19()

这就是一个模型中间激活的部分,通过查询图的数据结构获得。
使用该模型特征,我们可以仅使用3行代码创建新的特征提取模型,该模型输出vgg19的最后一层激活后的输出:

feature_list = [layer.out for layer in vgg19.layers]
feat_extraction_model = keras.Model(inputs=vgg19.input, outputs=feature_list)
img = np.random.random(size=(1, 224, 224, 3))
extracted_feature = feat_extraction_model(img)

自定义层实现API扩展

tf.keras提供了很多内建的层,比如说:

  • 卷积层:Conv1DConv2DConv3D以及Conv2DTranspose
  • 池化层:MaxPooling1DMaxPooling2DMaxPooling3D以及AveragePooling1D
  • RNN层:GRULSTMConvLSTM2D
  • BatchNormalizationDropoutEmbedding

如果没有发现你想用的层,那么你可以简单通过自定义层来实现扩展
所有的层都是继承自Layer类,并实现了下面几个方法:call方法,定义了层的计算过程。build方法,完成了权重之类的创建(当然你也可以在__init__方法中创建)
下面是一个简单实现Dense的例子:

class CustmDense(keras.layers.Layer):
    def __init__(self, units=32):
        super(CustmDense, self).__init__()
        self.units = units
    def build(self, input_shape):
        self.w = self.add_weight(shape=(input_shape[-1], self.units), initializer='random_normal', trainable=True)
        self.b = self.add_weight(shape=(self.units, ), initializer='random_normal', trainable=True)

    def call(self, inputs):
        return tf.matmul(inputs, self.w) + self.b
    
    def get_config(self):
        parent_config = super(CustmDense, self).get_config()
        parent_config['units'] = self.units
        return parent_config
    
    @classmethod
    def from_config(cls, config):
        return cls(**config)

inputs = keras.Input(shape=(4, ))
outputs = CustmDense(10)(inputs)

model = keras.Model(inputs=inputs, outputs=outputs)

如果想要为层添加序列化功能,那么就实现get_config方法,该方法能够返回层参数的字典
同样,如果需要支持从配置参数中恢复层,那么就需要实现from_config方法

何时使用功能API

如何决定在何时使用功能API或者直接从Model类进行继承?
一般来说,功能API是高层的API,更加容易使用且稳定,并且提供给了很多使用继承Model无法实现的功能
但是对于继承Model类来说,能够更加灵活,并且能够实现功能API不易实现的功能(比如说,你无法使用功能API实现Tree-RNN)
下面列出一些功能API的有点:

  • 更少的代码,功能API已经实现了一些功能,因此你无需去自定义
  • 自动对参数进行检查
  • 模型可以可视化
  • 模型可以序列化以及拷贝

功能API的缺点如下:

  • 不支持动态图结构
  • 有时候你就是需要从头创建…
  • 4
    点赞
  • 6
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值