开始使用Keras函数API(翻译整理自Keras英文文档)

最新推荐文章于 2021-07-23 16:06:54 发布

嗷海胆

最新推荐文章于 2021-07-23 16:06:54 发布

阅读量311

点赞数

本文链接：https://blog.csdn.net/qq_32743513/article/details/103605131

版权

Keras函数API是定义复杂模型(如多输出模型、有向无环图或具有共享层的模型)的方法。

第一个例子:密集连接的网络
对于实现这样的网络，顺序模型可能是更好的选择，但它有助于从一些非常简单的东西开始。

from keras.layers import Input, Dense
from keras.models import Model

# This returns a tensor
inputs = Input(shape=(784,))

#层实例可以在一个tensor上调用，并返回一个tensor
output_1 = Dense(64, activation='relu')(inputs)
output_2 = Dense(64, activation='relu')(output_1)
predictions = Dense(10, activation='softmax')(output_2)

# 构建一个模型包括
# 输入层和三个dense
model = Model(inputs=inputs, outputs=predictions)
model.compile(optimizer='rmsprop',
              loss='categorical_crossentropy',
              metrics=['accuracy'])
model.fit(data, labels)  # starts training

All models are callable, just like layers
有了这个functional API，就很容易重用经过训练的模型:你可以把任何模型当作一个层来对待，方法是在一个tensor上调用它。请注意，通过调用模型，您不仅重用了模型的体系结构，还重用了它的权重。

x = Input(shape=(784,))
# This works, and returns the 10-way softmax we defined above.
y = model(x)   #这个model是上面那个有三层dense的模型，注意输出是10的softmax

例如，这可以允许快速创建能够处理输入序列的模型。你可以把一个图像分类模型转换成一个视频分类模型，只需要一行代码

from keras.layers import TimeDistributed

# Input tensor for sequences of 20 timesteps,
# each containing a 784-dimensional vector
input_sequences = Input(shape=(20, 784))

# This applies our previous model to every timestep in the input sequences.
# the output of the previous model was a 10-way softmax,
# so the output of the layer below will be a sequence of 20 vectors of size 10.
processed_sequences = TimeDistributed(model)(input_sequences)

Multi-input and multi-output models
让我们考虑以下模型。我们试图预测一个新闻标题会在Twitter上收到多少条转发和赞。该模型的主要输入将是标题本身，作为一个单词序列，但是为了增加趣味性，我们的模型还将有一个辅助输入，接收额外的数据，如标题发布的时间等。模型还将通过两个损失函数进行监督。较早地在模型中使用主损失函数是一种较好的深度模型正则化机制。
在这里插入图片描述
主输入将接收标题，作为整数序列(每个整数编码一个单词)。整数的范围为1到10,000(10,000个单词的词汇表)，序列长度为100个单词。
（注：这一部分涉及到nlp的内容，不太了解的人看不懂也没关系，可以先去查一查nlp，lstm等内容。简单注解一下：首先计算机肯定不能直接接受文字然后理解他，毕竟他基本上只能直接用数字去表示所有东西，所以这里会将一句话尽可能的用数字去表示（向量），一般而言会将标题这一个序列分词，然后去将这些词用数字（向量）去做一个唯一标识，再输入进去）

from keras.layers import Input, Embedding, LSTM, Dense
from keras.models import Model
import numpy as np
np.random.seed(0)  # Set a random seed for reproducibility

# Headline input: meant to receive sequences of 100 integers, between 1 and 10000.
# Note that we can name any layer by passing it a "name" argument.
main_input = Input(shape=(100,), dtype='int32', name='main_input')

# This embedding layer will encode the input sequence
# into a sequence of dense 512-dimensional vectors.
x = Embedding(output_dim=512, input_dim=10000, input_length=100)(main_input)

# A LSTM will transform the vector sequence into a single vector,
# containing information about the entire sequence
lstm_out = LSTM(32)(x)

在这里，我们插入了辅助loss，使得LSTM和嵌入层能够顺利地训练，即使模型中的mian loss要高得多。

auxiliary_output = Dense(1, activation='sigmoid', name='aux_output')(lstm_out)

此时，我们将辅助输入数据与LSTM输出连接到模型中:

auxiliary_input = Input(shape=(5,), name='aux_input')
x = keras.layers.concatenate([lstm_out, auxiliary_input])    #这里将辅助输入和lstm的输出整合了

# We stack a deep densely-connected network on top
x = Dense(64, activation='relu')(x)
x = Dense(64, activation='relu')(x)
x = Dense(64, activation='relu')(x)

# And finally we add the main logistic regression layer
main_output = Dense(1, activation='sigmoid', name='main_output')(x)

用两个输入和两个输出定义了一个模型

model = Model(inputs=[main_input, auxiliary_input], outputs=[main_output, auxiliary_output])

我们对模型进行compile，并将辅助损耗的权重赋值为0.2。要为每个不同的输出指定不同的loss_weights或loss，可以使用列表或字典。在这里，我们将单个损耗作为损耗参数传递，因此所有输出都将使用相同的损耗。

model.compile(optimizer='rmsprop', loss='binary_crossentropy',
              loss_weights=[1., 0.2])

我们可以通过传递输入数组和目标数组的列表来训练模型

headline_data = np.round(np.abs(np.random.rand(12, 100) * 100))
additional_data = np.random.randn(12, 5)
headline_labels = np.random.randn(12, 1)
additional_labels = np.random.randn(12, 1)
model.fit([headline_data, additional_data], [headline_labels, additional_labels],
          epochs=50, batch_size=32)

由于我们的输入和输出是有名称的(我们给它们传递了一个“name”参数)，我们也可以通过以下方式编译模型:

model.compile(optimizer='rmsprop',
              loss={'main_output': 'binary_crossentropy', 'aux_output': 'binary_crossentropy'},
              loss_weights={'main_output': 1., 'aux_output': 0.2})

# And trained it via:
model.fit({'main_input': headline_data, 'aux_input': additional_data},
          {'main_output': headline_labels, 'aux_output': additional_labels},
          epochs=50, batch_size=32)

利用模型predict：

model.predict({'main_input': headline_data, 'aux_input': additional_data})

或者：

pred = model.predict([headline_data, additional_data])

Shared layers
函数API的另一个很好的用途是使用共享层的模型。让我们来看看共享层。
让我们考虑一个tweet数据集。我们想要建立一个模型来分辨两个tweet是否来自同一个人(例如，这允许我们通过用户tweet的相似性来比较用户)。
实现这一目标的一种方法是建立一个模型，将两个tweet编码成两个向量，将向量连接起来，然后添加逻辑回归;这输出两个tweet共享同一作者的概率。这个模型将基于正例tweet（来自同一作者）对和负例tweet（不是同一作者）对训练。
让我们使用函数API来构建它。我们将一个形状的二进制矩阵(280,256)作为tweet的输入，即一个大小为256的280个向量的序列，其中256维向量中的每个维对字符的存在/不存在进行编码(来自256个常用字符的字母表)。

import keras
from keras.layers import Input, LSTM, Dense
from keras.models import Model

tweet_a = Input(shape=(280, 256))
tweet_b = Input(shape=(280, 256))

要在不同的输入之间共享一个层，只需实例化该层一次，然后调用你想要的任意多个输入

# This layer can take as input a matrix
# and will return a vector of size 64
shared_lstm = LSTM(64)

# When we reuse the same layer instance
# multiple times, the weights of the layer
# are also being reused
# (it is effectively *the same* layer)
encoded_a = shared_lstm(tweet_a)
encoded_b = shared_lstm(tweet_b)

# We can then concatenate the two vectors:
merged_vector = keras.layers.concatenate([encoded_a, encoded_b], axis=-1)

# And add a logistic regression on top
predictions = Dense(1, activation='sigmoid')(merged_vector)

# We define a trainable model linking the
# tweet inputs to the predictions
model = Model(inputs=[tweet_a, tweet_b], outputs=predictions)

model.compile(optimizer='rmsprop',
              loss='binary_crossentropy',
              metrics=['accuracy'])
model.fit([data_a, data_b], labels, epochs=10)

让我们暂停一下，看看如何读取共享层的输出或输出形状。

The concept of layer "node"
无论什么时候你在某个input上调用一个层，你都在创建一个新的tensor(这个层的输出)，你在这个层上添加一个“节点”，把输入tensor和输出tensor联系起来。当您多次调用同一层时，该层拥有多个索引为0、1、2的节点……
在Keras的早期版本中，您可以通过layer.get_output()获得层实例的输出张量，或者通过layer.output_shape获得其输出形状。您仍然可以(除了get_output()已被属性output替换)。但是如果一个层连接到多个输入呢
只要一个层只连接到一个输入，就没有混乱，.output将返回该层的一个输出

a = Input(shape=(280, 256))

lstm = LSTM(32)
encoded_a = lstm(a)

assert lstm.output == encoded_a

如果该层有多个输入:

a = Input(shape=(280, 256))
b = Input(shape=(280, 256))

lstm = LSTM(32)
encoded_a = lstm(a)
encoded_b = lstm(b)

lstm.output

>> AttributeError: Layer lstm_1 has multiple inbound nodes,
hence the notion of "layer output" is ill-defined.
Use `get_output_at(node_index)` instead.

正确的应该是：

assert lstm.get_output_at(0) == encoded_a
assert lstm.get_output_at(1) == encoded_b

这同样适用于input_shape和output_shape：只要该层只有一个结点，或者只要该结点有相同的输入输出shape。那么“层输入输出shape”的概念就被很好的定义了。这个shape会返回layer.output_shape/layer.input_shape.但是这里要注意，如果你使用了同一个Conv2D层，但是一个输入的shape是（32， 32， 3），另一个是（64， 64， 3），那么这个层将会有多个input/output shapes，如果你想要获得他们，那就要指定他们所属结点的index。

a = Input(shape=(32, 32, 3))
b = Input(shape=(64, 64, 3))

conv = Conv2D(16, (3, 3), padding='same')
conved_a = conv(a)

# Only one input so far, the following will work:
assert conv.input_shape == (None, 32, 32, 3)

conved_b = conv(b)
# now the `.input_shape` property wouldn't work, but this does:
assert conv.get_input_shape_at(0) == (None, 32, 32, 3)
assert conv.get_input_shape_at(1) == (None, 64, 64, 3)

嗷海胆

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
开始使用Keras函数API(翻译整理自Keras英文文档)

Keras函数API是定义复杂模型(如多输出模型、有向无环图或具有共享层的模型)的方法。第一个例子:密集连接的网络对于实现这样的网络，顺序模型可能是更好的选择，但它有助于从一些非常简单的东西开始。from keras.layers import Input, Densefrom keras.models import Model# This returns a tensorinput...
复制链接

扫一扫