keras_tuner库的总结【源自官网Examples】

最新推荐文章于 2025-04-01 09:24:36 发布

MarToony|名角

最新推荐文章于 2025-04-01 09:24:36 发布

阅读量2.8k

点赞数 5

分类专栏：机器学习文章标签：深度学习 keras tensorflow 调参 keras_tuner

本文链接：https://blog.csdn.net/m0_38052500/article/details/109647800

版权

机器学习专栏收录该内容

11 篇文章

订阅专栏

keras_tuner库的使用，源自官网的Examples

在kerastuner中entries表示的是某个超参数对象的可取值。
术语体系。有些称谓是自行杜撰的。文章中的内容或者概念需要结合原文代码多理解与多实验。

① example1

from tensorflow import keras
from tensorflow.keras import layers
from kerastuner.tuners import RandomSearch
from kerastuner.engine.hypermodel import HyperModel
from kerastuner.engine.hyperparameters import HyperParameters

(x, y), (val_x, val_y) = keras.datasets.mnist.load_data()
x = x.astype('float32') / 255.
val_x = val_x.astype('float32') / 255.
x = x[:10000]
y = y[:10000]

"""Basic case:
- We define a `build_model` function
- It returns a compiled model
- It uses hyperparameters defined on the fly
"""

def build_model(hp):
    model = keras.Sequential()
    model.add(layers.Flatten(input_shape=(28, 28)))
    for i in range(hp.Int('num_layers', 2, 20)):
        model.add(layers.Dense(units=hp.Int('units_' + str(i), 32, 512, 32),
                               activation='relu'))
    model.add(layers.Dense(10, activation='softmax'))
    model.compile(
        optimizer=keras.optimizers.Adam(
            hp.Choice('learning_rate', [1e-2, 1e-3, 1e-4])),
        loss='sparse_categorical_crossentropy',
        metrics=['accuracy'])
    return model

# 为微调程序设置必要参数
tuner = RandomSearch(
    build_model,
    objective='val_accuracy',
    max_trials=5,  
    executions_per_trial=5, 
    directory='test_dir')

# tuner.search_space_summary()

tuner.search(x=x,
             y=y,
             epochs=4,
             validation_data=(val_x, val_y),batch_size=20)

tuner.results_summary()

note:

tunner.search函数的参数使用与keras中model.fit函数的参数一致，其中就包括batch_size（以实现mini_batch的梯度下降算法）
kerastuner.tuners.RandomSearch函数中：max_trials：调参过程中进行实验的参数组合的总数目，executions_per_trial：每个参数组合要重复执行的次数（默认值为1），重点在于，每执行一次，就要执行tuner.search中限定的epochs的次数，换句话说，整个调参过程执行的epoch的次数是：executions_per_trial*epochs。而其价值在于减少结果差异，从而能够更准确地评估模型的性能。但是具体怎么减少的，并不清楚？！
其实在build_model中，的hp.Int与hp.Choice功能一致，都是从一个指定的范围中选择一个值。只不过，前者给定的是区间范围，而后者给定的是离散的值的集合。
插一句： 因为 调参过程 会默认将每个实验组合对应的模型参数进行ckpt格式的模型存储，且内置了断点续传功能，也就是，当再次运行本程序的时候，如果还没有完成max_trial次实验，会继续进行剩余trial次数的实验，否则，且无results_summary()时，不会进行任何的操作或打印任何信息。
运用tuner.results_summary()后，如何理解打印的信息

[Results summary]  
[Trial summary]
 |-Trial ID: ce065e2a0531cf2cb888a4be6bd5f69f
 |-Score: 0.922279953956604
 |-Best step: 0
 > Hyperparameters:
...  ...
[Trial summary]
 |-Trial ID: ce065e2a0531cf2cb888a4be6bd5f69f
 |-Score: 0.922279953956604
 |-Best step: 0
 > Hyperparameters:
...  ...
[Trial summary]
 |-Trial ID: ce065e2a0531cf2cb888a4be6bd5f69f
 |-Score: 0.922279953956604
 |-Best step: 0
 > Hyperparameters:
...  ...

[Results summary]下的[Trial summary]块的数目会与RandomSearch函数中规定的max_trial的数值一致。且这个[Trial summary]块的内容其实在训练过程中已经在控制台上得到打印（准确的说是在每个trial结束后进行打印，而这一点是系统约定好进行的行为，而非由results_summary或者search_space_summary引起）。只不过在调用tuner.results_summary()时，重新输出了一遍。
重点在于这个Results summary是根据每次trial的Score进行倒序排列。因此第一个出现的参数组合是最优的参数组合。

tuner.search_space_summary()的作用是：于正式训练之前会打印出该调参模型中涉及到的所有HyperParameters调参对象（暂且这么称呼）

[Search space summary]
 |-Default search space size: 4
 > num_layers (Int)
 |-default: None
 |-max_value: 20
 |-min_value: 2
 |-sampling: None
 |-step: 1
 > units_0 (Int)
 |-default: None
 |-max_value: 512
 |-min_value: 32
 |-sampling: None
 |-step: 32
 > units_1 (Int)
 |-default: None
 |-max_value: 512
 |-min_value: 32
 |-sampling: None
 |-step: 32
 > learning_rate (Choice)
 |-default: 0.01
 |-ordered: True
 |-values: [0.01, 0.001, 0.0001]

如果与下图中的调参对象的名字细心观察，会发现，其实在名字上又对应的。但是之所以units_出现了两次，是因为hp.Int('num_layers', 2, 20)规定了最小值为2 ，也就意味着无论该调参对象取任何一个数值，它都是介于2-20之间，换句话说，model.add(layers.Dense(units=hp.Int('units_' + str(i), 32, 512, 32),activation='relu'))该行代码至少运行两次，也就是产生两个全连接层units_0和units_1。

② override the loss and metrics

tuner = RandomSearch(
    build_model,
    objective='val_accuracy',
    loss=keras.losses.SparseCategoricalCrossentropy(name='my_loss'),
    metrics=['accuracy', 'mse'],
    max_trials=5,
    directory='test_dir')
#######################3
10000/10000 [==============================] - 2s 159us/sample - loss: 2.3019 - accuracy: 0.1093 - mse: 27.1429 - val_loss: 2.3027 - val_accuracy: 0.1135 - val_mse: 27.2504

从上面的输出可以看出，当我们衡量的指标是两个或者多个的时候，验证集上的指标名称会前缀val_。另外，注意训练集的metrics是每个batch计算一次，而验证集是每个epoch计算一次。
metrics与loss的计算方式不一样，但都可以用来评价模型的拟合效果。但是后者的另一个重要作用在于反向传播。

③ define a HyperModel subclass

其实是对网络结构的进一步封装，并没有技巧上的提升。但是可能会方便之后的优化。

class MyHyperModel(HyperModel):
    def __init__(self, img_size, num_classes):
        self.img_size = img_size
        self.num_classes = num_classes

    def build(self, hp):
        model = keras.Sequential()
        model.add(layers.Flatten(input_shape=self.img_size))
        for i in range(hp.Int('num_layers', 2, 20)):
            model.add(layers.Dense(units=hp.Int('units_' + str(i), 32, 512, 32),
                                   activation='relu'))
        model.add(layers.Dense(self.num_classes, activation='softmax'))
        model.compile(
            optimizer=keras.optimizers.Adam(
                hp.Choice('learning_rate', [1e-2, 1e-3, 1e-4])),
            loss='sparse_categorical_crossentropy',
            metrics=['accuracy'])
        return model

tuner = RandomSearch(
    MyHyperModel(img_size=(28, 28), num_classes=10),
    objective='val_accuracy',
    max_trials=5,
    directory='test_dir')

④ restrict the search space（限制搜索空间）

This means that default values are being used for params that are left out
这意味着默认值将用于遗漏的参数.

# 展示第一个例子中的RandomSearch函数与本例的RandomSearch函数
# 一例：
tuner = RandomSearch(
    build_model,
    objective='val_accuracy',
    max_trials=5,  
    executions_per_trial=1, 
    directory='test_dir')
#本例：
hp = HyperParameters()
hp.Choice('learning_rate', [1e-1, 1e-3])

tuner = RandomSearch(
    build_model,
    max_trials=5,
    hyperparameters=hp,
    tune_new_entries=False,
    objective='val_accuracy')

从上面的代码看出，本例的方式可以在外部修改模型中已定义的调参对象的参数取值范围。之所以不在内部修改，可能是为了面向暂时的突发奇想，是一种临时的的，快速实验的处理方式。
HyperParameters()是一个实例Container for both a hyperparameter space, and current values。包含两个属性：一个包含超参对象的实例们的列表以及一个将超参对象实例的名字与current value相映射的字典。
tune_new_entries=True：如果要False，就必须在代码中定义hp = HyperParameters()以及在RandomSearch中指定出参数hyperparameters=hp
Whether hyperparameter entries that are requested by the hypermodel but that were not specified in hyperparameters should be added to the search space, or not. If not, then the default value for these parameters will be used.，其中hyperparameter entries that are requested by the hypermodel but that were not specified in hyperparameters中的hyperparameter entries表示模型中已经被定义声明了的超参数对象，但是并没有在hp = HyperParameters()对象中定义的。
（在kerastuner中entries表示的是某个超参数对象的可取值。）

一个场景：
我们在模型中定义了需要进行参数调整的一系列超参数对象。本来是可以happy地进行超模型的训练。
但是任何事情的发展过程都会伴随着细节的调整，超模型的训练过程亦然。
当我们想专门调整部分几个超参数对象之间，而并非全部超参数对象之间的参数组合的实验效果。我们如何办呢？

在不添加多余代码的前提下，我们的方式可能是：将暂时不需要参与到实验中的超参数对象修改为非超参数对象，也就是指定一个值。比如optimizer=keras.optimizers.Adam(hp.Choice('learning_rate', [1e-2, 1e-4], default=1e-4)) 修改为 optimizer=keras.optimizers.Adam(1e-4)。这种方法可以达到预期效果，但是比较繁琐，尤其是超参数对象比较多的时候。
**推荐的方式：**增加一些代码。如下：

# 1 重新定义一个超参数对象的实例，是个空的。
hp = HyperParameters()
# 2 将目前想要研究的几个超参数对象重新定义。
hp.Int('num_layers', 2,6)
# 3 指定hyperparameters参数，用以替代超模型中相同位置的超参数对象，并设置tune_new_entries=False，以拒绝模型参数中暂时不想被考虑的超参数对象使用定义的默认值（该默认值是在build_model中的定义的default值，如果未指定default参数，则选择参数对象取值范围中的第一个值即first entry）。
tuner = RandomSearch(
    build_model,
    hyperparameters=hp,
    tune_new_entries=False)
    
# 4 当暂时不想被考虑的超参数对象的默认值不是我们想要的那个值时，我们还可以在通过hp.Fixed('learning_rate', 1e-1)来实现。
hp.Fixed('learning_rate', 1e-1)
# 5 当然，如果我们想重写某个超参数对象的范围，超模型中的其他超参数对象仍然可被考虑，也是可以的，不过需要把tune_new_entries设为True。
hp.Int('num_layers', 2,6)
tuner = RandomSearch(
    build_model,
    hyperparameters=hp,
    tune_new_entries=True)

补充：其实tune_new_entries也可以理解为超模型中定义的超参数对象是采用默认值还是values，或者说（min_value, max_value），又或者说choices。True则表示采用后者，而False表示采用前者。
其实到了这一步，原文的第五（关于Fixed）、六（tune_new_entries=True情况下的重写作用）、七个例子（），可以不用讲了，可以详细看上述的代码。

# 部分介绍的完整的代码如下================================：
def build_model(hp):
    model = keras.Sequential()
    model.add(layers.Flatten(input_shape=(28, 28)))
    for i in range(hp.Int('num_layers', 10, 20)):
        model.add(layers.Dense(units=hp.Int('units_' + str(i), 32, 512, 32),
                               activation='relu'))
    model.add(layers.Dense(10, activation='softmax'))
    model.compile(
        optimizer=keras.optimizers.Adam(
            hp.Choice('learning_rate', [1e-2, 1e-4], default=1e-4)),
        loss='sparse_categorical_crossentropy',
        metrics=['accuracy'])
    return model
    
hp = HyperParameters()
hp.Fixed('learning_rate', 1e-1)
hp.Int('num_layers', 2,6)

tuner = RandomSearch(
    build_model,
    max_trials=5,
    hyperparameters=hp,
    tune_new_entries=False,
    objective='val_accuracy')

tuner.search_space_summary()
#################################
[Search space summary]
 |-Default search space size: 2
 > learning_rate (Fixed)
 |-value: 0.1
 > num_layers (Int)
 |-default: None
 |-max_value: 6
 |-min_value: 2
 |-sampling: None
 |-step: 1