tensorflow2.0学习笔记:超参数搜索

超参数:神经网络训练过程中不变的参数

  1. 网络结构参数:层数、每层宽度(神经单元个数)、每层的激活函数等
  2. 训练参数:batch_size,学习率(Alpha),学习率的变化策略等
    人力调试成本大 --> 超参数搜索

常见的超参数搜索方法:

(1) 网格搜索:超参数离散化-->超参数组合-->一组一组(可并行)
(2)  随机搜索:随机生成参数组合
(3)  遗传算法搜索:

      (a) 初始化参数集合 -> 训练 -> 得到模型指标作为生存概率
      (b) 选择 -> 交叉 -> 变异 -> 产生下一代集合
      (c) 回到(a)
      
(4) 启发式搜索:AutoML(研究热点)
      使用循环神经网络生成参数 -> 使用强化学习进行反馈,使用模型来训练生成参数

两个简单的超参数搜索的例子

  1. 网格搜索(手动实现)learning rate
  2. 使用sklearn 进行超参数搜索
import matplotlib as mpl
import matplotlib.pyplot as plt
%matplotlib inline
import numpy as np
import sklearn
import pandas as pd
import os
import sys
import time
import tensorflow as tf

from tensorflow import keras
from sklearn.datasets import fetch_california_housing
housing = fetch_california_housing()
# print(housing.DESCR)
# print(housing.data.shape)
# print(housing.target.shape)
from sklearn.model_selection import train_test_split

x_train_all, x_test, y_train_all, y_test = train_test_split(
    housing.data, housing.target, random_state = 7)
x_train, x_valid, y_train, y_valid, = train_test_split(
    x_train_all, y_train_all, random_state = 11)
print(x_train.shape, y_train.shape)
print(x_valid.shape, y_valid.shape)
print(x_test.shape, y_test.shape)
from sklearn.preprocessing import StandardScaler

scaler = StandardScaler()
x_train_scaled = scaler.fit_transform(x_train)
x_valid_scaled = scaler.transform(x_valid)
x_test_scaled = scaler.transform(x_test)

1. 网格搜索(手动实现)learning rate

# learning rate :[1e-4,3e-4,1e-3,3e-3,1e-2,3e-2]
# W = W + grad * learning rate
# 在keras.optimizers.SGD(lr)中,修改learning rate.

learning_rates = [1e-4,3e-4,1e-3,3e-3,1e-2,3e-2]
histories = [] # 建立列表,用来保存model.fit()的结果
for lr in learning_rates:
    model = keras.models.Sequential([
        keras.layers.Dense(30, activation='relu',
                          input_shape=x_train.shape[1: ]),
        keras.layers.Dense(1),
    ])
    
    optimizer = keras.optimizers.SGD(lr) 
    
    model.compile(loss="mean_squared_error", optimizer=optimizer)
    callbacks = [keras.callbacks.EarlyStopping(
        patience=5, min_delta=1e-2)]
    history = model.fit(x_train_scaled, y_train,
                       validation_data = (x_valid_scaled, y_valid),
                       epochs = 10,
                       callbacks = callbacks)
    histories.append(history)
def plot_learning_curves(history):
    pd.DataFrame(history.history).plot(figsize=(8, 5))
    plt.grid(True)
    plt.gca().set_ylim(0, 3)
    plt.show()
for lr,history in zip(learning_rates,histories):
    print("learning_rate:",lr)
    plot_learning_curves(history)

2.使用sklearn 进行超参数搜索

#learning rate :sklearn: RandomizedSearchCV
# 1.将tf.keras.model转化为sklearn_model
# 2.定义参数集合
# 3.搜索参数

def build_model(hidden_layers = 1,layer_size = 30, learning_rate = 3e-3):
        model = keras.models.Sequential()
        model.add( keras.layers.Dense(30, activation='relu',input_shape=x_train.shape[1: ]))
        for _ in range(hidden_layers-1):
            model.add(keras.layers.Dense(layer_size,activation='relu'))
        model.add(keras.layers.Dense(1))
        optimizer = keras.optimizers.SGD(learning_rate) #定义learning rate.
        model.compile(loss="mse", optimizer=optimizer)
        return model
    
sklearn_model = keras.wrappers.scikit_learn.KerasRegressor(build_model) # tf.keras.model转化为sklearn_model

callbacks = [keras.callbacks.EarlyStopping(patience=5, min_delta=1e-2)]
history = sklearn_model.fit(x_train_scaled,y_train,epochs = 10,
                   validation_data = (x_valid_scaled, y_valid),callbacks = callbacks)
#定义参数集合
from scipy.stats import reciprocal
# reciprocal分布:f(x) = 1/(x*log(b/a)) a <= x <=b

param_distribution = {
    "hidden_layers": [1,2,3,4],
    "layer_size": np.arange(1,100),
    "learning_rate": reciprocal(1e-4,1e-2)
}
#reciprocal.rvs(1e-4,1e-2,size=10)

from sklearn.model_selection import RandomizedSearchCV

random_search_cv = RandomizedSearchCV(sklearn_model,param_distribution,
                                     n_iter=3,cv=3,n_jobs=1)
# 随机化初始参数
# n_iter 从参数集合中选取的随机参数个数,n_jobs并行处理的个数。

random_search_cv.fit(x_train_scaled,y_train,epochs=10,
                    validation_data = (x_valid_scaled,y_valid),
                    callbacks = callbacks)

# cross_validation: 训练集分成n份,n-1份训练,最后一份验证。
# 超参数搜索使用cross_validation机制,默认cv=3.
print(random_search_cv.best_params_)
print(random_search_cv.best_score_)
print(random_search_cv.best_estimator_)
{'hidden_layers': 4, 'layer_size': 93, 'learning_rate': 0.0024942883277236454}
-0.4219960812142345
<tensorflow.python.keras.wrappers.scikit_learn.KerasRegressor object at 0x0000023FC7D44C08>
model = random_search_cv.best_estimator_.model
model.evaluate(x_test_scaled,y_test)
  • 2
    点赞
  • 5
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值