超参数搜索实战练习

手动实现超参数搜索

核心代码为第6步

1、导包
import matplotlib as mpl
import matplotlib.pyplot as plt
%matplotlib inline
import numpy as np
import sklearn
import pandas as pd
import os
import sys
import time
import tensorflow as tf

from tensorflow import keras

print(tf.__version__)
print(sys.version_info)
for module in mpl, np, pd, sklearn, tf, keras:
    print(module.__name__, module.__version__)
2、数据
from sklearn.datasets import fetch_california_housing

housing = fetch_california_housing()
print(housing.DESCR)
print(housing.data.shape)
print(housing.target.shape)
3、打印部分数据
import pprint

pprint.pprint(housing.data[0:5])
pprint.pprint(housing.target[0:5])
4、划分样本
from sklearn.model_selection import train_test_split

x_train_all, x_test, y_train_all, y_test = train_test_split(
    housing.data, housing.target, random_state = 7)
x_train, x_valid, y_train, y_valid = train_test_split(
    x_train_all, y_train_all, random_state = 11)
print(x_train.shape, y_train.shape)
print(x_valid.shape, y_valid.shape)
print(x_test.shape, y_test.shape)
5、归一化
from sklearn.preprocessing import StandardScaler

scaler = StandardScaler()
x_train_scaled = scaler.fit_transform(x_train)
x_valid_scaled = scaler.transform(x_valid)
x_test_scaled = scaler.transform(x_test)
6、手动实现超参数搜索搭建模型和模型训练
# learning_rate: [1e-4, 3e-4, 1e-3,3e-3, 1e-2, 3e-2]
# w = w+grad * learning_rate

learning_rates = [1e-4, 3e-4, 1e-3, 3e-3, 1e-2, 3e-2]
histories = []
for lr in learning_rates:
    model = keras.models.Sequential([
        keras.layers.Dense(30, activation="relu",
                                input_shape=x_train.shape[1:]),
        keras.layers.Dense(1),
    ])
    optimizer = keras.optimizers.SGD(lr)
    model.compile(loss="mean_squared_error", optimizer=optimizer)
    callbacks = [keras.callbacks.EarlyStopping(
        patience=5, min_delta=1e-2)]
    history = model.fit(x_train_scaled,y_train,
                        validation_data = (x_valid_scaled, y_valid),
                       epochs=100,
                       callbacks=callbacks)
    histories.append(history)
7、画图
def plot_learning_curves(history):
    pd.DataFrame(history.history).plot(figsize=(8, 5))
    plt.grid(True)
    plt.gca().set_ylim(0, 1)
    plt.show()
for lr, history in zip(learning_rates, histories):
    print("learning_rate:", lr)
    plot_learning_curves(history)

用sklearn封装keras模型及超参数搜索

一、sklearn封装keras模型核心代码(前5步共用上面的步骤)

# RandomzedSearchCV 实现超参数随机化转换
#tf.keras组件中有KerasRegressor将模型转化为sklearn模型

# 1、转化为sklearn的model
# 2、定义参数集合
# 3、搜索参数

def build_model(hidden_layers = 1, layer_size = 30,
               learning_rate = 3e-3):
    model = keras.models.Sequential()
    model.add(keras.layers.Dense(layer_size, activation="relu",
                                input_shape = x_train.shape[1:]))
    for _ in range(hidden_layers - 1):
        model.add(keras.layers.Dense(layer_size,
                                    activation = "relu"))
    model.add(keras.layers.Dense(1))
    optimizer = keras.optimizers.SGD(learning_rate)
    model.compile(loss="mse", optimizer = optimizer)
    return model

# sklenrn封装
sklearn_model = keras.wrappers.scikit_learn.KerasRegressor(
    build_fn = build_model)
callbacks = [keras.callbacks.EarlyStopping(patience=5, min_delta=1e-2)]

history = sklearn_model.fit(x_train_scaled,y_train,
                           epochs = 10,
                           validation_data = (x_valid_scaled,y_valid),
                           callbacks = callbacks)

二、超参数搜索

from scipy.stats import reciprocal # 产生随机数
# f(x) = 1/(x*log(b/a)) a <= x <= b

#产生随机参数组合
param_distribution = {
    "hidden_layers":[1, 2, 3, 4],
    "layer_size": np.arange(1, 100),
    "learning_rate": reciprocal(1e-4, 1e-2),
}
# 利用RandomizedSearchCV函数逐个产生的参数
from sklearn.model_selection import RandomizedSearchCV

random_search_cv = RandomizedSearchCV(sklearn_model,
                                      param_distribution,
                                      n_iter = 10#参数集合,
                                      cv = 3,
                                      n_jobs = 1 # 并行个数)
random_search_cv.fit(x_train_scaled, y_train, epochs = 100,
                     validation_data = (x_valid_scaled, y_valid),
                     callbacks = callbacks)

# cross_validation: 训练集分成n份,n-1训练,最后一份验证.默认n=3,可以通过cv参数修改

三、打印出最佳参数

print(random_search_cv.best_params_)
print(random_search_cv.best_score_)
print(random_search_cv.best_estimator_)

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-LRQBKhG3-1575619448013)(/Users/bobwang/Library/Application Support/typora-user-images/image-20191206132315891.png)]

四、使用这些参数生成模型,测试

model = random_search_cv.best_estimator_.model
model.evaluate(x_test_scaled, y_test)
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值