模型集成-ensamble

1.综述

资料集合

模型是由于有较高的方差产生,集成多个模型可以减小方差,为了使模型有集成有效,需要每一模型都需要是很好的模型但是需要犯不同的错误,结果会更鲁棒一些

主要参考内容https://machinelearningmastery.com/stacking-ensemble-for-deep-learning-neural-networks/   ,包含了很多集成的代码实现

中文资料https://www.cnblogs.com/szxspark/p/10144913.html

一些简单的keras实现的ensemble的例子https://machinelearningmastery.com/?s=ensemble&post_type=post&submit=Search

 

2.做法

使用不同的初始化情况的模型来获得相同配置的输出,收集全部的输出后来进行平均,一般做ensamble的数目不会特别多,有两个原因,一个是计算代价问题,另一个是过多的模型收获的收益不会一直增长,

1)训练采用不同的数据

  k-fold分割数据,每一个数据集的子集都用来训练模型,获取多个模型进行集成

  在数据集中重新重新采样,抽取数据,(叫bootstrap aggregation,或者bootstrap aggregation)

  可以负采样(抽样后不放回),

2)训练不模型来降低模型

  相同模型和数据,利用不同不同的初始化方式来训练模型并集成(可以降低方差,但是泛化误差不会好很多,都是同一个类型的映射)  

  不同的模型,包括不同隐层,不同的学习率,正则化方式等

  一个模型再训练的不同时间段保存(中间可能加入震荡的噪声,震荡学习率(带热重启动的随机梯度下降(SGDR)), )

3)组合方式

  直接平均

  利用dev进行加权平均,加权的权值由在dev上的验证效果决定

  要一个新的模型来获取之前的值(stacked)

 

一些实际的做法

1.对于加权平均的方式来对分类做emsemble,有两种方式,一种是穷举的方式来搜索,可以使用python的itertools.product(A,repeat=num of model),可以把A和B的所有组合选出来。

速度太慢,无法并行,组合情况随模型的数目和取值成指数增长

另一种是使用一些搜索算法,包括差分进化算法,pso等。

 1 # global optimization to find coefficients for weighted ensemble on blobs problem
 2 from sklearn.datasets.samples_generator import make_blobs
 3 from sklearn.metrics import accuracy_score
 4 from keras.utils import to_categorical
 5 from keras.models import Sequential
 6 from keras.layers import Dense
 7 from numpy import mean
 8 from numpy import std
 9 from numpy import array
10 from numpy import argmax
11 from numpy import tensordot
12 from numpy.linalg import norm
13 from scipy.optimize import differential_evolution
14 
15 
16 # fit model on dataset
17 def fit_model(trainX, trainy):
18     trainy_enc = to_categorical(trainy)
19     # define model
20     model = Sequential()
21     model.add(Dense(25, input_dim=2, activation='relu'))
22     model.add(Dense(3, activation='softmax'))
23     model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
24     # fit model
25     model.fit(trainX, trainy_enc, epochs=500, verbose=0)
26     return model
27 
28 
29 # make an ensemble prediction for multi-class classification
30 def ensemble_predictions(members, weights, testX):
31     # make predictions
32     yhats = [model.predict(testX) for model in members]
33     yhats = array(yhats)
34     # weighted sum across ensemble members
35     summed = tensordot(yhats, weights, axes=((0),(0)))
36     # argmax across classes
37     result = argmax(summed, axis=1)
38     return result
39 
40 # # evaluate a specific number of members in an ensemble
41 def evaluate_ensemble(members, weights, testX, testy):
42     # make prediction
43     yhat = ensemble_predictions(members, weights, testX)
44     # calculate accuracy
45     return accuracy_score(testy, yhat)
46 
47 # normalize a vector to have unit norm
48 def normalize(weights):
49     # calculate l1 vector norm
50     result = norm(weights, 1)
51     # check for a vector of all zeros
52     if result == 0.0:
53         return weights
54     # return normalized vector (unit norm)
55     return weights / result
56 
57 # loss function for optimization process, designed to be minimized
58 def loss_function(weights, members, testX, testy):
59     # normalize weights
60     normalized = normalize(weights)
61     # calculate error rate
62     return 1.0 - evaluate_ensemble(members, normalized, testX, testy)
63 
64 # generate 2d classification dataset
65 X, y = make_blobs(n_samples=1100, centers=3, n_features=2, cluster_std=2, random_state=2)
66 # split into train and test
67 n_train = 100
68 trainX, testX = X[:n_train, :], X[n_train:, :]
69 trainy, testy = y[:n_train], y[n_train:]
70 print(trainX.shape, testX.shape)
71 # fit all models
72 n_members = 5
73 members = [fit_model(trainX, trainy) for _ in range(n_members)]
74 # evaluate each single model on the test set
75 testy_enc = to_categorical(testy)
76 for i in range(n_members):
77     _, test_acc = members[i].evaluate(testX, testy_enc, verbose=0)
78     print('Model %d: %.3f' % (i+1, test_acc))
79 # evaluate averaging ensemble (equal weights)
80 weights = [1.0/n_members for _ in range(n_members)]
81 score = evaluate_ensemble(members, weights, testX, testy)
82 print('Equal Weights Score: %.3f' % score)
83 # define bounds on each weight
84 bound_w = [(0.0, 1.0)  for _ in range(n_members)]
85 # arguments to the loss function
86 search_arg = (members, testX, testy)
87 # global optimization of ensemble weights
88 result = differential_evolution(loss_function, bound_w, search_arg, maxiter=80, tol=1e-7)
89 # get the chosen weights
90 weights = normalize(result['x'])
91 print('Optimized Weights: %s' % weights)
92 # evaluate chosen weights
93 score = evaluate_ensemble(members, weights, testX, testy)
94 print('Optimized Weights Score: %.3f' % score)
View Code

2.对于stacking ensemble的情况,可以使用k-fold的方式分割数据集,训练多个模型,对于每一个模型,抛弃全部的训练集,保留验证集,利用全部的验证集通过训练好的模型来构建 一组新的数据,利用新数据和对应label训练模型(逻辑回归等)。

 1 # stacked generalization with linear meta model on blobs dataset
 2 from sklearn.datasets.samples_generator import make_blobs
 3 from sklearn.metrics import accuracy_score
 4 from sklearn.linear_model import LogisticRegression
 5 from keras.models import load_model
 6 from keras.utils import to_categorical
 7 from numpy import dstack
 8 
 9 # load models from file
10 def load_all_models(n_models):
11     all_models = list()
12     for i in range(n_models):
13         # define filename for this ensemble
14         filename = 'models/model_' + str(i + 1) + '.h5'
15         # load model from file
16         model = load_model(filename)
17         # add to list of members
18         all_models.append(model)
19         print('>loaded %s' % filename)
20     return all_models
21 
22 # create stacked model input dataset as outputs from the ensemble
23 def stacked_dataset(members, inputX):
24     stackX = None
25     for model in members:
26         # make prediction
27         yhat = model.predict(inputX, verbose=0)
28         # stack predictions into [rows, members, probabilities]
29         if stackX is None:
30             stackX = yhat
31         else:
32             stackX = dstack((stackX, yhat))
33     # flatten predictions to [rows, members x probabilities]
34     stackX = stackX.reshape((stackX.shape[0], stackX.shape[1]*stackX.shape[2]))
35     return stackX
36 
37 # fit a model based on the outputs from the ensemble members
38 def fit_stacked_model(members, inputX, inputy):
39     # create dataset using ensemble
40     stackedX = stacked_dataset(members, inputX)
41     # fit standalone model
42     model = LogisticRegression()
43     model.fit(stackedX, inputy)
44     return model
45 
46 # make a prediction with the stacked model
47 def stacked_prediction(members, model, inputX):
48     # create dataset using ensemble
49     stackedX = stacked_dataset(members, inputX)
50     # make a prediction
51     yhat = model.predict(stackedX)
52     return yhat
53 
54 # generate 2d classification dataset
55 X, y = make_blobs(n_samples=1100, centers=3, n_features=2, cluster_std=2, random_state=2)
56 # split into train and test
57 n_train = 100
58 trainX, testX = X[:n_train, :], X[n_train:, :]
59 trainy, testy = y[:n_train], y[n_train:]
60 print(trainX.shape, testX.shape)
61 # load all models
62 n_members = 5
63 members = load_all_models(n_members)
64 print('Loaded %d models' % len(members))
65 # evaluate standalone models on test dataset
66 for model in members:
67     testy_enc = to_categorical(testy)
68     _, acc = model.evaluate(testX, testy_enc, verbose=0)
69     print('Model Accuracy: %.3f' % acc)
70 # fit stacked model using the ensemble
71 model = fit_stacked_model(members, testX, testy)
72 # evaluate model on test set
73 yhat = stacked_prediction(members, model, testX)
74 acc = accuracy_score(testy, yhat)
75 print('Stacked Test Accuracy: %.3f' % acc)
View Code

 一般小模型使用神经网络,新的ensemble部分的小模型也是用神经网络。

 

 

--------------------

 

转载于:https://www.cnblogs.com/wb-learn/p/11436069.html

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值