模型集成-ensamble

最新推荐文章于 2023-04-30 15:34:27 发布

ayz30868

最新推荐文章于 2023-04-30 15:34:27 发布

阅读量564

点赞数

文章标签： python 数据结构与算法人工智能

原文链接：http://www.cnblogs.com/wb-learn/p/11436069.html

版权

1.综述

资料集合

模型是由于有较高的方差产生，集成多个模型可以减小方差，为了使模型有集成有效，需要每一模型都需要是很好的模型但是需要犯不同的错误，结果会更鲁棒一些

主要参考内容https://machinelearningmastery.com/stacking-ensemble-for-deep-learning-neural-networks/ ，包含了很多集成的代码实现

中文资料https://www.cnblogs.com/szxspark/p/10144913.html

一些简单的keras实现的ensemble的例子https://machinelearningmastery.com/?s=ensemble&post_type=post&submit=Search

2.做法

使用不同的初始化情况的模型来获得相同配置的输出，收集全部的输出后来进行平均，一般做ensamble的数目不会特别多，有两个原因，一个是计算代价问题，另一个是过多的模型收获的收益不会一直增长，

1）训练采用不同的数据

　　k-fold分割数据，每一个数据集的子集都用来训练模型，获取多个模型进行集成

　　在数据集中重新重新采样，抽取数据，（叫bootstrap aggregation，或者bootstrap aggregation）

　　可以负采样（抽样后不放回），

2）训练不模型来降低模型

　　相同模型和数据，利用不同不同的初始化方式来训练模型并集成（可以降低方差，但是泛化误差不会好很多，都是同一个类型的映射）　　

　　不同的模型，包括不同隐层，不同的学习率，正则化方式等

　　一个模型再训练的不同时间段保存（中间可能加入震荡的噪声，震荡学习率（带热重启动的随机梯度下降(SGDR)），）

3）组合方式

　　直接平均

　　利用dev进行加权平均，加权的权值由在dev上的验证效果决定

　　要一个新的模型来获取之前的值（stacked）

一些实际的做法

1.对于加权平均的方式来对分类做emsemble，有两种方式，一种是穷举的方式来搜索，可以使用python的itertools.product(A,repeat=num of model)，可以把A和B的所有组合选出来。

速度太慢，无法并行，组合情况随模型的数目和取值成指数增长

另一种是使用一些搜索算法，包括差分进化算法，pso等。

 1 # global optimization to find coefficients for weighted ensemble on blobs problem
 2 from sklearn.datasets.samples_generator import make_blobs
 3 from sklearn.metrics import accuracy_score
 4 from keras.utils import to_categorical
 5 from keras.models import Sequential
 6 from keras.layers import Dense
 7 from numpy import mean
 8 from numpy import std
 9 from numpy import array
10 from numpy import argmax
11 from numpy import tensordot
12 from numpy.linalg import norm
13 from scipy.optimize import differential_evolution
14 
15 
16 # fit model on dataset
17 def fit_model(trainX, trainy):
18     trainy_enc = to_categorical(trainy)
19     # define model
20     model = Sequential()
21     model.add(Dense(25, input_dim=2, activation='relu'))
22     model.add(Dense(3, activation='softmax'))
23     model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
24     # fit model
25     model.fit(trainX, trainy_enc, epochs=500, verbose=0)
26     return model
27 
28 
29 # make an ensemble prediction for multi-class classification
30 def ensemble_predictions(members, weights, testX):
31     # make predictions
32     yhats = [model.predict(testX) for model in members]
33     yhats = array(yhats)
34     # weighted sum across ensemble members
35     summed = tensordot(yhats, weights, axes=((0),(0)))
36     # argmax across classes
37     result = argmax(summed, axis=1)
38     return result
39 
40 # # evaluate a specific number of members in an ensemble
41 def evaluate_ensemble(members, weights, testX, testy):
42     # make prediction
43     yhat = ensemble_predictions(members, weights, testX)
44     # calculate accuracy
45     return accuracy_score(testy, yhat)
46 
47 # normalize a vector to have unit norm
48 def normalize(weights):
49     # calculate l1 vector norm
50     result = norm(weights, 1)
51     # check for a vector of all zeros
52     if result == 0.0:
53         return weights
54     # return normalized vector (unit norm)
55     return weights / result
56 
57 # loss function for optimization process, designed to be minimized
58 def loss_function(weights, members, testX, testy):
59     # normalize weights
60     normalized = normalize(weights)
61     # calculate error rate
62     return 1.0 - evaluate_ensemble(members, normalized, testX, testy)
63 
64 # generate 2d classification dataset
65 X, y = make_blobs(n_samples=1100, centers=3, n_features=2, cluster_std=2, random_state=2)
66 # split into train and test
67 n_train = 100
68 trainX, testX = X[:n_train, :], X[n_train:, :]
69 trainy, testy = y[:n_train], y[n_train:]
70 print(trainX.shape, testX.shape)
71 # fit all models
72 n_members = 5
73 members = [fit_model(trainX, trainy) for _ in range(n_members)]
74 # evaluate each single model on the test set
75 testy_enc = to_categorical(testy)
76 for i in range(n_members):
77     _, test_acc = members[i].evaluate(testX, testy_enc, verbose=0)
78     print('Model %d: %.3f' % (i+1, test_acc))
79 # evaluate averaging ensemble (equal weights)
80 weights = [1.0/n_members for _ in range(n_members)]
81 score = evaluate_ensemble(members, weights, testX, testy)
82 print('Equal Weights Score: %.3f' % score)
83 # define bounds on each weight
84 bound_w = [(0.0, 1.0)  for _ in range(n_members)]
85 # arguments to the loss function
86 search_arg = (members, testX, testy)
87 # global optimization of ensemble weights
88 result = differential_evolution(loss_function, bound_w, search_arg, maxiter=80, tol=1e-7)
89 # get the chosen weights
90 weights = normalize(result['x'])
91 print('Optimized Weights: %s' % weights)
92 # evaluate chosen weights
93 score = evaluate_ensemble(members, weights, testX, testy)
94 print('Optimized Weights Score: %.3f' % score)

View Code

2.对于stacking ensemble的情况，可以使用k-fold的方式分割数据集，训练多个模型，对于每一个模型，抛弃全部的训练集，保留验证集，利用全部的验证集通过训练好的模型来构建一组新的数据，利用新数据和对应label训练模型（逻辑回归等）。

 1 # stacked generalization with linear meta model on blobs dataset
 2 from sklearn.datasets.samples_generator import make_blobs
 3 from sklearn.metrics import accuracy_score
 4 from sklearn.linear_model import LogisticRegression
 5 from keras.models import load_model
 6 from keras.utils import to_categorical
 7 from numpy import dstack
 8 
 9 # load models from file
10 def load_all_models(n_models):
11     all_models = list()
12     for i in range(n_models):
13         # define filename for this ensemble
14         filename = 'models/model_' + str(i + 1) + '.h5'
15         # load model from file
16         model = load_model(filename)
17         # add to list of members
18         all_models.append(model)
19         print('>loaded %s' % filename)
20     return all_models
21 
22 # create stacked model input dataset as outputs from the ensemble
23 def stacked_dataset(members, inputX):
24     stackX = None
25     for model in members:
26         # make prediction
27         yhat = model.predict(inputX, verbose=0)
28         # stack predictions into [rows, members, probabilities]
29         if stackX is None:
30             stackX = yhat
31         else:
32             stackX = dstack((stackX, yhat))
33     # flatten predictions to [rows, members x probabilities]
34     stackX = stackX.reshape((stackX.shape[0], stackX.shape[1]*stackX.shape[2]))
35     return stackX
36 
37 # fit a model based on the outputs from the ensemble members
38 def fit_stacked_model(members, inputX, inputy):
39     # create dataset using ensemble
40     stackedX = stacked_dataset(members, inputX)
41     # fit standalone model
42     model = LogisticRegression()
43     model.fit(stackedX, inputy)
44     return model
45 
46 # make a prediction with the stacked model
47 def stacked_prediction(members, model, inputX):
48     # create dataset using ensemble
49     stackedX = stacked_dataset(members, inputX)
50     # make a prediction
51     yhat = model.predict(stackedX)
52     return yhat
53 
54 # generate 2d classification dataset
55 X, y = make_blobs(n_samples=1100, centers=3, n_features=2, cluster_std=2, random_state=2)
56 # split into train and test
57 n_train = 100
58 trainX, testX = X[:n_train, :], X[n_train:, :]
59 trainy, testy = y[:n_train], y[n_train:]
60 print(trainX.shape, testX.shape)
61 # load all models
62 n_members = 5
63 members = load_all_models(n_members)
64 print('Loaded %d models' % len(members))
65 # evaluate standalone models on test dataset
66 for model in members:
67     testy_enc = to_categorical(testy)
68     _, acc = model.evaluate(testX, testy_enc, verbose=0)
69     print('Model Accuracy: %.3f' % acc)
70 # fit stacked model using the ensemble
71 model = fit_stacked_model(members, testX, testy)
72 # evaluate model on test set
73 yhat = stacked_prediction(members, model, testX)
74 acc = accuracy_score(testy, yhat)
75 print('Stacked Test Accuracy: %.3f' % acc)

View Code

一般小模型使用神经网络，新的ensemble部分的小模型也是用神经网络。

--------------------

转载于:https://www.cnblogs.com/wb-learn/p/11436069.html

ayz30868

关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
模型集成-ensamble

1.综述资料集合模型是由于有较高的方差产生，集成多个模型可以减小方差，为了使模型有集成有效，需要每一模型都需要是很好的模型但是需要犯不同的错误，结果会更鲁棒一些主要参考内容https://machinelearningmastery.com/stacking-ensemble-for-deep-learning-neural-networks/ ，包含了很多集成的代码实现...
复制链接

扫一扫