@交叉验证中scikit-learn estimator 问题
交叉验证中scikit-learn estimator 问题
问题描述:Cannot clone object ‘<keras.engine.training.Model object at 0x0000025D9FEE75C0>’ (type <class ‘keras.engine.training.Model’>): it does not seem to be a scikit-learn estimator as it does not implement a ‘get_params’ methods.
源代码
#5折交叉验证
from sklearn.model_selection import GridSearchCV
from sklearn.metrics import *
from sklearn.model_selection import cross_val_score
model=Dmodel(embedding_matrix)
# =============================================================================
# alpha_can = np.logspace(-3, 2, 10)#从10的-3次方 到10的二次方 等比数列
# np.set_printoptions(suppress=True)#numpy的输出以小数点形式输出0.001 不要类似1.000000e-03这种
# #print 'alpha_can = ', alpha_can
# lasso_model = GridSearchCV(model, param_grid={'alpha': alpha_can},scoring='neg_mean_squared_error', cv=5)#5折交叉验证,利用已知的alpha看看训练数据中哪个alpha值是最优的
# =============================================================================
print('T1')
cross=cross_val_score(model, np.array(Train_X), np.array(T_l1), cv=5, scoring='neg_mean_squared_error')
model1=cross .fit(np.array(Train_X),np.array(T_l1),batch_size=512,epochs=20,verbose=0)
print('T2')
model2=lasso_model .fit(np.array(Train_X),np.array(T_l2),batch_size=512,epochs=20,verbose=0)
其中在拟合的时候报错,尝试办法:
1.
The class name scikits.learn.linear_model.logistic.LogisticRegression refers to a very old version of scikit-learn. The top level package name is now sklearn since at least 2 or 3 releases. It’s very likely that you have old versions of scikit-learn installed concurrently in your python path. Uninstall them all, then reinstall 0.14 or later and try again.link.
2.答案还在于sklearn的文档中。查资料过程中发现,最主要的问题在于,cross_val_score应该用于fit之后,具体原因这个博主描述的很清楚:python-如何在sklearn中编写自定义估算器并在其上使用交叉验证?.
我想出了一个办法,我将两行代码交换顺序,程序可以运行,但是结果出来的时候,还是报错了,
3,参考这个评估深度学习模型-在keras中使用scikit-learn-基于keras的python学习笔记(三)继续改善
我喜欢这个例子:
import numpy as np
from sklearn.cross_validation import cross_val_score
class RegularizedRegressor:
def __init__(self, l = 0.01):
self.l = l
def combine(self, inputs):
return sum([i*w for (i,w) in zip([1] + inputs, self.weights)])
def predict(self, X):
return [self.combine(x) for x in X]
def classify(self, inputs):
return sign(self.predict(inputs))
def fit(self, X, y, **kwargs):
self.l = kwargs['l']
X = np.matrix(X)
y = np.matrix(y)
W = (X.transpose() * X).getI() * X.transpose() * y
self.weights = [w[0] for w in W.tolist()]
def get_params(self, deep = False):
return {'l':self.l}
X = np.matrix([[0, 0], [1, 0], [0, 1], [1, 1]])
y = np.matrix([0, 1, 1, 0]).transpose()
print cross_val_score(RegularizedRegressor(),
X,
y,
fit_params={'l':0.1},
scoring = 'mean_squared_error')