import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn import datasets
%matplotlib
%matplotlib inline
%config InlineBackend.figure_format = 'retina'
Using matplotlib backend: MacOSX
X, y = datasets.make_moons(n_samples=1000, noise=0.25, random_state=666)
print('X Shape: ', X.shape)
print('Y Shape: ', y.shape)
X Shape: (1000, 2)
Y Shape: (1000,)
plt.scatter(X[y==0,0], X[y==0,1])
plt.scatter(X[y==1,0], X[y==1,1])
plt.show()
Voting
采用多个模型,综合得到结果
- hard 少数服从大数
- soft 概率的平均数(但只适用于概率模型)
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=666)
print('X_train shape: ', X_train.shape)
print('y_train shape: ', y_train.shape)
print('X_test shape: '