KNN算法介绍详见:https://blog.csdn.net/Nicht_sehen/article/details/80495884
原理详见:维基百科
题外话:维基百科真的是个好东西 (:D)
查看数据
首先我们来看一下数据集:
import pandas as pd
import mglearn
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
iris_dataset = load_iris()
# 随机划分训练集和测试集
X_train, X_test, y_train, y_test = train_test_split(iris_dataset['data'],iris_dataset['target'], random_state=0)
# 将训练集转化为dataframe 使用pandas画图
iris_dataframe = pd.DataFrame(X_train, columns = iris_dataset.feature_names)
g = pd.plotting.scatter_matrix(iris_dataframe, c = y_train,figsize = (15,15),marker = 'o',hist_kwds = {
'bins':20},s=