加尼福尼亚房价数据集与KNN
需要的几个Python库
import matplotlib.pyplot as plt
from sklearn import datasets
from sklearn.neighbors import KNeighborsRegressor
from sklearn.cross_validation import train_test_split
from sklearn.metrics import mean_squared_error
import numpy as np
数据集内容:
这里data为数据集,target为目标,DESCR为简单的介绍,feature_names为列名。
利用train_test_split函数对训练和测试集进行划分
cali=datasets.california_housing.fetch_california_housing()
x=cali['data']
y=cali['target']
#x=pd.DataFrame(x)
#x.columns=cali['feature_names']
x_train,x_test,y_train,y_test=train_test_split(x,y,train_size=0.8)
这里是sklearn-train_test_split随机划分的介绍:
传送门: https://blo