跟着github上的Avik-Jain学习机器学习:
https://github.com/Avik-Jain/100-Days-Of-ML-Code
Day3:多值线性回归预测模型
学习任务:
获取数据,对数据预处理(编码),分割数据集
创建线性回归模型,学习并预测
1.数据处理
#imports library and read data
import pandas as pd
import numpy as np
#读取数据,并将数据分为样本和标签
data = pd.read_csv('G:/MachineLearningDailyStudy-100/100-Days-Of-ML-Code-master/datasets/50_Startups.csv')
x = data.iloc[:, :-1].values
y = data.iloc[:, 4].values
#对样本中的属性进行编码
#data preprocessing
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
labelencoder = LabelEncoder()
x[:, 3] = labelencoder.fit_transform(x[:, 3])
on