linear Regression/k-Nearest Neighbors regression ML model by python

最新推荐文章于 2024-04-30 14:09:31 发布

小白笑苍

最新推荐文章于 2024-04-30 14:09:31 发布

阅读量671

点赞数

分类专栏： Python ML 文章标签： ML python linearRegression

本文链接：https://blog.csdn.net/toyijiu/article/details/79086625

版权

Python 同时被 2 个专栏收录

64 篇文章 0 订阅

订阅专栏

14 篇文章 0 订阅

订阅专栏

Learned by the book Hands-On Machine Learning with Scikit-Learn and TensorFlow,chapter 1.It is a linearRegression ML model demo code.
The relate datasets are below:
handson-ml/datasets/lifesat/
The code below:

import matplotlib
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from sklearn import linear_model


def prepare_country_stats(oecd_bli, gdp_per_capita):
    return sample_data

#load the data
oecd_bli = pd.read_csv('oecd_bli_2015.csv',thousands=',')
#filter the items whose 'INQUALITY' is 'TOT'
oecd_bli = oecd_bli[oecd_bli["INEQUALITY"]=="TOT"]
#reshape the dataset,row ='Country',col='Indicator',values='Value'
oecd_bli = oecd_bli.pivot(index="Country", columns="Indicator", values="Value")

gdp_per_capita = pd.read_csv('gdp_per_capita.csv',thousands=',',delimiter='\t',encoding='latin1',na_values='n/a')
#rename the col '2015' to 'GDP per capita'
gdp_per_capita.rename(columns={"2015": "GDP per capita"}, inplace=True)
gdp_per_capita.set_index("Country", inplace=True)
#merge the two dataset,index  is same: 'Country'
full_country_stats = pd.merge(left=oecd_bli, right=gdp_per_capita, left_index=True, right_index=True)
full_country_stats.sort_values(by="GDP per capita", inplace=True)

#seperate the dataset to two parts
remove_indices = [0, 1, 6, 8, 33, 34, 35]
keep_indices = list(set(range(36)) - set(remove_indices))

sample_data = full_country_stats[["GDP per capita", 'Life satisfaction']].iloc[keep_indices]
missing_data = full_country_stats[["GDP per capita", 'Life satisfaction']].iloc[remove_indices]

#prepare the data
country_stats = prepare_country_stats(oecd_bli,gdp_per_capita)
x = np.c_[country_stats['GDP per capita']]
y = np.c_[country_stats['Life satisfaction']]

#visualize the data
country_stats.plot(kind='scatter',x='GDP per capita',y='Life satisfaction')
plt.show()

#select a linear model
lin_reg_model = linear_model.LinearRegression()
#train the model
lin_reg_model.fit(x,y)

#make a prediction of a country's satisfaction with the GDP
print(lin_reg_model.predict([[9000]]))

You can see the chart and the prediction result:
这里写图片描述

Also you can use the way of k-Nearest Neighbors regression to predict the satisfaction value.just replace:

clf = sklearn.linear_model.LinearRegression()

to:

#the value of neighbors here is 3
clf = sklearn.neighbors.KNeighborsRegressor(n_neighbors=3)

It will choose the number of the countries whose GDP is closest to the one you want to predict(here is 9000).Then compute the average value which will be the prediction result.

小白笑苍

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
linear Regression/k-Nearest Neighbors regression ML model by python

Learned by the book Hands-On Machine Learning with Scikit-Learn and TensorFlow,chapter 1.It is a linearRegression ML model demo code. The relate datasets are below: handson-ml/datasets/lifesat/ The
复制链接

扫一扫

专栏目录