带线性回归最佳拟合线的散点图_Scatter plot with linear regression line of best fit
1、import data
df = pd.readcsv(r"D\mpg_ggplot2.csv")
seclet_data =df.loc[df.cyl.isin([4,8]),:)
2、Plot
sns.set_style("white")
gridobj = sns.implot( x="displ",y = "hwy",hue = "cyl",data=data_select,
height=7,aspect=1.6,
palette = "tab10"
)
3、Decorations
gridobj.set(xlim = (0.5,7.5),ylim = (0,50))
plt.show
针对每列绘制线性回归线:
2、Plot
gridobj = implot(x='displ',y='hwy',hue = 'cyl',
data=data_select,
height=7,
palette='Set1'
)
抖动图
#import data
df = pd.read_csv(r"D:\opensource\dataSource\pandas+matplotlib_习题数据集\datasets\mpg_ggplot2.csv")
# df.head()
# Draw Stripplot
fig, ax= plt.subplots(figsize=(16,10), dpi=80)
sns.stripplot(df.cty,df.hwy,jitter=0.25,size=8, ax=ax, linewidth=.5)
#fig代表绘图窗口(Figure),ax代表这个绘图窗口上的坐标系(axes)。后面的ax.xxx则是表示对ax坐标系进行xxx操作
#Decoration
plt.title('Use jittered plots to avoid overlapping of points', fontsize=22)
plt.show()
计数图
避免点重叠问题的另一个选择是增加点的大小,这取决于该点中有多少点。 因此,点的大小越大,其周围的点的集中度越高
#data prepare
df_counts = df.groupby(['hwy','cty']).size().reset_index(name='counts')
# Draw Stripplot
fig, ax = plt.subplots(figsize=(16,10), dpi= 80)
sns.stripplot(df_counts.cty, df_counts.hwy, size=df_counts.counts*2, ax=ax)
# Decorations
plt.title('Counts Plot - Size of circle is bigger as more points overlap', fontsize=22)
plt.show()