python画图库哪个好_用python画一些好看的数据图

最新推荐文章于 2023-07-11 16:07:36 发布

weixin_39742471

最新推荐文章于 2023-07-11 16:07:36 发布

阅读量253

点赞数

文章标签： python画图库哪个好

这一次带你们画图，画一些数据图

虽然吧，excel本身就可以画一些很好看的图，比如上面这张图，但是，为什么不用python试一试画图呢？

这一次我们拿来了这个数据集

应该是有关于大学的一些信息，比如专业，比如学院名称，比如各个专业的就业信息之类的

import pandas as pd

from matplotlib import pyplot as plt

#我们导入这个新模块

%matplotlib inline

# 敲这个命令，我们画的图就可以显示在下面（但是在我的jupyter notebook里面，好像不敲这个命令也可以O(∩_∩)O哈哈~，你们可以自己试一试）

recent_grads=pd.read_csv('recent-grads.csv',encoding='utf-8')

# 用utf-8编码来读取这个文件

recent_grads

173 rows × 21 columns

recent_grads.iloc[0]

# 看一看第一行，是右边的数据

Rank 1

Major_code 2419

Major PETROLEUM ENGINEERING

Total 2339

Men 2057

Women 282

Major_category Engineering

ShareWomen 0.120564

Sample_size 36

Employed 1976

Full_time 1849

Part_time 270

Full_time_year_round 1207

Unemployed 37

Unemployment_rate 0.0183805

Median 110000

P25th 95000

P75th 125000

College_jobs 1534

Non_college_jobs 364

Low_wage_jobs 193

Name: 0, dtype: object

recent_grads.head()

# 看一看前五行

5 rows × 21 columns

recent_grads.tail()

# 看一看倒数五行

5 rows × 21 columns

现在我们要去掉数据集里面的空值，因为空值会影响我们画图像

recent_grads.shape

(173, 21)

recent_grads.dropna(inplace=True)

# inplace=True代表我们是在当前的数据集上面动手脚

recent_grads.shape

(172, 21)

好了，现在我们开始疯狂的生成图像

我们先画散点图，对于散点图，我们需要两处数据，一处做横轴，另外一处做纵轴

recent_grads.plot(x='Sample_size',y='Median',kind='scatter')

# kind='scatter可以标明我们想画散点图，其实其他图也可以画，我们可以自己去series.plot()的官方文档上去查

recent_grads.plot(x='Sample_size',y='Unemployment_rate',kind='scatter')

recent_grads.plot(x='Full_time',y='Median',kind='scatter')

recent_grads.plot(x='ShareWomen',y='Unemployment_rate',kind='scatter')

recent_grads.plot(x='Men',y='Median',kind='scatter')

recent_grads.plot(x='Women',y='Median',kind='scatter')

可以看到，其实，横轴和纵轴的相关性都不大啊

接下来我们画直方图，直方图用来表示数据的分布很有用，直方图只接收一处数据，然后这处数据的范围作为横轴，并且分好一段一段的区间，然后呢，就跟我们小时候在数学课上面画图的方式一样，数据会落到对应的区间里面去，如此一来，我们便可窥见这出数据的全貌

recent_grads['Sample_size'].hist()

recent_grads['Median'].hist(bins=20)

# 我们可以自己设定横轴的区间，可方便了

recent_grads['Employed'].hist()

recent_grads['Full_time'].hist()

recent_grads['ShareWomen'].hist()

recent_grads['Unemployment_rate'].hist()

recent_grads['Men'].hist()

recent_grads['Women']

0 282.0

1 77.0

2 131.0

3 135.0

4 11021.0

...

168 5359.0

169 2332.0

170 2270.0

171 3695.0

172 964.0

Name: Women, Length: 172, dtype: float64

recent_grads['Women'].hist(figsize=(10,10))

# figsize=(10,10)参数可以让我们自己调节图像的大小，也很方便

然后，我们把这两种图像给结合起来，这简直更加方便了

from pandas.plotting import scatter_matrix

# 我们需要从pandas.plotting这个模块里面用一下scatter_matrix这个函数

scatter_matrix(recent_grads[['Sample_size','Median']],figsize=(10,10))

# 在上面的代码中，主要参数就一个，接受的参数类型为DataFrame，然后scatter_matrix就会自动生成散点图矩阵

array([[,

]],

dtype=object)

矩阵图也很好理解，Dataframe里面有几个维度的数据，散点图矩阵就有几个维度，然后在对角线上，因为你总不能自己跟自己做散点图吧，因为一样的数据肯定会生成一条直线

recent_grads.plot(x='Median',y="Median",kind='scatter')

看，这就是一条直线

scatter_matrix(recent_grads[['Sample_size','Median','Unemployment_rate']],figsize=(15,15))

array([[,

]],

dtype=object)

然后我们需要回答一个问题：这些专业里面，有多少个专业是女生居多的?

recent_grads.columns

Index(['Rank', 'Major_code', 'Major', 'Total', 'Men', 'Women',

'Major_category', 'ShareWomen', 'Sample_size', 'Employed', 'Full_time',

'Part_time', 'Full_time_year_round', 'Unemployed', 'Unemployment_rate',

'Median', 'P25th', 'P75th', 'College_jobs', 'Non_college_jobs',

'Low_wage_jobs'],

dtype='object')

recent_grads['Major']

0 PETROLEUM ENGINEERING

1 MINING AND MINERAL ENGINEERING

2 METALLURGICAL ENGINEERING

3 NAVAL ARCHITECTURE AND MARINE ENGINEERING

4 CHEMICAL ENGINEERING

...

168 ZOOLOGY

169 EDUCATIONAL PSYCHOLOGY

170 CLINICAL PSYCHOLOGY

171 COUNSELING PSYCHOLOGY

172 LIBRARY SCIENCE

Name: Major, Length: 172, dtype: object

recent_grads['ShareWomen']

0 0.120564

1 0.101852

2 0.153037

3 0.107313

4 0.341631

...

168 0.637293

169 0.817099

170 0.799859

171 0.798746

172 0.877960

Name: ShareWomen, Length: 172, dtype: float64

recent_grads['ShareWomen'].hist()

看这个直方图我们感觉，男生多的专业和女生多的专业好像是一半一半啊，

那么我们接下来就精确的算一算

print('{:.2} persentage of predominantly female.'.format((recent_grads[(recent_grads['ShareWomen']>0.5)].shape[0])/(recent_grads.shape[0])))

0.56 persentage of predominantly female.

最后，我们随便画几个图玩一玩，大家可以看一看这些数据的分布有什么特点

recent_grads['Median'].hist()

20000-50000

recent_grads[:10]['ShareWomen'].plot(kind='bar')

recent_grads[-10:]['ShareWomen'].plot(kind='bar')

recent_grads['Unemployment_rate'][:10].hist()

recent_grads['Unemployment_rate'][-10:].hist()

好啦，这次感受画图的小小项目就到这里啦！

weixin_39742471

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
python画图库哪个好_用python画一些好看的数据图

这一次带你们画图，画一些数据图虽然吧，excel本身就可以画一些很好看的图，比如上面这张图，但是，为什么不用python试一试画图呢？这一次我们拿来了这个数据集应该是有关于大学的一些信息，比如专业，比如学院名称，比如各个专业的就业信息之类的import pandas as pdfrom matplotlib import pyplot as plt#我们导入这个新模块%matplotlib in...
复制链接

扫一扫