文章目录
数据分析第三讲 matplotlib常用统计图
1.绘制散点图
使用的方法:scatter(x,y)
假设通过爬虫你获取到了长沙2018年10,11月份每天白天的最高气温(分别位于列表a,b),那么
此时如何寻找出气温和随时间变化的某种规律
a = [11,17,16,11,12,11,12,6,6,7,8,9,12,15,14,17,18,21,16,17,20,14,15,15,15,19,21,22,22,22,23]
b = [26,26,28,19,21,17,16,19,18,20,20,19,22,23,17,20,21,20,22,15,11,15,5,13,17,10,11,13,12,13]
# 假设通过爬虫你获取到了长沙2018年10,11月份每天白天的最高气温(分别位于列表a,b),那么
# 此时如何寻找出气温和随时间变化的某种规律
# a = [11,17,16,11,12,11,12,6,6,7,8,9,12,15,14,17,18,21,16,17,20,14,15,15,15,19,21,22,22,22,23]
# b = [26,26,28,19,21,17,16,19,18,20,20,19,22,23,17,20,21,20,22,15,11,15,5,13,17,10,11,13,12,13]
from matplotlib import pyplot as plt
from matplotlib import font_manager
font = font_manager.FontProperties(fname=r"c:\windows\fonts\simsun.ttc", size=14)
y_10 = [11, 17, 16, 11, 12, 11, 12, 6, 6, 7, 8, 9, 12, 15, 14, 17, 18, 21, 16, 17, 20, 14, 15, 15, 15, 19, 21, 22, 22,
22, 23]
y_11 = [26, 26, 28, 19, 21, 17, 16, 19, 18, 20, 20, 19, 22, 23, 17, 20, 21, 20, 22, 15, 11, 15, 5, 13, 17, 10, 11, 13,
12, 13]
# 设置图片大小
plt.figure(figsize=(15, 8), dpi=80)
# 十月份 31天
x_10 = range(1, 32)
# 十一月份 30天
x_11 = range(51, 81)
# 绘制散点图
plt.scatter(x_10, y_10, label="十月份")
plt.scatter(x_11, y_11, label="十一月份")
# 设置x轴刻度
_x = list(x_10) + list(x_11)
_xticks_label = ["十月{}日".format(i) for i in x_10]
_xticks_label += ["十月{}日".format(i - 30) for i in x_11]
plt.xticks(_x[::3], _xticks_label[::3], fontproperties=font, rotation=45)
# 添加描述信息
plt.xlabel("时间", fontproperties=font)
plt.xlabel("温度", fontproperties=font)
# 添加标题
plt.title("2018年长沙市10,11月份气温变化", fontproperties=font)
# 添加图例
plt.legend(prop=font)
plt.show()
2.绘制条形图
- 2.1使用的方法:bar(x,y)
假设你获取到了2019年内地电影票房前20的电影(列表a)和电影票房数据(列表b),那么如何更加直观
的展示该数据
a = [“流浪地球”,“复仇者联盟4:终局之战”,“哪吒之魔童降世”,“疯狂的外星人”,“飞驰人生”,“蜘蛛侠:英
雄远征”,“扫毒2天地对决”,“烈火英雄”,“大黄蜂”,“惊奇队长”,“比悲伤更悲伤的故事”,“哥斯拉2:怪兽之
王”,“阿丽塔:战斗天使”,“银河补习班”,“狮子王”,“反贪风暴4”,“熊出没”,“大侦探皮卡丘”,"新喜剧之王
",“使徒行者2:谍影行动”,]
b = [56.01,26.94,17.53,16.49,15.45,12.96,11.8,11.61,11.28,11.12,10.49,10.3,8.75,7.55,7.32,6.99,6.88,
6.86,6.58,6.23] - 示例1
'''
假设你获取到了2019年内地电影票房前20的电影(列表a)和电影票房数据(列表b),那么如何更加直观
的展示该数据
a = ["流浪地球","复仇者联盟4:终局之战","哪吒之魔童降世","疯狂的外星人","飞驰人生","蜘蛛侠:英
雄远征","扫毒2天地对决","烈火英雄","大黄蜂","惊奇队长","比悲伤更悲伤的故事","哥斯拉2:怪兽之
王","阿丽塔:战斗天使","银河补习班","狮子王","反贪风暴4","熊出没","大侦探皮卡丘","新喜剧之王
","使徒行者2:谍影行动"]
b = [56.01,26.94,17.53,16.49,15.45,12.96,11.8,11.61,11.28,11.12,10.49,10.3,8.75,7.55,7.32,6.99,6.88,
6.86,6.58,6.23]
'''
from matplotlib import pyplot as plt
from matplotlib import font_manager
font = font_manager.FontProperties(fname=r"c:\windows\fonts\simsun.ttc", size=14)
x = ["流浪地球","复仇者联盟4:\n终局之战","哪吒之\n魔童降世","疯狂的外星人","飞驰人生","蜘蛛侠:\n英雄远征","扫毒2\n天地对决","烈火英雄","大黄蜂","惊奇队长","比悲伤更悲\n伤的故事","哥斯拉2:\n怪兽之王","阿丽塔:\n战斗天使","银河补习班","狮子王","反贪风暴4","熊出没","大侦探皮卡丘","新喜剧之王","使徒行者2:\n谍影行动"]
y = [56.01,26.94,17.53,16.49,15.45,12.96,11.8,11.61,11.28,11.12,10.49,10.3,8.75,7.55,7.32,6.99,6.88,
6.86,6.58,6.23]
# print(len(x),len(y))
# 设置图片大小
plt.figure(figsize=(15,8),dpi=80)
# 绘制条形图
plt.bar(x,y)
# 设置x轴刻度
# 0 - 19
plt.xticks(range(len(x)),x,fontproperties=font,rotation=45)
plt.show()
# 效果如下图
- 2.2绘制横向条形图
使用的方法:barh(x,y)
把x轴横过来:查看源码,** def barh(self, y, width, height=0.8, left=None, *, align=“center”,
kwargs):
r"""
Make a horizontal bar plot.
更改plt.bar(x,y)为plt.barh(x,y),删除rotation=45和x数据里面的\n
更改plt.xticks(range(len(x)),x,fontproperties=font)为plt.yticks(range(len(x)),x,fontproperties=font)
代码和效果如下:
from matplotlib import pyplot as plt
from matplotlib import font_manager
font = font_manager.FontProperties(fname=r"c:\windows\fonts\simsun.ttc", size=10)
x = ["流浪地球","复仇者联盟4:终局之战","哪吒之魔童降世","疯狂的外星人","飞驰人生","蜘蛛侠:英雄远征","扫毒2天地对决","烈火英雄","大黄蜂","惊奇队长","比悲伤更悲伤的故事","哥斯拉2:怪兽之王","阿丽塔:战斗天使","银河补习班","狮子王","反贪风暴4","熊出没","大侦探皮卡丘","新喜剧之王","使徒行者2:谍影行动"]
y = [56.01,26.94,17.53,16.49,15.45,12.96,11.8,11.61,11.28,11.12,10.49,10.3,8.75,7.55,7.32,6.99,6.88,
6.86,6.58,6.23]
# print(len(x),len(y))
# 设置图片大小
plt.figure(figsize=(15,8),dpi=80)
# 绘制条形图
plt.barh(x,y) # plt.bar(x,y)
# 设置y轴刻度
# 0 - 19
plt.yticks(range(len(x)),x,fontproperties=font) # plt.xticks(range(len(x)),x,fontproperties=font)
plt.show()
要实现排名高的前面:
更改plt.barh(x[::-1],y[::-1]),将x和y数据反转
更改plt.yticks(range(len(x)),x,fontproperties=font) 为
plt.yticks(range(len(x)),x[::-1],fontproperties=font)
代码、效果如下:
from matplotlib import pyplot as plt
from matplotlib import font_manager
font = font_manager.FontProperties(fname=r"c:\windows\fonts\simsun.ttc", size=10)
x = ["流浪地球","复仇者联盟4:终局之战","哪吒之魔童降世","疯狂的外星人","飞驰人生","蜘蛛侠:英雄远征","扫毒2天地对决","烈火英雄","大黄蜂","惊奇队长","比悲伤更悲伤的故事","哥斯拉2:怪兽之王","阿丽塔:战斗天使","银河补习班","狮子王","反贪风暴4","熊出没","大侦探皮卡丘","新喜剧之王","使徒行者2:谍影行动"]
y = [56.01,26.94,17.53,16.49,15.45,12.96,11.8,11.61,11.28,11.12,10.49,10.3,8.75,7.55,7.32,6.99,6.88,
6.86,6.58,6.23]
# print(len(x),len(y))
# 设置图片大小
plt.figure(figsize=(15,8),dpi=80)
# 绘制条形图
plt.barh(x[::-1],y[::-1]) # plt.bar(x,y)
# 设置y轴刻度
# 0 - 19
plt.yticks(range(len(x)),x[::-1],fontproperties=font) # plt.xticks(range(len(x)),x,fontproperties=font)
plt.show()
- 2.3 绘制多条条形图
如何在一张图里绘制多个条形图?
x1 = [1,3,5,7,9]
data1 = [5,2,7,8,2]
x2 = [1,3,5,7,9]
data2 = [8,6,2,5,6] - 示例
'''
绘制多组数据
'''
from matplotlib import pyplot as plt
x1 = [1,3,5,7,9,11,13,15]
data1 = [5,3,4,3,2,3,1,3]
x2 = x1
data2 = [4,2,3,2,1,2,0,2]
plt.bar(x1,data1)
plt.bar(x2,data2)
plt.show() # 此时图形会重合如下图
解决办法1:import numpy as np,x2 = np.array(x1) + 1
解决办法2:使用列表推导式
代码、效果如下:
'''
绘制多组数据
'''
from matplotlib import pyplot as plt
import numpy as np
x1 = [1,3,5,7,9,11,13,15]
data1 = [5,3,4,3,2,3,1,3]
# x2 = np.array(x1) + 1
x2 = [i + 1 for i in x1]
data2 = [4,2,3,2,1,2,0,2]
plt.bar(x1,data1)
plt.bar(x2,data2)
plt.show() #
- 条形图练习
假设你知道了列表a中电影分别在2017-09-14(b_14),2017-09-15(b_15),2017-09-16(b_16)三天的票房,
为了展示列表中电影本身的票房以及同其他电影的数据对比情况,应该如何更加直观的呈现该数
据
a = [‘流浪地球’,‘复仇者联盟4’,‘哪吒之魔童降世’,‘疯狂的外星人’]
b_14 = [2358,399,2358,362]
b_15 = [12357,156,2045,168]
b_16 = [15746,312,4497,319]
'''
条形图练习
假设你知道了列表a中电影分别在2017-09-14(b_14),2017-09-15(b_15),2017-09-16(b_16)三天的票房,
为了展示列表中电影本身的票房以及同其他电影的数据对比情况,应该如何更加直观的呈现该数
据
a = ['流浪地球','复仇者联盟4','哪吒之魔童降世','疯狂的外星人']
b_14 = [2358,399,2358,362]
b_15 = [12357,156,2045,168]
b_16 = [15746,312,4497,319]
'''
from matplotlib import pyplot as plt
from matplotlib import font_manager
font = font_manager.FontProperties(fname=r"c:\windows\fonts\simsun.ttc", size=10)
a = ['流浪地球','复仇者联盟4','哪吒之魔童降世','疯狂的外星人']
b_14 = [2358,399,2358,362]
b_15 = [12357,156,2045,168]
b_16 = [15746,312,4497,319]
# 设置图形大小
plt.figure(figsize=(15,8), dpi=80)
# 设置间隔
bar_width = 0.3
x_14 = list(range(len(a)))
x_15 = [i+bar_width for i in x_14]
x_16 = [i+bar_width*2 for i in x_14]
plt.bar(x_14,b_14,bar_width,label='9月14日')
plt.bar(x_15,b_15,bar_width,label='9月15日')
plt.bar(x_16,b_16,bar_width,label='9月16日')
# 设置图例
plt.legend(prop=font)
# 设置x轴刻度
plt.xticks(x_15,a,fontproperties=font)
plt.show()
3.绘制直方图
- 类似于柱状图
- 通过将数组组合在一起来显示分布
- 使用的方法:hist(x,bins)
组数:将数据分组,当数据在100个以内时,按数据多少常分5-12组
组距:指每个小组的两个端点的距离
组数 = 极差/组距 极差=>max(a)-min(a) 组距=>78到81 组距3 - 示例
"""
直方图示例
250部电影时长
"""
# import random
from matplotlib import pyplot as plt
# a = [random.randint(78,150) for i in range(250)] # 随机获取250个time
# print(a)
a = [132, 114, 149, 127, 114, 83, 101, 98, 147, 78, 95, 129, 98, 93, 102, 131, 83, 123, 100, 93, 83, 88, 138, 119, 145, 121, 136, 87, 94, 86, 85, 112, 80, 143, 107, 116, 123, 124, 139, 104, 102, 104, 113, 121, 107, 80, 141, 86, 103, 119, 142, 131, 93, 101, 92, 81, 137, 138, 144, 79, 121, 79, 116, 143, 89, 82, 133, 87, 96, 140, 139, 104, 121, 84, 107, 93, 141, 119, 90, 132, 85, 139, 142, 126, 106, 138, 141, 116, 105, 143, 91, 116, 106, 134, 137, 144, 96, 112, 133, 150, 95, 96, 137, 79, 94, 116, 118, 105, 101, 87, 96, 82, 97, 90, 79, 91, 93, 93, 98, 114, 138, 80, 139, 108, 97, 143, 142, 97, 117, 92, 135, 118, 97, 137, 146, 144, 146, 107, 85, 145, 128, 78, 132, 142, 83, 111, 88, 106, 142, 128, 150, 116, 97, 113, 100, 130, 79, 149, 84, 104, 96, 94, 103, 105, 123, 113, 96, 95, 103, 126, 134, 123, 149, 140, 114, 116, 99, 97, 119, 93, 91, 120, 104, 98, 78, 124, 139, 106, 119, 141, 91, 136, 97, 107, 102, 95, 118, 93, 134, 142, 90, 84, 90, 149, 105, 112, 135, 93, 100, 149, 148, 136, 126, 88, 148, 113, 81, 86, 117, 145, 109, 120, 89, 103, 82, 112, 78, 125, 105, 132, 108, 105, 102, 95, 148, 111, 79, 88, 87, 97, 88, 83, 110, 145, 101, 138, 102, 92, 83, 83]
# 组距
d = 3
# 组数
num = (max(a) - min(a)) // d
# print(num) # 24
# 图像大小
plt.figure(figsize=(15,8),dpi=80)
# 频数直方图 num还可以传一个列表,比如num = [i for i in range(78,160,4)]
plt.hist(a,num)
# num = [i for i in range(78,160,4)] # 设置频数直方图传入列表参数
# plt.hist(a,num,density=True) # 设置频数直方图修改默认参数
# 设置x轴刻度
plt.xticks(range(min(a),max(a)+d,d)) # 78 81 84...150
# y轴表示在某个时长内有多少部电影
# 设置网格
plt.grid()
plt.show()
频率图代码、效果如下:
"""
直方图示例
250部电影时长
"""
from matplotlib import pyplot as plt
a = [132, 114, 149, 127, 114, 83, 101, 98, 147, 78, 95, 129, 98, 93, 102, 131, 83, 123, 100, 93, 83, 88, 138, 119, 145, 121, 136, 87, 94, 86, 85, 112, 80, 143, 107, 116, 123, 124, 139, 104, 102, 104, 113, 121, 107, 80, 141, 86, 103, 119, 142, 131, 93, 101, 92, 81, 137, 138, 144, 79, 121, 79, 116, 143, 89, 82, 133, 87, 96, 140, 139, 104, 121, 84, 107, 93, 141, 119, 90, 132, 85, 139, 142, 126, 106, 138, 141, 116, 105, 143, 91, 116, 106, 134, 137, 144, 96, 112, 133, 150, 95, 96, 137, 79, 94, 116, 118, 105, 101, 87, 96, 82, 97, 90, 79, 91, 93, 93, 98, 114, 138, 80, 139, 108, 97, 143, 142, 97, 117, 92, 135, 118, 97, 137, 146, 144, 146, 107, 85, 145, 128, 78, 132, 142, 83, 111, 88, 106, 142, 128, 150, 116, 97, 113, 100, 130, 79, 149, 84, 104, 96, 94, 103, 105, 123, 113, 96, 95, 103, 126, 134, 123, 149, 140, 114, 116, 99, 97, 119, 93, 91, 120, 104, 98, 78, 124, 139, 106, 119, 141, 91, 136, 97, 107, 102, 95, 118, 93, 134, 142, 90, 84, 90, 149, 105, 112, 135, 93, 100, 149, 148, 136, 126, 88, 148, 113, 81, 86, 117, 145, 109, 120, 89, 103, 82, 112, 78, 125, 105, 132, 108, 105, 102, 95, 148, 111, 79, 88, 87, 97, 88, 83, 110, 145, 101, 138, 102, 92, 83, 83]
# 组距
d = 3
# 组数
num = (max(a) - min(a)) // d
# print(num) # 24
# 图像大小
plt.figure(figsize=(15,8),dpi=80)
# 频数直方图 num还可以传一个列表,比如num = [i for i in range(78,160,4)]
num = [i for i in range(78,160,4)] # 设置频数直方图传入列表参数
plt.hist(a,num,density=True) # 设置频数直方图修改默认参数
# 设置x轴刻度
plt.xticks(range(min(a),max(a)+d,d)) # 78 81 84...150
# y轴表示在某个时长内有多少部电影
# 设置网格
plt.grid()
plt.show()
4.绘制饼图
plt.pie() 绘制饼图
练习:统计5个国家每年人口的变化情况
# 练习:统计5个国家每年人口的变化情况
from matplotlib import pyplot as plt
import pandas as pd
file_path = './population_data.json'
df = pd.read_json(file_path)
# print(df.info())
'''
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 12407 entries, 0 to 12406
Data columns (total 4 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Country Name 12407 non-null object
1 Country Code 12407 non-null object
2 Year 12407 non-null int64
3 Value 12407 non-null float64
dtypes: float64(1), int64(1), object(2)
memory usage: 387.8+ KB
None'''
# print(df.head())
'''
Country Name Country Code Year Value
0 Arab World ARB 1960 96388069.0
1 Arab World ARB 1961 98882541.4
2 Arab World ARB 1962 101474075.8
3 Arab World ARB 1963 104169209.2
4 Arab World ARB 1964 106978104.6'''
# 透视表
population_data = df.pivot_table(index='Country Name',columns='Year',values='Value')
# print(population_data.head())
'''
Year 1960 1961 ... 2009 2010
Country Name ...
Afghanistan 9671046.0 9859928.0 ... 3.343833e+07 34385000.0
Albania 1610565.0 1661158.0 ... 3.192723e+06 3205000.0
Algeria 10799997.0 11006643.0 ... 3.495017e+07 35468000.0
American Samoa 20041.0 20500.0 ... 6.731200e+04 68420.0
Andorra 13377.0 14337.0 ... 8.367700e+04 84864.0
[5 rows x 51 columns]'''
plt.figure(figsize=(15,8),dpi=80)
# countries = df.iloc[:,0].drop_duplicates()
# print(list(countries))
'''['Arab World', 'Caribbean small states', 'East Asia & Pacific (all income levels)', 'East Asia & Pacific (
developing only)', 'Euro area', 'Europe & Central Asia (all income levels)', 'Europe & Central Asia (developing
only)', 'European Union', 'Heavily indebted poor countries (HIPC)', 'High income', 'High income: nonOECD',
'High income: OECD', 'Latin America & Caribbean (all income levels)', 'Latin America & Caribbean (developing only)',
'Least developed countries: UN classification', 'Low & middle income', 'Low income', 'Lower middle income',
'Middle East & North Africa (all income levels)', 'Middle East & North Africa (developing only)', 'Middle income',
'North America', 'OECD members', 'Other small states', 'Pacific island small states', 'Small states', 'South Asia',
'Sub-Saharan Africa (all income levels)', 'Sub-Saharan Africa (developing only)', 'Upper middle income', 'World',
'Afghanistan', 'Albania', 'Algeria', 'American Samoa', 'Andorra', 'Angola', 'Antigua and Barbuda', 'Argentina',
'Armenia', 'Aruba', 'Australia', 'Austria', 'Azerbaijan', 'Bahamas, The', 'Bahrain', 'Bangladesh', 'Barbados',
'Belarus', 'Belgium', 'Belize', 'Benin', 'Bermuda', 'Bhutan', 'Bolivia', 'Bosnia and Herzegovina', 'Botswana',
'Brazil', 'Brunei Darussalam', 'Bulgaria', 'Burkina Faso', 'Burundi', 'Cambodia', 'Cameroon', 'Canada', 'Cape Verde',
'Cayman Islands', 'Central African Republic', 'Chad', 'Channel Islands', 'Chile', 'China', 'Colombia', 'Comoros',
'Congo, Dem. Rep.', 'Congo, Rep.', 'Costa Rica', "Cote d'Ivoire", 'Croatia', 'Cuba', 'Curacao', 'Cyprus',
'Czech Republic', 'Denmark', 'Djibouti', 'Dominica', 'Dominican Republic', 'Ecuador', 'Egypt, Arab Rep.',
'El Salvador', 'Equatorial Guinea', 'Eritrea', 'Estonia', 'Ethiopia', 'Faeroe Islands', 'Fiji', 'Finland', 'France',
'French Polynesia', 'Gabon', 'Gambia, The', 'Georgia', 'Germany', 'Ghana', 'Gibraltar', 'Greece', 'Greenland',
'Grenada', 'Guam', 'Guatemala', 'Guinea', 'Guinea-Bissau', 'Guyana', 'Haiti', 'Honduras', 'Hong Kong SAR, China',
'Hungary', 'Iceland', 'India', 'Indonesia', 'Iran, Islamic Rep.', 'Iraq', 'Ireland', 'Isle of Man', 'Israel',
'Italy', 'Jamaica', 'Japan', 'Jordan', 'Kazakhstan', 'Kenya', 'Kiribati', 'Korea, Dem. Rep.', 'Korea, Rep.',
'Kosovo', 'Kuwait', 'Kyrgyz Republic', 'Lao PDR', 'Latvia', 'Lebanon', 'Lesotho', 'Liberia', 'Libya',
'Liechtenstein', 'Lithuania', 'Luxembourg', 'Macao SAR, China', 'Macedonia, FYR', 'Madagascar', 'Malawi', 'Malaysia',
'Maldives', 'Mali', 'Malta', 'Marshall Islands', 'Mauritania', 'Mauritius', 'Mayotte', 'Mexico', 'Micronesia,
Fed. Sts.', 'Moldova', 'Monaco', 'Mongolia', 'Montenegro', 'Morocco', 'Mozambique', 'Myanmar', 'Namibia', 'Nepal',
'Netherlands', 'New Caledonia', 'New Zealand', 'Nicaragua', 'Niger', 'Nigeria', 'Northern Mariana Islands', 'Norway',
'Oman', 'Pakistan', 'Palau', 'Panama', 'Papua New Guinea', 'Paraguay', 'Peru', 'Philippines', 'Poland', 'Portugal',
'Puerto Rico', 'Qatar', 'Romania', 'Russian Federation', 'Rwanda', 'Samoa', 'San Marino', 'Sao Tome and Principe',
'Saudi Arabia', 'Senegal', 'Serbia', 'Seychelles', 'Sierra Leone', 'Singapore', 'Sint Maarten (Dutch part)',
'Slovak Republic', 'Slovenia', 'Solomon Islands', 'Somalia', 'South Africa', 'Spain', 'Sri Lanka', 'St. Kitts and
Nevis', 'St. Lucia', 'St. Martin (French part)', 'St. Vincent and the Grenadines', 'Sudan', 'Suriname', 'Swaziland',
'Sweden', 'Switzerland', 'Syrian Arab Republic', 'Tajikistan', 'Tanzania', 'Thailand', 'Timor-Leste', 'Togo',
'Tonga', 'Trinidad and Tobago', 'Tunisia', 'Turkey', 'Turkmenistan', 'Turks and Caicos Islands', 'Tuvalu', 'Uganda',
'Ukraine', 'United Arab Emirates', 'United Kingdom', 'United States', 'Uruguay', 'Uzbekistan', 'Vanuatu', 'Venezuela,
RB', 'Vietnam', 'Virgin Islands (U.S.)', 'West Bank and Gaza', 'Yemen, Rep.', 'Zambia', 'Zimbabwe'] '''
country_list = ['China','Australia','Germany','Brazil','Japan']
country_data = population_data.loc[country_list]
# print(country_data)
'''
Year 1960 1961 ... 2009 2010
Country Name ...
China 667070000.0 660330000.0 ... 1.331380e+09 1.338300e+09
Australia 10276477.0 10483000.0 ... 2.195170e+07 2.229900e+07
Germany 72814900.0 73377632.0 ... 8.190231e+07 8.177700e+07
Brazil 72758801.0 74975656.0 ... 1.932466e+08 1.949460e+08
Japan 92500572.0 94943000.0 ... 1.275580e+08 1.274510e+08
[5 rows x 51 columns]
'''
# country_data = population_data.loc[country_list].iloc[:,-1] #也可以
country_data = population_data.loc[country_list][2010]
print(country_data)
'''
Country Name
China 1.338300e+09
Australia 2.229900e+07
Germany 8.177700e+07
Brazil 1.949460e+08
Japan 1.274510e+08
Name: 2010, dtype: float64
'''
# labels 标签 autopct='%1.2f%%' 百分比 shadow=True 显示阴影 explode部分抽出
plt.pie(country_data.values,labels=country_data.index,autopct='%1.2f%%',shadow=True,explode=[0,0,0,0,0.2])
plt.show()
5. 绘制3D立体图形
mplot3d:matplotlib里用于绘制3D图形的一个模块
主要通过Axes3D完成3D绘图
3D绘图步骤
1.创建3D图像画布
2.通过ax绘制图像
5.1 3D曲线
Axes3D.plot(xs,ys,zs,zdir)
xs,ys,zs:点的三维坐标
zdir:竖直轴,默认为z
# 练习:3D曲线
from matplotlib import pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import numpy as np
# 准备数据 等差数列 0到15的1000个等差数列
zline = np.linspace(0, 15, 1000)
xline = np.sin(zline)
yline = np.cos(zline)
# 创建3D画板
fig = plt.figure()
ax = Axes3D(fig)
# 绘图
plt.plot(xline, yline, zline)
ax.set_xlabel('x')
ax.set_ylabel('y')
ax.set_zlabel('z')
plt.savefig('fig.png',bbox_inches='tight')
plt.show()
5.2 3D散点图
Axes3D.scatter(xs,ys,zs,zdir,s,c,marker)
s:点的大小
c:点的颜色
marker:点的标记
# 练习:3D散点图
from matplotlib import pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import numpy as np
# 准备数据 创建0到100的随机数
x = np.random.rand(100)
y = np.random.rand(100)
z = np.random.rand(100)
x1 = np.random.rand(100)
y1 = np.random.rand(100)
z1 = np.random.rand(100)
# 创建3D画板
fig = plt.figure()
ax = Axes3D(fig)
# 绘图
ax.scatter(x,y,z,s=10,c='r',marker='o')
ax.scatter(x1,y1,z1,s=20,c='g',marker='^')
# plt.savefig('fig.png',bbox_inches='tight')
plt.show()
5.3 3D平面图
# 练习:3D平面图
from matplotlib import pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import numpy as np
# 准备数据
x = np.arange(-4, 4, 0.25)
y = np.arange(-4, 4, 0.25)
x, y = np.meshgrid(x, y)
# 创建3D画板
fig = plt.figure()
ax = Axes3D(fig)
# 绘图
ax.plot_surface(x, y, (x+y))
plt.show()
6.子图的使用
- plt.subplots(nrows,ncols,sharex,sharey)
nrows,ncols 分割的行数和列数
sharex,sharey 是否共享x轴、y轴
- 示例
"""
子图的使用
"""
from matplotlib import pyplot as plt
fig,suplot_arr = plt.subplots(2,2,figsize=(8,8))
suplot_arr[0,0].scatter(range(10),range(10))
suplot_arr[0,1].bar(range(10),range(10))
suplot_arr[1,0].hist(range(10),range(5),rwidth=0.8)
y = [15,12,14,17,19,22,24,23,23,21,17,14]
suplot_arr[1,1].plot(range(2,26,2),y)
plt.show()
添加参数设置共享x轴:fig,suplot_arr = plt.subplots(2,2,figsize=(8,8),sharex=True) # 共享x轴 效果如下:
添加参数设置共享y轴:fig,suplot_arr = plt.subplots(2,2,figsize=(8,8),sharey=True) # 共享y轴 效果如下:
7.matplotlib常见问题总结
- 应该选择那种图形来呈现数据
- matplotlib.plot(x,y)
- matplotlib.bar(x,y)
- matplotlib.scatter(x,y)
- matplotlib.hist(data,bins,normed)
- xticks和yticks的设置
- label和title,grid的设置
- 绘图的大小和保存图片
8.matplotlib使用的流程总结
1.明确问题
2.选择图形的呈现方式
3.准备数据
4.绘图和图形完善