pandas 案例分析:美国各州人口数据分析

案例分析:美国各州人口数据分析

首先导入文件,并查看数据样本

In [54]:
abbr = pd.read_csv("./usapop/state-abbrevs.csv")
abbr.head()
Out[54]:
  state abbreviation
0 Alabama AL
1 Alaska AK
2 Arizona AZ
3 Arkansas AR
4 California CA
In [55]:
areas = pd.read_csv("./usapop/state-areas.csv")
areas.head()
Out[55]:
  state area (sq. mi)
0 Alabama 52423
1 Alaska 656425
2 Arizona 114006
3 Arkansas 53182
4 California 163707
In [56]:
pop = pd.read_csv("./usapop/state-population.csv")
pop.head()
Out[56]:
  state/region ages year population
0 AL under18 2012 1117489.0
1 AL total 2012 4817528.0
2 AL under18 2010 1130966.0
3 AL total 2010 4785570.0
4 AL under18 2011 1125763.0

合并pop与abbrevs两个DataFrame,分别依据state/region列和abbreviation列来合并。

为了保留所有信息,使用外合并。

In [57]:
pop2 = pop.merge(abbr,left_on="state/region",right_on="abbreviation",how="outer")
# 用外连接(或者左链接)
pop2
Out[57]:
  state/region ages year population state abbreviation
0 AL under18 2012 1117489.0 Alabama AL
1 AL total 2012 4817528.0 Alabama AL
2 AL under18 2010 1130966.0 Alabama AL
3 AL total 2010 4785570.0 Alabama AL
4 AL under18 2011 1125763.0 Alabama AL
5 AL total 2011 4801627.0 Alabama AL
6 AL total 2009 4757938.0 Alabama AL
7 AL under18 2009 1134192.0 Alabama AL
8 AL under18 2013 1111481.0 Alabama AL
9 AL total 2013 4833722.0 Alabama AL
10 AL total 2007 4672840.0 Alabama AL
11 AL under18 2007 1132296.0 Alabama AL
12 AL total 2008 4718206.0 Alabama AL
13 AL under18 2008 1134927.0 Alabama AL
14 AL total 2005 4569805.0 Alabama AL
15 AL under18 2005 1117229.0 Alabama AL
16 AL total 2006 4628981.0 Alabama AL
17 AL under18 2006 1126798.0 Alabama AL
18 AL total 2004 4530729.0 Alabama AL
19 AL under18 2004 1113662.0 Alabama AL
20 AL total 2003 4503491.0 Alabama AL
21 AL under18 2003 1113083.0 Alabama AL
22 AL total 2001 4467634.0 Alabama AL
23 AL under18 2001 1120409.0 Alabama AL
24 AL total 2002 4480089.0 Alabama AL
25 AL under18 2002 1116590.0 Alabama AL
26 AL under18 1999 1121287.0 Alabama AL
27 AL total 1999 4430141.0 Alabama AL
28 AL total 2000 4452173.0 Alabama AL
29 AL under18 2000 1122273.0 Alabama AL
... ... ... ... ... ... ...
2514 USA under18 1999 71946051.0 NaN NaN
2515 USA total 2000 282162411.0 NaN NaN
2516 USA under18 2000 72376189.0 NaN NaN
2517 USA total 1999 279040181.0 NaN NaN
2518 USA total 2001 284968955.0 NaN NaN
2519 USA under18 2001 72671175.0 NaN NaN
2520 USA total 2002 287625193.0 NaN NaN
2521 USA under18 2002 72936457.0 NaN NaN
2522 USA total 2003 290107933.0 NaN NaN
2523 USA under18 2003 73100758.0 NaN NaN
2524 USA total 2004 292805298.0 NaN NaN
2525 USA under18 2004 73297735.0 NaN NaN
2526 USA total 2005 295516599.0 NaN NaN
2527 USA under18 2005 73523669.0 NaN NaN
2528 USA total 2006 298379912.0 NaN NaN
2529 USA under18 2006 73757714.0 NaN NaN
2530 USA total 2007 301231207.0 NaN NaN
2531 USA under18 2007 74019405.0 NaN NaN
2532 USA total 2008 304093966.0 NaN NaN
2533 USA under18 2008 74104602.0 NaN NaN
2534 USA under18 2013 73585872.0 NaN NaN
2535 USA total 2013 316128839.0 NaN NaN
2536 USA total 2009 306771529.0 NaN NaN
2537 USA under18 2009 74134167.0 NaN NaN
2538 USA under18 2010 74119556.0 NaN NaN
2539 USA total 2010 309326295.0 NaN NaN
2540 USA under18 2011 73902222.0 NaN NaN
2541 USA total 2011 311582564.0 NaN NaN
2542 USA under18 2012 73708179.0 NaN NaN
2543 USA total 2012 313873685.0 NaN NaN

2544 rows × 6 columns

去除abbreviation的那一列(axis=1)

In [58]:
  • 1
    点赞
  • 15
    收藏
    觉得还不错? 一键收藏
  • 4
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 4
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值