数据分析-人口分析为例

最新推荐文章于 2024-07-20 17:12:48 发布

qq_28368825

最新推荐文章于 2024-07-20 17:12:48 发布

阅读量135

点赞数

文章标签： python

本文链接：https://blog.csdn.net/qq_28368825/article/details/123271137

版权

数据分析-人口分析为例

import numpy as np
import pandas as pd

读文件：

abb=pd.read_csv('./data/state-abbrevs.csv')
area=pd.read_csv('./data/state-areas.csv')
population=pd.read_csv('./data/state-population.csv')

在这里插入图片描述

合并abb和population：

abb_pop=pd.merge(abb,population,left_on='abbreviation',right_on='state/region',how='outer')

在这里插入图片描述
删除重复数据：

abb_pop.drop(labels='abbreviation',axis=1,inplace=True)

查看哪些列有NAN

abb_pop.isnull().any(axis=0)

stat列为空的行对应的state/region的数值：

abb_pop.loc[abb_pop['state'].isnull()]['state/region'].unique()

在这里插入图片描述
补全：

indexs=abb_pop.loc[abb_pop['state/region']=='USA'].index
abb_pop.loc[indexs,'state']='United state'
indexs=abb_pop.loc[abb_pop['state/region']=='PR'].index
abb_pop.loc[indexs,'state']='PPR'

合并地区：

abb_pop_area=pd.merge(abb_pop,area,how='outer')

在这里插入图片描述

删除地区area (sq. mi)中有NAN的值

indexs=abb_pop_area.loc[abb_pop_area['area (sq. mi)'].isnull()].index
abb_pop_area.drop(labels=indexs,axis=0,inplace=True)

查找ages为total和year为2010的行

abb_pop_area.query('ages=="total"&year=="2010"')

在这里插入图片描述
求解人口密度：

abb_pop_area['midu']=abb_pop_area['population']/abb_pop_area['area (sq. mi)']

人口密度最大的州：

abb_pop_area.sort_values(by='midu',axis=0,ascending=False).iloc[0]['state']

qq_28368825

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
数据分析-人口分析为例

数据分析-人口分析为例import numpy as npimport pandas as pd读文件：abb=pd.read_csv('./data/state-abbrevs.csv')area=pd.read_csv('./data/state-areas.csv')population=pd.read_csv('./data/state-population.csv')合并abb和population：abb_pop=pd.merge(abb,population,left
复制链接

扫一扫