pandas数值型数据和非数值型数据统计

最新推荐文章于 2024-03-23 18:00:00 发布

一匹脱缰的野马

最新推荐文章于 2024-03-23 18:00:00 发布

阅读量5.3k

点赞数 2

分类专栏： pandas的使用

本文链接：https://blog.csdn.net/qq_39112101/article/details/100765417

版权

pandas的使用专栏收录该内容

20 篇文章 1 订阅

订阅专栏

对单列数据进行统计

加载数据

import pandas as pd

detail = pd.read_excel('./meal_order_detail.xlsx')

常见的数值统计的方法如下：

统计detail中的，单价相关指标
print('最大值',detail.loc[:,'amounts'].max())
print('最小值',detail.loc[:,'amounts'].min())
print('均值',detail.loc[:,'amounts'].mean())
print('中位数',detail.loc[:,'amounts'].median())
print('方差',detail.loc[:,'amounts'].var())
print('极差',detail.loc[:,'amounts'].ptp())
print('标准差',detail.loc[:,'amounts'].std())
print('众数',detail.loc[:,'amounts'].mode())
print('非空值的数目',detail.loc[:,'amounts'].count())
print('最大值的位置',detail.loc[:,'amounts'].idxmax())
print('最小值的位置',detail.loc[:,'amounts'].idxmin())

describe对于数值型的数据返回8中统计结果

print('describe',detail.loc[:,'amounts'].describe())

对多列数据进行统计

格式如下：
print('describe',detail.loc[:,['amounts','counts']].describe())  
简单来说，列的位置加入列名称列表即可

非数值统计统计

对于dataframe转化数据类型

其他类型转化成object，非数值型返回4个数据

detail.loc[:,'amounts'] = detail.loc[:,'amounts'].astype('object')  
print(detail.loc[:,'amounts'].describe()
print(detail.loc[:,'amounts'].dtypes)

其他类型数据转化成类别型数据

detail.loc[:,'amounts'] = detail.loc[:,'amounts'].astype('category') 
print(detail.loc[:,'amounts'].describe()
print(detail.loc[:,'amounts'].dtypes)

detail中那些菜品最火？菜品卖出多少份？

detail.loc[:,'dishes_name'] = detail.loc[:,'dishes_name'].astype('category')
print('按照deshed_name统计描述信息：',detail.loc[:,'dishes_name'].describe())        
发现这里的最火菜品是大碗白饭，但是大碗白饭不是菜品，所有重新计算。

删除数据中的大碗白饭

bool_id = detail.loc[:,'dishes_name'] == '白饭/大碗'
index = detail.loc[bool_id,:].index             
detail.drop(labels=index,axis=0,inplace=True)

把数据类型重新转化，然后再赋给数据本身。

detail.loc[:,'dishes_name'] = detail.loc[:,'dishes_name'].astype('category')  
#  在进行统计描述信息                                                                  
print("按照dishes_name统计描述信息：\n",detail.loc[:,'dishes_name'].describe())

在返回数据为

detail.loc[:,'dishes_name'] = detail.loc[:,'dishes_name'].astype('category')  
#  在进行统计描述信息                                                                  
print("按照dishes_name统计描述信息：\n",detail.loc[:,'dishes_name'].describe())

在detail中哪个菜品点的最多，点了多少分菜？

将order_id转变成类别型数据，再进行describe

detail.loc[:,'order_id'] = detail.loc[:,'order_id'].astype('category')    
print('按照order_id统计描述信息为:',detail.loc[:,'order_id'].describe())

建议在使用时，把数据类型转成category，然后再计算

一匹脱缰的野马

关注

2
点赞
踩
10

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录