python数据分析003—numpy 和 pandas包（下）案例：药店销售数据分析

最新推荐文章于 2022-08-03 20:30:00 发布

SK溯鲲

最新推荐文章于 2022-08-03 20:30:00 发布

阅读量876

点赞数 2

文章标签： python 数据分析 numpy

本文链接：https://blog.csdn.net/qq_41680326/article/details/104315550

版权

分析数据文件下载：百度云盘链接
提取码：q2zv

#import the data analysis package（导入数据分析包）
import pandas as pd

1.Ask questions（提出问题）

【1】月均消费次数
【2】月均消费金额
【3】客单价
【4】消费趋势

2.Understand the data（理解数据）

【1】Read Excel data（读取Excel数据）

#Read Excel data,read in according to STR first,then convert（读取Excel数据，统一先按照str读入，之后转换）
fileNameStr =  'E:\liaoyuanhao\朝阳医院2018年销售数据.xlsx'
xls = pd.ExcelFile(fileNameStr,dtype = 'object')
salesDf = xls.parse('Sheet1',dtype = 'object')

【2】Print the first few lines(打印前几行)：saleDff.head()

#Print out the first 3 lines to make sure the data is running properly(打印出前3行，以确保数据运行正常)
salesDf.head(3)

	购药时间	社保卡号	商品编码	商品名称	销售数量	应收金额	实收金额
0	2018-01-01 星期五	001616528	236701	强力VC银翘片	6	82.8	69
1	2018-01-02 星期六	001616528	236701	清热解毒口服液	1	28	24.64
2	2018-01-06 星期三	0012602828	236701	感康	2	16.8	15

【3】How many rows,how many columns(有多少行多少列）

salesDf.shape

(6578, 7)

【4】View the data type of the column(查看列的数据类型)

salesDf.dtypes

购药时间    object
社保卡号    object
商品编码    object
商品名称    object
销售数量    object
应收金额    object
实收金额    object
dtype: object

3.Data cleaning（数据清洗）

【1】select a subset（选择子集）

#There is no need to select a subset in this case（本案例不需要选择子集）
#Example: select 5 rows of data from the time of purchase to the number of sales
subsalesDf = salesDf.loc[0:4,'购药时间':'销售数量']
subsalesDf

	购药时间	社保卡号	商品编码	商品名称	销售数量
0	2018-01-01 星期五	001616528	236701	强力VC银翘片	6
1	2018-01-02 星期六	001616528	236701	清热解毒口服液	1
2	2018-01-06 星期三	0012602828	236701	感康	2
3	2018-01-11 星期一	0010070343428	236701	三九感冒灵	1
4	2018-01-15 星期五	00101554328	236701	三九感冒灵	8

【2】column to rename（列重命名）

#Dictionary, correspondence between old column names and new column names（字典，旧列名和新列名的对应关系）
colNameDict = {
   '购药时间':'销售时间'}
'''
inplace = False,数据框本身不会变，而会创建一个改动后的新数据框，默认的inplace是False
inplace = True,数据框本身会改变
'''
salesDf.rename(columns = colNameDict,inplace = True)
salesDf.head(

最低0.47元/天解锁文章

SK溯鲲

关注

2
点赞
踩
3

收藏

觉得还不错? 一键收藏
1
评论
python数据分析003—numpy 和 pandas包（下）案例：药店销售数据分析

分析数据文件下载：百度云盘链接提取码：q2zv#import the data analysis package（导入数据分析包）import pandas as pd1.Ask questions（提出问题）【1】月均消费次数【2】月均消费金额【3】客单价【4】消费趋势2.Understand the data（理解数据）【1】Read Excel data（读取Excel...
复制链接

扫一扫