pandas（三）

最新推荐文章于 2023-01-24 13:29:17 发布

qq_23664173

最新推荐文章于 2023-01-24 13:29:17 发布

阅读量324

点赞数

分类专栏： pandas

本文链接：https://blog.csdn.net/qq_23664173/article/details/84501258

版权

pandas 专栏收录该内容

3 篇文章 0 订阅

订阅专栏

score=pd.read_csv("./file.csv",names=[‘classes’,‘stuid’,‘C’,“M”,“E”])
在这里插入图片描述
1 score[‘classes’].unique()
unique 去重返回不同的值
2 score[‘classes’].value_counts()
值计数从大到小排序
3 score[‘classes’].isin([‘AI11’,‘AI12’])
判断 classes里的成员是否是AI11 、AI12
4 data.swaplevel().sort_index(level=0)
交换索引 swaplevel()
5 df1=data.unstack(level=1)
unstack 转换为DataFrame level 写成等于哪一列,哪一列就作为DataFrame的列索引
6 score.sort_index(axis=1,ascending=True)
对DataFrame 按索引的大小进行排序，axis=1 按列索引进行排序，axis=0 按行索引进行排序
7 ser1=pd.Series([89,89,54,100,33])
ser1.rank(ascending=False,method=“first”)
rank排序返回最大值的索引
8 g_data=pd.read_csv("./file.csv",names=[‘id’,‘value’],index_col=0)
names=[‘id’,‘value’] 添加列索引
index_col=0 指定第一列为索引 index_col=1 指定第二列为索引
9 date_range(start,end,freq,periods) 生成时间序列
start end 开始时间和结束时间 freq 每隔多久生成一个时间点 S 秒数 D 天 H 小时 T 分钟 M 月份 Y 年
g_index=pd.date_range(start=“1970/11/10 08:00:00”,freq=“Y”,end=“2019/11/10 10:00:00”)

表合并

pd.merge(data1,data2,left_on,right_on,on,left_index,right_index,how)
left_on right_on 左右表以哪个字段连接
on 如果两个表中要连接的字段名是相同的,使用 on 或者不写
left_index right_index 当两个表中的连接字段是索引时设置 left right _index为True
how 表连接的方式默认为 inner 交集 outer 并集
**
stu_info=pd.read_csv("./file/merge_stu_name.csv")
stu_score=pd.read_csv("./file/merge_stu_score.csv")**
10 data3=pd.merge(stu_info,stu_score,on=‘stu_id’)
表一和表二同过共同的列 ‘stu_id’ 进行合并
11 data3=pd.merge(stu_info,stu_score,left_index=True,right_index=True,how=“inner”)

concat 堆叠

12 pd.concat((stu_info,stu_info2),sort=False,join=‘inner’,axis=1)
join 表连接的方式默认为 inner 交集 outer 并集
上下堆叠，列相加

qq_23664173

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
pandas（三）

score=pd.read_csv("./file.csv",names=[‘classes’,‘stuid’,‘C’,“M”,“E”])1 score[‘classes’].unique()unique 去重返回不同的值2 score[‘classes’].value_counts()值计数从大到小排序3 score[‘classes’].isin([‘AI11’,‘AI...
复制链接

扫一扫

专栏目录