pandas入门（4）——基本功能

最新推荐文章于 2024-10-09 23:04:30 发布

Annaaphq

最新推荐文章于 2024-10-09 23:04:30 发布

阅读量141

点赞数

文章标签： pandas python 数据分析

本文链接：https://blog.csdn.net/Annaaphq/article/details/126124633

版权

本文介绍了pandas的基本操作，包括重新索引、丢弃指定轴上的项、索引和过滤数据。重点讲解了loc和iloc的区别及用法，以及算术运算中的数据对齐和填充缺失值。此外，还讨论了DataFrame和Series之间的运算、排序与排名的方法。

摘要由CSDN通过智能技术生成

pandas入门（4）——基本功能

重新索引

方法reindex：其作用是创建一个新对象，它的数据符合新的索引，若某个索引值当前不存在，就引入缺失值

obj = pd.Series([4.5, 7.2, -5.3, 3.6], index=['d', 'b', 'a', 'c'])
obj
obj2 = obj.reindex(['a', 'b', 'c', 'd', 'e'])
obj2

方法method：使用ffill可以实现前项值填充，适用于时间序列等有序数据

obj3 = pd.Series(['blue', 'purple', 'yellow'], index=[0, 2, 4])
obj3
obj3.reindex(range(6), method='ffill')

reindex借助DataFrame可以修改索引行和列

只传递一个序列，会重新索引结果的行

frame = pd.DataFrame(np.arange(9).reshape((3, 3)),
                     index=['a', 'c', 'd'],
                     columns=['Ohio', 'Texas', 'California'])
frame
frame2 = frame.reindex(['a', 'b', 'c', 'd'])
frame2

重新索引列使用columns关键字

states = ['Texas', 'Utah', 'California']
frame.reindex(columns=states)

丢弃指定轴上的项

drop方法

obj = pd.Series(np.arange(5.), index=['a', 'b', 'c', 'd', 'e'])
obj
new_obj = obj.drop('c')
new_obj
obj.drop(['d', 'c'])

DataFrame中删除：

data = pd.DataFrame(np.arange(16).reshape((4, 4)),
                     index=['Ohio', 'Colorado', 'Utah', 'New York'],
                     columns=['one', 'two', 'three', 'four'])
data
#删除行
data.drop(['Colorado', 'Ohio'])
#删除列
data.drop('two', axis=1)
data.drop(['two', 'four'], axis='columns')