Pandas Dataframe 的 loc与iloc 区别与示例

weixin_52020016

已于 2024-01-13 11:02:18 修改

阅读量958

点赞数 12

文章标签： pandas

于 2024-01-13 10:52:51 首次发布

本文链接：https://blog.csdn.net/weixin_52020016/article/details/135567024

版权

在 Pandas 中，DataFrame 有两种主要的索引方式：隐式整数索引和行、列标签索引。

隐式整数索引： 使用 iloc 方法，默认情况下，Pandas 会为 DataFrame 分配从 0 开始的整数索引。这是 DataFrame 的默认索引方式，类似于 Python 中的列表索引

行、列标签索引： 使用 loc 方法，可以通过行和列的标签进行索引。行标签默认是整数索引，也可以设置成其他字符，列标签是 DataFrame 的列名。

# 隐式整数索引可以是任何整数，不受标签类型的限制。例如，可以使用负数索引，而标签索引不支持负数。
 print(df.iloc[-1])  # 可以使用负数索引，选择最后一行的元素
# print(df.loc[-1])  # 引发错误，因为标签索引不支持负数

`loc` 、`iloc` 索引元素方法

loc： 根据标签进行索引和选择数据。

通过标签索引单个元素：

# 选择行标签为0，列标签为ColumnName的元素
df.loc[0, 'ColumnName']

通过标签索引整行或整列：

# 选择所有行，列标签为ColumnName的元素
df.loc[:, 'ColumnName']

通过标签索引多行多列：

选择前五行，列标签为ColumnName1与ColumnName2的所有元素
df.loc[0:5, ['ColumnName1', 'ColumnName2']]

iloc： 根据整数位置（从0开始）进行索引和选择数据。
- 通过整数索引单个元素：
```
# 选择第一行第二列数据
df.iloc[0, 1]
```
- 通过整数索引整行或整列：
```
# 选择所有行第一列数据
df.iloc[:, 1]
```
- 通过整数索引多行多列：
```
# 选择前五行，第一列、第三列数据
df.iloc[0:5, [0, 2]]
```

区别在于 loc 使用的是标签，而 iloc 使用的是整数位置。以下是更详细的示例：

import pandas as pd

# 创建一个DataFrame
data = {'A': [1, 2, 3, 4, 5],
        'B': ['a', 'b', 'c', 'd', 'e']}
df = pd.DataFrame(data, index=['one', 'two', 'three', 'four', 'five'])

#[]中的数字是隐式整数索引，不会被打印在控制台
"""
>>> df
		 [0][1]
       	  A  B
[0]one    1  a
[1]two    2  b
[2]three  3  c
[3]four   4  d
[4]five   5  e
"""


# 使用 loc 行列标签定位
print("Using loc:")
# 选择行标签为one,列标签为A的元素
print(df.loc['one', 'A'])          # 1
# 选择所有行,列标签为B的元素
print(df.loc[:, 'B'])              # one    a
                                   # two    b
                                   # three  c
                                   # four   d
                                   # five   e
# 选择行标签为one、three,列标签为A、B的元素                                   
print(df.loc[['one', 'three'], ['A', 'B']])
                                   #        A  B
                                   # one    1  a
                                   # three  3  c

# 使用 iloc 隐式整数索引
print("\nUsing iloc:")
# 选择第一行第一列的元素
print(df.iloc[0, 0])               # 1
# 选择所有行，第二列元素
print(df.iloc[:, 1])               # one      a
                                   # two      b
                                   # three    c
                                   # four     d
                                   # five     e
# 选择第一行第三行，第一列，第二列元素                               
print(df.iloc[[0, 2], [0, 1]])
                                   #        A  B
                                   # one    1  a
                                   # three  3  c