pandas基本用法

NJU_AI_NB

已于 2024-02-20 14:02:16 修改

阅读量1.8k

点赞数 9

分类专栏： pandas 文章标签： pandas

于 2024-02-14 11:00:52 首次发布

本文链接：https://blog.csdn.net/aa12367/article/details/136111853

版权

本文详细介绍了Pandas库中的Series和DataFrame数据结构，包括它们的创建、索引、数据访问、切片、过滤、数学运算以及DataFrame的创建方式、数据选择和操作、缺失值处理等内容。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

Pandas是Python中用于数据处理和分析的重要库，它提供了高性能、易于使用的数据结构和数据操作工具。

# 导入pandas库
import pandas as pd

一.pd.Series是Pandas库中的一种数据结构，它类似于一维数组或列表，但具有额外的功能和灵活性。Series可以存储任何数据类型（整数、浮点数、字符串等），并且每个数据点都与一个索引相关联，索引可以是整数、字符串或其他类型。

# 创建一个Pandas Series对象
pd.Series([1,2,31,12,34])

0     1
1     2
2    31
3    12
4    34
dtype: int64

# 创建一个Pandas Series对象并赋值给变量t
t=pd.Series([1, 2, 31, 12, 34])

# 打印变量类型
type(t)

pandas.core.series.Series

# 打印Series对象t
t

0     1
1     2
2    31
3    12
4    34
dtype: int64

1.自定义索引：可以在创建Series时指定自定义索引

# 创建一个Pandas Series对象并指定索引
t2 = pd.Series([1,23,2,2,1],index=list('abcde'))
# 打印Series对象t2
t2

a     1
b    23
c     2
d     2
e     1
dtype: int64

# 创建一个字典
temp_dict ={
   'name':'xiaohong','age':'30','tel':'10086'}

# 使用字典创建Pandas Series对象
t3=pd.Series(temp_dict)

# 打印Series对象t3
t3

name    xiaohong
age           30
tel        10086
dtype: object

# 打印Series对象t3的数据类型
t3.dtype

dtype('O')

# 打印Series对象t2的数据类型
t2.dtype

dtype('int64')

# 打印Series对象t2
t2

a     1
b    23
c     2
d     2
e     1
dtype: int64

# 将Series对象t2的数据类型转换为float
t2.astype(float)

a     1.0
b    23.0
c     2.0
d     2.0
e     1.0
dtype: float64

# 打印Series对象t3
t3

name    xiaohong
age           30
tel        10086
dtype: object

2.使用标签索引：与Python列表类似，Series也支持使用标签索引来访问数据

# 使用标签索引访问Series对象t3的元素'age'
t3['age']

'30'

# 使用标签索引访问Series对象t3的元素'tel'
t3['tel']

'10086'

3.访问数据：可以通过整数索引访问Series中的数据

# 使用整数索引访问Series对象t3的元素，这种方式在未来的版本中会被弃用，建议使用iloc[]
t3[1]

C:\Users\86132\AppData\Local\Temp\ipykernel_26968\2027420829.py:1: FutureWarning: Series.__getitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use `ser.iloc[pos]`
  t3[1]





'30'

# 使用整数索引访问Series对象t3的元素，这种方式在未来的版本中会被弃用，建议使用iloc[]
t3[2]

C:\Users\86132\AppData\Local\Temp\ipykernel_26968\739707692.py:1: FutureWarning: Series.__getitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use `ser.iloc[pos]`
  t3[2]





'10086'

# 使用整数索引访问Series对象t3的元素，这种方式在未来的版本中会被弃用，建议使用iloc[]
t3[0]

C:\Users\86132\AppData\Local\Temp\ipykernel_26968\2017440695.py:1: FutureWarning: Series.__getitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use `ser.iloc[pos]`
  t3[0]





'xiaohong'

二.Series操作：可以对Series进行各种操作，如切片、过滤、数学运算等

1.切片

# 使用切片访问Series对象t3的前两个元素
t3[:2]

name    xiaohong
age           30
dtype: object

# 使用整数索引列表访问Series对象t3的元素，这种方式在未来的版本中会被弃用，建议使用iloc[]
t3[[1,2]]

C:\Users\86132\AppData\Local\Temp\ipykernel_6328\3760560860.py:1: FutureWarning: Series.__getitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use `