Pandas 数据结构介绍——Series的构造和属性

最新推荐文章于 2024-04-23 21:35:29 发布

峡谷的小鱼

最新推荐文章于 2024-04-23 21:35:29 发布

阅读量1.6k

点赞数

分类专栏：数据分析 pandas 文章标签： pytorch python 数据分析

本文链接：https://blog.csdn.net/weixin_43276033/article/details/123947974

版权

数据分析 pandas 专栏收录该内容

9 篇文章 1 订阅

订阅专栏

Series

简要介绍pandas中的一维数组数据结构 Series.

一、Series 构造

pandas 中定义了一种一维带标签的数组型对象 Series，能够保存任何数据类型，数据标签也称为索引。：

class pandas.Series(data=None, index=None, dtype=None, name=None, copy=False, fastpath=False):
	"""
	参数：
		data: array-like, iterable, dict, or scalar value
			将被存储到 Series 中的数据，如果是dict，将保持参数的顺序不变。
		index: array-like or index (1d)
			index 的值必须是可哈希的，且与data长度相同。index值允许重复。默认为：0，1，2.···,n。如果是dict，将key作为索引。
		dtype: str, numpy.dtype, or ExtensionDtype, optional
			输出 Series 的数据类型。如果为给出，将遵从data中的数据类型。
		name: str, optional
			Series 的名。
		copy: bool, default False
			复制数据。
	"""
	pass

举例

import numpy as np
import pandas as pd

# 使用字典构造 Series 
>>> d = {'a': 1, 'b': 2, 'c': 3}
>>> s1 = pd.Series(data=d, index=['a', 'b', 'c'])
>>> s1
a    1
b    2
c    3
dtype: int64

# 索引首先会使用字典的键值构建，然后使用给给定的索引值对序列重新编制。
# 因为字典键值与索引不匹配，所以值为 NaN
>>> s2 = pd.Series(data=d, index=['x', 'y', 'z'])
>>> s2
x   NaN
y   NaN
z   NaN
dtype: float64

# 使用列表构造，copy=False
>>> l = [1,2,3,4]
>>> s3 = pd.Series(data=l, copy=False)
>>> s3[2] = 9
>>> s3
 0    1
 1    2
 2    9
 3    4
 dtype: int64
>>> l
[1, 2, 3, 4]

# 使用 array 构造，copy=False
>>> r = np.array([1, 2, 3, 4])
>>> s3 = pd.Series(r, copy=False)
>>> s3[1] = 999
>>> r
array([  1, 999,   3,   4])
>>> s3
0      1
1    999
2      3
3      4
dtype: int32

二、Series 对象的属性

>>> s = pd.Series(data=np.arange(7), dtype=np.float32, index=list('abcdefg'))
>>> s
a    0.0
b    1.0
c    2.0
d    3.0
e    4.0
f    5.0
g    6.0
dtype: float32


# 返回 Series 的索引index
>>> s.index
Index(['a', 'b', 'c', 'd', 'e', 'f', 'g'], dtype='object')

# 返回数据的数组
>>> s.array
<PandasArray>
[0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0]
Length: 7, dtype: float32

# 以 array 的方式返回数据
s.values
array([0., 1., 2., 3., 4., 5., 6.], dtype=float32)

# 返回基础的数据类型
>>> s.dtype
dtype('float32')

# 以元组的形式返回数据的尺寸
>>> s.shape
(7,)

# 返回基础数据的字节数
>>> s.nbytes
28

# 返回基础数组的维度，定义为1
>>> s.ndim
1

# 返回基础数据元素的数量
>>> s.size
7

# 返回数组的转置，这里是它本身
>>> s.T
a    0.0
b    1.0
c    2.0
d    3.0
e    4.0
f    5.0
g    6.0
dtype: float32

# 如果数组里存在NaN，返回True，否则返回False
>>> s.hasnans
False

# 判断数组是否为空
>>> s.empty
False

返回基础数据的 dtype对象
>>> s.dtypes
dtype('float32')