Pandas 2.2 中文文档（三十二）-CSDN博客

原文：pandas.pydata.org/docs/

`pandas.Series`

原文：pandas.pydata.org/docs/reference/api/pandas.Series.html

class pandas.Series(data=None, index=None, dtype=None, name=None, copy=None, fastpath=_NoDefault.no_default)

包含轴标签的一维 ndarray（包括时间序列）。

标签不必是唯一的，但必须是可哈希的类型。该对象支持基于整数和标签的索引，并提供了许多涉及索引的操作方法。ndarray 的统计方法已被覆盖以自动排除缺失数据（当前表示为 NaN）。

Series 之间的运算（+、-、/、*、**）会根据它们关联的索引值对齐数据，这些索引值不需要相同长度。结果索引将是这两个索引的排序并集。

参数：

data类似数组，可迭代对象，字典或标量值

包含存储在 Series 中的数据。如果数据是一个字典，则保持参数顺序。

index类似数组或索引（1d）

值必须是可哈希的，并且与数据具有相同的长度。允许非唯一索引值。如果未提供，将默认为 RangeIndex（0、1、2、…、n）。如果数据类似字典并且索引为 None，则使用数据中的键作为索引。如果索引不为 None，则生成的 Series 将根据索引值重新索引。

dtypestr、numpy.dtype 或 ExtensionDtype，可选

输出 Series 的数据类型。如果未指定，则将从数据中推断出。请参阅用户指南获取更多用法。

name可哈希，默认为 None

要赋予 Series 的名称。

copybool，默认为 False

复制输入数据。仅影响 Series 或 1d ndarray 输入。见示例。

注意事项

更多信息，请参考用户指南。

示例

从指定了索引的字典构建 Series

>>> d = {'a': 1, 'b': 2, 'c': 3}
>>> ser = pd.Series(data=d, index=['a', 'b', 'c'])
>>> ser
a   1
b   2
c   3
dtype: int64

字典的键与索引值匹配，因此索引值没有影响。

>>> d = {'a': 1, 'b': 2, 'c': 3}
>>> ser = pd.Series(data=d, index=['x', 'y', 'z'])
>>> ser
x   NaN
y   NaN
z   NaN
dtype: float64

请注意，索引首先是由字典中的键构建的。之后，Series 会根据给定的索引值重新索引，因此我们会得到全部 NaN 作为结果。

使用 copy=False 从列表构建 Series。

>>> r = [1, 2]
>>> ser = pd.Series(r, copy=False)
>>> ser.iloc[0] = 999
>>> r
[1, 2]
>>> ser
0    999
1      2
dtype: int64

由于输入数据类型，即使 copy=False，Series 也会复制原始数据的副本，因此数据不会改变。

使用 copy=False 从 1d ndarray 构建 Series。

>>> r = np.array([1, 2])
>>> ser = pd.Series(r, copy=False)
>>> ser.iloc[0] = 999
>>> r
array([999,   2])
>>> ser
0    999
1      2
dtype: int64

由于输入数据类型，Series 对原始数据有一个视图，因此数据也会发生变化。

属性

`T`	返回转置，按定义为自身。
`array`	支持此 Series 或 Index 的数据的 ExtensionArray。
`at`	访问行/列标签对的单个值。
`attrs`	此数据集的全局属性字典。
`axes`	返回行轴标签列表。
`dtype`	返回基础数据的 dtype 对象。
`dtypes`	返回基础数据的 dtype 对象。
`empty`	指示 Series/DataFrame 是否为空。
`flags`	获取与此 pandas 对象关联的属性。
`hasnans`	如果存在任何 NaN，则返回 True。
`iat`	通过整数位置访问行/列对的单个值。
`iloc`	（已弃用）纯粹基于整数位置的索引，用于按位置进行选择。
`index`	Series 的索引（轴标签）。
`is_monotonic_decreasing`	如果对象中的值是单调递减的，则返回布尔值。
`is_monotonic_increasing`	如果对象中的值是单调递增的，则返回布尔值。
`is_unique`	如果对象中的值是唯一的，则返回布尔值。
`loc`	通过标签或布尔数组访问一组行和列。
`name`	返回 Series 的名称。
`nbytes`	返回基础数据中的字节数。
`ndim`	基础数据的维数，根据定义为 1。
`shape`	返回基础数据的形状的元组。
`size`	返回基础数据中的元素数。
`values`	根据 dtype 返回 Series 作为 ndarray 或类似 ndarray。

方法

`abs`()	返回每个元素的绝对数值的 Series/DataFrame。
`add`(other[, level, fill_value, axis])	返回系列和其他的加法，逐元素进行（二进制运算符 add）。
`add_prefix`(prefix[, axis])	使用字符串前缀为标签添加前缀。
`add_suffix`(suffix[, axis])	使用字符串后缀为标签添加后缀。
`agg`([func, axis])	使用一个或多个操作在指定轴上进行聚合。
`aggregate`([func, axis])	使用一个或多个操作在指定轴上进行聚合。
`align`(other[, join, axis, level, copy, …])	使用指定的连接方法在它们的轴上对齐两个对象。
`all`([axis, bool_only, skipna])	返回是否所有元素都为 True，可能在一个轴上。
`any`(*[, axis, bool_only, skipna])	返回是否有任何元素为 True，可能在一个轴上。
`apply`(func[, convert_dtype, args, by_row])	对 Series 的值调用函数。
`argmax`([axis, skipna])	返回 Series 中最大值的整数位置。
`argmin`([axis, skipna])	返回 Series 中最小值的整数位置。
`argsort`([axis, kind, order, stable])	返回将 Series 值排序的整数索引。
`asfreq`(freq[, method, how, normalize, …])	将时间序列转换为指定频率。
`asof`(where[, subset])	返回在指定位置之前没有任何 NaN 的最后一行。
`astype`(dtype[, copy, errors])	将 pandas 对象转换为指定的数据类型 `dtype`。
`at_time`(time[, asof, axis])	选择特定时间的值（例如，上午 9:30）。
`autocorr`([lag])	计算滞后 N 的自相关性。
`backfill`(*[, axis, inplace, limit, downcast])	(已弃用) 使用下一个有效观察值填充 NA/NaN 值以填补间隙。
`between`(left, right[, inclusive])	返回布尔 Series，等效于 left <= series <= right。
`between_time`(start_time, end_time[, …])	选择一天中特定时间段之间的值（例如，上午 9:00-9:30）。
`bfill`(*[, axis, inplace, limit, limit_area, …])	使用下一个有效观测值填充 NA/NaN 值。
`bool`()	（已弃用）返回单个元素 Series 或 DataFrame 的布尔值。
`case_when`(caselist)	替换条件为 True 的值。
`clip`([lower, upper, axis, inplace])	在输入阈值处修剪值。
`combine`(other, func[, fill_value])	根据 func 将 Series 与 Series 或标量组合。
`combine_first`(other)	使用 ‘other’ 中相同位置的值更新空元素。
`compare`(other[, align_axis, keep_shape, …])	与另一个 Series 进行比较并显示差异。
`convert_dtypes`([infer_objects, …])	使用支持 `pd.NA` 的 dtypes 将列转换为最佳可能的 dtypes。
`copy`([deep])	复制此对象的索引和数据。
`corr`(other[, method, min_periods])	计算与其他 Series 的相关性，不包括缺失值。
`count`()	返回 Series 中非 NA/null 观测值的数量。
`cov`(other[, min_periods, ddof])	计算与 Series 的协方差，不包括缺失值。
`cummax`([axis, skipna])	返回 DataFrame 或 Series 轴上的累积最大值。
`cummin`([axis, skipna])	返回 DataFrame 或 Series 轴上的累积最小值。
`cumprod`([axis, skipna])	返回 DataFrame 或 Series 轴上的累积乘积。
`cumsum`([axis, skipna])	返回 DataFrame 或 Series 轴上的累积和。
`describe`([percentiles, include, exclude])	生成描述性统计信息。
`diff`([periods])	元素的第一个离散差异。
`div`(other[, level, fill_value, axis])	返回系列和其他元素的浮点除法，逐元素进行（二元运算符 truediv）。
`divide`(other[, level, fill_value, axis])	返回系列和其他元素的浮点除法，逐元素进行（二元运算符 truediv）。
`divmod`(other[, level, fill_value, axis])	返回系列和其他元素的整数除法和模数，逐元素进行（二元运算符 divmod）。
`dot`(other)	计算 Series 和其他列之间的点积。
`drop`([labels, axis, index, columns, level, …])	返回删除指定索引标签的 Series。
`drop_duplicates`(*[, keep, inplace, ignore_index])	返回删除重复值的 Series。
`droplevel`(level[, axis])	返回删除请求的索引/列级别的 Series/DataFrame。
`dropna`(*[, axis, inplace, how, ignore_index])	返回删除缺失值的新 Series。
`duplicated`([keep])	表示 Series 值是否重复。
`eq`(other[, level, fill_value, axis])	返回系列��其他元素的相等，逐元素进行（二元运算符 eq）。
`equals`(other)	测试两个对象是否包含相同的元素。
`ewm`([com, span, halflife, alpha, …])	提供指数加权（EW）计算。
`expanding`([min_periods, axis, method])	提供扩展窗口计算。
`explode`([ignore_index])	将类似列表的每个元素转换为一行。
`factorize`([sort, use_na_sentinel])	将对象编码为枚举类型或分类变量。
`ffill`(*[, axis, inplace, limit, limit_area, …])	通过将最后一个有效观察传播到下一个有效值，填充 NA/NaN 值。
`fillna`([value, method, axis, inplace, …])	使用指定方法填充 NA/NaN 值。
`filter`([items, like, regex, axis])	根据指定的索引标签，对数据帧行或列进行子集选择。
`first`(offset)	(已弃用)根据日期偏移量选择时间序列数据的初始周期。
`first_valid_index`()	返回第一个非 NA 值的索引，如果没有找到非 NA 值，则返回 None。
`floordiv`(other[, level, fill_value, axis])	返回序列和其他的整数除法，逐元素（二进制运算符 floordiv）。
`ge`(other[, level, fill_value, axis])	返回序列和其他的大于或等于，逐元素（二进制运算符 ge）。
`get`(key[, default])	获取给定键的对象中的项（例如：DataFrame 列）。
`groupby`([by, axis, level, as_index, sort, …])	使用映射器或列的 Series 进行分组。
`gt`(other[, level, fill_value, axis])	返回序列和其他的大于，逐元素（二进制运算符 gt）。
`head`([n])	返回前 n 行。
`hist`([by, ax, grid, xlabelsize, xrot, …])	使用 matplotlib 绘制输入系列的直方图。
`idxmax`([axis, skipna])	返回最大值的行标签。
`idxmin`([axis, skipna])	返回最小值的行标签。
`infer_objects`([copy])	尝试推断对象列的更好的数据类型。
`info`([verbose, buf, max_cols, memory_usage, …])	打印 Series 的简洁摘要。
`interpolate`([method, axis, limit, inplace, …])	使用插值方法填充 NaN 值。
`isin`(values)	Series 中的元素是否包含在 values 中。
`isna`()	检测缺失值。
`isnull`()	Series.isnull 是 Series.isna 的别名。
`item`()	将底层数据的第一个元素作为 Python 标量返回。
`items`()	惰性地遍历 (index, value) 元组。
`keys`()	返回索引的别名。
`kurt`([axis, skipna, numeric_only])	返回请求轴上的无偏峰度。
`kurtosis`([axis, skipna, numeric_only])	返回请求轴上的无偏峰度。
`last`(offset)	(已弃用) 根据日期偏移选择时间序列数据的最终周期。
`last_valid_index`()	返回最后一个非 NA 值的索引，如果找不到非 NA 值，则返回 None。
`le`(other[, level, fill_value, axis])	返回 series 和 other 的小于或等于值，逐元素进行比较（二元运算符 le）。
`lt`(other[, level, fill_value, axis])	返回 series 和 other 的小于值，逐元素进行比较（二元运算符 lt）。
`map`(arg[, na_action])	根据输入映射或函数映射 Series 的值。
`mask`(cond[, other, inplace, axis, level])	替换条件为 True 的值。
`max`([axis, skipna, numeric_only])	返回请求轴上的值的最大值。
`mean`([axis, skipna, numeric_only])	返回请求轴上的值的平均值。
`median`([axis, skipna, numeric_only])	返回请求轴上的值的中位数。
`memory_usage`([index, deep])	返回 Series 的内存使用情况。
`min`([axis, skipna, numeric_only])	返回请求轴上的值的最小值。
`mod`(other[, level, fill_value, axis])	返回系列和其他的模数，逐元素计算（二元运算符 mod）。
`mode`([dropna])	返回 Series 的众数（mode）。
`mul`(other[, level, fill_value, axis])	返回系列和其他的乘法，逐元素计算（二元运算符 mul）。
`multiply`(other[, level, fill_value, axis])	返回系列和其他的乘法，逐元素计算（二元运算符 mul）。
`ne`(other[, level, fill_value, axis])	返回系列和其他的不等于，逐元素计算��二元运算符 ne）。
`nlargest`([n, keep])	返回最大的 n 个元素。
`notna`()	检测存在的（非缺失）值。
`notnull`()	Series.notnull 是 Series.notna 的别名。
`nsmallest`([n, keep])	返回最小的 n 个元素。
`nunique`([dropna])	返回对象中唯一元素的数量。
`pad`(*[, axis, inplace, limit, downcast])	（已弃用）通过将最后一个有效观察结果传播到下一个有效观察结果来填充 NA/NaN 值。
`pct_change`([periods, fill_method, limit, freq])	当前元素与先前元素之间的分数变化。
`pipe`(func, args, *kwargs)	应用可链式调用的函数，期望 Series 或 DataFrame。
`pop`(item)	返回项目并从系列中删除。
`pow`(other[, level, fill_value, axis])	返回系列和其他的指数幂，逐元素计算（二元运算符 pow）。
`prod`([axis, skipna, numeric_only, min_count])	返回沿请求轴的值的乘积。
`product`([axis, skipna, numeric_only, min_count])	返回沿请求轴的值的乘积。
`quantile`([q, interpolation])	返回给定分位数处的值。
`radd`(other[, level, fill_value, axis])	返回系列和其他元素的加法，逐元素进行（二进制运算符 radd）。
`rank`([axis, method, numeric_only, …])	沿轴计算数值数据排名（1 到 n）。
`ravel`([order])	(已弃用) 将底层数据展平为 ndarray 或 ExtensionArray。
`rdiv`(other[, level, fill_value, axis])	返回系列和其他元素的浮点除法，逐元素进行（二进制运算符 rtruediv）。
`rdivmod`(other[, level, fill_value, axis])	返回系列和其他元素的整数除法和取模，逐元素进行（二进制运算符 rdivmod）。
`reindex`([index, axis, method, copy, level, …])	将 Series 调整为具有可选填充逻辑的新索引。
`reindex_like`(other[, method, copy, limit, …])	返回具有与其他对象匹配索引的对象。
`rename`([index, axis, copy, inplace, level, …])	更改 Series 索引标签或名称。
`rename_axis`([mapper, index, axis, copy, inplace])	为索引或列设置轴的名称。
`reorder_levels`(order)	使用输入顺序重新排列索引级别。
`repeat`(repeats[, axis])	重复 Series 的元素。
`replace`([to_replace, value, inplace, limit, …])	用给定值替换 to_replace 中的值。
`resample`(rule[, axis, closed, label, …])	对时间序列数据进行重新采样。
`reset_index`([level, drop, name, inplace, …])	生成具有重置索引的新 DataFrame 或 Series。
`rfloordiv`(other[, level, fill_value, axis])	返回系列和其他元素的整数除法，逐元素进行（二进制运算符 rfloordiv）。
`rmod`(other[, level, fill_value, axis])	返回系列和其他元素的取模，逐元素进行（二进制运算符 rmod）。
`rmul`(other[, level, fill_value, axis])	返回系列和其他元素的乘积，逐元素计算（二进制运算符 rmul）。
`rolling`(window[, min_periods, center, …])	提供滚动窗口计算。
`round`([decimals])	将系列中的每个值四舍五入到给定的小数位数。
`rpow`(other[, level, fill_value, axis])	返回序列和其他元素的指数幂，逐元素计算（二进制运算符 rpow）。
`rsub`(other[, level, fill_value, axis])	返回系列和其他元素的差异，逐元素计算（二进制运算符 rsub）。
`rtruediv`(other[, level, fill_value, axis])	返回系列和其他元素的浮点除法，逐元素计算（二进制运算符 rtruediv）。
`sample`([n, frac, replace, weights, …])	从对象的轴中返回随机样本项目。
`searchsorted`(value[, side, sorter])	找到应插入元素以维护顺序的索引。
`sem`([axis, skipna, ddof, numeric_only])	返回请求轴上的无偏均值标准误差。
`set_axis`(labels, *[, axis, copy])	将所需的索引分配给给定的轴。
`set_flags`(*[, copy, allows_duplicate_labels])	返回带有更新标志的新对象。
`shift`([periods, freq, axis, fill_value, suffix])	将索引按所需的周期数移动，并可选择性地指定时间频率。
`skew`([axis, skipna, numeric_only])	返回请求轴上的无偏倾斜度。
`sort_index`(*[, axis, level, ascending, …])	按索引标签对系列进行排序。
`sort_values`(*[, axis, ascending, inplace, …])	按值排序。
`squeeze`([axis])	将 1 维轴对象压缩为标量。
`std`([axis, skipna, ddof, numeric_only])	返回请求轴上的样本标准差。
`sub`(other[, level, fill_value, axis])	返回系列和其他元素的减法，逐元素进行（二进制运算符 sub）。
`subtract`(other[,��level, fill_value, axis])	返回系列和其他元素的减法，逐元素进行（二进制运算符 sub）。
`sum`([axis, skipna, numeric_only, min_count])	返回请求轴上值的总和。
`swapaxes`(axis1, axis2[, copy])	（已弃用）交换轴并适当交换值轴。
`swaplevel`([i, j, copy])	在 `MultiIndex` 中交换级别 i 和 j。
`tail`([n])	返回最后 n 行。
`take`(indices[, axis])	返回沿轴的给定位置索引的元素。
`to_clipboard`(*[, excel, sep])	将对象复制到系统剪贴板。
`to_csv`([path_or_buf, sep, na_rep, …])	将对象写入逗号分隔值（csv）文件。
`to_dict`(*[, into])	将 Series 转换为 {label -> value} 字典或类似字典的对象。
`to_excel`(excel_writer, *[, sheet_name, …])	将对象写入 Excel 表。
`to_frame`([name])	将 Series 转换为 DataFrame。
`to_hdf`(path_or_buf, *, key[, mode, …])	使用 HDFStore 将包含的数据写入 HDF5 文件。
`to_json`([path_or_buf, orient, date_format, …])	将对象转换为 JSON 字符串。
`to_latex`([buf, columns, header, index, …])	将对象呈现为 LaTeX 表格、长表格或嵌套表格。
`to_list`()	返回值的列表。
`to_markdown`([buf, mode, index, storage_options])	以 Markdown 友好的格式打印 Series。
`to_numpy`([dtype, copy, na_value])	表示此 Series 或索引中的值的 NumPy ndarray。
`to_period`([freq, copy])	将 Series 从 DatetimeIndex 转换为 PeriodIndex。
`to_pickle`(path, *[, compression, protocol, …])	将对象保存为 pickle（序列化）文件。
`to_sql`(name, con, *[, schema, if_exists, …])	将存储在 DataFrame 中的记录写入 SQL 数据库。
`to_string`([buf, na_rep, float_format, …])	渲染 Series 的字符串表示形式。
`to_timestamp`([freq, how, copy])	转换为 Timestamps 的 DatetimeIndex，位于周期的开始处。
`to_xarray`()	从 pandas 对象返回一个 xarray 对象。
`tolist`()	返回值的列表形式。
`transform`(func[, axis])	对自身调用 `func`，产生一个与自身轴形状相同的 Series。
`transpose`(args, *kwargs)	返回转置，其定义为自身。
`truediv`(other[, level, fill_value, axis])	返回系列和其他的浮点除法，逐元素进行（二元运算符 truediv）。
`truncate`([before, after, axis, copy])	在某个索引值之前和之后截断 Series 或 DataFrame。
`tz_convert`(tz[, axis, level, copy])	将带有时区信息的轴转换为目标时区。
`tz_localize`(tz[, axis, level, copy, …])	将 Series 或 DataFrame 的时区无关索引本地化为目标时区。
`unique`()	返回 Series 对象的唯一值。
`unstack`([level, fill_value, sort])	将具有 MultiIndex 的 Series 进行解压缩，也称为透视，以生成 DataFrame。
`update`(other)	使用传入的 Series 的值就地修改 Series。
`value_counts`([normalize, sort, ascending, …])	返回包含唯一值计数的 Series。
`var`([axis, skipna, ddof, numeric_only])	返回所请求轴上的无偏方差。
`view`([dtype])	(已弃用) 创建 Series 的新视图。
`where`(cond[, other, inplace, axis, level])	替换条件为假的值。
`xs`(key[, axis, level, drop_level])	从 Series/DataFrame 返回横截面。

`pandas.Series.index`

原文：pandas.pydata.org/docs/reference/api/pandas.Series.index.html

Series.index

Series 的索引（轴标签）。

一个 Series 的索引被用来标记和识别底层数据的每个元素。索引可以被看作是一个不可变的有序集合（技术上是一个多重集，因为它可能包含重复的标签），并且被用于在 pandas 中索引和对齐数据。

索引

Series 的索引标签。

另请参阅

Series.reindex

将 Series 调整到新的索引。

Index

pandas 的基础索引类型。

注意

关于 pandas 索引的更多信息，请参阅索引用户指南。

示例

要创建一个带有自定义索引并查看索引标签的 Series：

>>> cities = ['Kolkata', 'Chicago', 'Toronto', 'Lisbon']
>>> populations = [14.85, 2.71, 2.93, 0.51]
>>> city_series = pd.Series(populations, index=cities)
>>> city_series.index
Index(['Kolkata', 'Chicago', 'Toronto', 'Lisbon'], dtype='object')

要更改现有 Series 的索引标签：

>>> city_series.index = ['KOL', 'CHI', 'TOR', 'LIS']
>>> city_series.index
Index(['KOL', 'CHI', 'TOR', 'LIS'], dtype='object')

`pandas.Series.array`

原文：pandas.pydata.org/docs/reference/api/pandas.Series.array.html

property Series.array

支持此 Series 或 Index 的数据的 ExtensionArray。

ExtensionArray

存储的值的 ExtensionArray。对于扩展类型，这是实际的数组。对于 NumPy 原生类型，这是一个薄的（无需复制）包装器，包围着 numpy.ndarray.

.array 与 .values 不同，可能需要将数据转换为不同的形式。

另请参阅

Index.to_numpy

一个始终返回 NumPy 数组的类似方法。

Series.to_numpy

一个始终返回 NumPy 数组的类似方法。

注意

此表列出了 pandas 中每个扩展 dtype 的不同数组类型。

dtype	数组类型
category	Categorical
period	PeriodArray
interval	IntervalArray
IntegerNA	IntegerArray
string	StringArray
boolean	BooleanArray
datetime64[ns, tz]	DatetimeArray

对于任何第三方扩展类型，数组类型将是一个 ExtensionArray。

对于所有剩余的 dtypes，.array 将是一个 arrays.NumpyExtensionArray，包装了实际存储的 ndarray。如果您绝对需要一个 NumPy 数组（可能需要复制/强制转换数据），那么请使用 Series.to_numpy()。

示例

对于常规的 NumPy 类型，如 int 和 float，将返回一个 NumpyExtensionArray。

>>> pd.Series([1, 2, 3]).array
<NumpyExtensionArray>
[1, 2, 3]
Length: 3, dtype: int64

对于类别型等扩展类型，将返回实际的 ExtensionArray。

>>> ser = pd.Series(pd.Categorical(['a', 'b', 'a']))
>>> ser.array
['a', 'b', 'a']
Categories (2, object): ['a', 'b']

`pandas.Series.values`

原文：pandas.pydata.org/docs/reference/api/pandas.Series.values.html

property Series.values

根据 dtype 返回 Series 作为 ndarray 或类似 ndarray。

警告

我们建议使用Series.array或Series.to_numpy()，取决于您是否需要对基础数据的引用或 NumPy 数组。

numpy.ndarray 或类似 ndarray

另请参见

Series.array

对基础数据的引用。

Series.to_numpy

代表基础数据的 NumPy 数组。

示例

>>> pd.Series([1, 2, 3]).values
array([1, 2, 3])

>>> pd.Series(list('aabc')).values
array(['a', 'a', 'b', 'c'], dtype=object)

>>> pd.Series(list('aabc')).astype('category').values
['a', 'a', 'b', 'c']
Categories (3, object): ['a', 'b', 'c']

时区感知的日期时间数据被转换为 UTC：

>>> pd.Series(pd.date_range('20130101', periods=3,
...                         tz='US/Eastern')).values
array(['2013-01-01T05:00:00.000000000',
 '2013-01-02T05:00:00.000000000',
 '2013-01-03T05:00:00.000000000'], dtype='datetime64[ns]')

`pandas.Series.dtype`

原文：pandas.pydata.org/docs/reference/api/pandas.Series.dtype.html

property Series.dtype

返回底层数据的 dtype 对象。

示例

>>> s = pd.Series([1, 2, 3])
>>> s.dtype
dtype('int64')

`pandas.Series.shape`

pandas.pydata.org/docs/reference/api/pandas.Series.shape.html

property Series.shape

返回基础数据的形状的元组。

示例

>>> s = pd.Series([1, 2, 3])
>>> s.shape
(3,)

`pandas.Series.nbytes`

原文：pandas.pydata.org/docs/reference/api/pandas.Series.nbytes.html

property Series.nbytes

返回基础数据中的字节数。

示例

对于系列：

>>> s = pd.Series(['Ant', 'Bear', 'Cow'])
>>> s
0     Ant
1    Bear
2     Cow
dtype: object
>>> s.nbytes
24

对于索引：

>>> idx = pd.Index([1, 2, 3])
>>> idx
Index([1, 2, 3], dtype='int64')
>>> idx.nbytes
24

`pandas.Series.ndim`

pandas.pydata.org/docs/reference/api/pandas.Series.ndim.html

property Series.ndim

底层数据的维度数量，根据定义为 1。

示例

>>> s = pd.Series(['Ant', 'Bear', 'Cow'])
>>> s
0     Ant
1    Bear
2     Cow
dtype: object
>>> s.ndim
1

对于索引：

>>> idx = pd.Index([1, 2, 3])
>>> idx
Index([1, 2, 3], dtype='int64')
>>> idx.ndim
1

`pandas.Series.size`

原文：pandas.pydata.org/docs/reference/api/pandas.Series.size.html

property Series.size

返回底层数据中元素的数量。

示例

对于 Series：

>>> s = pd.Series(['Ant', 'Bear', 'Cow'])
>>> s
0     Ant
1    Bear
2     Cow
dtype: object
>>> s.size
3

对于 Index：

>>> idx = pd.Index([1, 2, 3])
>>> idx
Index([1, 2, 3], dtype='int64')
>>> idx.size
3

`pandas.Series.T`

原文：pandas.pydata.org/docs/reference/api/pandas.Series.T.html

property Series.T

返回转置，根据定义是自身。

例子

对于 Series：

>>> s = pd.Series(['Ant', 'Bear', 'Cow'])
>>> s
0     Ant
1    Bear
2     Cow
dtype: object
>>> s.T
0     Ant
1    Bear
2     Cow
dtype: object

对于索引：

>>> idx = pd.Index([1, 2, 3])
>>> idx.T
Index([1, 2, 3], dtype='int64')

`pandas.Series.memory_usage`

原文：pandas.pydata.org/docs/reference/api/pandas.Series.memory_usage.html

Series.memory_usage(index=True, deep=False)

返回 Series 的内存使用量。

内存使用量可以选择包括索引和对象数据类型元素的贡献。

参数：

indexbool，默认为 True

指定是否包括 Series 索引的内存使用量。

deepbool，默认为 False

如果为 True，则通过查询对象数据类型以获取系统级内存消耗来深入检查数据，并将其包含在返回值中。

int

消耗的内存字节数。

另请参阅

numpy.ndarray.nbytes

数组元素消耗的总字节数。

DataFrame.memory_usage

DataFrame 消耗的字节。

示例

>>> s = pd.Series(range(3))
>>> s.memory_usage()
152

不包括索引会给出数据的其余部分的大小，这部分大小必然更小：

>>> s.memory_usage(index=False)
24

默认情况下忽略对象值的内存占用：

>>> s = pd.Series(["a", "b"])
>>> s.values
array(['a', 'b'], dtype=object)
>>> s.memory_usage()
144
>>> s.memory_usage(deep=True)
244

`pandas.Series.hasnans`

原文：pandas.pydata.org/docs/reference/api/pandas.Series.hasnans.html

property Series.hasnans

如果有任何 NaN，则返回 True。

启用各种性能加速。

bool

示例

>>> s = pd.Series([1, 2, 3, None])
>>> s
0    1.0
1    2.0
2    3.0
3    NaN
dtype: float64
>>> s.hasnans
True

`pandas.Series.empty`

原文：pandas.pydata.org/docs/reference/api/pandas.Series.empty.html

property Series.empty

指示 Series/DataFrame 是否为空。

如果 Series/DataFrame 完全为空（没有任何项），意味着任一轴的长度为 0。

布尔值

如果 Series/DataFrame 为空，则返回 True，否则返回 False。

另请参阅

Series.dropna

返回不含空值的 Series。

DataFrame.dropna

返回在给定轴上省略标签的 DataFrame，其中（所有或任何）数据缺失。

笔记

如果 Series/DataFrame 仅包含 NaN，则仍然不被视为空。请参阅下面的示例。

示例

实际空 DataFrame 的示例。请注意索引为空：

>>> df_empty = pd.DataFrame({'A' : []})
>>> df_empty
Empty DataFrame
Columns: [A]
Index: []
>>> df_empty.empty
True

如果我们的 DataFrame 中只有 NaN，它不被视为空！我们需要删除 NaN 使 DataFrame 为空：

>>> df = pd.DataFrame({'A' : [np.nan]})
>>> df
 A
0 NaN
>>> df.empty
False
>>> df.dropna().empty
True

>>> ser_empty = pd.Series({'A' : []})
>>> ser_empty
A    []
dtype: object
>>> ser_empty.empty
False
>>> ser_empty = pd.Series()
>>> ser_empty.empty
True

`pandas.Series.dtypes`

原文：pandas.pydata.org/docs/reference/api/pandas.Series.dtypes.html

property Series.dtypes

返回基础数据的 dtype 对象。

示例

>>> s = pd.Series([1, 2, 3])
>>> s.dtypes
dtype('int64')

`pandas.Series.name`

原文：pandas.pydata.org/docs/reference/api/pandas.Series.name.html

property Series.name

返回 Series 的名称。

如果 Series 用于构成 DataFrame，则 Series 的名称变为其索引或列名。在使用解释器显示 Series 时也会使用它。

标签（可散列对象）

Series 的名称，如果是 DataFrame 的一部分，则也是列名。

另请参阅

Series.rename

给定标量输入时设置 Series 名称。

Index.name

对应的 Index 属性。

示例

在调用构造函数时可以最初设置 Series 名称。

>>> s = pd.Series([1, 2, 3], dtype=np.int64, name='Numbers')
>>> s
0    1
1    2
2    3
Name: Numbers, dtype: int64
>>> s.name = "Integers"
>>> s
0    1
1    2
2    3
Name: Integers, dtype: int64

DataFrame 中 Series 的名称是其列名。

>>> df = pd.DataFrame([[1, 2], [3, 4], [5, 6]],
...                   columns=["Odd Numbers", "Even Numbers"])
>>> df
 Odd Numbers  Even Numbers
0            1             2
1            3             4
2            5             6
>>> df["Even Numbers"].name
'Even Numbers'

`pandas.Series.flags`

原文：pandas.pydata.org/docs/reference/api/pandas.Series.flags.html

property Series.flags

获取与此 pandas 对象关联的属性。

可用的标志有

Flags.allows_duplicate_labels

参见

Flags

适用于 pandas 对象的标志。

DataFrame.attrs

适用于此数据集的全局元数据。

注意

“标志”与“元数据”不同。标志反映了 pandas 对象（Series 或 DataFrame）的属性。元数据指的是数据集的属性，应存储在DataFrame.attrs中。

示例

>>> df = pd.DataFrame({"A": [1, 2]})
>>> df.flags
<Flags(allows_duplicate_labels=True)>

可以使用.来获取或设置标志。

>>> df.flags.allows_duplicate_labels
True
>>> df.flags.allows_duplicate_labels = False

或通过使用键进行切片

>>> df.flags["allows_duplicate_labels"]
False
>>> df.flags["allows_duplicate_labels"] = True

`pandas.Series.set_flags`

原文：pandas.pydata.org/docs/reference/api/pandas.Series.set_flags.html

Series.set_flags(*, copy=False, allows_duplicate_labels=None)

返回一个更新了标志的新对象。

参数：

copybool，默认为 False

指定是否应复制对象。

注意

在 pandas 3.0 中，copy 关键字的行为将发生变化。写时复制将默认启用，这意味着所有带有 copy 关键字的方法将使用惰性复制机制来推迟复制并忽略 copy 关键字。 copy 关键字将在未来版本的 pandas 中删除。

通过启用写时复制 pd.options.mode.copy_on_write = True，您已经可以获得未来的行为和改进。

allows_duplicate_labelsbool，可选

返回对象是否允许重复标签。

Series 或 DataFrame

调用者的相同类型。

另请参阅

DataFrame.attrs

适用于此数据集的全局元数据。

DataFrame.flags

适用于此对象的全局标志。

注意事项

此方法返回一个新对象，该对象是输入数据的视图。修改输入或输出值将反映在另一个值中。

该方法旨在用于方法链中使用。

“Flags”与“元数据”不同。标志反映了 pandas 对象（Series 或 DataFrame）的属性。元数据指数据集的属性，应存储在 DataFrame.attrs 中。

示例

>>> df = pd.DataFrame({"A": [1, 2]})
>>> df.flags.allows_duplicate_labels
True
>>> df2 = df.set_flags(allows_duplicate_labels=False)
>>> df2.flags.allows_duplicate_labels
False

`pandas.Series.astype`

原文：pandas.pydata.org/docs/reference/api/pandas.Series.astype.html

Series.astype(dtype, copy=None, errors='raise')

将 pandas 对象转换为指定的数据类型dtype。

参数：

dtypestr，数据类型，Series 或列名 -> 数据类型的映射

使用字符串、numpy.dtype、pandas.ExtensionDtype 或 Python 类型将整个 pandas 对象转换为相同类型。或者，使用映射，例如{col: dtype, …}，其中 col 是列标签，dtype 是 numpy.dtype 或 Python 类型，将数据框的一个或多个列转换为特定类型。

copybool，默认为 True

当copy=True时返回一个副本（设置copy=False时要非常小心，因为值的更改可能传播到其他 pandas 对象）。

注意

copy关键字在 pandas 3.0 中的行为将发生变化。写时复制将默认启用，这意味着所有带有copy关键字的方法将使用延迟复制机制来推迟复制并忽略copy关键字。copy关键字将在未来版本的 pandas 中被移除。

通过启用写时复制pd.options.mode.copy_on_write = True，您已经可以获得未来的行为和改进。

errors{‘raise’, ‘ignore’}，默认为‘raise’

控制对提供的数据类型的无效数据引发异常。

raise：允许引发异常
ignore：忽略异常。出错时返回原始对象。

与调用者相同的类型

另请参阅

to_datetime

将参数转换为日期时间。

to_timedelta

将参数转换为时间间隔。

to_numeric

将参数转换为数值类型。

numpy.ndarray.astype

将 numpy 数组转换为指定类型。

注意

从版本 2.0.0 开始更改：使用astype从时区无关的数据类型转换为时区感知的数据类型将引发异常。请改用Series.dt.tz_localize()。

示例

创建一个数据框：

>>> d = {'col1': [1, 2], 'col2': [3, 4]}
>>> df = pd.DataFrame(data=d)
>>> df.dtypes
col1    int64
col2    int64
dtype: object

将所有列转换为 int32：

>>> df.astype('int32').dtypes
col1    int32
col2    int32
dtype: object

使用字典将 col1 转换为 int32：

>>> df.astype({'col1': 'int32'}).dtypes
col1    int32
col2    int64
dtype: object

创建一个系列：

>>> ser = pd.Series([1, 2], dtype='int32')
>>> ser
0    1
1    2
dtype: int32
>>> ser.astype('int64')
0    1
1    2
dtype: int64

转换为分类类型：

>>> ser.astype('category')
0    1
1    2
dtype: category
Categories (2, int32): [1, 2]

将转换为具有自定义排序的有序分类类型：

>>> from pandas.api.types import CategoricalDtype
>>> cat_dtype = CategoricalDtype(
...     categories=[2, 1], ordered=True)
>>> ser.astype(cat_dtype)
0    1
1    2
dtype: category
Categories (2, int64): [2 < 1]

创建一个日期系列：

>>> ser_date = pd.Series(pd.date_range('20200101', periods=3))
>>> ser_date
0   2020-01-01
1   2020-01-02
2   2020-01-03
dtype: datetime64[ns]

`pandas.Series.convert_dtypes`

原文：pandas.pydata.org/docs/reference/api/pandas.Series.convert_dtypes.html

Series.convert_dtypes(infer_objects=True, convert_string=True, convert_integer=True, convert_boolean=True, convert_floating=True, dtype_backend='numpy_nullable')

使用支持pd.NA的数据类型将列转换为最佳可能的数据类型。

参数：

infer_objectsbool，默认为 True

是否应将对象数据类型转换为最佳可能的类型。

convert_stringbool，默认为 True

是否应将对象数据类型转换为StringDtype()。

convert_integerbool，默认为 True

是否，如果可能的话，可以将其转换为整数扩展类型。

convert_booleanbool，默认为 True

是否应将对象数据类型转换为BooleanDtypes()。

convert_floatingbool，默认为 True

是否，如果可能的话，可以将其转换为浮点扩展类型。如果convert_integer也为True，则优先考虑整数数据类型，如果浮点数可以被准确地转换为整数。

dtype_backend{‘numpy_nullable’, ‘pyarrow’}，默认为‘numpy_nullable’

应用于结果DataFrame的后端数据类型（仍处于实验阶段）。行为如下：

"numpy_nullable"：返回可空 dtype 支持的DataFrame（默认）。
"pyarrow"：返回 pyarrow 支持的可空ArrowDtype DataFrame。

新版本 2.0 中新增。

Series 或 DataFrame

具有新数据类型的输入对象的副本。

另请参见

推断对象的数据类型。

将参数转换为日期时间。

将参数转换为时间差。

将参数转换为数值类型。

注意事项

默认情况下，convert_dtypes将尝试将 Series（或 DataFrame 中的每个 Series）转换为支持pd.NA的数据类型。通过使用选项convert_string、convert_integer、convert_boolean和convert_floating，可以分别关闭到StringDtype、整数扩展类型、BooleanDtype或浮点扩展类型的单个转换。

对于对象数据类型的列，如果infer_objects为True，则使用与正常 Series/DataFrame 构造过程相同的推断规则。然后，如果可能，转换为StringDtype、BooleanDtype或适当的整数或浮点扩展类型，否则保持为object。

如果 dtype 是整数，则转换为适当的整数扩展类型。

如果 dtype 是数值型，并且由全部整数组成，则转换为适当的整数扩展类型。否则，转换为适当的浮点扩展类型。

将来，随着添加支持pd.NA的新数据类型，此方法的结果将会改变以支持这些新数据类型。

例子

>>> df = pd.DataFrame(
...     {
...         "a": pd.Series([1, 2, 3], dtype=np.dtype("int32")),
...         "b": pd.Series(["x", "y", "z"], dtype=np.dtype("O")),
...         "c": pd.Series([True, False, np.nan], dtype=np.dtype("O")),
...         "d": pd.Series(["h", "i", np.nan], dtype=np.dtype("O")),
...         "e": pd.Series([10, np.nan, 20], dtype=np.dtype("float")),
...         "f": pd.Series([np.nan, 100.5, 200], dtype=np.dtype("float")),
...     }
... )

从具有默认数据类型的 DataFrame 开始。

>>> df
 a  b      c    d     e      f
0  1  x   True    h  10.0    NaN
1  2  y  False    i   NaN  100.5
2  3  z    NaN  NaN  20.0  200.0

>>> df.dtypes
a      int32
b     object
c     object
d     object
e    float64
f    float64
dtype: object

将 DataFrame 转换为使用最佳可能的数据类型。

>>> dfn = df.convert_dtypes()
>>> dfn
 a  b      c     d     e      f
0  1  x   True     h    10   <NA>
1  2  y  False     i  <NA>  100.5
2  3  z   <NA>  <NA>    20  200.0

>>> dfn.dtypes
a             Int32
b    string[python]
c           boolean
d    string[python]
e             Int64
f           Float64
dtype: object

从字符串序列和由np.nan表示的缺失数据开始。

>>> s = pd.Series(["a", "b", np.nan])
>>> s
0      a
1      b
2    NaN
dtype: object

获得一个具有StringDtype数据类型的序列。

>>> s.convert_dtypes()
0       a
1       b
2    <NA>
dtype: string

`pandas.Series.infer_objects`

原文：pandas.pydata.org/docs/reference/api/pandas.Series.infer_objects.html

Series.infer_objects(copy=None)

尝试推断对象列的更好数据类型。

尝试对对象类型的列进行软转换，保持非对象和无法转换的列不变。推断规则与正常 Series/DataFrame 构建时相同。

参数：

copybool，默认为 True

是否为非对象或无法推断的列或 Series 进行复制。

注意

使用copy关键字将在 pandas 3.0 中改变行为。写时复制将默认启用，这意味着所有带有copy关键字的方法将使用延迟复制机制来推迟复制并忽略copy关键字。copy关键字将在未来版本的 pandas 中被移除。

通过启用写时复制pd.options.mode.copy_on_write = True，您已经可以获得未来的行为和改进。

与输入对象相同的类型

参见

to_datetime

将参数转换为日期时间。

to_timedelta

将参数转换为时间间隔。

to_numeric

将参数转换为数值类型。

convert_dtypes

将参数转换为最佳可能的数据类型。

示例

>>> df = pd.DataFrame({"A": ["a", 1, 2, 3]})
>>> df = df.iloc[1:]
>>> df
 A
1  1
2  2
3  3

>>> df.dtypes
A    object
dtype: object

>>> df.infer_objects().dtypes
A    int64
dtype: object

`pandas.Series.copy`

原文：pandas.pydata.org/docs/reference/api/pandas.Series.copy.html

Series.copy(deep=True)

复制此对象的索引和数据。

当deep=True（默认值）时，将创建一个新对象，其中包含调用对象的数据和索引的副本。对副本的数据或索引的修改将不会反映在原始对象中（请参阅下面的说明）。

当deep=False时，将创建一个新对象，而不会复制调用对象的数据或索引（只会复制到数据和索引的引用）。对原始数据的任何更改都将反映在浅拷贝中（反之亦然）。

注意

上述描述的deep=False行为将在 pandas 3.0 中发生变化。写时复制将默认启用，这意味着返回deep=False的“浅”拷贝仍将避免进行急切拷贝，但原始数据的更改将不再反映在浅拷贝中（反之亦然）。相反，它利用了一种懒惰（延迟）拷贝机制，只有在对原始数据或浅拷贝进行任何更改时才会复制数据。

您已经可以通过启用写时复制pd.options.mode.copy_on_write = True来获得未来的行为和改进。

参数：

deepbool，默认为 True

进行深度复制，包括数据和索引的复制。使用deep=False时，索引和数据都不会被复制。

Series 或 DataFrame

对象类型与调用者匹配。

说明

当deep=True时，数据会被复制，但实际的 Python 对象不会被递归复制，只会复制到对象的引用。这与标准库中的 copy.deepcopy 不同，后者会递归复制对象数据（请参阅下面的示例）。

当deep=True时，Index对象会被复制，但出于性能原因，底层 numpy 数组不会被复制。由于Index是不可变的，底层数据可以安全共享，因此不需要复制。

由于 pandas 不是线程安全的，请参阅在线程环境中复制时的注意事项。

当 pandas 配置中的copy_on_write设置为True时，即使deep=False，copy_on_write配置也会生效。这意味着对复制数据的任何更改都会在写入时生成数据的新副本（反之亦然）。对原始变量或复制变量进行的任何更改都不会反映在对方中。请参阅写时复制获取更多信息。

示例

>>> s = pd.Series([1, 2], index=["a", "b"])
>>> s
a    1
b    2
dtype: int64

>>> s_copy = s.copy()
>>> s_copy
a    1
b    2
dtype: int64

浅拷贝与默认（深拷贝）的区别：

>>> s = pd.Series([1, 2], index=["a", "b"])
>>> deep = s.copy()
>>> shallow = s.copy(deep=False)

浅拷贝与原始共享数据和索引。

>>> s is shallow
False
>>> s.values is shallow.values and s.index is shallow.index
True

深拷贝具有自己的数据和索引的副本。

>>> s is deep
False
>>> s.values is deep.values or s.index is deep.index
False

对由浅拷贝和原始共享的数据的更新在两者中都会反映出来（注意：对于 pandas >= 3.0，这将不再是真实的）；深拷贝保持不变。

>>> s.iloc[0] = 3
>>> shallow.iloc[1] = 4
>>> s
a    3
b    4
dtype: int64
>>> shallow
a    3
b    4
dtype: int64
>>> deep
a    1
b    2
dtype: int64

请注意，当复制包含 Python 对象的对象时，深拷贝会复制数据，但不会递归地这样做。更新嵌套数据对象将反映在深拷贝中。

>>> s = pd.Series([[1, 2], [3, 4]])
>>> deep = s.copy()
>>> s[0][0] = 10
>>> s
0    [10, 2]
1     [3, 4]
dtype: object
>>> deep
0    [10, 2]
1     [3, 4]
dtype: object

Copy-on-Write 设置为 true，当原始数据发生更改时，浅拷贝不会被修改：

>>> with pd.option_context("mode.copy_on_write", True):
...     s = pd.Series([1, 2], index=["a", "b"])
...     copy = s.copy(deep=False)
...     s.iloc[0] = 100
...     s
a    100
b      2
dtype: int64
>>> copy
a    1
b    2
dtype: int64

`pandas.Series.bool`

原文：pandas.pydata.org/docs/reference/api/pandas.Series.bool.html

Series.bool()

返回单个元素 Series 或 DataFrame 的布尔值。

自版本 2.1.0 起已弃用：布尔值已弃用，并将在未来版本的 pandas 中移除。对于Series，请使用pandas.Series.item。

这必须是一个布尔标量值，要么为 True，要么为 False。如果 Series 或 DataFrame 不具有确切的 1 个元素，或该元素不是布尔值（整数值 0 和 1 也将引发异常），则会引发 ValueError。

布尔值

Series 或 DataFrame 中的值。

另请参见

Series.astype

更改 Series 的数据类型，包括布尔值。

DataFrame.astype

更改 DataFrame 的数据类型，包括布尔值。

numpy.bool_

NumPy 布尔数据类型，由 pandas 用于布尔值。

示例

该方法仅适用于具有布尔值的单元素对象：

>>> pd.Series([True]).bool()  
True
>>> pd.Series([False]).bool()  
False

>>> pd.DataFrame({'col': [True]}).bool()  
True
>>> pd.DataFrame({'col': [False]}).bool()  
False

这是一种替代方法，仅适用于具有布尔值的单元素对象：

>>> pd.Series([True]).item()  
True
>>> pd.Series([False]).item()  
False

`pandas.Series.to_numpy`

原文：pandas.pydata.org/docs/reference/api/pandas.Series.to_numpy.html

Series.to_numpy(dtype=None, copy=False, na_value=_NoDefault.no_default, **kwargs)

表示此 Series 或 Index 中的值的 NumPy ndarray。

参数：

dtypestr 或 numpy.dtype，可选

传递给numpy.asarray()的 dtype。

copybool，默认为 False

是否确保返回的值不是另一个数组的视图。请注意，copy=False并不确保to_numpy()是无副本的。相反，copy=True确保进行复制，即使不是绝对必要。

na_value任意，可选

用于缺失值的值。默认值取决于 dtype 和数组的类型。

**kwargs

传递给底层数组的to_numpy方法的其他关键字（用于扩展数组）。

numpy.ndarray

另请参阅

Series.array

获取实际存储的数据。

Index.array

获取实际存储的数据。

DataFrame.to_numpy

DataFrame 的类似方法。

注意

返回的数组将相等（self 中相等的值在返回的数组中也相等；不相等的值也是如此）。当 self 包含 ExtensionArray 时，dtype 可能不同。例如，对于 category-dtype Series，to_numpy()将返回一个 NumPy 数组，分类 dtype 将丢失。

对于 NumPy dtypes，这将是对存储在此 Series 或 Index 中的实际数据的引用（假设copy=False）。在原地修改结果将修改存储在 Series 或 Index 中的数据（我们不建议这样做）。

对于扩展类型，to_numpy()可能需要复制数据并将结果强制转换为 NumPy 类型（可能是对象），这可能很昂贵。当您需要对底层数据进行无副本引用时，应改用Series.array。

该表列出了不同的 dtype 和各种 pandas 中各种 dtype 的to_numpy()的默认返回类型。

dtype	数组类型
category[T]	ndarray[T]（与输入相同的 dtype）
period	ndarray[object]（周期）
interval	ndarray[object]（间隔）
IntegerNA	ndarray[object]
datetime64[ns]	datetime64[ns]
datetime64[ns, tz]	ndarray[object]（时间戳）

示例

>>> ser = pd.Series(pd.Categorical(['a', 'b', 'a']))
>>> ser.to_numpy()
array(['a', 'b', 'a'], dtype=object)

指定 dtype 以控制如何表示 datetime-aware 数据。使用dtype=object返回一个包含正确tz的 pandas Timestamp对象的 ndarray。

>>> ser = pd.Series(pd.date_range('2000', periods=2, tz="CET"))
>>> ser.to_numpy(dtype=object)
array([Timestamp('2000-01-01 00:00:00+0100', tz='CET'),
 Timestamp('2000-01-02 00:00:00+0100', tz='CET')],
 dtype=object)

或dtype='datetime64[ns]'以返回本机 datetime64 值的 ndarray。值将转换为 UTC 并丢弃时区信息。

>>> ser.to_numpy(dtype="datetime64[ns]")
... 
array(['1999-12-31T23:00:00.000000000', '2000-01-01T23:00:00...'],
 dtype='datetime64[ns]')

`pandas.Series.to_period`

原文：pandas.pydata.org/docs/reference/api/pandas.Series.to_period.html

Series.to_period(freq=None, copy=None)

将 Series 从 DatetimeIndex 转换为 PeriodIndex。

参数：

freqstr，默认为 None

与 PeriodIndex 相关的频率。

copybool，默认为 True

是否返回副本。

注意

在 pandas 3.0 中，copy 关键字的行为将发生变化。写时复制将默认启用，这意味着所有带有 copy 关键字的方法将使用延迟复制机制来推迟复制并忽略 copy 关键字。copy 关键字将在未来的 pandas 版本中被移除。

通过启用写时复制pd.options.mode.copy_on_write = True，您已经可以获得未来的行为和改进。

系列

将索引转换为 PeriodIndex 的 Series。

示例

>>> idx = pd.DatetimeIndex(['2023', '2024', '2025'])
>>> s = pd.Series([1, 2, 3], index=idx)
>>> s = s.to_period()
>>> s
2023    1
2024    2
2025    3
Freq: Y-DEC, dtype: int64

查看索引

>>> s.index
PeriodIndex(['2023', '2024', '2025'], dtype='period[Y-DEC]')

`pandas.Series.to_timestamp`

译文：pandas.pydata.org/docs/reference/api/pandas.Series.to_timestamp.html

Series.to_timestamp(freq=None, how='start', copy=None)

转换为时间戳的 DatetimeIndex，在周期的开始。

参数：

freqstr，默认为 PeriodIndex 的频率

所需的频率。

how{‘s’, ‘e’, ‘start’, ‘end’}

将周期转换为时间戳的惯例；周期的开始与结束。

copybool，默认为 True

是否返回副本。

注意

通过启用写时复制pd.options.mode.copy_on_write = True，您已经可以获得未来的行为和改进。

具有 DatetimeIndex 的 Series

示例

>>> idx = pd.PeriodIndex(['2023', '2024', '2025'], freq='Y')
>>> s1 = pd.Series([1, 2, 3], index=idx)
>>> s1
2023    1
2024    2
2025    3
Freq: Y-DEC, dtype: int64

时间戳的结果频率为 YearBegin。

>>> s1 = s1.to_timestamp()
>>> s1
2023-01-01    1
2024-01-01    2
2025-01-01    3
Freq: YS-JAN, dtype: int64

使用偏移量作为时间戳的频率。

>>> s2 = pd.Series([1, 2, 3], index=idx)
>>> s2 = s2.to_timestamp(freq='M')
>>> s2
2023-01-31    1
2024-01-31    2
2025-01-31    3
Freq: YE-JAN, dtype: int64

`pandas.Series.to_list`

原文：pandas.pydata.org/docs/reference/api/pandas.Series.to_list.html

Series.to_list()

返回值列表。

这些都是标量类型，即 Python 标量（对于 str、int、float）或 pandas 标量（对于 Timestamp/Timedelta/Interval/Period）。

列表

另请参阅

numpy.ndarray.tolist

将数组作为 Python 标量的 a.ndim 级别深度嵌套列表返回。

示例

对于 Series

>>> s = pd.Series([1, 2, 3])
>>> s.to_list()
[1, 2, 3]

对于索引：

>>> idx = pd.Index([1, 2, 3])
>>> idx
Index([1, 2, 3], dtype='int64')

>>> idx.to_list()
[1, 2, 3]

`pandas.Series.array`

原文：pandas.pydata.org/docs/reference/api/pandas.Series.__array__.html

Series.__array__(dtype=None, copy=None)

将值作为 NumPy 数组返回。

用户不应直接调用此函数。而是由 numpy.array() 和 numpy.asarray() 调用。

参数：

dtypestr 或 numpy.dtype，可选

用于生成结果 NumPy 数组的 dtype。默认情况下，dtype 是从数据中推断出来的。

copybool 或 None，可选

未使用。

numpy.ndarray

将系列中的值转换为具有指定 dtype 的 numpy.ndarray。

另请参阅

array

从数据创建一个新的数组。

Series.array

返回 Series 支持的数组的零拷贝视图。

Series.to_numpy

与 Series 方法具有类似行为。

示例

>>> ser = pd.Series([1, 2, 3])
>>> np.asarray(ser)
array([1, 2, 3])

对于时区感知数据，可以使用 dtype='object' 保留时区。

>>> tzser = pd.Series(pd.date_range('2000', periods=2, tz="CET"))
>>> np.asarray(tzser, dtype="object")
array([Timestamp('2000-01-01 00:00:00+0100', tz='CET'),
 Timestamp('2000-01-02 00:00:00+0100', tz='CET')],
 dtype=object)

或者将值本地化为 UTC 并丢弃 tzinfo，并使用 dtype='datetime64[ns]'

>>> np.asarray(tzser, dtype="datetime64[ns]")  
array(['1999-12-31T23:00:00.000000000', ...],
 dtype='datetime64[ns]')

`pandas.Series.get`

原文：pandas.pydata.org/docs/reference/api/pandas.Series.get.html

Series.get(key, default=None)

从对象中获取给定键（例如：DataFrame 列）的项目。

如果未找到，则返回默认值。

参数：

key对象

与对象中包含的项目相同类型

示例

>>> df = pd.DataFrame(
...     [
...         [24.3, 75.7, "high"],
...         [31, 87.8, "high"],
...         [22, 71.6, "medium"],
...         [35, 95, "medium"],
...     ],
...     columns=["temp_celsius", "temp_fahrenheit", "windspeed"],
...     index=pd.date_range(start="2014-02-12", end="2014-02-15", freq="D"),
... )

>>> df
 temp_celsius  temp_fahrenheit windspeed
2014-02-12          24.3             75.7      high
2014-02-13          31.0             87.8      high
2014-02-14          22.0             71.6    medium
2014-02-15          35.0             95.0    medium

>>> df.get(["temp_celsius", "windspeed"])
 temp_celsius windspeed
2014-02-12          24.3      high
2014-02-13          31.0      high
2014-02-14          22.0    medium
2014-02-15          35.0    medium

>>> ser = df['windspeed']
>>> ser.get('2014-02-13')
'high'

如果未找到键，则将使用默认值。

>>> df.get(["temp_celsius", "temp_kelvin"], default="default_value")
'default_value'

>>> ser.get('2014-02-10', '[unknown]')
'[unknown]'