【Python床头书】python pandas.Series定义参数属性示例方法用法详解

本文链接：https://blog.csdn.net/wang2leee/article/details/134046207

python pandas.Series定义参数属性示例方法用法详解

文章目录

python pandas.Series定义参数属性示例方法用法详解

类定义

class pandas.Series(data=None, index=None, dtype=None, name=None, copy=None, fastpath=False)[source]

类描述

一维带有轴标签的ndarray（包括时间序列）。

标签不需要是唯一的，但必须是可哈希类型。该对象支持基于整数和基于标签的索引，并提供了许多用于执行涉及索引的操作的方法。覆盖了ndarray中的统计方法，以自动排除缺失数据（当前表示为NaN）。

Series之间的操作（+、-、/、*、**）根据它们关联的索引值进行对齐 - 它们的长度不需要相同。结果索引将是两个索引的排序并集。

参数

参数名	类型	说明
data	array-like, Iterable, dict, or scalar value	包含存储在Series中的数据。如果data是字典，则保留参数顺序。
index	array-like或Index（1d）	值必须是可哈希的，并且与data具有相同的长度。允许非唯一的索引值。如果未提供，默认为RangeIndex（0, 1, 2, …, n）。如果data类似字典并且index为None，则使用data中的键作为索引。如果index不为None，则重新索引结果Series以使用索引值。
dtype	str、numpy.dtype或ExtensionDtype，可选	输出Series的数据类型。如果未指定，则将从data中推断出数据类型。更多用法请参阅用户指南。
name	可哈希，默认为None	Series的名称。
copy	bool，默认为False	复制输入数据。仅影响Series或1d ndarray输入。请参阅示例。

示例

从指定索引的字典构建 Series

d = {'a': 1, 'b': 2, 'c': 3}
ser = pd.Series(data=d, index=['a', 'b', 'c'])
ser

输出结果：

a   1
b   2
c   3
dtype: int64

字典的键与索引值相匹配，因此索引值没有影响。

d = {'a': 1, 'b': 2, 'c': 3}
ser = pd.Series(data=d, index=['x', 'y', 'z'])
ser

输出结果：

x   NaN
y   NaN
z   NaN
dtype: float64

注意，索引首先使用字典的键构建。之后，Series根据给定的索引值重新索引，因此我们得到了全部为NaN的结果。

从不复制的列表构建 Series

r = [1, 2]
ser = pd.Series(r, copy=False)
ser.iloc[0] = 999
r

输出结果：

[1, 2]

ser

输出结果：

0    999
1      2
dtype: int64

由于输入数据类型，即使copy=False，Series仍然复制了原始数据，因此数据没有发生变化。

从一维ndarray构建 Series，并且不进行复制。

r = np.array([1, 2])
ser = pd.Series(r, copy=False)
ser.iloc[0] = 999
r

输出结果：

array([999,   2])

ser

输出结果：

0    999
1      2
dtype: int64

由于输入数据类型，Series对原始数据进行了视图，因此数据也发生了变化。

属性

属性名	说明
T	返回转置，根据定义是自身。
array	返回支持此Series或Index的数据的ExtensionArray。
at	访问行/列标签对的单个值。
attrs	该数据集的全局属性字典。
axes	返回行轴标签的列表。
dtype	返回底层数据的dtype对象。
dtypes	返回底层数据的dtype对象。
flags	获取与此pandas对象关联的属性。
hasnans	如果存在任何NaN，则返回True。
iat	按整数位置访问行/列对的单个值。
iloc	纯整数位置基于位置进行索引选择。
index	Series的索引（轴标签）。
is_monotonic_decreasing	如果对象中的值是单调递减的，则返回布尔值。
is_monotonic_increasing	如果对象中的值是单调递增的，则返回布尔值。
is_unique	如果对象中的值是唯一的，则返回布尔值。
loc	按标签（或布尔数组）访问一组行和列。
name	返回Series的名称。
nbytes	返回底层数据中的字节数。
ndim	底层数据的维数，根据定义为1。
shape	返回底层数据的形状的元组。
size	返回底层数据中的元素数。
values	根据dtype返回Series作为ndarray或类似ndarray的对象。
empty	如果Series为空，则返回True。

方法

方法名	说明
abs()	返回每个元素的绝对值的Series / DataFrame。
add(other[, level, fill_value, axis])	按元素方式返回系列和其他的加法结果（二进制操作符add）。
add_prefix(prefix[, axis])	使用字符串前缀添加标签。
add_suffix(suffix[, axis])	使用字符串后缀添加标签。
agg([func, axis])	使用一个或多个操作聚合指定轴上的数据。
aggregate([func, axis])	使用一个或多个操作聚合指定轴上的数据。
align(other[, join, axis, level, copy, …])	将两个对象在它们的轴上与指定的join方法进行对齐。
all([axis, bool_only, skipna])	如果所有元素都为True，则返回True，可能是在一个轴上。
any(*[, axis, bool_only, skipna])	如果任何元素为True，则返回True，可能是在一个轴上。
apply(func[, convert_dtype, args, by_row])	对Series的值应用函数。
argmax([axis, skipna])	返回Series中最大值的位置。
argmin([axis, skipna])	返回Series中最小值的位置。
argsort([axis, kind, order])	返回按Series值排序的整数索引。
asfreq(freq[, method, how, normalize, …])	将时间序列转换为指定的频率。
asof(where[, subset])	在where之前返回最后一行（不包含任何NaN）。
astype(dtype[, copy, errors])	将pandas对象强制转换为指定的dtype类型。
at_time(time[, asof, axis])	选择特定时间的值（例如，上午9:30）。
autocorr([lag])	计算滞后N的自相关性。
backfill(*[, axis, inplace, limit, downcast])	（已弃用）使用下一个有效观测值填充NA / NaN值。
between(left, right[, inclusive])	返回等效于left <= series <= right的布尔Series。
between_time(start_time, end_time[, …])	选择在一天中的特定时间段内的值（例如，上午9:00-9:30）。
bfill(*[, axis, inplace, limit, downcast])	使用下一个有效观测值填充NA / NaN值。
bool()	（已弃用）返回单个元素Series或DataFrame的布尔值。
cat	CategoricalAccessor的别名
clip([lower, upper, axis, inplace])	将值裁剪到输入阈值。
combine(other, func[, fill_value])	根据func将Series与Series或标量组合起来。
combine_first(other)	使用’other’中相同位置的值更新空元素。
compare(other[, align_axis, keep_shape, …])	与另一个Series进行比较并显示差异。
convert_dtypes([infer_objects, …])	使用支持pd.NA的dtype将列转换为最佳可能的dtype。
copy([deep])	复制此对象的索引和数据。
corr(other[, method, min_periods])	计算与其他Series的相关性，排除缺失值。
count()	返回Series中非NA / null观测值的数量。
cov(other[, min_periods, ddof])	计算与Series的协方差，排除缺失值。
cummax([axis, skipna])	沿DataFrame或Series轴返回累积最大值。
cummin([axis, skipna])	沿DataFrame或Series轴返回累积最小值。
cumprod([axis, skipna])	沿DataFrame或Series轴返回累积乘积。
cumsum([axis, skipna])	沿DataFrame或Series轴返回累积总和。
describe([percentiles, include, exclude])	生成描述性统计。
diff([periods])	计算元素的第一个离散差异。
div(other[, level, fill_value, axis])	按元素方式返回系列和其他的浮点除法结果（二进制操作符truediv）。
divide(other[, level, fill_value, axis])	按元素方式返回系列和其他的浮点除法结果（二进制操作符truediv）。
divmod(other[, level, fill_value, axis])	返回系列和其他的整数除法和模运算结果（二进制操作符divmod）。
dot(other)	计算Series和other的点积。
drop([labels, axis, index, columns, level, …])	返回删除指定索引标签的Series。
drop_duplicates(*[, keep, inplace, ignore_index])	返回删除重复值的Series。
droplevel(level[, axis])	返回请求的索引/列级别。
dropna(*[, axis, inplace, how, ignore_index])	返回删除缺失值的新Series。
duplicated([keep])	指示是否存在重复的Series值。
eq(other[, level, fill_value, axis])	返回系列和其他的等于（元素方式）的结果（二进制操作符eq）。
equals(other)	测试两个对象是否包含相同的元素。
ewm([com, span, halflife, alpha, …])	提供指数加权（EW）计算。
expanding([min_periods, axis, method])	提供扩展窗口计算。
explode([ignore_index])	将列表形式的每个元素转换为行。
factorize([sort, use_na_sentinel])	将对象编码为枚举类型或分类变量。
ffill(*[, axis, inplace, limit, downcast])	通过传播最后一个有效观测值来填充NA / NaN值。
fillna([value, method, axis, inplace, …])	使用指定的方法填充NA / NaN值。
filter([items, like, regex, axis])	根据指定的索引标签子集DataFrame行或列。
first(offset)	根据日期偏移量选择时间序列数据的初始周期。
first_valid_index()	返回第一个非NA值的索引，如果没有非NA值，则返回None。
floordiv(other[, level, fill_value, axis])	按元素方式返回系列和其他的整数除法结果（二进制操作符floordiv）。
ge(other[, level, fill_value, axis])	返回系列和其他的大于等于的结果（元素方式）（二进制操作符ge）。
get(key[, default])	从对象中获取给定键的项目（例如DataFrame列）。
groupby([by, axis, level, as_index, sort, …])	使用映射器或列的Series进行分组。
gt(other[, level, fill_value, axis])	返回系列和其他的大于的结果（元素方式）（二进制操作符gt）。
head([n])	返回前n行。
hist([by, ax, grid, xlabelsize, xrot, …])	使用matplotlib绘制输入系列的直方图。
idxmax([axis, skipna])	返回最大值的行标签。
idxmin([axis, skipna])	返回最小值的行标签。
infer_objects([copy])	尝试为对象列推断更好的dtypes。
info([verbose, buf, max_cols, memory_usage, …])	打印Series的简洁摘要。
interpolate([method, axis, limit, inplace, …])	使用插值方法填充NaN值。
isin(values)	Series中的元素是否包含在values中。
isna()	检测缺失值。
isnull()	Series.isnull是Series.isna的别名。
item()	将底层数据的第一个元素作为Python标量返回。
items()	惰性迭代（索引，值）元组。
keys()	返回索引的别名。
kurt([axis, skipna, numeric_only])	计算请求轴上的无偏峰度。
kurtosis([axis, skipna, numeric_only])	计算请求轴上的无偏峰度。
last(offset)	根据日期偏移量选择时间序列数据的最后周期。
last_valid_index()	返回最后一个非NA值的索引，如果没有非NA值，则返回None。
le(other[, level, fill_value, axis])	返回系列和其他的小于等于的结果（元素方式）（二进制操作符le）。
lt(other[, level, fill_value, axis])	返回系列和其他的小于的结果（元素方式）（二进制操作符lt）。
map(arg[, na_action])	根据输入的映射或函数对Series的值进行映射。
mask(cond[, other, inplace, axis, level])	将条件为True的值替换为其他值。
max([axis, skipna, numeric_only])	返回请求轴上的最大值。
mean([axis, skipna, numeric_only])	返回请求轴上的平均值。
median([axis, skipna, numeric_only])	返回请求轴上的中位数。
memory_usage([index, deep])	返回Series的内存使用情况。
min([axis, skipna, numeric_only])	返回请求轴上的最小值。
mod(other[, level, fill_value, axis])	返回系列和其他的模运算结果（元素方式）（二进制操作符mod）。
mode([dropna])	返回Series的众数。
mul(other[, level, fill_value, axis])	返回系列和其他的乘法结果（元素方式）（二进制操作符mul）。
multiply(other[, level, fill_value, axis])	返回系列和其他的乘法结果（元素方式）（二进制操作符mul）。
ne(other[, level, fill_value, axis])	返回系列和其他的不等于的结果（元素方式）（二进制操作符ne）。
nlargest([n, keep])	返回最大的n个元素。
notna()	检测存在（非缺失）值。
notnull()	Series.notnull是Series.notna的别名。
nsmallest([n, keep])	返回最小的n个元素。
nunique([dropna])	返回对象中唯一元素的数量。
pad(*[, axis, inplace, limit, downcast])	（已弃用）通过传播最后一个有效观测值填充NA / NaN值。
pct_change([periods, fill_method, limit, freq])	当前元素与上一个元素之间的百分比变化。
pipe(func, args, *kwargs)	应用期望Series或DataFrames的可链式函数。
plot	PlotAccessor的别名
pop(item)	返回item并从series中删除它。
pow(other[, level, fill_value, axis])	返回Series和other的指数幂，按元素计算（二元运算符pow）。
prod([axis, skipna, numeric_only, min_count])	返回沿请求的轴的值的乘积。
product([axis, skipna, numeric_only, min_count])	返回沿请求的轴的值的乘积。
quantile([q, interpolation])	返回给定分位数处的值。
radd(other[, level, fill_value, axis])	返回Series和other的加法，按元素计算（二元运算符radd）。
rank([axis, method, numeric_only, …])	沿着轴计算数值数据的排名（从1到n）。
ravel([order])	将底层数据展平为ndarray或ExtensionArray。
rdiv(other[, level, fill_value, axis])	返回Series和other的浮点除法，按元素计算（二元运算符rtruediv）。
rdivmod(other[, level, fill_value, axis])	返回Series和other的整除和取模，按元素计算（二元运算符rdivmod）。
reindex([index, axis, method, copy, level, …])	通过可选的填充逻辑将Series调整为新的索引。
reindex_like(other[, method, copy, limit, …])	返回具有与其他对象匹配的索引的对象。
rename([index, axis, copy, inplace, level, …])	修改Series的索引标签或名称。
rename_axis([mapper, index, axis, copy, inplace])	设置索引或列的轴的名称。
reorder_levels(order)	使用输入顺序重新排列索引级别。
repeat(repeats[, axis])	重复Series的元素。
replace([to_replace, value, inplace, limit, …])	用给定值替换to_replace中的值。
resample(rule[, axis, closed, label, …])	重新采样时间序列数据。
reset_index([level, drop, name, inplace, …])	生成具有重置索引的新DataFrame或Series。
rfloordiv(other[, level, fill_value, axis])	返回Series和other的整除，按元素计算（二元运算符rfloordiv）。
rmod(other[, level, fill_value, axis])	返回Series和other的取模，按元素计算（二元运算符rmod）。
rmul(other[, level, fill_value, axis])	返回Series和other的乘法，按元素计算（二元运算符rmul）。
rolling(window[, min_periods, center, …])	提供滚动窗口计算。
round([decimals])	将Series中的每个值四舍五入到指定的小数位数。
rpow(other[, level, fill_value, axis])	返回Series和other的指数幂，按元素计算（二元运算符rpow）。
rsub(other[, level, fill_value, axis])	返回Series和other的减法，按元素计算（二元运算符rsub）。
rtruediv(other[, level, fill_value, axis])	返回Series和other的浮点除法，按元素计算（二元运算符rtruediv）。
sample([n, frac, replace, weights, …])	从对象的轴中返回随机样本。
searchsorted(value[, side, sorter])	找到应插入以保持顺序的元素的索引。
sem([axis, skipna, ddof, numeric_only])	返回请求轴上均值的无偏标准误差。
set_axis(labels, *[, axis, copy])	将所需的索引分配给给定的轴。
set_flags(*[, copy, allows_duplicate_labels])	返回具有更新标志的新对象。
shift([periods, freq, axis, fill_value, suffix])	按指定的周期数移动索引，可选择使用时间频率。
skew([axis, skipna, numeric_only])	返回请求轴上的无偏偏度。
sort_index(*[, axis, level, ascending, …])	根据索引标签对Series进行排序。
sort_values(*[, axis, ascending, inplace, …])	按值排序。
sparse	SparseAccessor的别名。
squeeze([axis])	将1维轴对象压缩为标量。
std([axis, skipna, ddof, numeric_only])	返回请求轴上的样本标准差。
str	StringMethods的别名。
sub(other[, level, fill_value, axis])	返回Series和other的减法，按元素计算（二元运算符sub）。
subtract(other[, level, fill_value, axis])	返回Series和other的减法，按元素计算（二元运算符sub）。
sum([axis, skipna, numeric_only, min_count])	返回沿请求的轴的值的总和。
swapaxes(axis1, axis2[, copy])	（已弃用）交换轴并适当交换值轴。
swaplevel([i, j, copy])	在MultiIndex中交换级别i和j。
tail([n])	返回最后n行。
take(indices[, axis])	沿轴返回给定位置索引处的元素。
to_clipboard([excel, sep])	将对象复制到系统剪贴板。
to_csv([path_or_buf, sep, na_rep, …])	将对象写入逗号分隔值（csv）文件。
to_dict([into])	将Series转换为{label -> value}字典或类似字典的对象。
to_excel(excel_writer[, sheet_name, na_rep, …])	将对象写入Excel工作表。
to_frame([name])	将Series转换为DataFrame。
to_hdf(path_or_buf, key[, mode, complevel, …])	使用HDFStore将数据写入HDF5文件。
to_json([path_or_buf, orient, date_format, …])	将对象转换为JSON字符串。
to_latex([buf, columns, header, index, …])	将对象呈现为LaTeX表格、长表格或嵌套表格。
to_list()	返回值列表。
to_markdown([buf, mode, index, storage_options])	以Markdown友好的格式打印Series。
to_numpy([dtype, copy, na_value])	表示此Series或Index中的值的NumPy ndarray。
to_period([freq, copy])	将Series从DatetimeIndex转换为PeriodIndex。
to_pickle(path[, compression, protocol, …])	将对象pickle（序列化）到文件中。
to_sql(name, con, *[, schema, if_exists, …])	将存储在DataFrame中的记录写入SQL数据库。
to_string([buf, na_rep, float_format, …])	渲染Series的字符串表示形式。
to_timestamp([freq, how, copy])	将其转换为Timestamps的DatetimeIndex，始于周期的开始。
to_xarray()	从pandas对象返回一个xarray对象。
tolist()	返回值列表。
transform(func[, axis])	在self上调用func，生成与self具有相同轴形状的Series。
transpose(args, *kwargs)	返回转置，根据定义是self。
truediv(other[, level, fill_value, axis])	返回Series和other的浮点除法，按元素计算（二元运算符truediv）。
truncate([before, after, axis, copy])	截断Series或DataFrame之前和之后的某个索引值。
tz_convert(tz[, axis, level, copy])	将带时区的轴转换为目标时区。
tz_localize(tz[, axis, level, copy, …])	将Series或DataFrame的无时区索引本地化为目标时区。
unique()	返回Series对象的唯一值。
unstack([level, fill_value, sort])	将带有MultiIndex的Series展开（也称为透视）以生成DataFrame。
update(other)	使用传递的Series的值修改Series。
value_counts([normalize, sort, ascending, …])	返回包含唯一值的Series对象的计数。
var([axis, skipna, ddof, numeric_only])	返回请求轴上的无偏方差。
view([dtype])	创建Series的新视图。
where(cond[, other, inplace, axis, level])	替换条件为False的值。
xs(key[, axis, level, drop_level])	从Series/DataFrame返回横截面。