Python酷库之旅-第三方库Pandas(060)

最新推荐文章于 2024-09-07 20:08:36 发布

神奇夜光杯

最新推荐文章于 2024-09-07 20:08:36 发布

阅读量1.9k

点赞数 89

分类专栏： Myelsa的Python酷库之旅文章标签： python pandas 开发语言人工智能 excel 第三方库学习与成长

本文链接：https://blog.csdn.net/ygb_1024/article/details/140856382

版权

Myelsa的Python酷库之旅专栏收录该内容

158 篇文章 47 订阅

订阅专栏

一、用法精讲

231、pandas.Series.reorder_levels方法

231-1、语法

231-2、参数

231-3、功能

231-4、返回值

231-5、说明

231-6、用法

231-6-1、数据准备

231-6-2、代码示例

231-6-3、结果输出

232、pandas.Series.sort_values方法

232-1、语法

232-2、参数

232-3、功能

232-4、返回值

232-5、说明

232-6、用法

232-6-1、数据准备

232-6-2、代码示例

232-6-3、结果输出

233、pandas.Series.sort_index方法

233-1、语法

233-2、参数

233-3、功能

233-4、返回值

233-5、说明

233-6、用法

233-6-1、数据准备

233-6-2、代码示例

233-6-3、结果输出

234、pandas.Series.swaplevel方法

234-1、语法

234-2、参数

234-3、功能

234-4、返回值

234-5、说明

234-6、用法

234-6-1、数据准备

234-6-2、代码示例

234-6-3、结果输出

235、pandas.Series.unstack方法

235-1、语法

235-2、参数

235-3、功能

235-4、返回值

235-5、说明

235-6、用法

一、用法精讲

231、pandas.Series.reorder_levels方法

231-1、语法

# 231、pandas.Series.reorder_levels方法
pandas.Series.reorder_levels(order)
Rearrange index levels using input order.

May not drop or duplicate levels.

Parameters:
order
list of int representing new level order
Reference level by number or key.

Returns:
type of caller (new object)

231-2、参数

231-2-1、order(必须)：指定重新排序索引的顺序，它应该是一个列表或元组，包含索引级别的顺序，级别的索引可以通过名称或位置来指定。如果提供的顺序不包含所有的级别，reorder_levels方法会将缺少的级别保持不变。

231-3、功能

允许你根据指定的顺序重新排序Series的多级索引，这对于调整数据的组织结构很有用，尤其是在进行数据透视或汇总时。

231-4、返回值

返回一个新的Series对象，其中的索引按照指定的顺序重新排序。

231-5、说明

无

231-6、用法

231-6-1、数据准备

无

231-6-2、代码示例

# 231、pandas.Series.reorder_levels方法
import pandas as pd
# 创建一个具有多级索引的Series
index = pd.MultiIndex.from_tuples([('A', 1), ('A', 2), ('B', 1), ('B', 2)], names=['letter', 'number'])
s = pd.Series([10, 20, 30, 40], index=index)
# 重新排序索引
s_reordered = s.reorder_levels(['number', 'letter'])
print(s_reordered)

231-6-3、结果输出

# 231、pandas.Series.reorder_levels方法
# number  letter
# 1       A         10
# 2       A         20
# 1       B         30
# 2       B         40
# dtype: int64

232、pandas.Series.sort_values方法

232-1、语法

# 232、pandas.Series.sort_values方法
pandas.Series.sort_values(*, axis=0, ascending=True, inplace=False, kind='quicksort', na_position='last', ignore_index=False, key=None)
Sort by the values.

Sort a Series in ascending or descending order by some criterion.

Parameters:
axis
{0 or ‘index’}
Unused. Parameter needed for compatibility with DataFrame.

ascending
bool or list of bools, default True
If True, sort values in ascending order, otherwise descending.

inplace
bool, default False
If True, perform operation in-place.

kind
{‘quicksort’, ‘mergesort’, ‘heapsort’, ‘stable’}, default ‘quicksort’
Choice of sorting algorithm. See also numpy.sort() for more information. ‘mergesort’ and ‘stable’ are the only stable algorithms.

na_position
{‘first’ or ‘last’}, default ‘last’
Argument ‘first’ puts NaNs at the beginning, ‘last’ puts NaNs at the end.

ignore_index
bool, default False
If True, the resulting axis will be labeled 0, 1, …, n - 1.

key
callable, optional
If not None, apply the key function to the series values before sorting. This is similar to the key argument in the builtin sorted() function, with the notable difference that this key function should be vectorized. It should expect a Series and return an array-like.

Returns:
Series or None
Series ordered by values or None if inplace=True.

232-2、参数

232-2-1、axis(可选，默认值为0)：指定排序的轴，对于Series对象，这个参数默认为0，即沿着数据的索引进行排序。

232-2-2、ascending(可选，默认值为True)：指定排序的顺序，True表示升序(默认)，False表示降序。如果提供的是一个布尔列表，它必须与Series的索引长度相同，且会对不同的排序级别进行不同的排序。

232-2-3、inplace(可选，默认值为False)：是否在原地进行排序，True表示在原地排序并修改原始对象，False(默认)表示返回排序后的新对象。

232-2-4、kind(可选，默认值为'quicksort')：排序算法类：'quicksort'、'mergesort'和'heapsort'，任选其一。

232-2-5、na_position(可选，默认值为'last')：NaN值的位置，可选值有'first'(默认)和'last'，指定NaN值在排序后的结果中的位置。

232-2-6、ignore_index(可选，默认值为False)：排序后是否重新生成索引，False(默认)表示保留原索引，True表示生成新的连续索引。

232-2-7、key(可选，默认值为None)：函数，用于在排序前对数据进行转换，该函数会作用于每个数据值，然后基于转换后的值进行排序。

232-3、功能

用于对Series对象的值进行排序，该方法支持多种排序选项，并可以根据需要进行配置。

232-4、返回值

返回一个新的Series对象，其中的值按照指定的排序顺序排列，如果inplace=True，则对原对象进行修改，不返回新对象。

232-5、说明

无

232-6、用法

232-6-1、数据准备

无

232-6-2、代码示例

# 232、pandas.Series.sort_values方法
import pandas as pd
# 创建一个Series对象
s = pd.Series([3, 6, 5, 11, 10, 8, 10, 24])
# 对Series进行升序排序
s_sorted = s.sort_values()
print(s_sorted, end='\n\n')
s_sorted = s.sort_values(ignore_index=True)
print(s_sorted)

232-6-3、结果输出

# 232、pandas.Series.sort_values方法
# 0     3
# 2     5
# 1     6
# 5     8
# 4    10
# 6    10
# 3    11
# 7    24
# dtype: int64
#
# 0     3
# 1     5
# 2     6
# 3     8
# 4    10
# 5    10
# 6    11
# 7    24

233、pandas.Series.sort_index方法

233-1、语法

# 233、pandas.Series.sort_index方法
pandas.Series.sort_index(*, axis=0, level=None, ascending=True, inplace=False, kind='quicksort', na_position='last', sort_remaining=True, ignore_index=False, key=None)
Sort Series by index labels.

Returns a new Series sorted by label if inplace argument is False, otherwise updates the original series and returns None.

Parameters:
axis
{0 or ‘index’}
Unused. Parameter needed for compatibility with DataFrame.

level
int, optional
If not None, sort on values in specified index level(s).

ascending
bool or list-like of bools, default True
Sort ascending vs. descending. When the index is a MultiIndex the sort direction can be controlled for each level individually.

inplace
bool, default False
If True, perform operation in-place.

kind
{‘quicksort’, ‘mergesort’, ‘heapsort’, ‘stable’}, default ‘quicksort’
Choice of sorting algorithm. See also numpy.sort() for more information. ‘mergesort’ and ‘stable’ are the only stable algorithms. For DataFrames, this option is only applied when sorting on a single column or label.

na_position
{‘first’, ‘last’}, default ‘last’
If ‘first’ puts NaNs at the beginning, ‘last’ puts NaNs at the end. Not implemented for MultiIndex.

sort_remaining
bool, default True
If True and sorting by level and index is multilevel, sort by other levels too (in order) after sorting by specified level.

ignore_index
bool, default False
If True, the resulting axis will be labeled 0, 1, …, n - 1.

key
callable, optional
If not None, apply the key function to the index values before sorting. This is similar to the key argument in the builtin sorted() function, with the notable difference that this key function should be vectorized. It should expect an Index and return an Index of the same shape.

Returns:
Series or None
The original Series sorted by the labels or None if inplace=True.

233-2、参数

233-2-1、axis(可选，默认值为0)：指定排序的轴，对于Series对象，通常设置为0，表示对索引进行排序。

233-2-2、level(可选，默认值为None)：对多级索引(MultiIndex)进行排序时，指定排序的级别，可以是单个级别或级别的列表。

233-2-3、ascending(可选，默认值为True)：指定排序的顺序，True表示升序(默认)；False表示降序，如果提供的是一个布尔列表，它必须与索引的级别数相同。

233-2-4、inplace(可选，默认值为False)：是否在原地进行排序，True表示在原地排序并修改原始对象，False(默认)表示返回排序后的新对象。

233-2-5、kind(可选，默认值为'quicksort')：排序算法类：'quicksort'、'mergesort'和'heapsort'，任选其一。

233-2-6、na_position(可选，默认值为'last')：NaN值的位置，可选值有'first'(默认)和'last'，指定NaN值在排序后的结果中的位置。

233-2-7、sort_remaining(可选，默认值为True)：在MultiIndex中，是否对剩余的级别进行排序。True(默认)表示对所有级别进行排序；False表示仅对指定的级别排序。

233-2-8、ignore_index(可选，默认值为False)：排序后是否重新生成索引，False(默认)表示保留原索引，True表示生成新的连续索引。

233-2-9、key(可选，默认值为None)：函数，用于在排序前对索引进行转换，该函数会作用于每个索引值，然后基于转换后的值进行排序。

233-3、功能

用于对Series对象的索引进行排序，它对索引进行排序而不是对数据值排序。

233-4、返回值

返回一个新的Series对象，其中的索引按照指定的排序顺序排列，如果inplace=True，则对原对象进行修改，不返回新对象。

233-5、说明

无

233-6、用法

233-6-1、数据准备

无

233-6-2、代码示例

# 233、pandas.Series.sort_index方法
import pandas as pd
# 创建一个Series对象
s = pd.Series([3, 1, 2, 5, 4], index=['b', 'a', 'c', 'e', 'd'])
# 对索引进行升序排序
s_sorted = s.sort_index()
print(s_sorted, end='\n\n')
s_sorted = s.sort_index(ignore_index=True)
print(s_sorted)

233-6-3、结果输出

# 233、pandas.Series.sort_index方法
# a    1
# b    3
# c    2
# d    4
# e    5
# dtype: int64
#
# 0    1
# 1    3
# 2    2
# 3    4
# 4    5
# dtype: int64

234、pandas.Series.swaplevel方法

234-1、语法

# 234、pandas.Series.swaplevel方法
pandas.Series.swaplevel(i=-2, j=-1, copy=None)
Swap levels i and j in a MultiIndex.

Default is to swap the two innermost levels of the index.

Parameters:
i, jint or str
Levels of the indices to be swapped. Can pass level name as string.

copybool, default True
Whether to copy underlying data.

Note

The copy keyword will change behavior in pandas 3.0. Copy-on-Write will be enabled by default, which means that all methods with a copy keyword will use a lazy copy mechanism to defer the copy and ignore the copy keyword. The copy keyword will be removed in a future version of pandas.

You can already get the future behavior and improvements through enabling copy on write pd.options.mode.copy_on_write = True

Returns:
Series
Series with levels swapped in MultiIndex.

234-2、参数

234-2-1、i(可选，默认值为-2)：整数或字符串，表示要交换的第一个级别的标签或级别的索引位置。

234-2-2、j(可选，默认值为-1)：整数或字符串，表示要交换的第二个级别的标签或级别的索引位置。

234-2-3、copy(可选，默认值为None)：是否在交换级别时复制数据，True表示复制，False表示不复制，如果设为None，则取决于MultiIndex的具体情况。

234-3、功能

用于交换Series对象中的多级索引(MultiIndex)的级别，该方法允许在MultiIndex中调换两个级别的位置。

234-4、返回值

返回一个新的Series对象，其中指定的级别已被交换，如果copy=False，则可能对原对象进行修改(具体取决于索引的实现)。

234-5、说明

无

234-6、用法

234-6-1、数据准备

无

234-6-2、代码示例

# 234、pandas.Series.swaplevel方法
import pandas as pd
# 创建一个MultiIndex的Series对象
index = pd.MultiIndex.from_tuples([('A', 1), ('A', 2), ('B', 1), ('B', 2)], names=['letter', 'number'])
s = pd.Series([10, 20, 30, 40], index=index)
# 打印原始Series
print("Original Series:")
print(s)
# 交换'letter'和'number'两个级别
s_swapped = s.swaplevel(i='letter', j='number')
print("\nSeries after swapping levels:")
print(s_swapped)

234-6-3、结果输出

# 234、pandas.Series.swaplevel方法
# Original Series:
# letter  number
# A       1         10
#         2         20
# B       1         30
#         2         40
# dtype: int64
#
# Series after swapping levels:
# number  letter
# 1       A         10
# 2       A         20
# 1       B         30
# 2       B         40
# dtype: int64

235、pandas.Series.unstack方法

235-1、语法

# 235、pandas.Series.unstack方法
pandas.Series.unstack(level=-1, fill_value=None, sort=True)
Unstack, also known as pivot, Series with MultiIndex to produce DataFrame.

Parameters:
level
int, str, or list of these, default last level
Level(s) to unstack, can pass level name.

fill_value
scalar value, default None
Value to use when replacing NaN values.

sort
bool, default True
Sort the level(s) in the resulting MultiIndex columns.

Returns:
DataFrame
Unstacked Series.

235-2、参数

235-2-1、level(可选，默认值为-1)：要转换为列的层级，默认值是-1，表示最后一个层级。如果Series是多层级的，level参数允许你选择要转换的层级，可以是单个整数、整数列表或层级名称(对于DataFrame)。

235-2-2、fill_value(可选，默认值为None)：在转换过程中用于填充缺失值的值，默认是None，即不进行填充，如果你指定了一个值，那么在转换过程中，所有缺失的数据点将被填充为这个值。

235-2-3、sort(可选，默认值为True)：是否对新生成的列进行排序，默认值是True，表示排序。如果设置为False，则保持原来的顺序，这对于保持数据的原始顺序很有用，尤其是在数据按照特定方式排序时。

235-3、功能

将Series对象的指定层级的索引转换为DataFrame的列，这有助于将数据从长格式(即多级索引的Series)转换为宽格式(即普通的DataFrame)，使数据更容易进行分析和操作。

235-4、返回值

返回一个DataFrame，其中被转换为列的层级的索引现在成为DataFrame的列，原始的索引层级将被保留，或根据指定的层级被处理。

235-5、说明

无

235-6、用法

235-6-1、数据准备

无

235-6-2、代码示例

# 235、pandas.Series.unstack方法
import pandas as pd
data = pd.Series([1, 2, 3, 4], index=[['New York', 'New York', 'Los Angeles', 'Los Angeles'], ['2023-01-01', '2023-01-02', '2023-01-01', '2023-01-02']])
print(data, end='\n\n')
df = data.unstack(level=-1)
print(df)

235-6-3、结果输出

# 235、pandas.Series.unstack方法
# New York     2023-01-01    1
#              2023-01-02    2
# Los Angeles  2023-01-01    3
#              2023-01-02    4
# dtype: int64
#
#              2023-01-01  2023-01-02
# Los Angeles           3           4
# New York              1           2