Python酷库之旅-第三方库Pandas(063)

神奇夜光杯

于 2024-08-04 07:45:00 发布

阅读量632

点赞数 12

分类专栏： Myelsa的Python酷库之旅文章标签： python pandas 开发语言人工智能标准库及第三方库 excel 学习与成长

本文链接：https://blog.csdn.net/ygb_1024/article/details/140878568

版权

Myelsa的Python酷库之旅专栏收录该内容

107 篇文章 23 订阅

订阅专栏

一、用法精讲

246、pandas.Series.shift方法

246-1、语法

246-2、参数

246-3、功能

246-4、返回值

246-5、说明

246-6、用法

246-6-1、数据准备

246-6-2、代码示例

246-6-3、结果输出

247、pandas.Series.first_valid_index方法

247-1、语法

247-2、参数

247-3、功能

247-4、返回值

247-5、说明

247-6、用法

247-6-1、数据准备

247-6-2、代码示例

247-6-3、结果输出

248、pandas.Series.last_valid_index方法

248-1、语法

248-2、参数

248-3、功能

248-4、返回值

248-5、说明

248-6、用法

248-6-1、数据准备

248-6-2、代码示例

248-6-3、结果输出

249、pandas.Series.resample方法

249-1、语法

249-2、参数

249-3、功能

249-4、返回值

249-5、说明

249-6、用法

249-6-1、数据准备

249-6-2、代码示例

249-6-3、结果输出

250、pandas.Series.tz_convert方法

250-1、语法

250-2、参数

250-3、功能

250-4、返回值

250-5、说明

250-6、用法

一、用法精讲

246、pandas.Series.shift方法

246-1、语法

# 246、pandas.Series.shift方法
pandas.Series.shift(periods=1, freq=None, axis=0, fill_value=_NoDefault.no_default, suffix=None)
Shift index by desired number of periods with an optional time freq.

When freq is not passed, shift the index without realigning the data. If freq is passed (in this case, the index must be date or datetime, or it will raise a NotImplementedError), the index will be increased using the periods and the freq. freq can be inferred when specified as “infer” as long as either freq or inferred_freq attribute is set in the index.

Parameters:
periods
int or Sequence
Number of periods to shift. Can be positive or negative. If an iterable of ints, the data will be shifted once by each int. This is equivalent to shifting by one value at a time and concatenating all resulting frames. The resulting columns will have the shift suffixed to their column names. For multiple periods, axis must not be 1.

freq
DateOffset, tseries.offsets, timedelta, or str, optional
Offset to use from the tseries module or time rule (e.g. ‘EOM’). If freq is specified then the index values are shifted but the data is not realigned. That is, use freq if you would like to extend the index when shifting and preserve the original data. If freq is specified as “infer” then it will be inferred from the freq or inferred_freq attributes of the index. If neither of those attributes exist, a ValueError is thrown.

axis
{0 or ‘index’, 1 or ‘columns’, None}, default None
Shift direction. For Series this parameter is unused and defaults to 0.

fill_value
object, optional
The scalar value to use for newly introduced missing values. the default depends on the dtype of self. For numeric data, np.nan is used. For datetime, timedelta, or period data, etc. NaT is used. For extension dtypes, self.dtype.na_value is used.

suffix
str, optional
If str and periods is an iterable, this is added after the column name and before the shift value for each shifted column name.

Returns:
Series/DataFrame
Copy of input object, shifted.

246-2、参数

246-2-1、periods(可选，默认值为1)：整数，指定要移动的时间步数，正值表示向前移动(将数据向后移)，负值表示向后移动(将数据向前移)。

246-2-2、freq(可选，默认值为None)：DateOffset或者字符串，指定移动的频率，如果时间序列有时间索引(如日期)，可以使用频率来移动数据。例如，'D'表示天，'M'表示月。

246-2-3、axis(可选，默认值为0)：字符串或整数，指定移动的轴，在Series中默认是0。

246-2-4、fill_value(可选)：标量值，指定用来填充移位后引入的缺失值的值。

246-2-5、suffix(可选，默认值为None)：字符串，添加到移位后Series名称的后缀，如果指定，将在移位后生成的新Series的名称后添加此后缀。

246-3、功能

将数据按指定的周期进行移位，这在时间序列分析和数据处理过程中非常有用，移位操作会改变数据的位置，但不会改变数据的索引。

246-4、返回值

返回一个与原始Series相同类型的新Series，其中的数据根据指定的周期数进行移位，原始Series的索引保持不变，但数据会向前或向后移动，移位过程中产生的缺失值会用NaN填充，除非指定了fill_value参数。

246-5、说明

无

246-6、用法

246-6-1、数据准备

无

246-6-2、代码示例

# 246、pandas.Series.shift方法
import pandas as pd
# 创建一个时间序列
dates = pd.date_range('2024-08-01', periods=10, freq='D')
data = pd.Series(range(10), index=dates)
# 向后移位1期
shifted_data = data.shift(periods=1)
print("原始数据:")
print(data)
print("\n向后移位1期后的数据:")
print(shifted_data)
# 使用fill_value填充缺失值
shifted_data_fill = data.shift(periods=2, fill_value=0)
print("\n填充缺失值后的数据:")
print(shifted_data_fill)

246-6-3、结果输出

# 246、pandas.Series.shift方法
# 原始数据:
# 2024-08-01    0
# 2024-08-02    1
# 2024-08-03    2
# 2024-08-04    3
# 2024-08-05    4
# 2024-08-06    5
# 2024-08-07    6
# 2024-08-08    7
# 2024-08-09    8
# 2024-08-10    9
# Freq: D, dtype: int64
# 
# 向后移位1期后的数据:
# 2024-08-01    NaN
# 2024-08-02    0.0
# 2024-08-03    1.0
# 2024-08-04    2.0
# 2024-08-05    3.0
# 2024-08-06    4.0
# 2024-08-07    5.0
# 2024-08-08    6.0
# 2024-08-09    7.0
# 2024-08-10    8.0
# Freq: D, dtype: float64
# 
# 填充缺失值后的数据:
# 2024-08-01    0
# 2024-08-02    0
# 2024-08-03    0
# 2024-08-04    1
# 2024-08-05    2
# 2024-08-06    3
# 2024-08-07    4
# 2024-08-08    5
# 2024-08-09    6
# 2024-08-10    7
# Freq: D, dtype: int64

247、pandas.Series.first_valid_index方法

247-1、语法

# 247、pandas.Series.first_valid_index方法
pandas.Series.first_valid_index()
Return index for first non-NA value or None, if no non-NA value is found.

Returns:
type of index

247-2、参数

无

247-3、功能

返回Series中第一个非缺失值(即非NaN值)的索引，这在处理和分析数据时非常有用，特别是当需要找到数据集开始的有效数据点时。

247-4、返回值

返回第一个非缺失值的索引，如果Series中所有值都是缺失值，那么返回None。

247-5、说明

无

247-6、用法

247-6-1、数据准备

无

247-6-2、代码示例

# 247、pandas.Series.first_valid_index方法
import pandas as pd
import numpy as np
# 创建一个包含缺失值的时间序列
data = pd.Series([np.nan, np.nan, 2, 3, np.nan, 5, 6])
# 获取第一个非缺失值的索引
print(data, end='\n\n')
first_valid_idx = data.first_valid_index()
print("第一个非缺失值的索引:", first_valid_idx)

247-6-3、结果输出

# 247、pandas.Series.first_valid_index方法
# 0    NaN
# 1    NaN
# 2    2.0
# 3    3.0
# 4    NaN
# 5    5.0
# 6    6.0
# dtype: float64
#
# 第一个非缺失值的索引: 2

248、pandas.Series.last_valid_index方法

248-1、语法

# 248、pandas.Series.last_valid_index方法
pandas.Series.last_valid_index()
Return index for last non-NA value or None, if no non-NA value is found.

Returns:
type of index

248-2、参数

无

248-3、功能

找到Series中最后一个有效的(非NA/null)数据点的位置(索引)，这在数据清洗和处理时非常有用，因为它可以帮助你快速定位数据集中的最后一个有效数据点。

248-4、返回值

返回最后一个非NA/null值的索引，如果Series中所有值都是NA/null，那么返回None。

248-5、说明

无

248-6、用法

248-6-1、数据准备

无

248-6-2、代码示例

# 248、pandas.Series.last_valid_index方法
import pandas as pd
import numpy as np
# 创建一个包含一些NaN值的示例Series
data = pd.Series([1, 2, np.nan, 4, np.nan, 6])
# 显示Series
print("Series:")
print(data)
# 获取最后一个有效索引
last_valid_idx = data.last_valid_index()
# 显示最后一个有效索引
print("\n最后一个有效索引:", last_valid_idx)

248-6-3、结果输出

# 248、pandas.Series.last_valid_index方法
# Series:
# 0    1.0
# 1    2.0
# 2    NaN
# 3    4.0
# 4    NaN
# 5    6.0
# dtype: float64
#
# 最后一个有效索引: 5

249、pandas.Series.resample方法

249-1、语法

# 249、pandas.Series.resample方法
pandas.Series.resample(rule, axis=_NoDefault.no_default, closed=None, label=None, convention=_NoDefault.no_default, kind=_NoDefault.no_default, on=None, level=None, origin='start_day', offset=None, group_keys=False)
Resample time-series data.

Convenience method for frequency conversion and resampling of time series. The object must have a datetime-like index (DatetimeIndex, PeriodIndex, or TimedeltaIndex), or the caller must pass the label of a datetime-like series/index to the on/level keyword parameter.

Parameters:
ruleDateOffset, Timedelta or str
The offset string or object representing target conversion.

axis{0 or ‘index’, 1 or ‘columns’}, default 0
Which axis to use for up- or down-sampling. For Series this parameter is unused and defaults to 0. Must be DatetimeIndex, TimedeltaIndex or PeriodIndex.

Deprecated since version 2.0.0: Use frame.T.resample(…) instead.

closed{‘right’, ‘left’}, default None
Which side of bin interval is closed. The default is ‘left’ for all frequency offsets except for ‘ME’, ‘YE’, ‘QE’, ‘BME’, ‘BA’, ‘BQE’, and ‘W’ which all have a default of ‘right’.

label{‘right’, ‘left’}, default None
Which bin edge label to label bucket with. The default is ‘left’ for all frequency offsets except for ‘ME’, ‘YE’, ‘QE’, ‘BME’, ‘BA’, ‘BQE’, and ‘W’ which all have a default of ‘right’.

convention{‘start’, ‘end’, ‘s’, ‘e’}, default ‘start’
For PeriodIndex only, controls whether to use the start or end of rule.

Deprecated since version 2.2.0: Convert PeriodIndex to DatetimeIndex before resampling instead.

kind{‘timestamp’, ‘period’}, optional, default None
Pass ‘timestamp’ to convert the resulting index to a DateTimeIndex or ‘period’ to convert it to a PeriodIndex. By default the input representation is retained.

Deprecated since version 2.2.0: Convert index to desired type explicitly instead.

onstr, optional
For a DataFrame, column to use instead of index for resampling. Column must be datetime-like.

levelstr or int, optional
For a MultiIndex, level (name or number) to use for resampling. level must be datetime-like.

originTimestamp or str, default ‘start_day’
The timestamp on which to adjust the grouping. The timezone of origin must match the timezone of the index. If string, must be one of the following:

‘epoch’: origin is 1970-01-01

‘start’: origin is the first value of the timeseries

‘start_day’: origin is the first day at midnight of the timeseries

‘end’: origin is the last value of the timeseries

‘end_day’: origin is the ceiling midnight of the last day

New in version 1.3.0.

Note

Only takes effect for Tick-frequencies (i.e. fixed frequencies like days, hours, and minutes, rather than months or quarters).

offsetTimedelta or str, default is None
An offset timedelta added to the origin.

group_keysbool, default False
Whether to include the group keys in the result index when using .apply() on the resampled object.

New in version 1.5.0: Not specifying group_keys will retain values-dependent behavior from pandas 1.4 and earlier (see pandas 1.5.0 Release notes for examples).

Changed in version 2.0.0: group_keys now defaults to False.

Returns:
pandas.api.typing.Resampler
Resampler object.

See also

Series.resample
Resample a Series.

DataFrame.resample
Resample a DataFrame.

groupby
Group Series/DataFrame by mapping, function, label, or list of labels.

asfreq
Reindex a Series/DataFrame with the given frequency without grouping.

249-2、参数

249-2-1、rule(必须)：字符串或日期偏移对象，指定重采样的频率。例如，'D'表示按天重采样，'M'表示按月重采样，'5T'表示每5分钟重采样。

249-2-2、axis(可选)：整数，要沿着哪个轴进行重采样，对于Series，默认是0。

249-2-3、closed(可选，默认值为None)：字符串，指定区间闭合的位置，'right'表示区间闭合在右端，'left'表示区间闭合在左端。

249-2-4、label(可选，默认值为None)：字符串，指定标签的对齐位置，'right'表示标签对齐在右端，'left'表示标签对齐在左端。

249-2-5、convention(可选)：字符串，对于季度频率('Q'或'A')，'start'表示用季度开始作为标签，'end'表示用季度结束作为标签。

249-2-6、kind(可选)：字符串，指定输出的索引类型，'timestamp'表示时间戳，'period'表示时间段。

249-2-7、on(可选，默认值为None)：字符串，指定用于重采样的列名，仅适用于DataFrame。

249-2-8、level(可选，默认值为None)：整数或字符串，指定用于重采样的索引层级，仅适用于MultiIndex。

249-2-9、origin(可选，默认值为'start_day')：字符串，用于设定重采样原点的开始位置，可选值：'epoch'、'start'、'start_day'或时间戳。

249-2-10、offset(可选，默认值为None)：字符串或日期偏移对象，用于调整时间规则的偏移量。

249-2-11、group_keys(可选，默认值为False)：布尔值，当进行聚合时，是否将分组键作为索引的一部分返回。

249-3、功能

根据指定的频率对时间序列数据进行重采样，然后可以对重采样后的数据进行各种聚合操作(如求和、平均等)。

249-4、返回值

返回一个Resampler对象，您可以在这个对象上调用聚合函数来得到最终的结果。

249-5、说明

使用场景：

249-5-1、数据频率转换：当数据的原始采样频率不适合当前分析需求时，可以通过重采样来改变数据频率。例如，将分钟级别的数据转换为小时级别、天级别或月级别的数据。

249-5-2、时间序列聚合：对高频数据进行聚合，生成低频数据，常见的聚合操作包括求和、平均值、最大值、最小值等。

249-5-3、缺失值填充：在时间序列中，可能会有缺失的数据点，通过重采样，可以对这些缺失值进行填充或插值。

249-5-4、数据对齐和同步：当有多个时间序列数据需要对齐到同一时间频率时，可以使用重采样来实现。例如，将多个传感器的数据对齐到同一时间频率。

249-5-5、频率降采样：对于高频数据，可以通过重采样来降低频率，从而减少数据量，便于存储和处理。

249-5-6、时间窗口分析：在金融数据分析中，常常需要对数据进行时间窗口分析，如移动平均、滚动窗口统计等，重采样可以帮助将数据转换为适合窗口分析的频率。

249-6、用法

249-6-1、数据准备

无

249-6-2、代码示例

# 249、pandas.Series.resample方法
# 249-1、数据频率转换
import pandas as pd
# 创建分钟级别的数据
minute_data = pd.Series(range(60), index=pd.date_range('2024-01-01', periods=60, freq='min'))
# 将分钟数据转换为小时数据，计算每小时的总和
hourly_data = minute_data.resample('h').sum()
print("原始分钟数据:")
print(minute_data)
print("转换后的小时数据:")
print(hourly_data, end='\n\n')

# 249-2、时间序列聚合
import pandas as pd
# 创建日级别的数据
daily_data = pd.Series(range(30), index=pd.date_range('2024-01-01', periods=30, freq='D'))
# 按月聚合数据，并求和
monthly_data = daily_data.resample('ME').sum()
print("原始日数据:")
print(daily_data)
print("按月聚合后的数据:")
print(monthly_data, end='\n\n')

# 249-3、缺失值填充
import pandas as pd
# 创建日级别的数据并增加缺失值
daily_data = pd.Series(range(30), index=pd.date_range('2024-01-01', periods=30, freq='D'))
data_with_gaps = daily_data.reindex(pd.date_range('2024-01-01', periods=35, freq='D'))
# 对缺失值进行插值填充
filled_data = data_with_gaps.resample('D').ffill()
print("有缺失值的数据:")
print(data_with_gaps)
print("填充后的数据:")
print(filled_data, end='\n\n')

# 249-4、数据对齐和同步
import pandas as pd
# 创建不同频率的传感器数据
sensor1_data = pd.Series(range(10), index=pd.date_range('2024-01-01', periods=10, freq='2D'))
sensor2_data = pd.Series(range(20), index=pd.date_range('2024-01-01', periods=20, freq='D'))
# 将不同频率的数据对齐到同一频率
aligned_sensor1_data = sensor1_data.resample('D').ffill()
aligned_sensor2_data = sensor2_data.resample('D').ffill()
print("传感器1的原始数据:")
print(sensor1_data)
print("传感器1对齐后的数据:")
print(aligned_sensor1_data)
print("传感器2的原始数据:")
print(sensor2_data)
print("传感器2对齐后的数据:")
print(aligned_sensor2_data, end='\n\n')

# 249-5、频率降采样
import pandas as pd
# 创建秒级别的数据
second_data = pd.Series(range(60), index=pd.date_range('2024-01-01', periods=60, freq='s'))
# 将秒级数据降采样为分钟级数据，计算每分钟的平均值
minute_data = second_data.resample('min').mean()
print("原始秒数据:")
print(second_data)
print("降采样后的分钟数据:")
print(minute_data, end='\n\n')

# 249-6、时间窗口分析
import pandas as pd
# 创建日级别的股票价格数据
daily_stock_prices = pd.Series(range(100), index=pd.date_range('2024-01-01', periods=100, freq='D'))
# 按周计算滚动平均
weekly_mean_prices = daily_stock_prices.resample('W').mean()
print("原始日股票价格数据:")
print(daily_stock_prices)
print("按周计算的滚动平均:")
print(weekly_mean_prices, end='\n\n')

249-6-3、结果输出

# 249、pandas.Series.resample方法
# 249-1、数据频率转换
# 原始分钟数据:
# 2024-01-01 00:00:00     0
# 2024-01-01 00:01:00     1
# 2024-01-01 00:02:00     2
# 2024-01-01 00:03:00     3
# 2024-01-01 00:04:00     4
# 2024-01-01 00:05:00     5
# 2024-01-01 00:06:00     6
# 2024-01-01 00:07:00     7
# 2024-01-01 00:08:00     8
# 2024-01-01 00:09:00     9
# 2024-01-01 00:10:00    10
# 2024-01-01 00:11:00    11
# 2024-01-01 00:12:00    12
# 2024-01-01 00:13:00    13
# 2024-01-01 00:14:00    14
# 2024-01-01 00:15:00    15
# 2024-01-01 00:16:00    16
# 2024-01-01 00:17:00    17
# 2024-01-01 00:18:00    18
# 2024-01-01 00:19:00    19
# 2024-01-01 00:20:00    20
# 2024-01-01 00:21:00    21
# 2024-01-01 00:22:00    22
# 2024-01-01 00:23:00    23
# 2024-01-01 00:24:00    24
# 2024-01-01 00:25:00    25
# 2024-01-01 00:26:00    26
# 2024-01-01 00:27:00    27
# 2024-01-01 00:28:00    28
# 2024-01-01 00:29:00    29
# 2024-01-01 00:30:00    30
# 2024-01-01 00:31:00    31
# 2024-01-01 00:32:00    32
# 2024-01-01 00:33:00    33
# 2024-01-01 00:34:00    34
# 2024-01-01 00:35:00    35
# 2024-01-01 00:36:00    36
# 2024-01-01 00:37:00    37
# 2024-01-01 00:38:00    38
# 2024-01-01 00:39:00    39
# 2024-01-01 00:40:00    40
# 2024-01-01 00:41:00    41
# 2024-01-01 00:42:00    42
# 2024-01-01 00:43:00    43
# 2024-01-01 00:44:00    44
# 2024-01-01 00:45:00    45
# 2024-01-01 00:46:00    46
# 2024-01-01 00:47:00    47
# 2024-01-01 00:48:00    48
# 2024-01-01 00:49:00    49
# 2024-01-01 00:50:00    50
# 2024-01-01 00:51:00    51
# 2024-01-01 00:52:00    52
# 2024-01-01 00:53:00    53
# 2024-01-01 00:54:00    54
# 2024-01-01 00:55:00    55
# 2024-01-01 00:56:00    56
# 2024-01-01 00:57:00    57
# 2024-01-01 00:58:00    58
# 2024-01-01 00:59:00    59
# Freq: min, dtype: int64
# 转换后的小时数据:
# 2024-01-01    1770
# Freq: h, dtype: int64

# 249-2、时间序列聚合
# 原始日数据:
# 2024-01-01     0
# 2024-01-02     1
# 2024-01-03     2
# 2024-01-04     3
# 2024-01-05     4
# 2024-01-06     5
# 2024-01-07     6
# 2024-01-08     7
# 2024-01-09     8
# 2024-01-10     9
# 2024-01-11    10
# 2024-01-12    11
# 2024-01-13    12
# 2024-01-14    13
# 2024-01-15    14
# 2024-01-16    15
# 2024-01-17    16
# 2024-01-18    17
# 2024-01-19    18
# 2024-01-20    19
# 2024-01-21    20
# 2024-01-22    21
# 2024-01-23    22
# 2024-01-24    23
# 2024-01-25    24
# 2024-01-26    25
# 2024-01-27    26
# 2024-01-28    27
# 2024-01-29    28
# 2024-01-30    29
# Freq: D, dtype: int64
# 按月聚合后的数据:
# 2024-01-31    435
# Freq: ME, dtype: int64

# 249-3、缺失值填充
# 有缺失值的数据:
# 2024-01-01     0.0
# 2024-01-02     1.0
# 2024-01-03     2.0
# 2024-01-04     3.0
# 2024-01-05     4.0
# 2024-01-06     5.0
# 2024-01-07     6.0
# 2024-01-08     7.0
# 2024-01-09     8.0
# 2024-01-10     9.0
# 2024-01-11    10.0
# 2024-01-12    11.0
# 2024-01-13    12.0
# 2024-01-14    13.0
# 2024-01-15    14.0
# 2024-01-16    15.0
# 2024-01-17    16.0
# 2024-01-18    17.0
# 2024-01-19    18.0
# 2024-01-20    19.0
# 2024-01-21    20.0
# 2024-01-22    21.0
# 2024-01-23    22.0
# 2024-01-24    23.0
# 2024-01-25    24.0
# 2024-01-26    25.0
# 2024-01-27    26.0
# 2024-01-28    27.0
# 2024-01-29    28.0
# 2024-01-30    29.0
# 2024-01-31     NaN
# 2024-02-01     NaN
# 2024-02-02     NaN
# 2024-02-03     NaN
# 2024-02-04     NaN
# Freq: D, dtype: float64
# 填充后的数据:
# 2024-01-01     0.0
# 2024-01-02     1.0
# 2024-01-03     2.0
# 2024-01-04     3.0
# 2024-01-05     4.0
# 2024-01-06     5.0
# 2024-01-07     6.0
# 2024-01-08     7.0
# 2024-01-09     8.0
# 2024-01-10     9.0
# 2024-01-11    10.0
# 2024-01-12    11.0
# 2024-01-13    12.0
# 2024-01-14    13.0
# 2024-01-15    14.0
# 2024-01-16    15.0
# 2024-01-17    16.0
# 2024-01-18    17.0
# 2024-01-19    18.0
# 2024-01-20    19.0
# 2024-01-21    20.0
# 2024-01-22    21.0
# 2024-01-23    22.0
# 2024-01-24    23.0
# 2024-01-25    24.0
# 2024-01-26    25.0
# 2024-01-27    26.0
# 2024-01-28    27.0
# 2024-01-29    28.0
# 2024-01-30    29.0
# 2024-01-31     NaN
# 2024-02-01     NaN
# 2024-02-02     NaN
# 2024-02-03     NaN
# 2024-02-04     NaN
# Freq: D, dtype: float64

# 249-4、数据对齐和同步
# 传感器1的原始数据:
# 2024-01-01    0
# 2024-01-03    1
# 2024-01-05    2
# 2024-01-07    3
# 2024-01-09    4
# 2024-01-11    5
# 2024-01-13    6
# 2024-01-15    7
# 2024-01-17    8
# 2024-01-19    9
# Freq: 2D, dtype: int64
# 传感器1对齐后的数据:
# 2024-01-01    0
# 2024-01-02    0
# 2024-01-03    1
# 2024-01-04    1
# 2024-01-05    2
# 2024-01-06    2
# 2024-01-07    3
# 2024-01-08    3
# 2024-01-09    4
# 2024-01-10    4
# 2024-01-11    5
# 2024-01-12    5
# 2024-01-13    6
# 2024-01-14    6
# 2024-01-15    7
# 2024-01-16    7
# 2024-01-17    8
# 2024-01-18    8
# 2024-01-19    9
# Freq: D, dtype: int64
# 传感器2的原始数据:
# 2024-01-01     0
# 2024-01-02     1
# 2024-01-03     2
# 2024-01-04     3
# 2024-01-05     4
# 2024-01-06     5
# 2024-01-07     6
# 2024-01-08     7
# 2024-01-09     8
# 2024-01-10     9
# 2024-01-11    10
# 2024-01-12    11
# 2024-01-13    12
# 2024-01-14    13
# 2024-01-15    14
# 2024-01-16    15
# 2024-01-17    16
# 2024-01-18    17
# 2024-01-19    18
# 2024-01-20    19
# Freq: D, dtype: int64
# 传感器2对齐后的数据:
# 2024-01-01     0
# 2024-01-02     1
# 2024-01-03     2
# 2024-01-04     3
# 2024-01-05     4
# 2024-01-06     5
# 2024-01-07     6
# 2024-01-08     7
# 2024-01-09     8
# 2024-01-10     9
# 2024-01-11    10
# 2024-01-12    11
# 2024-01-13    12
# 2024-01-14    13
# 2024-01-15    14
# 2024-01-16    15
# 2024-01-17    16
# 2024-01-18    17
# 2024-01-19    18
# 2024-01-20    19
# Freq: D, dtype: int64

# 249-5、频率降采样
# 原始秒数据:
# 2024-01-01 00:00:00     0
# 2024-01-01 00:00:01     1
# 2024-01-01 00:00:02     2
# 2024-01-01 00:00:03     3
# 2024-01-01 00:00:04     4
# 2024-01-01 00:00:05     5
# 2024-01-01 00:00:06     6
# 2024-01-01 00:00:07     7
# 2024-01-01 00:00:08     8
# 2024-01-01 00:00:09     9
# 2024-01-01 00:00:10    10
# 2024-01-01 00:00:11    11
# 2024-01-01 00:00:12    12
# 2024-01-01 00:00:13    13
# 2024-01-01 00:00:14    14
# 2024-01-01 00:00:15    15
# 2024-01-01 00:00:16    16
# 2024-01-01 00:00:17    17
# 2024-01-01 00:00:18    18
# 2024-01-01 00:00:19    19
# 2024-01-01 00:00:20    20
# 2024-01-01 00:00:21    21
# 2024-01-01 00:00:22    22
# 2024-01-01 00:00:23    23
# 2024-01-01 00:00:24    24
# 2024-01-01 00:00:25    25
# 2024-01-01 00:00:26    26
# 2024-01-01 00:00:27    27
# 2024-01-01 00:00:28    28
# 2024-01-01 00:00:29    29
# 2024-01-01 00:00:30    30
# 2024-01-01 00:00:31    31
# 2024-01-01 00:00:32    32
# 2024-01-01 00:00:33    33
# 2024-01-01 00:00:34    34
# 2024-01-01 00:00:35    35
# 2024-01-01 00:00:36    36
# 2024-01-01 00:00:37    37
# 2024-01-01 00:00:38    38
# 2024-01-01 00:00:39    39
# 2024-01-01 00:00:40    40
# 2024-01-01 00:00:41    41
# 2024-01-01 00:00:42    42
# 2024-01-01 00:00:43    43
# 2024-01-01 00:00:44    44
# 2024-01-01 00:00:45    45
# 2024-01-01 00:00:46    46
# 2024-01-01 00:00:47    47
# 2024-01-01 00:00:48    48
# 2024-01-01 00:00:49    49
# 2024-01-01 00:00:50    50
# 2024-01-01 00:00:51    51
# 2024-01-01 00:00:52    52
# 2024-01-01 00:00:53    53
# 2024-01-01 00:00:54    54
# 2024-01-01 00:00:55    55
# 2024-01-01 00:00:56    56
# 2024-01-01 00:00:57    57
# 2024-01-01 00:00:58    58
# 2024-01-01 00:00:59    59
# Freq: s, dtype: int64
# 降采样后的分钟数据:
# 2024-01-01    29.5
# Freq: min, dtype: float64

# 249-6、时间窗口分析
# 原始日股票价格数据:
# 2024-01-01     0
# 2024-01-02     1
# 2024-01-03     2
# 2024-01-04     3
# 2024-01-05     4
#               ..
# 2024-04-05    95
# 2024-04-06    96
# 2024-04-07    97
# 2024-04-08    98
# 2024-04-09    99
# Freq: D, Length: 100, dtype: int64
# 按周计算的滚动平均:
# 2024-01-07     3.0
# 2024-01-14    10.0
# 2024-01-21    17.0
# 2024-01-28    24.0
# 2024-02-04    31.0
# 2024-02-11    38.0
# 2024-02-18    45.0
# 2024-02-25    52.0
# 2024-03-03    59.0
# 2024-03-10    66.0
# 2024-03-17    73.0
# 2024-03-24    80.0
# 2024-03-31    87.0
# 2024-04-07    94.0
# 2024-04-14    98.5
# Freq: W-SUN, dtype: float64

250、pandas.Series.tz_convert方法

250-1、语法

# 250、pandas.Series.tz_convert方法
pandas.Series.tz_convert(tz, axis=0, level=None, copy=None)
Convert tz-aware axis to target time zone.

Parameters:
tzstr or tzinfo object or None
Target time zone. Passing None will convert to UTC and remove the timezone information.

axis{0 or ‘index’, 1 or ‘columns’}, default 0
The axis to convert

levelint, str, default None
If axis is a MultiIndex, convert a specific level. Otherwise must be None.

copybool, default True
Also make a copy of the underlying data.

Note

The copy keyword will change behavior in pandas 3.0. Copy-on-Write will be enabled by default, which means that all methods with a copy keyword will use a lazy copy mechanism to defer the copy and ignore the copy keyword. The copy keyword will be removed in a future version of pandas.

You can already get the future behavior and improvements through enabling copy on write pd.options.mode.copy_on_write = True

Returns:
Series/DataFrame
Object with time zone converted axis.

Raises:
TypeError
If the axis is tz-naive.

250-2、参数

250-2-1、tz(必须)：字符串，pytz.timezone或者dateutil.tz.tzfile，指定目标时区，可以是时区名称的字符串(例如'UTC', 'US/Eastern')，也可以是pytz或dateutil模块中的时区对象。

250-2-2、axis(可选，默认值为0)：整数或字符串，指定要操作的轴，这意味着沿着行操作(对于Series而言，这个参数通常是无意义的，因为Series只有一个轴)。

250-2-3、level(可选，默认值为None)：整数或字符串，当Series或DataFrame的索引是MultiIndex(多重索引)时，可以通过level指定哪个级别的索引包含时区信息；如果索引是单级的，则忽略此参数。

250-2-4、copy(可选，默认值为None)：布尔值，指定是否复制底层数据，如果设置为False，则尽可能避免复制，返回的对象可能与原对象共享数据；如果设置为True，则始终返回一个副本。

250-3、功能

用于将Series中的时间戳数据从一个时区转换到另一个时区，该方法仅适用于包含时区信息的时间序列数据。

250-4、返回值

返回一个新的Series对象，其时间戳已经转换到指定的目标时区，如果copy参数设置为False并且没有发生实际的数据复制，则返回的对象可能与原始对象共享数据。

250-5、说明

无

250-6、用法

250-6-1、数据准备

无

250-6-2、代码示例

# 250、pandas.Series.tz_convert方法
import pandas as pd
s = pd.Series(
     [1],
     index=pd.DatetimeIndex(['2024-08-2 21:14:00+02:00']),
 )
data = s.tz_convert('Asia/Shanghai')
print(data)

250-6-3、结果输出

# 250、pandas.Series.tz_convert方法
# 2024-08-03 03:14:00+08:00    1
# dtype: int64

二、推荐阅读

1、Python筑基之旅

2、Python函数之旅

3、Python算法之旅

4、Python魔法之旅

5、博客个人主页

神奇夜光杯

关注

12
点赞
踩
7

收藏

觉得还不错? 一键收藏
打赏
1
评论
Python酷库之旅-第三方库Pandas(063)

第三方库Pandas(063)
复制链接

扫一扫

专栏目录