python计算十年平均录取率_使用netcdf数据和python在6小时的时间步平均2十年的数据...

I have 2 decades of spatially variable wind data recorded at six-hourly intervals. I need to average the 2 decades of data across each six-hourly time interval, so I end up with 365 * 4 time steps. The data is in netcdf format.

Here's what the data looks like:

import xarray as xr

filename = 'V-01011999-01012019.nc'

ds = xr.open_dataset(filename)

print(ds)

Dimensions: (lat: 8, lon: 7, time: 29221)

Coordinates:

* lat (lat) float32 -2.5 -5.0 -7.5 -10.0 -12.5 -15.0 -17.5 -20.0

* lon (lon) float32 130.0 132.5 135.0 137.5 140.0 142.5 145.0

* time (time) datetime64[ns] 1999-01-01 1999-01-01T06:00:00 .. 2019-01-01

Data variables:

vwnd (time, lat, lon) float32 ...

#remove feb 29 from records

ds = ds.sel(time=~((ds.time.dt.month == 2) & (ds.time.dt.day == 29)))

I have been able to group by day of year to get a 2 decadal average of the day of year.

tsavg = ds.groupby('time.dayofyear').mean('time')

print(tsavg)

Dimensions: (dayofyear: 366, lat: 8, lon: 7)

Coordinates:

* lat (lat) float32 -2.5 -5.0 -7.5 -10.0 -12.5 -15.0 -17.5 -20.0

* lon (lon) float32 130.0 132.5 135.0 137.5 140.0 142.5 145.0

* dayofyear (dayofyear) int64 1 2 3 4 5 6 7 8 ... 360 361 362 363 364 365 366

Data variables:

vwnd (dayofyear, lat, lon) float32 -2.61605 -1.49012 ... -0.959997

What I really want is a time coordinate of length 365 * 4 (4 x 6 hr intervals in a day) with each time step being an average over the past 20 years for that time step.

Also, for some reason tsavg.dayofyear length is still 366 even though I deleted Feb 29th.

I couldn't apply or follow the answers from this post.

I have studied the groupby resources extensively and tried so many things but I can't figure it out. I'm looking for some help with the coding.

解决方案

Indeed there is not a very well documented way of doing this. Note also that dayofyear may not be exactly what you expect it to be.

In lieu of being able to use groupby with multiple levels (e.g. see this answer regarding how to do something similar to what you are asking in pandas), which is not available yet in xarray, a reasonably clean way of solving this kind of problem is to define a new coordinate for grouping that represents the "time of year" for each time in your Dataset.

In your case you are looking to group by the "hour of the year" (i.e. matching month, day, and hour). For this you can create an array of strings, which are basically just the string representations of the dates in the time coordinate with the years dropped:

ds['hourofyear'] = xr.DataArray(ds.indexes['time'].strftime('%m-%d %H'), coords=ds.time.coords)

result = ds.groupby('hourofyear').mean('time')

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值