简单示例
首先我们需要定义一个函数,这个函数接收一个序列,然后返回三个常数:
def foo(lst)->float:
#接收一个一维序列,然后求解线性回归系数、R值、P值
res = stats.linregress(x=range(len(sample)), y=sample)
return res.slope, res.rvalue, res.pvalue
我们需要把一个普通py函数推广到xarray矩阵中,这里需要注意几个细节:
- 如果这个函数的输入为一个序列,要指明沿哪个维度输入,比如:
input_core_dims=[["z"]]
; - 指明输出的维度,如本文中的函数输出三个函数,就要指明:
output_core_dims=[(),(),()]
; - 向量化参数设置为真:
vectorize=True
,只有输出的返回值仅有一个常数时,才可以忽略此参数。
import xarray as xr
import numpy as np
# 定义你的dataarray对象
da = xr.DataArray(np.random.rand(5, 5, 3), coords={"x": [1, 2, 3, 4, 5,], "y": [1, 2, 3, 4, 5,], "z": [1, 2, 3]})
# 定义你的函数
def foo(lst)->float:
#接收一个一维序列,然后求解线性回归系数、R值、P值
res = stats.linregress(x=range(len(sample)), y=sample)
return res.slope, res.rvalue, res.pvalue
# 使用xarray.apply_ufunc函数
result = xr.apply_ufunc(foo, da, input_core_dims=[["z"]], output_core_dims=[(),(),()], vectorize=True, dask="allowed")
# 查看结果
print(result)
可以看到,生成了3个DataArray
对象:
<xarray.DataArray (x: 5, y: 5)>
array([[0.01218603, 0.01218603, 0.01218603, 0.01218603, 0.01218603],
[0.01218603, 0.01218603, 0.01218603, 0.01218603, 0.01218603],
[0.01218603, 0.01218603, 0.01218603, 0.01218603, 0.01218603],
[0.01218603, 0.01218603, 0.01218603, 0.01218603, 0.01218603],
[0.01218603, 0.01218603, 0.01218603, 0.01218603, 0.01218603]])
Coordinates:
* x (x) int64 1 2 3 4 5
* y (y) int64 1 2 3 4 5,
<xarray.DataArray (x: 5, y: 5)>
array([[0.36905434, 0.36905434, 0.36905434, 0.36905434, 0.36905434],
[0.36905434, 0.36905434, 0.36905434, 0.36905434, 0.36905434],
[0.36905434, 0.36905434, 0.36905434, 0.36905434, 0.36905434],
[0.36905434, 0.36905434, 0.36905434, 0.36905434, 0.36905434],
[0.36905434, 0.36905434, 0.36905434, 0.36905434, 0.36905434]])
Coordinates:
* x (x) int64 1 2 3 4 5
* y (y) int64 1 2 3 4 5,
<xarray.DataArray (x: 5, y: 5)>
array([[0.00131352, 0.00131352, 0.00131352, 0.00131352, 0.00131352],
[0.00131352, 0.00131352, 0.00131352, 0.00131352, 0.00131352],
[0.00131352, 0.00131352, 0.00131352, 0.00131352, 0.00131352],
[0.00131352, 0.00131352, 0.00131352, 0.00131352, 0.00131352],
[0.00131352, 0.00131352, 0.00131352, 0.00131352, 0.00131352]])
Coordinates:
* x (x) int64 1 2 3 4 5
* y (y) int64 1 2 3 4 5
复杂示例
现在让他应用于dataset:
import xarray as xr
import numpy as np
from scipy import stats
ds=xr.Dataset(data_vars=dict(
var1=(["x", "y", "time"],np.random.rand(5, 5, 3)),
var2=(["x", "y", "time"],np.random.rand(5, 5, 3)),
),coords={"x": [1, 2, 3, 4, 5,], "y": [1, 2, 3, 4, 5,], "time": [1, 2, 3]})
def linregress(da,dim):
# 定义你的函数
def foo(sample)->float:
#接收一个一维序列,经过处理后输出一个整数
print(sample)
res = stats.linregress(x=range(len(sample)), y=sample)
return res.slope, res.rvalue, res.pvalue
# 使用xarray.apply_ufunc函数
result = xr.apply_ufunc(foo, da, input_core_dims=[[dim]], output_core_dims=[(),(),()], vectorize=True)
return result
#查看结果
res=linregress(ds,dim="time")
print(res)
输出为:
(<xarray.Dataset>
Dimensions: (x: 5, y: 5)
Coordinates:
* x (x) int64 1 2 3 4 5
* y (y) int64 1 2 3 4 5
Data variables:
var1 (x, y) float64 -0.09859 0.1949 -0.3086 ... 0.2483 0.1627 -0.02292
var2 (x, y) float64 -0.3107 -0.3781 -0.04227 ... -0.2307 0.05446 -0.273,
<xarray.Dataset>
Dimensions: (x: 5, y: 5)
Coordinates:
* x (x) int64 1 2 3 4 5
* y (y) int64 1 2 3 4 5
Data variables:
var1 (x, y) float64 -0.42 0.7406 -0.7166 ... 0.5737 0.7468 -0.2498
var2 (x, y) float64 -0.9122 -0.9307 -0.7619 ... -0.9814 0.2078 -0.8954,
<xarray.Dataset>
Dimensions: (x: 5, y: 5)
Coordinates:
* x (x) int64 1 2 3 4 5
* y (y) int64 1 2 3 4 5
Data variables:
var1 (x, y) float64 0.724 0.4691 0.4914 0.3615 ... 0.611 0.4632 0.8393
var2 (x, y) float64 0.2688 0.2383 0.4485 0.8178 ... 0.1229 0.8667 0.2937)