这是一个解决方案(包括一个可重复的例子):
library(raster)
library(lubridate)
library(tidyverse)
# creating some fake temperature data which matches your rasterstack
# create template raster
r
# add fake temperature values
Deu_crop
# add layer names
names(Deu_crop)
# check rasterstack
Deu_crop
# output
#
# class : RasterStack
# dimensions : 31, 37, 1147, 5479 (nrow, ncol, ncell, nlayers)
# resolution : 0.25, 0.25 (x, y)
# extent : 5.75, 15, 47.25, 55 (xmin, xmax, ymin, ymax)
# coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0
# names : X1980.01.02, X1980.01.03, X1980.01.04, X1980.01.05, X1980.01.06, X1980.01.07, ...
# min values : -10, -10, -10, -10, -10, -10, ...
# max values : 25, 25, 25, 25, 25, 25, ...
所以 Deu_crop 应该在结构方面可以与您的数据共存,当然还有随机温度值 .
shapefile不易重现,所以我已下载并使用它 . 正如我已经提到的,一些多边形对于提取来说有点小 .
最快的方法是将shapefile光栅化以匹配您的数据栅格,但是某些多边形不会被转换而其他多边形可能转换为错误的单元格...所以在这种情况下,最好直接使用 raster::extract shapefile,尽管它可以忍受 - 在此期间喝咖啡 .
shp
# coffee time
e
# add NUTS_ID as names to list
names(e)
要计算每个bin每年的天数,我创建一个使用 tidiverse 功能的函数,并使用 lapply 迭代整个提取列表(一个列表项对应一个多边形):
# define bins
bins
myfun
gather(data.frame(e[[ix]],stringsAsFactors = F),'colname','temp') %>%
group_by(colname) %>% summarise(temp = mean(temp)) %>% ungroup() %>% # spatial mean
mutate(year = sub('X(\\d{4}).+','\\1',colname)) %>% # get years
select(- colname) %>% # drop colname column
mutate(bin1= (temp <= bins[1]) * 1) %>% # bin1
mutate(bin2= (temp > bins[1] & temp <= bins[2]) * 1) %>% # bin2
mutate(bin3= (temp > bins[2] & temp <= bins[3]) * 1) %>% # bin3
mutate(bin4= (temp > bins[3] & temp <= bins[4]) * 1) %>% # bin4
mutate(bin5= (temp > bins[4] & temp <= bins[5]) * 1) %>% # bin5
mutate(bin6= (temp > bins[5]) * 1) %>% select(-temp) %>% # bin6
group_by(year) %>% summarise_all(funs(sum)) %>% mutate(NUTS_ID = names(e)[ix]) # drop year, calculate occurences and add NUTS_ID
}
# create single dataframe
result
快速查看 result 变量:
result
# output:
#
# # A tibble: 6,864 x 8
# year bin1 bin2 bin3 bin4 bin5 bin6 NUTS_ID
#
# 1 1980 12 85 91 92 85 0 DEA54
# 2 1981 3 64 99 113 86 0 DEA54
# 3 1982 3 80 113 86 83 0 DEA54
# 4 1983 6 84 90 85 100 0 DEA54
# 5 1984 8 90 92 86 90 0 DEA54
# 6 1985 5 86 85 95 94 0 DEA54
# 7 1986 6 74 97 108 80 0 DEA54
# 8 1987 4 82 99 94 86 0 DEA54
# 9 1988 3 89 87 91 96 0 DEA54
#10 1989 8 103 92 73 89 0 DEA54
# # ... with 6,854 more rows
更新:
为了处理这些箱子我首先从整个数据的最小值和最大值计算箱子,然后我使用一个新函数 createBins 将它们添加到每个多边形的提取物中 . 这将取代原始解决方案中的 myfun 部分 .
# new function
createBins
for (i in 1:nrow(bins_mat)){
bin
if (i ==1) df % mutate(!!bin := (temp >= bins_mat[i,2] & temp <= bins_mat[i,3])*1)
else df % mutate(!!bin := (temp > bins_mat[i,2] & temp <= bins_mat[i,3])*1)
}
return(df)
}
# new version of myfun
myfun2
gather(data.frame(e[[ix]],stringsAsFactors = F),'colname','temp') %>%
group_by(colname) %>% summarise(temp = mean(temp)) %>% ungroup() %>% # spatial mean
mutate(year = sub('X(\\d{4}).+','\\1',colname)) %>% # get years
select(- colname) %>% # drop colname column
createBins(.,bins_mat) %>% select(-temp) %>%
group_by(year) %>% summarise_all(funs(sum)) %>% mutate(NUTS_ID = names(e)[ix])
}
# 11 values to create 10 interval bins
bins
# create a bin matrix (number, bin_minimum, bin_maximum) for later function
bins_mat
# create new result
result