Python STARFM 时空融合

gis风轻云淡

已于 2023-07-18 17:36:33 修改

阅读量3.1k

点赞数 14

分类专栏： Python Remote sensing 文章标签： python

于 2023-05-12 09:55:13 首次发布

本文链接：https://blog.csdn.net/u011534341/article/details/130630960

版权

Python 同时被 2 个专栏收录

3 篇文章

订阅专栏

Remote sensing

3 篇文章

订阅专栏

2023.07.18修改。

最近想做ET时空融合，各种尝试与研究，发在这里记录一下。STARFM和ESTARFM使用最为广泛。根据原论文，ESTARFM对空间异质性较高的区域效果更好，但需要2对高低分辨率影像作为输入，STARFM输入要求低，只需要一对低分辨率与高分辨率+另外一景低分辨率日期影像即可进行融合，这对于一些多云雨难以获取清晰影像数据的研究区更加适用（部分研究区一年可能就少数几景清晰影像）。

1. 我的研究区位于江西省，云雨天气较多，先考虑STARFM算法；

2. 原论文作者Feng Gao提供了C语言版本的程序（Download : USDA ARS），但根据说明文件，是在Linux系统上测试的，最主要的我也不怎么用C语言，除了上大学学了点C++，也不会用C。我还想根据自己的需要做一些外部的修改之类的（如长时间序列自动配对批处理）；

3. 还是偷懒在网上搜Python版本的，在Github上找到一个starfm4py，作者是意大利欧空局的Nikolina Mileva。并正式发表了一篇论文（New tool for spatio-temporal image fusion in remote sensing: a case study approach using Sentinel-2 and Sentinel-3 data）出来，用于做Sentinel-2与Sentinel-3融合的。感觉这个代码比较规范，就开始研究这个代码，作者为了加快处理速度用了Dask包，因为STARFM算法是在移动窗口内执行的，考虑到边缘问题，根据输入的搜索窗口填充了影像边缘。代码条理很清晰，光谱、时间、空间权重、相似像元搜索以及进一步过滤、组合权重以及权重反距离归一化分别用单独的函数写，此外还特地考虑了论文中给出的特例（MODIS影像反射率在2个时期不变；Landsat和MODIS影像反射率在tk时期相同），代码质量很高。但是研究完所有代码，对于filtering函数（进一步根据光谱与时间距离过滤相邻相似像元）里面的写法感觉很困惑，根据原论文以及Nikolina Mileva自己的论文，在这一步过滤时，光谱和时间距离应该一起考虑，但在代码里面，Nikolina Mileva根据一个temp变量（用于判断输入影像对是否大于1对）来判断是否考虑时间距离，我自己反复看原论文以及Nikolina Mileva论文相关部分后，还是觉得她这个地方可能存在问题，我自己觉得不管输入影像对数量是多少，光谱和时间距离都需要同时考虑，里面的区别在于输入多对影像数据时使用Maxmum函数进行取舍，这也是原论文中公式13~16前面有“max”的原因。可能我是个细节控+强迫症，我就在网上找了Nikolina Mileva目前工作单位的邮箱，发了邮件咨询这个问题，但是已经好几天了，没有回信。。。我打算暂时不考虑这个问题，尝试跑一下影像数据看看效果，但是发现这个代码还存在一个问题，也可能是我水平太菜了，没设置好。代码中除了使用Dask并行处理外，还手动写了个分块函数对影像进行分块，还是因为移动窗口的原因，分的块也是相互重叠的。此外，作者还使用了一个.zarr格式来配合Dask包，问题就出在这里，根据原论文，对于异质性较高的区域，作者使用搜索窗口是6000m，A=250m，按照30m分辨率算，搜索窗大小就是200，于是我把搜索窗参数设为200测试，结果发现，我一块大概1000*1400大小的测试影像，它分块后的.zarr格式文件大小居然达到了40多G，而且我自己台式机还报了内存错误，我拿到一台128G服务器跑，还是内存报错，原本是想着就几M大小的影像，服务器怎么也够了，手动分区那里就没有单独去考虑，结果发现不行，然后我将手动分块那里设为79，才能继续跑下去，但是速度很慢，一块跑完要半个小时左右，可能是我设置问题，毕竟没怎么用过Dask包，感觉很难忍，算了，还是手动在她的代码基础上修改下吧；

4. 我在Nikolina Mileva代码基础上做的修改如下：（1）我目前只使用到一对影像输入的，所以去掉了temp变量，暂时不考虑多对影像输入的（好像修改下也不是太复杂，但是实在没精力去折腾了），紧接着对于filtering函数，同时考虑光谱和时间距离；（2）组合权重comb_distance这块，Nikolina Mileva跟原论文稍微有点出入，主要是考虑到出现被0除，光谱、时间、空间距离都加了1，然后都求了倒数，但是在comb_distance函数里面考虑对数变换时，光谱和时间距离又加了1。我在comb_distance函数里面把多余的1去掉了；（3）在考虑两种特例时（MODIS影像反射率在2个时期不变；Landsat和MODIS影像反射率在tk时期相同），根据对原论文的理解，修改了weighting函数；（4）我没有使用分块，而是用2层for循环来遍历各个移动窗口，针对行循环加了个进度条；（5）光谱、时间、空间距离Nikolina Mileva是在移动窗口计算的，我修改为提前整景影像计算，然后循环时，根据搜索窗直接读入。

5. 注：我使用的高低分辨率影像都做了几何校正、重采样到30m分辨率并进行栅格对齐、裁剪到相同大小等处理，背景值设为-99；高低分辨率影像不确定性参数，原论文中Feng Gao测试时针对不同地区使用的Landsat和MODIS反射率数据可见光/近红外输入为0.002/0.005、0.01/0.015，Nikolina Mileva测试用的是Sentinel-2和Sentinel-3数据，不确定性设置为0.03（对应于MSI传感器绝对辐射精度3%），原论文都是用于反射率融合的，不确定性会低一些，如果针对产品数据（如ET），不确定性值理论上应该会高很多（曾经好像看到过文章说MODIS ET不确定性大概20%），需要找一下相应的论文作为根据设置，这里，我还是采用了Nikolina Mileva设置的值0.03；经测试，搜索窗大小对结果影像较大，需要根据自己研究区测试。下面是我修改后的代码，有问题的话，欢迎大家批评指正：

# -*- coding: utf-8 -*-
"""
Created on 2023.7.18
Adapted from starfm4py (Nikolina Mileva; https://github.com/nmileva/starfm4py)
@author: Mao Huihui
"""
import os
import sys
import time
import numpy as np
from tqdm import tqdm
from osgeo import gdal


## Parameters
# Searching window size (Using Odd Number)
windowSize = 51

# Set to True if you want to decrease the sensitivity to the spectral distance
logWeight = False

# The spatial impact factor is a constant defining the relative importance of spatial distance (in meters)
# Take a smaller value of the spatial impact factor for heterogeneous regions (e.g. A = 150 m)
spatImp = 750

# Increasing the number of classes limits the number of similar pixels
numberClass = 5

# Set the uncertainty value for the fine resolution sensor
# 1) A comprehensive evaluation of two MODIS evapotranspiration products over the conterminous United States:
# Using point and gridded FLUXNET and water balance ET (25% on basin scale; https://doi.org/10.1016/j.rse.2013.07.013)
uncertaintyFineRes = 0.03

# Set the uncertainty value for the coarse resolution sensor
# 1) A comprehensive evaluation of two MODIS evapotranspiration products over the conterminous United States:
# Using point and gridded FLUXNET and water balance ET (25% on basin scale; https://doi.org/10.1016/j.rse.2013.07.013)
uncertaintyCoarseRes = 0.03

# Other global variables
mid_idx = windowSize // 2  # Central Pixel
specUncertainty = np.sqrt(uncertaintyFineRes**2 + uncertaintyCoarseRes**2)
tempUncertainty = np.sqrt(2 * uncertaintyCoarseRes**2)


#  栅格数据访问
class RasterTiff:
    gdal.AllRegister()

    # 读栅格文件
    def read_img(self, filename):
        dataset = gdal.Open(filename)    # 打开文件

        im_width = dataset.RasterXSize   # 栅格矩阵的列数
        im_height = dataset.RasterYSize  # 栅格矩阵的行数
        im_bands = dataset.RasterCount   # 波段数

        im_geotrans = dataset.GetGeoTransform()  # 仿射矩阵，左上角像素的大地坐标和像素分辨率
        im_proj = dataset.GetProjection()  # 地图投影信息，字符串表示
        im_data = dataset.ReadAsArray(0, 0, im_width, im_height)  # 将数据写成数组，对应栅格矩阵

        im_nodata = dataset.GetRasterBand(1).GetNoDataValue()

        type(im_data), im_data.shape

        del dataset
        return im_height, im_width, im_bands, im_geotrans, im_proj, im_nodata, im_data

    # 写栅格文件
    def write_img(self, filename, im_proj, im_geotrans, im_nodata, im_data):
        # gdal数据类型包括：gdal.GDT_Byte, gdal.GDT_UInt16, gdal.GDT_UInt32,
        # gdal.GDT_Int32, gdal.GDT_Float32, gdal.GDT_Float64

        # 判断栅格数据的数据类型
        if 'int8' in im_data.dtype.name:
            datatype = gdal.GDT_Byte
        elif 'int16' in im_data.dtype.name:
            datatype = gdal.GDT_UInt16
        else:
            datatype = gdal.GDT_Float32

        # 判断数组维数
        if len(im_data.shape) == 3:
            im_bands, im_height, im_width = im_data.shape
        else:
            im_bands, (im_height, im_width) = 1, im_data.shape

        # 创建文件
        driver = gdal.GetDriverByName('GTiff')  # 数据类型必须有
        dataset = driver.Create(filename, im_width, im_height, im_bands, datatype)

        dataset.SetGeoTransform(im_geotrans)  # 写入仿射变换参数
        dataset.SetProjection(im_proj)  # 写入投影

        if im_bands == 1:
            dataset.GetRasterBand(1).SetNoDataValue(im_nodata)  # Nodata
            dataset.GetRasterBand(1).WriteArray(im_data)  # 写入数组数据
        else:
            for i in range(im_bands):
                dataset.GetRasterBand(i + 1).SetNoDataValue(im_nodata)  # Nodata
                dataset.GetRasterBand(i + 1).WriteArray(im_data[i])

        del dataset


# Calculate the spectral distance
def spectral_distance(fine_img_t0_win, coarse_img_t0_win):
    spec_diff = fine_img_t0_win - coarse_img_t0_win
    spec_dist = np.abs(spec_diff) + 1.0
    # print('Spectral Distance:', spec_dist.shape)
    return spec_diff, spec_dist


# Calculate the temporal distance
def temporal_distance(coarse_img_t0_win, coarse_img_t1_win):
    temp_diff = coarse_img_t1_win - coarse_img_t0_win
    temp_dist = np.abs(temp_diff) + 1.0
    # print('Temporal Distance:', temp_dist.shape)
    return temp_diff, temp_dist


# Calculate the spatial distance
def spatial_distance():
    coord = np.sqrt((np.mgrid[0:windowSize, 0:windowSize] - windowSize // 2)**2)
    spat_dist = np.sqrt(((0 - coord[0])**2 + (0 - coord[1])**2))
    rel_spat_dist = spat_dist / spatImp + 1.0  # relative spatial distance
    # print('Spatial Distance:', rel_spat_dist.shape)
    return rel_spat_dist


# 标准差是根据整景影像计算的，而不是在移动窗口计算的
# Define the threshold used in the dynamic classification process
def similarity_threshold(fine_image_t0):
    fine_image_t0 = np.where(fine_image_t0==-99, np.nan, fine_image_t0)
    st_dev = np.nanstd(fine_image_t0)
    sim_threshold = st_dev*2 / numberClass
    # print('Similarity Threshold:', sim_threshold)
    return sim_threshold


# Define the spectrally similar pixels within a moving window
def similarity_pixels(fine_img_t0_win, sim_thre):
    # possible to implement as sparse matrix
    similar_pixels = np.where(np.abs(fine_img_t0_win - fine_img_t0_win[mid_idx, mid_idx]) <= sim_thre, 1, 0)
    # print('Similarity Pixels:', similar_pixels.shape)
    return similar_pixels


# Apply filtering on similar pixels
def filtering(fine_img_t0_win, spec_dist, temp_dist, spec_diff, temp_diff, sim_thre):
    similar_pixels = similarity_pixels(fine_img_t0_win, sim_thre)
    max_spec_dist  = np.abs(spec_diff)[mid_idx, mid_idx] + specUncertainty + 1
    max_temp_dist  = np.abs(temp_diff)[mid_idx, mid_idx] + tempUncertainty + 1
    spec_filter = np.where(spec_dist < max_spec_dist, 1, 0)
    temp_filter = np.where(temp_dist < max_temp_dist, 1, 0)

    st_filter = spec_filter * temp_filter
    similar_pixels_filtered = similar_pixels * st_filter
    # print('Filtering:', similar_pixels_filtered.shape)
    return similar_pixels_filtered


# Calculate the combined distance
def comb_distance(spec_dist, temp_dist, spat_dist):
    if logWeight:
        # print('logWeight')
        spec_dist = np.log(spec_dist)
        temp_dist = np.log(temp_dist)

    comb_dist = spec_dist * temp_dist * spat_dist
    # print('Comb Distance:', comb_dist.shape)
    return comb_dist


# Calculate weights
def weighting(spec_dist, temp_dist, comb_dist, similar_pixels_filtered):
    # Calculate weights only for the filtered spectrally similar pixels
    weights_filt = comb_dist * similar_pixels_filtered

    # Normalize weights
    reverse_dist_weights = np.divide(1, weights_filt,
                                     out=np.zeros_like(weights_filt),
                                     where=weights_filt != 0)  # 防止分母为0
    norm_weights = np.divide(reverse_dist_weights, np.sum(reverse_dist_weights),
                             out=np.zeros_like(reverse_dist_weights),
                             where=(np.sum(reverse_dist_weights) != 0))  # 防止分母为0

    # Assign max weight (1) when the temporal or spectral distance is zero
    if spec_dist[mid_idx, mid_idx] == 1 or temp_dist[mid_idx, mid_idx] == 1:
        norm_weights = np.zeros(shape=comb_dist.shape)
        norm_weights[mid_idx, mid_idx] = 1

    # print('Weighting:', norm_weights.shape)
    return norm_weights


# Main Function
if __name__ == '__main__':
    stime = time.time()
    raster = RasterTiff()

    # Path
    coarse_image_path = r'********************************************'
    fine_image_path   = r'********************************************'
    prediction_output_path = r'********************************************'

    # Image Name
    coarse_image_t0_name = '********************************************'
    coarse_image_t1_name = '********************************************'
    fine_image_t0_name   = '********************************************'

    # Base Image
    # im_height, im_width, im_bands, im_geotrans, im_proj, im_nodata, im_data
    base_image = raster.read_img(os.path.join(fine_image_path, fine_image_t0_name))
    im_nodata = base_image[-2]  # -99
    print('im_nodata:', im_nodata)
    image_shape = base_image[-1].shape
    print('image_shape:', image_shape)
    print('\n')

    # Read image
    coarse_image_t0 = raster.read_img(os.path.join(coarse_image_path, coarse_image_t0_name))[-1]
    coarse_image_t1 = raster.read_img(os.path.join(coarse_image_path, coarse_image_t1_name))[-1]
    fine_image_t0   = raster.read_img(os.path.join(fine_image_path, fine_image_t0_name))[-1]

    # Edge Pad With 0
    coarse_image_t0_pad = np.pad(coarse_image_t0, windowSize // 2, mode='constant', constant_values=-99)
    coarse_image_t1_pad = np.pad(coarse_image_t1, windowSize // 2, mode='constant', constant_values=-99)
    fine_image_t0_pad   = np.pad(fine_image_t0, windowSize // 2, mode='constant', constant_values=-99)

    # Spectral Distance
    spec = spectral_distance(fine_image_t0_pad, coarse_image_t0_pad)
    spec_diff = spec[0]
    spec_dist = spec[1]

    # Temporal Distance
    temp = temporal_distance(coarse_image_t0_pad, coarse_image_t1_pad)
    temp_diff = temp[0]
    temp_dist = temp[1]

    # Spatial Distance
    spat_dist = spatial_distance()
    print('\n')

    # Similarity Threshold
    sim_threshold = similarity_threshold(fine_image_t0)

    # Perform the prediction with STARFM (One Pair)
    print('Processing...')
    STARFM_prediction = np.zeros(shape=image_shape) + im_nodata
    # for row_idx in range(windowSize // 2, image_shape[0] + windowSize // 2):
    # Progressor
    for row_idx in tqdm(range(windowSize // 2, image_shape[0] + windowSize // 2)):
        for col_idx in range(windowSize // 2, image_shape[1] + windowSize // 2):
            # Searching window
            coarse_image_t0_window = coarse_image_t0_pad[row_idx - windowSize // 2:row_idx + windowSize // 2 + 1,
                                                         col_idx - windowSize // 2:col_idx + windowSize // 2 + 1]
            coarse_image_t1_window = coarse_image_t1_pad[row_idx - windowSize // 2:row_idx + windowSize // 2 + 1,
                                                         col_idx - windowSize // 2:col_idx + windowSize // 2 + 1]
            fine_image_t0_window = fine_image_t0_pad[row_idx - windowSize // 2:row_idx + windowSize // 2 + 1,
                                                     col_idx - windowSize // 2:col_idx + windowSize // 2 + 1]

            # Spectral Distance Window
            spec_dist_window = spec_dist[row_idx - windowSize // 2:row_idx + windowSize // 2 + 1,
                                         col_idx - windowSize // 2:col_idx + windowSize // 2 + 1]
            spec_diff_window = spec_diff[row_idx - windowSize // 2:row_idx + windowSize // 2 + 1,
                                         col_idx - windowSize // 2:col_idx + windowSize // 2 + 1]

            # Temporal Distance Window
            temp_dist_window = temp_dist[row_idx - windowSize // 2:row_idx + windowSize // 2 + 1,
                                         col_idx - windowSize // 2:col_idx + windowSize // 2 + 1]
            temp_diff_window = temp_diff[row_idx - windowSize // 2:row_idx + windowSize // 2 + 1,
                                         col_idx - windowSize // 2:col_idx + windowSize // 2 + 1]

            # Spectrally Similar Pixels
            # 1. Dynamic Classification Process (threshold)
            # 2. Sample Filtering (spectral and temporal)
            # print('1) Spectrally Similar Pixels:')
            similar_pixels = filtering(fine_image_t0_window, spec_dist_window, temp_dist_window, spec_diff_window, temp_diff_window, sim_threshold)

            # Weight
            # print('2) Weight:')
            comb_dist = comb_distance(spec_dist_window, temp_dist_window, spat_dist)
            weights = weighting(spec_dist_window, temp_dist_window, comb_dist, similar_pixels)

            # Prediction
            # print('3) Prediction:')
            pred_image_window = fine_image_t0_window + temp_diff_window
            pred_image_window[pred_image_window < 0] = 0  # Set prediction value less 0 to 0
            weighted_pred_window = np.sum(pred_image_window * weights)
            STARFM_prediction[row_idx - windowSize // 2, col_idx - windowSize // 2] = weighted_pred_window
            # print('\n')

    # Write STARFM prediction image
    # filename, im_proj, im_geotrans, im_nodata, im_data
    raster.write_img(os.path.join(prediction_output_path,
                     f'********************************************_{windowSize}_{logWeight}.tif'),
                     base_image[4], base_image[3], -99, STARFM_prediction)

    etime = time.time()
    print('Processing Time: {ptime} min!'.format(ptime=round((etime - stime) / 60.0, 2)))