说明
将netcdf数据按时间先后顺序合并,利用循环,尽量不改变子数据属性。
思路
将变量分为含时间维和不含时间维两类,分别处理。
感谢
https://blog.csdn.net/schumacher2016/article/details/82852700
在此博主程序的基础上进行修改,初次发文不懂怎么致谢,感谢该博主的分享。
代码1
%%
% % desciption: merge multiple netcdf files for sepcific domain
%
% % usage:
% % 1. filenumber is up to the number of your netcdf file to be processed.
% % 2. all the details of the data is saved as the orginal ones.
% % 3. just consider extend the time dimension, so the other dimensions
% % of merged files has to be the same.
% % 4. the data of all the merged files without time dimension is assumed to be exactly the same
%
% % author:
% % huang xue zhi, dalian university of technology
% % Ruth Shaw, Ocean University of China
% % revison history
% % 2018-09-25 first verison.
% % 2018-10-05 second verison.
%
% %%
%% read data
clc;
clear;
datadir = 'F:\Australia\data\input\';
addpath(datadir)
filelist = dir([datadir,'ocean_temp','*.nc']);
filenumber = size(filelist,1);
%% create the merged netcdf file to store the result.
cid=netcdf.create('ocean_temp_2016_2018.nc','clobber');
这里是自由设置global attributes,加入实际care的一些信息。
%define global attributes
netcdf.putAtt(cid,netcdf.getConstant('NC_GLOBAL'),'title','BRAN_2015_alpha');
netcdf.putAtt(cid,netcdf.getConstant('NC_GLOBAL'),'geospatial_lat_min','-29 degrees');
netcdf.putAtt(cid,netcdf.getConstant('NC_GLOBAL'),'geospatial_lat_max','-35 degrees');
netcdf.putAtt(cid,netcdf.getConstant('NC_GLOBAL'),'geospatial_lon_min','113 degrees');
netcdf.putAtt(cid,netcdf.getConstant('NC_GLOBAL'),'geospatial_lon_max','116 degrees');
netcdf.putAtt(cid,netcdf.getConstant('NC_GLOBAL'),'NCO','4.6.2');
ncid = netcdf.open([datadir,filelist(1).name], 'NC_NOWRITE');
[ndims,nvars,ngatts,unlimdimid] = netcdf.inq(ncid);
datainfo = ncinfo([datadir,filelist(1).name]);
考虑到不同数据的时间维变量名可能不同,这里的dimname_extend可以修改。
% choose one dimension to extend, such as time
dimname_extend = 'Time';
% get time dimension
time_dim = 0;
for i = 1:filenumber
ncid = netcdf.open([datadir,filelist(i).name], 'NC_NOWRITE');
[tmp,dimlen] = netcdf.inqDim(ncid,netcdf.inqDimID(ncid,dimname_extend));
time_dim = time_dim + dimlen;
end
时间维定义的是确值,即经过文件循环累加得到的总时间维数,这里时间维设置成netcdf.getConstant(‘NC_UNLIMITED’)需要调整变量维度顺序,因为unlimited的维数需要在最后(?还是最前),比较麻烦。
% define the variable dimension
% dim = 1:ndims;
for i = 1 : ndims
if strcmp(dimname_extend, datainfo.Dimensions(1,i).Name)
%dim.(datainfo.Dimensions(1,i).Name) = netcdf.defDim(cid,datainfo.Dimensions(1,i).Name,netcdf.getConstant('NC_UNLIMITED'));
dim.(datainfo.Dimensions(1,i).Name) = netcdf.defDim(cid,datainfo.Dimensions(1,i).Name,time_dim);
else
dim.(datainfo.Dimensions(1,i).Name) = netcdf.defDim(cid,datainfo.Dimensions(1,i).Name,datainfo.Dimensions(1,i).Length);
end
end
% end define the dimension
%% deal with constant varieties
% dimid = netcdf.inqDimID(ncid,dimname_extend);
% nvars = 11;
for i = 1:nvars
var_dim = [];
for j = 1:size(datainfo.Variables(1,i).Size,2)
var_dim(j) = dim.(datainfo.Variables(1,i).Dimensions(1,j).Name);
end
datatype = datainfo.Variables(1,i).Datatype;
if strcmp(datatype,'single')
datatype = 'float';
end
varid(i)=netcdf.defVar(cid,datainfo.Variables(1,i).Name,datatype,var_dim);
if ~isempty(datainfo.Variables(1,i).Attributes)
attr_cell = struct2cell(datainfo.Variables(1,i).Attributes);
for j =1 : size(attr_cell,3)
netcdf.putAtt(cid,varid(i),char(attr_cell(1,1,j)),cell2mat(attr_cell(2,1,j)));
end
end
end
netcdf.endDef(cid);
先put不需要延伸的数据,再循环扩展有时间维的数据。弊端是每扩展一个变量都要打开全部的文件,这里有29个文件,如果有6个需要扩展的变量,则需要打开文件29*7次,后期需要改进。
for i = 1:nvars
vardim = struct2cell(datainfo.Variables(1,i).Dimensions);
if sum(ismember(vardim(1,:,:),dimname_extend))
bool = ismember(vardim(1,:,:),dimname_extend);
var_value=[];
for j = 1:filenumber
ncid = netcdf.open([datadir,filelist(j).name], 'NC_NOWRITE');
var = netcdf.getVar(ncid,netcdf.inqVarID(ncid,datainfo.Variables(1,i).Name));
var_value = cat(find(bool==1),var_value,var);
end
else
var_value = netcdf.getVar(ncid,netcdf.inqVarID(ncid,datainfo.Variables(1,i).Name));
end
netcdf.putVar(cid,varid(i),var_value);
end
netcdf.close(cid);
直接复制不啰嗦的代码请戳:
Github.