matlab读写hdf5文件,hdf5格式的matlab读写操作

最近要用caffe处理一个multi-label的回归问题,就是输出是一个向量,不是一个具体的数值,这个时候之前的leveldb格式就不凑效了,因为caffe源代码里面默认label是一个数值,网上搜了下,都说hdf5格式可以解决这个问题

在caffe里面,有一个hdf5的datalayer作为数据输入,从源代码来看,对于label的维数没做限制,剩下的问题就是如何生成hdf5的数据,目前只是找到了github上的一个人共享的用matlab写的hdf5数据的读写操作,在这我把代码粘贴出来

testHDF5.m

%% WRITING TO HDF5

filename='trial.h5';

num_total_samples=10000;

% to simulate data being read from disk / generated etc.

data_disk=rand(5,5,1,num_total_samples);

label_disk=rand(10,num_total_samples);

chunksz=100;

created_flag=false;

totalct=0;

for batchno=1:num_total_samples/chunksz

fprintf('batch no. %d\n', batchno);

last_read=(batchno-1)*chunksz;

% to simulate maximum data to be held in memory before dumping to hdf5 file

batchdata=data_disk(:,:,1,last_read+1:last_read+chunksz);

batchlabs=label_disk(:,last_read+1:last_read+chunksz);

% store to hdf5

startloc=struct('dat',[1,1,1,totalct+1], 'lab', [1,totalct+1]);

curr_dat_sz=store2hdf5(filename, batchdata, batchlabs, ~created_flag, startloc, chunksz);

created_flag=true;% flag set so that file is created only once

totalct=curr_dat_sz(end);% updated dataset size (#samples)

end

% display structure of the stored HDF5 file

h5disp(filename);

%% READING FROM HDF5

% Read data and labels for samples #1000 to 1999

data_rd=h5read(filename, '/data', [1 1 1 1000], [5, 5, 1, 1000]);

label_rd=h5read(filename, '/label', [1 1000], [10, 1000]);

fprintf('Testing ...\n');

try

assert(isequal(data_rd, single(data_disk(:,:,:,1000:1999))), 'Data do not match');

assert(isequal(label_rd, single(label_disk(:,1000:1999))), 'Labels do not match');

fprintf('Success!\n');

catch err

fprintf('Test failed ...\n');

getReport(err)

end

%delete(filename);

% CREATE list.txt containing filename, to be used as source for HDF5_DATA_LAYER

FILE=fopen('list.txt', 'w');

fprintf(FILE, '%s', filename);

fclose(FILE);

fprintf('HDF5 filename listed in %s \n', 'list.txt');

% NOTE: In net definition prototxt, use list.txt as input to HDF5_DATA as:

% layers {

% name: "data"

% type: HDF5_DATA

% top: "data"

% top: "labelvec"

% hdf5_data_param {

% source: "/path/to/list.txt"

% batch_size: 64

% }

% }

store2hdf5.m

function [curr_dat_sz, curr_lab_sz] = store2hdf5(filename, data, labels, create, startloc, chunksz)

% *data* is W*H*C*N matrix of images should be normalized (e.g. to lie between 0 and 1) beforehand

% *label* is D*N matrix of labels (D labels per sample)

% *create* [0/1] specifies whether to create file newly or to append to previously created file, useful to store information in batches when a dataset is too big to be held in memory (default: 1)

% *startloc* (point at which to start writing data). By default,

% if create=1 (create mode), startloc.data=[1 1 1 1], and startloc.lab=[1 1];

% if create=0 (append mode), startloc.data=[1 1 1 K+1], and startloc.lab = [1 K+1]; where K is the current number of samples stored in the HDF

% chunksz (used only in create mode), specifies number of samples to be stored per chunk (see HDF5 documentation on chunking) for creating HDF5 files with unbounded maximum size - TLDR; higher chunk sizes allow faster read-write operations

% verify that format is right

dat_dims=size(data);

lab_dims=size(labels);

num_samples=dat_dims(end);

assert(lab_dims(end)==num_samples, 'Number of samples should be matched between data and labels');

if ~exist('create','var')

create=true;

end

if create

%fprintf('Creating dataset with %d samples\n', num_samples);

if ~exist('chunksz', 'var')

chunksz=1000;

end

if exist(filename, 'file')

fprintf('Warning: replacing existing file %s \n', filename);

delete(filename);

end

h5create(filename, '/data', [dat_dims(1:end-1) Inf], 'Datatype', 'single', 'ChunkSize', [dat_dims(1:end-1) chunksz]); % width, height, channels, number

h5create(filename, '/label', [lab_dims(1:end-1) Inf], 'Datatype', 'single', 'ChunkSize', [lab_dims(1:end-1) chunksz]); % width, height, channels, number

if ~exist('startloc','var')

startloc.dat=[ones(1,length(dat_dims)-1), 1];

startloc.lab=[ones(1,length(lab_dims)-1), 1];

end

else % append mode

if ~exist('startloc','var')

info=h5info(filename);

prev_dat_sz=info.Datasets(1).Dataspace.Size;

prev_lab_sz=info.Datasets(2).Dataspace.Size;

assert(prev_dat_sz(1:end-1)==dat_dims(1:end-1), 'Data dimensions must match existing dimensions in dataset');

assert(prev_lab_sz(1:end-1)==lab_dims(1:end-1), 'Label dimensions must match existing dimensions in dataset');

startloc.dat=[ones(1,length(dat_dims)-1), prev_dat_sz(end)+1];

startloc.lab=[ones(1,length(lab_dims)-1), prev_lab_sz(end)+1];

end

end

if ~isempty(data)

h5write(filename, '/data', single(data), startloc.dat, size(data));

h5write(filename, '/label', single(labels), startloc.lab, size(labels));

end

if nargout

info=h5info(filename);

curr_dat_sz=info.Datasets(1).Dataspace.Size;

curr_lab_sz=info.Datasets(2).Dataspace.Size;

end

end

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
Matlab中,可以使用h5read函数来读取HDF5文件的内容。该函数的语法如下: data = h5read(filename, datasetname) 其中,filename是HDF5文件的路径和名称,datasetname是要读取的数据集的名称。该函数将返回一个包含数据的数组。 另外,你还可以使用hdf5info函数来获取HDF5文件的基本信息,例如数据集的名称、大小等。该函数的语法如下: info = hdf5info(filename) 其中,filename是HDF5文件的路径和名称。该函数将返回一个结构体info,包含了HDF5文件的详细信息。 需要注意的是,Matlab读取HDF4文件的方法与读取HDF5文件的方法略有不同。对于HDF4文件,可以使用hdfinfo和hdfread函数来读取文件的信息和内容。 希望以上信息对你有帮助。如果还有其他问题,请随时提问。 #### 引用[.reference_title] - *1* [MATLAB读取HDF5文件——以读取OMI he5数据为例](https://blog.csdn.net/wokaowokaowokao12345/article/details/108783849)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v91^control_2,239^v3^insert_chatgpt"}} ] [.reference_item] - *2* [hdf5格式matlab读写操作](https://blog.csdn.net/weixin_42372510/article/details/116125168)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v91^control_2,239^v3^insert_chatgpt"}} ] [.reference_item] - *3* [MATLAB教程转载——MATLAB读取HDF文件](https://blog.csdn.net/NJ_YTTP/article/details/53453466)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v91^control_2,239^v3^insert_chatgpt"}} ] [.reference_item] [ .reference_list ]

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值