python 保存h5文件_使用Python从保存在H5文件中的超大数据集生成pcolormesh图像

I am collecting a large amount of data that will be saved into individual H5 files using h5py. I would like to patch these images together into one pcolormesh plot to be saved as a single image.

A quick example I have been working on generates arrays of 2000x2000 random data points and saves them in H5 files using h5py. Then I try to import the data in these files and try to plot it in matplotlib as a pcolormesh, but I always run into a memoryError (which is expected).

import numpy

import h5py

arr = numpy.random.random((2000,2000))

with h5py.File("TEST_HDF5_SAVE_FILES\\Plot_0.h5", "w") as f:

dset = f.create_dataset("Plot_0", data = arr)

for i in range(1,100):

arr = numpy.random.random((2000,2000))

with h5py.File("TEST_HDF5_SAVE_FILES\\Plot_" + str(i) + ".h5", "w") as f:

dset = f.create_dataset("Plot_" + str(i), data = arr)

This script generates my files. I picked 100 as an arbitrary number just to have a large enough set of files to pull from.

Then I import them using the following script:

y = numpy.arange(0, 2000, 1)

for display_plot_num in range(0, 5):

print display_plot_num

x = numpy.arange(display_plot_num*2000, display_plot_num*2000 + 2000, 1)

with h5py.File("TEST_HDF5_SAVE_FILES\\Plot_" + str(display_plot_num) + ".h5", "r+") as f:

data = f["Plot_" + str(display_plot_num)]

plt.pcolormesh(x, y, data)

plt.show()

The range value in the for loop can be altered up until 100, but the maximum value I can choose without a memory error is 5 (i.e. 5 plots can be patched on a pcolormesh plot in matplotlib) and it is extremely clunky and slow. I need to be able to patch together many images.

Is there any other technique I should use to plot this data? Or it would be nice if I could just convert the data from multiple H5 files into an image without going through matplotlib or a similar program (like scipy).

In summary, my problem is this:

I have a large number of HDF5 files with image data (2000x2000)

I need to patch together these files into a single image and save it

Any help is appreciated. Also, I would be glad to answer any further questions about my problem.

Edit (5.6.2013):

I feel a similar question would be how to deal (import, manipulate, edit, etc.) with very high resolution images in Python. This is essentially what I am trying to do; generate a very high resolution image from a collection of smaller images.

解决方案

One way to reduce the bloat of images in matplotlib (especially when saving to SVG) is to use the rasterized=True kwarg. This will essentially "flatten" your pcolormesh, which makes it much faster to save, uses less resources, etc.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值