python如何保存数组_如何部分加载用numpy save保存的数组在python中

I have an multi dimentional array with numpy save and want only to partial load some dimension because array is very big.

How can I do it in simple way ?

Edit: Context is is simple and basic:

You have 5 Gb array saved with numpy.save. But, you only need to have access some parts of the array A[:,:] without loading 5gb in Memory.

ANSWER is: Using h5py to save/load partially the data: here code sample:

import sys

import h5py

def main():

data = read()

if sys.argv[1] == 'x':

x_slice(data)

elif sys.argv[1] == 'z':

z_slice(data)

def read():

f = h5py.File('/tmp/test.hdf5', 'r')

return f['seismic_volume']

def z_slice(data):

return data[:,:,0]

def x_slice(data):

return data[0,:,:]

解决方案

You'd have to intentionally save the array for partial loading; you can't do generically.

You could, for example, split the array (along one of the dimensions) and save the subarrays with savez. load of a such a file archive is 'lazy', only reading the subfiles you ask for.

h5py is an add on package that saves and loads data from HDF5 files. That allows for partial reads.

numpy.memmap is another option, treating a file as memory that stores an array.

Look up the docs for these, as well as previous SO questions.

To elaborate on the holds. There are minor points that aren't clear. What exactly do you mean by 'load some dimension'? The simplest interpretation is that you want A[0,...] or A[3:10,...]. The other is the implication of 'simple way'. Does that mean you already have a complex way, and what a simpler one? Or just that you don't want to rewrite the numpy.load function to do the task?

Otherwise I think the question is reasonably clear - and the simple answer is - no there isn't a simple way.

I'm tempted to reopen the question so other experienced numpy posters can weigh in.

I should have reviewed the load docs (the OP should have as well!). As ali_m commented there is a memory map mode. The docs say:

mmap_mode : {None, 'r+', 'r', 'w+', 'c'}, optional

If not None, then memory-map the file, using the given mode

(see `numpy.memmap` for a detailed description of the modes).

A memory-mapped array is kept on disk. However, it can be accessed

and sliced like any ndarray. Memory mapping is especially useful for

accessing small fragments of large files without reading the entire

file into memory.

How does numpy handle mmap's over npz files?

(I dug into this months ago, but forgot the option.)

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值