python2中:
cPickle只是pickle的C编译版本,用法几乎相同
python3:
cPickle模块已经被移除,即只剩下pickle模块
一、dump()方法
pickle.dump(obj, file, [,protocol])
注释:序列化对象,将对象obj保存到文件file中去。参数protocol是序列化模式,默认是0(ASCII协议,表示以文本的形式进行序列化),protocol的值还可以是1和2(1和2表示以二进制的形式进行序列化。其中,1是老式的二进制协议;2是新二进制协议)。file表示保存到的类文件对象,file必须有write()接口,file可以是一个以’w’打开的文件或者是一个StringIO对象,也可以是任何可以实现write()接口的对象。
二、load()方法
pickle.load(file)
注释:反序列化对象,将文件中的数据解析为一个python对象。file中有read()接口和readline()接口
三 示例:
def pickled(savepath, data, label, fnames, bin_num=BIN_COUNTS, mode=“train”):
‘’’
savepath (str): save path
data (array): image data, a nx3072 array
label (list): image label, a list with length n
fnames (str list): image names, a list with length n
bin_num (int): save data in several files
mode (str): {‘train’, ‘test’}
‘’’
assert os.path.isdir(savepath)
total_num = len(fnames)
samples_per_bin = total_num / bin_num
assert samples_per_bin > 0
idx = 0
for i in range(bin_num):
start = i*samples_per_bin
end = (i+1)*samples_per_bin
if end <= total_num:
dict = {'data': data[start:end, :],
'labels': label[start:end],
'filenames': fnames[start:end]}
else:
dict = {'data': data[start:, :],
'labels': label[start:],
'filenames': fnames[start:]}
if mode == "train":
dict['batch_label'] = "training batch {} of {}".format(idx, bin_num)
else:
dict['batch_label'] = "testing batch {} of {}".format(idx, bin_num)
with open(os.path.join(savepath, 'data_batch_'+str(idx)), 'wb') as fi:
cPickle.dump(dict, fi)
idx = idx + 1
def unpickled(filename):
#assert os.path.isdir(filename)
assert os.path.isfile(filename)
with open(filename, ‘rb’) as fo:
dict = cPickle.load(fo)
return dict