TFRecord reader
Installation
pip3 install tfrecord
Usage
It's recommended to create an index file for each TFRecord file. Index file must be provided when using multiple workers, otherwise the loader may return duplicate records.
python3 -m tfrecord.tools.tfrecord2idx
Use TFRecordDataset to read TFRecord files in PyTorch.
import torch
from tfrecord.torch.dataset import TFRecordDataset
tfrecord_path = "/path/to/data.tfrecord"
index_path = None
description = {"image": "byte", "label": "float"}
dataset = TFRecordDataset(tfrecord_path, index_path, description)
loader = torch.utils.data.DataLoader(dataset, batch_size=32)
data = next(iter(loader))
print(data)
Use MultiTFRecordDataset to read multiple TFRecord files. This class samples from given