python pipeline library_Tensorpack DataFlow - 纯Python的高效数据加载Pipeline

下水道捞鱼小分队

于 2021-02-19 14:45:51 发布

阅读量147

点赞数

文章标签： python pipeline library

本文链接：https://blog.csdn.net/weixin_33525438/article/details/114389724

版权

Tensorpack DataFlow是一个用于深度学习的高效灵活数据加载pipeline，完全用Python编写。它提供高速优化的并行处理模块，易于接口实现工作负载并行化，并且可以与其他Python库无缝配合。DataFlow最初是tensorpack库的一部分，现在作为一个独立的库存在，安装简单，使用广泛。

摘要由CSDN通过智能技术生成

Tensorpack DataFlow

Tensorpack DataFlow is an efficient and flexible data loading pipeline for deep learning, written in pure Python.

Its main features are:

Highly-optimized for speed. Parallization in Python is hard. DataFlow implements highly-optimized parallel building blocks which gives you an easy interface to parallelize your workload.

Written in pure Python. This allows it to be used together with any other Python-based library.

DataFlow is originally part of the tensorpack library and has been through 3 years of active development. Given its independence of the rest of the tensorpack library, and the high demand from users, it is now a separate library whose source code is synced with tensorpack.

Why would you want to use DataFlow instead of a platform-specific data loading solutions? We recommend you to read Why DataFlow?.

Install:

pip install --upgrade git+https://gi