1、mxnet在构建batch的数据时,其回使用到自身的Tuple、Pad、Stack等类。
其中在Tuple里很详细的使用说明详解:
Examples
--------
>>> from gluoncv.data import batchify
>>> a = ([1, 2, 3, 4], 0)
>>> b = ([5, 7], 1)
>>> c = ([1, 2, 3, 4, 5, 6, 7], 0)
>>> batchify.Tuple(batchify.Pad(), batchify.Stack())([a, b])
(
[[1 2 3 4]
[5 7 0 0]]
<NDArray 2x4 @cpu(0)>,
[0. 1.]
<NDArray 2 @cpu(0)>)
>>> # Input can also be a list
>>> batchify.Tuple([batchify.Pad(), batchify.Stack()])([a, b])
(
[[1 2 3 4]
[5 7 0 0]]
<NDArray 2x4 @cpu(0)>,
[0. 1.]
<NDArray 2 @cpu(0)>)
>>> # Another example
>>> a = ([1, 2, 3, 4], [5, 6], 1)
>>> b = ([1, 2], [3, 4, 5, 6], 0)
>>> c = ([1], [2, 3, 4, 5, 6], 0)
>>> batchify.Tuple(batchify.Pad(), batchify.Pad(), batchify.Stack())([a, b, c])
(
[[1 2 3 4]
[1 2 0 0]
[1 0 0 0]]
<NDArray 3x4 @cpu(0)>,
[[5 6 0 0 0]
[3 4 5 6 0]
[2 3 4 5 6]]
<NDArray 3x5 @cpu(0)>,
[1. 0. 0.]
<NDArray 3 @cpu(0)>)
"""
其中<NDArray 3x4 @cpu(0)>指的时数据是加载到cpu0上面的,这个是默认值,如果想加载到GPU和不同GPU上面的话,其定义数据的时候就要指明如下:
def __iter__(self):
if self._num_workers == 0:
def same_process_iter():
for batch in self._batch_sampler:
ret = self._batchify_fn([self._dataset[idx] for idx in batch])
if self._pin_memory:
#其中context就是mxnet里的,其就是把数据复制到cpu,返回其memory context
ret = _as_in_context(ret, context.cpu_pinned(self._pin_device_id))
yield ret
return same_process_iter()
其它使用例子:
>>> with mx.cpu_pinned():
... cpu_array = mx.nd.ones((2, 3))
>>> cpu_array.context
cpu_pinned(0)
>>> cpu_array = mx.nd.ones((2, 3), ctx=mx.cpu_pinned())
>>> cpu_array.context
cpu_pinned(0)