pytorch的dataloader的__getitem__()方法,内部定义随机数,采用numpy.random.rand的方法,每次迭代从dataloader里取数据,都会产生拿到相同的随机数序列,相反,torch.rand的方法,每次迭代拿数据时,拿到的是不同的随机数序列。
# numpy生成随机数
def get_train_sample(self, ):
track_idx = np.random.randint(0, len(self._tracks), (1, ))[0]
separated = self._tracks[track_idx]
patch_length = int(self._sample_length * self._sample_rate)
if self._incoherent_rate <= np.random.rand(1)[0]: # coherent remix
start_t = np.random.randint(0, separated.shape[-1]-patch_length, (1, ))
separated = separated[:, :, start_t: start_t+patch_length]
return separated
# torch生成随机数
def get_train_sample(self, ):
track_idx = torch.randint(0, len(self._tracks), (1,)).item()
separated = self._tracks[track_idx]
patch_length = int(self._sample_length * self._sample_rate)
if self._incoherent_rate <= np.random.rand(1)[0]: # coherent remix
start_t = torch.randint(0, separated.shape[-1]-patch_length, (1, )).item()
separated = separated[:, :, start_t: start_t+patch_length]
return separated
用第一种方法生成的数据片段,每个loader里面都相同,用第二种方法,每个loader依旧是随机。