pack_padded_sequence 类似与一个压缩操作。
举个例子:
有一个tensor([[1,2,0], [3,0,0], [4,5,6]]),经过 pack_padded_sequence 之后会得到一个tensor([4,1,3,5,2,6]);再经过pad_packed_sequence之后会得到tensor([[1,2,0], [3,0,0], [4,5,6]])
问:为什么需要这个操作呢?
答:提高计算效率
逐行解释
from torch.nn.utils.rnn import pack_padded_sequence, pad_packed_sequence # 导包
seq = torch.tensor([[1,2,0], [3,0,0], [4,5,6]]) # shape: [batch_size, seq_len]
lens = [2, 1, 3] # 每个batch的非零个数,[1,2,0]非零个数为2,[3,0,0]非零个数为1,[4,5,6]非零个数为3
# batch_first 表示seq的第一维是batch_size
packed = pack_padded_sequence(seq, lens, batch_first=True, enforce_sorted=False)
print(packed)
# PackedSequence(data=tensor([4, 1, 3, 5, 2, 6]), batch_sizes=tensor([3, 2, 1]), sorted_indices=tensor([2, 0, 1]), unsorted_indices=tensor([1, 2, 0]))
seq_unpacked, lens_unpacked = pad_packed_sequence(packed, batch_first=True)
print(seq_unpacked)
# tensor([[1, 2, 0], [3, 0, 0], [4, 5, 6]])
print(lens_unpacked)
# tensor([2, 1, 3])