1.问题阐述
图的邻接常用稀疏矩阵表述,在pytorch中常用类似于以下方式的语句创建稀疏tensor:
import torch
from scipy.sparse import csr_matrix
# 原始矩阵数据
data = [[1,2,1],[1,2,3]]
csr_data = csr_matrix(data)
# 转csr稀疏矩阵
print(csr_data)
# 转coo稀疏矩阵
coo_data = csr_data.tocoo()
# 行索引
row = torch.Tensor(coo_data.row).long()
# 列索引
col = torch.Tensor(coo_data.col).long()
# 索引堆叠,变二维索引
index = torch.stack([row, col])
# 稀疏数据
data = torch.FloatTensor(coo_data.data)
print("行索引\n",coo_data.row,'列索引\n',coo_data.col,'数据value\n',coo_data.data)
tensor_coo = torch.sparse.FloatTensor(index, data, torch.Size(coo_data.shape))
print('sparse tensor\n', tensor_coo)
然而要取出多行或者多列时,却不能用tensor的取法,如以下方式:
ind = torch.tensor([0,1]).long()
print(tensor_coo[ind])
报错:
Cell In[1], line 16 14 # 行索引 15 ind = torch.tensor([0,1]).long() ---> 16 print(tensor_coo[ind]) NotImplementedError: Could not run 'aten::index.Tensor' with arguments from the 'SparseCPU' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions.
2.解决方法-sparseTensor.index_select
解决方法参考了知乎的一篇文章torch 的sparse tensor有什么比较好的slice方法,由此对代码修改如下(切片方式为最后几行,其他与上面代码一致),至此问题·解决:
import torch
from scipy.sparse import csr_matrix
# 原始矩阵数据
data = [[1,2,1],[1,2,3]]
csr_data = csr_matrix(data)
# 转csr稀疏矩阵
print('csr稀疏矩阵',csr_data)
# 转coo稀疏矩阵
coo_data = csr_data.tocoo()
# 行索引
row = torch.Tensor(coo_data.row).long()
# 列索引
col = torch.Tensor(coo_data.col).long()
# 索引堆叠,变二维索引
index = torch.stack([row, col])
# 稀疏数据
data = torch.FloatTensor(coo_data.data)
print("行索引\n",coo_data.row,"\n",'列索引\n',coo_data.col,"\n",'数据value\n',coo_data.data)
tensor_coo = torch.sparse.FloatTensor(index, data, torch.Size(coo_data.shape))
print('sparse tensor\n', tensor_coo)
# 索引
ind = torch.LongTensor([0,1])
# 第一个参数为dim,dim=0为行,dim=1为列,第二个参数为索引
print(tensor_coo.index_select(0,ind))