I am working on classification problem in which I have a list of strings as class labels and I want to convert them into a tensor. So far I have tried converting the list of strings into a numpy array using the np.array function provided by the numpy module.
truth = torch.from_numpy(np.array(truths))
but I am getting the following error.
RuntimeError: can't convert a given np.ndarray to a tensor - it has an invalid type. The only supported types are: double, float, int64, int32, and uint8.
Can anybody suggest an alternative approach? Thanks
解决方案
Unfortunately, you can't right now. And I don't think it is a good idea since it will make PyTorch clumsy. A popular workaround could convert it into numeric types using sklearn.
Here is a short example:
from sklearn import preprocessing
import torch
labels = ['cat', 'dog', 'mouse', 'elephant', 'pandas']
le = preprocessing.LabelEncoder()
targets = le.fit_transform(labels)
# targets: array([0, 1, 2, 3])
targets = torch.as_tensor(targets)
# targets: tensor([0, 1, 2, 3])
Since you may need the conversion between true labels and transformed labels, it is good to store the variable le.