基本思想还是转化为多个二分类
https://github.com/keras-team/keras/issues/10371
For the multi-label classification, you can try tanh+hinge with {-1, 1} values in labels like (1, -1, -1, 1).
Or sigmoid + hamming loss with {0, 1} values in labels like (1, 0, 0, 1).
In my case, sigmoid + focal loss with {0, 1} values in labels like (1, 0, 0, 1) worked well.
You can check this paper https://arxiv.org/abs/1708.02002.
比如batch为32 sample的,8个多标签输出,可以等价看成32*8个sample的二分类问题,自然这32*8个sample正负样本比很容易不均(如果每个sample只有1,2个标签的话)。这是focal loss就可以发挥很大的作用了
https://www.kaggle.com/rejpalcz/focalloss-for-keras
class FocalLoss(nn.Module):
def __init__(self, gamma=2