《Cross-modal retrieval常用torch版本loss》总结

1.MSE Loss(回归类)

均分误差:Creates a criterion that measures the mean squared error (squared L2 norm) between each element in the input xx and target yy .
在这里插入图片描述
在这里插入图片描述

    loss = nn.MSELoss()
    input = torch.randn(3, 5, requires_grad=True)
    target = torch.randn(3, 5)
    output = loss(input, target)
    output.backward()

    print(input)
    print(target)
    print(output)

tensor([[ 0.8964,  1.6948,  0.5003,  0.6851, -0.7712],
        [-0.7480, -0.1916,  0.4495, -1.2375, -0.7038],
        [-1.5244,  0.2029, -0.7153, -1.4792,  1.1071]], requires_grad=True)
tensor([[-1.2733,  0.2253,  0.1926, -1.1926,  0.5637],
        [-0.9189,  2.7922,  0.2730,  0.6243, -1.2396],
        [ 1.2193, -0.6027, -0.1948,  0.5456, -0.3350]])
tensor(2.6409, grad_fn=<MseLossBackward>)

2.BCELOSS(分类类)

torch.nn.BCELoss(weight=None, size_average=None, reduce=None, reduction=‘mean’)
功能:创造一个熵去测量output和target之间的二进制交叉熵
在这里插入图片描述
这个用于测量自动编码器的重建误差,目标y应该是(0,1)之间的数字。

    m = nn.Sigmoid()
    loss = nn.BCELoss()
    input = torch.randn(3, requires_grad=True)
    target = torch.empty(3).random_(2)
    output = loss(m(input), target)
    output.backward()

    print(m(input))
    print(target)
    
tensor([0.2943, 0.4337, 0.3497], grad_fn=<SigmoidBackward>)
tensor([0., 1., 1.])
tensor(0.7449, grad_fn=<BinaryCrossEntropyBackward>)

1.BCEWithLogitsLoss
在这里插入图片描述

    loss = nn.BCEWithLogitsLoss()
    input = torch.randn(3, requires_grad=True)
    target = torch.empty(3).random_(2)
    output = loss(input, target)
    output.backward()

    print(input)
    print(target)
    print(output)
    
tensor([0.1406, 0.4081, 1.5632], requires_grad=True)
tensor([1., 1., 0.])
tensor(0.9628, grad_fn=<BinaryCrossEntropyWithLogitsBackward>)

3.KL(分布匹配类)

度量两个概率分布之间的差异程度
torch.nn.KLDivLoss(size_average=None, reduce=None, reduction=‘mean’, log_target=False)
Kullback-Leibler散度是连续分布的有用距离度量,在对(离散采样)连续输出分布的空间进行直接回归时通常很有用。
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
参考链接:https://zhuanlan.zhihu.com/p/339613080

4.CrossEntropyLoss

torch.nn.CrossEntropyLoss(weight=None, size_average=None, ignore_index=-100, reduce=None, reduction=‘mean’)
这个交叉熵结合了LogSoftmax和NLLLOSS,在训练带有C类的分类问题时很有用。如果提供,则可选参数weight应为一维张 量,为每个类分配权重。当您的训练集不平衡时,此功能特别有用。该输入预计将包含原始,非标准化的分数为每个类。
在这里插入图片描述

loss = nn.CrossEntropyLoss()
    input = torch.randn(3, 5, requires_grad=True)
    target = torch.empty(3, dtype=torch.long).random_(5)
    output = loss(input, target)
    output.backward()

    print(input)
    print(target)
    print(output)

tensor([[-1.3169,  0.6902, -0.3976, -0.1056,  1.6268],
        [-0.1469, -0.8665, -0.7510, -0.5994,  0.0559],
        [-0.5486, -0.6443, -0.1930,  0.7109,  0.0054]], requires_grad=True)
tensor([2, 3, 4])
tensor(1.9986, grad_fn=<NllLossBackward>)

5.TripletMarginLoss

torch.nn.TripletMarginLoss(margin=1.0, p=2.0, eps=1e-06, swap=False, size_average=None, reduce=None, reduction=‘mean’)
创建一个标准来衡量给定输入张量的三重态损失 11x 1 , 2倍X 2 , 3倍X 3 且边距值大于 00 。这用于测量样本之间的相对相似性。一个三元组由a,p和n组成(分别是anchor,正例和负例)。所有输入张量的形状应为 (N,D)
在这里插入图片描述

triplet_loss = nn.TripletMarginLoss(margin=1.0, p=2)
    anchor = torch.randn(2, 3, requires_grad=True)
    positive = torch.randn(2, 3, requires_grad=True)
    negative = torch.randn(2, 3, requires_grad=True)
    output = triplet_loss(anchor, positive, negative)
    output.backward()

    print(anchor)
    print(positive)
    print(negative)
    print(output)
tensor([[ 0.2258, -0.6545, -1.5043],
        [ 1.0083,  0.6198, -0.1240]], requires_grad=True)
tensor([[ 1.8021,  1.9506, -0.6078],
        [-0.1537, -0.2082, -0.6502]], requires_grad=True)
tensor([[-0.9203,  0.4674, -0.6659],
        [-1.3081, -1.0276,  0.9357]], requires_grad=True)
tensor(1.1822, grad_fn=<MeanBackward0>)

torch.nn.TripletMarginWithDistanceLoss(*, distance_function=None, margin=1.0, swap=False, reduction=‘mean’)

6.NLLLoss

torch.nn.NLLLoss(weight=None, size_average=None, ignore_index=-100, reduce=None, reduction=‘mean’)
负对数似然损失。用C类训练分类问题很有用。如果提供,则可选参数weight应为一维张量,为每个类分配权重。当您的训练集不平衡时,此功能特别有用。

基于对抗的跨媒体检索Cross-modal retrieval aims to enable flexible retrieval experience across different modalities (e.g., texts vs. images). The core of crossmodal retrieval research is to learn a common subspace where the items of different modalities can be directly compared to each other. In this paper, we present a novel Adversarial Cross-Modal Retrieval (ACMR) method, which seeks an effective common subspace based on adversarial learning. Adversarial learning is implemented as an interplay between two processes. The first process, a feature projector, tries to generate a modality-invariant representation in the common subspace and to confuse the other process, modality classifier, which tries to discriminate between different modalities based on the generated representation. We further impose triplet constraints on the feature projector in order to minimize the gap among the representations of all items from different modalities with same semantic labels, while maximizing the distances among semantically different images and texts. Through the joint exploitation of the above, the underlying cross-modal semantic structure of multimedia data is better preserved when this data is projected into the common subspace. Comprehensive experimental results on four widely used benchmark datasets show that the proposed ACMR method is superior in learning effective subspace representation and that it significantly outperforms the state-of-the-art cross-modal retrieval methods.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值