采样方法

Candidate Sampling

@(机器学习)

当我们有一个多类或者多标签的分类问题时,训练的样本为 (xi,Ti) , T 只是一个所有类标L的一个极小的子集

“Exhaustive” training methods such as softmax and logistic regression require us to compute F(x, y) for every class y ∈ L for every training example. When |L| is very large, this can be prohibitively expensive.

“Candidate Sampling”:从所有类标中抽样一部分子集Si Ci=Ti+Si ,评估函数F(x,y)对样本 (xi,Ti) 的C个类分类性能,以此计算损失

Alt text

Q(y|x):给定x,抽样类出现的概率,抽样函数

logistic training loss=i(yPOSilog(1+exp(G(xi,y)))+yNEGilog(1+exp(G(xi,y))))softmax training loss=i(G(xi,ti)+log(yPOSINEGiexp(G(xi,y))))

NCE and Negative Sampling generalize to the case where T is a multiset. In this i case, P(y|x) denotes the expected count of y in Ti . Similarly, NCE, Negative Sampling, and Sampled Logistic generalize to the case where S is a multiset. In this i case Q(y|x) denotes the expected count of y in S .

Sampled Softmax

F(x,y)log(P(y|x))+K(x)

在full softmax训练过程中,针对每个样本 (xi,ti) 都要计算对所有的类L计算logits,当L非常大的时候计算代价太高
在sampled softmax中,对每个训练样本,我们根据抽样的函数 Q(y|x) 抽取一个子集 SiL ,每一个类y语概率 Q(y|xi) :
P(Si=S|xi)=ΠySQ(y|xi)Πy(LS)(1Q(y|xi))

sampled class:

Ci=Siti

目标:给定一个集合 Ci , Ci 中哪一个类才是目标类
对每一个类 yCi ,我们希望计算给定 xi Ci 的情况下,y是目标类的后验概率,即 P(ti=y|x,Ci)

贝叶斯法则:

P(A,B|C)=P(A|B,C)P(B|C)

P(ti=y|xi,Ci)=P(ti=y,Ci|xi)/P(Ci|xi)=P(ti=y|xi)P(Ci|ti=y,xi)/P(Ci|xi)

Alt text

def sampled_softmax_loss(weights, biases, inputs, labels, num_sampled,
                         num_classes, num_true=1,
                         sampled_values=None,
                         remove_accidental_hits=True,
                         partition_strategy="mod",
                         name="sampled_softmax_loss"):
  """Computes and returns the sampled softmax training loss.

  This is a faster way to train a softmax classifier over a huge number of
  classes.

  This operation is for training only.  It is generally an underestimate of
  the full softmax loss.
  仅在训练过程中使用,再预测阶段,使用full softmax
  At inference time, you can compute full softmax probabilities with the
  expression `tf.nn.softmax(tf.matmul(inputs, weights) + biases)`.

  See our [Candidate Sampling Algorithms Reference]
  (../../extras/candidate_sampling.pdf)

  Also see Section 3 of [Jean et al., 2014](http://arxiv.org/abs/1412.2007)
  ([pdf](http://arxiv.org/pdf/1412.2007.pdf)) for the math.

  Args:
    weights: A `Tensor` of shape `[num_classes, dim]`, or a list of `Tensor`
        objects whose concatenation along dimension 0 has shape
        [num_classes, dim].  The (possibly-sharded) class embeddings.
    biases: A `Tensor` of shape `[num_classes]`.  The class biases.
    inputs: A `Tensor` of shape `[batch_size, dim]`.  The forward
        activations of the input network.
    labels: A `Tensor` of type `int64` and shape `[batch_size,
        num_true]`. The target classes.  Note that this format differs from
        the `labels` argument of `nn.softmax_cross_entropy_with_logits`.
    num_sampled: An `int`.  The number of classes to randomly sample per batch.
    num_classes: An `int`. The number of possible classes.
    num_true: An `int`.  The number of target classes per training example.
    sampled_values: a tuple of (`sampled_candidates`, `true_expected_count`,
        `sampled_expected_count`) returned by a `*_candidate_sampler` function.
        (if None, we default to `log_uniform_candidate_sampler`)
    remove_accidental_hits:  A `bool`.  whether to remove "accidental hits"
        where a sampled class equals one of the target classes.  Default is
        True.
    partition_strategy: A string specifying the partitioning strategy, relevant
        if `len(weights) > 1`. Currently `"div"` and `"mod"` are supported.
        Default is `"mod"`. See `tf.nn.embedding_lookup` for more details.
    name: A name for the operation (optional).

  Returns:
    A `batch_size` 1-D tensor of per-example sampled softmax losses.

  """
  logits, labels = _compute_sampled_logits(
      weights, biases, inputs, labels, num_sampled, num_classes,
      num_true=num_true,
      sampled_values=sampled_values,
      subtract_log_q=True,
      remove_accidental_hits=remove_accidental_hits,
      partition_strategy=partition_strategy,
      name=name)
  sampled_losses = nn_ops.softmax_cross_entropy_with_logits(logits, labels)
  # sampled_losses is a [batch_size] tensor.
  return sampled_losses
  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值