图像检索Unifying Deep Local and Global Features for Image Search(2020cvpr)

1、论文 Unifying Deep Local and Global Features for Image Search

https://arxiv.org/pdf/2001.05027.pdf

https://github.com/tensorflow/models/tree/master/research/delf.

解读

Unifying Deep Local and Global Features for Image Search(2020)(十四)

在这里插入图片描述

注意 截至10月底官方代码没有论文中的 restruction loss

1、issue : https://github.com/tensorflow/models/issues/9189
作者说 局部特征不加 池化 训练更好(得到 一个更低维度的池化特征)

We found that pooling on the local descriptor directly can be difficult, because it will lead to a lower dimensionality pooled feature, which may have a difficult time optimizing the classifier (consequently, the attention layers cannot be learned so well). Also, if we allow the local descriptors to be tuned directly based on this loss, it may produce abstract/high-level representations which may be good for the attention loss optimization, but not necessarily for local descriptor matching (the local descriptors may become less localizable).

Attention classifier should be able to reuse ArcFace as well; it requires an additional hyperparameter to be set though (the ArcFace margin). The goal of the attention layer is not to produce a powerful global feature, but rather to learn well the attention keypoint detection; so the ArcFace loss may not contribute much to this goal.

tensorflow实现相关的代码 截取
https://github.com/tensorflow/models/blob/4437d7b4b17c5535d516bcb4038ff9397ae9eef9/research/delf/delf/python/training/model/delf_model.py#L83

# 局部特征获取keypoints 的解析过程文件
https://github.com/tensorflow/models/issues/3387

Great to hear you were able to train it!

The step you seem to be missing is to apply some post-processing operations to the extracted features. Essentially, you need to call the ExtractKeypointDescriptor function (from the feature_extractor.py file), which will give you the boxes, features, etc (note that this function requires a model_fn argument, which you should set to the output of BuildModel). A simplified example of how to use ExtractKeypointDescriptor can be seen in the file feature_extractor_test.py.

After extracting those, you can then call DelfFeaturePostProcessing to obtain the final locations and descriptors.

Hope this helps!
loss计算  两个 没有论文中 autoencoder 的 restruction loss (MSE loss)

        desc_loss = compute_loss(labels, desc_logits)  #全局损失

        # Calculate attention loss by applying the attention block classifier.
        attn_logits = model.attn_classification(attn_prelogits)
        attn_loss = compute_loss(labels, attn_logits)  # attention loss 

        # Cumulate global loss and attention loss.
        total_loss = desc_loss + FLAGS.attention_loss_weight * attn_loss
注意力结构
第二个池化 过滤器是1,不是512 
feat = 输入特征 l2 * 注意力权重
class AttentionModel(tf.keras.Model):
  """Instantiates attention model.
  Uses two [kernel_size x kernel_size] convolutions and softplus as activation
  to compute an attention map with the same resolution as the featuremap.
  Features l2-normalized and aggregated using attention probabilites as weights.
  """

  def __init__(self, kernel_size=1, decay=_DECAY, name='attention'):
    """Initialization of attention model.
    Args:
      kernel_size: int, kernel size of convolutions.
      decay: float, decay for l2 regularization of kernel weights.
      name: str, name to identify model.
    """
    super(AttentionModel, self).__init__(name=name)

    # First convolutional layer (called with relu activation).
    self.conv1 = layers.Conv2D(
        512,
        kernel_size,
        kernel_regularizer=reg.l2(decay),
        padding='same',
        name='attn_conv1')
    self.bn_conv1 = layers.BatchNormalization(axis=3, name='bn_conv1')

    # Second convolutional layer, with softplus activation.
    self.conv2 = layers.Conv2D(
        1,
        kernel_size,
        kernel_regularizer=reg.l2(decay),
        padding='same',
        name='attn_conv2')
    self.activation_layer = layers.Activation('softplus')

  def call(self, inputs, training=True):
    x = self.conv1(inputs)
    x = self.bn_conv1(x, training=training)
    x = tf.nn.relu(x)

    score = self.conv2(x)
    prob = self.activation_layer(score)

    # L2-normalize the featuremap before pooling.
    inputs = tf.nn.l2_normalize(inputs, axis=-1)
    feat = tf.reduce_mean(tf.multiply(inputs, prob), [1, 2], keepdims=False)

	# delg  和 2018 delf 相比增加了一部分  feat
	# feat 接池化层 +attention loss 
	# prob 和layer3 输出  restruction loss 
    return feat, prob, score

  def global_and_local_forward_pass(self, images, training=True):
    """Run a forward to calculate global descriptor and attention prelogits.
    Args:
      images: Tensor containing the dataset on which to run the forward pass.
      training: Indicator of wether the forward pass is running in training mode
        or not.
    Returns:
      Global descriptor prelogits, attention prelogits, attention scores,
        backbone weights.
    """
    backbone_blocks = {}
    desc_prelogits = self.backbone.build_call(
        images, intermediates_dict=backbone_blocks, training=training)
    # Prevent gradients from propagating into the backbone. See DELG paper:
    # https://arxiv.org/abs/2001.05027.
    block3 = backbone_blocks['block3']  # pytype: disable=key-error
    block3 = tf.stop_gradient(block3)
    attn_prelogits, attn_scores, _ = self.attention(block3, training=training)
    return desc_prelogits, attn_prelogits, attn_scores, backbone_blocks

  def build_call(self, input_image, training=True):
    (global_feature, _, attn_scores,
     backbone_blocks) = self.global_and_local_forward_pass(input_image,
                                                           training)
    features = backbone_blocks['block3']  # pytype: disable=key-error
    return global_feature, attn_scores, features

  def call(self, input_image, training=True):
    _, probs, features = self.build_call(input_image, training=training)
    return probs, features






  • 1
    点赞
  • 5
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值