图像检索Unifying Deep Local and Global Features for Image Search（2020cvpr)

最新推荐文章于 2021-11-19 21:04:44 发布

贝猫说python

最新推荐文章于 2021-11-19 21:04:44 发布

阅读量1.7k

点赞数 1

本文链接：https://blog.csdn.net/m0_37192554/article/details/109314265

版权

1、论文 Unifying Deep Local and Global Features for Image Search

https://arxiv.org/pdf/2001.05027.pdf

https://github.com/tensorflow/models/tree/master/research/delf.

解读

Unifying Deep Local and Global Features for Image Search(2020)(十四)

在这里插入图片描述

注意截至10月底官方代码没有论文中的 restruction loss

1、issue : https://github.com/tensorflow/models/issues/9189
作者说局部特征不加池化训练更好（得到一个更低维度的池化特征）

We found that pooling on the local descriptor directly can be difficult, because it will lead to a lower dimensionality pooled feature, which may have a difficult time optimizing the classifier (consequently, the attention layers cannot be learned so well). Also, if we allow the local descriptors to be tuned directly based on this loss, it may produce abstract/high-level representations which may be good for the attention loss optimization, but not necessarily for local descriptor matching (the local descriptors may become less localizable).

Attention classifier should be able to reuse ArcFace as well; it requires an additional hyperparameter to be set though (the ArcFace margin). The goal of the attention layer is not to produce a powerful global feature, but rather to learn well the attention keypoint detection; so the ArcFace loss may not contribute much to this goal.

# 局部特征获取keypoints 的解析过程文件
https://github.com/tensorflow/models/issues/3387

Great to hear you were able to train it!

The step you seem to be missing is to apply some post-processing operations to the extracted features. Essentially, you need to call the ExtractKeypointDescriptor function (from the feature_extractor.py file), which will give you the boxes, features, etc (note that this function requires a model_fn argument, which you should set to the output of BuildModel). A simplified example of how to use ExtractKeypointDescriptor can be seen in the file feature_extractor_test.py.

After extracting those, you can then call DelfFeaturePostProcessing to obtain the final locations and descriptors.

Hope this helps!

loss计算  两个 没有论文中 autoencoder 的 restruction loss (MSE loss)

        desc_loss = compute_loss(labels, desc_logits)  #全局损失

        # Calculate attention loss by applying the attention block classifier.
        attn_logits = model.attn_classification(attn_prelogits)
        attn_loss = compute_loss(labels, attn_logits)  # attention loss 

        # Cumulate global loss and attention loss.
        total_loss = desc_loss + FLAGS.attention_loss_weight * attn_loss

注意力结构
第二个池化 过滤器是1，不是512 
feat = 输入特征 l2 * 注意力权重
class AttentionModel(tf.keras.Model):
  """Instantiates attention model.
  Uses two [kernel_size x kernel_size] convolutions and softplus as activation
  to compute an attention map with the same resolution as the featuremap.
  Features l2-normalized and aggregated using attention probabilites as weights.
  """

  def __init__(self, kernel_size=1, decay=_DECAY, name='attention'):
    """Initialization of attention model.
    Args:
      kernel_size: int, kernel size of convolutions.
      decay: float, decay for l2 regularization of kernel weights.
      name: str, name to identify model.
    """
    super(AttentionModel, self).__init__(name=name)

    # First convolutional layer (called with relu activation).
    self.conv1 = layers.Conv2D(
        512,
        kernel_size,
        kernel_regularizer=reg.l2(decay),
        padding='same',
        name='attn_conv1')
    self.bn_conv1 = layers.BatchNormalization(axis=3, name='bn_conv1')

    # Second convolutional layer, with softplus activation.
    self.conv2 = layers.Conv2D(
        1,
        kernel_size,
        kernel_regularizer=reg.l2(decay),
        padding='same',
        name='attn_conv2')
    self.activation_layer = layers.Activation('softplus')

  def call(self, inputs, training=True):
    x = self.conv1(inputs)
    x = self.bn_conv1(x, training=training)
    x = tf.nn.relu(x)

    score = self.conv2(x)
    prob = self.activation_layer(score)

    # L2-normalize the featuremap before pooling.
    inputs = tf.nn.l2_normalize(inputs, axis=-1)
    feat = tf.reduce_mean(tf.multiply(inputs, prob), [1, 2], keepdims=False)

	# delg  和 2018 delf 相比增加了一部分  feat
	# feat 接池化层 +attention loss 
	# prob 和layer3 输出  restruction loss 
    return feat, prob, score

  def global_and_local_forward_pass(self, images, training=True):
    """Run a forward to calculate global descriptor and attention prelogits.
    Args:
      images: Tensor containing the dataset on which to run the forward pass.
      training: Indicator of wether the forward pass is running in training mode
        or not.
    Returns:
      Global descriptor prelogits, attention prelogits, attention scores,
        backbone weights.
    """
    backbone_blocks = {}
    desc_prelogits = self.backbone.build_call(
        images, intermediates_dict=backbone_blocks, training=training)
    # Prevent gradients from propagating into the backbone. See DELG paper:
    # https://arxiv.org/abs/2001.05027.
    block3 = backbone_blocks['block3']  # pytype: disable=key-error
    block3 = tf.stop_gradient(block3)
    attn_prelogits, attn_scores, _ = self.attention(block3, training=training)
    return desc_prelogits, attn_prelogits, attn_scores, backbone_blocks

  def build_call(self, input_image, training=True):
    (global_feature, _, attn_scores,
     backbone_blocks) = self.global_and_local_forward_pass(input_image,
                                                           training)
    features = backbone_blocks['block3']  # pytype: disable=key-error
    return global_feature, attn_scores, features

  def call(self, input_image, training=True):
    _, probs, features = self.build_call(input_image, training=training)
    return probs, features

贝猫说python

关注

1
点赞
踩
5

收藏

觉得还不错? 一键收藏
0
评论
图像检索Unifying Deep Local and Global Features for Image Search（2020cvpr)

1、论文 Unifying Deep Local and Global Features for Image Searchhttps://arxiv.org/pdf/2001.05027.pdfhttps://github.com/tensorflow/models/tree/master/research/delf.解读Unifying Deep Local and Global Features for Image Search(2020)(十四)注意截至10月底官方代码没有论文中的 r
复制链接

扫一扫