Mate&Few-Shot Learning

Mate&Few-Shot Learning

1. 基本概念

  1. Meta Learning
    • F e w − s h o t    l e a r n i n g \color{red}Few-shot\;learning Fewshotlearning is a kind of m e t a    l e a r n i n g \color{red}meta\;learning metalearning.
    • Meta learning: l e a r n    t o    l e a r n \color{red}learn\;to\;learn learntolearn.
  2. Supervised Learning vs. Few-Shot Learning
    1. Traditional supervised learning:
      • T e s t \color{red}Test Test samples are n e v e r    s e e n    b e f o r e \color{red}never\;seen\;before neverseenbefore.
      • T e s t \color{red}Test Test samples are from k n o w n    c l a s s e s \color{blue}known\;classes knownclasses.
    2. Few-shot learning:
      • Q u e r y \color{red}Query Query samples are n e v e r    s e e n    b e f o r e \color{red}never\;seen\;before neverseenbefore.
      • Q u e r y \color{red}Query Query samples are from u n k n o w n    c l a s s e s \color{red}unknown\;classes unknownclasses.
  3. 术语
    • Training Set:
    • Support Set:
      • k \color{red}k k-way: the support set has k \color{red}k k classes.
      • n \color{green}n n-shot: every class has n \color{green}n n samples.
        • 3-way is easier than 6-way;
        • 2-shot is easier than 1-shot.
    • Query:
  4. Idea: Learn a Similarity Function
    • Basic Idea:
      • Learn a similarity function: s i m ( x , x ∗ ) sim(x,x^* ) sim(x,x).
      • Ideally, s i m ( x 1 , x 2 ) = 1 sim(x_1,x_2 )=1 sim(x1,x2)=1, s i m ( x 1 , x 3 ) = 0 sim(x_1,x_3 )=0 sim(x1,x3)=0, and s i m ( x 2 , x 3 ) = 0 sim(x_2,x_3 )=0 sim(x2,x3)=0.
    • Step:
      • First, learn a similarity function from large-scale training dataset.
      • Then, apply the similarity function for prediction.
        • Compare the q u e r y \color{red}query query with every sample in the s u p p o r t    s e t \color{red}support\;set supportset.
        • Find the sample with the highest similarity score.
  5. Datasets

2. Siamese Network

2.1 Learning Pairwise Similarity Scores

Ref:

  • Bromley et al. Signature verification using a Siamese time delay neural network. In NIPS. 1994.
  • Koch, Zemel, & Salakhutdinov. Siamese neural networks for one-shot image recognition. In ICML, 2015.
  1. Data for Training set
    Select 2 pieces of training data each time and label them.
    在这里插入图片描述
  2. CNN for Feature Extraction
    在这里插入图片描述
  3. Training Siamese Network
    • Forword Network在这里插入图片描述
    • BackWord Network
      Update the parameter of CNN.
      在这里插入图片描述
  4. One-shot Prediction
    The training data (for the Siamese network) does not contain the support set classes and the query.
    在这里插入图片描述

2.2 Triplet Loss

Ref:

  • Schroff, Kalenichenko, & Philbin. Facenet: A unified embedding for face recognition and clustering. InCVPR, 2015
  1. Data for Training set
    Select 3 pieces of training data each time and label them.
    在这里插入图片描述
  2. CNN for Feature Extraction
    Feature Extraction use the same CNN.
    在这里插入图片描述
  3. Triplet Loss
    在这里插入图片描述
  4. One-Shot Prediction
    在这里插入图片描述

2.3 Basic Idea of Few-Shot Learning

  • Train a S i a m e s e    n e t w o r k \color{red}Siamese\;network Siamesenetwork on large-scale training set.
  • Given a s u p p o r t    s e t \color{red}support\;set supportset of 𝑘-way 𝑛-shot.
    • 𝑘-way means 𝑘 classes.
    • 𝑛-shot means every class has 𝑛 samples.
    • The training set does not contain the 𝑘 classes.
  • Given a q u e r y \color{red}query query, predict its class.
    • Use the Siamese network to compute similarity or distance.

3. Pretraining and Fine Tuning

  1. Cosine Similarity
    在这里插入图片描述

  2. Softmax Function
    在这里插入图片描述

  3. Softmax Classifier(全连接层+softmax函数)
    Here, 𝑘 is number of classes, and 𝑑 is number of features.
    在这里插入图片描述

3.1 Few-Shot Prediction Using Pretrained CNN

Reference:

  • Dhillon, Chaudhari, Ravichandran, & Soatto. A baseline for few-shot image classification. In ICLR, 2020.
  • Chen, Wang, Liu, Xu, & Darrell. A New Meta-Baseline for Few-Shot Learning. arXiv, 2020
  1. Pretraining
    • Pretrain a CNN for f e a t u r e    e x t r a c t i o n \color{red}feature\;extraction featureextraction (aka embedding).
    • The CNN can be pretrained using s t a n d a r d    s u p e r v i s e d    l e a r n i n g \color{red}standard\;supervised\;learning standardsupervisedlearning or S i a m e s e    n e t w o r k \color{red}Siamese \;network Siamesenetwork.
      在这里插入图片描述
  2. Deal with the Support set
    在这里插入图片描述
  3. Making Few-Shot Prediction
    q q q is Query.
    在这里插入图片描述
  4. Summary
    在这里插入图片描述

3.2 Benefit of Fine Tuning

Reference:

  • Chen, Liu, Kira, Wang, & Huang. A Closer Look at Few-shot Classification. In ICLR, 2019.
  • Dhillon, Chaudhari, Ravichandran, & Soatto. A baseline for few-shot image classification. In ICLR, 2020.
  • Chen, Wang, Liu, Xu, & Darrell. A New Meta-Baseline for Few-Shot Learning. arXiv, 2020.
  • Fine-Tuning is a improved algorithm for Few-Shot Prediction Using Pretrained CNN.
  • The process of Few-Shot Prediction Using Pretrained CNN is:
    在这里插入图片描述
    the j j j of x j x_j xj is the q u e r y \color{red}query query. W \color{red}W W and b \color{red}b b are from the support set.
  1. Trick 1: A Good Initialization
    在这里插入图片描述
    We can train W \color{red}W W and b \color{red}b b on the support set. (Fine tuning.)
  2. Trick 2: Entropy Regularization
    在这里插入图片描述

    在这里插入图片描述

  3. Trick 3: Cosine Similarity + Softmax Classifier
    在这里插入图片描述
    在这里插入图片描述
  • summary
    在这里插入图片描述

参考:

  1. 小样本学习和元学习(中文课程) - Shusen Wang
  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值