Contents
- Fine-Grained Visual Classification (FGVC)
- Learning from Labels of Multi-granularity
- [CVPR 2020] Making better mistakes: Leveraging class hierarchies with deep networks
- [BMVC 2021] Leveraging Class Hierarchies with Metric-Guided Prototype Learning
- [ICLR 2021] No cost likelihood manipulation at test time for making better mistakes in deep networks
- [CVPR 2021] Your "Flamingo" is My "Bird": Fine-Grained, or Not
- [CVPR 2022] Label Relation Graphs Enhanced Hierarchical Residual Network for Hierarchical Multi-Granularity Classification
- [CVPR 2022] Use All The Labels: A Hierarchical Multi-Label Contrastive Learning Framework
- [ECCV 2022] Where to Focus: Investigating Hierarchical Attention Relationship for Fine-Grained Visual Classification
- [ECCV 2022] Learning Hierarchy Aware Features for Reducing Mistake Severity
- [ICLR 2023] Learning Structured Representations by Embedding Class Hierarchy
- Learning Features from Parts
- [CVPR 2017] Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine-grained Image Recognition
- [ECCV 2018] Learning to Navigate for Fine-grained Classification
- [BMVC 2021] Feature fusion vision transformer for fine-grained visual categorization
- [ACM MM 2021] Rams-trans: Recurrent attention multi-scale transformer for fine-grained image recognition
- [AAAI 2022] TransFG: A Transformer Architecture for Fine-grained Recognition
- [ACM MM 2022] SIM-Trans: Structure Information Modeling Transformer for fine-grained visual categorization
- [Arxiv 2022] A Novel Plug-in Module for Fine-Grained Visual Classification
- [CVPR 2022] Fine-Grained Object Classification via Self-Supervised Pose Alignment
- [NeurIPS 2022] Relational Proxies: Emergent Relationships as Fine-Grained Discriminators
- [TMM 2023] TransIFC: Invariant Cues-aware Feature Concentration Learning for Efficient Fine-grained Bird Image Classification
- Learning multi-granularity features
- References
Fine-Grained Visual Classification (FGVC)
Task Description
- FGVC 是一种对图像的细粒度单标签分类任务,例如将下图分类为 “flamingo” 而非 “bird”
- 该任务主要有以下难点:
- (1) a lot of variation in the same category
- (2) objects of different subcategories may be very similar
- (3) often requires professional experts to label data, which makes data more expensive
Datasets
- CUB: It contains 11,788 images covering 200 species of birds. The dataset is divided into two sets including 5,994 training images and 5,794 test images. The 200 species of birds are grouped into 122 genera, 37 families, and 13 orders by a bird taxonomy hierarchy according to the ornithological systematics. (类别层次关系由如下论文提供 Chen, Tianshui, et al. “Fine-grained representation learning and recognition by exploiting hierarchical semantic embedding.” Proceedings of the 26th ACM international conference on Multimedia. 2018.)
- Butterfly-200: It has a hierarchical structure with 200 species, 116 genera, 23 subfamilies, and 5 families according to the insect taxonomy. The dataset contains 25,279 images, including a training set of 5,135 images, a validation set of 5,135 images and a test set of 15,009 images.
- VegFru: a dataset with fine-grained vegetables and fruits recognition covering 292 subordinate classes and 25 upper-level categories. VegFru dataset has 29,200 images for training, 14,600 for validation and 116,931 for testing.
- FGVC-Aircraft contains 100 fine-grained aircraft models, which are grouped into 70 families and 30 makers by tracing superclasses in Wikipedia pages [4]. The dataset has 10,000 images, 6,667 are for training and 3,333 for evaluation. (类别层次关系由如下论文提供 Chang, Dongliang, et al. “Your” Flamingo" is My" Bird": Fine-Grained, or Not." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021.)
- Stanford Cars contains 196 car models, which can be re-organised into 9 makers by tracing superclasses in Wikipedia pages (类别层次关系由如下论文提供 Chang, Dongliang, et al. “Your” Flamingo" is My" Bird": Fine-Grained, or Not." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021.). The dataset contains 16,185 images, including 8,144 images for training and 8,041 images for testing.
Learning from Labels of Multi-granularity
[CVPR 2020] Making better mistakes: Leveraging class hierarchies with deep networks
- Bertinetto, Luca, et al. “Making better mistakes: Leveraging class hierarchies with deep networks.” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020.
- code: https://github.com/fiveai/making-better-mistakes
- blog: Making better mistakes – Mistake severity issues
[BMVC 2021] Leveraging Class Hierarchies with Metric-Guided Prototype Learning
- Landrieu, Loic, and Vivien Sainte Fare Garnot. “Leveraging Class Hierarchies with Metric-Guided Prototype Learning.” British Machine Vision Conference (BMVC). 2021.
- code: https://github.com/VSainteuf/metric-guided-prototypes-pytorch
- blog: Making better mistakes – Mistake severity issues
[ICLR 2021] No cost likelihood manipulation at test time for making better mistakes in deep networks
- Karthik, Shyamgopal, et al. “No cost likelihood manipulation at test time for making better mistakes in deep networks.” ICLR (2021).
- code: https://github.com/sgk98/CRM-Better-Mistakes
- blog: Making better mistakes – Mistake severity issues
[CVPR 2021] Your “Flamingo” is My “Bird”: Fine-Grained, or Not
- Chang, Dongliang, et al. “Your” Flamingo" is My" Bird": Fine-Grained, or Not." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021.
- code: https://github.com/PRIS-CV/Fine-Grained-or-Not
- blog: [CVPR 2021] Your “Flamingo“ is My “Bird“: Fine-Grained, or Not
[CVPR 2022] Label Relation Graphs Enhanced Hierarchical Residual Network for Hierarchical Multi-Granularity Classification
- Chen, Jingzhou, et al. “Label Relation Graphs Enhanced Hierarchical Residual Network for Hierarchical Multi-Granularity Classification.” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022.
- code: https://github.com/MonsterZhZh/HRN
- blog: [CVPR 2022] Label Relation Graphs Enhanced Hierarchical Residual Network
[CVPR 2022] Use All The Labels: A Hierarchical Multi-Label Contrastive Learning Framework
- paper: Zhang, Shu, et al. “Use All The Labels: A Hierarchical Multi-Label Contrastive Learning Framework.” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022.
- code: https://github.com/salesforce/hierarchicalContrastiveLearning
- blog: [CVPR 2022] Use All The Labels: A Hierarchical Multi-Label Contrastive Learning Framework
[ECCV 2022] Where to Focus: Investigating Hierarchical Attention Relationship for Fine-Grained Visual Classification
- Liu, Yang, et al. “Where to Focus: Investigating Hierarchical Attention Relationship for Fine-Grained Visual Classification.” Proceedings ECCV 2022. Springer, 2022.
- code: https://github.com/visiondom/CHRF
- blog: [ECCV 2022] Where to Focus: Investigating Hierarchical Attention Relationship for FGVC
[ECCV 2022] Learning Hierarchy Aware Features for Reducing Mistake Severity
- paper: Garg, Ashima, Depanshu Sani, and Saket Anand. “Learning Hierarchy Aware Features for Reducing Mistake Severity.” European Conference on Computer Vision. Springer, Cham, 2022.
- code: https://github.com/07Agarg/HAF
- blog: Making better mistakes – Mistake severity issues
[ICLR 2023] Learning Structured Representations by Embedding Class Hierarchy
- Zeng, Siqi, et al. “Learning Structured Representations by Embedding Class Hierarchy.” ICLR 2023.
- blog: [ICLR 2023] Learning Structured Representations by Embedding Class Hierarchy
Learning Features from Parts
[CVPR 2017] Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine-grained Image Recognition
- Fu, Jianlong, Heliang Zheng, and Tao Mei. “Look closer to see better: Recurrent attention convolutional neural network for fine-grained image recognition.” Proceedings of the IEEE conference on computer vision and pattern recognition. 2017.
- code: https://github.com/Jianlong-Fu/Recurrent-Attention-CNN
- blog: [CVPR 2017] Look Closer to See Better: Recurrent Attention Convolutional Neural Network for FGVC
[ECCV 2018] Learning to Navigate for Fine-grained Classification
- Yang, Ze, et al. “Learning to navigate for fine-grained classification.” Proceedings of the European Conference on Computer Vision (ECCV). 2018.
- code: https://github.com/yangze0930/NTS-Net
- blog: [ECCV 2018] Learning to Navigate for Fine-grained Classification
[BMVC 2021] Feature fusion vision transformer for fine-grained visual categorization
- Wang, Jun, Xiaohan Yu, and Yongsheng Gao. “Feature fusion vision transformer for fine-grained visual categorization.” (BMVC 2021).
- code: https://github.com/Markin-Wang/FFVT
- blog: [BMVC 2021] Feature fusion vision transformer for fine-grained visual categorization
[ACM MM 2021] Rams-trans: Recurrent attention multi-scale transformer for fine-grained image recognition
- Hu, Yunqing, et al. “Rams-trans: Recurrent attention multi-scale transformer for fine-grained image recognition.” Proceedings of the 29th ACM International Conference on Multimedia. 2021.
- blog: Rams-trans: Recurrent attention multi-scale transformer for fine-grained image recognition
[AAAI 2022] TransFG: A Transformer Architecture for Fine-grained Recognition
- paper: He, Ju, et al. “Transfg: A transformer architecture for fine-grained recognition.” Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 36. No. 1. 2022.
- code: https://github.com/TACJu/TransFG
- Warning: several open issues on Github are about reproducible problems and the authors did not respond actively… (About CUB-200-2011’s accuracy, About CUB ACC, About the training details)
- blog: [AAAI 2022] TransFG: A Transformer Architecture for Fine-grained Recognition
[ACM MM 2022] SIM-Trans: Structure Information Modeling Transformer for fine-grained visual categorization
- Sun, Hongbo, Xiangteng He, and Yuxin Peng. “Sim-trans: Structure information modeling transformer for fine-grained visual categorization.” Proceedings of the 30th ACM International Conference on Multimedia. 2022.
- code: https://github.com/PKU-ICST-MIPL/SIM-Trans_ACMMM2022
- blog: [ACM MM 2022] SIM-Trans: Structure Information Modeling Transformer for FGVC
[Arxiv 2022] A Novel Plug-in Module for Fine-Grained Visual Classification
- Chou, Po-Yung, Cheng-Hung Lin, and Wen-Chung Kao. “A Novel Plug-in Module for Fine-Grained Visual Classification.” arXiv preprint arXiv:2202.03822 (2022).
- code: https://github.com/chou141253/FGVC-PIM
- blog: [Arxiv 2022] A Novel Plug-in Module for Fine-Grained Visual Classification
[CVPR 2022] Fine-Grained Object Classification via Self-Supervised Pose Alignment
- Yang, Xuhui, et al. “Fine-Grained Object Classification via Self-Supervised Pose Alignment.” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022.
- code: https://github.com/yangxh11/P2P-Net
- blog: [CVPR 2022] Fine-Grained Object Classification via Self-Supervised Pose Alignment
[NeurIPS 2022] Relational Proxies: Emergent Relationships as Fine-Grained Discriminators
- Chaudhuri, Abhra, et al. “Relational Proxies: Emergent Relationships as Fine-Grained Discriminators.” (NeurIPS 2022).
- code: https://github.com/abhrac/relational-proxies
- blog: [NeurIPS 2022] Relational Proxies: Emergent Relationships as Fine-Grained Discriminators
[TMM 2023] TransIFC: Invariant Cues-aware Feature Concentration Learning for Efficient Fine-grained Bird Image Classification
- Liu, Hai, et al. “TransIFC: Invariant Cues-aware Feature Concentration Learning for Efficient Fine-grained Bird Image Classification.” IEEE Transactions on Multimedia (2023).
- blog: [TMM 2023] TransIFC: Invariant Cues-aware Feature Concentration Learning for Efficient Fine-grained Bird Image Classification
Learning multi-granularity features
[ECCV 2020] Fine-grained visual classification via progressive multi-granularity training of jigsaw patches
- Du, Ruoyi, et al. “Fine-grained visual classification via progressive multi-granularity training of jigsaw patches.” European Conference on Computer Vision. Springer, Cham, 2020.
- code: https://github.com/RuoyiDu/PMG-Progressive-Multi-Granularity-Training
- blog: [ECCV 2020] Fine-grained visual classification via progressive multi-granularity training of jigsaw patches