论文阅读20:CrossViT: Cross-Attention MultiScale Vision Transformer for ImageClassification 2022
问题:how to learn multi-scale feature representation in transformer models for imageclassification.解决:propose a dual-branch transformer to combine image patches of diffrent sizes toproduce stronger image features.reduce the computation, do .....
复制链接