使用timm模型的金字塔特征

最新推荐文章于 2024-03-30 07:45:30 发布

there2belief

最新推荐文章于 2024-03-30 07:45:30 发布

阅读量285

点赞数

分类专栏： AI/ML/DL 文章标签： python ai

原文链接：https://huggingface.co/docs/timm/feature_extraction#multi-scale-feature-maps-feature-pyramid

版权

AI/ML/DL 专栏收录该内容

254 篇文章

订阅专栏

文章介绍了timm库如何用于查看各种预训练模型并创建作为特征提取器的模型。通过设置features_only参数，可以得到不同尺度的特征地图，这对于对象检测、分割等任务非常有用。此外，timm提供了.feature_info属性来查询特征信息，包括通道数和分辨率。还可以通过out_indices和output_stride参数选择特定层次的特征或限制输出的步长。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

1 查看timm模型

import timm

print(timm.list_models())

Models API and Pretrained weights | timmdocs (fast.ai)

2 使用timm的特征

Multi-scale Feature Maps (Feature Pyramid)

Object detection, segmentation, keypoint, and a variety of dense pixel tasks require access to feature maps from the backbone network at multiple scales. This is often done by modifying the original classification network. Since each network varies quite a bit in structure, it’s not uncommon to see only a few backbones supported in any given obj detection or segmentation library.

timm allows a consistent interface for creating any of the included models as feature backbones that output feature maps for selected levels.

A feature backbone can be created by adding the argument features_only=True to any create_model call. By default 5 strides will be output from most models (not all have that many), with the first starting at 2 (some start at 1 or 4).

Create a feature map extraction model

>>> import torch
>>> import timm
>>> m = timm.create_model('resnest26d', features_only=True, pretrained=True)
>>> o = m(torch.randn(2, 3, 224, 224))
>>> for x in o:
...     print(x.shape)

Output:

torch.Size([2, 64, 112, 112])
torch.Size([2, 256, 56, 56])
torch.Size([2, 512, 28, 28])
torch.Size([2, 1024, 14, 14])
torch.Size([2, 2048, 7, 7])

Query the feature information

After a feature backbone has been created, it can be queried to provide channel or resolution reduction information to the downstream heads without requiring static config or hardcoded constants. The .feature_info attribute is a class encapsulating the information about the feature extraction points.

>>> import torch
>>> import timm
>>> m = timm.create_model('regnety_032', features_only=True, pretrained=True)
>>> print(f'Feature channels: {m.feature_info.channels()}')
>>> o = m(torch.randn(2, 3, 224, 224))
>>> for x in o:
...     print(x.shape)

Output:

Feature channels: [32, 72, 216, 576, 1512]
torch.Size([2, 32, 112, 112])
torch.Size([2, 72, 56, 56])
torch.Size([2, 216, 28, 28])
torch.Size([2, 576, 14, 14])
torch.Size([2, 1512, 7, 7])

Select specific feature levels or limit the stride

There are two additional creation arguments impacting the output features.

out_indices selects which indices to output
output_stride limits the feature output stride of the network (also works in classification mode BTW)

out_indices is supported by all models, but not all models have the same index to feature stride mapping. Look at the code or check feature_info to compare. The out indices generally correspond to the C(i+1)th feature level (a 2^(i+1) reduction). For most models, index 0 is the stride 2 features, and index 4 is stride 32.

output_stride is achieved by converting layers to use dilated convolutions. Doing so is not always straightforward, some networks only support output_stride=32.

>>> import torch
>>> import timm
>>> m = timm.create_model('ecaresnet101d', features_only=True, output_stride=8, out_indices=(2, 4), pretrained=True)
>>> print(f'Feature channels: {m.feature_info.channels()}')
>>> print(f'Feature reduction: {m.feature_info.reduction()}')
>>> o = m(torch.randn(2, 3, 320, 320))
>>> for x in o:
...     print(x.shape)

Output:

Feature channels: [512, 2048]
Feature reduction: [8, 8]
torch.Size([2, 512, 40, 40])
torch.Size([2, 2048, 40, 40])