FASTAI
1.进阶dataloader
1.1 示例
pets = DataBlock(blocks=(ImageBlock, CategoryBlock), #用到的类型,图片、分类
get_items=get_image_files, #获取x的方式
splitter=RandomSplitter(), #分割方式
get_y=using_attr(RegexLabeller(r'(.+)_\d+.jpg$'), 'name'), #获取y标签的方法
item_tfms=Resize(460),
batch_tfms=aug_transforms(size=224))
A datablock is built by giving the fastai library a bunch of informations:
- the types used, through an argument called
blocks
: here we have images and categories, so we passImageBlock
andCategoryBlock
. - how to get the raw items, here our function
get_image_files
. - how to label those items, here with the same regular expression as before.
- how to split those items, here with a random splitter.
- the
item_tfms
andbatch_tfms
like before.
1.2 通过csv格式文件 进行标签
例子使用了多分类的数据集,标签在train.csv中
dls = ImageDataLoaders.from_df
(df
, path, folder=‘train’, valid_col=‘is_valid’, label_delim=' '
, item_tfms=Resize(460), batch_tfms=aug_transforms(size=224))
-
使用from_df函数来加载csv的数据集。
-
path为基目录,folder 用于在path和filename之间增加路径path/folder/filename
-
label_delim 指定使用空格作为标签分类条件
-
文件名、标签默认为第一二列所以这里不需要指定。
进阶:
pascal = DataBlock(blocks=(ImageBlock, MultiCategoryBlock),
splitter=ColSplitter('is_valid'),
get_x=ColReader('fname', pref=str(path/'train') + os.path.sep),
get_y=ColReader('labels', label_delim=' '),
item_tfms = Resize(460),
batch_tfms=aug_transforms(size=224))
多分类使用F1 Score来评价模型的好坏
f1_macro = F1ScoreMulti(thresh=0.5, average='macro')
f1_macro.name = 'F1(macro)'
f1_samples = F1ScoreMulti(thresh=0.5, average='samples')
f1_samples.name = 'F1(samples)'
learn = vision_learner(dls, resnet50, metrics=[partial(accuracy_multi, thresh=0.5), f1_macro, f1_samples])