glip优化

优化修改

增加离线加载分词器模型和文本编码器模型逻辑
  1. 修改“maskrcnn_benchmark/config/defaults.py”脚本。

    增加离线分词器和文本编码器路径变量。

    _C.MODEL.LANGUAGE_BACKBONE.TOKENIZER_PATH = ""
    _C.MODEL.LANGUAGE_BACKBONE.MODEL_PATH = ""
  2. 修改“maskrcnn_benchmark/data/build.py”脚本。

    通过新增变量改变加载分词器的逻辑。

    修改前:

    def make_data_loader(cfg, is_train=True, is_distributed=False, num_replicas=None, rank=None, start_iter=0):
    if cfg.MODEL.LANGUAGE_BACKBONE.TOKENIZER_TYPE == "clip":
        ……
    else:
    extra_args['tokenizer'] = AutoTokenizer.from_pretrained(cfg.MODEL.LANGUAGE_BACKBONE.TOKENIZER_TYPE)

    修改后:

    def make_data_loader(cfg, is_train=True, is_distributed=False, num_replicas=None, rank=None, start_iter=0):
    if cfg.MODEL.LANGUAGE_BACKBONE.TOKENIZER_TYPE == "clip":
        ……
    else:
    if cfg.MODEL.LANGUAGE_BACKBONE.TOKENIZER_PATH:
         extra_args['tokenizer'] = AutoTokenizer.from_pretrained(cfg.MODEL.LANGUAGE_BACKBONE.TOKENIZER_PATH)
    else:
         extra_args['tokenizer'] = AutoTokenizer.from_pretrained(cfg.MODEL.LANGUAGE_BACKBONE.TOKENIZER_TYPE)
  3. 修改“maskrcnn_benchmark/engine/inference.py”脚本。

    通过新增变量改变加载分词器的逻辑。

    修改前:

    def create_queries_and_maps(labels, label_list, additional_labels = None, cfg = None):
    if cfg.MODEL.LANGUAGE_BACKBONE.TOKENIZER_TYPE == "bert-base-uncased":
            tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")

    修改后:

    def create_queries_and_maps(labels, label_list, additional_labels = None, cfg = None):
    if cfg.MODEL.LANGUAGE_BACKBONE.TOKENIZER_TYPE == "bert-base-uncased":
            if cfg.MODEL.LANGUAGE_BACKBONE.TOKENIZER_PATH:
                tokenizer = AutoTokenizer.from_pretrained(cfg.MODEL.LANGUAGE_BACKBONE.TOKENIZER_PATH)
            else:
                tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
  4. 修改“maskrcnn_benchmark/modeling/detector/generalized_vl_rcnn.py”脚本。

    通过新增变量改变加载分词器的逻辑。

    修改前:

    class GeneralizedVLRCNN(nn.Module):
    def __init__(self, cfg):
    # language encoder
            if cfg.MODEL.LANGUAGE_BACKBONE.TOKENIZER_TYPE == "clip":
                ……
    else:
                self.tokenizer = AutoTokenizer.from_pretrained(cfg.MODEL.LANGUAGE_BACKBONE.TOKENIZER_TYPE)

    修改后:

    class GeneralizedVLRCNN(nn.Module):
    def __init__(self, cfg):
    # language encoder
            if cfg.MODEL.LANGUAGE_BACKBONE.TOKENIZER_TYPE == "clip":
                ……
    else:
                if cfg.MODEL.LANGUAGE_BACKBONE.TOKENIZER_PATH:
                    self.tokenizer = AutoTokenizer.from_pretrained(cfg.MODEL.LANGUAGE_BACKBONE.TOKENIZER_PATH)
                else:
                    self.tokenizer = AutoTokenizer.from_pretrained(cfg.MODEL.LANGUAGE_BACKBONE.TOKENIZER_TYPE)
  5. 修改“maskrcnn_benchmark/modeling/language_backbone/bert_model.py”脚本。

    通过新增变量改变加载文本编码器的逻辑。

    修改前:

    class BertEncoder(nn.Module):
    def __init__(self, cfg):
    ……
    if self.bert_name == "bert-base-uncased":
        config = BertConfig.from_pretrained(self.bert_name)
        config.gradient_checkpointing = self.cfg.MODEL.LANGUAGE_BACKBONE.USE_CHECKPOINT
                self.model = BertModel.from_pretrained(self.bert_name, add_pooling_layer=False, config=config)
    self.language_dim = 768

    修改后:

    class BertEncoder(nn.Module):
    def __init__(self, cfg):
    ……
    if self.bert_name == "bert-base-uncased":
        if cfg.MODEL.LANGUAGE_BACKBONE.MODEL_PATH:
            config = BertConfig.from_pretrained(cfg.MODEL.LANGUAGE_BACKBONE.MODEL_PATH)
            config.gradient_checkpointing = self.cfg.MODEL.LANGUAGE_BACKBONE.USE_CHECKPOINT
            print('config: ', config, flush=True)
            self.model = BertModel.from_pretrained(cfg.MODEL.LANGUAGE_BACKBONE.MODEL_PATH,
                                                   add_pooling_layer=False, config=config)
        else:
            config = BertConfig.from_pretrained(self.bert_name)
            config.gradient_checkpointing = self.cfg.MODEL.LANGUAGE_BACKBONE.USE_CHECKPOINT
            self.model = BertModel.from_pretrained(self.bert_name, add_pooling_layer=False, config=config)
        self.language_dim = 768
  6. 修改“maskrcnn_benchmark/modeling/rpn/loss.py”脚本。

    通过新增变量改变加载分词器的逻辑。

    修改前:

    class ATSSLossComputation(torch.nn.Module):
    def __init__(self, cfg, box_coder):
        ……
    self.lang = cfg.MODEL.LANGUAGE_BACKBONE.MODEL_TYPE
            if cfg.MODEL.LANGUAGE_BACKBONE.TOKENIZER_TYPE == "clip":
                ……
    else:
                self.tokenizer = AutoTokenizer.from_pretrained(self.lang)

    修改后:

    class ATSSLossComputation(torch.nn.Module):
    def __init__(self, cfg, box_coder):
        ……
    self.lang = cfg.MODEL.LANGUAGE_BACKBONE.MODEL_TYPE
    if cfg.MODEL.LANGUAGE_BACKBONE.TOKENIZER_TYPE == "clip":
                ……
    else:
         if cfg.MODEL.LANGUAGE_BACKBONE.MODEL_PATH:
             self.tokenizer = AutoTokenizer.from_pretrained(cfg.MODEL.LANGUAGE_BACKBONE.MODEL_PATH)
         else:
             self.tokenizer = AutoTokenizer.from_pretrained(self.lang)
  7. 修改“maskrcnn_benchmark/modeling/rpn/vldyhead.py”脚本。

    通过新增变量改变加载文本编码器的逻辑。

    修改前:

    class VLDyHead(torch.nn.Module):
    def __init__(self, cfg):
    ……
            if cfg.MODEL.LANGUAGE_BACKBONE.MODEL_TYPE == "bert-base-uncased":
                lang_cfg = BertConfig.from_pretrained(cfg.MODEL.LANGUAGE_BACKBONE.MODEL_TYPE)

    修改后:

    class VLDyHead(torch.nn.Module):
    def __init__(self, cfg):
    ……
            if cfg.MODEL.LANGUAGE_BACKBONE.MODEL_TYPE == "bert-base-uncased":
                if cfg.MODEL.LANGUAGE_BACKBONE.MODEL_PATH:
                    lang_cfg = BertConfig.from_pretrained(cfg.MODEL.LANGUAGE_BACKBONE.MODEL_PATH)
                else:
                    lang_cfg = BertConfig.from_pretrained(cfg.MODEL.LANGUAGE_BACKBONE.MODEL_TYPE)

原文链接:概述-模型开发-Ascend Extension for PyTorch6.0.RC2开发文档-昇腾社区 (hiascend.com)

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值