modelscope StructBERT零样本分类-中文-large 本地化部署

ZZZZyh00000

已于 2023-12-19 09:51:46 修改

阅读量664

点赞数 8

分类专栏： NLP 文章标签：分类数据挖掘人工智能

于 2023-12-19 09:24:34 首次发布

本文链接：https://blog.csdn.net/ZZZZyh00000/article/details/135076757

版权

NLP 专栏收录该内容

3 篇文章 0 订阅

订阅专栏

modelscope StructBERT零样本分类-中文-large 本地化部署

1. 环境配置

模型介绍：

https://www.modelscope.cn/models/damo/nlp_structbert_zero-shot-classification_chinese-large/summary

conda环境：

这个和我之前部署的CSANMT翻译模型环境一致，感觉modelscope里面的模型环境好像都会差不多，主要的包就这么些，报错如果差什么包就 pip install什么包就好：

python                    3.8.18 
tensorflow                2.13.0
pytorch                   2.1.0           py3.8_cuda12.1_cudnn8.9.2_0    pytorch
modelscope                1.9.5                    pypi_0    pypi

2. 模型下载

感觉modelscope的模型下载方式还是挺方便的，都是统一的代码，只需要修改model id就可以了

from modelscope.hub.snapshot_download import snapshot_download

model_dir = snapshot_download('damo/nlp_structbert_zero-shot-classification_chinese-large', cache_dir='Your_path_to/StructBERT/model', revision='master')

3. 本地测试

这个地方用他们官方的代码检查一下下载的模型是否能够执行：

from modelscope.pipelines import pipeline

classifier = pipeline('zero-shot-classification', 'damo/nlp_structbert_zero-shot-classification_chinese-large')

labels = ['家居', '旅游', '科技', '军事', '游戏', '故事']
sentence = '世界那么大，我想去看看'
classifier(sentence, candidate_labels=labels)
# {'labels': ['旅游', '故事', '游戏', '家居', '军事', '科技'],
#  'scores': [0.2843151092529297,
#   0.20308202505111694,
#   0.14530399441719055,
#   0.12690572440624237,
#   0.12382000684738159,
#   0.11657321453094482]}
#   预测结果为 "旅游"

classifier(sentence, candidate_labels=labels, multi_label=True)
# {'labels': ['旅游', '故事', '游戏', '科技', '军事', '家居'],
#  'scores': [0.7894195318222046,
#   0.5234490633010864,
#   0.41255447268486023,
#   0.2873048782348633,
#   0.27711278200149536,
#   0.2695293426513672]}
#   如阈值设为0.5，则预测出的标签为 "旅游" 及 "故事"

参数说明：

sentence：需要分类的输入文本
candidate_labels：自定义的类别标签
multi_label：bool值，true表示多标签，false表示单独标签

最后返回的值是一个dict，labels是已经排好序的数组，排第一个就是评分最高的。

4. 本地部署服务

这个比较简单，就用了官方给的示例代码稍微改一改就行了。。。。

服务端代码：

from modelscope.pipelines import pipeline
from flask import Flask, request
from loguru import logger

app = Flask(__name__)

def load_model():
    global classifier
    classifier = pipeline('zero-shot-classification', './model/damo/nlp_structbert_zero-shot-classification_chinese-large', device='gpu')

@app.route('/classification', methods=['POST'])
def predict():
    logger.info("接受到POST请求，开始文本分类...")
    data = request.get_json()
    sentence = data['sentence']
    labels = data['labels']
    multi_label = data['muti_label']
    logger.info(sentence)
    logger.info(labels)
    logger.info(multi_label)
    return classifier(sentence,candidate_labels=labels, multi_label=multi_label)
if __name__ == '__main__':
    load_model()
    app.run(host="0.0.0.0", port=6014)

postman请求体：

{
    "sentence":"你真帅！",
    "labels":["夸奖","谩骂","赞美","贬低"],
    "muti_label":true
}

返回结果：

{
    "labels": [
        "赞美",
        "夸奖",
        "谩骂",
        "贬低"
    ],
    "scores": [
        0.978646457195282,
        0.9710671305656433,
        0.3276732265949249,
        0.16048337519168854
    ]
}

模型还支持多batch的情况：

postman 请求体：

{
    "sentence":["你真帅","你是猪吗"],
    "labels":[["夸奖","谩骂"],["赞美","贬低"]],
    "muti_label":true
}

返回结果：

[
    {
        "labels": [
            [
                "赞美",
                "贬低"
            ],
            [
                "夸奖",
                "谩骂"
            ]
        ],
        "scores": [
            0.8209134936332703,
            0.7264650464057922
        ]
    },
    {
        "labels": [
            [
                "赞美",
                "贬低"
            ],
            [
                "夸奖",
                "谩骂"
            ]
        ],
        "scores": [
            0.37469756603240967,
            0.17754413187503815
        ]
    }
]