bge-large-zh-v1.5
下载模型到指定路径:
modelscope download --model BAAI/bge-large-zh-v1.5 --local_dir ./bge-large-zh-v1.5
自定义 embedding 模型,custom-bge-large-zh-v1.5.json:
{
"model_name": "custom-bge-large-zh-v1.5",
"dimensions": 1024,
"max_tokens": 512,
"language": ["zh"],
"model_id": "BAAI/bge-large-zh-v1.5",
"model_uri": "/path/to/bge-large-zh-v1.5"
}
注册自定义模型:
xinference register --model-type embedding --file custom-bge-large-zh-v1.5.json --persist
启动自定义模型:
xinference launch --model-name custom-bge-large-zh-v1.5 --model-type embedding
bge-reranker-v2-m3
下载模型到指定路径:
modelscope download --model AI-ModelScope/bge-reranker-v2-m3 --local_dir ./bge-reranker-v2-m3
自定义 rerank 模型custom-bge-reranker-v2-m3.json
{
"model_name": "custom-bge-reranker-v2-m3",
"type": "normal",
"language": ["en", "zh", "multilingual"],
"model_id": "BAAI/bge-reranker-v2-m3",
"model_uri": "/path/to/bge-reranker-v2-m3"
}
注册自定义模型:
xinference register --model-type rerank --file ./custom-bge-reranker-v2-m3.json --persist
出现错误:
Traceback (most recent call last):
File "//env/bin/xinference", line 8, in <module>
sys.exit(cli())
File "//env/lib/python3.10/site-packages/click/core.py", line 1161, in __call__
return self.main(*args, **kwargs)
File "//env/lib/python3.10/site-packages/click/core.py", line 1082, in main
rv = self.invoke(ctx)
File "//env/lib/python3.10/site-packages/click/core.py", line 1697, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "//env/lib/python3.10/site-packages/click/core.py", line 1443, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "//env/lib/python3.10/site-packages/click/core.py", line 788, in invoke
return __callback(*args, **kwargs)
File "//env/lib/python3.10/site-packages/xinference/deploy/cmdline.py", line 407, in register_model
client.register_model(
File "//env/lib/python3.10/site-packages/xinference/client/restful/restful_client.py", line 1188, in register_model
raise RuntimeError(
RuntimeError: Failed to register model, detail: Not Found
成功(因为xinference部署在9999端口):
xinference register --endpoint http://localhost:9999 --model-type rerank --file ./custom-bge-reranker-v2-m3.json --persist
启动自定义模型:
xinference launch --model-type rerank --model-name custom-bge-reranker-v2-m3 --endpoint http://localhost:9999
验证模型加载成功,输出中会显示已加载的模型。
curl http://localhost:9999/v1/models
{"object":"list","data":[{"id":"custom-bge-large-zh-v1.5","object":"model","created":0,"owned_by":"xinference","model_type":"embedding","address":"0.0.0.0:39987","accelerators":[],"model_name":"custom-bge-large-zh-v1.5","dimensions":1024,"max_tokens":512,"language":["zh"],"model_revision":null,"replica":1},{"id":"custom-bge-reranker-v2-m3","object":"model","created":0,"owned_by":"xinference","model_type":"rerank","address":"0.0.0.0:44611","accelerators":[],"type":"normal","model_name":"custom-bge-reranker-v2-m3","language":["en","zh","multilingual"],"model_revision":null,"replica":1}]}(env)