Ubuntu xinference部署本地模型bge-large-zh-v1.5、bge-reranker-v2-m3

bge-large-zh-v1.5

下载模型到指定路径:

modelscope download --model BAAI/bge-large-zh-v1.5 --local_dir ./bge-large-zh-v1.5

自定义 embedding 模型,custom-bge-large-zh-v1.5.json:

{
    "model_name": "custom-bge-large-zh-v1.5",
    "dimensions": 1024,
    "max_tokens": 512,
    "language": ["zh"],
    "model_id": "BAAI/bge-large-zh-v1.5",
    "model_uri": "/path/to/bge-large-zh-v1.5"
}

注册自定义模型:

xinference register --model-type embedding --file custom-bge-large-zh-v1.5.json --persist

启动自定义模型:

xinference launch --model-name custom-bge-large-zh-v1.5 --model-type embedding

bge-reranker-v2-m3

下载模型到指定路径:

 modelscope download --model AI-ModelScope/bge-reranker-v2-m3 --local_dir ./bge-reranker-v2-m3

自定义 rerank 模型custom-bge-reranker-v2-m3.json

{
    "model_name": "custom-bge-reranker-v2-m3",
    "type": "normal",
    "language": ["en", "zh", "multilingual"],
    "model_id": "BAAI/bge-reranker-v2-m3",
    "model_uri": "/path/to/bge-reranker-v2-m3"
}

注册自定义模型:

xinference register --model-type rerank --file ./custom-bge-reranker-v2-m3.json --persist

出现错误:

Traceback (most recent call last):
  File "//env/bin/xinference", line 8, in <module>
    sys.exit(cli())
  File "//env/lib/python3.10/site-packages/click/core.py", line 1161, in __call__
    return self.main(*args, **kwargs)
  File "//env/lib/python3.10/site-packages/click/core.py", line 1082, in main
    rv = self.invoke(ctx)
  File "//env/lib/python3.10/site-packages/click/core.py", line 1697, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "//env/lib/python3.10/site-packages/click/core.py", line 1443, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "//env/lib/python3.10/site-packages/click/core.py", line 788, in invoke
    return __callback(*args, **kwargs)
  File "//env/lib/python3.10/site-packages/xinference/deploy/cmdline.py", line 407, in register_model
    client.register_model(
  File "//env/lib/python3.10/site-packages/xinference/client/restful/restful_client.py", line 1188, in register_model
    raise RuntimeError(
RuntimeError: Failed to register model, detail: Not Found

成功(因为xinference部署在9999端口):

xinference register --endpoint http://localhost:9999 --model-type rerank --file ./custom-bge-reranker-v2-m3.json --persist

启动自定义模型:

xinference launch --model-type rerank --model-name custom-bge-reranker-v2-m3 --endpoint http://localhost:9999

验证模型加载成功,输出中会显示已加载的模型。

curl http://localhost:9999/v1/models
{"object":"list","data":[{"id":"custom-bge-large-zh-v1.5","object":"model","created":0,"owned_by":"xinference","model_type":"embedding","address":"0.0.0.0:39987","accelerators":[],"model_name":"custom-bge-large-zh-v1.5","dimensions":1024,"max_tokens":512,"language":["zh"],"model_revision":null,"replica":1},{"id":"custom-bge-reranker-v2-m3","object":"model","created":0,"owned_by":"xinference","model_type":"rerank","address":"0.0.0.0:44611","accelerators":[],"type":"normal","model_name":"custom-bge-reranker-v2-m3","language":["en","zh","multilingual"],"model_revision":null,"replica":1}]}(env) 
### 比较 bge-m3bge-large-zh-v1.5 特点与性能 #### 参数规模差异 bge-m3 模型参数量相对较少,这使得其训练和推理过程中的计算资源需求较低[^1]。而 bge-large-zh-v1.5 则拥有更多的参数数量,在理论上能够捕捉更复杂的模式并提供更高的表达能力。 #### 训练数据集区别 bge-m3 使用多语言语料库进行预训练,因此具备跨语言理解的能力;相比之下,bge-large-zh-v1.5 主要针对中文环境下的大规模文本进行了优化处理,对于特定领域内的汉语任务可能表现得更为出色[^2]. #### 推理速度对比 由于 bge-m3 的结构较为紧凑简单, 在实际应用过程中往往可以实现更快的速度响应; 反观 bge-large-zh-v1.5 虽然精度更高但是因为架构复杂度增加而导致运行效率有所下降. ```python import timeit def benchmark(model_name): start_time = time.time() # 假设这里是调用模型的具体逻辑 end_time = time.time() return f"{model_name}: {end_time - start_time:.4f} seconds" print(benchmark(&#39;bge-m3&#39;)) print(benchmark(&#39;bge-large-zh-v1.5&#39;)) ``` #### 应用场景适用性分析 当面对国际化业务或者需要支持多种自然语言的任务时,bge-m3 显然是更好的选择因为它具有良好的泛化能力和广泛的适应范围; 如果项目专注于服务中国国内市场并且对结果准确性有较高要求的话,则应该优先考虑采用经过专门调整过的 bge-large-zh-v1.5 来获得最佳效果[^3].
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值