记录一下部署Vicuna api时遇到的问题

部署vicuna 13b api的时候出现问题


按照官方openai_api.md
执行 python3 -m fastchat.serve.controller
出现以下问题

2023-11-01 10:21:06 | INFO | controller | args: Namespace(dispatch_method='shortest_queue', host='localhost', port=21001, ssl=False)
2023-11-01 10:21:06 | ERROR | stderr | INFO:     Started server process [1131]
2023-11-01 10:21:06 | ERROR | stderr | INFO:     Waiting for application startup.
2023-11-01 10:21:06 | ERROR | stderr | INFO:     Application startup complete.
2023-11-01 10:21:06 | ERROR | stderr | ERROR:    [Errno 99] error while attempting to bind on address ('::1', 21001, 0, 0): cannot assign requested address
2023-11-01 10:21:06 | ERROR | stderr | INFO:     Waiting for application shutdown.
2023-11-01 10:21:06 | ERROR | stderr | INFO:     Application shutdown complete.

加个参数--host 0.0.0.0即可
python3 -m fastchat.serve.controller --host 0.0.0.0
运行结果

2023-11-01 10:22:06 | INFO | controller | args: Namespace(dispatch_method='shortest_queue', host='0.0.0.0', port=21001, ssl=False)
2023-11-01 10:22:06 | ERROR | stderr | INFO:     Started server process [1163]
2023-11-01 10:22:06 | ERROR | stderr | INFO:     Waiting for application startup.
2023-11-01 10:22:06 | ERROR | stderr | INFO:     Application startup complete.
2023-11-01 10:22:06 | ERROR | stderr | INFO:     Uvicorn running on http://0.0.0.0:21001 (Press CTRL+C to quit)
2023-11-01 10:29:23 | INFO | controller | Register a new worker: http://localhost:21002
2023-11-01 10:29:23 | INFO | controller | Register done: http://localhost:21002, {'model_names': ['Vicuna-13b-v1.5'], 'speed': 1, 'queue_length': 0}
2023-11-01 10:29:23 | INFO | stdout | INFO:     127.0.0.1:34186 - "POST /register_worker HTTP/1.1" 200 OK
2023-11-01 10:30:08 | INFO | controller | Receive heart beat. http://localhost:21002
2023-11-01 10:30:08 | INFO | stdout | INFO:     127.0.0.1:34212 - "POST /receive_heart_beat HTTP/1.1" 200 OK

然后运行(我是在4090上运行的,load 8 bit的话24g显存才够用)
python3 -m fastchat.serve.model_worker --model-path lmsys/vicuna-7b-v1.3 --host 0.0.0.0 --load-8bit
运行成功

2023-11-01 10:28:40 | INFO | model_worker | args: Namespace(awq_ckpt=None, awq_groupsize=-1, awq_wbits=16, controller_address='http://localhost:21001', conv_template=None, cpu_offloading=False, device='cuda', dtype=None, embed_in_truncate=False, gptq_act_order=False, gptq_ckpt=None, gptq_groupsize=-1, gptq_wbits=16, gpus=None, host='0.0.0.0', limit_worker_concurrency=5, load_8bit=True, max_gpu_memory=None, model_names=None, model_path='../Vicuna-13b-v1.5/', no_register=False, num_gpus=1, port=21002, revision='main', seed=None, stream_interval=2, worker_address='http://localhost:21002')
2023-11-01 10:28:40 | INFO | model_worker | Loading the model ['Vicuna-13b-v1.5'] on worker 51f7ff39 ...
Loading the tokenizer from the `special_tokens_map.json` and the `added_tokens.json` will be removed in `transformers 5`,  it is kept for forward compatibility, but it is recommended to update your `tokenizer_config.json` by uploading it again. You will see the new `added_tokens_decoder` attribute that will store the relevant information.
  0%|                                                                                                                                                                                                   | 0/3 [00:00<?, ?it/s]
 33%|██████████████████████████████████████████████████████████████▎                                                                                                                            | 1/3 [00:08<00:17,  8.71s/it]
 67%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋                                                              | 2/3 [00:22<00:11, 11.76s/it]
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:42<00:00, 15.34s/it]
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:42<00:00, 14.07s/it]
2023-11-01 10:29:23 | ERROR | stderr | 
2023-11-01 10:29:23 | INFO | model_worker | Register to controller
2023-11-01 10:29:23 | ERROR | stderr | INFO:     Started server process [1472]
2023-11-01 10:29:23 | ERROR | stderr | INFO:     Waiting for application startup.
2023-11-01 10:29:23 | ERROR | stderr | INFO:     Application startup complete.
2023-11-01 10:29:23 | ERROR | stderr | INFO:     Uvicorn running on http://0.0.0.0:21002 (Press CTRL+C to quit)

最后运行
python3 -m fastchat.serve.openai_api_server --host 0.0.0.0 --port 8000

INFO:     Started server process [2099]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值