InferVision NN model deployment webinar notes

InferVision NN model deployment webinar

Modle life span

  1. build time: load modle into memory (zip, encode, send, read, unzip, decode, certificate…)
  2. luanch time: load modle into GPU (api, check load, free, status verification, luanch config)
  3. run time: read input, provide output (input api, output api, execute)

free
instance under process -> process kill -> free memory
gRPC: process communication

Scheduler center

  1. Register (receive Archive, return id)
  2. Reference (receive input, return output)

Instance status:

  • working, but can receive new task
  • will be killed, cannot receive new task
    Assumption: model can be loaded into one GPU

How to find an instance:

  • find a matching instance is working now, just send
  • no matching instance, but available GPU, lauch new instance
  • no matching instance, no available GPU, to kill a working instance (find it and kill it), lauch new instance …
  • no matching instance, no available GPU, nobody to kill, wait… [based timeout from Client]

Schduler challange

  1. Better monitoring and analysis
  2. Better instance matching
  3. Better load balance
  4. Avoid race condition
  5. Error handle
  6. log
  7. Mimic production environment (load, profile, test, optimization)
  8. Test (unit test, integration test, etc.)
  9. Interfare
  10. Distributed

Client

  1. I/O cost ->High efficient IPC (process commu: sharing memory)
  2. Client no request, Server idle ->Keep scheduler request busy

Based on memory on client, determine request n -> shared memory
Preprocess input = CPU on client -> to shared memory -> to request queue -> to scheduler

Same modle on one GPU 4 models parralle,batch
Operator combine -> to batches not to multi-instances

docker container layers v.s. model layers [LARGE v.s. SMALL]

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值