Supported Model Frameworks/Formats - KServe

支持的模型框架/格式

https://kserve.github.io/website/0.10/modelserving/v1beta1/serving_runtime/

模型服务运行时间

KServe提供了一个简单的Kubernetes CRD,可以将单个或多个经过训练的模型部署到模型服务运行时,如TFServingTorchServeTriton Inference Server。此外,ModelServer是在KServe中使用预测v1协议实现的Python模型服务运行时,MLServer使用REST和gRPC实现预测v2协议。这些模型服务运行时能够提供开箱即用的模型服务,但您也可以选择为更复杂的用例构建自己的模型服务器。KServe提供了基本的API基元,使您可以轻松地构建自定义模型服务运行时,您可以使用BentML等其他工具来构建您的自定义模型服务镜像。

使用推理服务部署模型后,您将获得KServe提供的以下所有无服务器功能。

  • 缩放到零和从零缩放
  • 基于请求的CPU/GPU自动缩放
  • 修订管理
  • 优化的容器
  • 批处理
  • 请求/响应日志记录
  • 流量管理
  • AuthN/AuthZ的安全性
  • 分布式跟踪
  • 开箱即用的指标
  • 入口/出口控制

下表列出了KServe支持的每个模型服务运行时。HTTP和gRPC列指示服务运行时支持的预测协议版本。KServe预测协议被记为“v1”或“v2”。一些服务运行时也支持它们自己的预测协议,这些协议用*表示。默认的服务运行时版本列定义了服务运行时的源和版本——MLServer、KServe或它自己的。这些版本也可以在runtime kustomization YAML中找到。为运行时提供服务的所有KServe本机模型都使用当前的KServe发布版本(v0.10)。支持的框架版本列列出了支持的模型的主要版本。这些也可以在相应的runtime YAML中supportedModelFormats字段下找到。对于使用KServe服务运行时的模型框架,可以在KServe/python中找到特定的默认版本。在给定的服务运行时目录中,setup.py文件包含所使用的确切模型框架版本。例如,在kserve/python/lgbserver中,setup.py文件将模型框架版本设置为3.3.2,lightgbm==3.3.2。

Model Serving RuntimeExported modelHTTPgRPCDefault Serving Runtime VersionSupported Framework (Major) Version(s)Examples
Custom ModelServerv1, v2v2Custom Model
LightGBM MLServerSaved LightGBM Modelv2v2v1.0.0 (MLServer)3LightGBM Iris V2
LightGBM ModelServerSaved LightGBM Modelv1v0.10 (KServe)3LightGBM Iris
MLFlow ModelServerSaved MLFlow Modelv2v2v1.0.0 (MLServer)1MLFLow wine-classifier
PMML ModelServerPMMLv1v0.10 (KServe)3, 4 (PMML4.4.1)SKLearn PMML
SKLearn MLServerPickled Modelv2v2v1.0.0 (MLServer)1SKLearn Iris V2
SKLearn ModelServerPickled Modelv1v0.10 (KServe)1SKLearn Iris
TFServingTensorFlow SavedModelv1*tensorflow2.6.2 (TFServing Versions)2TensorFlow flower
TorchServeEager Model/TorchScriptv1, v2, *torchserve*torchserve0.7.0 (TorchServe)1TorchServe mnist
Triton Inference ServerTensorFlow,TorchScript,ONNXv2v221.09-py3 (Triton)8 (TensoRT), 1, 2 (TensorFlow), 1 (PyTorch), 2 (Triton) Compatibility MatrixTorchscript cifar
XGBoost MLServerSaved Modelv2v2v1.0.0 (MLServer)1XGBoost Iris V2
XGBoost ModelServerSaved Modelv1v0.10 (KServe)1XGBoost Iris

*tensorflow-除了KServe的预测协议外,tensorflow还实现了自己的预测协议。请参阅:Tensorflow服务预测API文档

*torchserve-PyTorch除了KServe之外,还实现了自己的预引用协议。请参阅:Torchserve gRPC API文档

笔记
提供运行时版本的模型可以用推理服务yaml上的runtimeVersion字段覆盖,我们强烈建议为生产服务设置此字段。

apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
  name: "torchscript-cifar"
spec:
  predictor:
    triton:
      storageUri: "gs://kfserving-examples/models/torchscript"
      runtimeVersion: 21.08-py3

使用推理服务部署Tensorflow模型

创建HTTP推理服务

创建一个指定框架tensorflow和指向已保存的tensorflow模型的storageUri的推理服务yaml,并将其命名为tensorflow.yaml。
旧架构

apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
  name: "flower-sample"
spec:
  predictor:
    tensorflow:
      storageUri: "gs://kfserving-examples/models/tensorflow/flowers"

新架构

apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
  name: "flower-sample"
spec:
  predictor:
    model:
      modelFormat:
        name: tensorflow
      storageUri: "gs://kfserving-examples/models/tensorflow/flowers"

kubectl

kubectl apply -f tensorflow.yaml 

期望输出

$ inferenceservice.serving.kserve.io/flower-sample created

等待推理服务处于就绪状态

kubectl get isvc flower-sample
NAME            URL                                        READY   PREV   LATEST   PREVROLLEDOUTREVISION        LATESTREADYREVISION                     AGE
flower-sample   http://flower-sample.default.example.com   True           100                                   flower-sample-predictor-default-n9zs6   7m15s

运行预测

第一步是确定入口IP和端口,并设置ingress_HOST和ingress_PORT,在这里可以下载推理请求输入文件。

MODEL_NAME=flower-sample
INPUT_PATH=@./input.json
SERVICE_HOSTNAME=$(kubectl get inferenceservice ${MODEL_NAME} -o jsonpath='{.status.url}' | cut -d "/" -f 3)
curl -v -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict -d $INPUT_PATH

期望输出

* Connected to localhost (::1) port 8080 (#0)
> POST /v1/models/tensorflow-sample:predict HTTP/1.1
> Host: tensorflow-sample.default.example.com
> User-Agent: curl/7.73.0
> Accept: */*
> Content-Length: 16201
> Content-Type: application/x-www-form-urlencoded
> 
* upload completely sent off: 16201 out of 16201 bytes
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< content-length: 222
< content-type: application/json
< date: Sun, 31 Jan 2021 01:01:50 GMT
< x-envoy-upstream-service-time: 280
< server: istio-envoy
< 
{
    "predictions": [
        {
            "scores": [0.999114931, 9.20987877e-05, 0.000136786213, 0.000337257545, 0.000300532585, 1.84813616e-05],
            "prediction": 0,
            "key": "   1"
        }
    ]
}

金丝雀Rollout

Canary Rollout是控制推出新模型风险的一个好方法,首先将一小部分流量转移到它身上,然后逐渐增加这个百分比。要运行canary rollout,可以应用指定了canaryTrafficPercent字段的canary.yaml。
旧架构

apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
  name: "flower-sample"
spec:
  predictor:
    canaryTrafficPercent: 20
    tensorflow:
      storageUri: "gs://kfserving-examples/models/tensorflow/flowers-2"

新架构

apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
  name: "flower-sample"
spec:
  predictor:
    canaryTrafficPercent: 20
    model:
      modelFormat:
        name: tensorflow
      storageUri: "gs://kfserving-examples/models/tensorflow/flowers-2"

应用canary.yaml创建金丝雀推理服务。
kubectl

kubectl apply -f canary.yaml 

要验证流量分割百分比是否正确应用,可以运行以下命令:

kubectl get isvc flower-sample
NAME            URL                                        READY   PREV   LATEST   PREVROLLEDOUTREVISION                   LATESTREADYREVISION                     AGE
flower-sample   http://flower-sample.default.example.com   True    80     20       flower-sample-predictor-default-n9zs6   flower-sample-predictor-default-2kwtr   7m15s

正如您所看到的,流量在上一次推出的修订版和当前最新准备的修订版之间分配,KServe会自动为您跟踪上一次发布的(稳定的)修订版,因此您不需要像v1alpha2那样在推理服务上同时维护default和金丝雀。

创建gRPC推理服务

创建推理服务,它暴露gRPC端口,默认情况下它侦听端口9000。
旧框架

apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
  name: "flower-grpc"
spec:
  predictor:
    tensorflow:
      storageUri: "gs://kfserving-examples/models/tensorflow/flowers"
      ports:
        - containerPort: 9000
          name: h2c
          protocol: TCP

新框架

apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
  name: "flower-grpc"
spec:
  predictor:
    model:
      modelFormat:
        name: tensorflow
      storageUri: "gs://kfserving-examples/models/tensorflow/flowers"
      ports:
        - containerPort: 9000
          name: h2c
          protocol: TCP

应用grp.yaml创建grpc推理服务。
kubectl

kubectl apply -f grpc.yaml 

期望输出

$ inferenceservice.serving.kserve.io/flower-grpc created

运行预测

我们使用python gRPC客户端进行预测,因此您需要创建一个python虚拟环境并安装tensorflow服务api。

# The prediction script is written in TensorFlow 1.x
pip install tensorflow-serving-api>=1.14.0,<2.0.0

运行gRPC预测脚本

MODEL_NAME=flower-grpc
INPUT_PATH=./input.json
SERVICE_HOSTNAME=$(kubectl get inferenceservice ${MODEL_NAME} -o jsonpath='{.status.url}' | cut -d "/" -f 3)
python grpc_client.py --host $INGRESS_HOST --port $INGRESS_PORT --model $MODEL_NAME --hostname $SERVICE_HOSTNAME --input_path $INPUT_PATH

期望输出

outputs {
  key: "key"
  value {
    dtype: DT_STRING
    tensor_shape {
      dim {
        size: 1
      }
    }
    string_val: "   1"
  }
}
outputs {
  key: "prediction"
  value {
    dtype: DT_INT64
    tensor_shape {
      dim {
        size: 1
      }
    }
    int64_val: 0
  }
}
outputs {
  key: "scores"
  value {
    dtype: DT_FLOAT
    tensor_shape {
      dim {
        size: 1
      }
      dim {
        size: 6
      }
    }
    float_val: 0.9991149306297302
    float_val: 9.209887502947822e-05
    float_val: 0.00013678647519554943
    float_val: 0.0003372581850271672
    float_val: 0.0003005331673193723
    float_val: 1.848137799242977e-05
  }
}
model_spec {
  name: "flowers-sample"
  version {
    value: 1
  }
  signature_name: "serving_default"
}

使用TorchServe推理服务部署PyTorch模型

在本例中,我们部署了一个经过训练的PyTorch MNIST模型,通过使用TorchServe runtime运行推理服务来预测手写数字,TorchServe运行时是PyTorch模型的默认安装服务运行时。模型可解释性也是一个重要方面,它有助于理解哪些输入特征对特定分类很重要。Captum是一个模型可解释性库。在这个例子中,TorchServe解释端点是用Captum最先进的算法实现的,包括集成的梯度,为用户提供了一种简单的方式来理解哪些特征有助于模型输出。您可以参考Captum教程了解更多示例。

使用模型档案文件和配置创建模型存储

KServe/TorchServe集成需要以下模型商店布局。

├── config
│   ├── config.properties
├── model-store
│   ├── densenet_161.mar
│   ├── mnist.mar

TorchServe提供了一个实用程序,可以将所有模型工件打包到一个TorchServe Model Archive File (MAR)中。将模型工件打包到MAR文件中后,您将上传到模型存储路径下的模型存储。

您可以将模型和相关文件存储在远程存储或本地持久卷上。MNIST模型和相关文件可以从这里获得。

笔记
对于远程存储,您可以选择使用存储在KServe example GCS bucket gs://kfserving examples/models/torchserver/image_classifier上的预构建MNIST MAR文件启动示例,或者使用torch模型归档器生成MAR文件,并根据上述布局在远程存储上创建模型存储。

torch模型归档器--模型名称mnist--1.0版\
--模型文件模型归档器/模型库/mnist/mnist.py\
--序列化文件模型归档器/模型存储/mnist/mnist_cnn.pt\
--处理程序模型归档器/模型存储区/mnist/mnist_handler.py\

对于PVC用户,请参阅模型档案文件生成,以自动生成带有模型和从属文件的MAR文件。

TorchServe使用config.properties文件来存储配置。有关配置文件支持的属性的更多详细信息,请参阅此处。以下是KServe的示例文件:

inference_address=http://0.0.0.0:8085
management_address=http://0.0.0.0:8085
metrics_address=http://0.0.0.0:8082
grpc_inference_port=7070
grpc_management_port=7071
enable_metrics_api=true
metrics_format=prometheus
number_of_netty_threads=4
job_queue_size=10
enable_envvars_config=true
install_py_dep_per_model=true
model_store=/mnt/models/model-store
model_snapshot={"name":"startup.cfg","modelCount":1,"models":{"mnist":{"1.0":{"defaultVersion":true,"marName":"mnist.mar","minWorkers":1,"maxWorkers":5,"batchSize":1,"maxBatchDelay":10,"responseTimeout":120}}}}

KServe/TorchServe集成支持KServe v1/v2 REST协议。在config.properties中,我们需要打开标志enable_envvars_config,以便使用环境变量设置KServe信封。

警告
以前的service_envelope属性已被弃用,并且在config.properties文件中使用标志enable_envvars_config=true来启用在运行时设置服务信封。请求从KServe推理请求格式转换为TorchServe请求格式,并发送到通过本地套接字配置的推理地址。

使用V1 REST协议部署PyTorch模型

创建TorchServe推理服务

当您在新模型规范上指定模型格式pytorch时,KServe默认选择TorchServe运行时。

旧框架

apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
  name: "torchserve"
spec:
  predictor:
    pytorch:
      storageUri: gs://kfserving-examples/models/torchserve/image_classifier/v1

新框架

apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
  name: "torchserve"
spec:
  predictor:
    model:
      modelFormat:
        name: pytorch
      storageUri: gs://kfserving-examples/models/torchserve/image_classifier/v1

要在CPU上部署模型,请应用以下torchserver.yaml来创建推理服务。
kubectl

kubectl apply -f torchserve.yaml

旧框架

apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
  name: "torchserve"
spec:
  predictor:
    pytorch:
      storageUri: gs://kfserving-examples/models/torchserve/image_classifier/v1
      resources:
        limits:
          memory: 4Gi
          nvidia.com/gpu: "1"

新框架

apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
  name: "torchserve"
spec:
  predictor:
    model:
      modelFormat:
        name: pytorch
      storageUri: gs://kfserving-examples/models/torchserve/image_classifier/v1
      resources:
        limits:
          memory: 4Gi
          nvidia.com/gpu: "1"

要在GPU上部署模型,请应用gpu.yaml来创建GPU推理服务。
kubectl

kubectl apply -f gpu.yaml

期望输出

$ inferenceservice.serving.kserve.io/torchserve created

模型推理

第一步是确定入口IP和端口并设置INGRESS_HOST 和 INGRESS_PORT

MODEL_NAME=mnist
SERVICE_HOSTNAME=$(kubectl get inferenceservice torchserve -o jsonpath='{.status.url}' | cut -d "/" -f 3)

您可以使用图像转换器将图像转换为基于64字节的数组,其他型号请参阅输入请求

curl -v -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/${MODEL_NAME}:predict -d @./mnist.json

期望输出

*   Trying 52.89.19.61...
* Connected to a881f5a8c676a41edbccdb0a394a80d6-2069247558.us-west-2.elb.amazonaws.com (52.89.19.61) port 80 (#0)
> PUT /v1/models/mnist HTTP/1.1
> Host: torchserve.kserve-test.example.com
> User-Agent: curl/7.47.0
> Accept: */*
> Content-Length: 167
> Expect: 100-continue
>
< HTTP/1.1 100 Continue
* We are completely uploaded and fine
< HTTP/1.1 200 OK
< cache-control: no-cache; no-store, must-revalidate, private
< content-length: 1
< date: Tue, 27 Oct 2020 08:26:19 GMT
< expires: Thu, 01 Jan 1970 00:00:00 UTC
< pragma: no-cache
< x-request-id: b10cfc9f-cd0f-4cda-9c6c-194c2cdaa517
< x-envoy-upstream-service-time: 6
< server: istio-envoy
<
* Connection #0 to host a881f5a8c676a41edbccdb0a394a80d6-2069247558.us-west-2.elb.amazonaws.com left intact
{"predictions": ["2"]}

模型说明

要获取模型说明:

curl -v -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/mnist:explain -d @./mnist.json

期望输出

{"explanations": [[[[0.0005394675730469475, -0.0022280013123036043, -0.003416480100841055, -0.0051329881112415965, -0.009973864160829985, -0.004112560908882716, -0.009223458030656112, -0.0006676354577291628, -0.005249806664413386, -0.0009790519227372953, -0.0026914653993121195, -0.0069470097151383995, -0.00693530415962956, -0.005973878697847718, -0.00425042437288857, 0.0032867281838150977, -0.004297780258633562, -0.005643196661192014, -0.00653025019738562, -0.0047062916121001185, -0.0018656628277792628, -0.0016757477204072532, -0.0010410417081844845, -0.0019093520822156726, -0.004451403461006374, -0.0008552767257773671, -0.0027638888169885267, -0.0], [0.006971297052106784, 0.007316855222185687, 0.012144494329150574, 0.011477799383288441, 0.006846725347670252, 0.01149386176451476, 0.0045351987881190655, 0.007038361889638708, 0.0035855377023272157, 0.003031419502053957, -0.0008611575226775316, -0.0011085224745969223, -0.0050840743637658534, 0.009855491784340777, 0.007220680811043034, 0.011374285598070253, 0.007147725481709019, 0.0037114580912849457, 0.00030763245479291384, 0.0018305492665953394, 0.010106224395114147, 0.012932881164284687, 0.008862892007714321, 0.0070960526615982435, -0.0015931137903787505, 0.0036495747329455906, 0.0002593849391051298, -0.0], [0.006467265785857396, -0.00041793201228071674, 0.004900316089756856, 0.002308395474823997, 0.007859295399592283, 0.003916404948969494, 0.005630750246437249, 0.0043712538044184375, 0.006128530599133763, -0.009446321309831246, -0.014173645867037036, -0.0062988650915794565, -0.011473838941118539, -0.009049151947644047, -0.0007625645864610934, -0.013721416630061238, -0.0005580156670410108, 0.0033404383756480784, -0.006693278798487951, -0.003705084551144756, 0.005100375089529131, 5.5276874714401074e-05, 0.007221745280359063, -0.00573598303916232, -0.006836169033785967, 0.0025401608627538936, 9.303533912921196e-05, -0.0], [0.005914399808621816, 0.00452643561023696, 0.003968242261515448, 0.010422786058967673, 0.007728358107899074, 0.01147115923288383, 0.005683869479056691, 0.011150670502307374, 0.008742555292485278, 0.0032882897575743754, 0.014841138421861584, 0.011741228362482451, 0.0004296862879259221, -0.0035118140680654854, -0.006152254410078331, -0.004925121936901983, -2.3611205202801947e-06, 0.029347073037039074, 0.02901626308947743, 0.023379353021343398, 0.004027157620197582, -0.01677662249919171, -0.013497255736128979, 0.006957482854214602, 0.0018321766800746145, 0.008277034396684563, 0.002733405455464871, -0.0], [0.0049579739156640065, -0.002168016158233997, 0.0020644317321723642, 0.0020912464240293825, 0.004719691119907336, 0.007879231202446626, 0.010594445898145937, 0.006533067778982801, 0.002290214592708113, -0.0036651114968251986, 0.010753227423379443, 0.006402706020466243, -0.047075193909339695, -0.08108259303568185, -0.07646875196692542, -0.1681834845371156, -0.1610307396135756, -0.12010309927453829, -0.016148831320070896, -0.009541525999486027, 0.04575604594761406, 0.031470966329886635, 0.02452149438024385, 0.016594078577569567, 0.012213591301610382, -0.002230875840404426, 0.0036704051254298374, -0.0], [0.006410107592414739, 0.005578283890924384, 0.001977103461731095, 0.008935476507124939, 0.0011305055729953436, 0.0004946313900665659, -0.0040266029554395935, -0.004270765544167256, -0.010832150944943138, -0.01653511868336456, -0.011121302103373972, -0.42038514526905024, -0.22874576003118394, -0.16752936178907055, -0.17021699697722079, -0.09998584936787697, -0.09041117495322142, -0.10230248444795721, -0.15260897522094888, 0.07770835838531896, -0.0813761125123066, 0.027556910053932963, 0.036305965104261866, 0.03407793793894619, 0.01212761779302579, 0.006695133380685627, 0.005331392748588556, -0.0], [0.008342680065996267, -0.00029249776150416367, 0.002782130291086583, 0.0027793744856745373, 0.0020525102690845407, 0.003679269934110004, 0.009373846012918791, -0.0031751745946300403, -0.009042846256743316, 0.0074141593032070775, -0.02796812516561052, -0.593171583786029, -0.4830164472795136, -0.353860128479443, -0.256482708704862, 0.11515586314578445, 0.12700563162828346, 0.0022342450630152204, -0.24673707669992118, -0.012878340813781437, 0.16866821780196756, 0.009739033161051434, -0.000827843726513152, -0.0002137320694585577, -0.004179480126338929, 0.008454049232317358, -0.002767934266266998, -0.0], [0.007070382982749552, 0.005342127805750565, -0.000983984198542354, 0.007910101170274493, 0.001266267696096404, 0.0038575136843053844, 0.006941130321773131, -0.015195182020687892, -0.016954974010578504, -0.031186444096787943, -0.031754626467747966, 0.038918845112017694, 0.06248943950328597, 0.07703301092601872, 0.0438493628024275, -0.0482404449771698, -0.08718650815999045, -0.0014764704694506415, -0.07426336448916614, -0.10378029666564882, 0.008572087846793842, -0.00017173413848283343, 0.010058893270893113, 0.0028410498666004377, 0.002008290211806285, 0.011905375389931099, 0.006071375802943992, -0.0], [0.0076080165949142685, -0.0017127333725310495, 0.00153128150106188, 0.0033391793764531563, 0.005373442509691564, 0.007207746020295443, 0.007422946703693544, -0.00699779191449194, 0.002395328253696969, -0.011682618874195954, -0.012737004464649057, -0.05379966383523857, -0.07174960461749053, -0.03027341304050314, 0.0019411862216381327, -0.0205575129473766, -0.04617091711614171, -0.017655308106959804, -0.009297162816368814, -0.03358572117988279, -0.1626068444778013, -0.015874364762085157, -0.0013736074085577258, -0.014763439328689378, 0.00631805792697278, 0.0021769414283267273, 0.0023061635006792498, -0.0], [0.005569931813561535, 0.004363218328087518, 0.00025609463218383973, 0.009577483244680675, 0.007257755916229399, 0.00976284778532342, -0.006388840235419147, -0.009017880790555707, -0.015308709334434867, -0.016743935775597355, -0.04372596546189275, -0.03523469356755156, -0.017257810114846107, 0.011960489902313411, 0.01529079831828911, -0.020076559119468443, -0.042792547669901516, -0.0029492027218867116, -0.011109560582516062, -0.12985858077848939, -0.2262858575494602, -0.003391725540087574, -0.03063368684328981, -0.01353486587575121, 0.0011140822443932317, 0.006583451102528798, 0.005667533945285076, -0.0], [0.004056272267155598, -0.0006394041203204911, 0.004664893926197093, 0.010593032387298614, 0.014750931538689989, 0.015428721146282149, 0.012167820222401367, 0.017604752451202518, 0.01038886849969188, 0.020544326931163263, -0.0004206566917812794, -0.0037463581359232674, -0.0024656693040735075, 0.0026061897697624353, -0.05186055271869177, -0.09158655048397382, 0.022976389912563913, -0.19851635458461808, -0.11801281807622972, -0.29127727790584423, -0.017138655663803876, -0.04395515676468641, -0.019241432506341576, 0.0011342298743447392, 0.0030625771422964584, -0.0002867924892991192, -0.0017908808807543712, -0.0], [0.0030114260660488892, 0.0020246448273580006, -0.003293361220376816, 0.0036965043883218584, 0.00013185761728146236, -0.004355610866966878, -0.006432601921104354, -0.004148701459814858, 0.005974553907915845, -0.0001399233607281906, 0.010392944122965082, 0.015693249298693028, 0.0459528427528407, -0.013921539948093455, -0.06615556518538708, 0.02921438991320325, -0.16345220625101778, -0.002130491295590408, -0.11449749664916867, -0.030980255589300607, -0.04804122537359171, -0.05144994776295644, 0.005122827412776085, 0.006464862173908011, 0.008624278272940246, 0.0037316228508156427, 0.0036947794337026706, -0.0], [0.0038173843228389405, -0.0017091931226819494, -0.0030871869816778068, 0.002115642501535999, -0.006926441921580917, -0.003023077828426468, -0.014451359520861637, -0.0020793048380231397, -0.010948003939342523, -0.0014460716966395166, -0.01656990336897737, 0.003052317148320358, -0.0026729564809943513, -0.06360067057346147, 0.07780985635080599, -0.1436689936630281, -0.040817177623437874, -0.04373367754296477, -0.18337299150349698, 0.025295182977407064, -0.03874921104331938, -0.002353901742617205, 0.011772560401335033, 0.012480994515707569, 0.006498422579824301, 0.00632320984076023, 0.003407169765754805, -0.0], [0.00944355257990139, 0.009242583578688485, 0.005069860444386138, 0.012666191449103024, 0.00941789912565746, 0.004720427012836104, 0.007597687789204113, 0.008679266528089945, 0.00889322771021875, -0.0008577904940828809, 0.0022973860384607604, 0.025328230809207493, -0.09908781123080951, -0.07836626399832172, -0.1546141264726177, -0.2582207272050766, -0.2297524599578219, -0.29561835103416967, 0.12048787956671528, -0.06279365699861471, -0.03832012404275233, 0.022910264999199934, 0.005803508497672737, -0.003858461926053348, 0.0039451232171312765, 0.003858476747495933, 0.0013034515558609956, -0.0], [0.009725756015628606, -0.0004001101998876524, 0.006490722835571152, 0.00800808023631959, 0.0065880711806331265, -0.0010264326176194034, -0.0018914305972878344, -0.008822522194658438, -0.016650520788128117, -0.03254382594389507, -0.014795713101569494, -0.05826499837818885, -0.05165369567511702, -0.13384277337594377, -0.22572641373340493, -0.21584739544668635, -0.2366836351939208, 0.14937824076489659, -0.08127414932170171, -0.06720440139736879, -0.0038552732903526744, 0.0107597891707803, -5.67453590118174e-05, 0.0020161340511396244, -0.000783322694907436, -0.0006397207517995289, -0.005291639205010064, -0.0], [0.008627543242777584, 0.007700097300051849, 0.0020430960246806138, 0.012949015733198586, 0.008428709579953574, 0.001358177022953576, 0.00421863939925833, 0.002657580000868709, -0.007339431957237175, 0.02008439775442315, -0.0033717631758033114, -0.05176633249899187, -0.013790328758662772, -0.39102366157050594, -0.167341447585844, -0.04813367828213947, 0.1367781582239039, -0.04672809260566293, -0.03237784669978756, 0.03218068777925178, 0.02415063765016493, -0.017849899351200002, -0.002975675228088795, -0.004819438014786686, 0.005106898651831245, 0.0024278620704227456, 6.784303333368138e-05, -0.0], [0.009644258527009343, -0.001331907219439711, -0.0014639718434477777, 0.008481926798958248, 0.010278031715467508, 0.003625808326891529, -0.01121188617599796, -0.0010634587872994379, -0.0002603820881968461, -0.017985648016990465, -0.06446652745470374, 0.07726063173046191, -0.24739929795334742, -0.2701855018480216, -0.08888614776216278, 0.1373325760136816, -0.02316068912438066, -0.042164834956711514, 0.0009266091344106458, 0.03141872420427644, 0.011587728430225652, 0.0004755143243520787, 0.005860642609620605, 0.008979633931394438, 0.005061734169974005, 0.003932710387086098, 0.0015489986106803626, -0.0], [0.010998736164377534, 0.009378969800902604, 0.00030577045264713074, 0.0159329353530375, 0.014849508018911006, -0.0026513365659554225, 0.002923303082126996, 0.01917908707828847, -0.02338288107991566, -0.05706674679291175, 0.009526265752669624, -0.19945255386401284, -0.10725519695909647, -0.3222906835083537, -0.03857038318412844, -0.013279804965996065, -0.046626023244262085, -0.029299060237210447, -0.043269580558906555, -0.03768510002290657, -0.02255977771908117, -0.02632588166863199, -0.014417349488098566, -0.003077271951572957, -0.0004973277708010661, 0.0003475839139671271, -0.0014522783025903258, -0.0], [0.012215315671616316, -0.001693194176229889, 0.011365785434529038, 0.0036964574178487792, -0.010126738168635003, -0.025554378647710443, 0.006538003839811914, -0.03181759044467965, -0.016424751042854728, 0.06177539736110035, -0.43801735323216856, -0.29991040815937386, -0.2516019795363623, 0.037789523540809, -0.010948746374759491, -0.0633901687126727, -0.005976006160777705, 0.006035133605976937, -0.04961632526071937, -0.04142116972831476, -0.07558952727782252, -0.04165176179187153, -0.02021603856619006, -0.0027365663096057032, -0.011145473712733575, 0.0003566937349350848, -0.00546472985268321, -0.0], [0.008009386447317503, 0.006831207743885825, 0.0051306149795546365, 0.016239014770865052, 0.020925441734273218, 0.028344800173195076, -0.004805080609285047, -0.01880521614501033, -0.1272329010865855, -0.39835936819190537, -0.09113694760349819, -0.04061591094832608, -0.12677021961235907, 0.015567707226741051, -0.005615051546243333, -0.06454044862001587, 0.0195457674752272, -0.04219686517155871, -0.08060569979524296, 0.027234494361702787, -0.009152881336047056, -0.030865118003992217, -0.005770311060090559, 0.002905833371986098, 5.606663556872091e-05, 0.003209538083839772, -0.0018588810743365345, -0.0], [0.007587008852984699, -0.0021213639853557625, 0.0007709558092903736, 0.013883256128746423, 0.017328713012428214, 0.03645357525636198, -0.04043993335238427, 0.05730125171252314, -0.2563293727512057, -0.11438826083879326, 0.02662382809034687, 0.03525271352483709, 0.04745678120172762, 0.0336360484090392, -0.002916635707204059, -0.17950855098650784, -0.44161773297052964, -0.4512180227831197, -0.4940283106297913, -0.1970108671285798, 0.04344323143078066, -0.012005120444897523, 0.00987576109166055, -0.0018336757466252476, 0.0004913959502151706, -0.0005409724034216215, -0.005039223900868212, -0.0], [0.00637876531169957, 0.005189469227685454, 0.0007676355246000376, 0.018378100865097655, 0.015739815031394887, -0.035524983116512455, 0.03781006978038308, 0.28859052096740495, 0.0726464110153121, -0.026768468497420147, 0.06278766200288134, 0.17897045813699355, -0.13780371920803108, -0.14176458123649577, -0.1733103177731656, -0.3106508869296763, 0.04788355140275794, 0.04235327890285105, -0.031266625292514394, -0.016263819217960652, -0.031388328800811355, -0.01791363975905968, -0.012025067979443894, 0.008335083985905805, -0.0014386677797296231, 0.0055376544652972854, 0.002241522815466253, -0.0], [0.007455256326741617, -0.0009475207572210404, 0.0020288385162615286, 0.015399640135796092, 0.021133843188103074, -0.019846405097622234, -0.003162485751163173, -0.14199005055318842, -0.044200898667146035, -0.013395459413208084, 0.11019680479230103, -0.014057216041764874, -0.12553853334447865, -0.05992513534766256, 0.06467942189539834, 0.08866056095907732, -0.1451321508061849, -0.07382491447758655, -0.046961739981080476, 0.0008943713493160624, 0.03231044103656507, 0.00036034241706501196, -0.011387669277619417, -0.00014602449257226195, -0.0021863729003374116, 0.0018817840156005856, 0.0037909804578166286, -0.0], [0.006511855618626698, 0.006236866054439829, -0.001440571166157676, 0.012795776609942026, 0.011530545030403624, 0.03495489377257363, 0.04792403136095304, 0.049378583599065225, 0.03296101702085617, -0.0005351385876652296, 0.017744115897640366, 0.0011656622496764954, 0.0232845869823761, -0.0561191397060232, -0.02854070511118366, -0.028614174047247348, -0.007763531086362863, 0.01823079560098924, 0.021961392405283622, -0.009666681805706179, 0.009547046884328725, -0.008729943263791338, 0.006408909680578429, 0.009794327096359952, -0.0025825219195515304, 0.007063559189211571, 0.007867244119267047, -0.0], [0.007936663546039311, -0.00010710180170593153, 0.002716512705673228, 0.0038633557307721487, -0.0014877316616940372, -0.0004788143065635909, 0.012508842248031202, 0.0045381104608414645, -0.010650910516128294, -0.013785341529644855, -0.034287643221318206, -0.022152707546335495, -0.047056481347685974, -0.032166744564720455, -0.021551611335278546, -0.002174962503376043, 0.024344287130424306, 0.015579272560525105, 0.010958169741952194, -0.010607232913436921, -0.005548369726118836, -0.0014630046444242706, 0.013144180105016433, 0.0031349366359021916, 0.0010984887428255974, 0.005426941473328394, 0.006566511860044785, -0.0], [0.0005529184874606495, 0.00026139355020588705, -0.002887623443531047, 0.0013988462990850632, 0.00203365139495493, -0.007276926701775218, -0.004010419939595932, 0.017521952161185662, 0.0006996977433557911, 0.02083134683611201, 0.013690533534289498, -0.005466724359976675, -0.008857712321334327, 0.017408578822635818, 0.0076439343049154425, 0.0017861314923539985, 0.007465865707523924, 0.008034420825988495, 0.003976298558337994, 0.00411970637898539, -0.004572592545819698, 0.0029563907011979935, -0.0006382227820088148, 0.0015153753877889707, -0.0052626601797995595, 0.0025664706985019416, 0.005161751034260073, -0.0], [0.0009424280561998445, -0.0012942360298110595, 0.0011900868416523343, 0.000984424113178899, 0.0020988269382781564, -0.005870080062890889, -0.004950484744457169, 0.003117643454332697, -0.002509563565777083, 0.005831604884101081, 0.009531085216183116, 0.010030206821909806, 0.005858190171099734, 4.9344529936340524e-05, -0.004027895832421331, 0.0025436439920587606, 0.00531153867563076, 0.00495942692369508, 0.009215148318606382, 0.00010011928317543458, 0.0060051362999805355, -0.0008195376963202741, 0.0041728603512658224, -0.0017597169567888774, -0.0010577007775543158, 0.00046033327178068433, -0.0007674196306044449, -0.0], [-0.0, -0.0, 0.0013386963856532302, 0.00035183178922260837, 0.0030610334903526204, 8.951834979315781e-05, 0.0023676793550483524, -0.0002900551076915047, -0.00207019445286608, -7.61697478482574e-05, 0.0012150086715244216, 0.009831239281792168, 0.003479667642621962, 0.0070584324334114525, 0.004161851261339585, 0.0026146296354490665, -9.194746959222099e-05, 0.0013583866966571571, 0.0016821551239318913, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0]]]]}

使用V2 REST协议部署PyTorch模型

创建推理服务

当您在新的模型规范中指定模型格式pytorch并启用KServe v1推理协议时,KServe默认选择TorchServe运行时。要启用v2推理协议,请指定值为v2的protocolVersion字段。
旧框架

apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
  name: "torchserve-mnist-v2"
spec:
  predictor:
    pytorch:
      protocolVersion: v2
      storageUri: gs://kfserving-examples/models/torchserve/image_classifier/v2

新框架

apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
  name: "torchserve-mnist-v2"
spec:
  predictor:
    model:
      modelFormat:
        name: pytorch
      protocolVersion: v2  
      storageUri: gs://kfserving-examples/models/torchserve/image_classifier/v2

要在CPU上部署模型,请应用mnist_v2.yaml来创建推理服务。
kubectl

kubectl apply -f mnist_v2.yaml

期望输出

$ inferenceservice.serving.kserve.io/torchserve-mnist-v2 created

模型推理

第一步是确定入口IP和端口,并设置INGRESS_HOST和INGRESS_PORT。

MODEL_NAME=mnist
SERVICE_HOSTNAME=$(kubectl get inferenceservice torchserve-mnist-v2 -o jsonpath='{.status.url}' | cut -d "/" -f 3)

您可以使用v2协议发送字节数组和张量,对于字节数组,使用图像转换器将图像转换为字节数组输入。在这里,我们使用mnist_v2_bytes.json文件来运行一个示例推理。

curl -v -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v2/models/${MODEL_NAME}/infer -d @./mnist_v2_bytes.json

期望输出

{"id": "d3b15cad-50a2-4eaf-80ce-8b0a428bd298", "model_name": "mnist", "model_version": "1.0", "outputs": [{"name": "predict", "shape": [], "datatype": "INT64", "data": [1]}]}

对于张量输入,使用张量图像转换器将图像转换为张量输入,这里我们使用mnist_v2.json文件来运行示例推理。

curl -v -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v2/models/${MODEL_NAME}/infer -d @./mnist_v2.json

期望输出

{"id": "2266ec1e-f600-40af-97b5-7429b8195a80", "model_name": "mnist", "model_version": "1.0", "outputs": [{"name": "predict", "shape": [], "datatype": "INT64", "data": [1]}]}

模型说明

要使用v2解释端点获取模型解释,请执行以下操作:

MODEL_NAME=mnist
curl -v -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v2/models/mnist/explain -d @./mnist_v2.json

期望输出

{"id": "d3b15cad-50a2-4eaf-80ce-8b0a428bd298", "model_name": "mnist", "model_version": "1.0", "outputs": [{"name": "explain", "shape": [1, 28, 28], "datatype": "FP64", "data": [-0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, 0.0, -0.0, -0.0, 0.0, -0.0, 0.0, -0.0, -0.0, -0.0, -0.0, -0.0, 0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, 0.0, -0.0, 0.0, -0.0, -0.0, -0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, -0.0, -0.0, 0.0, 0.0, -0.0, 0.0, -0.0, -0.0, -0.0, -0.0, -0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, -0.0, -0.0, -0.0, 0.0, -0.0, 0.0, 0.0, 0.0, -0.0, -0.0, -0.0, 0.0, -0.0, 0.0, -0.0, -0.0, -0.0, -0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, -0.0, -0.0, 0.0, 0.0, -0.0, -0.0, -0.0, -0.0, -0.0, 0.0, 0.0, -0.0, -0.0, -0.0, 0.0, 0.0, 0.0, -0.0, -0.0, -0.0, -0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, -0.0, -0.0040547528781588035, -0.00022612877200043775, -0.0001273413606783097, 0.005648369508785856, 0.008904784451506994, 0.0026385365879584796, 0.0026802458602499875, -0.002657801604900743, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, 0.0, -0.0, -0.0, -0.0, -0.0, -0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.00024465772895309256, 0.0008218449738666515, 0.015285917610467934, 0.007512832227517626, 0.007094984753782517, 0.003405668751094489, -0.0020919252360163056, -0.00078002938659872, 0.02299587777864007, 0.01900432942654754, -0.001252955497754338, -0.0014666116894338772, -0.0, -0.0, -0.0, 0.0, 0.0, 0.0, 0.0, -0.0, -0.0, -0.0, -0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.005298396384926053, -0.0007901605067151054, 0.0039060659788228954, 0.02317408211645009, 0.017237917554858186, 0.010867034286601965, 0.003001563092717309, 0.00622421762838887, 0.006120712336480808, 0.016736329175541464, 0.005674718838256385, 0.004344134814439431, -0.001232842177319105, -0.0, -0.0, -0.0, 0.0, 0.0, -0.0, -0.0, -0.0, 0.0, 0.0, 0.0, 0.0, 0.0, -0.0, 0.0006867353660007012, 0.00977289933298656, -0.003875493166540815, 0.0017986937404117591, 0.0013075440157543057, -0.0024510980461748236, -0.0008806773426546923, -0.0, -0.0, -0.00014277890422995419, -0.009322313284511257, 0.020608317953885236, 0.004351394739722548, -0.0007875565409186222, -0.0009075897751127677, -0.0, -0.0, 0.0, 0.0, 0.0, -0.0, -0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.00022247237111456804, -0.0007829031603535926, 0.002666369539125161, 0.000973336852105775, 0.0, -0.0, 0.0, 0.0, 0.0, 0.0, -0.0, 0.000432321003928822, 0.023657172129172684, 0.010694844898905204, -0.002375952975746018, -0.0, -0.0, 0.0, 0.0, -0.0, -0.0, -0.0, 0.0, 0.0, 0.0, 0.0, 0.0, -0.0, -0.0020747972047037, -0.002320101258915877, -0.0012899205783904548, 0.0, 0.0, 0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, 0.007629679655402933, 0.01044862724376463, 0.00025032878924736025, -0.0, -0.0, 0.0, 0.0, 0.0, -0.0, -0.0, 0.0, 0.0, 0.0, 0.0, -0.0, -0.0, -0.00037708370104137974, -0.005156369275302328, 0.0012477582442296628, 0.0, 0.0, 0.0, 0.0, 0.0, -0.0, -0.0, 0.0, -0.0, -4.442516083381132e-05, 0.01024804634283815, 0.0009971135240970147, -0.0, -0.0, 0.0, 0.0, 0.0, -0.0, -0.0, 0.0, -0.0, 0.0, 0.0, -0.0, 0.0004501048968956462, -0.0019630535686311007, -0.0006664793297549408, 0.0020157403539278907, 0.0, 0.0, -0.0, -0.0, -0.0, -0.0, -0.0, 0.0, -0.0, -0.0022144569383238466, 0.008361583574785395, 0.00314019428604999, -0.0, -0.0, 0.0, 0.0, 0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0028943544591141838, -0.0031301383432286406, 0.002113252872926688, 0.0, 0.0, 0.0, 0.0, 0.0, -0.0, -0.0, 0.0, -0.0, -0.0, -0.0010321050605717045, 0.008905753926369048, 0.002846438277738756, -0.0, -0.0, 0.0, 0.0, 0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, 0.0, -0.005305288883499087, -0.00192711009725932, 0.0012090042768467344, 0.0, 0.0, 0.0, -0.0, -0.0, -0.0, 0.0, 0.0, 0.0, -0.0, -0.0011945156500241256, 0.005654442715832439, 0.0020132075345016807, -0.0, -0.0, 0.0, 0.0, 0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, 0.0, -0.0014689356969061985, 0.0010743412638183228, 0.0, 0.0, 0.0, 0.0, -0.0, -0.0, -0.0, -0.0, 0.0, -0.0, -0.0, -0.0017047980586912376, 0.00290660517425009, -0.0007805869640505143, -0.0, -0.0, 0.0, 0.0, 0.0, -0.0, -0.0, -0.0, 0.0, -0.0, -0.0, 5.541* Connection #0 to host localhost left intact
725422148614e-05, 0.0014516114512869852, 0.0002827701966546988, 0.0, 0.0, 0.0, -0.0, -0.0, -0.0, 0.0, 0.0, 0.0, 0.0, -0.0, -0.0014401407633627265, 0.0023812497776698745, 0.002146825301700187, -0.0, -0.0, 0.0, -0.0, 0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, 0.0011500529125940918, 0.0002865015572973405, 0.0029798151042282686, 0.0, 0.0, 0.0, -0.0, -0.0, -0.0, 0.0, 0.0, 0.0, -0.0, -0.0, -0.0017750295500283872, 0.0008339859126060243, -0.00377073933577687, -0.0, -0.0, 0.0, 0.0, 0.0, -0.0, -0.0, -0.0, -0.0, 0.0, 0.0, -0.0006093176894575109, -0.00046905787892409935, 0.0034053218511795034, 0.0, 0.0, 0.0, 0.0, -0.0, -0.0, -0.0, 0.0, -0.0, -0.0, -0.0007450011768391558, 0.001298767372877851, -0.008499247640112315, -6.145166131400234e-05, -0.0, -0.0, -0.0, 0.0, 0.0, -0.0, -0.0, -0.0, 0.0, -0.0, 0.0, 0.0011809726042792137, -0.001838476328106708, 0.00541110661116898, 0.0, 0.0, 0.0, 0.0, -0.0, -0.0, -0.0, -0.0, 0.0, -0.002139234224224006, 0.0003259163407641124, -0.005276118873855287, -0.001950984007438105, -9.545670742026532e-07, 0.0, -0.0, 0.0, 0.0, 0.0, -0.0, -0.0, -0.0, -0.0, 0.0, 0.0, 0.0007772404228681039, -0.0001517956264720738, 0.0064814848131711815, -0.0, 0.0, 0.0, -0.0, -0.0, -0.0, -0.0, -0.0, 8.098064985902114e-05, -0.00249042660692983, -0.0020718619200672302, -5.341117902942147e-05, -0.00045564724429915073, 0.0, -0.0, -0.0, 0.0, 0.0, 0.0, -0.0, -0.0, -0.0, -0.0, 0.0, 0.0, 0.0, 0.0022750983476959733, 0.0017164060958460778, 0.0003221344707738082, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0015560282678744543, 9.107238495871273e-05, 0.0008772841497928399, 0.0006502978626355868, -0.004128780767525651, 0.0006030386900152659, 0.0, -0.0, 0.0, -0.0, -0.0, 0.0, 0.0, -0.0, -0.0, 0.0, -0.0, -0.0, 0.0, 0.0, 0.001395995791096219, 0.0026791526689584344, 0.0023995008266391488, -0.0004496096312746451, 0.003101832450753724, 0.007494536066960778, 0.0028641187148287965, -0.0030525907182629075, 0.003420222396518567, 0.0014924018363498125, -0.0009357388301326025, 0.0007856228933169799, -0.0018433973914981437, 1.6031856831240914e-05, 0.0, 0.0, -0.0, -0.0, 0.0, 0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, 0.0, -0.0006999018502034005, 0.004382250870697946, -0.0035419313267119365, -0.0028896748092595375, -0.00048734542493666705, -0.0060873452419295, 0.000388224990424471, 0.002533641537585585, -0.004352836563597573, -0.0006079418766875505, -0.0038101334053377753, -0.000828441340357984, 0.0, -0.0, 0.0, 0.0, -0.0, 0.0, 0.0, 0.0, -0.0, -0.0, -0.0, 0.0, -0.0, -0.0, -0.0, -0.0, -0.0, 0.0010901530866342661, -0.013135008038845744, 0.0004734518707654666, 0.002050423283568135, -0.006609451922460863, 0.0023647861820124366, 0.0046789204256194, -0.0018122527412311837, 0.002137538353955849, 0.0, -0.0, -0.0, 0.0, 0.0, -0.0, -0.0, -0.0, 0.0, 0.0, 0.0, -0.0, -0.0, 0.0, -0.0, -0.0, -0.0, -0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, -0.0, -0.0, -0.0, 0.0, -0.0, -0.0, 0.0, 0.0, 0.0, 0.0, 0.0, -0.0, -0.0, -0.0, 0.0, 0.0, -0.0, -0.0, 0.0, -0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, -0.0, -0.0, 0.0, -0.0, -0.0, -0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, -0.0, -0.0, -0.0, 0.0, 0.0, -0.0, -0.0, -0.0, 0.0, 0.0, 0.0, 0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, -0.0, 0.0, 0.0, -0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, -0.0, -0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, -0.0, 0.0, -0.0, -0.0, -0.0, -0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]}]}

自动缩放

主要的无服务器推理功能之一是自动缩放与传入工作负载匹配的推理服务的副本。KServe默认启用Knative Pod Autoscaler,它可以监视流量并根据配置的指标进行上下缩放。

Knative Autoscaler

KServe支持Knative Pod Autoscaler(KPA)和Kubernetes的Horizontal Pod Autocaler(HPA)的实现。下面列出了每个自动缩放器的功能和限制。

笔记
如果你想使用Kubernetes Horizontal Pod Autoscaler(HPA),你必须安装HPA扩展

Knative Pod Autoscaler(KPA)

  • Knative Serving核心的一部分,安装Knative Serving后默认启用。
  • 支持缩放到零功能。
  • 不支持基于CPU的自动缩放。

Horizontal Pod Autoscaler(HPA)

  • 不是Knative Serving核心的一部分,必须在安装Knative Serving后启用。
  • 不支持缩放到零功能。
  • 支持基于CPU的自动缩放。

创建具有并发目标的推理服务

硬/软自动缩放限制

您可以使用注释autoscaling.knactive.dev/target配置推理服务以获得软限制。软限制是一个有针对性的限制,而不是严格执行的限制,特别是如果突然出现请求突发,可能会超过这个值。

旧框架

apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
  name: "torchserve"
  annotations:
    autoscaling.knative.dev/target: "10"
spec:
  predictor:
    pytorch:
      storageUri: "gs://kfserving-examples/models/torchserve/image_classifier/v1"

新框架

apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
  name: "torchserve"
  annotations:
    autoscaling.knative.dev/target: "10"
spec:
  predictor:
    model:
      modelFormat:
        name: pytorch
      storageUri: "gs://kfserving-examples/models/torchserve/image_classifier/v1"

您也可以配置具有字段containerConcurrency和硬限制的推理服务。硬性限制是一个强制的上限。如果并发达到硬限制,多余的请求将被缓冲,并且必须等到有足够的空闲容量来执行请求。
旧框架

apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
  name: "torchserve"
spec:
  predictor:
    containerConcurrency: 10
    pytorch:
      storageUri: "gs://kfserving-examples/models/torchserve/image_classifier/v1"

新框架

apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
  name: "torchserve"
spec:
  predictor:
    containerConcurrency: 10
    model:
      modelFormat:
        name: pytorch
      storageUri: "gs://kfserving-examples/models/torchserve/image_classifier/v1"

在指定缩放目标的软限制或硬限制后,现在可以使用autoscaling.yaml部署推理服务。
kubectl

kubectl apply -f autoscaling.yaml

期望输出

$ inferenceservice.serving.kserve.io/torchserve created

使用并发请求运行推理

第一步是安装hey load generator,然后将并发请求发送到推理服务。

go get -u github.com/rakyll/hey
MODEL_NAME=mnist
SERVICE_HOSTNAME=$(kubectl get inferenceservice torchserve -o jsonpath='{.status.url}' | cut -d "/" -f 3)
hey -m POST -z 30s -D ./mnist.json -host ${SERVICE_HOSTNAME} http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/${MODEL_NAME}:predict

Check Pod Autoscaling

hey默认情况下会同时生成50个请求,因此您可以看到,当容器并发目标设置为10时,推理服务会扩展到5个pod。

期望输出

NAME                                                             READY   STATUS        RESTARTS   AGE
torchserve-predictor-default-cj2d8-deployment-69444c9c74-67qwb   2/2     Terminating   0          103s
torchserve-predictor-default-cj2d8-deployment-69444c9c74-nnxk8   2/2     Terminating   0          95s
torchserve-predictor-default-cj2d8-deployment-69444c9c74-rq8jq   2/2     Running       0          50m
torchserve-predictor-default-cj2d8-deployment-69444c9c74-tsrwr   2/2     Running       0          113s
torchserve-predictor-default-cj2d8-deployment-69444c9c74-vvpjl   2/2     Running       0          109s
torchserve-predictor-default-cj2d8-deployment-69444c9c74-xvn7t   2/2     Terminating   0          103s

金丝雀发布

金丝雀发布是一种部署策略,当您向生产流量的一小部分发布新版本的模型时。

使用Canary模型创建推理服务

在上面的实验之后,现在让我们看看如何在不将全部流量默认移动到新模型的情况下推出新模型。

旧框架

apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
  name: "torchserve"
  annotations:
    serving.kserve.io/enable-tag-routing: "true"
spec:
  predictor:
    canaryTrafficPercent: 20
    pytorch:
      storageUri: "gs://kfserving-examples/models/torchserve/image_classifier/v2"

新框架

apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
  name: "torchserve"
  annotations:
    serving.kserve.io/enable-tag-routing: "true"
spec:
  predictor:
    canaryTrafficPercent: 20
    model:
      modelFormat:
        name: pytorch
      storageUri: "gs://kfserving-examples/models/torchserve/image_classifier/v2"

在本例中,我们使用canaryTrafficPercent字段将storageUri更改为v2版本,然后应用canary.yaml
kubectl

kubectl apply -f canary.yaml

期望输出

kubectl get revisions -l serving.kserve.io/inferenceservice=torchserve
NAME                                 CONFIG NAME                    K8S SERVICE NAME   GENERATION   READY   REASON   ACTUAL REPLICAS   DESIRED REPLICAS
torchserve-predictor-default-00001   torchserve-predictor-default                      1            True             1                 1
torchserve-predictor-default-00002   torchserve-predictor-default                      2            True             1                 1

kubectl get pods -l serving.kserve.io/inferenceservice=torchserve
NAME                                                             READY   STATUS    RESTARTS   AGE
torchserve-predictor-default-00001-deployment-7d99979c99-p49gk   2/2     Running   0          28m
torchserve-predictor-default-00002-deployment-c6fcc65dd-rjknq    2/2     Running   0          3m37s

检查流量状况

金丝雀模型推出后,流量应在金丝雀模型修订版和“稳定”修订版之间进行分配,后者以100%的流量推出,现在从推理服务流量状态检查流量分配:

kubectl get isvc torchserve -ojsonpath='{.status.components}'

期望输出

{
  "predictor": {
    "address": {
      "url": "http://torchserve-predictor-default.default.svc.cluster.local"
    },
    "latestCreatedRevision": "torchserve-predictor-default-00002",
    "latestReadyRevision": "torchserve-predictor-default-00002",
    "latestRolledoutRevision": "torchserve-predictor-default-00001",
    "traffic": [
      {
        "latestRevision": true,
        "percent": 20,
        "revisionName": "torchserve-predictor-default-00002",
        "tag": "latest",
        "url": "http://latest-torchserve-predictor-default.default.example.com"
      },
      {
        "latestRevision": false,
        "percent": 80,
        "revisionName": "torchserve-predictor-default-00001",
        "tag": "prev",
        "url": "http://prev-torchserve-predictor-default.default.example.com"
      }
    ],
    "url": "http://torchserve-predictor-default.default.example.com"
  }
}

“流量”推出

向InferenceService运行以下curl请求几次,您可以看到请求被发送到具有20/80拆分的两个修订版。

MODEL_NAME=mnist
SERVICE_HOSTNAME=$(kubectl get inferenceservice torchserve -o jsonpath='{.status.url}' | cut -d "/" -f 3)
for i in {1..10}; do curl -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/${MODEL_NAME}:predict -d @./mnist.json; done

期望输出

{"predictions": [2]}Handling connection for 8080
{"predictions": [2]}Handling connection for 8080
{"predictions": [2]}Handling connection for 8080
<html><title>500: Internal Server Error</title><body>500: Internal Server Error</body></html>Handling connection for 8080
<html><title>500: Internal Server Error</title><body>500: Internal Server Error</body></html>Handling connection for 8080
{"predictions": [2]}Handling connection for 8080
{"predictions": [2]}Handling connection for 8080
{"predictions": [2]}Handling connection for 8080
{"predictions": [2]}Handling connection for 8080

您可以注意到,当请求命中canary修订版时,它会失败,这是因为新修订版需要v2推理输入mnist_v2.json,这是一个突破性的更改,此外,流量会根据指定的流量百分比在两个修订版之间随机分配。在这种情况下,您应该使用0 canaryTrafficPercent来推出金丝雀模型,并在将全部流量移动到新模型之前使用最新标记的url来测试金丝雀模型。
kubectl

kubectl patch isvc torchserve --type='json' -p '[{"op": "replace", "path": "/spec/predictor/canaryTrafficPercent", "value": 0}]'
curl -v -H "Host: latest-torchserve-predictor-default.default.example.com" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/${MODEL_NAME}:predict -d @./mnist.json

期望输出

{"id": "d3b15cad-50a2-4eaf-80ce-8b0a428bd298", "model_name": "mnist", "model_version": "1.0", "outputs": [{"name": "predict", "shape": [1], "datatype": "INT64", "data": [1]}]}

在测试和验证了新模型后,您现在可以将canaryTrafficPercent提高到100,以将流量完全推送到新版本,现在latestRolledoutRevision变为torchserver-predictor-default-00002,而previousRolledoutRevision变成torchserver-predictor-default-00001。
kubectl

kubectl patch isvc torchserve --type='json' -p '[{"op": "replace", "path": "/spec/predictor/canaryTrafficPercent", "value": 100}]'

检查流量状态:

kubectl get isvc torchserve -ojsonpath='{.status.components}'

期望输出

{
  "predictor": {
    "address": {
      "url": "http://torchserve-predictor-default.default.svc.cluster.local"
    },
    "latestCreatedRevision": "torchserve-predictor-default-00002",
    "latestReadyRevision": "torchserve-predictor-default-00002",
    "latestRolledoutRevision": "torchserve-predictor-default-00002",
    "previousRolledoutRevision": "torchserve-predictor-default-00001",
    "traffic": [
      {
        "latestRevision": true,
        "percent": 100,
        "revisionName": "torchserve-predictor-default-00002",
        "tag": "latest",
        "url": "http://latest-torchserve-predictor-default.default.example.com"
      },
    ],
    "url": "http://torchserve-predictor-default.default.example.com"
  }
}

回滚模型

如果流量移动到新版本后,新模型版本不起作用,您仍然可以将canaryTrafficPercent修补为0,并将流量移回以前滚动的模型,即torchserver-predictor-default-00001。

kubectl

kubectl patch isvc torchserve --type='json' -p '[{"op": "replace", "path": "/spec/predictor/canaryTrafficPercent", "value": 0}]'

检查流量状态:

kubectl get isvc torchserve -ojsonpath='{.status.components}'

期望输出

{
  "predictor": {
    "address": {
      "url": "http://torchserve-predictor-default.default.svc.cluster.local"
    },
    "latestCreatedRevision": "torchserve-predictor-default-00002",
    "latestReadyRevision": "torchserve-predictor-default-00002",
    "latestRolledoutRevision": "torchserve-predictor-default-00001",
    "previousRolledoutRevision": "torchserve-predictor-default-00001",
    "traffic": [
      {
        "latestRevision": true,
        "percent": 0,
        "revisionName": "torchserve-predictor-default-00002",
        "tag": "latest",
        "url": "http://latest-torchserve-predictor-default.default.example.com"
      },
      {
        "latestRevision": false,
        "percent": 100,
        "revisionName": "torchserve-predictor-default-00001",
        "tag": "prev",
        "url": "http://prev-torchserve-predictor-default.default.example.com"
      }
    ],
    "url": "http://torchserve-predictor-default.default.example.com"
  }
}

监控

Metrics Exposure and Grafana Dashboard Setup

使用推理服务部署Scikit学习模型

本示例将引导您了解如何利用推理服务CRD的v1beta1版本部署scikit-learn模型。请注意,默认情况下,v1beta1版本将通过与现有V1 Dataplane兼容的API公开您的模型。但是,本示例将向您展示如何通过与新的V2 Dataplane兼容的API为模型提供服务。

训练

第一步是训练一个样本scikit学习模型。请注意,该模型将保存为model.joblib。

from sklearn import svm
from sklearn import datasets
from joblib import dump

iris = datasets.load_iris()
X, y = iris.data, iris.target

clf = svm.SVC(gamma='scale')
clf.fit(X, y)

dump(clf, 'model.joblib')

本地测试

一旦您获得了模型序列化的model.joblib,我们就可以使用MLServer来启动本地服务器。有关MLServer的更多详细信息,请随时查看SKLearn示例文档

笔记
此步骤是可选的,仅用于测试,可以直接使用InfenceService进行部署

先决条件

首先,要在本地使用MLServer,首先需要在本地环境以及SKLearn运行时中安装MLServer包。

pip install mlserver mlserver-sklearn

模型设置

下一步将提供一些模型设置,以便MLServer知道:

  • 为您的模型提供服务的推理运行时(即mlserver_sklearn.SKLearnModel)
  • 型号的名称和版本

这些可以通过环境变量或创建本地model-settings.json文件来指定:

{
  "name": "sklearn-iris",
  "version": "v1.0.0",
  "implementation": "mlserver_sklearn.SKLearnModel"
}

请注意,当您部署模型时,KServe已经注入了一些合理的默认值,这样它就可以开箱即用,而无需任何进一步的配置。但是,您仍然可以通过提供类似于本地文件的model-settings.json文件来覆盖这些默认值。您甚至可以提供一组model-settings.json文件来加载多个模型

本地服务模型

有了本地安装的mlserver软件包和本地model-settings.json文件,您现在应该可以启动我们的服务器了:

mlserver start .

使用推理服务进行部署

最后,您将使用KServe来部署经过训练的模型。为此,您只需要使用推理服务CRD的v1beta1版本,并将protocolVersion字段设置为v2。
旧框架

apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
  name: "sklearn-irisv2"
spec:
  predictor:
    sklearn:
      protocolVersion: "v2"
      storageUri: "gs://seldon-models/sklearn/mms/lr_model"

新框架

apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
  name: "sklearn-irisv2"
spec:
  predictor:
    model:
      modelFormat:
        name: sklearn
      runtime: kserve-mlserver
      storageUri: "gs://seldon-models/sklearn/mms/lr_model"

请注意,这做出了以下假设:

  • 您的模型权重(即model.joblib文件)已经上传到“模型存储库”(本例中为GCS),并且可以作为gs://seldon models/skrearn/iris访问。
  • 有一个K8s集群可用,可以通过kubectl访问。
  • KServe已安装在您的群集中。

kubectl

kubectl apply -f ./sklearn.yaml

测试部署的模型

现在,您可以通过发送示例请求来测试已部署的模型。
请注意,此请求需要遵循V2数据平面协议。您可以在下面看到一个负载示例:

{
  "inputs": [
    {
      "name": "input-0",
      "shape": [2, 4],
      "datatype": "FP32",
      "data": [
        [6.8, 2.8, 4.8, 1.4],
        [6.0, 3.4, 4.5, 1.6]
      ]
    }
  ]
}

现在,假设您的入口可以在${INGRESS_HOST}: ${INGRESS_PORT}访问,或者您可以按照此说明查找入口IP和端口。

您可以使用curl将推理请求发送为:

SERVICE_HOSTNAME=$(kubectl get inferenceservice sklearn-irisv2 -o jsonpath='{.status.url}' | cut -d "/" -f 3)

curl -v \
  -H "Host: ${SERVICE_HOSTNAME}" \
  -H "Content-Type: application/json" \
  -d @./iris-input.json \
  http://${INGRESS_HOST}:${INGRESS_PORT}/v2/models/sklearn-irisv2/infer

期望输出

{
  "id": "823248cc-d770-4a51-9606-16803395569c",
  "model_name": "sklearn-irisv2",
  "model_version": "v1.0.0",
  "outputs": [
    {
      "data": [1, 1],
      "datatype": "INT64",
      "name": "predict",
      "parameters": null,
      "shape": [2]
    }
  ]
}

使用推理服务部署XGBoost模型

本示例将引导您了解如何利用推理服务CRD的v1beta1版本部署xgboost模型。请注意,默认情况下,v1beta1版本将通过与现有V1Dataplane兼容的API公开您的模型。但是,本示例将向您展示如何通过与新的V2数据平面兼容的API为模型提供服务。

训练

第一步是训练一个样例xgboost模型。我们将此模型另存为model.bst。

import xgboost as xgb
from sklearn.datasets import load_iris
import os

model_dir = "."
BST_FILE = "model.bst"

iris = load_iris()
y = iris['target']
X = iris['data']
dtrain = xgb.DMatrix(X, label=y)
param = {'max_depth': 6,
            'eta': 0.1,
            'silent': 1,
            'nthread': 4,
            'num_class': 10,
            'objective': 'multi:softmax'
            }
xgb_model = xgb.train(params=param, dtrain=dtrain)
model_file = os.path.join((model_dir), BST_FILE)
xgb_model.save_model(model_file)

本地测试

一旦我们的model.bst模型串行化,我们就可以使用MLServer来启动本地服务器。有关MLServer的更多详细信息,请随时查看他们文档中的XGBoost示例

请注意,此步骤是可选的,仅用于测试。您可以直接部署经过训练的模型。

先决条件

首先,要在本地使用MLServer,首先需要在本地环境以及XGBoost运行时中安装MLServer包。

pip install mlserver mlserver-xgboost

模型设置

下一步将提供一些模型设置,以便MLServer知道:

  • 我们希望模型使用的推理运行时(即mlserver_xgboost.XGBoostModel)
  • 我们型号的名称和版本

这些可以通过环境变量或创建本地model-settings.json文件来指定:

{
  "name": "xgboost-iris",
  "version": "v1.0.0",
  "implementation": "mlserver_xgboost.XGBoostModel"
}

请注意,当我们部署我们的模型时,KServe已经注入了一些合理的默认值,这样它就可以在没有任何进一步配置的情况下开箱即用。但是,您仍然可以通过提供类似于本地文件的model-settings.json文件来覆盖这些默认值。您甚至可以提供一组model-settings.json文件来加载多个模型

在本地为我们的模型服务

有了本地安装的mlserver包和本地model-settings.json文件,我们现在应该可以启动服务器了:

mlserver start .

使用推理服务进行部署

最后,我们将使用KServe来部署我们经过训练的模型。为此,我们只需要使用推理服务CRD的v1beta1版本,并将protocolVersion字段设置为v2。
旧框架

apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
  name: "xgboost-iris"
spec:
  predictor:
    xgboost:
      protocolVersion: "v2"
      storageUri: "gs://kfserving-examples/models/xgboost/iris"

新框架

apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
  name: "xgboost-iris"
spec:
  predictor:
    model:
      modelFormat:
        name: xgboost
      runtime: kserve-mlserver
      storageUri: "gs://kfserving-examples/models/xgboost/iris"

请注意,这做出了以下假设:

  • 您的模型权重(即您的model.bst文件)已上传到“模型存储库”(本例中为GCS),
    可以作为gs://k服务示例/models/xgboost/iris访问。
  • 有一个K8s集群可用,可以通过kubectl访问。
  • KServe已安装在您的群集中。

假设我们已经安装了KServe,可以通过kubectl访问集群,我们可以将我们的模型部署为:

kubectl apply -f xgboost.yaml

测试部署的模型

现在,我们可以通过发送一个示例请求来测试我们部署的模型。
请注意,此请求需要遵循V2数据平面协议。您可以在下面看到一个负载示例:

{
  "inputs": [
    {
      "name": "input-0",
      "shape": [2, 4],
      "datatype": "FP32",
      "data": [
        [6.8, 2.8, 4.8, 1.4],
        [6.0, 3.4, 4.5, 1.6]
      ]
    }
  ]
}

现在,假设我们的入口可以在${INGRESS_HOST}: ${INGRESS_PORT}访问,我们可以使用curl将我们的推理请求发送为:

您可以按照以下说明查找入口IP和端口。

SERVICE_HOSTNAME=$(kubectl get inferenceservice xgboost-iris -o jsonpath='{.status.url}' | cut -d "/" -f 3)

curl -v \
  -H "Host: ${SERVICE_HOSTNAME}" \
  -H "Content-Type: application/json" \
  -d @./iris-input.json \
  http://${INGRESS_HOST}:${INGRESS_PORT}/v2/models/xgboost-iris/infer

输出将类似于:
期望输出

{
  "id": "4e546709-0887-490a-abd6-00cbc4c26cf4",
  "model_name": "xgboost-iris",
  "model_version": "v1.0.0",
  "outputs": [
    {
      "data": [1.0, 1.0],
      "datatype": "FP32",
      "name": "predict",
      "parameters": null,
      "shape": [2]
    }
  ]
}

使用推理服务部署PMML模型

PMML,或称预测模型标记语言,是一种用于描述数据挖掘和统计模型的XML格式,包括模型的输入、用于为数据挖掘准备数据的转换,以及定义模型本身的参数。在这个例子中,我们展示了如何在推理服务上提供PMML格式模型。

创建推理服务

旧框架

apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
  name: "pmml-demo"
spec:
  predictor:
    pmml:
      storageUri: gs://kfserving-examples/models/pmml

新框架

apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
  name: "pmml-demo"
spec:
  predictor:
    model:
      modelFormat:
        name: pmml
      storageUri: "gs://kfserving-examples/models/pmml"

使用上述yaml创建推理服务

kubectl apply -f pmml.yaml

期望输出

$ inferenceservice.serving.kserve.io/pmml-demo created

警告
pmmlserver基于Py4J,不支持多进程模式,因此我们无法设置spec.productor.contenerConcurrency。如果您想扩展pmmlserver以提高预测性能,您应该将推理服务的resources.limits.cu设置为1,并扩展副本大小。

运行预测

第一步是确定入口IP和端口,并设置INGRESS_HOST和INGRESS_PORT

MODEL_NAME=pmml-demo
INPUT_PATH=@./pmml-input.json
SERVICE_HOSTNAME=$(kubectl get inferenceservice pmml-demo -o jsonpath='{.status.url}' | cut -d "/" -f 3)
curl -v -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict -d $INPUT_PATH

期望输出

* TCP_NODELAY set
* Connected to localhost (::1) port 8081 (#0)
> POST /v1/models/pmml-demo:predict HTTP/1.1
> Host: pmml-demo.default.example.com
> User-Agent: curl/7.64.1
> Accept: */*
> Content-Length: 45
> Content-Type: application/x-www-form-urlencoded
>
* upload completely sent off: 45 out of 45 bytes
< HTTP/1.1 200 OK
< content-length: 39
< content-type: application/json; charset=UTF-8
< date: Sun, 18 Oct 2020 15:50:02 GMT
< server: istio-envoy
< x-envoy-upstream-service-time: 12
<
* Connection #0 to host localhost left intact
{"predictions": [{'Species': 'setosa', 'Probability_setosa': 1.0, 'Probability_versicolor': 0.0, 'Probability_virginica': 0.0, 'Node_Id': '2'}]}* Closing connection 0

使用PMML推理服务部署Spark MLlib模型

设置

1.安装pyspark 3.0.x和pyspark2pmml

pip install pyspark~=3.0.0
pip install pyspark2pmml

2.获取JPMML-SparkML.jar

训练Spark MLlib模型并导出到PMML文件

使用–jars启动pyspark以指定JPMML-SparkML uber-JAR的位置

pyspark --jars ./jpmml-sparkml-executable-1.6.3.jar

安装Spark ML管道

from pyspark.ml import Pipeline
from pyspark.ml.classification import DecisionTreeClassifier
from pyspark.ml.feature import RFormula

df = spark.read.csv("Iris.csv", header = True, inferSchema = True)

formula = RFormula(formula = "Species ~ .")
classifier = DecisionTreeClassifier()
pipeline = Pipeline(stages = [formula, classifier])
pipelineModel = pipeline.fit(df)

from pyspark2pmml import PMMLBuilder

pmmlBuilder = PMMLBuilder(sc, df, pipelineModel)

pmmlBuilder.buildFile("DecisionTreeIris.pmml")

将DecisionTreeRis.pmml上传到GCS存储桶,注意PMMLServer期望模型文件名为model.pmml

gsutil cp ./DecisionTreeIris.pmml gs://$BUCKET_NAME/sparkpmml/model.pmml

使用PMMLServer创建推理服务

使用pmml预测器创建推理服务,并指定您上传到的存储桶位置的storageUri
旧框架

apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
  name: "spark-pmml"
spec:
  predictor:
    pmml:
      storageUri: gs://kfserving-examples/models/sparkpmml

新框架

apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
  name: "spark-pmml"
spec:
  predictor:
    model:
      modelFormat:
        name: pmml
      storageUri: gs://kfserving-examples/models/sparkpmml

应用推理服务自定义资源

kubectl apply -f spark_pmml.yaml

期望输出

$ inferenceservice.serving.kserve.io/spark-pmml created

等待推理服务准备就绪

kubectl wait --for=condition=Ready inferenceservice spark-pmml
inferenceservice.serving.kserve.io/spark-pmml condition met

运行预测

第一步是确定入口IP和端口,并设置INGRESS_HOST和INGRESS_PORT

MODEL_NAME=spark-pmml
INPUT_PATH=@./pmml-input.json
SERVICE_HOSTNAME=$(kubectl get inferenceservice spark-pmml -o jsonpath='{.status.url}' | cut -d "/" -f 3)
curl -v -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict -d $INPUT_PATH

期望输出

* Connected to spark-pmml.default.35.237.217.209.xip.io (35.237.217.209) port 80 (#0)
> POST /v1/models/spark-pmml:predict HTTP/1.1
> Host: spark-pmml.default.35.237.217.209.xip.io
> User-Agent: curl/7.73.0
> Accept: */*
> Content-Length: 45
> Content-Type: application/x-www-form-urlencoded
>
* upload completely sent off: 45 out of 45 bytes
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< content-length: 39
< content-type: application/json; charset=UTF-8
< date: Sun, 07 Mar 2021 19:32:50 GMT
< server: istio-envoy
< x-envoy-upstream-service-time: 14
<
* Connection #0 to host spark-pmml.default.35.237.217.209.xip.io left intact
{"predictions": [[1.0, 0.0, 1.0, 0.0]]}

使用推理服务部署LightGBM模型

训练LightGBM模型

要测试LightGBM服务器,首先需要使用以下python代码训练一个简单的LightGBM模型。

import lightgbm as lgb
from sklearn.datasets import load_iris
import os

model_dir = "."
BST_FILE = "model.bst"

iris = load_iris()
y = iris['target']
X = iris['data']
dtrain = lgb.Dataset(X, label=y, feature_names=iris['feature_names'])

params = {
    'objective':'multiclass', 
    'metric':'softmax',
    'num_class': 3
}
lgb_model = lgb.train(params=params, train_set=dtrain)
model_file = os.path.join(model_dir, BST_FILE)
lgb_model.save_model(model_file)

使用V1协议部署LightGBM模型

在本地测试模型

使用经过训练的模型在本地安装并运行LightGBM服务器,并测试预测。

python -m lgbserver --model_dir /path/to/model_dir --model_name lgb

LightGBM服务器在本地启动后,我们可以通过发送推理请求来测试模型。

import requests

request = {'sepal_width_(cm)': {0: 3.5}, 'petal_length_(cm)': {0: 1.4}, 'petal_width_(cm)': {0: 0.2},'sepal_length_(cm)': {0: 5.1} }
formData = {
    'inputs': [request]
}
res = requests.post('http://localhost:8080/v1/models/lgb:predict', json=formData)
print(res)
print(res.text)

使用推理服务进行部署

要在Kubernetes上部署模型,可以通过使用lightgbm和storageUri指定modelFormat来创建推理服务。
旧框架

apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
  name: "lightgbm-iris"
spec:
  predictor:
    lightgbm:
      storageUri: "gs://kfserving-examples/models/lightgbm/iris"

新框架

apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
  name: "lightgbm-iris"
spec:
  predictor:
    model:
      modelFormat:
        name: lightgbm
      storageUri: "gs://kfserving-examples/models/lightgbm/iris"

应用上述yaml创建推理服务

kubectl apply -f lightgbm.yaml

期望输出

$ inferenceservice.serving.kserve.io/lightgbm-iris created

测试已部署的模型

要测试部署的模型,第一步是确定入口IP和端口,并设置INGRESS_HOST和INGRESS_PORT,然后运行以下curl命令将推理请求发送到推理服务。

MODEL_NAME=lightgbm-iris
INPUT_PATH=@./iris-input.json
SERVICE_HOSTNAME=$(kubectl get inferenceservice lightgbm-iris -o jsonpath='{.status.url}' | cut -d "/" -f 3)
curl -v -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict -d $INPUT_PATH

期望输出

*   Trying 169.63.251.68...
* TCP_NODELAY set
* Connected to 169.63.251.68 (169.63.251.68) port 80 (#0)
> POST /models/lightgbm-iris:predict HTTP/1.1
> Host: lightgbm-iris.default.svc.cluster.local
> User-Agent: curl/7.60.0
> Accept: */*
> Content-Length: 76
> Content-Type: application/x-www-form-urlencoded
>
* upload completely sent off: 76 out of 76 bytes
< HTTP/1.1 200 OK
< content-length: 27
< content-type: application/json; charset=UTF-8
< date: Tue, 21 May 2019 22:40:09 GMT
< server: istio-envoy
< x-envoy-upstream-service-time: 13032
<
* Connection #0 to host 169.63.251.68 left intact
{"predictions": [[0.9, 0.05, 0.05]]}

使用V2协议部署模型

在本地测试模型

一旦您获得了模型序列化的model.bst,我们就可以使用实现KServe V2推理协议的MLServer来启动本地服务器。有关MLServer的更多详细信息,请查看LightGBM示例文档

要在本地运行MLServer,首先要在本地环境以及LightGBM运行时中安装mlserver包。

pip install mlserver mlserver-lightgbm

下一步是提供模型设置,以便MLServer知道:

  • 为模型提供服务的推理运行时(即mlserver_lightgbm.LightGBMModel)
  • 型号的名称和版本

这些可以通过环境变量或创建本地model-settings.json文件来指定:

{
  "name": "lightgbm-iris",
  "version": "v1.0.0",
  "implementation": "mlserver_lightgbm.LightGBMModel"
}

有了本地安装的mlserver软件包和本地model-settings.json文件,您现在应该可以启动我们的服务器了:

mlserver start .

使用REST端点部署推理服务

当您使用推理服务部署模型时,KServe会注入合理的默认值,这样它就可以开箱即用,而无需任何进一步的配置。但是,您仍然可以通过提供类似于本地文件的model-settings.json文件来覆盖这些默认值。您甚至可以提供一组model-settings.json文件来加载多个模型

要使用V2推理协议部署LightGBM模型,需要将protocolVersion字段设置为V2。
旧框架

apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
  name: "lightgbm-v2-iris"
spec:
  predictor:
    lightgbm:
      protocolVersion: v2
      storageUri: "gs://kfserving-examples/models/lightgbm/v2/iris"

新框架

apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
  name: "lightgbm-v2-iris"
spec:
  predictor:
    model:
      modelFormat:
        name: lightgbm
      protocolVersion: v2
      storageUri: "gs://kfserving-examples/models/lightgbm/v2/iris"

应用推理服务yaml来获取REST端点
kubectl

kubectl apply -f lightgbm-v2.yaml

期望输出

$ inferenceservice.serving.kserve.io/lightgbm-v2-iris created

用curl测试展开的模型

现在,您可以通过发送示例请求来测试已部署的模型。

请注意,此请求需要遵循V2数据平面协议。您可以在下面看到一个负载示例:

{
  "inputs": [
    {
      "name": "input-0",
      "shape": [2, 4],
      "datatype": "FP32",
      "data": [
        [6.8, 2.8, 4.8, 1.4],
        [6.0, 3.4, 4.5, 1.6]
      ]
    }
  ]
}

现在,假设您的入口可以在${INGRESS_HOST}: ${INGRESS_PORT}访问,或者您可以按照此说明查找入口IP和端口。

您可以使用curl将推理请求发送为:

SERVICE_HOSTNAME=$(kubectl get inferenceservice lightgbm-v2-iris -o jsonpath='{.status.url}' | cut -d "/" -f 3)

curl -v \
  -H "Host: ${SERVICE_HOSTNAME}" \
  -H "Content-Type: application/json" \
  -d @./iris-input-v2.json \
  http://${INGRESS_HOST}:${INGRESS_PORT}/v2/models/lightgbm-v2-iris/infer

期望输出

{
  "model_name":"lightgbm-v2-iris",
  "model_version":null,
  "id":"96253e27-83cf-4262-b279-1bd4b18d7922",
  "parameters":null,
  "outputs":[
    {
      "name":"predict",
      "shape":[2,3],
      "datatype":"FP64",
      "parameters":null,
      "data":
        [8.796664107010673e-06,0.9992300031041593,0.0007612002317336916,4.974786820804187e-06,0.9999919650711493,3.0601420299625077e-06]
    }
  ]
}

使用gRPC端点创建推理服务

创建推理服务yaml并公开gRPC端口,目前只允许一个端口公开HTTP或gRPC端口并且默认情况下公开HTTP端口。
旧框架

apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
  name: "lightgbm-v2-iris"
spec:
  predictor:
    lightgbm:
      protocolVersion: v2
      storageUri: "gs://kfserving-examples/models/lightgbm/v2/iris"
      ports:
      - name: h2c
        protocol: TCP
        containerPort: 9000

新框架

apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
  name: "lightgbm-v2-iris"
spec:
  predictor:
    model:
      modelFormat:
        name: lightgbm
      protocolVersion: v2
      storageUri: "gs://kfserving-examples/models/lightgbm/v2/iris"
      ports:
      - name: h2c
        protocol: TCP
        containerPort: 9000

应用推理服务yaml获取gRPC端点
kubectl

kubectl apply -f lightgbm-v2-grpc.yaml

使用grpcurl测试已部署的模型

gRPC推理服务就绪后,grpurl可用于向推理服务发送gRPC请求。

# download the proto file
curl -O https://raw.githubusercontent.com/kserve/kserve/master/docs/predict-api/v2/grpc_predict_v2.proto

INPUT_PATH=iris-input-v2-grpc.json
PROTO_FILE=grpc_predict_v2.proto
SERVICE_HOSTNAME=$(kubectl get inferenceservice lightgbm-v2-iris -o jsonpath='{.status.url}' | cut -d "/" -f 3)

gRPC API遵循KServe预测V2协议

例如,ServerReady API可用于检查服务器是否已就绪:

grpcurl \
  -plaintext \
  -proto ${PROTO_FILE} \
  -authority ${SERVICE_HOSTNAME}" \
  ${INGRESS_HOST}:${INGRESS_PORT} \
  inference.GRPCInferenceService.ServerReady

期望输出

{
  "ready": true
}

ModelInfer API按照grpc_predict_v2.proto文件中定义的ModelInferRequest模式获取输入。请注意,输入文件与上一个curl示例中使用的文件不同。

grpcurl \
  -vv \
  -plaintext \
  -proto ${PROTO_FILE} \
  -authority ${SERVICE_HOSTNAME} \
  -d @ \
  ${INGRESS_HOST}:${INGRESS_PORT} \
  inference.GRPCInferenceService.ModelInfer \
  <<< $(cat "$INPUT_PATH")

期望输出

Resolved method descriptor:
// The ModelInfer API performs inference using the specified model. Errors are
// indicated by the google.rpc.Status returned for the request. The OK code
// indicates success and other codes indicate failure.
rpc ModelInfer ( .inference.ModelInferRequest ) returns ( .inference.ModelInferResponse );

Request metadata to send:
(empty)

Response headers received:
accept-encoding: identity,gzip
content-type: application/grpc
date: Sun, 25 Sep 2022 10:25:05 GMT
grpc-accept-encoding: identity,deflate,gzip
server: istio-envoy
x-envoy-upstream-service-time: 99

Estimated response size: 91 bytes

Response contents:
{
  "modelName": "lightgbm-v2-iris",
  "outputs": [
    {
      "name": "predict",
      "datatype": "FP64",
      "shape": [
        "2",
        "3"
      ],
      "contents": {
        "fp64Contents": [
          8.796664107010673e-06,
          0.9992300031041593,
          0.0007612002317336916,
          4.974786820804187e-06,
          0.9999919650711493,
          3.0601420299625077e-06
        ]
      }
    }
  ]
}

使用推理服务部署Paddle模型

在本例中,我们使用经过训练的paddle resnet50模型,通过运行带有paddle预测器的推理服务来对图像进行分类。

创建推理服务

旧框架

apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
  name: "paddle-resnet50"
spec:
  predictor:
    paddle:
      storageUri: "https://zhouti-mcp-edge.cdn.bcebos.com/resnet50.tar.gz"

新框架

apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
  name: "paddle-resnet50"
spec:
  predictor:
    model:
      modelFormat:
        name: paddle
      storageUri: "https://zhouti-mcp-edge.cdn.bcebos.com/resnet50.tar.gz"

应用上述yaml创建推理服务

kubectl apply -f paddle.yaml

期望输出

$ inferenceservice.serving.kserve.io/paddle-resnet50 created

运行预测

第一步是确定入口IP和端口,并设置INGRESS_HOST和INGRESS_PORT

MODEL_NAME=paddle-resnet50
SERVICE_HOSTNAME=$(kubectl get inferenceservice ${MODEL_NAME} -o jsonpath='{.status.url}' | cut -d "/" -f 3)
curl -v -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/${MODEL_NAME}:predict -d @./jay.json

期望输出

*   Trying 127.0.0.1:80...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 80 (#0)
> POST /v1/models/paddle-resnet50:predict HTTP/1.1
> Host: paddle-resnet50.default.example.com
> User-Agent: curl/7.68.0
> Accept: */*
> Content-Length: 3010209
> Content-Type: application/x-www-form-urlencoded
> Expect: 100-continue
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 100 Continue
* We are completely uploaded and fine
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< content-length: 23399
< content-type: application/json; charset=UTF-8
< date: Mon, 17 May 2021 03:34:58 GMT
< server: istio-envoy
< x-envoy-upstream-service-time: 511
<
{"predictions": [[6.736678770380422e-09, 1.1535990829258935e-08, 5.142250714129659e-08, 6.647170636142619e-08, 4.094492567219277e-08, 1.3402451770616608e-07, 9.355561303436843e-08, 2.8935891904779965e-08, 6.845367295227334e-08, 7.680615965455218e-08, 2.0334689452283783e-06, 1.1085678579547675e-06, 2.3477592492326949e-07, 6.582037030966603e-07, 0.00012373103527352214, 4.2878804151769145e-07, 6.419959845516132e-06, 0.9993496537208557, 7.372002437477931e-05, 3.101135735050775e-05, 5.6028093240456656e-06, 2.1862508674530545e-06, 1.9544044604913324e-08, 3.728893887000595e-07, 4.2903633357127546e-07, 1.8251179767503345e-07, 7.159925985433802e-08, 9.231618136595898e-09, 6.469241498052725e-07, 7.031690341108288e-09, 4.451231561120039e-08, 1.2455971898361895e-07, 9.44632745358831e-08, 4.347704418705689e-08, 4.658220120745682e-07, 6.797721141538204e-08, 2.1060276367279585e-07, 2.2605123106700376e-08, 1.4311490303953178e-07, 7.951298641728499e-08, 1.2341783417468832e-07, 1.0921713737843675e-06, 1.5243892448779661e-05, 3.1173343018053856e-07, 2.4152058131221565e-07, 6.863762536113427e-08, 8.467682022228473e-08, 9.4246772164297e-08, 1.0219210366813058e-08, 3.3770753304906975e-08, 3.6928835100979995e-08, 1.3694031508748594e-07, 1.0674284567357972e-07, 2.599483650556067e-07, 3.4866405940192635e-07, 3.132053549848024e-08, 3.574873232992104e-07, 6.64843895492595e-08, 3.1638955988455564e-07, 1.2095878219042788e-06, 8.66409024524728e-08, 4.0144172430700564e-08, 1.2544761318622477e-07, 3.3201178695208e-08, 1.9731444922399533e-07, 3.806405572959193e-07, 1.3827865075199952e-07, 2.300225965257141e-08, 7.14422512260171e-08, 2.851114544455413e-08, 2.982567437470607e-08, 8.936032713791064e-08, 6.22388370175031e-07, 6.478838798784636e-08, 1.3663023423760023e-07, 9.973181391842445e-08, 2.5761554667269593e-08, 4.130220077058766e-08, 3.9384463690339544e-08, 1.2158079698565416e-07, 4.302821707824478e-06, 1.8179063090428826e-06, 1.8520155435908237e-06, 1.6246107179540559e-06, 1.6448313544970006e-05, 1.0544916221988387e-05, 3.993061909568496e-06, 2.646479799750523e-07, 1.9193475964129902e-05, 4.803242745765601e-07, 1.696285067964709e-07, 4.550505764200352e-06, 4.235929372953251e-05, 4.443338639248395e-06, 5.104009687784128e-06, 1.3506396498996764e-05, 4.1758724478313525e-07, 4.494491463447048e-07, 3.156698369366495e-07, 1.0557599807725637e-06, 1.336463917311903e-08, 1.3893659556174498e-08, 6.770379457066156e-08, 1.4129696523923485e-07, 7.170518756538513e-08, 7.934466594861078e-08, 2.639154317307657e-08, 2.6134321373660896e-08, 7.196725881897237e-09, 2.1752363466021052e-08, 6.684639686227456e-08, 3.417795824134373e-08, 1.6228275967478112e-07, 4.107114648377319e-07, 6.472135396506928e-07, 2.951379372007068e-07, 5.653474133282543e-09, 4.830144462175667e-08, 8.887481861563629e-09, 3.7306168820805397e-08, 1.7784264727538357e-08, 4.641905082536368e-09, 3.413118676576232e-08, 1.937393818707278e-07, 1.2980176506971475e-06, 3.5641004814124244e-08, 2.149332445355867e-08, 3.055293689158134e-07, 1.5532516783878236e-07, 1.4520978766086046e-06, 3.488464628276233e-08, 3.825438398052938e-05, 4.5088432898410247e-07, 4.1766969616219285e-07, 6.770622462681786e-07, 1.4142248971893423e-07, 1.4235997696232516e-05, 6.293820433711517e-07, 4.762866865348769e-06, 9.024900577969674e-07, 9.058987870957935e-07, 1.5713684433649178e-06, 1.5720647184025438e-07, 1.818536503606083e-07, 7.193188622522939e-08, 1.1952824934269302e-06, 8.874837362782273e-07, 2.0870831463071227e-07, 9.906239029078279e-08, 7.793621747964607e-09, 1.0058498389753368e-07, 4.2059440374941914e-07, 1.843624630737395e-07, 1.6437947181202617e-07, 7.025352743994517e-08, 2.570448600636155e-07, 7.586877615040066e-08, 7.841313731660193e-07, 2.495309274763713e-07, 5.157681925993529e-08, 4.0674127177453556e-08, 7.531796519799627e-09, 4.797485431140558e-08, 1.7419973019627832e-08, 1.7958679165985814e-07, 1.2566392371127222e-08, 8.975440124459055e-08, 3.26965476915575e-08, 1.1208359751435637e-07, 3.906746215420753e-08, 4.6769045525252295e-08, 1.8523553535487736e-07, 1.4833052830454108e-07, 1.2279349448363064e-07, 1.0729105497375713e-06, 3.6538490011395197e-09, 1.6198403329781286e-07, 1.6190719875908144e-08, 1.2004933580556099e-07, 1.4800277448046018e-08, 4.02294837442696e-08, 2.15060893538066e-07, 1.1925696696835075e-07, 4.8982514044837444e-08, 7.608920071788816e-08, 2.3137479487900237e-08, 8.521050176568679e-08, 9.586213423062873e-08, 1.3351650807180704e-07, 3.021699157557123e-08, 4.423876376336011e-08, 2.610667060309879e-08, 2.3977091245797055e-07, 1.3192564551900432e-07, 1.6734931662654162e-08, 1.588336999702733e-07, 4.0643516285854275e-07, 8.753454494581092e-08, 8.366999395548191e-07, 3.437598650180007e-08, 7.847892646850596e-08, 8.526394701391382e-09, 9.601382799928615e-08, 5.258924034023948e-07, 1.3557448141909845e-07, 1.0307226716577134e-07, 1.0429813457335513e-08, 5.187714435805901e-08, 2.187001335585137e-08, 1.1791439824548888e-08, 2.98065643278278e-08, 4.338393466696289e-08, 2.9991046091026874e-08, 2.8507610494443725e-08, 3.058665143385042e-08, 6.441099031917474e-08, 1.5364101102477434e-08, 1.5973883549236234e-08, 2.5736850872704053e-08, 1.0903765712555469e-07, 3.2118737891551064e-08, 6.819742992547617e-09, 1.9251311300649832e-07, 5.8258109447706374e-08, 1.8765761922168167e-07, 4.0070790419122204e-07, 1.5791577823165426e-08, 1.950158434738114e-07, 1.0142063189277906e-08, 2.744815041921811e-08, 1.2843531571604672e-08, 3.7297493094001766e-08, 7.407496838141014e-08, 4.20607833007125e-08, 1.6924804668860816e-08, 1.459203531339881e-07, 4.344977000414474e-08, 1.7191403856031684e-07, 3.5817443233554513e-08, 8.440249388286247e-09, 4.194829728021432e-08, 2.514032360068086e-08, 2.8340199520471288e-08, 8.747196034164517e-08, 8.277125651545703e-09, 1.1676293709683705e-08, 1.4548514570833504e-07, 7.200282148289716e-09, 2.623600948936655e-06, 5.675736929333652e-07, 1.9483527466945816e-06, 6.752595282932816e-08, 8.168475318370838e-08, 1.0933046468153407e-07, 1.670913718498923e-07, 3.1387276777650186e-08, 2.973524537708272e-08, 5.752163900751839e-08, 5.850877471402782e-08, 3.2544622285968217e-07, 3.330221431951941e-08, 4.186786668469722e-07, 1.5085906568401697e-07, 2.3346819943981245e-07, 2.86402780602657e-07, 2.2940319865938363e-07, 1.8537603807544656e-07, 3.151798182443599e-07, 1.1075967449869495e-06, 1.5369782602192572e-07, 1.9237509718550427e-07, 1.64044664074936e-07, 2.900835340824415e-07, 1.246654903752642e-07, 5.802622027317739e-08, 5.186220519703966e-08, 6.0094205167615655e-09, 1.2333241272699524e-07, 1.3798474185477971e-07, 1.7370231830682314e-07, 5.617761189569137e-07, 5.1604470030497396e-08, 4.813277598714194e-08, 8.032698417537176e-08, 2.0645263703045202e-06, 5.638597713186755e-07, 8.794199857220519e-07, 3.4785980460583232e-06, 2.972389268052211e-07, 3.3904532870110415e-07, 9.469074058188198e-08, 3.754845678827223e-08, 1.5679037801419327e-07, 8.203105039683578e-08, 6.847962641387539e-09, 1.8251624211984563e-08, 6.050240841659615e-08, 3.956342808919544e-08, 1.0699947949888156e-07, 3.2566634899922065e-07, 3.5369430406717584e-07, 7.326295303755614e-08, 4.85765610847011e-07, 7.717713401689252e-07, 3.4567779749750116e-08, 3.246204585138912e-07, 3.1608601602783892e-06, 5.33099466792919e-08, 3.645687343123427e-07, 5.48158936908294e-07, 4.62306957160763e-08, 1.3466177506415988e-07, 4.3529482240955986e-08, 1.6404105451783835e-07, 2.463695381038633e-08, 5.958712634424046e-08, 9.493651020875404e-08, 5.523462576206839e-08, 5.7412357534758485e-08, 1.1850350347231142e-05, 5.8263944993086625e-06, 7.4208674050169066e-06, 9.127966222877149e-07, 2.0019581370434025e-06, 1.033498961078294e-06, 3.5146850763112525e-08, 2.058995278275688e-06, 3.5655509122989315e-07, 6.873234070781109e-08, 2.1935298022413008e-09, 5.560363547374436e-08, 3.3266996979364194e-07, 1.307369217329324e-07, 2.718762992515167e-08, 1.0462929189714032e-08, 7.466680358447775e-07, 6.923166040451179e-08, 1.6145664361033596e-08, 8.568521003837759e-09, 4.76221018175238e-09, 1.233977116044116e-07, 8.340628632197422e-09, 3.2649041248333788e-09, 5.0632489312363305e-09, 4.0704994930251814e-09, 1.2043538610839732e-08, 5.105608380517879e-09, 7.267142887457112e-09, 1.184516307262129e-07, 7.53557927168913e-08, 6.386964201965384e-08, 1.6212936770898523e-08, 2.610429419291904e-07, 6.979425393183192e-07, 6.647513117741255e-08, 7.717492849224072e-07, 6.651206945207377e-07, 3.324495310152997e-07, 3.707282019149716e-07, 3.99564243025452e-07, 6.411632114122767e-08, 7.107352217872176e-08, 1.6380016631956096e-07, 6.876800995314625e-08, 3.462474467141874e-07, 2.0256503319160402e-07, 6.19610148078209e-07, 2.6841073363925716e-08, 6.720335363752383e-07, 1.1348340649419697e-06, 1.8397931853542104e-06, 6.397251581802266e-07, 7.257533241045167e-08, 4.2213909523525217e-07, 3.9657925299252383e-07, 1.4037439655112394e-07, 3.249856774800719e-07, 1.5857655455420172e-07, 1.1122217102865761e-07, 7.391420808744442e-08, 3.42322238111592e-07, 5.39796154441774e-08, 8.517296379295658e-08, 4.061009803990601e-06, 1.4478755474556237e-05, 7.317032757470088e-09, 6.9484960008026064e-09, 4.468917325084476e-08, 9.23141172393116e-08, 5.411982328951126e-08, 2.2242811326123046e-07, 1.7609554703312824e-08, 2.0906279374344194e-08, 3.6797682678724186e-09, 6.177919686933819e-08, 1.7920288541972695e-07, 2.6279179721200308e-08, 2.6988200119149042e-08, 1.6432807115052128e-07, 1.2827612749788386e-07, 4.468908798571647e-08, 6.316552969565237e-08, 1.9461760203398626e-08, 2.087125849925542e-08, 2.2414580413965268e-08, 2.4765244077684656e-08, 6.785398465325443e-09, 2.4248794971981624e-08, 4.554979504689527e-09, 2.8977037658250993e-08, 2.0402325162649504e-08, 1.600950270130852e-07, 2.0199709638291097e-07, 1.611188515937556e-08, 5.964113825029926e-08, 4.098318573397819e-09, 3.9080127578472457e-08, 7.511338218080255e-09, 5.965624154669058e-07, 1.6478223585636442e-07, 1.4106989354445432e-08, 3.2855584919389e-08, 3.3387166364917675e-09, 1.220043444050134e-08, 4.624639160510924e-08, 6.842309385746148e-09, 1.74262879681919e-08, 4.6611329906909305e-08, 9.331947836699328e-08, 1.2306078644996887e-07, 1.2359445022980253e-08, 1.1173199254699284e-08, 2.7724862405875683e-08, 2.419210147763806e-07, 3.451186785241589e-07, 2.593766978975509e-08, 9.964568192799561e-08, 9.797809674694236e-09, 1.9085564417764544e-07, 3.972706252852731e-08, 2.6639204619982593e-08, 6.874148805735558e-09, 3.146993776681484e-08, 2.4086594407890516e-07, 1.3126927456141857e-07, 2.1254339799270383e-07, 2.050203384840188e-08, 3.694976058454813e-08, 6.563175816154398e-07, 2.560050127442537e-08, 2.6882981174480847e-08, 6.880636078676616e-07, 2.0092733166166e-07, 2.788039665801989e-08, 2.628409134786125e-08, 5.1678345158734373e-08, 1.8935413947929192e-07, 4.61852835087484e-07, 1.1086777718105623e-08, 1.4542604276357451e-07, 2.8737009216683873e-08, 6.105167926762078e-07, 1.2016463379893594e-08, 1.3944705301582871e-07, 2.093712758721722e-08, 4.3801410498645055e-08, 1.966320795077081e-08, 6.654448991838535e-09, 1.1149590584125235e-08, 6.424939158478082e-08, 6.971554888934861e-09, 3.260019587614238e-09, 1.4260189473702667e-08, 2.7895078247297533e-08, 8.11578289017234e-08, 2.5995715802196173e-08, 2.2855578762914774e-08, 1.055962854934478e-07, 8.145542551574181e-08, 3.7793686402665116e-08, 4.881891513264236e-08, 2.342062366267328e-08, 1.059935517133681e-08, 3.604105103249822e-08, 5.062430830093945e-08, 3.6804440384230475e-08, 1.501580193519203e-09, 1.4475033367489232e-06, 1.076210423889279e-06, 1.304991315009829e-07, 3.073601462233455e-08, 1.7184021317007137e-08, 2.0421090596300928e-08, 7.904992216367646e-09, 1.6902052379919041e-07, 1.2416506933732308e-08, 5.4758292122869534e-08, 2.6250422280327257e-08, 1.3261367115546818e-08, 6.29807459517906e-08, 1.270998595259698e-08, 2.0171681569536304e-07, 4.386637186826192e-08, 6.962349630157405e-08, 2.9565120485131047e-07, 7.925131626507209e-07, 2.0868920103112032e-07, 1.7341794489311724e-07, 4.2942417621816276e-08, 4.213406956665722e-09, 8.824785169281313e-08, 1.7341569957807224e-08, 7.321587247588468e-08, 1.7941774288487977e-08, 1.1245148101579616e-07, 4.242405395871174e-07, 8.259573469615589e-09, 1.1336403105133286e-07, 8.268798978861014e-08, 2.2186977588489754e-08, 1.9539720952366224e-08, 1.0675703876472653e-08, 3.288517547161973e-08, 2.4340963022950746e-08, 6.639137239972115e-08, 5.604687380866835e-09, 1.386604697728444e-08, 6.675873720496384e-08, 1.1355886009312144e-08, 3.132159633878473e-07, 3.12451788886392e-08, 1.502181845580708e-07, 1.3461754377885882e-08, 1.8882955998833495e-07, 4.645742279762999e-08, 4.6453880742092224e-08, 7.714453964524637e-09, 3.5857155467056145e-08, 7.60832108426257e-09, 4.221501370693659e-08, 4.3407251126836854e-09, 1.340157496088068e-08, 8.565600495558101e-08, 1.7045413969185574e-08, 5.4221903411644234e-08, 3.021912675649219e-08, 6.153376119755194e-08, 3.938857240370908e-09, 4.135628017820636e-08, 1.781920389021252e-08, 4.3105885083605244e-08, 3.903354972578654e-09, 7.663085455078544e-08, 1.1890405993142394e-08, 9.304217840622186e-09, 1.0968062014171664e-09, 1.0536767902635802e-08, 1.1516804221400889e-07, 8.134522886393825e-07, 5.952623993721318e-08, 2.806350174466843e-08, 1.2833099027886874e-08, 1.0605690192733164e-07, 7.872949936427176e-07, 2.7501393162765453e-08, 3.936289072470345e-09, 2.0519442145428002e-08, 7.394815870753746e-09, 3.598397313453461e-08, 2.5378517065632877e-08, 4.698972233541099e-08, 7.54952989012736e-09, 6.322805461422831e-07, 5.582006412652163e-09, 1.29640980617296e-07, 1.5874988434916304e-08, 3.3837810775594335e-08, 6.474512037613067e-09, 9.121148281110436e-08, 1.3918511676536127e-08, 8.230025549949005e-09, 2.7061290097663004e-08, 2.6095918315149902e-08, 5.722363471960534e-09, 6.963475698285038e-07, 4.685091781198025e-08, 9.590579885809802e-09, 2.099205858030473e-07, 3.082160660028421e-08, 3.563162565001221e-08, 7.326312925215461e-07, 2.1759731225756695e-06, 2.407518309155421e-07, 2.974515780351794e-07, 2.529018416908002e-08, 7.667950718825978e-09, 2.663289251358947e-07, 3.4358880185436647e-08, 2.3130198201215535e-08, 3.1239693498719134e-08, 2.8691621878351725e-07, 3.895845068768722e-08, 2.4184130253956937e-08, 1.1582445225144511e-08, 5.1545349322168477e-08, 2.034345492063494e-08, 8.201963197507212e-08, 1.164153573540716e-08, 5.496356720868789e-07, 1.1682151246361627e-08, 4.7576914852243135e-08, 1.6349824605299546e-08, 4.090862759653646e-08, 2.1271189609706198e-07, 1.6697286753242224e-07, 3.989708119433999e-08, 2.852450279533514e-06, 1.2500372292834072e-07, 2.4846613655427063e-07, 1.245429093188477e-08, 2.9700272463628608e-08, 4.250991558762962e-09, 1.61443480806156e-07, 2.6386018703306036e-07, 7.638056409575711e-09, 3.4455793773702226e-09, 7.273289526210647e-08, 1.7631434090503717e-08, 7.58661311550668e-09, 2.1547013062672704e-08, 1.2675349125856883e-07, 2.5637149292379036e-08, 3.500976220038865e-08, 6.472243541111311e-08, 8.387915251262257e-09, 3.069512288789156e-08, 7.520387867998579e-08, 1.5724964441687916e-07, 1.9634005354873807e-07, 1.2290831818972947e-07, 1.112118730439704e-09, 1.546895944670723e-08, 9.91701032404535e-09, 6.882473257974198e-07, 8.267616635748709e-08, 4.469531234008173e-08, 2.075201344098332e-08, 8.649378457903367e-08, 5.202766573120243e-08, 4.5564942041664835e-08, 2.0319955496006514e-08, 8.705182352741758e-09, 6.452066969586667e-08, 2.1777438519166026e-08, 1.030954166481024e-08, 3.211904342492744e-08, 2.3336936294526822e-07, 8.054096056753224e-09, 1.9623354319264763e-07, 1.2888089884199871e-07, 1.5392496166555247e-08, 1.401903038100727e-09, 5.696818305978013e-08, 6.080025372057207e-09, 1.0782793324892737e-08, 2.4260730313585555e-08, 1.9388659566743627e-08, 2.2970310453729326e-07, 1.9971754028347277e-08, 2.8477993296860404e-08, 5.2273552597625894e-08, 2.7392806600801123e-07, 9.857291161097237e-08, 3.12910977129377e-08, 4.151442212219081e-08, 5.251196366629074e-09, 1.580681100676884e-06, 8.547603442821128e-07, 1.068913135782168e-08, 1.0621830597301596e-06, 7.737313012512459e-08, 6.394216711669287e-08, 1.1698345758759388e-07, 1.0486609625104393e-07, 2.1161000063329993e-07, 1.53396815250062e-08, 5.094453570109181e-08, 1.4005379966874898e-08, 2.6282036102998063e-08, 8.778433624456738e-08, 7.772066545896905e-09, 4.228875383205377e-08, 3.3243779284930497e-07, 7.729244799747903e-08, 7.636901111496286e-10, 5.989500806435899e-08, 1.326090597331131e-07, 1.2853634245857393e-07, 8.844242671557367e-09, 1.0194374766570036e-07, 2.493779334145074e-07, 1.6547971881664125e-07, 1.1762754326127833e-08, 1.1496195639892903e-07, 2.9342709240154363e-07, 1.326124099421122e-08, 8.630262726683213e-08, 5.7394842656322e-08, 1.1094081031615133e-07, 2.2933713239581266e-07, 3.4706170026765903e-07, 1.4751107357824367e-07, 1.502495017291494e-08, 6.454319390059027e-08, 5.164533689594464e-08, 6.23741556182722e-08, 1.293601457064142e-07, 1.4052071506398534e-08, 5.386946000385251e-08, 2.0827554791935654e-08, 1.3040637902861363e-08, 1.0578981601838677e-07, 1.5079727688771527e-08, 8.92632726845477e-07, 4.6374381668101705e-08, 7.481006036869076e-07, 5.883147302654379e-09, 2.8707685117979054e-09, 8.381598490814213e-07, 7.341958596640552e-09, 1.4245998158912698e-08, 1.0926417104428765e-07, 1.1308178216040687e-07, 2.52339901862797e-07, 1.1782835684925885e-07, 4.6678056975224536e-08, 2.7959197179683315e-09, 3.4363861090014325e-08, 1.4674496640054713e-07, 3.5396915620822256e-08, 2.0581127557761647e-07, 7.18387909159901e-08, 2.7693943138729082e-08, 4.5493386835460115e-08, 1.9559182717898693e-08, 1.5359708172013598e-08, 1.2336623278486059e-08, 2.9570605519779747e-08, 2.877552560676122e-07, 9.051845495378075e-07, 2.3732602016934834e-07, 1.6521676471370483e-08, 1.5478875070584763e-08, 3.526786329643983e-08, 3.616410637619083e-08, 1.61590953950963e-08, 7.65007328595857e-08, 1.9661483108279754e-08, 4.917534823789538e-08, 1.1712612746350715e-07, 1.0889253054813253e-08, 1.494120169809321e-06, 1.018585660261806e-08, 3.7575969003000864e-08, 2.097097784314883e-08, 3.368558054717141e-08, 4.845588819080149e-09, 6.039624622644624e-07, 1.037331109898787e-08, 2.841650257323636e-07, 4.4990630954089283e-07, 3.463186004637464e-08, 7.720684180867465e-08, 1.471122175189521e-07, 1.1601575522490748e-07, 4.007488030310924e-07, 3.025649775167949e-08, 6.706784461130155e-08, 2.0128741340386114e-08, 1.5987744461654074e-09, 4.1919822280078733e-08, 1.3167154477855547e-08, 3.231814815762846e-08, 9.247659704669786e-08, 1.3075300842047e-07, 1.0574301256838226e-07, 3.762165334819656e-08, 1.0942246575496029e-07, 7.001474955359299e-08, 2.742706151082075e-08, 2.0766625752344225e-08, 4.5403403703403455e-08, 3.39040298058535e-08, 1.0469661759771043e-07, 2.8271578855765256e-08, 3.406226767310727e-07, 5.146206945028098e-07, 6.740708613506285e-07, 6.382248063374618e-09, 3.63878704945364e-08, 3.626059807970705e-08, 1.6065602892467723e-07, 3.639055989879125e-07, 6.232691696084203e-09, 4.805490050330263e-08, 3.372633727849461e-08, 6.328880317596486e-07, 6.480631498106959e-08, 2.1165197949812864e-07, 8.38779143919055e-08, 1.7589144363228115e-08, 2.729027670511641e-09, 2.144795097080987e-08, 7.861271456022223e-08, 2.0118186228046397e-08, 2.8407685093156942e-08, 2.4922530883486615e-07, 2.0156670998972004e-08, 2.6551649767725394e-08, 2.7848242822869906e-08, 6.907123761834555e-09, 1.880543720744754e-08, 1.3006903998302732e-08, 3.685918272822164e-07, 3.967941211158177e-07, 2.7592133022835696e-08, 2.5228947819755376e-08, 1.547002881352455e-07, 3.689306637966183e-08, 1.440177199718562e-09, 2.1504929392790473e-08, 5.068111263994979e-08, 5.081711407228795e-08, 1.171875219085905e-08, 5.409278358570191e-08, 7.138276600926474e-07, 2.5237213208129106e-07, 7.072044638789521e-08, 7.199763984999663e-08, 1.2525473103153217e-08, 3.4803417747752974e-07, 1.9591827538079087e-07, 1.2404700555634918e-07, 1.234617457157583e-07, 1.9201337408958352e-08, 1.9895249181445251e-07, 3.7876677794201896e-08, 1.0629785052174157e-08, 1.2437127772102485e-08, 2.1861892207653e-07, 2.6181456291851646e-07, 1.112900775979142e-07, 1.0776630432474121e-07, 6.380325157095967e-09, 3.895085143312826e-09, 1.5762756788717525e-07, 2.909027019271093e-09, 1.0381050685737137e-08, 2.8135211493918177e-08, 1.0778002490496874e-08, 1.3605974125141529e-08, 2.9236465692861202e-08, 1.9189795352758665e-07, 2.199506354827463e-07, 1.326399790002597e-08, 4.9004846403022384e-08, 2.980837132682268e-09, 8.926045680368588e-09, 1.0996975774446582e-08, 7.71560149104289e-09, 7.454491246505768e-09, 5.086162246925596e-08, 1.5129764108223753e-07, 1.1960075596562092e-08, 1.1323334270230134e-08, 9.391332156383214e-09, 9.585701832293125e-08, 1.905532798218701e-08, 1.8105303922766325e-08, 6.179227796110354e-08, 6.389401363549041e-08, 1.1853179771037503e-08, 9.37277544466042e-09, 1.2332148457971925e-07, 1.6522022860954166e-08, 1.246116454467483e-07, 4.196171854431441e-09, 3.996593278543514e-08, 1.2554556505506298e-08, 1.4302138140465104e-08, 6.631793780798034e-09, 5.964224669696705e-09, 5.556936244488497e-09, 1.4192455921602232e-07, 1.7613080771639034e-08, 3.380189639301534e-07, 7.85651934620546e-08, 2.966783085867064e-08, 2.8992105853831163e-06, 1.3787366697215475e-06, 5.313622430946907e-09, 2.512852859126724e-08, 8.406627216572815e-08, 4.492839167369311e-08, 5.408793057881667e-08, 2.4239175999696272e-08, 4.016805235096399e-07, 4.1083545454512205e-08, 5.4153481698904216e-08, 8.640767212853007e-09, 5.773256717134245e-08, 2.6443152023603034e-07, 8.953217047746875e-07, 2.7994001783326894e-08, 5.889480014786841e-09, 4.1788819515886644e-08, 2.8880645430717777e-08, 2.135752907861388e-08, 2.3024175277441827e-07, 8.786625471657317e-08, 2.0697297209437693e-09, 2.236410523437371e-08, 3.203276310870251e-09, 1.176874686592555e-08, 6.963571053120177e-08, 2.271932153519174e-08, 7.360382525689602e-09, 6.922528772435044e-09, 3.213871480056696e-08, 1.370577820125618e-07, 1.9815049157045905e-08, 1.0578956377571558e-08, 2.7049420481262132e-08, 2.9755937713815683e-09, 2.1773699288019088e-08, 1.09755387001087e-08, 1.991872444762066e-08, 2.3882098076910552e-08, 2.1357365653784655e-08, 6.109098560358461e-09, 1.1890497475519624e-08, 1.1459891702259029e-08, 3.73173456580389e-08, 1.572620256240498e-08, 3.404023374287135e-08, 3.6921580459647885e-08, 9.281765045443535e-08, 1.2323201303843234e-07, 4.2347593876002065e-08, 1.7423728237986325e-08, 5.8113389656000436e-08, 3.931436154402945e-08, 2.3690461148362374e-08, 1.792850135018398e-08, 1.440664210150544e-08, 7.019830494670032e-09, 6.041522482291839e-08, 4.867479930226182e-08, 1.0685319296044327e-08, 1.0051243393149889e-08, 4.2426261614991745e-08, 2.607815297039906e-08, 5.136670200300* Connection #0 to host localhost left intact
841e-09, 1.69729952315123e-09, 1.9131586981302462e-08, 2.111743526711507e-07, 1.337269672774255e-08, 2.0002481448955223e-08, 1.0454256482717028e-07, 2.8144228281234973e-08, 2.1344791889532644e-07, 2.1046110632028103e-08, 1.9114453664315079e-07, 3.957693550660224e-08, 2.931631826186276e-08, 1.105203111251285e-07, 4.84007678380749e-08, 5.583606110803885e-08, 1.2130111315400427e-07, 1.77621615193857e-08, 2.5610853882085394e-08, 1.203865309662433e-07, 4.674859610531712e-09, 1.5916098661250544e-08, 3.147594185293201e-08, 6.147686093527227e-08, 2.204641802450169e-08, 3.257763410147163e-07, 1.198914532096751e-07, 2.3818989802748547e-07, 1.4909986134625797e-08, 5.10168831624469e-08, 5.5142201915714395e-08, 2.288550327023131e-08, 5.714110073995471e-08, 5.185095801607531e-07, 4.977285783525076e-08, 1.1049896109227575e-08, 1.264099296349741e-07, 8.174881571676451e-08]]}

使用推理服务部署MLflow模型

此示例将引导您了解如何利用KServe推理服务CRD部署mlflow模型,以及如何使用V2数据平面发送推理请求。

训练

第一步是通过调用mlflow log_model API来训练样本sklearn模型并保存为mlflow模型格式。

# Original source code and more details can be found in:
# https://www.mlflow.org/docs/latest/tutorials-and-examples/tutorial.html

# The data set used in this example is from
# http://archive.ics.uci.edu/ml/datasets/Wine+Quality
# P. Cortez, A. Cerdeira, F. Almeida, T. Matos and J. Reis.
# Modeling wine preferences by data mining from physicochemical properties.
# In Decision Support Systems, Elsevier, 47(4):547-553, 2009.

import warnings
import sys

import pandas as pd
import numpy as np
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score
from sklearn.model_selection import train_test_split
from sklearn.linear_model import ElasticNet
from urllib.parse import urlparse
import mlflow
import mlflow.sklearn
from mlflow.models.signature import infer_signature

import logging

logging.basicConfig(level=logging.WARN)
logger = logging.getLogger(__name__)


def eval_metrics(actual, pred):
    rmse = np.sqrt(mean_squared_error(actual, pred))
    mae = mean_absolute_error(actual, pred)
    r2 = r2_score(actual, pred)
    return rmse, mae, r2


if __name__ == "__main__":
    warnings.filterwarnings("ignore")
    np.random.seed(40)

    # Read the wine-quality csv file from the URL
    csv_url = (
        "http://archive.ics.uci.edu/ml"
        "/machine-learning-databases/wine-quality/winequality-red.csv"
    )
    try:
        data = pd.read_csv(csv_url, sep=";")
    except Exception as e:
        logger.exception(
            "Unable to download training & test CSV, "
            "check your internet connection. Error: %s",
            e,
        )

    # Split the data into training and test sets. (0.75, 0.25) split.
    train, test = train_test_split(data)

    # The predicted column is "quality" which is a scalar from [3, 9]
    train_x = train.drop(["quality"], axis=1)
    test_x = test.drop(["quality"], axis=1)
    train_y = train[["quality"]]
    test_y = test[["quality"]]

    alpha = float(sys.argv[1]) if len(sys.argv) > 1 else 0.5
    l1_ratio = float(sys.argv[2]) if len(sys.argv) > 2 else 0.5

    with mlflow.start_run():
        lr = ElasticNet(alpha=alpha, l1_ratio=l1_ratio, random_state=42)
        lr.fit(train_x, train_y)

        predicted_qualities = lr.predict(test_x)

        (rmse, mae, r2) = eval_metrics(test_y, predicted_qualities)

        print("Elasticnet model (alpha=%f, l1_ratio=%f):" % (alpha, l1_ratio))
        print("  RMSE: %s" % rmse)
        print("  MAE: %s" % mae)
        print("  R2: %s" % r2)

        mlflow.log_param("alpha", alpha)
        mlflow.log_param("l1_ratio", l1_ratio)
        mlflow.log_metric("rmse", rmse)
        mlflow.log_metric("r2", r2)
        mlflow.log_metric("mae", mae)

        tracking_url_type_store = urlparse(mlflow.get_tracking_uri()).scheme
        model_signature = infer_signature(train_x, train_y)

        # Model registry does not work with file store
        if tracking_url_type_store != "file":

            # Register the model
            # There are other ways to use the Model Registry,
            # which depends on the use case,
            # please refer to the doc for more information:
            # https://mlflow.org/docs/latest/model-registry.html#api-workflow
            mlflow.sklearn.log_model(
                lr,
                "model",
                registered_model_name="ElasticnetWineModel",
                signature=model_signature,
            )
        else:
            mlflow.sklearn.log_model(lr, "model", signature=model_signature)

训练脚本还将利用MLflow模型格式对我们训练的模型进行序列化。

model/
├── MLmodel
├── model.pkl
├── conda.yaml
└── requirements.txt

本地测试

一旦你得到了你的模型序列化model.pkl,我们就可以使用MLServer来启动一个本地服务器。有关MLServer的更多详细信息,请随时查看MLflow示例文档

笔记
此步骤是可选的,仅用于测试,可以直接使用InfenceService进行部署

先决条件

首先,要在本地使用MLServer,首先需要在本地环境以及MLflow运行时中安装MLServer包。

pip install mlserver mlserver-mlflow

模型设置

下一步将提供一些模型设置,以便MLServer知道:

  • 为模型提供服务的推理运行时(即mlserver_mlflow.MLflowRuntime)
  • 型号的名称和版本

这些可以通过环境变量或创建本地model-settings.json文件来指定:

{
  "name": "mlflow-wine-classifier",
  "version": "v1.0.0",
  "implementation": "mlserver_mlflow.MLflowRuntime"
}

在本地启动模型服务器

有了本地安装的mlserver软件包和本地model-settings.json文件,您现在应该可以启动我们的服务器了:

mlserver start .

使用推理服务进行部署

当您使用InferenceService部署模型时,KServe会注入合理的默认值,这样它就可以在没有任何进一步配置的情况下开箱即用。但是,您仍然可以通过提供类似于本地文件的model-settings.json文件来覆盖这些默认值。您甚至可以提供一组model-settings.json文件来加载多个模型

为了使用v2协议对部署的模型进行推理,您将protocolVersion字段设置为v2,在本例中,您的模型工件已经上传到“GCS模型存储库”,可以作为gs://kfserving examples/models/mlflow/wine访问。
新框架

apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
  name: "mlflow-v2-wine-classifier"
spec:
  predictor:
    model:
      modelFormat:
        name: mlflow
      protocolVersion: v2
      storageUri: "gs://kfserving-examples/models/mlflow/wine"

kubectl

kubectl apply -f mlflow-new.yaml

测试部署的模型

现在,您可以通过发送示例请求来测试已部署的模型。

请注意,此请求需要遵循V2数据平面协议。您可以在下面看到一个负载示例:

{
  "parameters": {
    "content_type": "pd"
  },
  "inputs": [
      {
        "name": "fixed acidity",
        "shape": [1],
        "datatype": "FP32",
        "data": [7.4]
      },
      {
        "name": "volatile acidity",
        "shape": [1],
        "datatype": "FP32",
        "data": [0.7000]
      },
      {
        "name": "citric acid",
        "shape": [1],
        "datatype": "FP32",
        "data": [0]
      },
      {
        "name": "residual sugar",
        "shape": [1],
        "datatype": "FP32",
        "data": [1.9]
      },
      {
        "name": "chlorides",
        "shape": [1],
        "datatype": "FP32",
        "data": [0.076]
      },
      {
        "name": "free sulfur dioxide",
        "shape": [1],
        "datatype": "FP32",
        "data": [11]
      },
      {
        "name": "total sulfur dioxide",
        "shape": [1],
        "datatype": "FP32",
        "data": [34]
      },
      {
        "name": "density",
        "shape": [1],
        "datatype": "FP32",
        "data": [0.9978]
      },
      {
        "name": "pH",
        "shape": [1],
        "datatype": "FP32",
        "data": [3.51]
      },
      {
        "name": "sulphates",
        "shape": [1],
        "datatype": "FP32",
        "data": [0.56]
      },
      {
        "name": "alcohol",
        "shape": [1],
        "datatype": "FP32",
        "data": [9.4]
      }
  ]
}

现在,假设您的入口可以在${INGRESS_HOST}: ${INGRESS_PORT}访问,或者您可以按照此说明查找入口IP和端口。

您可以使用curl将推理请求发送为:

SERVICE_HOSTNAME=$(kubectl get inferenceservice mlflow-v2-wine-classifier -o jsonpath='{.status.url}' | cut -d "/" -f 3)

curl -v \
  -H "Host: ${SERVICE_HOSTNAME}" \
  -H "Content-Type: application/json" \
  -d @./mlflow-input.json \
  http://${INGRESS_HOST}:${INGRESS_PORT}/v2/models/mlflow-v2-wine-classifier/infer

期望输出

{
  "model_name":"mlflow-v2-wine-classifier",
  "model_version":null,
  "id":"699cf11c-e843-444e-9dc3-000d991052cc",
  "parameters":null,
  "outputs":[
    {
      "name":"predict",
      "shape":[1],
      "datatype":"FP64",
      "parameters":null,
      "data":[5.576883936610762]
    }
  ]
}
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值