MLOps极致细节：7. MLFlow REST API 功能介绍及应用

最新推荐文章于 2024-04-05 09:32:08 发布

破浪会有时

最新推荐文章于 2024-04-05 09:32:08 发布

阅读量784

点赞数

分类专栏： mlops 文章标签： restful github 后端

本文链接：https://blog.csdn.net/zyctimes/article/details/123418774

版权

mlops 专栏收录该内容

28 篇文章 16 订阅

订阅专栏

MLOps极致细节：7. MLFlow REST API 功能介绍及应用

在这个博客中，我们将详细介绍MLFlow REST API概念以及代码，包括新建mlflow run，获取指定run相关的信息，新建experiment，罗列所有experiment信息，从experiment id/name中获得相关experiment信息，设置Tag/param/metric信息，设置log batch信息。

此博客的代码基于MLFlow官网GitHub链接改写。关于MLFlow REST API的官方介绍参见此链接

平台：Win10。
IDE：Visual Studio Code
需要预装：Anaconda3。

文章目录

MLOps极致细节：7. MLFlow REST API 功能介绍及应用

1 背景介绍

一般情况下，我们可以直接使用MLflow的库来调用其中的功能模块，但也有一些情况，我们不希望使用MLflow库，或者我们并不是用Python来作为开发语言，那么MLflow REST API也是一种不错的选择。MLflow REST API允许您创建、列出、获取experiment和run，并记录parameters，metrics以及artifacts。API托管在MLflow跟踪服务器上的/api路由下。

2 官网的描述

在MLFlow REST API的官方介绍此链接中，详细地罗列了所有MLFlow REST API支持的功能，这里简单地罗列几个例子。

create run：

新建一个mlflow run，我们需要注意一下几点：

Endpoint: 每一个功能会对应不同的endpoint，比如新建run这个功能的endpoint是2.0/mlflow/runs/create。我们需要在base url后面加上这个endpoint才能成功发送请求；
HTTP Method：我大概看了一下，这里好像只有POST和GET两种；
Request Structure：发送请求的时候，需要输入的参数有哪些，比如当我们需要创建一个新的run的时候，需要experiment id（因为run是基于experiment的），start time（创建这个run的时间），user id（不一定需要，自己输入一个id就可以，或者不输入），以及tags（也是opetional的）。
Response Structure：当你发送成功之后，会接收到一些信息，比如状态码（status code），以及和这个run有关的信息，包括runinfo和rundata。官网的文档中很详细地列出来这两项各自都包括了哪些具体的参数，比如：

在这里插入图片描述

3 代码实现

当然，光有文字描述是不够的。除此之外，我们也写了一些代码。这里先介绍mlflow restapi直接调用的函数，包括了：

create new mlflow run
create new mlflow experiment
list all experiments
get specific experiment info by experiment id
get specific experiment info by experiment name
log batch
Set Tag
log param
log metric

完整的代码可以直接从我的gitee中下载git clone https://gitee.com/yichaoyyds/mlflow-ex-restapi-basic.git。

3.1 新建mlflow run

def create_run(self):
    """Create a new run for tracking."""
    url = self.base_url + "/runs/create"
    # user_id is deprecated and will be removed from the API in a future release
    payload = {
        "experiment_id": self.experiment_id,
        "start_time": int(time.time() * 1000),
        "user_id": _get_user_id(),
    }
    r = requests.post(url, json=payload)
    run_id = None
    if r.status_code == 200:
        run_id = r.json()["run"]["info"]["run_uuid"]
        print("Successfully create run with run id: {}".format(run_id))
    else:
        print("Creating run failed!")
    return run_id

正如之前所说的，首先我们需要把enpoint写进去，加载base url后面（url = self.base_url + "/runs/create"）。然后把request所需要的的输入（experiment_id, start_time, user_id）打包成json格式，发出去（requests.post(url, json=payload)）。会有一些数据返回，包含了状态码（r.status_code），以及和run有关的参数，比如run id（r.json()["run"]["info"]["run_uuid"]）。

3.2 获取和某一个run相关的信息

def get_run(self):
    """Get run info provided by run id."""
    url = self.base_url + "/runs/get"
    print("run_id: ", self.run_id)
    r = requests.get(url,params={"run_id":self.run_id})
    run = None
    if r.status_code == 200:
        run = r.json()["run"]
    return run

Endpoint: 2.0/mlflow/runs/get；
HTTP Method：GET；
Request Structure
- run_id：每一个mlflow run，都会有一个run id，它是独一的。所以，如果我们要获得关于这个run的信息的话，我们也便需要以run id作为输入；
Response Structure：run，包含了Run metadata (name, start time, etc) and data (metrics, params, and tags)。

3.3 新建experiment

def create_experiment(self, experiment_name="0", 
    artifact_location: Optional[str] = None,
    tags: Optional[Dict[str, Any]] = None):
    """Create a new experiment."""
    url = self.base_url + "/experiments/create"
    payload = {
        "name": str(experiment_name),
        "artifact_location": artifact_location,
        "tags": tags,
    }
    r = requests.post(url, json=payload)
    if r.status_code == 200:
        self.experiment_id = r.json()["experiment_id"]
    else:
        print("Creating experiment failed!")
    return r.status_code

Endpoint: 2.0/mlflow/experiments/create；
HTTP Method：POST；
Request Structure
- name：即experiment name。需要注意区分experiment name和experiment id。我经常搞混；
- artifact_location：Location where all artifacts for the experiment are stored.即，这些保存的日志文件的保存位置。这个是optional的；
- tags：这个是optional的，包含和这个experiment有关的tag。
Response Structure：experiment_id。

3.4 罗列所有experiment信息

def list_experiments(self):
    """Get all experiments."""
    url = self.base_url + "/experiments/list"
    r = requests.get(url)
    experiments = None
    if r.status_code == 200:
        experiments = r.json()["experiments"]
    return experiments

3.5 从experiment id中获得相关experiment信息

def get_experiment_by_id(self,experiment_id):
    """Get one experiment info and runs inside by experiment id."""
    url = self.base_url + "/experiments/get"
    r = requests.get(url,params={"experiment_id":experiment_id})
    experiment = None
    if r.status_code == 200:
        experiment = r.json()["experiment"]
    return experiment

3.6 从experiment name中获得相关experiment信息

def get_experiment_by_name(self,experiment_name):
    """Get one experiment info and runs inside by experiment name."""
    url = self.base_url + "/experiments/get-by-name"
    r = requests.get(url,params={"experiment_name":experiment_name})
    experiment = None
    if r.status_code == 200:
        experiment = r.json()["experiment"]
    return experiment

3.7 设置Tag信息

这里开始我们介绍一些和日志有关的函数。

def set_tag(self, tag):
    """
    Log a parameter for the given run. Tag support dict. For example,
    - tag={"key": "precision", "value": 0.769}
    - tag={"precision": 0.769}
    """
    url = self.base_url + "/runs/set-tag"
    if tag:
        if tag.keys() == ['key', 'value']:
            payload = {"run_id": self.run_id, "key": tag["key"], "value": str(tag["value"])}
        else:   
            payload = {"run_id": self.run_id, "key": list(tag.keys())[0], "value": str(list(tag.values())[0])}
        r = requests.post(url, json=payload)
        return r.status_code
    else:   # dictionary is empty
        return 0

对于输入tag的写法，可以有两种（我都把他们兼容在这个函数中了，因为在mlflow的官网案例中，两种写法都频繁出现）：

tag={“key”: “precision”, “value”: 0.769}
tag={“precision”: 0.769}

需要注意的是，最后作为输入进入requests.post的格式是类似上面第一种，并且所有value都必须是string格式。

3.8 设置log param信息

def log_param(self, param):
    """
    Log a parameter for the given run. Param support dict. For example,
    - param={"key": "precision", "value": 0.769}
    - param={"precision": 0.769}
    """
    url = self.base_url + "/runs/log-parameter"
    if param:
        if param.keys() == ['key', 'value']:
            payload = {"run_id": self.run_id, "key": param["key"], "value": param["value"]}
        else:
            payload = {"run_id": self.run_id, "key": list(param.keys())[0], "value": str(list(param.values())[0])}
        r = requests.post(url, json=payload)
        return r.status_code
    else:   # dictionary is empty
        return 0

3.9 设置log matric信息

def log_metric(self, metric={}, step: Optional[int] = None):
    """
    Log a metric for the given run. Metric support dict. For example,
    - metric={"key": "precision", "value": 0.769}
    - metric={"precision": 0.769}
    """
    url = self.base_url + "/runs/log-metric"
    if metric:
        if metric.keys() == ['key', 'value']:
            payload = {"run_id": self.run_id, 
                "key": metric["key"], 
                "value": str(metric["value"]),
                "timestamp": int(time.time() * 1000),
                "step":step or 0}
        else:   
            payload = {"run_id": self.run_id, 
                "key": list(metric.keys())[0], 
                "value": str(list(metric.values())[0]),
                "timestamp": int(time.time() * 1000),
                "step":step or 0}
        r = requests.post(url, json=payload)
        return r.status_code
    else:   # dictionary is empty
        return 0

3.10 设置log batch信息

相信看了上面的几个log函数，有人会有一些疑问了，每次只能log一个metric，一个param，是不是效率低了点。所以官方文档也给了一个log_batch函数，可以一次把一堆metrics，params，以及tags一并传出去。

def log_batch(self, metrics=[], params=[], tags=[]):
    """
    Log a batch of metrics, params, and tags for a run.
    metrics, params, tags support list or dict or empt.
    For example: 
    (1) metrics=[];
    (2) metrics=[{"key": "mse", "value": "0.769"}, {"key": "callback", "value": "0.512"}];
    (3) metrics={"mse": 2500.00, "rmse": 50.00};
    """
    url = self.base_url + "/runs/log-batch"
    # support metrics input with list or dict or empty
    if len(metrics)==0:
        metricsList = []
    elif isinstance(metrics, list):
        metricsList = [ {"key":str(metric["key"]), "value":str(metric["value"])} for metric in metrics]
    elif isinstance(metrics, dict):
        metricsList = [ {"key":key, "value":str(value)} for key, value in metrics.items()]
    # support params input with list or dict or empty
    if len(params)==0:
        paramsList = []
    elif isinstance(params, list):
        paramsList = [ {"key":str(param["key"]), "value":str(param["value"])} for param in params]
    elif isinstance(params, dict):
        paramsList = [ {"key":key, "value":str(value)} for key, value in params.items()]
    # support tags input with list or dict or empty
    if len(tags)==0:
        tagsList = []
    elif isinstance(tags, list):
        tagsList = [ {"key":str(tag["key"]), "value":str(tag["value"])} for tag in tags]
    elif isinstance(tags, dict):
        tagsList = [ {"key":key, "value":str(value)} for key, value in tags.items()]

    payload = {"run_id": self.run_id, "metrics": metricsList, "params": paramsList, "tags": tagsList}
    #print("payload: ",payload)
    r = requests.post(url, json=payload)
    return r.status_code

传出去的json报文格式类似于：

{
   "run_id": "2a14ed5c6a87499199e0106c3501eab8",
   "metrics": [
     {"key": "mae", "value": 2.5, "timestamp": 1552550804},
     {"key": "rmse", "value": 2.7, "timestamp": 1552550804},
   ],
   "params": [
     {"key": "model_class", "value": "LogisticRegression"},
   ]
}