MLOps极致细节:7. MLFlow REST API 功能介绍及应用
在这个博客中,我们将详细介绍MLFlow REST API概念以及代码,包括新建mlflow run,获取指定run相关的信息,新建experiment,罗列所有experiment信息,从experiment id/name中获得相关experiment信息,设置Tag/param/metric信息,设置log batch信息。
此博客的代码基于MLFlow官网GitHub链接改写。关于MLFlow REST API的官方介绍参见此链接
- 平台:Win10。
- IDE:Visual Studio Code
- 需要预装:Anaconda3。
文章目录
1 背景介绍
一般情况下,我们可以直接使用MLflow的库来调用其中的功能模块,但也有一些情况,我们不希望使用MLflow库,或者我们并不是用Python来作为开发语言,那么MLflow REST API也是一种不错的选择。MLflow REST API允许您创建、列出、获取experiment和run,并记录parameters,metrics以及artifacts。API托管在MLflow跟踪服务器上的/api
路由下。
2 官网的描述
在MLFlow REST API的官方介绍此链接中,详细地罗列了所有MLFlow REST API支持的功能,这里简单地罗列几个例子。
create run:
新建一个mlflow run,我们需要注意一下几点:
Endpoint
: 每一个功能会对应不同的endpoint,比如新建run这个功能的endpoint是2.0/mlflow/runs/create
。我们需要在base url后面加上这个endpoint才能成功发送请求;HTTP Method
:我大概看了一下,这里好像只有POST
和GET
两种;Request Structure
:发送请求的时候,需要输入的参数有哪些,比如当我们需要创建一个新的run的时候,需要experiment id(因为run是基于experiment的),start time(创建这个run的时间),user id(不一定需要,自己输入一个id就可以,或者不输入),以及tags(也是opetional的)。Response Structure
:当你发送成功之后,会接收到一些信息,比如状态码(status code),以及和这个run有关的信息,包括runinfo和rundata。官网的文档中很详细地列出来这两项各自都包括了哪些具体的参数,比如:
3 代码实现
当然,光有文字描述是不够的。除此之外,我们也写了一些代码。这里先介绍mlflow restapi直接调用的函数,包括了:
- create new mlflow run
- create new mlflow experiment
- list all experiments
- get specific experiment info by experiment id
- get specific experiment info by experiment name
- log batch
- Set Tag
- log param
- log metric
完整的代码可以直接从我的gitee中下载git clone https://gitee.com/yichaoyyds/mlflow-ex-restapi-basic.git
。
3.1 新建mlflow run
def create_run(self):
"""Create a new run for tracking."""
url = self.base_url + "/runs/create"
# user_id is deprecated and will be removed from the API in a future release
payload = {
"experiment_id": self.experiment_id,
"start_time": int(time.time() * 1000),
"user_id": _get_user_id(),
}
r = requests.post(url, json=payload)
run_id = None
if r.status_code == 200:
run_id = r.json()["run"]["info"]["run_uuid"]
print("Successfully create run with run id: {}".format(run_id))
else:
print("Creating run failed!")
return run_id
正如之前所说的,首先我们需要把enpoint写进去,加载base url后面(url = self.base_url + "/runs/create"
)。然后把request所需要的的输入(experiment_id
, start_time
, user_id
)打包成json格式,发出去(requests.post(url, json=payload)
)。会有一些数据返回,包含了状态码(r.status_code
),以及和run有关的参数,比如run id(r.json()["run"]["info"]["run_uuid"]
)。
3.2 获取和某一个run相关的信息
def get_run(self):
"""Get run info provided by run id."""
url = self.base_url + "/runs/get"
print("run_id: ", self.run_id)
r = requests.get(url,params={"run_id":self.run_id})
run = None
if r.status_code == 200:
run = r.json()["run"]
return run
Endpoint
:2.0/mlflow/runs/get
;HTTP Method
:GET
;Request Structure
run_id
:每一个mlflow run,都会有一个run id,它是独一的。所以,如果我们要获得关于这个run的信息的话,我们也便需要以run id作为输入;
Response Structure
:run,包含了Run metadata (name, start time, etc) and data (metrics, params, and tags)。
3.3 新建experiment
def create_experiment(self, experiment_name="0",
artifact_location: Optional[str] = None,
tags: Optional[Dict[str, Any]] = None):
"""Create a new experiment."""
url = self.base_url + "/experiments/create"
payload = {
"name": str(experiment_name),
"artifact_location": artifact_location,
"tags": tags,
}
r = requests.post(url, json=payload)
if r.status_code == 200:
self.experiment_id = r.json()["experiment_id"]
else:
print("Creating experiment failed!")
return r.status_code
Endpoint
:2.0/mlflow/experiments/create
;HTTP Method
:POST
;Request Structure
name
:即experiment name。需要注意区分experiment name和experiment id。我经常搞混;artifact_location
:Location where all artifacts for the experiment are stored.即,这些保存的日志文件的保存位置。这个是optional的;tags
:这个是optional的,包含和这个experiment有关的tag。
Response Structure
:experiment_id。
3.4 罗列所有experiment信息
def list_experiments(self):
"""Get all experiments."""
url = self.base_url + "/experiments/list"
r = requests.get(url)
experiments = None
if r.status_code == 200:
experiments = r.json()["experiments"]
return experiments
3.5 从experiment id中获得相关experiment信息
def get_experiment_by_id(self,experiment_id):
"""Get one experiment info and runs inside by experiment id."""
url = self.base_url + "/experiments/get"
r = requests.get(url,params={"experiment_id":experiment_id})
experiment = None
if r.status_code == 200:
experiment = r.json()["experiment"]
return experiment
3.6 从experiment name中获得相关experiment信息
def get_experiment_by_name(self,experiment_name):
"""Get one experiment info and runs inside by experiment name."""
url = self.base_url + "/experiments/get-by-name"
r = requests.get(url,params={"experiment_name":experiment_name})
experiment = None
if r.status_code == 200:
experiment = r.json()["experiment"]
return experiment
3.7 设置Tag信息
这里开始我们介绍一些和日志有关的函数。
def set_tag(self, tag):
"""
Log a parameter for the given run. Tag support dict. For example,
- tag={"key": "precision", "value": 0.769}
- tag={"precision": 0.769}
"""
url = self.base_url + "/runs/set-tag"
if tag:
if tag.keys() == ['key', 'value']:
payload = {"run_id": self.run_id, "key": tag["key"], "value": str(tag["value"])}
else:
payload = {"run_id": self.run_id, "key": list(tag.keys())[0], "value": str(list(tag.values())[0])}
r = requests.post(url, json=payload)
return r.status_code
else: # dictionary is empty
return 0
对于输入tag的写法,可以有两种(我都把他们兼容在这个函数中了,因为在mlflow的官网案例中,两种写法都频繁出现):
- tag={“key”: “precision”, “value”: 0.769}
- tag={“precision”: 0.769}
需要注意的是,最后作为输入进入requests.post
的格式是类似上面第一种,并且所有value都必须是string格式。
3.8 设置log param信息
def log_param(self, param):
"""
Log a parameter for the given run. Param support dict. For example,
- param={"key": "precision", "value": 0.769}
- param={"precision": 0.769}
"""
url = self.base_url + "/runs/log-parameter"
if param:
if param.keys() == ['key', 'value']:
payload = {"run_id": self.run_id, "key": param["key"], "value": param["value"]}
else:
payload = {"run_id": self.run_id, "key": list(param.keys())[0], "value": str(list(param.values())[0])}
r = requests.post(url, json=payload)
return r.status_code
else: # dictionary is empty
return 0
3.9 设置log matric信息
def log_metric(self, metric={}, step: Optional[int] = None):
"""
Log a metric for the given run. Metric support dict. For example,
- metric={"key": "precision", "value": 0.769}
- metric={"precision": 0.769}
"""
url = self.base_url + "/runs/log-metric"
if metric:
if metric.keys() == ['key', 'value']:
payload = {"run_id": self.run_id,
"key": metric["key"],
"value": str(metric["value"]),
"timestamp": int(time.time() * 1000),
"step":step or 0}
else:
payload = {"run_id": self.run_id,
"key": list(metric.keys())[0],
"value": str(list(metric.values())[0]),
"timestamp": int(time.time() * 1000),
"step":step or 0}
r = requests.post(url, json=payload)
return r.status_code
else: # dictionary is empty
return 0
3.10 设置log batch信息
相信看了上面的几个log函数,有人会有一些疑问了,每次只能log一个metric,一个param,是不是效率低了点。所以官方文档也给了一个log_batch
函数,可以一次把一堆metrics,params,以及tags一并传出去。
def log_batch(self, metrics=[], params=[], tags=[]):
"""
Log a batch of metrics, params, and tags for a run.
metrics, params, tags support list or dict or empt.
For example:
(1) metrics=[];
(2) metrics=[{"key": "mse", "value": "0.769"}, {"key": "callback", "value": "0.512"}];
(3) metrics={"mse": 2500.00, "rmse": 50.00};
"""
url = self.base_url + "/runs/log-batch"
# support metrics input with list or dict or empty
if len(metrics)==0:
metricsList = []
elif isinstance(metrics, list):
metricsList = [ {"key":str(metric["key"]), "value":str(metric["value"])} for metric in metrics]
elif isinstance(metrics, dict):
metricsList = [ {"key":key, "value":str(value)} for key, value in metrics.items()]
# support params input with list or dict or empty
if len(params)==0:
paramsList = []
elif isinstance(params, list):
paramsList = [ {"key":str(param["key"]), "value":str(param["value"])} for param in params]
elif isinstance(params, dict):
paramsList = [ {"key":key, "value":str(value)} for key, value in params.items()]
# support tags input with list or dict or empty
if len(tags)==0:
tagsList = []
elif isinstance(tags, list):
tagsList = [ {"key":str(tag["key"]), "value":str(tag["value"])} for tag in tags]
elif isinstance(tags, dict):
tagsList = [ {"key":key, "value":str(value)} for key, value in tags.items()]
payload = {"run_id": self.run_id, "metrics": metricsList, "params": paramsList, "tags": tagsList}
#print("payload: ",payload)
r = requests.post(url, json=payload)
return r.status_code
传出去的json报文格式类似于:
{
"run_id": "2a14ed5c6a87499199e0106c3501eab8",
"metrics": [
{"key": "mae", "value": 2.5, "timestamp": 1552550804},
{"key": "rmse", "value": 2.7, "timestamp": 1552550804},
],
"params": [
{"key": "model_class", "value": "LogisticRegression"},
]
}