机器学习mlflow_使用mlflow进行机器学习生命周期管理

最新推荐文章于 2024-05-13 03:18:49 发布

杨_明

最新推荐文章于 2024-05-13 03:18:49 发布

阅读量1.6k

点赞数

文章标签：机器学习 python 人工智能 java git

原文链接：https://medium.com/engineering-at-ooba/machine-learning-lifecycle-management-using-mlflow-64d3bd75b6bd

版权

机器学习mlflow

Managing machine learning model development can be a non-trivial task, involving multiple steps; model selection, framework selection, data processing, metric optimization, and lastly, model packaging and deployment. An organized workflow makes model management less complicated and adds reproducibility to experiments.

中号anaging机器学习模型的发展可以是一个不平凡的任务，涉及多个步骤; 模型选择，框架选择，数据处理，度量优化，最后是模型打包和部署。井井有条的工作流程可以简化模型管理，并提高实验的可重复性。

MLflow简介 (Introduction to MLflow)

MLfLow is an open-source machine learning lifecycle management tool that facilitates organizing workflow for training, tracking and productionizing machine learning models. It is designed to work along with most recent machine learning libraries and frameworks available out there.

MLfLow是一种开放源代码的机器学习生命周期管理工具，可帮助组织用于训练，跟踪和生产机器学习模型的工作流。它旨在与现有的最新机器学习库和框架一起使用。

According to the official website, there are four components that MLflow currently offers:

根据官方网站，MLflow当前提供四个组件：

Tracking: Record and query experiments: code, data, config, and results
跟踪：记录和查询实验：代码，数据，配置和结果
Projects: Package data science code in a format to reproduce runs on any platform
项目：以某种格式打包数据科学代码以重现在任何平台上的运行
Models: Deploy machine learning models in diverse serving environments
模型：在不同的服务环境中部署机器学习模型
Registry: Store, annotate, discover, and manage models in a central repository
注册表：在中央存储库中存储，注释，发现和管理模型

In the forthcoming sections, we will go over how all of these components can be leveraged to organize the machine learning workflow.

在接下来的部分中，我们将介绍如何利用所有这些组件来组织机器学习工作流程。

安装MLflow (Installing MLflow)

MLflow python package can be easily installed using pip or conda whichever you prefer.

可以使用pip或conda轻松安装MLflow python软件包。

shell> pip install mlflow

If you are using Databricks, all the ML runtimes come with mlflow installed and can be readily used to log model runs on DBFS storage from a Databricks notebook.

如果使用的是Databricks，则所有ML运行时都安装了mlflow，可以很容易地用于记录Databricks笔记本在DBFS存储上运行的模型。

To test the installation, run the mlflow command in the terminal:

要测试安装，请在终端中运行mlflow命令：

shell> mlflow

You should get an output similar to this:

您应该得到类似于以下的输出：

Usage: mlflow [OPTIONS] COMMAND [ARGS]...Options:
  --version  Show the version and exit.
  --help     Show this message and exit.Commands:
  azureml      Serve models on Azure ML.
  download     Downloads the artifact at the specified DBFS...
  experiments  Tracking APIs.
  pyfunc       Serve Python models locally.
  run          Run an MLflow project from the given URI.
  sagemaker    Serve models on SageMaker.
  sklearn      Serve SciKit-Learn models.
  ui           Run the MLflow tracking UI.

MLflow追踪 (MLflow Tracking)

Tracking component consists of a UI and APIs for logging parameters, code version, metrics and output files. MLflow runs are grouped into experiments such that the logs for different runs of an experiment can be tracked and compared. This also provides the ability to visualize and compare the logged parameters and metrics. MLflow provides simple API Support for most popular platforms including Python, REST, R and Java.

跟踪组件由用于记录参数，代码版本，指标和输出文件的UI和API组成。 MLflow运行被分组为实验，以便可以跟踪和比较实验不同运行的日志。这还提供了可视化和比较记录的参数和指标的功能。 MLflow为包括Python，REST，R和Java在内的大多数流行平台提供了简单的API支持。

Image for post — MLflow Tracking Architecture

By default, mlflow uses local storage to run the tracking server. MLflow does provide the option to tr

最低0.47元/天解锁文章

杨_明

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
机器学习mlflow_使用mlflow进行机器学习生命周期管理

机器学习mlflowManaging machine learning model development can be a non-trivial task, involving multiple steps; model selection, framework selection, data processing, metric optimization, and lastly, model...
复制链接

扫一扫