机器学习mlflow_使用mlflow管理您的机器学习实验

机器学习mlflow

There was this painful period of time that I still remember when my teammate and I were working on a machine learning (ML) project.

Ť这里是一个时间,我仍然记得当我的队友和我的机器学习(ML)项目工作这个痛苦的时期。

Tediously and studiously, we were manually transferring the results of our countless experiments to a Google Sheet and organizing our saved models in folders. Of course, we did try to automate the process as much as possible, but managing our ML experiments was still a messy affair.

繁琐而艰辛的工作是,我们将无数实验的结果手动传输到Google表格中,并在文件夹中整理保存的模型。 当然,我们确实尝试了使过程尽可能自动化,但是管理ML实验仍然是一件麻烦事。

If the above situation sounds like something you are in, hopefully, this article will be able to help you out and reduce your pain.

如果上述情况听起来像您正在从事的工作,那么希望本文能够为您提供帮助,并减轻您的痛苦。

Being one of the best open source solutions (see other tools here) for managing ML experiments, MLflow will greatly improve your well being (as a data scientist, machine learning specialist, etc.) and let you and your team remain sane while keeping track of your models 💪.

作为管理ML实验的最佳开源解决方案之一(请参阅此处的其他工具),MLflow将大大改善您的健康状况(作为数据科学家,机器学习专家等),并使您和您的团队在保持跟踪的同时保持理智您的模型💪。

MLflow如何帮助我? (How can MLflow help me?)

With just a few lines of code integrated into your script, you can auto-log your model parameters and metrics into an organized dashboard as shown below.

只需将几行代码集成到脚本中,即可将模型参数和指标自动记录到组织好的仪表板中,如下所示。

MLflow dashboard

Clicking into each of the table rows will show you more details, including the path of the model saved for that run (one run is basically one model training).

单击每个表行将为您显示更多详细信息,包括为该运行保存的模型的路径(一次运行基本上是一次模型训练)。

MLflow dashboard showing path to saved model

And as mentioned earlier, the important thing is that all these can be automated with just a few additional lines of code in your script.

如前所述,重要的是,只需在脚本中添加几行代码即可自动完成所有这些操作。

In our example code snippet below, we have placed comments above all the lines of code relating to MLflow.

在下面的示例代码片段中,我们已在与MLflow相关的所有代码行的上方放置了注释。

X_train, X_test, y_train, y_test = data_processing()


#################### 1. Setup Experiment ###########################
# set experiment name to organize runs
mlflow.set_experiment('New Experiment Name') 
experiment = mlflow.get_experiment_by_name('New Experiment Name')


# set path to log data, e.g., mlruns local folder
mlflow.set_tracking_uri('./mlruns')


# launch new run under the experiment name
with mlflow.start_run(experiment_id = experiment.experiment_id):


#################### 2. Normal Model Training ######################
    hyperparams = {'max_depth': 10, 
                   'max_samples': 0.8, 
                   'max_features': 'sqrt'}
    clf = RandomForestClassifier(**hyperparams,
                                 random_state=0)
    clf.fit(X_train, y_train)
    accuracy = clf.score(X_test, y_test)


################ 3. Log params, metrics and model #################
    # log model params
    mlflow.log_params(hyperparams)
    
    # log model metric
    mlflow.log_metric('accuracy', accuracy)
    
    # log model
    mlflow.sklearn.log_model(clf, "model")

In general, there are three main sections in our example:

通常,我们的示例中包含三个主要部分:

1. Setup experiment: Here we set an experiment name (mlflow.set_experiment()) and path (mlflow.set_tracking_uri()) to log our run, before starting our run with mlflow.start_run().

1. 设置实验 :在使用mlflow.start_run()开始运行之前,在此处设置实验名称( mlflow.set_experiment() )和路径( mlflow.set_tracking_uri() )以记录运行。

2. Train model: Nothing special here, just normal model training.

2. 训练模型 :这里没有什么特别的,只是普通的模型训练。

3. Logging: Log parameters (mlflow.log_params()), metrics (mlflow.log_metric()) and model (mlflow.sklearn.log_model()).

3. 记录 :记录参数( mlflow.log_params()

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值