python初学者编程指南_使用Python训练和部署机器学习模型的初学者指南

本文是针对初学者的Python机器学习指南,讲解如何使用Python训练和部署机器学习模型。作者从获取数据集开始,介绍如何用线性回归模型进行训练,然后利用pickle库保存和导入模型,再通过Flask框架创建简单的Web服务器来部署模型,实现对外提供预测服务。
摘要由CSDN通过智能技术生成

python初学者编程指南

by Ivan Yung

翁I

使用Python训练和部署机器学习模型的初学者指南 (A beginner’s guide to training and deploying machine learning models using Python)

When I was first introduced to machine learning, I had no idea what I was reading. All the articles I read consisted of weird jargon and crazy equations. How could I figure all this out?

当我第一次接触机器学习时,我不知道自己在读什么。 我读过的所有文章都由怪异的行话和疯狂的方程式组成。 我怎么能弄清楚这一切?

I opened a new tab in Chrome and looked for easier solutions. I found APIs from Amazon, Microsoft, and Google that did all the machine learning for me. Each hackathon project I made would call their servers and WOW — it looked so smart! I was hooked.

我在Chrome中打开了一个新标签,并寻找更简单的解决方案。 我发现来自Amazon,Microsoft和Google的API为我完成了所有机器学习。 我做的每个黑客马拉松项目都会称呼他们的服务器和WOW-看起来很聪明! 我迷上了。

But, after a year, I realized that I wasn’t learning anything. Everything I was doing was described by this Nedroid comic that I modified:

但是,一年后,我意识到自己什么都没学。 我修改的Nedroid漫画描述了我所做的一切:

Eventually, I sat down and learned how to use machine learning without megacorporations. And turns out, anyone can do it. The current libraries we have in Python are amazing. In this article, I will explain how I use these libraries to create a proper machine learning back end.

最终,我坐下来学习了如何在没有大型公司的情况下使用机器学习。 事实证明,任何人都可以做到。 我们在Python中拥有的当前库非常了不起。 在本文中,我将解释如何使用这些库来创建适当的机器学习后端。

获取数据集 (Getting a dataset)

Machine learning projects are reliant on finding good datasets. If the dataset is bad, or too small, we cannot make accurate predictions. You can find some good datasets at Kaggle or the UC Irvine Machine Learning Repository.

机器学习项目依赖于找到良好的数据集。 如果数据集不好或太小,我们将无法进行准确的预测。 您可以在KaggleUC Irvine机器学习存储库中找到一些好的数据集。

In this article, I am using a wine quality dataset with many features and one label. Features are independent variables which affect the dependent variable called the label. In this case, we have one label column — wine quality — that is affected by all the other columns (features like pH, density, acidity, and so on).

在本文中,我将使用具有许多功能和一个标签的葡萄酒质量数据集特征是独立变量,会影响称为标签的因变量。 在这种情况下,我们有一个标签列-葡萄酒质量-会受到其他所有列(pH,密度,酸度等特征)的影响。

In the following Python code, I use a library called pandas to control my dataset. pandas provides datasets with many functions to select and manipulate data.

在下面的Python代码中,我使用一个名为pandas的库来控制我的数据集。 熊猫为数据集提供了许多选择和操作数据的功能。

First, I load the dataset to a panda and split it into the label and its features. I then grab the label column by its name (quality) and then drop the column to get all the features.

首先,我将数据集加载到熊猫并将其拆分为标签及其特征。 然后,我通过其名称(质量)获取标签列,然后放下该列以获取所有功能。

训练模型 (Training a model)

Machine learning works by finding a relationship between a label and its features. We do this by showing an object (our model) a bunch of examples from our dataset. Each example helps define how each feature affects the label. We refer to this process as training our model.

机器学习通过查找标签及其特征之间的关系来工作。 为此,我们向对象(我们的模型)展示了数据集中的大量示例。 每个示例都有助于定义每个功能如何影响标签。 我们将此过程称为训练模型

I use the estimator object from the Scikit-learn library for simple machine learning. Estimators are empty models that create relationships through a predefined algorithm.

我使用Scikit-learn库中的estimator对象进行简单的机器学习。 估计器是空模型,它们通过预定义算法创建关系。

For this wine dataset, I create a model from a linear regression estimator. (Linear regression attempts to draw a straight line of best fit through our dataset.) The model is able to get the regression data through the fit function. I can use the model by passing in a fake set of features through the predict function. The example below shows the features for one fake wine. The model will output an answer based on its training.

对于这个葡萄酒数据集,我根据线性回归估算器创建了一个模型。 (线性回归尝试通过我们的数据集绘制最佳拟合直线。)模型能够通过拟合函数获取回归数据。 我可以通过预测函数传入一组伪造的特征来使用该模型。 下面的示例显示了一种假酒的特征。 该模型将根据其训练结果输出答案。

The code for this model, and fake wine, is below:

该模型和假酒的代码如下:

导入和导出我们的Python模型 (Importing and exporting our Python model)

The pickle library makes it easy to serialize the models into files that I create. I am also able to load the model back into my code. This allows me to keep my model training code separated from the code that deploys my model.

使用pickle库可以轻松地将模型序列化到我创建的文件中。 我还可以将模型加载回我的代码中。 这使我可以将模型训练代码与部署模型的代码分开。

I can import or export my Python model for use in other Python scripts with the code below:

我可以使用以下代码导入或导出供其他Python脚本使用的Python模型:

创建一个简单的Web服务器 (Creating a simple web server)

To deploy my model, I first have to create a server. Servers listen to web traffic, and run functions when they find a request addressed to them. The function that runs can depend on the request’s route and other data that it has. Afterwards, the server can send a message of confirmation back to the requester.

要部署我的模型,我首先必须创建一个服务器。 服务器侦听Web流量,并在找到发给它们的请求时运行功能。 运行的功能可能取决于请求的路由和它具有的其他数据。 之后,服务器可以将确认消息发送回请求者。

The Flask Python framework allows me to create web servers in record time.

Flask Python框架使我可以在创纪录的时间内创建Web服务器。

In the code below, I use Flask to run a simple one-route web server. My one route listens to POST requests and sends a hello back. POST requests are a special type of request that carry data in a JSON object.

在下面的代码中,我使用Flask运行简单的单路由Web服务器。 我的一条路由侦听POST请求并发送回声。 POST请求是一种特殊的请求类型,它在JSON对象中承载数据。

将模型添加到我的服务器 (Adding the model to my server)

With the pickle library, I am able to able to load our trained model into my web server.

使用pickle库,我能够将训练有素的模型加载到Web服务器中。

Our server now loads the trained model during its initialization. I can access it by sending a post request to my “/echo” route. The route grabs an array of features from the request body and gives it to the model. The model’s prediction is then sent back to the requester.

现在,我们的服务器将在初始化期间加载训练后的模型。 我可以通过向我的“ / echo”路由发送发布请求来访问它。 路由从请求主体获取一系列功能并将其提供给模型。 然后将模型的预测发送回请求者。

结论 (Conclusion)

After reading this article, you should be able to create your own machine learning back end. For more detail, you can find a full example that I made at this repository.

阅读本文之后,您应该能够创建自己的机器学习后端。 有关更多详细信息,您可以找到我在存储库中制作的完整示例。

When you have time, I recommend taking a step back from coding and reading about machine learning. This article only teaches the bare necessities to create a model. There are topics like loss reduction and neural nets that you need to know.

如果您有时间,建议您从编码和阅读有关机器学习的步骤中退一步。 本文仅讲授创建模型的基本需求。 您需要了解诸如减少损失和神经网络之类的主题。

Good luck and happy coding!

祝您好运,编码愉快!

翻译自: https://www.freecodecamp.org/news/a-beginners-guide-to-training-and-deploying-machine-learning-models-using-python-48a313502e5a/

python初学者编程指南

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值