web数据库框架
Some motivation for data scientists to learn a web framework.
数据科学家学习网络框架的一些动机。
介绍 (Introduction)
I want to start by noting that this is a bit of an opinion article, so take it with a grain of salt. At the very least, I hope it gets people thinking and acts as food for thought. With that, on with the show!
首先,我想指出这是一篇颇有见地的文章,因此请加一点盐。 至少,我希望它能引起人们的思考并起到思考的作用。 就这样,随着表演!
Full Stack Python has a fantastic article on web frameworks. Probably the most powerful quote is:
Full Stack Python在Web框架上有一篇很棒的文章。 可能最有力的报价是:
Web frameworks encapsulate what developers have learned over the past twenty years while programming sites and applications for the web. Frameworks make it easier to reuse code for common HTTP operations and to structure projects so other developers with knowledge of the framework can quickly build and maintain the application.
Web框架封装了开发人员在过去二十年来为Web站点和应用程序编程时所学的知识。 框架使为常见的HTTP操作重用代码和构造项目变得更加容易,以便其他具有框架知识的开发人员可以快速构建和维护应用程序。
Now that we know what a web framework is, why should Data Scientists know more?
现在我们知道了什么是Web框架,为什么数据科学家应该了解更多?
实施模型的麻烦 (The Trouble with Implementing Models)
“I’m a data scientist. Why should I learn a web framework?”
“我是一名数据科学家。 我为什么要学习网络框架?”
While it’s great to build models, they’re no good to anyone if you can’t get them into production. Companies like VentureBeat, Redapt, and others report that ~90% of machine learning projects don’t make it to production. Why is this true? Let’s first take a look at what kind of team is required for a machine learning project.
尽管构建模型很棒,但是如果您无法将它们投入生产,那么它们对任何人都是不好的。 VentureBeat,Redapt等公司报告说,约有90%的机器学习项目没有投入生产 。 为什么会这样呢? 首先让我们看一下机器学习项目需要什么样的团队。
Machine Learning projects need 4 teams to succeed. Data Scientists, Application\Web Developers, Data Engineers, and MLOps\DevOps. If one of those pieces is missing, then you have people wearing multiple hats (i.e. playing multiple roles).
机器学习项目需要4个团队才能成功。 数据科学家,Application \ Web开发人员,数据工程师和MLOps \ DevOps。 如果缺少其中一件作品,则说明您的人们戴着多个帽子(即扮演多个角色) 。
So let’s think about where the team could potentially use reinforcement.
因此,让我们考虑一下团队可以在哪些地方使用加固。
Data Scientists need data to do their job. If you have Data Scientists, then their relationship with Data Engineers should be strong. Why? Well, Data Scientists can’t do anything if they don’t have data to analyze. So the connection between Data Scientists and Data Engineers should be strong (if not, then you’re really in trouble). The same goes for Data Engineers and Application\Web Developers. Data Engineers can hardly do their job if they can’t get data from front end systems. So the right side of the Venn diagram is generally pretty strong. If the right side is not strong, seems logical to assume your company’s business intelligence program is in its infancy. So you’re either in a development phase, or you’re really in trouble.
数据科学家需要数据来完成他们的工作。 如果您有数据科学家,那么他们与数据工程师的关系应该很牢固。 为什么? 好吧,如果数据科学家没有可供分析的数据,他们将无能为力。 因此,数据科学家和数据工程师之间的联系应该牢固(如果没有,那么您真的很麻烦) 。 数据工程师和Application \ Web Developers也是如此。 如果无法从前端系统获取数据,数据工程师将很难完成工作。 因此,维恩图的右侧通常很坚固。 如果右侧不强,那么假设您公司的商业智能计划尚处于起步阶段似乎合乎逻辑。 因此,您要么处于开发阶段,要么真的遇到了麻烦。
The middle and left side of the Venn diagram could be common pain-points. In general, Data Engineering and BI have been around for a while. Data Science is an old job (e.g. statistician, statistical modeler, predictive analyst, etc.) with a new title. But MLOps, the merger of DevOps and machine learning, is not a particularly mature field. Below is an updated Venn diagram to convey my point.
维恩图的中间和左侧可能是常见的痛点。 通常,数据工程和BI已经存在了一段时间。 数据科学是一项具有新头衔的旧工作(例如,统计学家,统计建模人员,预测分析师等) 。 但是,MLOps(DevOps和机器学习的合并)并不是一个特别成熟的领域。 下面是更新的维恩图,以表达我的观点。
So how do we overcome these problems? We need our application\web developers and data scientists to develop a stronger relationship. This is not the only part of the solution, but an important part of the solution.
那么我们如何克服这些问题呢? 我们需要我们的应用程序\网络开发人员和数据科学家来建立更牢固的关系。 这不是解决方案的唯一部分,而是解决方案的重要部分。
Now let’s list out some potential business reasons a machine learning project could fail.
现在,让我们列出机器学习项目可能失败的一些潜在业务原因。
- The company is hesitant to pull the trigger on ML implementations. 该公司不愿拉动ML实现的触发器。
- The company struggles to understand the technical requirements for implementing ML models. 该公司努力了解实施ML模型的技术要求。
- Lack of coordination between different teams programming in different languages. 不同团队使用不同语言进行编程之间缺乏协调。
With all this in mind, sounds like companies could fall into 3 buckets:
考虑到所有这些,听起来公司可能会分为以下三个类别:
- Mature and flexible MLOps — in which case the issue is more likely business-related. 成熟且灵活的MLOps-在这种情况下,问题很可能与业务相关。
- Mature MLOps but inflexible — most likely built a strong process for 1 project that doesn’t work for anything else. 成熟的MLOps但不灵活-很可能为1个项目建立了一个强大的流程,而该流程对其他任何项目均无效。
- Immature MLOps — little to no experience in MLOps. 不成熟的MLOps-几乎没有MLOps经验。
If you fall into bucket #2 (Mature MLOps but inflexible), it helps to have your data scientists meeting your application\web developers halfway. This way you can prototype and develop a strong app foundation. This provides the business more comfort in saying “yes, let’s invest more in that”. Over time you can build on this momentum to achieve higher MLOps flexibility.
如果您陷入第2类(成熟的MLOps但不灵活) ,那么让数据科学家半途与您的应用程序\ Web开发人员会面会有所帮助。 通过这种方式,您可以原型设计并开发强大的应用程序基础。 说“是的,让我们在这方面进行更多的投资”,这会使企业更加放心 。 随着时间的流逝,您可以利用这种动力来实现更高的MLOps灵活性。
实施模型 (Implementing a Model)
“So I get that having a strong relationship with your application\web developers seems important, but I still don’t understand why I should specifically learn web frameworks?”
“因此,我认为与您的应用程序\ Web开发人员保持密切关系似乎很重要,但是我仍然不明白为什么我应该专门学习Web框架?”
Okay, let’s list some machine learning implementation options (not all-inclusive):
好的,让我们列出一些机器学习的实现选项(不是全部) :
- A dashboard — dependent on web frameworks 仪表板-取决于Web框架
- A REST API — dependent on web frameworks REST API-取决于Web框架
- A batch job 批处理作业
- Recode the model into the production system 将模型重新编码到生产系统中
The 2 options requiring web frameworks are 2 powerful implementation options (next to batch jobs). Let’s take a look at a REST API. Let's say I have a model, and let’s say I wrap that model up into a REST API. This enables other programs can send my REST API some data, which my model can score and return a result. With one service I can serve multiple applications!
需要Web框架的2个选项是2个强大的实现选项(批处理作业之后) 。 让我们看一下REST API。 假设我有一个模型,并且说我将该模型包装到REST API中。 这使其他程序可以向我的REST API发送一些数据,我的模型可以对该数据进行评分并返回结果。 通过一项服务,我可以为多个应用程序服务!
So in short, web frameworks are a powerful way to implement a machine learning model.
简而言之,Web框架是实现机器学习模型的强大方法。
I would go a step further to say that it’s also important to understand how other companies implement models. Conferences and podcasts are a great way to study up on infrastructures and architectures that other companies are using. (In some companies, the data science team might own the deployment of the model itself, while IT focuses on the full architecture. But one step at a time.)
我要进一步说,了解其他公司如何实现模型也很重要。 会议和播客是研究其他公司正在使用的基础结构和体系结构的好方法。 (在某些公司中,数据科学团队可能自己负责模型的部署,而IT则专注于完整的体系结构。但是一次仅一步。)
面向数据科学家的流行Web框架 (Popular Web Frameworks for Data Scientists)
Here are some popular frameworks that I’ve personally seen in the industry:
以下是我在行业中亲自看到的一些流行框架:
- Flask (Python) 烧瓶(Python)
- Django (Python) Django(Python)
- RShiny (R) RShiny(R)
- plumber (R) 水管工(R)
Flask is a very simple web framework that is easy to learn and great for REST APIs. Django is a bit more involved, but comes with a lot of power and is a popular framework among web developers. RShiny and plumber are very popular among R programmers.
Flask是一个非常简单的Web框架,易于学习,非常适合REST API。 Django参与程度更高,但功能强大,并且是Web开发人员中流行的框架。 RShiny和plumber在R程序员中非常流行。
If you come from a Python background and you want to learn more, Udemy has some great courses that teach Flask and Django. I personally recommend getting started in Flask.
如果您来自Python并想了解更多信息,Udemy会提供一些很棒的课程来教Flask和Django。 我个人建议开始使用Flask。
资源资源 (Resources)
Python REST APIs with Flask, Docker, MongoDB, and AWS DevOps
Why 90% of Machine Learning Models Never Make it to Production
Why do 87% of data science projects never make it into production?
Great conferences & podcasts highlighting MLOps:
突出MLOps的精彩会议和播客:
结束! (The End!)
Thanks for reading and hope you find this helpful! Happy coding and happy modeling!
感谢您的阅读,希望对您有所帮助! 快乐的编码和快乐的建模!
web数据库框架