DeepOps for Business:建立AI优先公司

Enterprises and large companies like Facebook have had AI-first capability for years, but it’s only recently that small businesses could make the transition. Yuval Greenfield of missinglink.ai has developed a ten-point checklist for companies wanting to make the change, giving them both AI capability and the chance to attract top talent by working on these hot projects. Let’s take a look at his talk “DeepOps: Building an AI-First Company” for ODSC’s Accelerate AI.

像Facebook这样的企业和大公司多年来一直拥有AI优先功能,但是直到最近,小企业才可以进行这种转变。 missinglink.ai的 Yuval Greenfield为想要做出改变的公司制定了十点清单,使他们既具有AI能力,又有机会通过从事这些热门项目来吸引顶尖人才。 让我们看一下他为ODSC的Accelerate AI所做的演讲“ DeepOps:建立一家AI优先公司”。

数据科学家想要什么 (What Data Scientists Want)

Data scientists want to build a state of the art models through an iterative process. Neural network developers move the needle through these experiments, and through massive trial and error, data scientists refine these models, driving innovation.

数据科学家希望通过迭代过程来构建最先进的模型。 神经网络开发人员在这些实验中发挥了作用,并且通过大量的试验和错误,数据科学家改进了这些模型,从而推动了创新。

Unfortunately, that’s not where most data scientists spend their time. Instead, companies want data scientists to focus on the models, but the reality involves expense tracking and efficient usage.

不幸的是,这并不是大多数数据科学家花费时间的地方。 取而代之的是,公司希望数据科学家将精力集中在模型上,但是现实涉及费用跟踪和有效使用。

These problems are back end problems. At many companies, data scientists are working with primitive tools instead of cutting edge AI solutions. As companies transition to AI-driven solutions, a marker of success will be the time to market, making those “expensive” tools more necessary.

这些问题是后端问题。 在许多公司中,数据科学家正在使用原始工具,而不是使用尖端的AI解决方案。 随着公司过渡到AI驱动的解决方案,成功的标志将是上市时间,这将使那些“昂贵”的工具变得更加必要。

所以有什么问题? (So What’s the Problem?)

A company has a lot of machines to run code and move data. This creates a processing issue. Data scientists are running machines at their desks using cobbled together an on-premise solution. Others are using legacy cloud systems, flipping between on-premise and scaled clouds.

公司有很多机器来运行代码和移动数据。 这会产生处理问题。 数据科学家正在使用拼凑的本地解决方案在他们的办公桌前运行机器。 其他人则在使用传统云系统,在本地云和扩展云之间切换。

It’s tricky to get things consistent. Getting creative can go well, or it can create a slowdown, blocking the pipeline when efficiency is critical. You’re wasting both talent and time.

使事情保持一致是很棘手的。 发挥创意可以顺利进行,或者会导致速度降低,从而在效率至关重要时阻塞管道。 您在浪费人才和时间。

Wouldn’t it be great if you could automate those tasks? If it involved just one button to launch?

如果您可以自动化这些任务,那不是很好吗? 如果仅涉及一个按钮即可启动?

实验的风险 (The Risks of Experimentation)

Data scientists might tweak and tweak again, retrying models without committing because no one cares about the garbage changes. Everything is fine until that one critical tweak that results in quality change, and now no one has a record or systematic documentation.

数据科学家可能会一次又一次地进行调整,因为没有人关心垃圾的更改,因此无需提交就可以重试模型。 一切都很好,直到一项导致质量变化的关键调整,现在没有人拥有记录或系统的文件。

A further risk is job change. The average job trajectory for a data scientist is around two years. When your data scientist leaves, you lose a wealth of historical data, not to mention future collaboration with quality.

另一个风险是工作变动。 数据科学家的平均工作轨迹约为两年。 当您的数据科学家离开时,您将丢失大量的历史数据,更不用说未来的高质量协作了。

Sharing notes can be bloated, despite being vital to the collaboration process. However, only the automation of notes can help strike the right balance between diligence and tedium.

尽管对协作过程至关重要,但共享笔记可能会肿。 但是,只有自动进行注释才能在勤奋和乏味之间取得正确的平衡。

建立资料夹 (Building the Folder)

In a typical folder, there’s a whole bunch of data with a bit of metadata. Running a new architecture allows you to drag a few files from your primary folder, allowing you to run the experiment. However, the folder system doesn’t work if you delete the folder or if you want to run many experiments.

在典型的文件夹中,有一堆数据和一些元数据。 运行新的体系结构可以使您从主文件夹中拖动一些文件,从而可以运行实验。 但是,如果删除文件夹或要运行许多实验,则文件夹系统将无法工作。

使用数据库 (Using a Database)

A database does solve some of that issue, but databases aren’t suitable for all types of information. If you run a query once and then the same query a month later, you may not get the same results because of the database changes.

数据库确实解决了某些问题,但是数据库并不适合所有类型的信息。 如果您运行一次查询,然后在一个月后运行同一查询,则由于数据库更改,您可能不会获得相同的结果。

And deciding to integrate some of the data into a folder after all means you’re managing both data sources now. So is it worth the hassle to version control data?

而且,如果决定将某些数据集成到文件夹中,则意味着您现在正在管理这两个数据源。 那么值得拥有版本控制数据的麻烦吗?

使问题复杂化 (Complicating the issue)

Most companies only have version control for the code, not the model or data. It makes critical questions challenging to answer because your scientists are now hunting for answers when things fall apart.

大多数公司仅具有代码的版本控制,而没有模型或数据。 这使关键问题难以回答,因为当事情崩溃时,您的科学家正在寻找答案。

DeepOps答案 (The DeepOps Answer)

What if we could take everything we’ve learned in DevOps and apply it to questions like these to transform the way we think of version control and data experimentation? If you can’t reproduce your results, the evolution of your product is lost.

如果我们可以吸收在DevOps中学到的所有知识并将其应用于诸如此类的问题,以改变我们对版本控制和数据试验的看法,该怎么办? 如果您无法重现结果,那么您产品的发展就会丢失。

As our understanding of deployment changes in the face of continuous intelligence and development, companies must be willing to accept that shipping out changes happens in mere minutes instead of months or years.

面对不断的智能和发展,随着我们对部署的理解发生变化,公司必须愿意接受在短短几分钟内而不是数月或数年内完成交付更改。

In a seriously counterintuitive understanding of this data, a 2018 state of DevOps report found that companies that take an hour or less between commit to production have a failure rate of less than ten percent while companies that take one to six months experience a massive jump in that number to over 50 percent.

在对该数据的严重违反直觉的理解中, 2018年DevOps的状态发现,在投入生产之间花费一个小时或更短时间的公司的失败率不到10%,而花费1-6个月的公司经历了巨大的增长。这个数字超过50%。

Faster and more reliable? It seems too good to be true. However, the collaboration between developers and Ops teams has jumpstarted this ideal situation. Better balance between these two teams provided the chance for continuous development using the infrastructure of Ops with the innovation of development.

更快更可靠? 看起来真是太好了。 但是,开发人员和Ops团队之间的合作已经启动了这一理想情况。 这两个团队之间更好的平衡为使用Ops的基础架构和开发创新提供了持续发展的机会。

发展文化 (A Culture of Development)

The biggest key for AI-transformation is creating a culture of responsibility throughout the pipeline. Each person has the keys to innovation, testing, and production. There are four key building blocks:

人工智能转型的最大关键是在整个流程中营造一种责任文化。 每个人都有创新,测试和生产的钥匙。 有四个关键构建块:

  • Version control

    版本控制
  • Test

    测试
  • Automate

    自动化
  • Monitor

    监控

These core principles made it possible to increase innovation pipelines. Now, deep learning uses these core aspects, providing data scientists a more focused target.

这些核心原则使增加创新渠道成为可能。 现在,深度学习利用了这些核心方面,为数据科学家提供了更加集中的目标。

Greenfield believes that companies should hire the right people. Companies must invest in engineers and data scientists to build these solutions.

格林菲尔德认为,公司应该雇用合适的人。 公司必须投资工程师和数据科学家来构建这些解决方案。

You’ll also need to invest in the methodologies that work. Blending white box with black box solutions can put your data scientists back on growth-producing activities, neither burdening them with mundane tasks nor risking poor, unexplained roll-outs.

您还需要投资可行的方法。 将白盒与黑盒解决方案融合在一起,可以使您的数据科学家重新从事促进增长的活动,既不会使他们承担繁琐的任务,也不会冒着无法解释的糟糕部署的风险。

格林菲尔德的DeepOps商业清单: (Greenfield’s DeepOps for Business Checklist:)

Automating Documentation:

自动化文档:

  • Code

  • Params

    参数
  • Results

    结果
  • Compare

    比较

Data:

数据:

  • Version Data

    版本数据
  • Query Data

    查询数据
  • Stream Data

    流数据

Actions:

动作:

  • One-click launch

    一键启动
  • Job queue

    作业队列
  • Cost speed knobs

    成本速度旋钮

Checking these should help small companies make the transition to an AI-first culture.

进行这些检查应有助于小型公司过渡到以人工智能为先的文化。

Original post here.

原始帖子在这里。

Read more data science articles on OpenDataScience.com, including tutorials and guides from beginner to advanced levels! Subscribe to our weekly newsletter here and receive the latest news every Thursday.

OpenDataScience.com 上阅读更多数据科学文章 ,包括从初学者到高级的教程和指南! 在此处订阅我们的每周新闻, 并在每个星期四接收最新新闻。

翻译自: https://medium.com/@ODSC/deepops-for-business-building-an-ai-first-company-b7d00f3d358b

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值