赋格计算：下一代基础设施自动化就在这里

最新推荐文章于 2024-09-20 16:10:39 发布

cumei1658

最新推荐文章于 2024-09-20 16:10:39 发布

阅读量302

点赞数

文章标签：分布式大数据编程语言数据库 python

原文链接：https://www.pybloggers.com/2016/08/fugue-computing-next-generation-infrastructure-automation-is-here/

版权

As we migrate applications to the cloud or build there natively, cloud computing itself is changing how we compose and operate our systems. We increasingly compose systems of elastic collections of services running on many compute instances. We now commonly employ application statelessness in order to exploit cloud system elasticity and to achieve the performance required of web scale systems. As we make these changes, we discover that systems management, operations, policy enforcement, and security in the cloud cannot be accomplished easily with tools and methods adapted from traditional data center environments.

当我们将应用程序迁移到云或在本地构建时，云计算本身正在改变我们组成和操作系统的方式。我们越来越多地组成在许多计算实例上运行的弹性服务收集系统。现在，我们通常采用应用程序无状态，以利用云系统的弹性并实现Web级系统所需的性能。在进行这些更改时，我们发现使用适用于传统数据中心环境的工具和方法无法轻松实现云中的系统管理，操作，策略实施和安全性。

Our reality is that the elastic compute systems of any given enterprise are now distributed across tens, hundreds, thousands or more nodes running an ever-growing array of cloud services, but there is no central coordinating function to act as a nexus for control and trust. We have exploded the complexity and mutability of systems in the cloud, without simultaneously advancing the framework for controlling even relatively simple, non-distributed environments. Consequently, it’s become difficult to trust our systems.

我们的现实是，任何给定企业的弹性计算系统现在都分布在数十个，数百个，数千个或更多的节点上，这些节点运行着不断增长的云服务阵列，但是没有中央协调功能可充当控制和信任的纽带。我们在没有同时改进用于控制相对简单的非分布式环境的框架的情况下，就爆炸了云系统的复杂性和可变性。因此，信任我们的系统变得困难。

In the midst of this unwieldy reality is an even more compelling reality—that the cloud is not, in fact, merely a collection of infrastructure. It’s the world’s first global computer. And, just as we abstracted the hardware of individual computers decades ago, we can abstract the distributed hardware of the cloud and radically simplify operations complexity. We can do this to great advantage, so long as we maintain the ability to dive into the guts of the lower-level system directly when needed.

在这种笨拙的现实之中，存在着一个更加引人注目的现实—云实际上并非仅仅是基础架构的集合。它是世界上第一台全球计算机。而且，就像几十年前我们抽象出单个计算机的硬件一样，我们可以抽象出云的分布式硬件并从根本上简化操作复杂性。只要我们能够保持在需要时直接跳入下层系统的勇气的能力，我们就能做到这一点。

在基础架构级别的操作系统中实时计算真相和信任 (Computing Truth and Trust Live in an Infrastructure-level Operating System)

Computing “truth” means always and centrally knowing the state of systems infrastructure. “Trust” means having confidence that systems are functioning as intended, reliably and repeatedly.

计算“真相”意味着始终且集中地了解系统基础架构的状态。 “信任”是指对系统按预期，可靠且反复运行的方式充满信心。

Today, most approaches to building and operating distributed cloud computing systems don’t fully exploit the benefits of elastic infrastructure; they are passive and mimic single CPU-based batch processing. One reason for this is that, in the cloud, developers can’t realistically run scripts to build infrastructure and expect that system truth will endure or that resulting application tiers can be trusted. Infrastructure immediately begins to drift from initial configuration. The accumulation of well intended administrative intervention introduces unintended consequences and failure modes. The bag-of-scripts + dashboard approach is ineffective and causes more problems than it solves.

如今，大多数构建和运行分布式云计算系统的方法都没有完全利用弹性基础架构的好处；它们是被动的，并且模仿基于单个CPU的批处理。原因之一是，在云中，开发人员无法现实地运行脚本来构建基础架构，无法期望系统真相能够持久，或者所得到的应用程序层可以被信任。基础架构立即开始从初始配置转移。善意的行政干预的积累会带来意想不到的后果和失败模式。脚本袋+仪表板方法无效，并导致更多问题无法解决。

We need the equivalent of an operating system for distributed computing instances and components in cloud infrastructures. It needs to have the capability to be invoked automatically and to operate autonomically, so that we have much better capacity to know truth. But we also need trust, which can only happen when these distributed systems are maintained over time in much the same way that an operating system maintains the distributed resources of an individual computer. Trust is established and maintained through continuously ensuring that individual components have not been repurposed by mistake or ill intention.

对于云基础架构中的分布式计算实例和组件，我们需要与之等效的操作系统。它需要具有自动调用和自动运行的能力，以便我们有更好的了解真相的能力。但是我们还需要信任，只有当这些分布式系统随着时间的推移以与操作系统维护单个计算机的分布式资源几乎相同的方式进行维护时，才会发生信任。通过不断确保各个组件不会因错误或恶意而被重新利用，来建立和维护信任。

Knowing truth about a distributed system is difficult, in large part because we mistakenly treat truth and trust as separable. The only way to achieve either is to achieve both. Computer operating systems exist to provide an integrated but simple solution that controls and reports, establishing and consistently maintaining known state. Distributed computing systems in the cloud need something similar, and this is the motivation for Fugue.

了解分布式系统的真实性很困难，这在很大程度上是因为我们错误地将真实性和信任视为可分离的。实现任何一个的唯一方法就是同时实现两个。存在计算机操作系统来提供集成但简单的解决方案，该解决方案控制和报告，建立并一致地维持已知状态。云中的分布式计算系统也需要类似的东西，这就是赋格的动机。

云服务是API下的硬件 (Cloud Services Are Hardware Under APIs)

In the mental model of the cloud as a distributed, general purpose computer, each cloud service is akin to a hardware interface in traditional computing. Just as those were abstracted and managed by an operating system and programming language, so cloud services can be abstracted and managed.

在作为分布式通用计算机的云的心理模型中，每个云服务都类似于传统计算中的硬件接口。就像通过操作系统和编程语言对它们进行抽象和管理一样，可以对云服务进行抽象和管理。

For example, networking in cloud is usually handled by a collection of software defined network (SDN) services, such as virtual networks, inbound and outbound port and protocol rules, and load balancers. This is a familiar set of virtualized services, but it’s ripe for being abstracted into much simpler use patterns as we are no longer constrained by the limitations of hardware. When composing an application of services, we should just be able to say one service “talks-to” another service in our programming language and have the cloud operating system create and enforce the appropriate connection, rather than having to configure several appliances to line up correctly to allow the connection. On the other hand, it’s important to have the ability to get beneath the abstraction layer when you want more control over the details.

例如，云中的联网通常由软件定义网络（SDN）服务的集合来处理，例如虚拟网络，入站和出站端口和协议规则以及负载平衡器。这是一组熟悉的虚拟化服务，但是将其抽象为更简单的使用模式已经成熟，因为我们不再受硬件限制的约束。组成服务应用程序时，我们应该只能说一种服务以我们的编程语言“与另一服务对话”，并让云操作系统创建并强制执行适当的连接，而不必配置多个设备来排队正确地允许连接。另一方面，当您想对细节进行更多控制时，具有深入抽象层的能力很重要。

Just as we used to spend much of our time configuring hardware and writing software directly against it, we now do so with cloud’s hardware equivalent. Instead, we should be writing simple, enforceable programs that leave the low-level details aside unless they are important in a particular use case.

就像我们过去花费大量时间配置硬件并直接针对硬件编写软件一样，我们现在使用等效于云的硬件。取而代之的是，我们应该编写简单，可执行的程序，除非在特定用例中很重要，否则将低级细节留在一边。

Browsing through our website, particularly the Product page, you’ll notice that the three main components of Fugue are the Ludwig language, the CLI, and the Conductor. Your Ludwig program, in the form of a Fugue composition, is run by the Conductor, which is roughly similar to an operating system kernel that runs inside your Amazon Web Services (AWS) account. The Conductor handles all the AWS API interactions. It provisions, instantiates, maintains, and destroys the resources needed for your program.

浏览我们的网站，特别是“产品”页面，您会注意到赋格曲的三个主要组成部分是路德维希语言，CLI和导体。您的Ludwig程序以Fugue组合的形式由Conductor运行，该导体大致类似于在Amazon Web Services（AWS）帐户中运行的操作系统内核。指挥负责处理所有AWS API交互。它提供，实例化，维护和销毁程序所需的资源。

Fugue作为基于内核的操作系统 (Fugue as a Kernel-based OS)

So, you can think of Fugue’s Conductor as being much like a single machine’s operating system kernel which provides resources and manages those resources for an application. But, the Conductor is doing this at the cloud infrastructure level, creating Fugue processes (analogous to Unix processes), managing them, and destroying them. There’s a clear line between user space, in which Fugue processes run, and kernel space, where the control over those processes is held.

因此，您可以将Fugue的Conductor看作是一台机器的操作系统内核，该内核为应用程序提供资源并管理这些资源。但是，指挥家是在云基础架构级别执行此操作的，它创建了Fugue进程（类似于Unix进程），对其进行管理并销毁了它们。在运行Fugue进程的用户空间和保留对这些进程的控制权的内核空间之间存在清晰的界限。

Just as with a traditional kernel operating system, kernel space is considered highly dangerous and hands-off for most direct interactions. Thus, you can limit access privileges for users of the system to a minimal set needed to read information from the cloud account and to send messages to the Conductor. This is a best practice when using Fugue, so that the system is safe and remains in a known-good state. Depending on the configuration of the Conductor, Fugue is designed to correct manual modifications of the system as it notices them. So, if you find yourself, say, in the AWS console, modifying an infrastructure component that is running in Fugue, but it keeps changing back to the declaration of the process you made in a Fugue composition, you’re seeing enforcement—a key Fugue pattern—in action.

就像传统的内核操作系统一样，对于大多数直接交互，内核空间也被认为具有很高的危险性和不实用性。因此，您可以将系统用户的访问权限限制为从云帐户读取信息并将消息发送到Conductor所需的最小权限。这是使用赋格曲的最佳实践，这样系统是安全的，并保持在良好状态。根据导体的配置，赋格旨在纠正系统对其进行的手动修改。因此，例如，如果您发现自己在AWS控制台中修改了在Fugue中运行的基础架构组件，但是它不断地变回您在Fugue组合中进行的流程的声明，那么您会看到强制执行-一个关键赋格模式-行动中。

赋格作为基于语言的操作系统 (Fugue as a Language-based OS)

Fugue tackles the complexity of cloud services proliferation and constant states of change in cloud by reducing the vast majority of cloud concepts to Ludwig language types that are handled by planners.

Fugue通过将绝大多数云概念简化为计划人员可以处理的路德维希语言类型，解决了云服务扩散的复杂性和云变化的持续状态。

Planners handle the semantics of interacting with the cloud service provider, and the Ludwig library and compiler allow high order functions to abstract away the complexity. In any given Fugue composition, many Ludwig types will be used and each of these types maps to a particular planner for interpretation and operation. This allows us to run a composition through the planner pipeline until every symbol is resolved into an API call or datum. Other than shell commands and integrations, all aspects of Fugue are language-based. This means that every aspect of the system is programmable by you and that we can reduce a truly complex environment to simple declarations. Over time, there will be many more planners available, along with Ludwig libraries to use them.

计划人员处理与云服务提供商进行交互的语义，路德维希（Ludwig）库和编译器允许高阶函数抽象出复杂性。在任何给定的赋格曲作品中，将使用许多路德维希类型，并且每种类型都映射到特定的计划程序以进行解释和操作。这使我们能够通过计划程序管道运行合成，直到将每个符号解析为API调用或数据。除了shell命令和集成之外，Fugue的所有方面都是基于语言的。这意味着您可以对系统的各个方面进行编程，并且我们可以将真正复杂的环境简化为简单的声明。随着时间的流逝，将有更多可用的计划程序以及Ludwig库来使用它们。

A true language-based operating system is made of the language it presents to the user and so provides complete access to the operating system itself through the language. This is not true of Fugue in that the planners are written in other languages. We’ve tried to draw the line between that which is expressed in Ludwig and that which is hard-coded into the planners in the right place for maximum ease of use and also user accessibility. Planners have runtime responsibilities, but they also have language interpretation responsibilities.

真正的基于语言的操作系统是由它提供给用户的语言组成的，因此可以通过该语言完全访问操作系统本身。对于Fugue而言，情况并非如此，因为计划员是用其他语言编写的。我们试图在以路德维希语表达的内容和硬编码到正确位置的计划程序中的内容之间划清界线，以最大程度地简化易用性以及用户的可访问性。计划者有运行时的责任，但他们也有语言解释的责任。

Any new feature of Fugue, integrations with other products, and new cloud services are first represented as Ludwig language constructs that are user-friendly and complete. These are generally new types in Ludwig, and it’s through them that Fugue determines which planners are needed at runtime. This focus on language clarifies a very murky and potentially complex problem space and allows us to impose some degree of safety and control over the cloud.

Fugue的任何新功能，与其他产品的集成以及新的云服务都首先以用户友好且完整的路德维希语语言结构表示。这些通常是路德维希（Ludwig）中的新类型，Fugue通过它们来确定运行时需要哪些计划程序。对语言的关注澄清了一个非常模糊和潜在复杂的问题空间，并允许我们对云实施某种程度的安全性和控制。

还有更多工作要做。我们一起做吧。 (There’s More Work To Do. Let’s Do It Together.)

Cloud computing can be efficient without an operating system over it, but it’s very hard to achieve and generally must be reinvented by each customer. Because cloud is so complex and growing in complexity all the time, it’s critical to have a single interface by which systems can be defined, updated, and operated over time, by which the your cloud lifecycle can be programmed and automated, by which your cloud ops can be transparent.

云计算可以在没有操作系统的情况下高效运行，但是很难实现，并且通常必须由每个客户进行重新设计。由于云一直如此复杂且日趋复杂，因此至关重要的是拥有一个单一的接口，通过该接口可以定义，更新和随时间运行系统，通过该接口可以编程和自动化您的云生命周期，并通过该接口对云进行编程和自动化。操作可以是透明的。

Fugue is a higher fidelity, easy-to-use, and powerful operating system for the cloud that delivers on these promises. This is next generation infrastructure automation that integrates well with your existing workflows.

Fugue是用于云的更高保真度，易于使用且功能强大的操作系统，可兑现这些承诺。这是下一代基础架构自动化，可与您现有的工作流程很好地集成。

We’ve explained a bit of our thinking behind Fugue and hope that it’s made you curious. Register for our Webinar and schedule a Demo to learn how Fugue can help your DevOps productivity and help your enterprise reduce complexity and cost.

我们已经解释了赋格背后的一些想法，并希望它使您感到好奇。注册我们的网络研讨会并安排一个演示，以了解Fugue如何帮助您提高DevOps的生产力并帮助您的企业降低复杂性和成本。

翻译自: https://www.pybloggers.com/2016/08/fugue-computing-next-generation-infrastructure-automation-is-here/