nitro库_Amazon EC2的发展— Nitro如何改变一切并深入研究实例类型

最新推荐文章于 2024-02-02 18:03:48 发布

weixin_26752759

最新推荐文章于 2024-02-02 18:03:48 发布

阅读量992

点赞数

文章标签： python java vue 大数据人工智能 ViewUI

原文链接：https://levelup.gitconnected.com/amazon-ec2-evolution-how-nitro-changed-everything-and-instance-type-deep-dive-b5ff5948265d

版权

nitro库

介绍 (Introduction)

For those new to EC2, EC2 offers auto-scalable instances with compute, memory, storage and networking, deployable in multiple Availability zones/Regions while targetable by a load balancer along with management / administration tools such as AWS Systems Manager and AWS License Manager. EC2 instances are available under multiple purchase options such as Spot, On-Demand, Reserved and Savings Plan. It offers a broad choice of processors such as Intel, AMD and Amazon’s very own ARM-based Graviton processors. EC2 also allows the possibility of using Application Specific Integrated-Circuit (ASICs) and Field Programmable Gate-Array (FPGAs).

对于那些不熟悉EC2的用户，EC2提供了具有计算，内存，存储和网络 功能的可自动扩展的实例，可以在多个可用区/区域中部署，同时可以由负载均衡器以及AWS Systems Manager和AWS License Manager等管理/管理工具作为目标。 EC2实例在多个购买选项下可用，例如竞价，按需，预留和储蓄计划 。它提供了多种处理器选择，例如英特尔，AMD和亚马逊自己的基于ARM的Graviton处理器。 EC2还允许使用专用集成电路( ASIC )和现场可编程门阵列( FPGA )。

For those new to terms like ASIC and FPGA, they provide an alternate way to compute than CPU and GPU. Unlike CPU and GPU, which have a predefined instruction set and general purpose or parallel computing optimized , ASICs are dedicated, custom designed and optimized for one specific function (for e.g. bitcoin mining) whereas FPGAs are, as their name suggests, programmable digital logic cells. The hardware can be reprogrammed / repurposed to different workloads. Neither ASICs nor FPGAs have a predefined instruction set.

对于像ASIC和FPGA这样的新术语，它们提供了CPU和GPU之外的另一种计算方式。与具有预定义指令集和通用或并行计算优化的CPU和GPU不同，ASIC是专用于针对一种特定功能(例如比特币挖掘) 进行 定制设计和优化的 ，而FPGA顾名思义就是可编程数字逻辑单元。可以对硬件进行重新编程/针对不同的工作负载。 ASIC和FPGA都没有预定义的指令集。

历史很短 (A very short history)

Amazon launched EC2 with one instance type, m1, in the year 2006. This machine offered 1.7 GHz of CPU, 1.75 GB of RAM, 160 GB of disk and 250 Mbps of network bandwidth. This has evolved to 300+ instance types as of July 2020.

Amazon于2006年推出了具有一个实例类型m1的EC2。该机器提供1.7 GHz CPU， 1.75 GB RAM， 160 GB磁盘和250 Mbps网络带宽。截至2020年7月，它已经发展到300多种实例类型。

EC2 currently offers instance with upto 4.0GHz of CPU(z1d), 24576 GB / 24 TB of RAM(u-24tb1.metal), 48 TB of disk (d2.8xlarge) and 100 Gbps of Network bandwidth (High-Memory instances). Nitro turbo-charged this evolution in 2017 at which time “only” 42 instance types were available. It allowed new CPU architectures (ARM, AMD), bare metal offerings, 100 Gbps networking, EFA etc. So what exactly is Nitro?

EC2当前与CPU的高达4.0 GHz频率 (z1d)，24576 GB /的RAM 24 TB(U-24tb1.metal)，48 TB盘(d2.8xlarge)和网络带宽的100Gbps的的报价实例(高内存实例) 。 Nitro在2017年为这种演变提供了加速，当时“只有” 42种实例类型可用。它允许使用新的CPU架构 (ARM，AMD)，裸机产品，100 Gbps网络， EFA等。那么Nitro到底是什么？

硝基—英语 (Nitro — in english)

Nitro refers to a whole fleet of changes, hardware and software, brought about in order to improve two important aspects of the infrastructure: performance and security. The basic idea is about offloading functions that are generally performed in a hypervisor stack to separate dedicated hardware / software component. Before Nitro — Networking, Storage, Security etc. were part of the hypervisor stack and contributed to about 30% of resource consumption that could not be used by customer instances.

Nitro指的是为了改善基础架构的两个重要方面而带来的整个变化(硬件和软件)：性能和安全性 。基本思想是关于卸载通常在系统管理程序堆栈中执行的功能，以分离专用的硬件/软件组件。在Nitro之前-网络，存储，安全性等都是虚拟机管理程序堆栈的一部分，占客户实例无法使用的大约30％的资源消耗。

With Nitro architecture, these components are moved out of the hypervisor stack, allowing for better resource consumption / performance and more controlled security.

借助Nitro架构，这些组件可从虚拟机管理程序堆栈中移出，从而实现更好的资源消耗/性能和更可控的安全性。

Taking out the hypervisor enables bare-metal instances, which was an important step especially for bringing in VMWare Cloud on AWS. A separate hypervisor was also written to replace the existing hypervisor.

取出虚拟机监控程序可启用裸机实例 ，这是重要的一步，尤其是对于在AWS上引入VMWare Cloud而言。还编写了一个单独的虚拟机管理程序来替换现有的虚拟机管理程序。

组件 (Components)

Architecturally, Nitro includes three main components:

在结构上，Nitro包含三个主要组件：

Nitro Security Chip: A hardware component which provides the hardware root of trust (contains the keys used for cryptographic functions and enables a secure boot process). It is a custom micro-controller that traps all I/O to non-volatile storage, making sure no unauthenticated writes are done to the flash storage.
Nitro安全芯片 ：提供硬件信任根的硬件组件(包含用于加密功能的密钥并启用安全启动过程)。这是一个自定义微控制器，可将所有I / O捕获到非易失性存储中，以确保不会对闪存存储进行未经身份验证的写入。
Nitro Cards: Hardware cards for specific functions such as Networking and Storage. Key cards include Nitro Card for VPC, Nitro Card for EBS, Nitro Card for Instance Storage, Nitro Card Controller, and Nitro Security Chip.
硝基卡 ：用于特定功能(如联网和存储)的硬件卡。密钥卡包括用于VPC的Nitro卡，用于EBS的Nitro卡，用于实例存储的Nitro卡，Nitro卡控制器和Nitro安全芯片。
Nitro Hypervisor: A KVM-based stripped-down lightweight hypervisor with very specific functionality and small user space. It manages CPU and memory allocation and because of its lightweight nature provides very high performance.
Nitro Hypervisor ：基于KVM的精简轻量级Hypervisor，具有非常特定的功能和较小的用户空间。它管理CPU和内存分配，并且由于其轻巧的特性而提供了非常高的性能。

There is a separate Nitro Card Controller which coordinates with all the components. All of these functionalities come together to provide the much faster innovation and new features coming out in EC2 in the last few years.

有一个单独的Nitro卡控制器 ，可以与所有组件配合使用。所有这些功能结合在一起，以提供最近几年EC2中出现的更快的创新和新功能。

EC2实例-深入研究 (EC2 Instances — Deep Dive)

Now that we have covered infrastructure that EC2 instances run on, let’s do a deep dive into EC2 instances available now and how they map to your workloads.

既然我们已经介绍了运行EC2实例的基础架构，那么让我们深入研究一下现在可用的EC2实例以及它们如何映射到您的工作负载。

实例特征 (Instance characteristics)

Instances have five main characteristics — type, family, generation, size and additional capabilities. For e.g. M5a.xlarge is an instance type, where M is the instance family, 5 stands for the generation, a is the “additional capabilities” (more on this next) and xlarge stands for the size.

实例具有五个主要特征- 类型，族，世代，大小和其他功能 。例如， M5a.xlarge是一个实例类型，其中M是实例族， 5代表世代， a是“附加功能”(在下一个更多内容中)， xlarge代表size 。

Additional capabilities

附加功能

a → uses AMD processer
→使用AMD处理器
g → uses AWS Graviton processor
g→使用AWS Graviton处理器
n → improved network throughput and packet rate performance
n→提高了网络吞吐量和数据包速率性能
d → directly attached instance storage (NVMe)
d→直接连接的实例存储(NVMe)
e → extra capacity, be it memory or storage
e→额外的容量，无论是内存还是存储空间

工作量类型 (Workload types)

General-purpose workloads — Scale-out workloads such as web servers, containerized microservices, caching fleets, and distributed data stores, as well as development environments etc.
通用工作负载 -横向扩展工作负载，例如Web服务器，容器化微服务，缓存队列和分布式数据存储，以及开发环境等。
Compute-intensive workloads — High performance computing (HPC), batch processing, ad serving, video encoding, gaming, scientific modelling, distributed analytics, and CPU-based machine learning inference etc.
计算密集型工作负载 -高性能计算(HPC)，批处理，广告投放，视频编码，游戏，科学建模，分布式分析以及基于CPU的机器学习推理等。
High-performance compute workloads — Machine/Deep learning, high performance computing, computational fluid dynamics, computational finance, seismic analysis, speech recognition, autonomous vehicles, drug discovery etc.
高性能计算工作负载 -机器/深度学习，高性能计算，计算流体力学，计算财务，地震分析，语音识别，自动驾驶汽车，药物发现等。
Memory-intensive workloads — Open-source databases, in-memory caches, and real time big data analytics, In-memory databases etc.
内存密集型工作负载 -开源数据库，内存缓存，实时大数据分析，内存数据库等。
Storage-intensive workloads —NoSQL Databases, Elasticsearch, Analytics workloads, MapReduce and Hadoop distributed computing, distributed file systems, network file systems, log or data-processing applications.
存储密集型工作负载 -NoSQL数据库，Elasticsearch，Analytics工作负载，MapReduce和Hadoop分布式计算，分布式文件系统，网络文件系统，日志或数据处理应用程序。

工作负载到实例类型的映射 (Workload to Instance type mapping)

General-purpose workloads

通用工作负载

Instance family M (M5, M5a, M5n, M6g) — M stands for “Most Scenarios” (Not officially but a mnemonic). All M family instances provide a balance of compute, memory, and network resources with a 4:1 Memory to vCPU ratio. M5a uses AMD processors, M5n provides better networking and M6g uses Graviton processors and only support EBS for instance storage (no NVMe). These machines are great for micro-services, running backend servers for SAP, Microsoft SharePoint etc.
实例家族M(M5，M5a，M5n，M6g) — M代表“ 大多数 情况 ”(不是正式名称，而是助记符)。所有 M系列实例均以 4：1的 内存与vCPU的 比率提供计算，内存和网络资源的平衡。 M5a使用AMD处理器，M5n提供更好的联网，M6g使用Graviton处理器，仅支持EBS作为实例存储(不支持NVMe)。这些机器非常适合微服务，运行用于SAP，Microsoft SharePoint等的后端服务器 。
Instance family T (T2, T3, T3a) —T stands for turbo (again a mnemoic, like all the others)and are used for workloads that have bursts of load but are low-load most of the time. These provide a baseline level of CPU performance with the ability to burst above the baseline. T instances’ baseline performance and ability to burst are governed by CPU Credits. Each T instance receives CPU Credits continuously, the rate of which depends on the instance size. T instances accrue CPU Credits when they are idle, and consume CPU credits when they are active and going above the baseline. A CPU Credit provides the performance of a full CPU core for one minute. T3 and T3a only support EBS for instance storage and have improved network and EBS burst performance than T2. Memory to vCPU ratio — variable. These machines are great for micro-services, low-latency interactive applications, development environments etc.
实例系列T(T2，T3，T3a) — T代表turbo(再次成为mnemoic，就像所有其他代词一样) ，用于负载突增但大部分时间都是低负载的工作负载。这些提供了CPU性能的基线水平，并具有超过基线的能力。 T实例的基准性能和爆发能力由CPU积分决定 。每个T实例连续接收CPU积分，其速率取决于实例的大小。 T实例在空闲时会累积CPU积分，而在它们处于活动状态并超过基准时消耗CPU积分。 CPU积分可在一分钟内提供完整CPU内核的性能。 T3和T3a仅支持EBS作为实例存储，并且比T2具有更好的网络和EBS突发性能。 内存与vCPU的比率-可变。 这些机器非常适合微服务，低延迟交互式应用程序，开发环境等。

Instance family A (A1) — A stands for “ARM”. These provide significant cost savings and are ideally suited for scale-out. They allow upto 45% cost savings and are powered by AWS Graviton Processors that feature 64-bit Arm Neoverse cores and custom silicon designed by AWS. Memory to vCPU ratio — 2:1. These machines are great for web servers, containerized microservices, development environments etc.
实例族A(A1) -A代表“ ARM”。这些节省了大量成本，非常适合横向扩展。它们可节省多达45％的成本，并由具有64位Arm Neoverse内核和AWS设计的定制芯片的AWS Graviton处理器提供支持。 内存与vCPU之比-2：1。 这些机器非常适合Web服务器，容器化微服务，开发环境等。

Compute-intensive workloads

计算密集型工作负载

Instance family C (C4, C5, C5a, C5n, C6g) — C stands for Compute. These instances provide high performance at a low price per vCPU with a 2:1 Memory to vCPU ratio except C4 with 1.875:1 and 2.625:1 for C5n.
实例族C(C4，C5，C5a，C5n，C6g) — C代表Compute。这些实例以2：1的内存与vCPU的 比率以每个vCPU的低价提供了高性能， 但C4的C4的1.875：1和2.625：1除外 。
Instance family Z (Z1d) — These provide a very high single thread performance and are perfect for workloads with high per-core licensing costs. They have the fastest processor in the cloud at 4.0 GHz and have a 8:1 memory to vCPU ratio.
实例系列Z(Z1d) -它们提供了非常高的单线程性能 ，非常适合具有高单核许可成本的工作负载 。它们拥有4.0 GHz的云中最快的处理器， 内存与vCPU的 比率为8：1 。

Memory-intensive workloads

内存密集型工作负载

Instance family R (R4, R5, R5a, R5n, R6g) — R stands for RAM. They provide a 8:1 memory to vCPU ratio except R4 which has 7.625:1. You can have upto 488 GB of RAM with R4 instances, upto 768 GB of RAM with R5 instances and upto 512 GB of RAM with R6 instances. These machines are great for open-source databases, in-memory caches, and real time big data analytics.
实例族R(R4，R5，R5a，R5n，R6g) — R代表RAM。它们提供的内存与vCPU的 比率为8：1， 但R4的比率为7.625：1 。你可以有高达488 GB的RAM与R4的情况下，高达768 GB的使用R5实例RAM和高达512 GB的RAM与R6实例。这些机器非常适合开源数据库，内存缓存和实时大数据分析 。
Instance family X (X1, X1e) — X stands for Extra Memory. They are used for very large in-memory workloads and provide 16:1 to 32:1 memory to vCPU ratio respectively. You can get upto 2TB of memory with X1 and 4TB of memory with X1e.
实例族X(X1，X1e) -X表示额外内存。它们用于非常大的内存工作负载，并分别提供16：1到32：1的内存与vCPU的 比率。 X1最多可获得2TB的内存，X1e最多可获得4TB的内存。
Instance family High memory — This is where you get your extreme needs met, for e.g. you get between 6 TB upto 24 TB of memory and 100 Gbps of network throughput. Memory to vCPU ratio — variable.
实例系列高内存 —在这里，您可以满足您的极端需求，例如，您可以获得6 TB至24 TB的内存和100 Gbps的网络吞吐量。 内存与vCPU的比率-可变。

Instance family X and High Memory are perfect for large in-memory databases like SAP HANA.

实例系列X和High Memory非常适合SAP HANA等大型内存数据库。

High-Performance compute workloads (Accelerated Computing)

高性能计算工作负载(加速计算)

Instance family P (P2, P3) — P stands for Performance. GPU compute instances that feature NVIDIA high-end GPUs including Volta V100. P3 being the latest generation has better features including 100 Gbps network throughput. P2 has a 15.25:1 of memory to vCPU ratio while P3 uses 7.625:1.
实例族P(P2，P3) — P代表性能。具有NVIDIA Volta V100等NVIDIA 高端 GPU的GPU计算实例 。 P3是最新一代，具有更好的功能，包括100 Gbps网络吞吐量。 P2具有15.25：1的内存与vCPU的比率，而P3使用7.625：1 。
Instance family G (G3, G4) — G stands for Graphics. GPU graphics instances that feature NVIDIA mid-range GPUs such as Turing T4 GPUs with GRID virtual workstation features and licensing. These machine are great for video encoding, 3D modelling and rendering, AR/VR etc. G3 has a 7.625:1 of memory to vCPU ratio while G4 uses 4:1.
实例族G(G3，G4) — G代表图形。具有NVIDIA 中端 GPU的GPU图形实例，例如具有GRID虚拟工作站功能和许可的Turing T4 GPU。这些机非常适用于视频编码 ，3D建模和渲染，AR / VR等G3具有7.625：的存储器1〜vCPU的比率而G4使用4：1。
Instance family Inf (Inf1) — Inf stands for Inference. High performance and lost-cost machine learning inference in the cloud. They provide 40% lower cost per inference than any other EC2 GPU instance while giving 2x higher throughput at the same time. Memory to vCPU ratio - 2:1
实例族Inf(Inf1) -Inf代表推理。云中的高性能和损失成本的机器学习推理。与其他任何EC2 GPU实例相比，它们的推理成本降低40％，同时吞吐量提高2倍。 内存与vCPU的比率-2：1
Instance family F (F1) — F stands for FPGA. These are customer programmable FPGAs that provide dramatic performance improvements over general purpose compute resources for specific workloads like genomics, image processing etc. They are programmable via OpenCL, VHDL and Verilog. Memory to vCPU ratio - 15.25:1
实例系列F(F1) -F代表FPGA。这些是客户可编程的FPGA，可针对特定的工作负载(如基因组学，图像处理等) ，在通用计算资源上提供显着的性能提升。它们可通过OpenCL，VHDL和Verilog进行编程。 内存与vCPU的比率-15.25：1

Storage-intensive workloads

存储密集型工作负载

Instance family I (I3, I3en) — I stands for I/O (high-performance database, real-time analytics). Here the I/O is optimized for low-latency high-transaction workloads. I3 provides high IOPS at a low cost, while I3en offer the lowest price per GB of SSD instance storage on Amazon EC2. I3 has a 7.625:1 of memory to vCPU ratio while I3en uses 8:1. They use SSD storage
实例系列I(I3，I3en) — 我代表I / O(高性能数据库，实时分析)。在这里，I / O针对低延迟高事务工作负载进行了优化。 I3以低成本提供了高IOPS，而I3en提供了Amazon EC2上每GB SSD实例存储最低的价格。 I3 的内存与vCPU比率为7.625：1，而I3en使用8：1 。他们使用SSD存储
Instance family D (D2) — D stands for Dense storage (hdfs, batch data-processing). These instances are great for high sequential disk throughput. and offer the lowest price per disk throughput performance on Amazon EC2. D2 has a 7.625:1 of memory to vCPU ratio, uses HDD storage and feature upto 48 TB of disk.
实例系列D(D2) -D代表密集存储(HDFS，批处理数据处理)。这些实例非常适合高顺序磁盘吞吐量。并提供Amazon EC2上最低的每磁盘吞吐量性能价格。 D2 的内存与vCPU比率为7.625：1 ，使用HDD存储，并具有高达48 TB的磁盘功能。
Instance family H (H1) — H stands for HDD (local). They are designed for applications that require low cost high disk throughput and high sequential disk I/O access to very large datasets. They have 4:1 of memory to vCPU ratio, feature up to 16 TB of HDD-based local storage and more memory and vCPU per TB of disk than D2.
实例族H (H1) -H代表HDD(本地)。它们是为需要低成本高磁盘吞吐量和对大型数据集进行高顺序磁盘I / O访问的应用程序而设计的。它们具有4：1的内存与vCPU的比率，具有高达16 TB的基于HDD的本地存储， 并且每TB磁盘比D2 拥有更多的内存和vCPU 。

摘要 (Summary)

Nitro turned out to be a game changer for Amazon, while also providing cost-benefits to the customer. One important point about EC2 missing from this article is the mapping of different purchase options to your workload, for e.g. spot instances for stateless / interruptible / non mission-critical workloads etc. A better understanding of this is highly recommended, before you begin your EC2 journey. Happy Learning!

Nitro证明是亚马逊的游戏规则改变者，同时还为客户提供了成本收益。本文缺少关于EC2的重要一点是将不同的购买选项映射到您的工作负载，例如无状态/可中断/非关键任务工作负载的竞价型实例等。强烈建议您在开始EC2之前对此有更好的了解旅程。学习愉快！