terraform_有关使用terraform优化云成本的权威指南

terraform

This article was cross-published on the HashiCorp Blog.

本文是在HashiCorp博客上交叉发布的。

The Problem — An Engineers New Role Cloud “Financial Controller”

问题—工程师的新角色云“财务控制器”

If you’re reading this, chances are you are in DevOps (or some type of Engineering) and you are wondering why on earth do I care about Cloud Cost Optimization?…that’s not my job, I’m not in Finance right?…WRONG!

如果您正在阅读本文,那么您很可能在DevOps(或某种类型的工程学)中,并且想知道为什么我到底在乎云成本优化?……那不是我的工作,我不在财务领域吗?……错误!

Engineers are the new cloud Financial Controllers, and if you are interested in defining this new role, automating your newfound responsibility, and implementing a process for Cloud Cost Optimization with Terraform Cloud for Business & Enterprise read on. Now yes, Cloud Cost Optimization is important, and in this article, we will fully address it in the context of an overall model of Cloud Cost Management (also FinOps).

工程师是新的云财务控制器,如果您有兴趣定义此新角色,自动化新发现的职责以及使用Terraform Cloud for Business&Enterprise实施云成本优化流程,请继续阅读。 现在是的,云成本优化很重要,在本文中,我们将在云成本管理的总体模型(也称为FinOps )的上下文中全面解决它。

Please note that the majority of features reviewed in this article focus on Terraform paid functionality such as Cost Estimation and Governance & Policy but the core use case around cost optimization can be achieved with open-source.

请注意,本文中介绍的大多数功能都集中在Terraform付费功能上,例如成本估算和治理与策略,但是围绕成本优化的核心用例可以通过开源实现。

The New Cloud Financial Model

新云财务模型

With the continuous shift to consumption-based cost models for infrastructure and operations; i.e. Cloud Service Providers (CSPs), you pay for what you use but you also pay for what you provision and don’t use. If you do not have a process for continuous governance and optimization, then there is a potential for waste.

随着基础架构和运营向基于消耗的成本模型的不断转变; 例如,云服务提供商(CSP),您需要为使用的东西付费,但是您也要为自己提供和不使用的东西付费。 如果您没有持续进行治理和优化的过程,那么就有浪费的可能性。

In a recent survey respondents stated:

在最近的调查中,受访者表示:

  • 45% of the organizations reporting were over-budget for their cloud spending

    报告的组织中有45%的云支出超出预算
  • More than 55% of the respondents are using either cumbersome manual processes, or simply do not implement actions and changes to optimize their cloud resources

    超过55%的受访者正在使用繁琐的手动流程,或者只是不执行操作和更改来优化其云资源

  • 34.15% of respondents believe they can save up to 25% of their cloud spend and 14.51% believe they can save up to 50%. Even worse, 27.46% said, “I don’t know”.

    34.15%的受访者认为他们可以节省多达25%的云支出,而14.51%的受访者认为他们可以节省多达50%的云支出。 更糟糕的是,有27.46%的人说:“我不知道”。

First, let’s unpack why there is an opportunity and then get to the execution

首先,让我们分析为什么会有机会,然后执行

In moving to the cloud, most organizations have put thought into basic governance models where a team, sometimes referred to as the Cloud Center of Excellence, looks over things like strategy, architecture, operations, and, yes, cost. Most of these teams contain a combination of IT Management and Cloud technical specialists from common IT domains and Finance. Finance is primarily charged with cost planning, migration financial forecasting, and optimization. Due to financial pressures, they tend to say “We need to do something about getting a handle on costs, savings, forecasting etc.” but have no direct control over costs. It is now Engineers that directly manage infrastructure and the costs.

在迁移到云中时,大多数组织已将思想引入基本的治理模型中,在该模型中,有时被称为“卓越云中心”的团队负责研究战略,架构,运营以及成本等方面的内容。 这些团队大多数都包含来自常见IT域和财务部门的IT管理和云技术专家。 财务主要负责成本计划,迁移财务预测和优化。 由于财务压力,他们倾向于说“我们需要采取一些措施来控制成本,节约,预测等。 ”但无法直接控制成本。 现在是工程师直接管理基础架构和成本。

The business case is simple, it is a financial paradigm shift where:

业务案例很简单,这是财务范式转移,其中:

  • Engineers are not only responsible for Operations but now also Costs.

    工程师不仅要负责运营,现在还要负责成本。
  • Engineers now have the tools and capabilities to automate and directly impact cost controls.

    工程师现在拥有自动化和直接影响成本控制的工具和功能。
  • Cost planning and estimation of running cloud workloads are not easily understood or forecasted by Finance.

    财务部门不容易理解或预测成本计划和运行中云工作负载的估计。
  • Traditional forms of financial budgeting and on-prem hardware demand planning (such as contract-based budgets and capitalized purchases) do not account for cost variability in consumption-based models.

    传统形式的财务预算和内部硬件需求计划(例如基于合同的预算和资本化购买)无法解决基于消耗的模型中的成本可变性。

Finance lacks control in the two primary areas of cost-saving:

财务在节省成本的两个主要方面缺乏控制:

  • Pre-provisioning: Limited governance and control in the resource provisioning phase.

    预先配置:资源配置阶段的有限治理和控制。
  • Post-provisioning: Limited governance and control in enforcing infrastructure changes for cost savings.

    预置后:强制实施基础架构更改以节省成本的有限治理和控制。

In the following article, we will define the people, processes, and technologies associated with managing cloud financial practices with Terraform.

在下面的文章中,我们将定义与使用Terraform管理云财务实践相关的人员,流程和技术。

The People

人民

To simplify things, we will assume there is some sort of team, i.e., the Cloud Center of Excellence, that is responsible for managing the overall cloud posture.

为简化起见,我们将假设有一个团队,即卓越云中心,负责管理总体云状况。

On this team there are four core roles:

在这个团队中,有四个核心角色:

  • IT Management

    IT管​​理
  • Finance

    金融
  • Engineering (consisting of DevOps and Infrastructure & Operations)

    工程(由DevOps以及基础架构和运营组成)
  • Security

    安全

In managing the “Cost of Cloud”, we will view Engineering’s role in the management of costs in the following context:

在管理“云成本”时,我们将在以下情况下查看Engineering在成本管理中的作用:

  • Planning — relating to Pre-Cloud Migration & Ongoing Cost Forecasting

    规划—与云前迁移和持续成本预测有关

  • Optimizing — Operationalizing and Realizing Continuous Cost Savings

    优化—实现并实现持续的成本节约

  • Governance — Ensuring Future Cost Savings & Waste Avoidance

    治理—确保未来节省成本和避免浪费

The following RASCI model can be used as a baseline of expectations for your team and, moving forward, we will focus on the Engineers role of “Responsibility” in these three primary areas.

以下RASCI模型可以用作您团队的期望基准,并且在今后的工作中,我们将重点关注工程师在这三个主要领域中的“责任”角色。

Image for post
Cloud Center of Excellence RASCI Model
云卓越中心RASCI模型

As you can see, Engineering has a higher level of responsibility in today’s infrastructure operations. I have used this and similar models to define the roles and responsibilities of the Cloud Center of Excellence to many organizations. The RASCI model is effective but also make sure to account for Frequency and Workflow as these will differentiate from current IT cost models and you will want to set expectations accordingly.

如您所见,工程在当今的基础架构运营中承担着更高的责任。 我已经使用此模型和类似模型为许多组织定义了卓越云中心的角色和职责。 RASCI模型是有效的,但也要确保考虑频率和工作流,因为它们将与当前的IT成本模型区分开,并且您需要相应地设定期望。

The Process — Planning, Optimization, & Governance

流程-规划,优化和治理

Now we are going to take a look at how Engineers can use Terraform at each level of the Cloud Cost Management process to deliver value and minimize additional work. To get started, below is a visualization of how Terraform fits into the Cloud Cost Management Lifecycle.

现在,我们来看看工程师如何在Cloud Cost Management流程的每个级别上使用Terraform来交付价值并最大程度地减少其他工作。 首先,以下是Terraform如何适应云成本管理生命周期的可视化。

Image for post
Cloud Cost Management Lifecycle Terraform
云成本管理生命周期地形

The overall process can be summed up as:

整个过程可以总结为:

  • Start by identifying workloads migrating to the cloud

    首先确定要迁移到云的工作负载
  • Create Terraform configuration

    创建Terraform配置
  • Run terraform plan to perform cost estimation

    运行terraform计划以执行成本估算
  • Run terraform apply to provision the resources

    运行terraform应用以供应资源
  • Once provisioned, workloads will run and Vendors will provide Optimization Recommendations

    设置后,工作负载将运行,供应商将提供优化建议
  • Integrate Vendor’s Optimization Recommendations into Terraform and/or CI/CD pipeline

    将供应商的优化建议整合到Terraform和/或CI / CD管道中
  • Investigate/analyze Optimization Recommendations and implement Terraform Sentinel for Cost & Security Controls

    调查/分析优化建议,并实施Terraform前哨进行成本和安全控制
  • Update Terraform configuration and run plan & apply

    更新Terraform配置并运行计划并应用
  • Newly optimized and complainant resources are now provisioned

    现在提供了新优化的投诉者资源

Section 1 — Planning — Pre-Migration & Ongoing Cost Forecasting

第1节-规划-迁移前和正在进行的成本预测

Cloud migrations require a multi-point assessment to determine the potential to move an application/workload to the cloud. Primary factors for the assessment are architecture, business case, the estimated cost for the move, and the ongoing utilization costs budgeted/forecasted for the next 1–3 years on average.

云迁移需要进行多点评估,以确定将应用程序/工作负载迁移到云的潜力。 评估的主要因素是架构,业务案例,搬迁的估计成本以及未来1-3年平均预算/预测的持续使用成本。

Old models of capitalization and amortization for application/workload costing done by Finance are a thing of the past, and now Engineering is responsible for managing operational costs. With Terraform, users can more clearly communicate expected costs with Terraform’s Cost Estimation functionality.

财务部门对应用程序/工作负载成本进行资本化和摊销的旧模型已成为过去,现在,工程部门负责管理运营成本。 使用Terraform,用户可以使用Terraform的“费用估算”功能更清楚地传达预期费用。

Using Terraform configuration files as a standard definition of how an application/workload is costed, you can now use Terraform Cloud & Enterprise API’s to automatically supply Finance with estimated cloud financial data or use Terraform’s user interface to provide Finance direct access to review costs and, by doing so, eliminate manual engineering oversight.

使用Terraform配置文件作为应用程序/工作负载成本的标准定义,您现在可以使用Terraform Cloud&Enterprise API自动向Finance提供估计的云财务数据,或使用Terraform的用户界面提供Finance直接访问以查看费用通过这样做,消除了人工工程监督。

Planning Recommendations:

规划建议:

  • Use Terraform configuration files as the standard definition of costing across AWS, Azure & GCP for cloud cost planning and forecasting, and provide this information via Terraform API or role-based access controls within the Terraform user interface to provide Financial persons a self-service workflow.

    使用Terraform配置文件作为AWS,Azure和GCP跨成本计算的标准定义,以进行云成本计划和预测,并通过Terraform API或Terraform用户界面中基于角色的访问控制提供此信息,以向财务人员提供自助服务工作流程。

  • Note: Many organizations conduct planning within Excel, Google Sheets, and Web-based tools. To make data usable within these systems we would recommend using Terraform’s Cost Estimates API to extract the data.

    注意:许多组织在Excel,Google表格和基于Web的工具中进行计划。 为了使数据在这些系统中可用,我们建议使用Terraform的Cost Estimates API提取数据。

  • Use Terraform Modules as standard units of defined infrastructure for costing high-level assessments and cloud demand planning (for example): Define a standard set of modules for a standard Java application so module A + B + C = $X per month and we plan to move 5 Java apps this year this can be a quick methodology to assess potential application run costs prior to defining the actual Terraform configuration files.

    使用Terraform Modules作为已定义基础结构的标准单位,以进行高级评估和云需求计划的成本(例如):为标准Java应用程序定义一组标准模块,因此模块A + B + C = $ X每月,我们计划要在今年移动5个Java应用程序,这可以是一种在定义实际Terraform配置文件之前评估潜在应用程序运行成本的快速方法。

  • Use Terraform to understand application/workload financial growth over time, i.e., cloud sprawl costs.

    使用Terraform了解随着时间的推移应用程序/工作负载的财务增长,即云蔓延成本。
  • Attempt to structurally align Terraform Organization, Workspace, and Resource naming conventions to the financial budgeting/forecasting process.

    尝试使Terraform Organization,Workspace和Resource命名约定在结构上与财务预算/预测流程保持一致。

Getting started with Terraform Cost Estimation is easy and the basic steps for Terraform Cloud for Business & Enterprise are provided in our Learn Guide. Once enabled, when a Terraform Plan is run, Terraform will reach-out to the AWS, Azure, and GCP cost estimation APIs to present the estimated cost for that Terraform Plan which can be used accordingly within your financial workflow.

Terraform Cost Estimation的入门非常容易,并且在我们的学习指南中提供了用于企业和企业的Terraform Cloud的基本步骤。 启用后,当运行Terraform Plan时,Terraform将与AWS,Azure和GCP成本估算API进行联系,以显示该Terraform Plan的估算成本,可在财务工作流中相应地使用它。

Example of Cost Estimation Output in Terraform

地形成本估算输出示例

Image for post
Terraform Cost Estimation
地形成本估算

Example of the Cost Estimation API JSON Payload from Terraform

Terraform的Cost Estimation API JSON有效负载示例

Image for post
Terraform Cost Estimation API JSON Payload
Terraform Cost Estimation API JSON有效负载

Now it is important to note that Terraform Cost Estimation provides costs based on a workspace view. If you would like a higher level, cross-workspace view, you will need to leverage the Terraform Cost Estimation API and a reporting tool of your choice. To give it a try there is a great little project named Tint from Peyton Casper, a HashiCorp Senior Solutions Engineer. Here is the blog to get started with Tint: Multi-Cloud Cost Visualization for Terraform and the project is hosted on GitHub peytoncasper/tint. In reality, any standard corporate reporting tool (e.g., Microsoft BI, Tableau, etc.) will work based on in-house requirements.

现在重要的是要注意Terraform Cost Estimation提供基于工作空间视图的成本。 如果您想要更高级别的跨工作区视图,则需要利用Terraform Cost Estimation API和您选择的报告工具。 要尝试一下,有一个很棒的小项目,来自HashiCorp高级解决方案工程师Peyton Casper。 这是开始使用Tint的博客:Terraform多云成本可视化,该项目托管在GitHub peytoncasper / tint上。 实际上,任何标准的公司报告工具(例如Microsoft BI,Tableau等)都将根据内部需求工作。

Image for post
Example Dashboard from Tint
淡色的示例仪表板

Section 2 — Optimizing — Operationalizing and Realizing Continuous Cost Savings

第2节-优化-实施和实现持续的成本节省

Optimization is the continued practice of evaluating the cost-benefit ratio of usage vs. provisioned resources and then adjusting that ratio to be most advantageous to your organization.

优化是一种持续的实践,即评估使用与已配置资源的成本效益比率,然后调整该比率以对您的组织最有利。

That said, many organizations have access to optimization recommendations from Cloud Service Providers such as AWS, Azure, or GCP or popular third-party tools. The main challenge of using these optimization tools is that organizations are not properly taking advantage of the recommendations.

也就是说,许多组织都可以从诸如AWS,Azure或GCP之类的云服务提供商或流行的第三方工具中获得优化建议。 使用这些优化工具的主要挑战是组织无法正确利用建议。

What we see is a disconnect from the Engineering/DevOps workflow (CI/CD pipeline) where Engineering does not engage with these optimization systems. Therefore, there is no feedback mechanism and even more, there is a high level of manual intervention in optimization consumption when they are engaged.

我们看到的是与Engineering / DevOps工作流程(CI / CD管道)的脱节,其中Engineering没有参与这些优化系统。 因此,没有反馈机制,更甚者,在参与优化消耗时,存在大量的手动干预。

Automating Optimization Insights into the Provisioning Workflow

自动化对供应流程的优化见解

It is safe to say that the major CSPs (AWS, Azure, GCP) and the vast majority of third-party tools provide access to export optimization recommendations via an API or an alternative method. For the purposes of this guide, we are going to focus on the basic steps/approach to automate the process of ingesting optimization recommendations which will come directly from the CSPs or from third parties such as Densify who maintain a Terraform Module.

可以肯定地说,主要的CSP( AWSAzureGCP )和绝大多数第三方工具都可以通过API或替代方法来访问导出优化建议。 出于本指南的目的,我们将专注于基本步骤/方法,以自动提取优化建议,这些建议直接来自CSP或维护Terraform Module的第三方(例如Densify )。

Image for post
Densify EC2 Optimization Example
致密化EC2优化示例

The concepts and code can be used as a model for your own deployment. Please note that each Vendor provides a different set of recommendations, but universally all provide insights on compute, so we will focus on compute as a norm, but any insight that you receive can be consumed based on the pattern below (e.g. compute, storage, DB, etc.).

这些概念和代码可以用作您自己的部署的模型。 请注意,每个供应商都提供了不同的建议集,但通常都提供了关于计算的见解,因此我们将把计算作为规范来集中精力,但是您可以根据以下模式(例如,计算,存储, DB等)。

Basic patterns for consuming optimization recommendations:.

使用优化建议的基本模式:

Establish a mechanism for Terraform to access the optimization recommendations. We see several common patterns:

建立Terraform访问优化建议的机制。 我们看到几种常见的模式:

  • Manual Workflow — Review of optimization recommendations from the providers portal and manually update Terraform files. Note: Not optimal — no automation, but a feed back loop for optimization must start somewhere!

    手动工作流程-从提供商门户查看优化建议,并手动更新Terraform文件。 注意:不是最优的-没有自动化,但是优化的反馈循环必须在某处开始!

  • File Workflow — Create a mechanism where optimization recommendations are imported into a local repository via a scheduled process (usually daily).

    文件工作流程-创建一种机制,通过该机制,优化建议通过计划的过程(通常是每天)导入到本地存储库中。

  • For instance, Densify customers use a script to export recommendations into a densify.auto.tfvars file and it is downloaded and stored in a locally accessible repository.

    例如,Densify客户使用脚本将建议导出到densify.auto.tfvars文件中,然后将其下载并存储在本地可访问的存储库中。
  • Then Terraform lookup function is used to look-up specific optimization updates that have been set as variables.

    然后,使用Terraform查找功能来查找已设置为变量的特定优化更新。

  • API Workflow — Create a mechanism for optimization recommendations to be extracted directly from the Vendor and stored within an accessible data repository and use Terraform’s http data_source functionality to perform the dataset import reference.

    API工作流-创建一种优化建议机制,以直接从供应商处提取并存储在可访问的数据存储库中,并使用Terraform的http data_source功能执行数据集导入参考。

  • Ticketing Workflow — This workflow is similar to the File and API workflow but some organizations insert an intermediary step where the optimization recommendations first go to a change control system like ServiceNow or Jira. Within these systems there is workflow & approval logic built-in where a flag is set for acceptable change and is passed as a variable to be consumed later in the process.

    票务工作流—该工作流类似于文件和API工作流,但是某些组织插入了一个中间步骤,在该步骤中,优化建议首先进入了诸如ServiceNow或Jira之类的变更控制系统。 在这些系统中,内置了工作流程和批准逻辑,其中设置了可接受更改的标志,并将其作为变量传递,以供稍后在流程中使用。

Terraform Code Update Examples

Terraform代码更新示例

In any of these cases, especially if automation is to take place, it will be important to maintain key pieces of resource data as variables. Optimization is a function of provisioned size and usage and the optimization provider will provide a recommendation to size the resource or service i.e. Compute, DB, Storage accordingly. As an example, we will use Compute, but the example is representative of all.

在任何这些情况下,尤其是要进行自动化时,将关键资源数据片段作为变量进行维护将非常重要。 优化是预配置大小和使用情况的函数,优化提供程序将提供建议以相应地调整资源或服务的大小,即计算,数据库,存储。 作为示例,我们将使用Compute,但是该示例代表了所有示例。

At a minimum, it is recommended that you have three variables set to perform the optimization Terraform update with some basic logic. Those variables and logic being:

至少建议您设置三个变量,以使用一些基本逻辑来执行优化Terraform更新。 这些变量和逻辑为:

Image for post
Terraform Variable update recommendations
Terraform变量更新建议

As an example, we will use Densify as a vendor-supported optimization process, but there are many HashiCorp customers & users that create their own Providers for similar processes. Their Terraform Module can be found via the Terraform Registry and the code found on GitHub Densify-dev.

例如,我们将使用Densify作为供应商支持的优化过程,但是有许多HashiCorp客户和用户为相似的过程创建了自己的提供程序。 可以通过Terraform注册表和GitHub Densify-dev上的代码找到其Terraform模块

In the following, you will see some basic updates of Terraform code with variables and logic to get you started. Below is an example of the variables created.

在下面的内容中,您将看到Terraform代码的一些基本更新,其中包含变量和逻辑,以帮助您入门。 以下是创建的变量的示例。

variable "densify_recommendations"{
description = "Map of maps generated from the Densify Terraform Forwarder. Contains all of the systems with the settings needed to provide details for tagging as Self-Aware and Self-Optimization"
type = "map"
}
variable "densify_unique_id" {
description = "Unique ID that both Terraform and Densify can use to track the systems."
}
variable "densify_fallback"{
description = "Fallback map of settings that are used for new infrastructure or systems that are missing sizing details from Densify."
type = "map"
}

Next, you will see updates with the Terraform lookup function to look-up the local optimization recommendations file (i.e. densify.auto.tfvars) for updates/changes. The optimization recommendations can also be auto-delivered by Densify using Webhooks and subscription APIs.

接下来,您将看到使用Terraform查找功能进行的更新,以查找本地优化建议文件(即densify.auto.tfvars)以进行更新/更改。 优化建议也可以由Densify使用Webhooks和订阅API自动提供。

locals{
temp_map = "${merge(map(var.densify_unique_id, var.densify_fallback),var.densify_recommendations)}"
densify_spec = "${local.temp_map[var.densify_unique_id]}"
cur_type = "${lookup(local.densify_spec,"currentType","na")}"
rec_type = "${lookup(local.densify_spec,"recommendedType","na")}"
savings = "${lookup(local.densify_spec,"savingsEstimate","na")}"
p_uptime = "${lookup(local.densify_spec,"predictedUptime","na")}"
ri_cover = "${lookup(local.densify_spec,"reservedInstanceCoverage","na")}"
appr_type = "${lookup(local.densify_spec,"approvalType","na")}"
recommendation_type = "${lookup(local.densify_spec,"recommendationType","na")}"

Lastly, you will want to insert some logic to ensure that you are properly handling the usage reference i.e. if a recommendation is available use it, otherwise keep current. Note: Densify also adds some code in there as part of a change control process for their customers that are using ServiceNow or Jira (but this can be any change control/ticketing system). They have an option to first pass the optimization recommendation to be approved in one of these external systems and then pass an approval flag in as a variable to ensure that it is an approved change.

最后,您将需要插入一些逻辑以确保您正确处理了用法参考,即如果有建议,请使用它,否则请保持最新。 注意:作为使用ServiceNow或Jira的客户的变更控制流程的一部分,Densify还在其中添加了一些代码(但这可以是任何变更控制/票务系统)。 他们可以选择先通过要在这些外部系统之一中批准的优化建议,然后再将批准标志作为变量传递,以确保它是批准的更改。

instance_type = "${local.cur_type == "na" ?
"na" :
local.recommendation_type == "Terminate" ?
local.cur_type:
local.appr_type == "all" ?
local.rec_type :
local.appr_type == local.rec_type ?
local.rec_type :
local.cur_type}"

For customers not using or not wanting a third party approval system, the recommendation changes will be visible on Terraform Plan. Similarly, they can also manually update a variable such as appr_type = false to avoid using the recommendation or use other similar methods via Feature Flags and conditional expressions in Terraform to control applied functionality.

对于不使用或不需要第三方批准系统的客户,建议更改将在Terraform Plan上可见。 同样,他们也可以手动更新变量,例如appr_type = false,以避免使用建议,或者通过Terranform中的功能标志条件表达式使用其他类似方法来控制所应用的功能。

The important point that we have gotten to here is we now have a defined process that can be partially or fully automated to make changes to our environment to optimize and save.

到这里重要的一点是,我们现在有了一个已定义的过程,该过程可以部分或完全自动化,以对环境进行更改以优化和保存。

Section 3 — Governance — Ensuring Future Cost Savings

第3节-治理-确保未来节省成本

The last and critical component of the Cloud Cost Management Lifecycle is how do we stop cost overruns again?…and how do we ensure a continuous feedback loop for control? I have had this conversation with many organizations that have done optimization exercises and then costs shoot back up and it turns into a game of waste and delayed recovery, so let’s focus on waste avoidance from the start.

云成本管理生命周期的最后一个关键组成部分是如何再次阻止成本超支?……如何确保持续的反馈循环以进行控制? 我已经与许多进行了优化练习的组织进行了交谈,然后成本又增加了,变成了浪费和延迟恢复的游戏,所以让我们从一开始就着重避免浪费。

So how do we do that with Terraform Cloud & Enterprise? It is with Sentinel, a product embedded within Terraform for governance & policy. In the following steps, it is assumed that you will apply learnings from the optimization recommendations in order to apply policy for cost control.

那么我们如何使用Terraform Cloud&Enterprise做到这一点? 它与Sentinel一起使用,Sentinel是嵌入在Terraform中用于治理和策略的产品。 在以下步骤中,假定您将从优化建议中吸取经验教训,以便将策略应用于成本控制。

Cost Compliance as Code = Sentinel Policy as Code

成本合规性作为代码=前哨政策作为代码

Terraform Sentinel is a Policy as Code engine that evaluates the resource that Terraform is managing against policy definition. Sentinel can be used to define policy on any and all data defined within a Terraform file. Common uses of Sentinel are to ensure provisioned resources are: secure, tagged, and are within allowable usage policies and cost.

Terraform Sentinel是一个“策略编码”引擎,用于根据策略定义评估Terraform管理的资源。 Sentinel可用于对Terraform文件中定义的所有数据定义策略。 Sentinel的常见用途是确保预配置的资源是:安全的,已标记的并且在允许的使用策略和成本之内。

Specifically focusing on costs, Terraform customers implement policy around three primary areas: (but there is no limit…you can get creative):

Terraform客户特别关注成本,在以下三个主要方面实施政策:(但没有限制,您可以发挥创造力):

Cost Controls Areas:

成本控制领域:

  • Amount — Control the amount of spend

    金额-控制支出金额

  • Provisioned size — Control the size/usage of the resource

    设置的大小-控制资源的大小/使用情况

  • Time to live — Control the time to live of the resource

    生存时间-控制资源的生存时间

In all three of these areas, you are able to apply policy or controls around things like Terraform Workspaces (e.g. apps/workloads), environments (e.g. prod, test, dev), and tags to ensure that spend and controls are aligned to optimize resources and avoid unnecessary spend.

在这三个领域中,您都可以针对Terraform工作区(例如,应用程序/工作负载),环境(例如,产品,测试,开发人员)和标签等应用策略或控件,以确保支出和控件一致以优化资源并避免不必要的花费。

The following is an example Sentinel policy output when running Terraform Plan. We will focus on three policies:

以下是运行Terraform Plan时的示例Sentinel策略输出。 我们将重点关注三项政策:

  1. passed — aws-global/limit-cost-by-workspace-type

    已通过— AWS全局/按工作空间限制成本类型

  2. advisory failed — aws-compute-nonprod/restrict-ec2-instance-type

    咨询失败-aws-compute-nonprod / restrict-ec2-instance-type

  3. passed — aws-global/enforce-mandatory-tags

    通过— aws-global / enforce-mandatory-tags

Note: Sentinel has three Enforcement Levels: Advisory, Soft-Mandatory, and Hard-Mandatory — please refer to the provided link for definitions. The Enforcement Level will dictate workflow and resolution of policy violations. In addition, please note that Terraform Cloud for Business & Enterprise is fully API enabled and users may interact with the Terraform “UI, CLI, or the API” to fully integrate into their CD/CD pipelines for policy workflow control and VCS systems such as GitLab, GitHub, and BitBucket for policy creation and management.

注意:Sentinel具有三个实施级别:咨询,强制性和强制性—有关定义,请参阅提供的链接。 执法级别将规定违反政策的工作流程和解决方案。 此外,请注意,Terraform Cloud for Business&Enterprise已完全启用API,用户可以与Terraform“ UI,CLI或API”进行交互,以完全集成到其CD / CD管道中,以进行策略工作流控制和VCS系统,例如GitLab,GitHub和BitBucket用于策略创建和管理。

Image for post
Terraform Policy Check
地形政策检查

Sentinel Cost Compliance Code Examples

前哨成本合规性代码示例

In the aws-global/limit-cost-by-workspace-type policy defined for this Workspace (which can be individual or globally defined), we have applied monthly limits on how much spend can be provisioned by this Workspace and also an enforcement level. A snippet of the policy is visible below where we have defined a Monthly limit of cost (Dev = $200) and an enforcement level. Again depending on enforcement level users will need to address the policy violation accordingly, but most important is that we have a mechanism to control costs before those resources are provisioned.

在为此工作空间定义的aws-global /按工作空间划分的费用类型策略(可以是单独的或全局定义的)中,我们对该工作空间可以提供多少支出以及实施级别应用了每月限制。 以下是该政策的摘要,其中我们定义了每月费用限制(Dev = $ 200)和执行级别。 再次取决于执法级别,用户将需要相应地解决违反策略的问题,但是最重要的是,我们拥有一种在提供这些资源之前控制成本的机制。

Sentinel Cost Compliance — Monthly Limits

前哨成本合规性-每月限制

##### Monthly Limits #####
limits = {
"dev": decimal.new(200),
"qa": decimal.new(500),
"prod": decimal.new(1000),
"other": decimal.new(50),
}

policy "limit-cost-by-workspace-type" {
enforcement_level ="soft-mandatory"

Sentinel Cost Compliance — Instance Types

前哨成本合规性-实例类型

As a second example, for a multitude of reasons including compliance and costs many customers will restrict what compute instance types can be provisioned and potentially configuration limits based on environment or team. An example of the full policy can be seen here: aws-compute-nonprod/restrict-ec2-instance-type. In the example below, we have a policy that controls instance sizes on non-prod environments to ensure lower costs in these areas and we can apply a different policy to production if we so choose.

作为第二个示例,由于包括合规性和成本在内的多种原因,许多客户将限制可以配置的计算实例类型以及基于环境或团队的潜在配置限制。 完整策略的示例可以在此处看到: aws-compute-nonprod / restrict-ec2-instance-type 。 在下面的示例中,我们有一个策略可以控制非生产环境上的实例大小,以确保这些区域的成本更低,并且如果我们愿意的话,可以对生产应用不同的策略。

# Allowed EC2 Instance Types
# We don't include t2.medium or t2.large as they are not allowed in dev or test environments
allowed_types = [
"t2.nano",
"t2.micro",
"t2.small",
]

policy "restrict-ec2-instance-type" {
enforcement_level = "advisory"

Sentinel Cost Compliance — Enforce Tagging

前哨成本合规—强制标记

Lastly, tagging is a critical factor in understanding costs. Tagging enables you to group, analyze, and provide policy around cost optimization.

最后,标记是了解成本的关键因素。 标记使您可以对成本优化进行分组,分析和提供策略。

Terraform with Sentinel provides the capability to enforce Tagging at provisioning and during updates to ensure that optimization can be targeted and governed. Tagging is managed in a simple Key/Value format and can be enforced across all CSPs. Here is a full sample policy for enforcement on AWS but Sentinel is highly flexible and can be similarly configured for any cloud provider (snippet provided below).

具有Sentinel的Terraform提供了在供应时和更新期间强制执行标记的功能,以确保优化可以作为目标并进行管理。 标记以简单的键/值格式进行管理,并且可以在所有CSP中实施。 这是用于在AWS上实施的完整示例策略,但Sentinel具有高度的灵活性,可以为任何云提供商进行类似的配置(下面提供的代码段)。

### List of mandatory tags ###
mandatory_tags = [
"Name",
"ttl",
"owner"
"cost center
"appid",
]

policy "enforce-mandatory-tags" {
enforcement_level ="hard-mandatory"

Summary

概要

Yes, this article is long but the name did have a “Definitive Guide” in it, and the purpose is really to articulate that there has been an accountability shift in organizations to Engineering for Cloud Cost Management. Engineering controls the mechanism for costs and savings like never before.

是的,这篇文章很长,但是名称中确实有一个“权威指南”,目的是明确说明组织中的责任已经转移到了云成本管理工程学。 工程控制着前所未有的成本和节省机制。

In addition, as organizations continue to invest in Terraform, IaC, and Cloud Platforms, they can no longer operate in the siloed financial and operational processes as of today. To enact savings, you need to enact change and that is where Terraform comes in, reclaiming and optimizing resources moving forward as an ecosystem of solutions and feedback mechanisms.

此外,随着组织继续在Terraform,IaC和Cloud Platform上进行投资,截止到今天,它们不再能够在孤立的财务和运营流程中进行运营。 为了节省开支,您需要进行更改,这就是Terraform的用武之地,它是作为解决方案和反馈机制的生态系统而回收和优化资源的。

If anyone has worked on projects in this space with Terraform that you would like to highlight or if you want more information on the subject, please feel free to reach out.

如果有人想使用Terraform在此空间中从事项目,而您想突出显示该项目,或者您想了解有关该主题的更多信息,请随时与我们联系。

Note: Special thanks to Tony Pulickal — for insight and review

注意:特别感谢Tony Pulickal-提供了见解和评论

翻译自: https://medium.com/hashicorp-engineering/the-definitive-guide-to-cloud-cost-optimization-with-terraform-d4b7caf16cd2

terraform

  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值