亚马逊 aws 指南 步骤_AWS步骤功能:如何为状​​态机实现信号量

亚马逊 aws 指南 步骤

by Yan Cui

崔燕

AWS步骤功能:如何为状​​态机实现信号量 (AWS step functions: how to implement semaphores for state machines)

Here at DAZN, we are migrat­ing from our lega­cy plat­form into the brave new world of microfron­tends and microser­vices. Along the way, we also dis­cov­ered the delights that AWS Step Func­tions have to offer. For exam­ple…

DAZN ,我们正在从旧平台迁移到微前端和微服务的美好新世界 。 在此过程中,我们还发现了AWS Step Functions必须提供的乐趣。 例如…

  • flex­i­ble error han­dling and retry

    灵活的错误处理和重试
  • the under­stat­ed abil­i­ty to wait between tasks

    低估了在任务之间等待的能力
  • the abil­i­ty to mix auto­mat­ed steps with activ­i­ties that require human intervention

    将自动化步骤与需要人工干预的活动混合的能力

In some cas­es, we need to con­trol the num­ber of con­cur­rent state machine exe­cu­tions that can access a shared resource. This might be a busi­ness require­ment. Or it could be due to scal­a­bil­i­ty con­cerns for the shared resource. It might also be a result of the design of our state machine which makes it dif­fi­cult to par­al­lelise.

在某些情况下,我们需要控制可以访问共享资源的并发状态机执行的次数。 这可能是业务要求。 或者可能是由于共享资源的可伸缩性问题。 这也可能是我们状态机设计的结果,这使其难以并行化。

We came up with a few solu­tions that fall into two gen­er­al cat­e­gories:

我们提出了一些可分为两大类的解决方案:

  1. Con­trol the num­ber of exe­cu­tions that you can start

    控制可以开始的执行次数
  2. Allow con­cur­rent exe­cu­tions to start, but block an exe­cu­tion from enter­ing the crit­i­cal path until it’s able to acquire a sem­a­phore (that is, a sig­nal to proceed)

    允许并发执行开始,但是阻止执行进入关键路径,直到它能够获取信号量 (即继续执行的信号)为止

控制并发执行的次数 (Control the number of concurrent executions)

You can con­trol the MAX num­ber of con­cur­rent exe­cu­tions by intro­duc­ing an SQS queue. A Cloud­Watch sched­ule will trig­ger a Lamb­da func­tion to:

您可以通过引入SQS队列来控制并发执行的最大数量。 CloudWatch计划将触发Lambda函数以:

  1. check how many con­cur­rent exe­cu­tions there are

    检查有多少个并发执行
  2. if there are N exe­cu­tions, then we can start MAX-N exe­cu­tions

    如果有N次执行,那么我们可以开始MAX-N次执行
  3. poll SQS for MAX-N mes­sages, and start a new exe­cu­tion for each

    轮询SQS以获取MAX-N消息,并针对每个消息开始新的执行

We’re not using the new SQS trig­ger for Lamb­da here, because the pur­pose is to slow down the cre­ation of new exe­cu­tions. Where­as the SQS trig­ger would push tasks to our Lamb­da func­tion eager­ly.

我们此处未在Lambda上使用新的SQS触发器 ,因为其目的是减慢新执行的创建。 而SQS触发器会急切地将任务推送到我们的Lambda函数。

Also, you should use a FIFO queue so that tasks are processed in the same order they’re added to the queue.

另外,您应该使用FIFO队列,以便以将任务添加到队列中的相同顺序处理任务。

使用信号量执行块 (Block execution using semaphores)

You can use the Lis­tEx­e­cu­tions API to find out how many exe­cu­tions are in the RUNNING state. You can then sort them by start­Date and only allow the oldest exe­cu­tions to tran­si­tion to states that access the shared resource.

您可以使用ListExecutions API找出处于RUNNING状态的执行次数。 然后,您可以按startDate对它们进行排序,并且只允许最早的执行转换为访问共享资源的状态。

Take the fol­low­ing state machine for instance.

以以下状态机为例。

The Only­One­Shall­RunA­tOne­Time state invokes the one-shall-pass Lambda func­tion and returns a proceed flag. The Shall Pass? state then branch­es the flow of this exe­cu­tion based on the proceed flag.

所述OnlyOneShallRunAtOneTime状态调用one-shall-pass lambda函数,并返回一个proceed标记。 要通行证吗? 状态然后基于proceed标志分支此执行流程。

OnlyOneShallRunAtOneTime:  Type: Task  Resource: arn:aws:lambda:us-east-1:xxx:function:one-shall-pass  Next: Shall Pass?Shall Pass?:  Type: Choice  Choices:    - Variable: $.proceed  # check if this execution should proceed                      BooleanEquals: true      Next: SetWriteThroughputDeltaForScaleUp  Default: WaitToProceed   # otherwise wait and try again later          WaitToProceed:  Type: Wait  Seconds: 60  Next: OnlyOneShallRunAtOneTime

The tricky thing here is how to asso­ciate the Lamb­da invo­ca­tion with the corre­spond­ing Step Func­tion exe­cu­tion. Unfor­tu­nate­ly, Step Func­tions do not pass the exe­cu­tion ARN to the Lamb­da func­tion. Instead, we have to pass the exe­cu­tion name as part of the input when we start the exe­cu­tion.

棘手的事情是如何将Lambda调用与相应的Step Function执行相关联。 不幸的是,Step Functions不会将执行ARN传递给Lambda函数。 相反,我们必须在开始执行时将执行名称作为输入的一部分传递。

const name = uuid().replace(/-/g, '_')const input = JSON.stringify({ name, bucketName, fileName, mode })   const req = { stateMachineArn, name, input }const resp = await SFN.startExecution(req).promise()

When the one_shall_pass func­tion runs, it can use the exe­cu­tion name from the input. It’s then able to match the invo­ca­tion against the exe­cu­tions returned by Lis­tEx­e­cu­tions.

one_shall_pass函数运行时,它可以使用输入中的执行name 。 然后可以将调用与ListExecutions返回的执行进行匹配

In this par­tic­u­lar case, only the oldest exe­cu­tion can pro­ceed. All oth­er executions would tran­si­tion to the Wait­To­Pro­ceed state.

在这种特殊情况下,只能执行最早的执行。 所有其他的执行将过渡到WaitToProceed状态。

module.exports.handler = async (input, context) => {  const executions = await listRunningExecutions()  Log.info(`found ${executions.length} RUNNING executions`)
const oldest = _.sortBy(executions, x => x.startDate.getTime())[0]       Log.info(`the oldest execution is [${oldest.name}]`)
if (oldest.name === input.name) {    return { ...input, proceed: true }  } else {    return { ...input, proceed: false }  }}

比较方法 (Compare the approaches)

Let’s com­pare the two approach­es against the fol­low­ing cri­te­ria:

让我们根据以下标准比较这两种方法:

  • Scal­a­bil­i­ty. How well does the approach cope as the num­ber of con­cur­rent exe­cu­tions goes up?

    可扩展性。 随着并发执行次数的增加,这种方法的适应能力如何?

  • Sim­plic­i­ty. How many mov­ing parts does the approach add?

    简单。 该方法增加了多少个运动部件?

  • Cost. How much extra cost does the approach add?

    费用 。 该方法会增加多少额外费用?

可扩展性 (Scalability)

Approach 2 (block­ing exe­cu­tions) has two prob­lems when you have a large num­ber of con­cur­rent exe­cu­tions.

当您有大量并发执行时,方法2(阻止执行)有两个问题。

First, you can hit the region­al throt­tling lim­it on the ListExecutions API call.

首先,您可以在ListExecutions API调用上达到区域限制限制。

Sec­ond, if you have con­fig­ured time­out on your state machine (and you should!) then they can also time­out. This cre­ates back­pres­sure on the sys­tem.

其次,如果您已经在状态机上配置了超时(应该这样做),那么它们也可以超时。 这会在系统上产生背压。

Approach 1 (with SQS) is far more scal­able by com­par­i­son. Queued tasks are not start­ed until they are allowed to start, so no back­pres­sure. Only the cron Lamb­da func­tion needs to list exe­cu­tions, so you’re also unlike­ly to reach API lim­its.

相比之下,方法1(带有SQS)具有更大的可伸缩性。 排队的任务只有在允许启动后才启动,因此不会产生背压。 仅cron Lambda函数需要列出执行,因此您也不太可能达到API限制。

简单 (Simplicity)

Approach 1 intro­duces new pieces to the infra­struc­ture — SQS, Cloud­Watch sched­ule, and Lamb­da. Also, it forces the pro­duc­ers to change as well.

方法1向基础架构引入了新的部分-SQS,CloudWatch计划和Lambda。 同样,它也迫使生产者也要改变。

With approach 2, a new Lamb­da func­tion is need­ed for the addi­tion­al step, but it’s part of the state machine.

对于方法2,额外的步骤需要一个新的Lambda函数,但这是状态机的一部分。

成本 (Cost)

Approach 1 intro­duces min­i­mal base­line cost even when there are no executions. How­ev­er, we are talk­ing about cents here…

即使没有执行,方法1也会引入最低的基准成本。 但是,我们在这里谈论的是美分…

Approach 2 intro­duces addi­tion­al state tran­si­tions, which is around $25 per mil­lion. See the Step Func­tions pric­ing page for more details. Since each execution will incur 3 tran­si­tions per minute while it’s blocked, the cost of these tran­si­tions can pile up quick­ly.

方法2引入了其他状态转换,大约为百万分之25。 有关更多详细信息,请参见“ 步骤功能”定价页面。 由于每次执行都会在阻塞时每分钟发生3次转换,因此这些转换的成本会Swift增加。

结论 (Conclusions)

Giv­en the two approach­es we con­sid­ered here, using SQS is by far the more scal­able. It is also more cost effec­tive as the num­ber of con­cur­rent exe­cu­tions goes up.

考虑到我们在此处考虑的两种方法,使用SQS到目前为止具有更大的可扩展性。 随着并发执行次数的增加,它也更具成本效益。

But, you need to man­age addi­tion­al infra­struc­ture and force upstream sys­tems to change. This can impact oth­er teams, and ulti­mate­ly affects your abil­i­ty to deliv­er on time.

但是,您需要管理其他基础结构并强制上游系统进行更改。 这可能会影响其他团队,并最终影响您按时交付的能力。

If you do not expect a high num­ber of exe­cu­tions, then you might be bet­ter off going with the sec­ond approach.

如果您不希望执行大量操作,那么采用第二种方法可能会更好。

翻译自: https://www.freecodecamp.org/news/aws-step-functions-how-to-implement-semaphores-for-state-machines-8075650ceb86/

亚马逊 aws 指南 步骤

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值