cloudwatch_如何使用CloudWatch Events和Lambda自动为API创建CloudWatch警报

cloudwatch

by Yan Cui

崔燕

如何使用CloudWatch Events和Lambda自动为API创建CloudWatch警报 (How to auto-create CloudWatch Alarms for APIs with CloudWatch Events and Lambda)

In a pre­vi­ous post, I dis­cussed how to auto-sub­scribe a Cloud­Watch Log Group to a Lamb­da func­tion using Cloud­Watch Events. The benefit of this is that we don’t need a man­u­al process to ensure all Lamb­da logs are forwarded to our log aggre­ga­tion ser­vice.

在上一篇文章中 ,我讨论了如何使用CloudWatch EventsCloudWatch Log Group自动订阅给Lambda函数。 这样做的好处是我们不需要手动过程即可确保将所有Lambda日志转发到我们的日志聚合服务。

Whilst this is use­ful in its own right, it only scratch­es the sur­face of what we can do. Cloud­Trail and Cloud­Watch Events make it easy to auto­mate many day-to-day oper­a­tional steps, with the help of Lamb­da of course ?

尽管这本身是有用的,但它仅触及我们能做的事情的表面。 CloudTrail以及CloudWatch的活动可以很容易地自动完成很多日常的日常操作步骤,当然是有LAMBDA的帮助?

I work with API Gate­way and Lamb­da a lot. When­ev­er you cre­ate a new API, or make changes, there are sev­er­al things you need to do:

我用API网关Lambda了很多工作。 每当您创建新的API或进行更改时,都需要做几件事:

  • Enable Detailed Met­rics for the deploy­ment stage

    启用了部署阶段的详细指标

  • Set up a dash­board in Cloud­Watch, show­ing request count, laten­cies, and error counts

    在CloudWatch中设置仪表板,以显示请求计数,延迟和错误计数
  • Set up Cloud­Watch Alarms for P99 laten­cies and error counts

    设置CloudWatch警报以获取P99延迟和错误计数

Because these are man­u­al steps, they often get missed.

由于这些是手动步骤,因此经常会被遗漏。

Have you ever for­got­ten to update the dash­board after adding a new end­point to your API? And did you also remem­ber to set up a P99 laten­cy alarm on this new end­point? How about alarms on the number of 4XX or 5xx errors?

您是否曾经在向API添加新端点后忘记更新仪表板? 您还记得在此新端点上设置P99延迟警报吗? 警报4XX或5xx错误的数量如何?

Most teams I’ve dealt with have some con­ven­tions around these, but they don’t have a way to enforce them. The result is that the con­ven­tion is applied in patch­es and can­not be relied upon. I find that this approach doesn’t scale with the size of the team.

我处理过的大多数团队都围绕这些约定制定了一些约定,但是他们没有办法实施这些约定。 结果是该约定被应用在补丁程序中,不能被依赖。 我发现这种方法无法随团队规模扩展。

It works when you’re a small team. Every­one has a shared under­stand­ing, and the nec­es­sary dis­ci­pline to fol­low the con­ven­tion. When the team gets big­ger, you need automa­tion to help enforce these con­ven­tions.

当您是一个小型团队时,它会起作用。 每个人都有共同的理解,以及遵循公约的必要纪律。 当团队规模扩大时,您需要自动化来帮助强制执行这些约定。

For­tu­nate­ly, we can auto­mate away these man­u­al steps using the same pattern. In the Mon­i­tor­ing unit of my course Pro­duc­tion-Ready Server­less, I demon­strat­ed how you can do this in 3 sim­ple steps:

幸运的是,我们可以使用相同的模式自动执行这些手动步骤。 在我的生产就绪无服务器课程的“ 监视”单元中,我演示了如何通过3个简单步骤来做到这一点:

  • Cloud­Trail cap­tures the Cre­at­eDe­ploy­ment request to API Gate­way

    CloudTrail捕获CreateDeployment请求API网关

  • Cloud­Watch Events pat­tern against this cap­tured request

    针对此捕获的请求的CloudWatch Events模式

  • Lamb­da func­tion to enable detailed met­rics, and cre­ate alarms for each end­point

    lambda函数启用详细的指标,并为每个端点创建警报

If you use the Server­less frame­work, then you might have a func­tion that looks like this:

如果使用无服务器框架,则可能具有如下所示的功能:

A cou­ple of things to note from the code above:

上面的代码有两点需要注意:

  • I’m using the server­less-iam-roles-per-func­tion plu­g­in to give the func­tion a tai­lored IAM role

    我正在使用serverless-iam-roles-per-function插件为该功能提供量身定制的IAM角色

  • The func­tion needs the apigateway:PATCH per­mis­sion to enable detailed met­rics

    该功能需要apigateway:PATCH权限才能启用详细指标

  • The func­tion needs the apigateway:GET per­mis­sion to get the API name and REST end­points

    该函数需要apigateway:GET权限才能获取API名称和REST端点

  • The func­tion needs the cloudwatch:PutMetricAlarm per­mis­sion to cre­ate the alarms

    该功能需要cloudwatch:PutMetricAlarm权限才能创建警报

  • The envi­ron­ment vari­ables spec­i­fy SNS top­ics for the Cloud­Watch Alarms

    环境变量为CloudWatch警报指定SNS主题

The cap­tured event looks like this:

捕获的事件如下所示:

We can find the restApiId and stageName inside the detail.requestParameters attribute. That’s all we need to fig­ure out what end­points are there, and so what alarms we need to cre­ate.

我们可以在detail.requestParameters属性内找到restApiIdstageName 。 我们仅需弄清楚那里有什么端点,以及我们需要创建什么警报。

Inside the han­dler func­tion, which you can find here, we per­form a few steps:

在处理函数中(您可以在此处找到),我们执行一些步骤:

  • Enable detailed met­rics with an updateStage call to API Gate­way

    通过对API Gateway的updateStage调用来启用详细指标

  • Get the list of REST end­points with a getResources call to API Gate­way

    通过对API Gateway的getResources调用获取REST端点列表

  • Get the REST API name with a getRestApi call to API Gate­way

    通过对API Gateway的getRestApi调用获取REST API名称

  • For each of the REST end­points, cre­ate a P99 laten­cy alarm in the AWS/ApiGateway name­space

    对于每个REST端点,在AWS/ApiGateway命名空间中创建P99延迟警报

Now, every time I cre­ate a new API, I will have Cloud­Watch Alarms to alert me when the 99 per­centile laten­cy for an end­point goes over 1 sec­ond, for 5 minutes in a row.

现在,每次创建新的API时,当端点的99%延迟连续1分钟超过1秒时,我都会有CloudWatch Alarms来提醒我。

All this, with just a few lines of code ?

所有这些,仅需几行代码?

You can take this fur­ther, and have oth­er Lamb­da func­tions to:

您可以更进一步,并使用其他Lambda函数来:

  • Cre­ate Cloud­Watch Alarms for 5xx errors for each end­point

    为每个端点为5xx错误创建CloudWatch警报
  • Cre­ate Cloud­Watch Dash­board for the API

    为API创建CloudWatch仪表板

So there you have it! A use­ful pat­tern for automat­ing away man­u­al operational tasks.

所以你有它! 自动执行手动操作任务的有用模式。

And before you tell me about the ACloudGuru AWS Alerts Serverless plugin by the ACloudGuru folks, yes I’m aware of it. It looks neat, but it’s ulti­mate­ly still some­thing the developer has to remem­ber to do.

在您告诉我有关ACloudGuru员工的ACloudGuru AWS Alerts Serverless插件之前,是的,我已经知道了。 它看起来很整洁,但最终仍然是开发人员必须记住要做的事情。

That requires dis­ci­pline.

那需要纪律。

My expe­ri­ence tells me that you can­not rely on dis­ci­pline, ever. Which is why I pre­fer to have a plat­form in place that will gen­er­ate these alarms instead.

我的经验告诉我,你永远不能依靠纪律。 这就是为什么我更喜欢有一个可以生成这些警报的平台。

翻译自: https://www.freecodecamp.org/news/how-to-auto-create-cloudwatch-alarms-for-apis-with-cloudwatch-events-and-lambda-b128920857aa/

cloudwatch

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值