aws fargate_使用GitLab和AWS Fargate的按需CI / CD基础架构

最新推荐文章于 2022-09-01 09:08:54 发布

weixin_26752759

最新推荐文章于 2022-09-01 09:08:54 发布

阅读量463

点赞数

文章标签： python

原文链接：https://medium.com/ci-t/on-demand-ci-cd-infrastructure-with-gitlab-and-aws-fargate-376edc7afcda

版权

aws fargate

In a previous article, I explained how to deploy the GitLab Runner manager and Fargate driver on AWS Fargate with no virtual machine setup. In this way, you can have your GitLab CI/CD jobs running serverless.

在上一篇文章中，我解释了如何在没有虚拟机设置的情况下在AWS Fargate上部署GitLab Runner管理器和Fargate驱动程序。这样，您可以使GitLab CI / CD作业在无服务器的情况下运行。

In the present article, I will show how you can use AWS Lambda functions to stop the Runner manager hosted on AWS Fargate when there are no CI/CD jobs to process and start it when a new pipeline is triggered. This configuration can significantly reduce the costs when you have considerable idle times between builds.

在本文中，我将展示如何在没有CI / CD作业要处理时，如何使用AWS Lambda函数停止在AWS Fargate上托管的Runner管理器，以及在触发新管道时启动它。当您在两次构建之间有大量空闲时间时，此配置可以大大降低成本。

GitLab Runner如何工作以及将其缩小到零的挑战 (How GitLab Runner works and the challenges to scale it down to zero)

GitLab Runner is an open-source software used to run your jobs and send the results back to GitLab. In summary, it works as an agent that polls a GitLab instance from time to time, asking for a pipeline job to be processed. It executes the assigned job and returns the job output and final status to the GitLab instance.

GitLab Runner是一个开源软件，用于运行您的作业并将结果发送回GitLab。总而言之，它作为不时轮询GitLab实例的代理，要求处理管道作业。它执行分配的作业，并将作业输出和最终状态返回给GitLab实例。

Since GitLab Runner needs to actively query GitLab instance for pending jobs, and not the opposite, it is expected to be always up and running. Given this scenario, a possible approach to achieve the "scale down to zero" behavior when idle is to create some kind of integration with the GitLab instance that we could use for identifying when a new pipeline was started and when no more jobs are pending. For this, we can use two integrations provided by GitLab: Webhooks and GitLab API.

由于GitLab Runner需要主动查询GitLab实例是否有待处理的作业，而不是相反，因此它有望一直运行。在这种情况下，实现闲置时“缩减为零”行为的一种可行方法是与GitLab实例建立某种集成，我们可以使用它来识别何时启动新管道以及何时没有更多的作业挂起。为此，我们可以使用GitLab提供的两个集成： Webhooks和GitLab API 。

我们解决方案的局限性 (Limitations of our solution)

For sake of simplicity, we did not consider more complex scenarios in the present solution. Below I list the most relevant limitations:

为了简单起见，我们在本解决方案中没有考虑更复杂的场景。下面我列出了最相关的限制：

We consider all jobs from the GitLab project should be processed by the same Runner. Nevertheless, it is not hard to adapt it for the case you need distinct Runners for different types of builds.
我们认为，GitLab项目中的所有作业均应由同一Runner处理。尽管如此，要针对不同类型的构建需要不同的Runner来适应它并不难。
We did not consider group runners in the present solution. In the present scenario, the runner is specific for processing the builds of a given GitLab project.
在当前解决方案中，我们没有考虑小组赛跑者。在当前情况下，运行器特定于处理给定GitLab项目的构建。

处所 (Premises)

For the scope of this article, we took into account the premises below:

对于本文的范围，我们考虑了以下前提：

All the necessary AWS infrastructure (VPC network, subnet, security group, and Fargate cluster) was previously created and configured in your AWS project.
先前已在您的AWS项目中创建并配置了所有必需的AWS基础设施(VPC网络，子网，安全组和Fargate群集)。
Creating the container image and Task Definition for the Runner manager are also not in the scope since they were presented in a previous article.
为Runner管理器创建容器映像和任务定义也不在范围内，因为它们已在上一篇文章中介绍过。
The AWS SAM client will be used in all deployments mentioned throughout the text. The reader may refer to AWS documentation for instructions on how to install and use it.
本文中提到的所有部署都将使用AWS SAM客户端 。读者可以参考AWS文档以获取有关如何安装和使用它的说明。

解决方案概述 (Solution overview)

The image below presents a high-level overview of the solution.

下图显示了该解决方案的高级概述。

Image showing all components involved in the solution: for example the Lambda functions and the Runner Fargate Task.

The core of the solution consists of two AWS Lambda functions:

该解决方案的核心包括两个AWS Lambda函数：

The first function is triggered by a GitLab Webhook when pipeline events occur in a given GitLab project. The function is responsible for starting a new GitLab Runner manager Fargate Task, in case it is not already running. Since the AWS Fargate driver uses SSH to connect to other Fargate Tasks, the function will also create an inbound rule in a specified Security Group to allow connections from the Runner's container IP address.
当在给定的GitLab项目中发生管道事件时，第一个功能由GitLab Webhook触发。该功能负责启动新的GitLab Runner管理器Fargate Task (如果尚未运行)。由于AWS Fargate驱动程序使用SSH连接到其他Fargate Tasks ，因此该功能还将在指定的安全组中创建入站规则，以允许来自Runner的容器IP地址的连接。
The second function is triggered on a regular schedule using AWS CloudWatch Events rule and it is responsible for stopping the GitLab Runner manager Fargate Task when no more CI Jobs are pending execution. In order to discover pending CI Jobs, it queries the GitLab Jobs API for the given GitLab project. Finally, the function removes the inbound rule created in the Security Group by the first function.
第二个功能使用AWS CloudWatch Events规则定期触发，它负责在没有其他CI作业待执行时停止GitLab Runner管理器Fargate Task 。为了发现未决的CI Jobs，它查询给定的GitLab项目的GitLab Jobs API 。最后，该功能删除第一个功能在安全组中创建的入站规则。

Please note this is a simplistic architecture to make it easier for the reader to understand the flow and the idea behind this article. In a more micro-service oriented architecture, the functions presented above could be split into multiple functions with a more focused scope.

请注意，这是一个简单的体系结构，使读者可以更轻松地理解本文的流程和思想。在面向微服务的体系结构中，以上介绍的功能可以分为多个功能，并且具有更大的针对性。

It is worth mentioning AWS offers a generous free usage tier for using Lambda functions. Free tier also applies to API Gateway and CloudWatch.

值得一提的是，AWS为使用Lambda函数提供了一个免费的免费使用层。免费套餐还适用于API Gateway和CloudWatch 。

From here, I will focus on detailing the steps necessary to implement the presented solution:

从这里开始，我将重点介绍实现所提出的解决方案所必需的步骤：

Store AWS Lambda functions' configs in AWS Parameter Store
将AWS Lambda函数的配置存储在AWS参数存储中
Create the API Gateway and AWS Lambda function to start the GitLab Runner
创建API网关和AWS Lambda函数以启动GitLab Runner
Configure a GitLab Webhook to trigger the function on pipeline events
配置GitLab Webhook以在管道事件上触发功能
Create the CloudWatch event rule and AWS Lambda function to stop GitLab Runner when idle
创建CloudWatch事件规则和AWS Lambda函数以在闲置时停止GitLab Runner
Test the configuration
测试配置

步骤1：将AWS Lambda函数的配置存储在AWS参数存储中 (Step 1: Store AWS Lambda functions’ configs in AWS Parameter Store)

Both Lambda functions that are part of the proposed solution will need to receive some information about the available AWS infrastructure or your GitLab project to be able to start/stop the Runner manager Fargate Task. We will use the AWS Parameter Store to centralize this information.

拟议解决方案中的两个Lambda功能都需要接收有关可用AWS基础设施或GitLab项目的一些信息，以便能够启动/停止Runner manager Fargate Task 。我们将使用AWS Parameter Store集中这些信息。

Go to AWS Parameter Store in your AWS project.
转到AWS项目中的AWS Parameter Store 。
Click Create parameter.
单击创建参数 。
Name it lambda-gitlab-runner. Note that the functions will search for a parameter having exactly this name.
将其命名为lambda-gitlab-runner 。请注意，函数将搜索具有确切名称的参数。
For the Type field, you can choose SecureStringor String.
对于类型字段，可以选择SecureString或String 。
For the Value field, you should fill it with the following JSON, replacing the attribute values by the correct information:
对于“ 值”字段，应使用以下JSON填充它，并用正确的信息替换属性值：

{
   "clusterName":"yourClusterName",
   "subnet":"subnet-XYZ",
   "securityGroup":"sg-XYZ",
   "runnerTaskDefinition":"yourTaskDefinition",
   "gitlabProjectId":"yourPrivateId",
   "gitlabApiPrivateToken":"yourPrivateToken",
   "gitlabHeaderToken":"yourGitLabToken"
}

Below is the explanation for each attribute:

以下是每个属性的说明：

clusterName: Name of the Fargate cluster where the GitLab Runner Task should be started/stopped.
clusterName：应该启动/停止GitLab Runner任务的Fargate集群的名称。
subnet: Subnet where the GitLab Runner Task should be started/stopped.
子网：应该在其中启动/停止GitLab Runner任务的子网。
securityGroup: Security group used by your GitLab Runner Task.
securityGroup： GitLab运行程序任务使用的安全组。
runnerTaskDefinition: Task Definition used to create the GitLab Runner Task.
RunnerTaskDefinition：用于创建GitLab运行程序任务的任务定义。
gitlabProjectId: Your GitLab project id. You can find this information in your GitLab project initial page, below the project name.
gitlabProjectId：您的GitLab项目ID。您可以在项目名称下方的GitLab项目初始页中找到此信息。
gitlabApiPrivateToken: GitLab personal access token necessary to use the GitLab API. If you don't have one already created, just follow the GitLab documentation for generating one.
gitlabApiPrivateToken：使用GitLab API所需的GitLab个人访问令牌。如果尚未创建一个，则只需遵循GitLab文档生成一个即可。
gitlabHeaderToken: String you will use as the GitLab Webhook Secret Token. You can think of it as a password you create for authenticating requests between GitLab Webhook and the AWS Lambda function.
gitlabHeaderToken：您将用作GitLab Webhook秘密令牌的字符串。您可以将其视为为在GitLab Webhook和AWS Lambda函数之间进行身份验证请求而创建的密码。

6. Click Create parameter.

6.单击创建参数 。

步骤2：创建API Gateway和AWS Lambda函数以启动GitLab Runner (Step 2: Create the API Gateway and AWS Lambda function to start the GitLab Runner)

To simplify this step, we provided a public Git repository containing the function implementation, as well as an AWS SAM template we will use for deploying both the API Gateway and AWS Lambda function.

为了简化此步骤，我们提供了一个公共Git存储库，其中包含函数实现以及用于部署API Gateway和AWS Lambda函数的AWS SAM模板 。

Below we show two commands you will need to execute to deploy the resources. Please refer to the function documentation for more information about the necessary IAM permissions your AWS user will need to successfully complete the deploy.

下面，我们显示您需要执行两个命令来部署资源。请参阅功能文档以获取有关您的AWS用户成功完成部署所需的必要IAM权限的更多信息。

sam package \
  --template-file template.yml \
  --output-template-file package.yml \
  --s3-bucket <your-s3-bucket>sam deploy \
  --template-file package.yml \
  --stack-name <your-stack-name> \
  --capabilities CAPABILITY_IAM

Note: Remember to replace the S3 bucket and stack name by the correct values.

注意：请记住用正确的值替换S3存储桶和堆栈名称。

If everything works as expected, the deploy will output the Amazon Resource Name (ARN) of the function, as well as the API Gateway endpoint URL to be used to trigger the function. You will need to provide this URL when configuring GitLab Webhook in the next step.

如果一切正常，部署将输出功能的亚马逊资源名称(ARN)以及用于触发功能的API网关端点URL。在下一步中配置GitLab Webhook时，需要提供此URL。

详细说明此功能的工作原理 (Detailing how this function works)

This section describes in more detail how the function works. The reader not interested in a deeper understanding of it may jump to the next section.

本节将更详细地描述该功能的工作原理。对更深入的了解不感兴趣的读者可以跳到下一部分。

In summary, the function performs the following steps:

总而言之，该函数执行以下步骤：

Read config parameters: When the function is started, it reads the lambda-gitlab-runner parameter from the AWS Parameter Store. As presented previously, this parameter contains a JSON with several configuration values to be used by the function.
读取配置参数：启动该功能后，它将从AWS Parameter Store中读取lambda-gitlab-runner参数。如前所述，此参数包含一个JSON，其中包含该函数要使用的多个配置值。
Authentication: The function uses the gitlabHeaderToken configuration to compare its value with the value received within the "X-Gitlab-Token" HTTP header of the request. If those values differ, the authentication will fail.
身份验证：该函数使用gitlabHeaderToken配置将其值与在请求的“ X-Gitlab-Token” HTTP标头中接收到的值进行比较。如果这些值不同，则身份验证将失败。
Start a new Runner Fargate Task: If there is no Runner manager Task is currently running, a new Task is started with a specific value in the “started-by” field, in order to make it easier to be identified by the other function we will use to stop the Runner.
启动一个新的Runner Fargate任务 ：如果当前没有正在运行的Runner Manager 任务，那么将在“ started-by”字段中使用特定值启动一个新的Task ，以使其易于被我们的其他功能识别将用于停止亚军。
Add an inbound rule to Security Group: The function creates an inbound rule in the Security Group specified by the securityGroup configuration parameter to allow SSH connections from the Runner manager.
向安全组添加入站规则：该函数在由securityGroup配置参数指定的安全组中创建入站规则，以允许来自运行管理器的SSH连接。

Below we show a Python code that is the core part of the function, where you can identify some of the steps described above.

下面我们显示了Python代码，它是函数的核心部分，您可以在其中识别上述一些步骤。

def _process_request(cluster_name, subnet, security_group, task_definition):   message = None   task_count = count_tasks_running(cluster_name, task_definition)   if task_count == 0:      task_arn = _run_task(
         cluster_name, task_definition, subnet, security_group
      )      _create_ssh_inbound_rule(cluster_name, security_group, task_arn)      message = "Task successfully created"   else:
      LOGGER.info("Task already exist, will abort")
      message = "Task already exist on cluster"return {"message": message}

步骤3：配置GitLab Webhook来在管道事件上触发功能 (Step 3: Configure a GitLab Webhook to trigger the function on pipeline events)

Below we show how to configure GitLab Webhook to trigger the function every-time a pipeline event happens in your GitLab project.

下面我们展示了如何配置GitLab Webhook，以在每次在GitLab项目中发生管道事件时触发该功能。

In the GitLab project, go to the Settings menu and click in Webhooks.
在GitLab项目中，转到“设置”菜单，然后单击Webhooks。
For the Secret Token field, use the same value as you used for the gitlabHeaderToken configuration in the AWS Parameter Store.
对于Secret Token字段，使用与AWS Parameter Store中的gitlabHeaderToken配置相同的值。
Fill the URL field with the API Gateway endpoint URL printed in your console when you deployed the function.
部署功能时，在控制台中打印的API Gateway端点URL填入URL字段。
In the Trigger field, leave only the Pipeline events checkbox selected.
在“ 触发”字段中，仅选中“ Pipeline events复选框。
Click Add webhook.
点击添加webhook 。

步骤4：创建CloudWatch事件规则和AWS Lambda函数以在闲置时停止GitLab Runner (Step 4: Create the CloudWatch event rule and AWS Lambda function to stop GitLab Runner when idle)

Similar to what we have done for the first AWS Lambda function, we provided another public Git repository containing the function implementation as well as an AWS SAM template we will use for deploying both the CloudWatch event and AWS Lambda function.

与我们对第一个AWS Lambda函数所做的类似，我们提供了另一个公共Git存储库，其中包含函数实现以及将用于部署CloudWatch事件和AWS Lambda函数的AWS SAM模板 。

You will need to use similar commands to deploy this new function. Please refer to the function documentation for information about the necessary IAM permissions for the deploy.

您将需要使用类似的命令来部署此新功能。请参阅功能文档，以获取有关部署所需的IAM权限的信息。

sam package \
  --template-file template.yml \
  --output-template-file package.yml \
  --s3-bucket <your-s3-bucket>sam deploy \
  --template-file package.yml \
  --stack-name <your-stack-name> \
  --capabilities CAPABILITY_IAM

Note: Remember to replace the S3 bucket and stack name by the correct values.

注意：请记住用正确的值替换S3存储桶和堆栈名称。

If everything works as expected, the deploy will output the function ARN.

如果一切正常，部署将输出功能ARN 。

Note: in the default settings, the CloudWatch Events rule will trigger this function every 10 minutes. You can customize this value in the SAM template file.

注意：在默认设置中， CloudWatch Events规则将每10分钟触发一次此功能。您可以在SAM模板文件中自定义此值。

详细说明此功能的工作原理 (Detailing how this function works)

The reader not interested in a deeper understanding of the function may jump to the next section.

对更深入地了解功能不感兴趣的读者可以跳至下一部分。

In summary, the function performs the following steps:

总之，该函数执行以下步骤：

Search Runner managers currently running: The function initially searches for all Runner Fargate Tasks created by the function presented in Step 2. For that, it uses the “started-by” field of the Fargate Task.
搜索当前正在运行的Runner管理器：该功能最初搜索由步骤2中显示的功能创建的所有Runner Fargate任务 。为此，它使用Fargate Task的“ started-by”字段。
Check if there are pending Jobs to process: The function then uses the GitLab API to search for jobs in the GitLab project that are currently in pending or running states, ignoring those being processed by shared Runners.
检查是否有待处理的作业：该函数随后使用GitLab API在GitLab项目中搜索当前处于待处理或运行状态的作业，而忽略共享Runner正在处理的作业。
Remove the Security Group inbound rule: If no CI job to process is found, it will remove the inbound rule used to allow SSH connections from the Runner manager.
删除安全组入站规则：如果未找到要处理的CI作业，它将删除用于允许来自运行管理器的SSH连接的入站规则。
Stop Runner Fargate Tasks: finally, the function will stop the Runner manager.
停止Runner Fargate任务 ：最后，该功能将停止Runner管理器。

Below we show the Python code for the core part of the function.

下面我们显示了该函数核心部分的Python代码。

def _process_request(
    cluster_name, security_group, gitlab_token, gitlab_project
):   runner_arn_list = _search_for_runner_manager_tasks(cluster_name)   if len(runner_arn_list) > 0:      exist_job = _exist_ci_jobs_being_processed(
         gitlab_project, gitlab_token
      )      if not exist_job:
         _remove_ssh_inbound_rules(
            cluster_name, runner_arn_list, security_group
         )
         _stop_runner_managers(cluster_name, runner_arn_list)

步骤5：测试配置 (Step 5: Test the configuration)

At this point, you should be able to trigger your pipeline and check if the Runner is properly started and stopped by the Lambda functions.

此时，您应该能够触发管道，并检查Lambda函数是否正确启动和停止了Runner。

In your GitLab project, go to the CI/CD menu and click in Pipelines.
在您的GitLab项目中，转到CI / CD菜单，然后单击“ 管道”。
Click in Run Pipeline.
在运行管道中单击。
Select the correct branch in the Run for field and add any variable your build requires in the Variables field.
选择在运行现场正确的分支，并添加您的构建需要在变量字段中的任何变量。
Click Run Pipeline.
单击运行管道。

结论 (Conclusion)

This article presented a tutorial on how to use AWS Lambda functions to keep the GitLab Runner up and running in AWS Fargate only during the time there are CI jobs to process. We tried to focus on a simple solution but we believe it can be enhanced and evolved to fit more complex scenarios.

本文介绍了有关如何使用AWS Lambda函数保持GitLab Runner在AWS Fargate中正常运行的教程，仅在要处理CI作业的情况下。我们试图将重点放在一个简单的解决方案上，但我们认为可以对其进行增强和发展以适应更复杂的场景。

I hope you found this article helpful. Thanks for reading!

希望本文对您有所帮助。谢谢阅读！

翻译自: https://medium.com/ci-t/on-demand-ci-cd-infrastructure-with-gitlab-and-aws-fargate-376edc7afcda

aws fargate

weixin_26752759

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
aws fargate_使用GitLab和AWS Fargate的按需CI / CD基础架构

aws fargateIn a previous article, I explained how to deploy the GitLab Runner manager and Fargate driver on AWS Fargate with no virtual machine setup. In this way, you can have your GitLab CI/CD jobs ...
复制链接

扫一扫