Securing containerized workloads on Amazon ECS and AWS Fargate

最新推荐文章于 2024-10-07 16:48:35 发布

李白的朋友王维

最新推荐文章于 2024-10-07 16:48:35 发布

阅读量109

点赞数

文章标签： aws 亚马逊云科技科技人工智能 re:Invent 2023 生成式AI 云服务

本文链接：https://blog.csdn.net/just2gooo/article/details/134814093

版权

Good morning and welcome to our session. Let's start with a quick poll.

How many here believe that securing your cloud applications is something you must do to gain regulatory compliance and certification? Good amount. Good.

How many of you believe that you must secure your cloud applications to protect your customers' personally identifiable information and your own intellectual property and assets? Ok.

And how many of you believe that security should be the number one concern in your organization?

And we certainly feel like that in AWS. Security is our number one concern and also with ECS, every feature we develop, we have that tenant in mind and we go through rigorous testing to ensure that is the case.

My name is Spiros, I'm a product manager for ECS and Fargate and I'm joined today with Yen and we're gonna go through a few of the existing and new features that we have on ECS and Fargate.

So imagine a situation where an employee leaves a company, but for some reason, some of their credentials don't get purged from all the systems. Some of the largest breaches have happened through compromised credentials like that.

And although ECS, although AWS follows a shared responsibility security model, we're responsible for security of the cloud, meaning we secure the infrastructure where your applications run on. You are responsible for security in the cloud, meaning safeguarding the applications and your customers data.

So it's always important to know what your applications are doing, how they're behaving because you have an amount of resources that you provision and throttle limits that you must meet. So you wanna make sure that those resources do not fall prey to unauthorized usage and compliance is very important.

In fact, it's table stakes for a lot of industries and that is something you must always strive to do in improving your security posture. But more importantly than that, it's also building trust, building trust with your own customers because they give their personal information to you to safeguard and curate, but also with your supply chain partners because they also federate their interfaces with you and provide you with access to their intellectual property.

So it is pretty important to build those trusting working relationships. Today, we'll take a look at the ECS security model and some of our enhanced security features we have developed. Plus we're gonna spend a good chunk of time on a new announcement we had this week of introducing GuardDuty runtime monitoring with ECS and Fargate and we're gonna show you a demo of the ECS operator experience if you happen to be in the security session.

Back on Tuesday, we went through the GuardDuty experience and what you should bear in mind is that we designed this feature thinking that there could potentially be two different people that take advantage of this feature, a security administrator from the security team and the ECS platform operator from the infrastructure team.

Containers by definition, build inherent security by isolating resources and separating it from each other. ECS deploys in two different modes on different compute. There's the managed EC2 case where this kind of separation happens at the instance level. And there's also the serverless mode, AWS Fargate, where that separation happens at the task level. And we'll show you a few features we did on EC2 to actually help resemble the Fargate experience and isolate the resources down to the task level as well.

So when looking at the different deployment models, when you launch ECS on EC2, there are three trust boundaries that you must protect. There's the operating system and kernel to avoid a bad image from getting loaded. There's the instance itself so you can safeguard against compromised credentials trying to get access to the resources as well as protect any malicious traffic or messaging coming in. And there's also the application level. This is where your container runs with your application. So any kind of malware or software vulnerability that can be exploited, exploited, you can that can actually get access not only to the task resources but also potentially to the instance resources and then propagate that across other tasks.

In Fargate, we greatly reduce the potential attack footprint because the only place that someone can attack is the task. And because tasks are isolated from each other, there is no chance someone can go beyond that level and propagate this across many other tasks.

So now let's take a look at how we create additional security features at every step of the ECS stack.

So at the very bottom, we have the operating system and the kernel, we have the container runtime, like Docker ContainerD and we have the ECS agent which communicates to the ECS service. So you can launch your containers in your clusters.

So aside of the compute separation and the infrastructure patching, you can also define security groups, you can define network access control lists. So you can decide what traffic is allowed or disallowed at the subnet level. And you can also have IAM roles for container instances. And with that, you can instruct what type of API calls the ECS agent or container runtime makes to your kernel.

You can also leverage AWS PrivateLink. This is a way to set up communication between ECS and the VPC. So the traffic doesn't go through the public internet.

Moving one level up. The task is a collection of containers where you run your applications. It's an instantiation of your task definition file where you actually put all the configuration parameters for how you want your containers to run.

So this is the point where you can define IAM roles for tasks when you launch an EC2, because that helps you separate and isolate your user credentials down to the task level. Making the EC2 experience look more like Fargate.

It is also where you can do dedicated ENI for tasks. In a similar vein, you now take that isolation for network down to the task level and resemble the Fargate experience. And this is also where you said task execution role, where you can create the type of permissions that the task requires to execute, whether that means allowing them to create log groups, allowing to download container images and so on.

And then you have container, that's where your application runs. And this is a place where you can set container secrets like API keys or user credentials or certificates. You can also adjust Linux kernel capabilities, either taking away some that your container doesn't need or add enhance with AppArmor profiles, for example.

And you have the option of running your container in privileged mode although that gives you root access. So unless it's absolutely necessary, we encourage you to disable that. And ECS gives you a chance to do that.

You also define user limits both soft and hard to protect against runaway application, consuming a lot more than what they need to run as well as image scanning and image pruning. If you use ECR it already scans your images and make sure that there are no software vulnerabilities and ECS, make sure that the version that we get is the latest with all the updates and security patches.

And at the very top you have the application. So this is where your core business logic runs. When you run on Fargate, you also can add ancillary functions that run inside the same task as your application itself, things like logging or routing or monitoring.

So some additional features we have here is implementing transport layer security for service to service, communication, tag based access control. So you can allow which actors can actually take actions and interact with your applications as well as configuring all your parameters for your observable functions, things like log routing and metrics and debugging, tracing and so on.

What we announced this week was the introduction of GuardDuty runtime monitoring support for Amazon ECS and Fargate. If you're not familiar with GuardDuty, GuardDuty is AWS's intelligent threat detection service, ECS has been using GuardDuty for a long time and that was to get things like VPC flow logs, DNS query logs, CloudTrail management events. But this is the first time we're actually using it for runtime threat detection. And in fact, it's the first time ever that we're doing it on a serverless setting with AWS Fargate.

Earlier this year, we announced runtime threat detection with GuardDuty on EKS, which is the managed Kubernetes service that AWS provides. This week, along with this announcement, we also announced a preview on EC2 standalone as well. So between EKS, ECS, Fargate and EC2, now you can enable runtime threat detection on any workload that your organization may demand.

So how does that work? Well, every time that a new resource gets spun in ECS, we collect the telemetry around application behavior and send it over to the GuardDuty service which applies machine learning anomaly detection and threat intelligence through their database to generate more than 30 different security findings specific to runtime threats.

And at that point, you can look at those findings either natively on the GuardDuty console or you can forward them on downstream to other services like Security Hub or Amazon Detective or even set up an event hub to send them over to an APN partner that can actually aggregate and process further and perhaps provide remediation and response actions.

Taking a closer look at how that's done. We have a lightweight GuardDuty runtime monitoring agent that we onboard on every EC2 resource. When you launch ECS on EC2, you can do this the same way you would do it on normal EC2, you can use the AWS Systems Manager to onboard it onto the EC2 instance. And at that point, we can monitor operating system and kernel information using eBPF probes as well as all the application behavior running on all your containers on all your tasks inside that instance.

Another way to onboard that is through an RPM package that's on an S3 bucket. So you could always have the option to also bake that into your own AMI.

The other thing you would need to do is set up a VPC endpoint so that telemetry can flow over to the GuardDuty service where all that telemetry is used to calculate the security findings and be sent downstream. That way that kind of sensitive information does not go through the public internet.

On Fargate, we went even further. So here the GuardDuty runtime monitoring agent is stored in ECR on a private repo. So if you already used the Amazon ECS managed execution role policy, there is nothing you need to do. Fargate will auto inject the agent for you. If you don't, then you just have to ensure that the task execution role that Fargate your Fargate task have have permission to download the image. And then for every new task, either standalone or part of a service, Fargate will pull and auto inject this agent in the task for you.

That's it. There's no task definitions to change, there are no images to maintain no upgrades to worry about every security patch. In every upgrade, you're guaranteed that every new task will pull the latest image and the added advantage with Fargate is GuardDuty will automatically set up the VPC end point for you. So the ultimate experience here is that your security administrator can enable this feature. And all of this can happen without the ACS operator lifting a finger. And this is kind of important to note because in doing that kind of separation of responsibilities between the security administrator and the ACS operator. Sometimes we find that especially on the platform side, infrastructure operator are a little weary of someone else in one click going and saying i want to run this on all your clusters in this account. So we also created some additional capabilities for the us operator to be able to work through this feature uh as well. And this is the type of thing that we're gonna demo right now sex burros.

So in today's demo, I'll be demonstrating two different scenarios to walk through. How do i enable my um my ecs workloads with this new guard duty ecs random monitoring feature on both far and the ecs cc two launch type. And this is all from the ecs operator, a a application owner perspective. So in the first demo, i will be uh walk through the steps to showcase how do i use specific es cluster with my target workloads into this new feature. As sp was mentioned, we actually as part of this feature, we support a different enablement mechanism to cater for different uh many uh different ways of managing workloads from different uh different personas within your organization.

So for the security administrator, we provide a centralized place where you can just enable this ecs run time or feature feature with the guard duty agent of deployment at the s at the same place. So which means with one or two button clicks, you can have this feature enabled and have the guard duty uh agent deploy for all your f workloads for all the managed accounts. However, this shouldn't happen until you, you're the ecs operators actually do the fully testing with your uh f workloads so that you can make sure that the whole integration is actually safe for your production workloads. And this uh this first demo is actually try. I was just walking through steps and to demonstrate how do we achieve this.

And for the second one, i will also touch base on how do i enable this feature on ec2 ecs ec2 cases. So at the moment, there is a customer experience difference on far g and ecs on ec2. So on far, we are actually supporting a fully managed automated and similar experience, which means you have one click experience that we uh we set up everything on behalf of you. And for es ec2 uh uh uh currently, there's a couple more steps, we will be involved to set up network or create the point or install the guard duty agent.

So next, let's see the featuring action. So uh during the today demo, i was using the aws management council and before we opt in our ecs cluster, the first thing we need to check is we have to make sure that the amazon es random monitoring account setting is turned on. So if you come to the ecs console, on the left side, you can see there's a account setting option and then once you select it, it will redirect you to the page which is show on my screen. And if you scroll down, you can see there's amazon ecs random monitoring account setting and the currently is already turned on.

So this account setting is actually fully managed by guard duty, which means you can already uh you cannot really update it through any ecs supported api. So when you, when your security administrator in your company enabled the random monitoring configuration on the guard duty side, it will be automatically turned on.

So once we make sure to verify the the account settings are turned on, we can now jump to the class page and try to opt in this cluster. So the cluster opt in process is actually done through adding a predefined text into your existing cluster all created with your new clusters.

So to add a text to your existing cluster, we can come to the to the cluster page. And if you see there's up top uh section, there's update cluster button. And if you click it, you can see it will redirect to your page, which you can update the subset of cluster configuration. And if you scroll down to the bottom, you can see there's a text section.

So to opt in this specific cluster, you won't need to add a predefined text whose key is guard duty managed and the value is true in order to opt in this cluster. So we also support the opposite. Which means if you already have auto deployment enabled for all your account, you actually also be able to opt opt out of sparsity cluster within your account from this feature. So in order to do this, you just need to flip this value from true to false. And you will also be able to opt out your cluster from auto deployment.

And to for when you test your feature disintegration, we highly recommend that you also enable the container insights which if you see on this console, you can see there is in the monitor section, there's a button you can toggle which help you to enable the container insights for your cluster. So con container insight is a feature that we support that can help you to collect aggregate and analyze the metrics coming from logs coming from your container runtime and underlying infrastructure so it help to help you to collect the resource utilization that consumed by your resource uh task cluster service. And so that since as sparrows mentioned, when we deploy the guard duty agent for your task, it was injected as a setup container within your target task.

So this means it will actually consuming additional resources with your task. So enable this feature will help you understand what actually the resource usage such as cpu memory consumed by this new guard duty container. So you can decide whether you need to adjust your task, definite task size accordingly.

So once we have the configuration set up, we can create the update, we can click the update button to update our word cluster. So when you open in this cluster, it won't, it won't automatically directly apply to your running workloads, which means we won't actually go to replace and stop your already running workloads. You have to do a manual refresh. This means if your task is actually managed by service, you will need to trigger a new, trigger, a new deployment. And if you have a new stand alone task, you will have to create the new tasks.

So once you have your cluster up in and it will show up on your guard duty dashboard. So if you come to the guard duty console, you can see under your ecs cluster coverage dashboard, you can see my testing cluster already showed up here. It gives you all your information such as the coverage status and then how many instance nodes are covered in this dashboard.

So now let's see, let's try to create some new new task and to see what my task will look like after i enable this new feature. So uh as we as we mentioned before, so the only changes that may need in your task, it totally depends on whether you where your container image gets stored. So since the guard duty container image is stored in the ecr repository that you may need to add a execution of r in your test definition, which also have the the required ecr permission so that the target is able to pull down the guard duty image from the ecr report.

So if you already have the aws managed es test policy use for your execution role, there's zero action required on your side. So this is the only changes we need in your task definition. And if as i mentioned, if you already have your major store in cr you already use the, have the ecr permission, there's no changes required.

So now since i have already updated my task definition, i can go create some new tasks. So if i click this brand new task button for the lunch time, i will choose fargate. And for the platform version, since this new feature, we only support the platform, a platform version about 1.1 0.0. So you can choose either latest or 1.1 0.0 platform version.

And since i'm going to create a new stand alone task, i'll choose a task and also select the, the test definition i just created. And one thing i want to mention is uh since the f needs to be able to talk to ecr service to put down the guard duty image, there's uh if your f task is actually launched in the public subnet, you will need to have a public ip assigned to your uh assign to your task.

And if your fruit task actually living in the private subnet, there's two ways to achieve this one. The first one is attach a network uh net gateway interface to your private subnet. And the second one is actually what we recommend is is to create a v pe cr vpc end point so that your far talk to the can talk to the ecr service.

So once you have this networking set up, we can go create some new tasks. You can see on the bottom this my task and now it's in the program status. So if we click the task, we can see what are the containers actually now in my task. So the amazon linux is actually in my application application container. And then now you can see and my containers section, there's another set card container is auto injected for you because we don't actually change anything test definition and we just run new task, we automatically inject this new card duty container for you.

And this garbage container is a non-essential container, which means if there's any problems, if the container is failed to start due to permission issue or network issue, or if the container exit i and exit unexpectedly due to some errors, your task won't be stopped. So your all your application containers will still continue to keep the current status.

And also if you see the configuration on the page, you can see that we also enforced a maximum memory limit on the guard duty container so that we can make sure that your guard duty container won't consume unbounded manner. All these actions that are trying to make sure that with this feature integration, your the performance and the availability of your applications workloads won't be impacted.

Now, my g container is running, you may start to wonder. So since i have additional container running in my task, what are the resource consumed by this new container do i have to be concerned about? And what actually the new task size i need to change from my target task.

Now, we can actually take one to to actually answer all these questions, collect all these data, we can use the container inside feature which just enabled on my cluster

So to see the container insights, metrics and logs that collected, we can come to the CloudWatch dashboard. And if you scroll down on the left side, you can see at the bottom there's container insights option. If you select it, it actually redirect to a page, you can see, you can see all the metrics logs for the resources that you have container insights enabled.

And I can select the cluster that I have, have my task launched. And on the resources, you can see there's a different level of resources. I can see the metrics logs on my aggregate on my class level and also algorithm by the ta definition. So the task guard is i will choose the task definition i just launched. And then you can now you can see all the resource metrics related to the cp utilization micro utilization, also networking disk.

So when you test, you can monitor the metrics here. So you can understand what are my new task, memory, new task or resource ization look like. And then you can actually just test size based on the the metrics also for at the moment, the for the metrics, we only provide task level, the class level service level and task level.

So you may also want to dive into what's actually the specific number, the result number consumed by the guard duty container. And then we to achieve this, we actually offer a performance log which actually can help you to dive into more details on the continent level resource usage.

So if you you choose the guard duty container and then there's action button, you can select and then you can show, you can choose to show real performance log. It will redirect you to your page which will help you to collect all the queries, all the logs that relate to this container and the query you can see is already prepopulated.

So if you run the query, they will collect all the performance logs for you. And if, if you see an example log, you can see there's container name, which is our guard du container. And on this law, you can see there is, it tells you the cpu utilize is utilized memory utilized all including network and disk.

So i highly recommend that when you test your cluster, your workloads do some like um perform some uh uh low testing. So you can actually observe the resource utilization after this uh feature integration. So you understand that what actually the the new task side should look like, so you can protect your production workload.

Now, we have the guard duty container running your f task. The guarded conner will start to monitoring your container runtime and then collect all the calme sent to the guard duty service. And if there's any detected, the guard duty will generate a security finding on your guard duty or guard duty dashboard.

See, and which will also pinpoint the container task information. So if you want, if you come to the guard duty dashboard and we can select the findings and there will be there will you can see there's security findings generated here. And to see an example, if you scroll down, you can see there's cluster informa clustered information which tells its depth cluster. And then also it also shows me the task details and container details. This is to help you provide enough information so that you can quickly pinpoint that which containers actually generated threads so that you can take actions accordingly.

So now uh the first demo, we actually learned, how do i enable this feature on target? And for the next demo, i will be also introducing how do i enable this feature on ecs ec2 cases. So there's two more new steps, many steps you actually need to perform. we enable this feature on es ec2.

The first one is the vpc end point creation for the fact that um when you actually enable this workload of far, the vpc end point is automatically created for you and for the so that you can see there's already guard duty and guard duty with pc end point on your on your pc console. However, on your, when you actually enable this feature on ecs ec2, this will need to be manly manually created.

Once you have the vpc n point created, we can now try to uh now need to install the guard duty agent on my ecs ec2 instances. So there's two ways of actually doing it if your instance is managed by a w systems manager, which also known as ssm you can uh there is a, there's a predefined ssm document which you can directly take advantage of.

So to, to have your ec2 instance managed by ssm you will need, there are some prerequisites you have to fulfill such as you want to install the ssm agent on your e ec2 instances or you also have to uh update your instance profile permission for them to in order to talk to the ss m service.

So um when you have the, when you have the, your managed by ssm, you can run this do predefined document which actually packages all the commands to install uninstall, the guard duty agent on your instances. So if you choose the, the this predefined document which offer the back our duty and then you will be able to see in your own aws account as well. And then you can actually run the command.

So i already have the page preopen here. And when you try to write the command, the only change uh only parameter you need to fulfill is to put the p name which the guard duty run time monitor is in. So, and then you can select which instances you want to install the guard duty agent. And i can choose one of the ones that already the ecs instance already pre launched on the one side side of instances, i can scroll down and click the wrong button.

Now you can see the status is already in progress once the, once the document come on, once the status change to successful, which means your guard duty agent already installed on your instances.

So i have the, i have one instance, it's already pre pre launched and also pre installed the guard duty agent using this assessment document. So once it, once it's installed on your guard duty agent, you will be able to see the instances showing up on your guard duty dashboard as well.

So if you, you similarly, if you come to your guard duty coverage dashboard and the random coverage, if you select ecs clusters, one time coverage, you can see the container insights container instance is covered under my cluster. And also if you select ec2 instance random coverage, you can also see see the instance showing up on this dashboard.

And um when, when the coverage state is healthy, which means the guarded container is actively running on the instances. So uh this is for when your instance managed on your managed by ssm, there's another.

So when your instances is actually not managed by ssm, and you also do not want to install the performance of all the actions to enable it. There's another manual way to help you to install the guard duty agent as well.

So the guard duty agent l pm is stored in the s3 bucket. So you can run the command to help you to put down the guard duty agent and the relative signature and the public key from the s3 bucket.

So i'll show a case what actually the the command you will need to run here. So if you see the script that already created for install the guard duty agent, you can, the first section is uh downloading the guard duty agent r pm signature and publicly from a three bucket. And the second section will be verify your signature. we highly recommend that we after you download the r pm, verify the signature and make sure the r pm is not tempered.

And then you can run the command to install the r pm. And at the end, you can track the status of the guard duty. now i can run the this script and uh you can see it's it's installing my guard duty agent and the on the screen, you can see the state active, which means it's successive running.

So to automate this script, you can actually patch all this script becomes part of your user data. So when you launch instances by auto group or capacity provider, it will automatically run for you. So you when your instance is launched, you already have guarded agent install on your instances.

And similarly, if you start to run your ta ecs task on your on this ese two instances, the guard duty agent will also generate findings on behalf of you.

So to show an example that already created um which is the finding of ecs ec2 instances, the difference here is on the session, on the bottom session. You can see that it actually also provide you instance details which pinpoints which instance and all the other information that you may, you may um you may want to know about your instance.

So that's all for today's demo. I'll hand over to spiros. Thanks you man.

So in, in summary, if there is one takeaway to take away from today's presentation is that with the introduction of this feature on ecs and fargate, as well as the existing support for guard time, run time monitoring for eks and the preview of doing this for ec2, you have the ability to turn on threat detection for runtime threats across the whole gamut of aws compute.

And if you're interested in trying this feature, guard duty also offers a 30 day free trial for you to try out. If you're still interested on ecs and fargate, there's still time if you don't have any plans to go to some other breakout and builder session.

And if you haven't had the chance to visit our um modern apps on the floor, it's at the very back of the far, right. So stop by the server espresso to see how lambda can make you coffee.

And with that, we would like to thank you for joining our session today. Please remember to give us some feedback in the mobile app. There's still some time you man and i will be here to answer any questions you may have. Thank you.

李白的朋友王维

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫