Securing Kubernetes workloads in Amazon EKS

最新推荐文章于 2024-10-07 08:59:05 发布

李白的朋友王维

最新推荐文章于 2024-10-07 08:59:05 发布

阅读量97

点赞数

文章标签： aws 亚马逊云科技科技人工智能 re:Invent 2023 生成式AI 云服务

本文链接：https://blog.csdn.net/just2gooo/article/details/134814143

版权

Good morning, everyone. Uh welcome to Las Vegas. I'm Mi Hausler. I'm a Principal Engineer on the Amazon ECS team.

I'm George John. I'm a Product Manager in the Amazon EKS sales team. Thank you all for being here with us today. And so you're in the uh K335 Securing Kubernetes Workloads uh in Amazon EKS.

So uh we're here talking today about security and um one of the oldest forms of security you can think of is probably uh a, a doorway, a door, right? Um sometimes you want to keep things out or you wanna keep things in. And so we're gonna use this theme of, of uh entry uh control throughout our talk today to, to kind of frame our discussion about security.

Um so to talk about our agenda a little bit, uh it's gonna be kind of 33 high level pieces, we're gonna be talking about securing access into uh the Kubernetes cluster, uh securing access from the cluster and then securing access within the cluster. So these, these are gonna be the main points of our discussion and to give you a uh if you're not familiar with Kubernetes quick quick show of hands who who here is uses Kubernetes or is used Kubernetes or EKS? Ok. So pretty much everybody. Great. Awesome.

Um if you're not familiar with Kubernetes, really brief brief overview, Kubernetes is a container orchestrator. So it controls um and schedules containers on compute uh in EKS, that's EC2 instances, uh Fargate, AWS, Fargate. And uh you can use those, you know, run those applications that can be your first party applications that can be vendor applications, all kinds of things. And those often interact with other AWS services like uh S3 or relational database RDS or Amazon EMR or anything else, right?

Um so that, that's kind of just for a high level view of what is Kubernetes and how does it work?

Um uh yeah, so we're gonna jump over into the first section here.

All right. Um imagine yourself, you're living in Prestar Greece, right? In 1250 BC. And you are secure, you're tasked with securing that coppos, right? Which is the most important part of the city. What would you do? You might build something like this. So this is a picture of Lion's Gate in Mycenaean Greece. If you look at it, you know, it's a huge wall fortified with giant stones with a door that's uh um can be shut for that time. It was pretty cutting edge. In fact, like 1000 years later, when the Greeks saw this, they thought it was built by giants.

So from a very early time, we know that securing the front gate is very important, fast forward, 2000 or 3300 years, it's 2018 80 Amazon EK was just launch. And as you can imagine, securing the front gate to EKS cluster was equally important.

So the approach we took at that time was to a very cubin is native open source kind of method to create or secure your clusters. So that meant that you can use cubin permissions along with IAM to define a mapping through a config map to say who can access what?

Now this this uh approach had a couple of tradeoffs, right. On one hand, it meant that you can use IAM, you don't have to create another identity provider. You can use IAM along with Kubin RBAC to set up access, which meant that you also get all the goodness of IAM right, or auditability with CloudTrail uh multi factor authentication and things like that. So that's all those are all the good things.

On the other hand, we've been getting customer feedback on some of the experience challenges with with this approach. So one thing we have heard is the fact that customers have to use multiple APIs bootstrap a cluster with access. So by that, i mean, you first use EKS management APIs to create the cluster, you got to wait for the cluster to be ready and then you use Kubin APIs to then define the mapping in the config mac between IAM principle and the Kubin permissions. So that's one feedback we've got.

Um so what's the downside of that? Right? So that means that you can really use automation like infrastructure as code tools to bootstrap a cluster with the access you need in a single step.

Another feedback we've been getting from customers around the whole experience being a little bit finicky. You got to really make sure that you don't make any typos. All the mapping is exactly the way it's supposed to be. And if you do make a type or a mistake, you might get locked out. I mean, there are ways to restore access but still, that's not ideal experience.

And finally, when you create a cluster, the IAM principle that creates a cluster is automatically granted super user privileges on the Kubin permissions. Which again, if you, for whatever reason, delete that IAM principle, you kind of get locked out. I mean, you still can restore, get access, but it's not the ideal experience.

Now, if you don't believe me, just look at our so EKS if you don't know, we have a, a public uh road map on github, a lot of customers have given us enough feedback that this experience is not ideal, but we have some good news for you. We have an upcoming feature from EKS called Access Management that's going to address most if not all of the issues.

Um let me hand over the mic to Mike. I talk about the APIs how we build it and just to give a quick demo.

Thanks. So, uh we're really excited about this cluster management access. It's coming very soon. Um some of the key highlights here are that it's going to provide AWS APIs. So as George talked about, you have this config map in the Kubernetes API. When you create a cluster, you got to go to the AWS EKS API, use AWS credentials to talk to that, then you got to switch to a completely different API to edit um a file basically in the Kubernetes API.

So with this new feature, it's going to be uh AWS native APIs. So it'll work great with infrastructure as code tools. Um this also really simplifies the access. Um as a cluster is created, you can add entries even before the cluster is finished creating. So you don't have to wait for the cluster to create to start uh bootstrapping uh permissions into the cluster.

And then it also provides uh granular control. So um rather than um again having to give someone uh cluster creator permissions and then expect that, that identity to go and edit the permissions in the cluster, these are all secured with EKS APIs, there's permissions around them. You can do all your IAM policy around who can edit the access to the cluster.

And here's just a brief screenshot of what this looks like in the management console. So when you create a cluster, you'll be able to configure what is the cluster access? Do I want the cluster creator to have any permissions? Maybe I don't, if I have some automation that's doing that, maybe I don't actually need that automation with my infrastructure as code or whatever it is um to have access to the cluster.

And then uh as you create uh after you create the cluster, if you want to add additional um uh entries to this management, API you can add those here.

Um this is kind of a uh another view of what this looks like. And additionally, we're gonna provide coarse grain policy access. So um I'll throw this in the demo in a second, but basically, you can bootstrap some level of permission for these identities who can access the cluster, there's authentication and authorization.

So the first half is that auth authentication part who can authenticate to the cluster. And then the second half is the sort of authorization part to say what level of access in the cluster do they have and what the this API ends up looking like. If you're familiar at all with the AWS CLI, uh you'll create the cluster and, and you'll set the authentication mode and I'll show that in the demo in a second what that looks like.

Um and then you'll be able to create what we call access entries. Um and then associate those with, with policies. Um and then at post, post creation, you can always update that. So we'll go into the demo really quick and I'll show what show you what that looks like.

So here I have a cluster i've created um it's already been created and i'm gonna run through uh kind of what this, this API is gonna look and feel like. So um I'm just using setting up some configuration here to use a specific config file. So i'm going to describe my cluster. And one of the first things you'll notice is when you describe a cluster, you see a bunch of settings around the cluster, the endpoint, you're probably familiar with um we're adding a new field for authentication mode. And so this cluster has the API and config map already set.

And um so if you uh if you're familiar again with the existing method, if you look at the config map in a cluster, you can see the roles in the config map that are defined. So in this case, I've used a managed node group, I've already attached that and manage node group.

Um when you use the, the EKS API for that, it automatically adds this entry into the AWS O config map. Um so that's, that's basically you can think about it as an empty config map, but with the, the ECF generated one, then uh as shown previously, you can update the cluster config and we have three different modes.

If you have an existing cluster or create a new cluster, the default mode is going to be config map. So it's gonna look and feel the same. So any controls you have around who has access to the cluster. And if I'm managing access to the cluster, that's that's staying the same, um there's not going to be a new, there is this new method, but by default, it's not going to be enabled, you have to enable it.

Um the second mode is this API and config map mode, which means we look at the entries in your, the API, the EKS API that you, you manage and then um as well as the config map. So kind of this migration phase and then the last mode is just API only. So we're only using the uh EK new EKS APIs to configure uh access to the cluster.

Um so I've already updated this cluster. So this call is going to fail, but you'll see um API and kfit map setting that to the cluster. So now i'm going to list access entries um and i have two in this cluster. So what happens is i created this cluster? But i have personally, i haven't created any access entries because i updated the cluster to API and config map ECAs automatically added the cluster creator for me and added the node roll for my nodes to the API.

And so i can describe these, i can look at like the, the admin the cluster creator role and it's just showing there's my principal identity. Um and the rn and then i can do the same to just describe the node role.

Um the one key difference here is on that type, you'll see two linux that just we do some additional configuration management of the permissions because a node uh different nodes have different permission levels in eks the windows uh windows nodes have a slightly different set of permissions than like the ec2 linux nodes. So that's, that's the type you see there but for most human users or uh or, or not even human users but other machine users, um you'll, it'll just be a standard uh type.

And then you can finally, uh so i just wanted to illustrate here if i try to roll back my cluster to config map only, it's actually gonna fail. We don't support rolling back to the config map because we need those entries in the API.

Um and I'll just show you briefly this is a AWS config file. I'm using just for this demo and this shows ok. I've got a couple of different profiles and this just lets me sort of illustrate different roles, accessing the cluster and having different levels of permission.

So I've got a base role. That's the role that i use to create the cluster. So right now it's administrator in the cluster. And I've got an additional role that I'm calling like cluster admin and then like one that's called cluster viewer.

So if I want to create a new access entry, just call the API again, you can do this with infrastructures code. And anything else is when this comes out?

Um and so now I've created this access entry for this viewer. Um right now it doesn't have any permissions in the cluster. Uh so if I want to describe from Kubernetes, what are the built in cluster roles? So these are the cluster level permissions in Kubernetes role based access control or arc.

Um and I'm gonna grip out all the system roles that are used by surface accounts. Um so there's really just four, so there's admin, it's kind of an uh a lower level admin, there's full cluster admin and there's an editor role and a viewer role.

So we create uh in this new API access policies in the EKS API that are roughly analogous to those four Kubin roles. Um again, as i called out, these are co grain policies that just give you sort of uh bootstrap ability to, to have, let someone have you access to the cluster or something, have you access to the cluster or administrator access to the cluster and you can, you, you don't have to use those at all if you um want to use RBA or you can use those with RBA.

Um so here's just a list of the policies that we're gonna have at launch which again, pretty familiar, like you see the admin policy again, similar to the RBA policy, but you see some additional ones here like um nodes and, and EMR so we're gonna be using this new API also not just for um for humans or for other uh uh automation that needs to access the cluster, but other AWS services that need to access your cluster will use this method.

So when you describe access to a cluster, you'll see, ok. If I'm using EMR with EKS, it's using this role in this policy. And so now i can associate an access policy.

Um I'm gonna associate that viewer role, uh viewer role that i have with the EKS view policy. Um and so now if i switch uh my AWS uh profile to use the viewer um IM role, i can do get caller identity that just says, who am i? What is my role? Ok. Now i'm actually the viewer role. I'm not the cluster creator. And ok, if i use that, can i try to get all pods in all name spaces? Ok

Great. I can awesome. I'm not the cluster administrator, but now I can access the cluster. Um but if I want to try to do any mutation that's not in the viewer role policy. So if I try to create a config map and just use the, the certis command line from literal, just setting a key in a value in a config map. It's actually gonna fail. This user doesn't have permissions to create config maps.

And then if I go back to the admin role that I created the cluster with, I'm just calling it the base profile, um I can update the cluster config to api so that's in progress and um I can get back caller identity and I can still get my pods so that resumes that uh that's the demo.

And so how does this, how does this work under the hood? So, um what if you're familiar at all with uk s, we use uh open source project called aws im authenticator. It's on github. I'm a contributor to it. Um we have other folks at aws who help maintain it as well.

Um what happens is that the, uh you, the client of kubernetes uh makes a call to kubernetes and when they do this, they send it, what's a token, a bear token? That token gets interpreted by kubernetes and says, ok, i don't know how to handle this token, but i know i have uh the im authenticator uh authentication web hook enabled that's managed by eks. And what that does is it knows how to take that token.

So what i say is a token, um it's actually a presigned uh sv four request for the get caller identity. api that i showed earlier. So as a client, when you call, get hard identity, the response you get back from aws is who you are. If you uh presigned that url, like if you're familiar with presigned url, s with s3, you can give someone an s3, presigned url and they can get an object for some period of time. Is that, that uh signature is valid.

Um but without having to give your credentials to someone else, right? Um so this works the same way you can presigned a url forget call, identity, give it to someone, they can execute that, that call, but only that call and see who you are. So that's how this works.

So i am authenticator uh executes that request against the secure token service or sts and gets back a response to see who is the client. Um and then returns that to the kubernetes api to say here's who the user is and with the config map mode with the existing way that we've worked.

Um it, the im authenticator process looks up the, the that off config map to say, ok, i see this roll back from sts. What permission if any does that have in the cluster and then tells kubernetes, this is who you, who the user is.

So the key difference here is as we introduce the api and the uh mode and, and api config map. Um the eks api now is sort a source of truth for this too, right? And so we push down to the, the kubernetes uh control plane that we manage um those records that you send and um onto a file on disk. And it's now authenticator can merge that context and to actually take precedence with, with the uh the eks api is over whatever is in the config map when you have api and config map mode and then send that back to the api serve to say here's who the uh the client is. And then when it's api mode only, it just obviously ignores the config map and only reads from, from the the file synced down through the api on disk.

All right. So, so far, we looked at how you can secure access into the cluster or the front gate of the cluster. So next section, we are gonna look at how you can secure access from within the cluster to resource outside.

So this is a picture of uh fort jefferson 68 miles from the coast of key west in florida. Just a little bit of uh all right, let me ask if you have, if you are given the task to build a facility that secures egress or you wanna make sure people doesn't leave the facility and a good place to build it would be an island surrounded by water to this facility.

Uh just a little bit of history here, dr samuel mudd, who kind of helped the assassin of us, president lincoln was uh imprisoned here for a period of time, but at this facility, we used to be a federal prison. Uh i think in the mid 19th century.

So coming back to eks, how do you make sure that you are able to secure or control what resources your applications are accessing from within the cluster to the outside? So we have this feature called i a rolls for service account. It's also known as irs a. We have had it for a couple of years now.

So prior to irs a, the way you would give your pods running on ec2 worker notes, permissions to access or i a permissions to access other aws resources was through uh by, by leveraging the im role attached to the ec2 instance itself. So this meant that all the parts running on the same ec2 instance, might we get all the permissions that the im role had, which might be overly permissive, right and violating the least privileged principle.

So in 2019, we launched irs a irs a enabled you to give granular im permissions to parts. So you could have multiple parts belonging to multiple applications all running on the same ec2 instance but still have different set of permissions. And irs a has been great and we have a lot of customers using it.

One of the key things i want to call out is that when we build irs a, we build it for cities on aws and not just eks on aws. Let me explain what i mean by that. So q and today you have multiple ways you can run qin is on aws. Now you have eks, which is a managed service. We have eks anywhere you can run eks on premises. You could also run your own qin clusters on two instances, self managed qin clusters. So in that case, you are responsible for both the control plane and data plane. We have a lot of customers using it.

So we wanted irs a to work across all these environments. So we did not, when we designed irs a intentionally, we did not take any dependency on eks api s. The way we build it was by leveraging fundamental aws service like im.

So when you set up, if you're familiar with the irs a actually cook show offense, how many of you are using irs a today? Ok. Maybe i would say 60%. So for the, for the remaining of you when you build, uh when you use irs a, uh you kind of leverage some of the core im constructs like the o i dc provider, you create an o i dc provider in i am. You use a role trust policy to then trust the oo i dc provider, you just create it.

So there are steps you go through and the fact that we are not um you're kind of not interacting with any eks api mean that you can use irs a in eks in self managed clusters and other environments. But a feedback they have been getting from customers has been that um there are some user experience challenges with that.

There are customers, for instance, in regulated industries who say who have told us that the cluster administrator doesn't always have permissions to i a to do to administrate. I am. So for those customers when they are setting up, i am roles to be used in irc, they have to reach out to the im administration team.

So it means that there's a lot of back and forth between the cluster administrator and im administrator makes the whole process manual lengthy, not very automation friendly. That's the feedback we've got.

The second one is on the scoping of i a roles. So the roles we use in irs a today. If you, if you're familiar with it, the trust policy, you're pointing it to the o i dc provider that's running on the cluster itself. So if you want to use the role a tomorrow on a new cluster, maybe that's part of blue green, right? You brought up a new cluster, you want to use the same role, you have to now go back to the roles, trust policy and update the entry to point to the new cluster. And, and now if the roles are managed by a different team, now it's the administrator or somebody reaching out to that the different team to kind of get this done. This again is from an experience standpoint, not ideal.

The third one is like as you scale the number of clusters you have in your account, you might run into some of the limits like the number of ydc providers, you can create an account the size of the trust policy. Now there are workarounds. I mean, you can get some of these limits increased or you can have a duplicate the roles, but again, not the great experience.

And finally, we have heard from customers, especially customers who have a very dynamic environment who have ephemeral clusters that come up and go down. They want to bootstrap the cluster with all the add ons with all the permissions in a single step. You know, i rsa today has a kind of a dependency that the cluster has to be in a ready state. If you create the cluster, you got to wait for several minutes, the cluster has to be in ready state before you can use, which again is not great.

So we have um some great news for you. Uh if you are familiar or if you're closely following some of the re invent announcements, we announced a feature last sunday called eks pod identity. Before i hand the stage to mia let me emphasize that we have a lot of customers using irs a, you know, they are, they are happy with it, they are using it successfully, we will continue investing and supporting irs a and at aws, we take customer feedback seriously and we have heard from a segment of customers who want a different solution.

So we want to give you choices, right? Irs a would continue to access ekspt is another option for you based on your use case and requirements, you can decide what, what works better for you. So with that, uh let me hand it over to mica.

Thanks george. Um yeah, so we're, we're, i'm i'm really excited about this launch. Uh some of the key points here are just that it really simplifies the trust. As you talked about updating the role trust policy is a very permissive uh permission uh permissive uh how to say this. It requires a lot of permissions um enabled in order to update a role uh trust policy, you basically have to be admin.

Um you're able to extend the trust to basically anyone at that point for that role. Um so this makes it much simpler. You can say just i trust eks to distribute this role correctly.

Um it's also backward compatible. So it works side by side with im rs for service accounts. Um you can, if you're using irs a at this point, you can turn on pot identity, it will basically take precedence over uh irs a.

Um and then you can turn off irs a. Um one of my favorite features about this is the scalable ava. So if you're not familiar with attribute based access control, basically, this uses attributes from the identity to uh to enforce permissions. So the way that we have done this is we added support for im role session tag and i'll get into this in the demo and kind of how powerful this is. But that's, that's a really exciting security feature here.

Um and additionally, it's just, it's audible like everything else. So um all the uh api s, the new api s by ecas that support this as well as the um use by the pods. Uh uh the aws api use by pods is all integrated with, with aws cloud trail.

Uh so here's just a brief screenshot of what this looks like in the aws console. Um you'll go to uh on a cluster level, create a pod identity configuration. And so this is just a mapping of an im roll to a kernes name space and a service account within that name space. So this, this is this is that mapping cluster role, namespace service account.

Um and then you'll be able to see a list uh of those, those uh associations in the aws console as well. Um to kind of give you an overview of how this works. So say you have an im administrator who creates a role, we're gonna call this role, secret reader. They can set the trust policy on this role to allow um on the effect. And then on the principal side, they'll set it to pause dot eks at amazon a w bs.com and the action is assume role and t session.

If you're familiar at all with using uh any identity um in any computer aws, whether that's e two or ecs or lambda or codebuild or anything. This is basically how you do it, right? You trust the service. So that's one of the key features here is that this will look and feel a lot like basically any other aws service that you want to provide identity to.

And so in this secret reader role, um in order to use that role, you might have a cluster administrator and you might give them permission to say, ok, i can create these access uh these pot identity associations.

Um but in order to do that, uh that's not it, i can't just give my cluster administrator create pot identity pot identity association. I also have to give them im pass roll, pass roll is a key uh defense mechanism built into i am that prevents privilege escalation. Just because the secret reader role trusts eks doesn't mean your cluster administrator is allowed to use that role. They also have to have pass role on that role.

So that's one of the key protections in this new api um is, is enforcing that check and what that session tagging ends up looking like and the way you can use that um and this is a uh incomplete example. But just to show kind of how this works is say you have a a aws resource that's a secret manager secret, right? And um those secret manage secrets have tags say the tag is eks dash cluster dash name, and the value is whatever your cluster's name is, same for kate's dash name space and kate's dash service, dash account.

Um the way that i am rule session tags come into play here is you can reference session tags in im policy. So in this example, policy in the condition, i've said the string must equal for the for the resource tag. So the resource on the secret must match the value of the same session tag in that session.

So if i have an i am role that i want to use in a cluster, that all those tags in the on the resource must match the session tags. So the great thing about this is i can reuse this im policy and even i a role across multiple clusters and not have to write an individual policy per cluster or an individual role per cluster that hard codes the cluster name in the policy, i can just rely on the session tags and uh the resource tags to match.

So what that ends up looking like is you have a container, say we have a yellow cluster and it assumes the same im role, the secret reader role

"Um the roll session tag that eks sets when it gives the credentials to the, the the container in your pod is going to be ecs cluster name yellow and the resource tag on a secret, whether it's a whatever permission you give the pod, whether it's a create secret or accessing a secret value must match the the the tag on the the session tag.

And the same for if you have a blue cluster, it's the same thing, but you can reuse the same im roll and the same im policy. This is really powerful if you want to enable roll reuse, if you have dozens of clusters and you're maintaining all these different roles that are essentially the same except for the hard coded cluster name, you can do away with that and just get one.

So uh this is just a brief overview of what those api s look like. So uh through the aws cli again, um so you can create pot i mc associations where you specify again at the cluster level. So you have a cluster name, the name space, the service account um in the roar. And then later you can update that as well to, to a different role.

Um when you do update to a different role, you have to restart your pod because the pod will keep running with the previous credential from the, the previous session. Um and then you can list in one place all your pot identity associations and even remove them so with that, we'll jump into a brief demo.

So here i've got uh a couple of clusters. Um i'm using cube cuttle and i'm looking at the context that i have. I've got a, uh really, i'm just gonna be using a blue and a bluefish and redfish clusters.

So the first thing i'm gonna do is do a uh show you how i have to, i'm gonna set up the rules set up, so i'm gonna run through this. So uh we've got a, i've got a role that i'm gonna wanna grant to my cluster and i'm gonna look at the policy document.

So this is the policy document we saw this already. Um it shows pods dot eks dot amazon aws.com, that's what it's trusting. it allows uh eks to tag sessions and assume the role.

Um i added an additional condition here to say, say i don't want this role to be used with just any cluster. i only want it for these two clusters. you saw, i had a bunch of other clusters. i can add condition keys to say it must only, it's only allowed for these clusters.

So if i don't want it to be just any cluster in my account, it can be specific ones. So ok, let's look at the policies on this role. So i've got two policies, an s3 policy and a secret manager policy.

If we look at the s3 policy, it's got a uh policy to so on the first stanza here, i've got three a uh actions, delete, put, get and get objects. Um but i have different resources that those can do.

So um i've got a s3 bucket um and it's a little bit long of a bucket name. But you can see on the first line, i've got a config prefix after that, i've put in the session tag uh as a template variable here. So it must be config slash cluster name slash name, space, name slash service account name. And then it can do anything at once under that in the second line, i just have slash data. And then after that cluster name and then everything.

So that's kind of a common sort of a pool of a prefix for anything in the cluster can put, get delete objects in that prefix. And then finally, i have a logs uh prefix and i put the kubernetes pod u id.

So that's gonna be a u id for it's gonna look at it and be different for every pod. Why might you want this? Well, say you run ac i system, right? And you don't want even at the service account name or even the pod name because pod names are can be duplicated over time. But uu i won't be, you might want every pod to upload to a different prefix but not be able to override others. That's an example where you might want this.

And then finally um the role can list the whole bucket and can list my other buckets. So ok. All right.

So if we look at the secret manager policy, um this is gonna be more tag based rather than like s3 path prefix. So i'm gonna allow um s3 ma uh secret manager star. So any secret manager call as long as the resource tag eks cluster name.

So the, the tag on the resource is cluster name key. And the value of that tag matches the session, tag value for ecs cluster name. So i'm so i'm doing a cluster level uh tag here and then um uh i'm letting it do anything on or have on all resources. It can get random passwords so it can generate random passwords. That's fine. I can list all the secrets. I can't necessarily access them, but i can list them and then a few other permissions for um un tagging, i deny un tagging.

I don't want to let it un tag things so you can't un tag someone else's uh resource and then tag it as yours and claim it. Um and i don't let it um ex manage any secret manager policy api s.

And so i'm gonna go ahead and list my clusters. I've got bluefish and redfish. Um and i'm gonna create a new add on. So, part of this new feature is that we have an add on that does the um uh a credential exchange for the pod.

So when a pod and i'll get into this in a little bit, but when a pod starts up, it will reach out to this uh agent to get credentials, the agent will reach out to eks to, to do the credential exchange.

So i'm installing the hot identity agent and that will take just a second here. Um and i'm gonna list pot identity associations on my redfish cluster. None yet. Great.

Um now, we're gonna go ahead and create one. So we've got a, we've got our uh our i am role. We've seen, we've seen the policy. So we're gonna go ahead and create that.

Um and there it's created so on the cluster redfish and that um pod roll is the, is the pod roll. And ok. So now we've got a role associated with the cluster.

What i'm gonna do is i'm gonna uh install a community pod. So if i keep cut, i get um pods, i don't have any pods. I have a deployment here and this deployment um i'm gonna do two things.

So i have a service account. That's the service account that i referenced in my uh pot uh pot identity association. And um it's just a, it's actually a really simple container. I'm running amazon linux 2023 and i've injected a few environment variables.

I put the bucket name so that mapping that through a config map. And i've given the pod some hints as to its metadata when you have session tags um in a uh session role credential, you don't know that you necessarily have them.

Um so i have to sort of inform the pod of. Here's your cluster name, here's your pod name, space, your service account name.

Um and the pod name in this case is the host name. So i don't inject that here, but i could do that as well and same for pod u id.

Um so those are injected as environment variables. And i've mounted some, some files as uh or some config maps as paths in the pod which we'll see in a second.

So run ih uh demo setup, actually, i can just run down my side. So i'm creating some config maps that have some setup scripts in here. And i'm applying my deployment file into the or my cuber ase deployment into the, into the cluster.

So ik get pok get pods. I've got my pod running, that's my, my deployment. So i'm gonna run, i'm gonna jump into that pod using c cuddle exec and i'm just sort of get the pod um pod name and now i have a shell in the pod.

So i'm in my redfish cluster in a uh pod in the default name space. And there's the, you can see the pod host name there. So i'm gonna run a script. I have a few scripts to show off here.

So in the pod, um let's get caller identity, right? What, what identity does the pod have? Ok. It has the pod roll. Great. It works, right.

Um uh but let's see those environment variables that we set. So and, and a few others. So we set those pod names based service account, pod u id ones.

Um but there's a few others that get injected into the pod by aws for us. That's that container, uh authorization token file and credential full ur i that's those are hints to the sdk to say here's how you get your credentials, you'll talk to this agent on that url and here's the token you need to read from disk capacity agent. Ok.

So we're gonna go really quick to uh we'll do this s3 demo first. So we saw that policy in s3, right? We saw those different prefixes. So we're going to create some random data, just read from random, write it to attempt file.

Um and we're going to copy that random data into a bucket under data that fails, right?

Um we have to use data and our cluster name. So if we do data cluster name, because the cluster name is an environment variable in our pod and we put it once we copy it up there, it works great. Awesome.

Um if we want to list the whole bucket, we can see what's in there. There's our random data, we uploaded some other data under the bluefish cluster, um some logs from other pods that have run.

Um so if we try to copy um bash history from one of those other pods um into our temp folder, it's going to fail because that's not our pod u id.

Um if we want to upload our bash history to our own pods u id prefix under the logs path that's gonna work and then we'll do secret manager really quick.

So if we want to list all secrets, we don't have any secrets yet. So remember we said we can create a random password, anyone can do that. Ok, great. So yeah, we can do that. Great.

Now we're gonna save one as an environment variable. Um and then we're gonna create a secret.

Um what's, what's anyone see that's not gonna work here. There's no tags. I tried to create it without tags. It's going to fail.

Um if i go ahead and add tags to this call. So i said dash dash tags, key equals eks cluster name value is the cluster name that's gonna work.

So if i want to go ahead and get that secret value, i can do that because i created that secret"

It's in my cluster. If I want to untag that I can't because we had an explicit deny there. Same for uh delete secret. We can do that because we uh we own the, the, the secret. So actually, I wanna run that again really quick because I wanna recreate that secret. And show from the other cluster. Oh it's already scheduled for deletion. Ok.

So um in secret manager, you have to delete a secret and sometimes it's eventually consistent and this is one of those times. Ok? We can create it. Great. So we're gonna drop out of here and we're gonna go back to our blue cluster.

So we're in bluefish and uh I've already created the pod and we don't wanna um uh we don't wanna revisit all of that. So there's a pod running. So if we exact into that pod, um we can see that the same thing that we saw before um pod and we can see, ok, we have the same role, we have um environment variables. It's a bluefish cluster instead of redfish.

Um but if we wanna run a different demo, so ok, we can also create a secret. Um and we set tags. Ok? So it's a bluefish secret and we can list secrets and we can see all the secrets we can see bluefish, that's the one we just created. Um and we can also see the one up there that was the redfish secret. So that's there.

If we want to try to get um our secret value for bluefish cluster, we can do that if we want to try to get it for redfish cluster, that's gonna fail. Um similarly logs we can copy um we can't copy uh uh logs that are not for our pod.

Um and then we, there's a surprise. What's that? Ok. Let's see what that is. And it's a, as fun as art. So, eks. Right. It just works. That's great.

Um, and so one of the, the, the way that what's actually happening, i kind of alluded to this earlier, um, the sdk in the aws cli and then all the aws dks now is going to take that token off dis call the local metadata, um, service, the pod metadata service and get credentials. So i've sort of uh set it out my credentials here. But um you can see it's that, that's what it's doing to get credentials. So hop back over.

So as i alluded to under the uh kind of what's happening under the covers here, um you have your application container, it's reading um a service account from disk in the aws sdk. It's calling this local pot identity agent. That's, that's uh that, that add on that we installed and that pot identity agent is calling the eks api with that token to say this token, i need creds for it. You, you know all the the details of all the associations, what role this needs giving back those roll creds.

Alright. So just to quickly recap, we talked about how you can secure access into the cluster with the upcoming cluster access management feature. Then we just looked at how you can secure access from within the cluster to the outside. You have multiple options. Irs a or ekspt.

The next we look at how you can secure access within the cluster, right? Imagine um just continuing on our theme of uh walls and forts and castles. Imagine you are in, living in pre medieval england and you are the king and how do you protect yourself? You know, you could and you need to protect yourself from enemies that are outside the kingdom and you also need to protect yourself from within.

So this is a picture of um dover castle. If you look at it closely, there are multiple um walls encircling the castle. So even if one ball gets compromises another wall protecting the king. So this is really an example of defense in depth.

Now, how do you get something similar uh in eks? So first let me quickly talk about network policies. So cuban is has this construct called network policies. By default, there is no isolation uh network isolation between parts running in a cluster. But let's say you have an application that's microservices and you want to put in some policies in terms of who, which part can communicate with.

What uh prior to i would say august of this year, the way you could enable network policies on an eks cluster was by installing a third party plug in the default port networking plug in that we make available the vpc c and i did not have support for it. We've got a few feedback from customers around that. Customers have told us that.

Now the fact that they have to run multiple plugins to get network policy mean that's adding on to the port start agencies. Now, feedback is around, that's the overhead and management and support and costs. You have multiple plugins, you have to kind of manage that. And the last one is by increasing the surface area, you're kind of increasing the new, introducing new failure modes.

So with that, i'm happy to announce that earlier this year in august, we launched a native support for network policies in uh vpc cn i mi i will walk you through what we did there.

Yeah. So uh this is really exciting. It's a single plug in. So rather than having to install multiple things just to get network policy, you can use the vpc cn i that comes with eks and get network policy support.

Um this is also integrated with our other features that we have in our cn i like ec2 security groups for pods. So in addition to kernes network policy, you can also enforce, enforce security groups off box.

Um and then it's just really perform it rather than using like an ip tables based way to block um network access. It's using ebpf. So uh what is evpf if you're not familiar with it?

Um evpf is a uh it's the extended uh berkeley packet filter. It's a engine inside the linux kernel that intercepts c calls as they get called from user space and it's commonly used for things like tracing security and load balancing.

Um so this is the exact case we're using it for, for security. Um it, and it, it happens, it, it's a way to, to uh to intercept calls without having to run or compile the kernel or run custom kernel modules.

Um so under the hood, how does this work? So, eks runs a network policy controller for you on our manage control plan.

Um if you create a network policy, uh this is a, a community resource. Um our network policy controller sees that and then in turn creates a network policy endpoint crd, that's a eks specific crd um that takes the rules that are enforced in this network policy. And the each nodes uh vpc cn i agent reads from this network policy endpoint to know what to enforce.

So the two key components here really are the network policy controller and then the existing agent that's already on the node.

Um so what is an example of this look like? So you might have a network policy like this is a kuber network policy. The name is demo app red ns. And what this does is the first part of the spec here says pod selector. So this applies to any pods that have the label app demo. And so for any of those pods that this applies to this will allow ingress from uh name spaces where the name space, name is red.

So uh this, this is basically a pretty simple selector, but just to say, i only want to allow traffic in from name space red. I don't allow it in from anywhere else. And again, this can work together with ec2 security groups as called out earlier.

So you have say you have amazon r ds or amazon aash to cash in your vpc and you've got an eks no group that has your pods, you can use uh network policy and the ekscr ds to restrict pod access to vpc resources as well as cluster resources.

Yeah. Alright. Next, let's look at some of the recent security related announcements. I'm not going to spend a lot of time on these, but if you just google search with these terms, it will take you to the right. what's new post or blog or documentation.

The first one is the network load balance and support for ec2 security groups. So the nlb or network load balancer, you provision on eks clusters can now be secured with security groups rules. So you can have rules saying, hey, my application running behind the nlb can only be accessed by certain endpoints or ip addresses.

The second one is private link support for eks management a p. So this gives you a secure, private way to access your eks management aps from within your vpc. So you're not egress through the internet or public network, you can directly through a private network, reach out to the ecs management a ps from your vpc.

Third one is some enhancements to amazon detective. So the vpc flow logs that you collect from eks. Amazon detective now offers a lot more analytics and visualization.

The fourth one is the improvement we made to guard duty integration for eks. So earlier, you could use guard duty to kind of monitor all the traffic to api server. Now, we are extending that where you can have a guard duty agent running on your cluster on the worker notes to now look at the worker node os level kind of issue. So guard duty can flag os level issues for you.

And finally, we launched this capability where you can sign and verify the container images that are deployed to your ecs cluster. This is enabled by a few different aw services. So there is aw signer which is a manage signing service. So aws signer has made some improvements where you can easily sign an image, you can store the signed images then in a container registry like ecr and then any time the image gets deployed to an eks cluster, you can have an admission controller, gatekeeper coron or you can write your own admission controller. We have an example in the blog post where you can then verify the images before they are deployed to um do the cluster itself.

Now if if you're interested in learning more about it. Please do search for them online or you can just ask us after the session.

So to quickly recap, you know, we looked at three facets of how you can secure a cluster first is securing access into the cluster. And you have two options, you have the existing config map way of doing things that's gonna continue to access. If you like that, you should continue using that. And then we have the upcoming feature which is gonna be launched soon called cluster access management, which we think is a more streamlined and better experience for se for setting up a access.

Second, we talked about how you can secure a access from within the cluster to outside. You have again, two options there. I rsa is one option that has been around for a couple of years now. We have another option. Eks for identity, which streamlines the whole experience of how you can grant i am permissions to your pods.

And finally, we talked about network policy and how that is natively integrated with b pc and to secure access inside the cluster. We are very excited about these features. Hopefully, we are giving you the bricks to build your own wall.

李白的朋友王维

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫