[NEW LAUNCH] Reserve GPU capacity with Amazon EC2 Capacity Blocks for ML

最新推荐文章于 2025-03-10 19:33:48 发布

李白的朋友王维

最新推荐文章于 2025-03-10 19:33:48 发布

阅读量198

点赞数

文章标签： aws 亚马逊云科技科技人工智能 re:Invent 2023 生成式AI 云服务

本文链接：https://blog.csdn.net/just2gooo/article/details/134795444

版权

Um yes, so I want to thank you all for, for joining this session today. Uh I hope you had a, a good Thanksgiving weekend. Uh for me, the, the weekend was pretty good. I don't know if there's any college football fans out there. Uh but I'm a University of Michigan alum and I'm a big Michigan football fan.

Uh yeah, go blue. Uh yeah, so we, we just beat uh our arch rival Ohio State for the third time in a row over the weekend. Uh so I hope there's no Ohio State Buckeye fans out there who are gonna take my session score today. Uh sorry for that.

Um yeah, so uh just to make sure you're all in the right place. Uh this is CMP 105. We're going to talk about how to use capacity blocks, which is, it's a new EC2 provisioning option uh to get GPU capacity to run machine learning workloads. Uh we'll also talk a bit about um how capacity blocks work with EKS uh as well uh and just as an introduction.

So my name is Jake Siddle. Uh I'm a product manager with an EC2. Uh I'm part of the capacity products team and I'm the product manager for this new capacity blocks product.

Uh so, just to provide an overview of what we're going to cover today. Uh so I'm gonna start by setting the stage uh going over. Um you know why customers choose AWS for their machine learning workloads. Uh and I'll talk about like the different uh machine learning uh service offerings that we, we have within AWS.

Uh I'm going to focus on after that, I'm going to focus on CT provision options for accelerated hardware specifically uh GPU instances. Um and then from there, I'm going to narrow in on this new capacity blocks product and talk about, you know, how you, how you can use this product when you should consider using it, sort of like a framework for how you should consider using capacity blocks.

And then I'll finish off with a, with a demo with a live demo of, of the capacity blocks product. Um and just to kind of um as sort of a teaser. Uh I was just looking this morning at um like the capacity blocks that we have available right now.

Um so, you know, capacity blocks are currently available for uh P5 instances which feature uh Nvidia H100 GPUs. Uh we have capacity available starting as early as tomorrow. Uh and the prices currently start uh 25% below uh the on demand rate. Um if you want to take a look after this session.

Alright. So I don't think I can overstate how uh machine learning is powering uh breakthroughs that are really impacting every aspect of our lives. Um and the application of machine learning to develop new products and transform businesses has really been accelerating over the past year or so.

Um at AWS, we're super proud uh about the role that we've played uh in democratizing machine learning, uh making ML you know, accessible to pretty much anybody who wants to use it. Um and that's why over 100,000 customers are running their, their ML workloads on AWS today.

Um so, you know, over the past year, generative AI generative AI has really taken the world by storm especially now that the, the general public has experienced directly like the power of the latest generative AI models through consumer apps like ChatGBT.

Um and we're seeing more and more customers now who are building like and deploying generative AI applications for variety of use cases. Um really broad range of use cases that we're seeing from like text generation for productivity applications that can improve worker productivity to like applications where customers are enhancing like creativity, like with music generation and video generation apps. And there are so many more in between, but we know we're just scratching the surface of what generative AI can do these days.

And we're super excited to continue working with our customers to identify new applications and figure out where gen AI can add value. So with that context, let's talk a bit about, you know, why customers are choosing AWS for their ML workloads and the different service offerings that we we have available.

You may have seen this, this chart before we use it pretty often. So I'll just kind of briefly kind of summarize this. So AWS has the most comprehensive set of AI and ML services for all skill levels. We want to make sure that we offer like many different options for customers to choose from, so that they always have like a solution that's sort of optimized for their specific use case regardless of the customer size or industry.

Um and we kind of organize our, our different offerings into these three different layers to kind of help customers, kind of think about what we, what we have available.

Um so in the bottom layer here, we have our uh ML frameworks and uh and infrastructure. Uh so this layer includes like our EC2 instances powered by uh accelerated hardware, uh things like GPUs or uh custom ML uh like silicon. Uh we also have our like uh or or these instances are interconnected by high-speed network interfaces.

Then we have our frameworks that are optimized to run on this accelerated hardware as well. And we support frameworks, popular frameworks like PyTorch and TensorFlow that come prepackaged in our Deep Learning AMIs Deep Learning containers that make it easier for anybody to get started with uh with these instances.

Um and then moving up in the stack, we have Amazon SageMaker, which is a managed service makes it super easy for any developer to build train and deploy AI models on AWS. And this is a really good option for customers who are looking to just focus on the data science. They want AWS to manage the the heavy lifting of like the infrastructure management for them.

And then at the top layer, we have are AI services. So within this layer, we have like specific sort of task, task specific excuse me services that are powered by foundation models that Amazon has has developed like for specific use cases, like we have like Amazon CodeWhisperer which is a code generation application that helps developers code more efficiently.

We also have Amazon Polly which is the text to speech generation application and these services are super easy to access just through simple API calls on a pay as you go basis in this layer. We also have Amazon Bedrock. I know there's been a lot of buzz about this this year but it's a new service where customers can access the latest foundation models from leading AI companies like Anthropic, Stability AI against recent API calls.

One thing I do want to mention this later. I'm not sure if folks have heard about like our new Party Rock app that we just launched a couple of weeks ago. I highly recommend you check it out if you haven't already.

Um but it's a new like Amazon Bedrock playground where you can uh create like a simple JAI app in just literally a few minutes. It's kind of cool, kind of fun to play around with.

Uh so for this session, I'm gonna focus on this ML frameworks infrastructure layer, uh specifically uh EC2 computes and, and GPU based instances uh through EC2.

Um so just to give an overview of the instance options that we, we offer for machine learning workloads, um I'm going to start with our, our GPU based instances. Uh so we kind of separate these by training and inference.

Uh we have our P4 and P5 instances which are optimized for uh deep learning workloads. Um they're especially optimized for deep learning at uh at scale like customers who want to train uh deep learning models on distributed clusters. Uh these instances have really high networking performance um and they're optimized for that.

So we have our P4 instances which feature uh Nvidia A100 GPUs and then, and then our P5, which feature uh H100 GPUs and the P5 instances. Uh these are the instances that we currently currently offer through capacity block we'll talk about this a bit more later.

Uh but these are the highest performing instances for deep learning that are currently available in EC2. Uh and then on the inference side, we have GPU based instances are G5 and G5G instances.

Uh so G5 features uh the Nvidia A10G uh tensor core GPUs. Uh these are great instances for, for inference performance and they're super flexible, they come in many different sizes. So you can kind of right size for your, for your workload.

Uh they can also support uh like small machine learning workloads as well. And then G5G uh features Nvidia T4G instances and they're paired up with uh with um Graviton two CPUs. So this can be like a really cost-effective way, a cost cost-effective option for running inference workloads if you want to use Nvidia libraries.

So we're going to keep, you know, investing in our GPU instances. We want to watch, you know, the latest and greatest GPUs that we can. Uh but we know that we have a lot of customers who are looking for like different options as well.

Um like customers who want to train workloads uh on like ML custom ML silicon as well. Uh and that's why we partnered with Habana Labs, which is an Intel company to launch our DL1 instances.

Uh so DL1 instances feature Habana Gaudi custom ML accelerators. Um and this can be really cost-effective option for training large ML models as well. Um you can use the Synapse AI SDK which is integrated with PyTorch and in TensorFlow uh to build your models on DL1.

And then finally, you know, AWS is, is continuing to invest in our own ML silicon to kind of push the performance and price performance of machine learning training and inference in the cloud. So we have our Trainium line of instances for training and then our Inferential One and Inferential Two instances for inference.

And these instances offer the best price performance for training inference in the cloud. You can get started with with these instances using our Neuron SDK which is compatible or it's integrated with TensorFlow and PyTorch as well.

So I want to really highlight our P5 instances which as I mentioned are currently supported by capacity blocks. We just launched P5 in July. And as I mentioned, these are the highest performing instances currently available for deep learning training.

These instances are powered by or they come in one size and each instance has eight Nvidia H100 tensor core GPUs. Um these instances um they're they're capable of up to four times higher performance and up to 40% lower price performance compared to our previous generation GPU instances.

Um and they uh single instances is capable of 16 peta flops of computing performance and it has 640 gigabytes of high bandwidth memory. Um these instances also uh can produce 3200 gigabits per second of networking performance.

Uh which is super important for our customers who want to uh like scale out multiple like clusters of these instances to train distributed redistributed training workloads for larger and larger models. We know, you know, model sizes para uh numbers are are continuing to increase.

Uh so networking performance for efficient scaling is super important. Uh so, on that line, you know, when we designed these instances, we wanted to make sure we weren't just thinking about like the server level performance, but we, we thought about the instance performance in the uh the context of the data center.

And that's why we continue to invest in our EC2 Ultra Cluster designs.

Uh so p5 instances are deployed in our second generation Amazon EC2 Ultra Clusters which are specifically designed for uh for high performance uh large-scale model training. So an Ultra Cluster can now feature up to or include up to 20,000 h100 gpus and they're all connected via non blocking petit scale network bandwidth network infrastructure.

Um you can get consistent low latency connectivity between all the instances that are part of an Ultra Cluster uh using our second generation Elastic Fabric Adapter.

Um and then also your instances that are that are launched into an Ultra Cluster can access uh you know, our high performance uh managed storage outing um with uh ffsx for Luster uh with really low latency and high throughput. Uh that way, you know, as you're running your training workloads, you can make sure that your instances uh remain saturated with, with data. And uh you know, your storage doesn't become a, a bottleneck for training efficiency.

Alright. So I know this all sounds great like all of our infrastructure, infrastructure offerings are, are pretty cool. Uh but a challenge that we're constantly working with our customers to overcome um is how they can get gpus, specifically p4 and p5 gpus they need when they need them and where they need them to run their ml workloads.

Um you know, over the past year, there have been some really exciting breakthroughs in, in like the experiences uh and capabilities of foundation models. Um and because of that, there's been a big uptick in demand um like a across the entire industry for gp us to, to train and deploy these, these types of models.

Uh so this growth in demand is really outpaced. Uh supply of gp us across the entire industry, uh making gp usa scarce resource pretty much anywhere you go. So this this growth as a result, you know, it's been really hard for customers kind of anywhere they go to get access to gpus just when they need them.

Uh so what we're hearing is like, customers are often having to wait in long lines, uh wait long lead times to get gp u capacity often with kind of unclear uh like dates of when they're actually going to receive it. And then once customers actually get that gp u capacity, they're often having to hold on to it even when they're not using it just to make sure they, they still have it again, like the next time that they need it. And that's sort of perpetuating this uh this scarcity problem because there's a lot of gp capacity that's just sitting around idle across the entire industry.

So I want to talk about the provision options that we have available for gp u instances p4 and p5 or actually, this is really focused on p5 right now instances in c2 that when used correctly can really help you sort of mitigate these scarcity challenges or, or bypass them entirely.

Uh so we have uh on-demand capacity reservations or od cr s for short, which many customers are using right now for gp u capacity in c2. And what o dc rs allow you to do is you can reserve a specific instance type in a, in a specific availability zone for however long you want.

Um you only pay for this reservation when the instances instances aren't running. And as long as you hold on to that reservation, you're going to have assurance that you can watch all the instances that you reserved.

Um the the caveats to od crs is when you create an on demand capacity reservation, it's subject to capacity availability at the time that you create it. It's very similar to on demand instances. So it could take a while to get an o dc r in the first place. They're really recommended for gp u instances like p4 and p5 instances. If you have sort of long term sustained usage where you, you want to hold on to this reservation for a long period of time and you're willing to plan in advance uh to, to get that capacity in the first place.

Um another thing to mention about o dc r is is you can offset the cost of the reservation with savings plans. Um so it's a get a discount on the on demand rate. Uh but savings plans do require like a one or three year uh term commitment.

The other provisioning option I want to mention here are spot instances. So spot instances have really been useful for a for a lot of customers who are trying out p5 instances lately. So spot instances offer presti discount over on demand rates because two can interrupt the workload at any time to take that capacity back when we need it. But this is a really good option if you have workloads that are, that can tolerate these interruptions.

So where we see a lot of customers using spot instances for p5 is if they want to try out the p5 hardware, like, you know, in a really sort of low stakes kind of cost efficient way. This is a good thing to try. Like if you have custom mommies where you want to check the compatibility with these instances, where you have libraries where you want to check configuration with these, with this hardware spot can be a good option to try. But again, it's, it's subject to capacity availability at the time that you provision these instances just like od cr's.

And now we have the, the subjects of this, this conversation of this talk here. Uh ec2 capacity blocks for machine learning are a brand new provision option that we just launched earlier this month, um specifically for gp u instances. And right now, they support key five instances in the us uh east ohio region.

Uh so what you can do with the capacity blocks is you can um reserve gp u capacity starting on a future date uh for the amount of time that you're going to need.

Um so you can uh this option is, is really like more flexible and elastic than o dc rs. Uh because you can reserve capacity with shorter lead times and you can, you don't have to hold on to it kind of in between uh the the burst, the burst of capacity that you need for specific workloads.

Um so you can reserve a capacity block in as little as one day or you can even sometimes reserve a capacity block the same day. Uh or you can reserve it for up to eight weeks into the future.

Um the, the more time flexible you are when you're reserving the capacity block, the better chance you have of finding an available time slot that we have to book um capacity blocks pricing.

Um it's, it's uh dynamic, so it's based on supply and demand. Uh sometimes it could be higher than on-demand rates. Uh other times it can be lower.

Uh as i mentioned at the start of this conversation, uh we currently have capacity blocks for p5 instances starting around 25% below the on-demand rate. So i'm going to focus in on capacity blocks uh for the remainder of the session here, kind of talk about um you know how you can use it and, and uh yeah, go from there.

Uh so how exactly do capacity blocks work? So it's sort of like uh booking a hotel room actually. So you start by uh you know, uh defining your g window, your date range. When you're looking for capacity within um sort of like if you're looking for a hotel room, you're gonna search across a range of dates to kind of compare different hotel options, hotel prices look at different rooms.

Um and then you also provide the number of instances that you want to reserve and the duration of time that you want to reserve them for. Uh sort of like if you're comparing uh or you have like a specific hotel room size in mind, like you want a couple of queen beds or a king bed or a suite.

Once you provide these uh specifications, then a bs or ec2 will show you time slots when we have capacity available right now to book. Um just like, you know, an online booking tool for hotels like expedia or hotels.com.

Um these time slots are visible right in our a db os management console in real time. Uh and you can book available time slots whenever you want. You get an immediate confirmation. Uh that that time slot you've booked is confirmed in that capacity is going to be available for you.

So capacity blocks are super flexible because we offer like a range of duration and size options. So they can fit a bunch of different use cases. For example, you can reserve as as few as a single p5 instance for up to 14 days, which is really useful if you just want a single instance for, you know, a set of training or sorry, a set of experiments or tests that you want to run capacity block can also scale up to 64 instances and that's 512 h100 gpus, that's a ton of computing power and that makes them useful for, you know, fairly large scale training runs. You can reserve 64 instances to up to 14 days, you can reserve back to back capacity blocks if you want. There's no restriction on how many capacity blocks you reserve consecutively.

And then also, in addition to large-scale training, we have a bunch of sizes in between as well. So you want to like fine tune an ml model or, or train a small tasks specific model. You could reserve a capacity block of four or eight instances for a few days if you would like.

And then lastly, capac box can also be useful if if say you have a product launch coming up and you expect, you know, a burnt a burst of inference demand and you want to make sure that you have enough gpus to serve those inference requests without you know, your end customers experiencing an outage capacity blocks can be useful to schedule additional gpu capacity around the time of your product launch to serve those inference requests.

So we've been hearing from a bunch of customers who are really excited about this new capacity blocks product. What we've been hearing is that capacity blocks allows customers to operate with these like precious gp u resources in a more elastic way than than has really been possible across the industry prior to this.

As an example, Leonardo AI is a generative AI start up out of Australia. Um they use a variety of gen AI models to help their end customers uh produce like production quality, uh visual assets uh for things like uh like video games. And they've been telling, telling us that they, they're really excited about capacity blocks because they no longer have to hold on to uh these instances.

Um all of the time uh uh which like results in a bunch of waste and they, they have more flexibility to use uh like the, the right instances for the right jobs and that, that value is going to continue to increase as we introduce more instance types with this model as time goes on.

So I want to talk briefly about how you can avoid waste with capacity blocks. So before we had capacity blocks, what we heard from customers was a lot of times they would kind of reserve, use od crs to reserve to up to what their peak needs were expected to be over some time period, even if they had like bursty kind of intermittent, like like peaks over some period of time. And they would combine these od crs with savings plans, you know, to get a discount on on the reservation, but that would still, that would result in a lot of wasted capacity kind of in between these peaks that was going unused.

What capacity blocks allow customers to do now is you can reduce the amount of capacity that you're reserving with an o dcr down to what you think your baseline needs are going to be and you can supplement that with capacity blocks just when you need that burst of capacity and you can still cover, you know, your baseline od crs with savings plans to get a better rate there. And then you don't have all the wasted capacity kind of in between your peaks with capacity blocks.

So a good example of like how a customer might use, this is say there's a customer who has a research team that needs like a couple of p5 instances for over the course of a year for sort of a continuous queue of, of training or of testing and experimentation. They can use an od cr for a couple of instances with the savings plan to operate really efficiently for those workloads.

Let's say the this research team uh wants to do a couple of model training runs over the course of the year. They want to fine tune a model, then capacity blocks are available for them to uh to burst beyond their baseline uh to use additional capacity just for those uh specific workloads.

So I want to quickly look at how the, how the cost of capacity blocks can compare to od crs for a few different usage patterns.

Um so as a baseline uh or we, we, we'll take a look at how much it costs it would cost you um to use a p5 a single p5 instance over the course of a year. Uh and for this exercise, i'm just going to assume that uh capacity blocks prices, although they're dynamic. Uh i'm, i'm just going to assume that they average around the on-demand rate for p5 instances. And, you know, based on today's prices for capacity blocks, that's somewhat conservative.

Um but uh we'll just use that as an assumption here. So as a baseline, um let's, let's take a look at what it would cost to hold on to a p a single p5 instance for an entire year, if you combine uh a capacity reservation with a one year, no upfront compute savings plan.

Um so that will cost about $678,000. Um and if you plan to use that capacity, that instance for the entire year pretty efficiently, if you have close to 100% utilization, uh it's going to be more effective to use an o dc r than using capacity blocks. Uh because capacity blocks uh are, are not compatible with savings plans.

However, if your usage uh becomes more intermittent, intermittent, like if you have less, less than 100% utilization over the course of the year, say you uh on average are going to use that instance for three weeks, a month or two weeks, a month, then capacity blocks starts to look like a more attractive option.

Um uh you can see like uh 12% or 42% savings using capacity blocks for part of the time versus using an o dc r with a savings plan. I just, and i just want to caveat that these, these calculations are assuming, you know, the capacity blocks average around the on-demand rate of, of p5 instances. That isn't necessarily always the case, but it fluctuates above or below on-demand rates based based on demand.

Alright, now i'm gonna switch over and uh just do a quick uh demo can kind of show how to use capacity blocks uh from the console here.

cool. all right. so the first thing you want to make sure you do when you, so i'm going to just walk through how to how you actually search for and would purchase the capacity block from the adb s management console.

so you want to make sure that you're in the ohio region, which is currently the region that we support capacity blocks in. and uh then you'll navigate to the ec2 uh the ec2 console page.

so then once you're here in the two page, you can go to the left hand navigation screen and click on capacity reservations. so capacity blocks are considered they, they're a type of capacity reservation um as opposed to like on-demand capacity reservations there. so we have two types of capacity reservations. now, they're both visible from this, this consulate page.

um so in order to, to start searching for available capacity blocks, you click on purchase capacity blocks for ml here on the top, there's also a button down here um and from this page, when you enter into this workflow, um you can exit at any, any time.

um like the, there's a couple of steps to actually uh reserving a capacity block first. uh you want to search around and see when we have capacity available. uh and then once you find something that you actually want to book, that's when you purchase it, but you can uh search around, you can submit multiple requests.

um uh there's no uh like commitment to actually buy something when you enter into this workflow.

um so currently, as i mentioned, we just support p five instances. uh the default option from the console. uh the default platform option is linux, unix if you are interested in using something like redhead enterprise linux that's available through our api s.

um and then you want to select the total number of instances that you want to reserve. um we have offerings ranging from 1 to 64 in. uh we have predefined options uh in powers of two in between that range.

uh and then you select the duration of time that you want uh anywhere from 1 to 14 days in one day increments. so i'm just going to keep it the default right now.

um and then the last thing that you select here is the date range when you want to look for capacity within. uh so the it defaults to uh today's date and then 10 weeks out into the future as the latest end date.

um i'm just going to keep the defaults here for the initial search so you can click this button and you see right away. um, what we have available, you see the start date is uh november 28th, uh and date is november 29th.

we show we return a single block option or offering uh per search request and this is always guaranteed to be the lowest, the lowest price capacity block that we have in the date range that you, that you specified.

um i'm going to take a look at a, at a difference block option here. let's let's see if we have a four instance by seven day block available and i'm going to shrink the date range down. i'm assuming that's uh uh i need to complete this workload by december 15th. so let's take a look again.

all right. and so here's the, we have another block that's available on november 28th. you see the total upfront price of this capacity block.

so the price and capacity blocks are charged upfront at the time you reserve the capacity, you just pay a lump sum. and i believe this equates to roughly 75% of the od price of p five instance, four or four p five instances for seven days.

so i'm going to proceed to the next step here. let's assume that i'm happy with this capacity block. and i want to purchase it.

uh so you can add tags. um so these are tags that would show up on the capacities of reservation resource that, that you would create here. um i'm gonna skip that and then we just have this, this summary page um before you actually uh purchase the, the reservation.

uh so i'm gonna go ahead and purchase type and confirm. all right, and it's, it's as simple as that. now, the capacity is reserved.

uh i have this reservation that starts on uh november 28th. you can see the details of the capacity reservation from this uh reservations overview uh page here through this table.

uh i have another reservation that i booked previously. uh that's currently active. so you can kind of see the difference between a uh a pending and an active reservation.

uh so the state after you purchase a reservation, it goes into pay payment pending until the payment has been suc successfully processed if you have like net 30 terms or something. uh your normal inverse invoicing terms apply.

uh so you just be charged after 30 days. um and then once the payment has been processed successfully, the status will change to scheduled.

um and then you can see some additional information about your reservation here in this table. um so you can see the start dates and the end date for the reservation. uh and the total number of instances that you reserved and available capacity will change.

uh once this reservation is active, you can see that uh in this active reservation here. this indicates how many instances you can launch right now into the reservation.

um so before a reservation actually starts, you can do some upfront work uh to, to kind of get prepared to launch your instances into the reservation. so you don't have to eat into your reservation time once once the reser reservation is active.

um so you can create a launch template targeting a capacity block uh while it's in payment pending or scheduled state. so i'm gonna walk through how to do that here in this workload. it's, it looks very similar if you were to launch an instance with the two launch wizard as well into a capacity block.

so i'm going to go ahead and create this launch template, just make up a name here. all right, i'll select the p five instance type she my key pair.

and an important thing when you're creating a launch template or launching an instance into a capacity block is you have to go down into advanced details and you want to select the capacity blocks, purchasing option under purchasing options.

you do this like this is a similar workflow if you were used to use spot instances. and the reason why we have this is we want to make sure that customers don't inadvertently launch instances into a capacity block reservation, not knowing that um like at the end of the reservation, any instances that are still running are going to be terminated.

so we want to make sure that uh customers are explicit when they, they're targeting these reservations knowing about that termination behavior.

so once you select capacity blocks, then you can uh select uh target by id from the capacity reservation drop down here, you need to target capacity blocks by the reservation id.

um and that, that way we know which, which instances need to be evicted at the end of the reservation when it expires.

um and then we can select uh the capacity block that we want to target. so i believe this is the one that i just created um this uh ede reservation.

so i'm gonna go ahead and target that and then you can go ahead and create your launch template.

so now we have this launch template as soon as the capacity block is active, you can launch instances, they're just normal ec2 instances into the capacity block. uh you can stop and start them or terminate them and relaunch them over the course of your reservation. they're just normal ec ec2 instances for that entire duration.

all right, i'm gonna switch back here.

all right. so the, the launch template, compatibility with capacity blocks is really important if you uh want to uh use capacity blocks with uh an eks cluster.

uh so the way that you would do this um once you have this launch template that's created that targets uh your capacity block. you can go ahead and create an eks self manage no group uh configured to use that launch template that you just created.

and what that does uh in the background is a bs will create a uh an amazon ec2 autocall group uh using that launch template that you that you specified.

um uh and once that autos scale group is created, uh you can create scheduled scaling policies to automatically launch instances into that reservation at the reservation start time and then scale down the auto scaling group when the reserva reservation ends.

um when you create these scaling policies uh for this a sg that is uh like uh kind of paired with your eks cluster.

um then when your reservation begins, ect will automatically start launching instances into your capacity block. uh and you can uh and ecs will run pods on those instances and you can queue up jobs ahead of time that uh that can then automatically run in those pods.

uh we do recommend that you use no termination handler uh to um to gracefully uh drain your pods at the end of your reservation.

uh the no termination handler is aware of when uh your auto group is going to start scaling in your instances at the, at the end of your reservation.

uh so that you can avoid disruptions to your workload if you still have uh you know, a job running on your on your pots.

um so with that, uh uh here are some useful links uh to kind of get started with capacity blocks. you know, i encourage you to check out capacity blocks in the ad bs management council, uh super easy to get started. so there's a link here.

we also have our capacity block user guide uh as well as the eks user guide for capacity blocks.

um the eks user guide gives a really detailed uh step by step process of how do you how to use um how to set up a self managed no group uh in an eks cluster.

um and then lastly, on the far right, we have a blog post uh that gives a pretty nice like simple walkthrough of how to use capacity blocks.

um so with that, i want to thank you all for your, your time here. if you can please take a moment to uh take the uh uh the session survey that would be super helpful.

um if you have questions, i'm gonna hang out over here on the on the side. and um yeah, i would love to, to chat with all of you. uh but yeah, thanks a lot for your time.