Getting the most performance for your .NET apps from AWS SDK for .NET

Hello. All right. Looks like we're live. Thank you all for coming to my talk. My name is Norman Johansson. I'm a Principal Engineer over at AWS and I focus on all of the .NET tooling that you probably use building your .NET applications at AWS. This includes the .NET SDK, all of our Lambda libraries out there. I work a lot on our .NET CLI tools and I, I help out with our Visual Studios as well.

Today we're gonna talk about how you can get the performance you need with the SDK. So my goal for you all today is to give you a better understanding of what the SDK is doing and then you can have that information to decide if you like that default behavior or how you can customize it to meet your needs. So hopefully all of you are going to be able to go back home and take some of these techniques and use it in your applications.

Now, I am probably the worst person in PowerPoint. I usually have someone who makes my PowerPoints, but this year we're just gonna do a lot of code. So don't expect much when it comes to PowerPoints. But before I do this, I want to make sure you all know what I mean when I say the SDK for .NET because we have a lot of tooling for you as .NET developers.

So when I say the SDK, I am talking about that family of packages that we have, they're called the SDK or they all have that prefix. This is where essentially we have a 1 to 1. There's a package for every service that service has a service client and it's got an operation for everyone. It's very 1 to 1 and it has a very consistent probing model across all of the services.

You can see in that little code snippet there on the side. That is the basic usage of the SDK here is using S3. Here you create the service client, there's a request object you set up and you call in the appropriate API passing the request feedback response that right there is SDK 101. Probably you already know this. This is not what we're gonna do. We're doing, this is a 400 level talk, we're gonna get deeper today.

So the SDK is actually quite old. I've been working on it since basically the beginning, which is, this is its 14th anniversary. So we've had two major versions. We're in version three right now, which was released in 2015. We had a slight bump in 2016 when we all moved over to .NET Core. But for the most part, we have tried to keep the SDK very stable, not breaking you as much at all.

In fact, if you look here comparing v1 of the SDK to v3, the code is actually very similar, major difference that you're gonna see here is the fact that in v1, we didn't have that concept of credential profiles that maybe you use today, you had to manually load up your credentials. And of course in .NET, we added async await. So now we've added that in there. If you're still using .NET Framework, you still have that synchronous code path. But the takeaway I want you to get is like under the covers. We have changed the SDK a lot over the last 14 years. But we've tried really hard to keep that public API as consistent where you can even take code from v1 basically change packages and you still would mostly work with minor changes. And that also goes down to even the behavior, the runtime behavior of the SDK. We try really hard not to surprise you with changes in the SDK. And you'll see a theme of that as we go through some of our demos today.

Now, I'm going to try to get through all of this. It's going to slam through here. It's a lot of content. I don't know if I'll make it all the way through it, but we're going to give it our best try. So like I said, this is a demo talk and we're gonna take a break from slides because you can see how beautiful they are. Those are my best slides even.

Ok. So here is, again, this is basically SDK 101 here, you can see we're creating the S3 service client, setting up the request object and sending that off there. Now to construct a service client, you actually need two things. You need credentials and a region or endpoint to send the request to. Now here you can see I'm calling the default constructor, not doing what I just told you not passing those in there. So by default, the SDK will go and search your environment to try to figure out what credentials to use. Are you running on some compute environment like EC2 or Lambda or ECS? If so use those attached credentials. If you're running locally, you got environment variables or default profile, it will use those. So this code here is the equivalent of no credential, no region.

So that's really the same thing is what we, you do when you construct is we have these fallback factors in the SDK that do this whole series of searching for your credentials. Now, the actual construction of the service client that is the client, it's actually pretty small, lightweight object. It's just basically yet another object, right? But it's constructing those searching for those credentials is what can be slow for creating your service client if you're doing that.

So what we highly recommend is that you actually reuse your service clients, you should have one. If you assuming you're using one account, you should have one service client for the duration of the application, stick it in your DI containers or whatever you're using because you don't want to actually resolve those credentials every time. We also under the covers inside the SDK, we maintain a little bit of state to make sure that we're, you know, kind of tracking the health of the, the the request if it's working. Ok? And that can affect our retry behaviors and things like that. So it's again, another reason why you should reuse, you share those service clients and they are thread safe. So it's fine, just stick it as a single ton in your DI and repurpose it throughout the place.

So what I mean by that if we look over here, this guy, this is your standard .NET Core application here. You can see I'm adding S3 using their SDK service. This extension method comes from the Extensions.NETCoSetup library, which is our library. We have to integrate the SDK service clients with the DI and configuration system. And so what you then obviously would do is injected in there and use your usual, you know, dependency, inject yourself. So that's how you want to basically use that same service client.

What I don't want to see you do and I'm gonna come check you. Not really, but, you know, is something like this right? Here's a, here's a API right? And on a per request, we're gonna go and resolve credentials here. It's like, hey debug, go log, load up my debug profile from that credentials file. What's really bad? And I've seen a few people do this and it's causing them a lot of problems is they'll do something like this on a per requests basis. They'll go create this instance profile in credentials. What that object does is that's how we go fetch the credentials from EC2. So if you do that on a per request basis, that means on a per requests basis, you're making an extra network call to hit this EC2 metadata system to go fetch those credentials. And that might work fine when you're in your, your development stages and you're not at high scale. But once you start getting to higher scale, that instance metadata service, it can't handle the same throughput as like calling down to S3 and you're eventually gonna start giving some issues about not being able to fetch credentials.

So like I said, really do not do that. That is my big takeaway if you need to have like a multi tenancy situation, which is usually why I see people do this is you need to build something where you're essentially caching either the service clients or those credentials on that per tenancy basis. You don't want to resolve those credentials on a per request basis. Then you'll be calling me up saying why does this slow? And I'll say, don't do that. So not that we call each other. You'll send me a GitHub issue. That's what it'll be.

All right. So let's switch over the sides up. So like I said, I'm just repeat again. You can use those fallback factories yourself or let the SDK use them. But the end result is be sure to cache those service clients, they are thread safe, they are fine to use across the board.

So now let's talk about HTTP client. Let's switch back over. So with HTTP client, as you can imagine, that's like the most fundamental piece for our SDK when it comes to .NET where web service is, right? We're always basically sending parts over. So let's take a look at how we use that.

Close all that down. I'm gonna forget one of these demos to start set up, but we'll give it a try. Here we go. Ok. So here's a sample that I told you just not to do where per iteration I am going and creating an S3 client and actually making a list buckets call. So I'm gonna start Fiddler up so we can kind of see what's going on under the covers and run this thing. Suspense the dow.

Now, if we look back at Fiddler and I know this is small, I'll try to see if I can zoom in here. You can see in that little window there is essentially we actually only made one HTTP connect call one time. Did we actually do that handshake the TLS negotiate even though you see under the covers, I created new service clients. It should be five different calls on there. So under the covers, the SDK is keeping a pool of HTTP clients and reusing those, we keep them on a, it's like per service. So there's a pool for S3 a pool for DDB. We use those to make sure that we are reducing how often you are doing that handshake because that can be one of the issues when making those first requests is making that handshake.

You can actually change this behavior. If you look on the Config object, there is a property on there. I don't need caps, lock on to disable that off. So to show you what would normally or would happen normally without that delete that going going. So now here's what happened. If we didn't cache, you can see every time we would be doing the whole negotiation and that would be really slow for you.

Now, generally, I don't have much reason for you ever to turn it off. The only reason I've ever seen anyone have a good reason for turning it off is if they're like running their client application, like outside of AWS and they're maybe making it through a sketchy network and they just want to make sure that we're always using a clean connection and not something that's gotten stale or something like that. That's about the only reason I would recommend turning it off, but you actually have more control with the SDK on this and how to use HTTP client.

So let's get rid of this. So if you want to completely change how we construct the service clients, maybe you've got advanced use cases, you can go and on our Config we have a HTTPClientFactory property and you can assign a factory to it. So once you do that, the any service clients that you create after this point will actually use that factory for creating the service client.

So if you look here at my implementation here, basically you have to implement the Create method here. And this is where, hey, maybe I want to change some special settings on the socket handler myself or maybe I want to add different handlers on there. Or maybe you want to integrate Polly down at this level if you're using Polly for your circuit breaker. In this example here, I've created an extra handler wrapping around the socket handler just to do some extra debugging, write a line. And I'm actually adding a header on there too. Maybe you're running your application behind some proxy that only lets the request go with a certain headers in there. You can do that with this capability.

Now, the thing to be careful though here at this stage of the request process that each request is considered signed. So if you mutate that request like changing the request body, the the query string parameters, then you're gonna get a signature mismatch headers you can change because when we sign it, we only sign a few specific headers and any additional headers are just ignored as far as the signature calculation goes.

Ok. So again, very did i switch over or am i just i am push pushing the wrong button? Ok. I would be switching back to slides. I don't know why there we go. Um very good. So here again, remember that h db client, it might look surprising to you. But under the covers, we are cashing those h db clients. We do this for performance reasons as well as there's also some legacy reasons. We've done a framework that we could spend forever on that one. We won't do that. Um you can turn that off with that switch, but i think it's also more interesting is you can also provide your own custom h db factory and do a lot more interesting things at that level if you'd like.

Ok. So the next one, we're talking about is security token service sts. This is the service that you often use when you want to generate temporary credentials. You often don't use it directly. You often use it indirectly by the fact that maybe your credential profile is set up to do an assumed role to another account. So sds is used for that assume roles. It's used for web identity generating into tokens for sam, things like that. So it has an interesting surprising behavior for the developers that i wanted to show you about.

Ok. So here is uh app application here. You can see i'm using uh not passing in anything. So it's going to use that fallback factory. It's gonna find these environment variables that says i want to use us west two as my region and i want to use my assume role test profile. So this is a profile. I set up my credential file that's gonna do an assume role. If i go back and start up fiddler and run this, i have four tables in my account if we go back to fiddler. And let's see if we can zoom in again. So you can see there, there's that request i db to list the number of tables in there and that's going off us west two. But then you can see the call to stssts is not going to us west two, it is going to us east one. This is the default behavior of the sdk the reason it does, this is again that done at s decay predates when sts. It was a regionalized service when sts was first launched, it was only a us east one service later on, they added regionalization endpoints and newer sdks that came out since then, default to that. But again, we try really hard not to make changing runtime behavior and we haven't had a major version since 2015. And so we had, we still default to us one.

Now this is gonna potentially cause you maybe some longer latency, maybe you're running all the way in the australia region or whatever. So you have to go from your region, all the us east one. And you're also now relying on the status of two different regions for your application that you wanted to be isolated to one. So you can change that though pretty easily except i'm going to go to my cheat sheet. So i don't do something bad in front of everybody. And there is an environment variable, you can specify that says for sds use regional endpoints. So if i do the same thing that we just did and we fill it right. And now you can see that now we've kept sts in region, we did not leave the region to go to sts. So we don't have a latency cost going over to us c one. And we are not worried about things happening in other regions. Even their application is working in one spot. So this is one i, if we ever do whenever we do do a v four, this will be the new behavior, but we haven't done a v four in a long time. We haven't done. So this is one that might not seem obvious to you and highly recommend always setting that environment variable. Or if i switch back over to slides, this button doesn't seem to like me. So if you guys want to switch the slides for me and you can see, you can either set it in the environment variable or in your credential profile. You've got that sts regional endpoint setting, you can set in there to make sure that you are using the regional endpoint on there.

Alright. Now let's talk about retry behavior. So by default, the sdk will try three. You know, if you get an error on a request, it will try it retry it three times except for d db for legacy reasons. We, we, we actually retry it 10 times and it uses exponential backup between each one of those. Now, we don't retry every error we retry. If it's like an io exception, you got a throttling error from the service or the service, we return back some sort of 500 error, then we'll retry this. So i want to show you how the sk sort does some of that stuff though.

Yeah. So here's a simple little application that is basically just gonna have down to be service client and i have on the configure you can change using the max error retry to say how many retries you want. I, since time does 10, i don't want this, i want this to fail faster. I changed it to three. And basically this is just gonna go and pound the heck out of this service. So i am actually she not going to send david. I have a really terrible server here. I've written that all it does is return back 500. It's got to be probably the worst thing ever written.

Alright, let's send it to pride. So now if i run this, so now we've got this decay. It's, i think it's spinning up 10 tasks and it's just pending those 10 tasks, sending them over to our dummy server and it's just gonna constantly getting failures. And you can see initially it's taking about 10 seconds for each of those things to go through all of the retried behavior. Part of that is also just waiting for.net throttling of all these tests systems. But you can see it's, it's failing, but the sdk is maintaining state inside of it to start realizing nothing's working. And so, and eventually we to see and you can see it just happened right there is, this is just an unhealthy system. I should stop bottlenecking the system and it will switch over to a fast failure mode. Just stop doing the retries and fail for you. And now once you resolve the situation, maybe you've got a b pc routing issue, something that's causing all the fail you fix. Those requests will start working. It'll start realizing, oh, the situation is healthy again and it will get back to its normal retry behavior. So this is again, another reason why i highly recommend sharing your service clients so we can maintain the state and you can have these types of features.

Another thing you can do with uh retries in the sdk set this as a start up. So here's an example. And i have this zip codes table. You're going to see a lot of this zip code down to table example here where i'm just again pounding on the system, this is going to go to d to be for real. And i'm gonna just start this up because it takes a minute to get to her i think is interesting for us. But this is essentially doing, loading up a bunch of zip codes on here. But i'm not using ad table that is terribly under provisioned. I can't say the word. So it's of course, gonna start getting some throttling errors.

Now, it's not gonna get them right away because dyna is gonna say, oh, you're getting a spike, i can handle the spike, but eventually realizing you're not just a spike. This is what you're thinking you can do and you didn't provision for it. So it's gonna start sending us a lot of throttling errors as we start pounding on this thing. And i'm stalling while i'm waiting to get to that point and on a conference network and drinking a bit of water. See, that's why you do these demos. You have some time to drink some water. Oh, there we go. There we go. Starting to get the failure mode. And you see, once, then we realize is you've passed your spike mode, we've got tons of errors, right?

Ok. Then i very terribly and you should auto scale your tables or use pay per request. Um but here i think i have the read units at like one or two. So um you can see really quickly, i'm in the hundreds, four hundreds of errors here. So something you can do to change this behavior though, as you can see up here, i set my retries to three as we also have this retry mode. So re try mode on the config has three different settings. You've got legacy and standard. Those are basically the same. So standard, like i think the differences that cause i didn't looking the other day is it's just a slight difference in how we compute the exponential back off. But they're basically the same. This third one is the interesting one that we're gonna talk about is adaptive.

So i'm gonna spin that up again. So we can start seeing that. But what adaptive does is essentially, it's keeping track of all these throttling errors that you're getting and realizing, oh, you're getting throttling errors. Let's throttle the sdk. Let's slow the sdk to try to match what you actually can get through. So, and this did not go while i did all that talking. That was very sad. So, hopefully, what we're gonna see here in a minute is it's gonna, you know, we'll still get some errors. It's not gonna because we have to get some of those errors to start detecting them, but it will slow it down and the rate of errors will be much less on there. It's another drinking opportunity going, going, going. There we go. So you can see now we're starting to get errors but not near as much. Right? If at this point in the old example, we were quickly in the hundreds here, we're a lot more trying to balance what it is. And uh maybe if i here i set the retro back down to three, probably if i would have set it back collected at its default of 10, there's a good chance that adaptive would have adapted within that time. So this is a way if you're finding yourself like you can't change the provisioning or you want to make sure you're handling that you can try adaptive to make this the case sort of match what it actually sees, it can actually get through to your service.

Alright. Ok. Can you switch back to the slides. Ok. So again, retry, you can change the number of retries with that max, er, property. I see a lot of people, either they change it up or sometimes a lot of times i see they'll actually set it to zero because they're building their own retry behavior. A lot of customers, they like, they, they vested fully into using something like poly and they want that to be their thing. So they don't want the sdk. That's how you can tell the sdk

shut up, just make a request and i will handle it. it's just setting that to zero. um the sdk will do a fast, fast failure mode when it detects things are failing for you. um and again, i'm gonna repeat one more time like a broken horse here and uh you, this is why you want to share your service clients between your application and you can try that adaptive retry if you find you're getting a lot of throttling errors and you want the sdk to try to match what you're getting.

alright. so down high level libraries. so the sdk, we've got two different high level libraries. we've got the document model, which is where you create a document and it's what you basically, you know, store a grab bag of data store and loaded stored and load that in to b we have the data model. we also call the object persistence model. that's where you have.net types um and store and load those inside the system.

these are very large high level libraries. i could spend a whole session on those. but what i really want to focus in is how you can improve the start up time on those. that's been a contentious issue for many people who've used these libraries.

so in this example here, so this is using the document model. so here basically it's you load a table up and we call get item up here fun random trivia for you. i've been using uh fiddler a lot to see under the covers. another thing you can do is on the service glance is there are some events you can attach to and actually see what's going on in there.

so i'm gonna run this and you're gonna see that even though we loaded item, we actually made two calls on that operation. we did a describe table and a get item call for that zip code in my dining table. now that described table is coming right from this load table. the way this high level fiber is written is that essentially we need to get the metadata of your table um to be able to make those low level service calls.

now, this described table can cause you issues first off now and during start up, i'm making 22 different operations. the described table is also a control plane api it doesn't have the same throttling as you know the get items and put items on there. i have to remember this high level library was actually written before things like lambda existed. so now you have the case of you've got lots of la mbda execution environments that start up and they're all doing that described table that could cause you a thinking issue.

the other sort of architectural problem there is this low table method, it's synchronous, right? but the dunnet is decay and dunnet core or modern dot whatever you want to call it, that is only async only. so this code has to do a blocking async call to work and that can cause you thread starvation depending on your scaling on your application.

so about, i think it was in august, we updated, we sent out a new version of this high level library that allowed you to be able to turn this described table off. essentially what you need to do is instead of us just, you know, going to the service and getting the metadata is you need to give us the metadata.

so we have this table builder that you would use that essentially. you can see it's got add the hash key, add the range key, you need to give us the metadata of the system. we'll use that instead of inviting it from the service, i am not gonna try and type that all in front of you all. i'm gonna go back to my cheat sheet and you can see for my example here, i'm saying it's my zip code table. i got some indexes. that's my hash key. i don't have a range key for my situation and i run that and no more get item or no more described table call.

so that gets rid of that and that can really improve, especially in that lambda case because like i said, this was designed before lambda existed and doing that described table in a lambda environment has caused a lot of people's struggles. so highly recommend switching it over to that.

now, the other high level library have right is our object persistence model. so this is where you create that context object up you use, you know, i've got a zip codes table, i load that up and get that on there. and that has again the same thing when i call that that is eventually doing a scribe table and get item.

so we want to get rid of it there too. so what you can do is on that config of the context of it, we have a new property called disable fetching. so that gets rid of that described table call, but we still need the metadata of your system. that's just we need it.

so we've always had and the ability in our done types to add attributes on there, right? so we used to always have these, we always had these on there. so you could basically help customize the mapping between the metadata we find in data to b to what your actual.net type is.

now, if you disable it, this becomes the source of truth. so we use all those attributes to actually define. so you have to make sure you really define everything in there, you put all your hash keys in there and that's what we're going to use. and so with that in there, we've got no more described table all gone.

now, you might find yourself though in the situation, maybe you aren't the ones that wrote the poco came from another team. you don't have access to changing those attributes. so what you can do then is combine what we just talked about in the docket model with the context object.

so i'm gonna go and take this definition out, bring it over here after i create my contacts object and on that, i can then register that table definition there. so now i didn't have to change the, the my poco, but i put the definition here and that will also be a way how to get rid of the described table calls.

everyone's good. i can tell you're all good, right? all right. twice again, please.

so again, high libraries highly recommend this was, you know, get rid of that described table call. this was added to the sdk uh within the last month or two. it was very recent on there. um you can use either that table builder or if you're using the content, which attributes on there s3 streaming.

so this is a common request we've gotten over the years is people want write these your web applications and they're uploading large streams up to their system.

alright. so here is i've got something that a web application that's going to take what i get from the stream and upload it straight to s3 using our transfer utility. i'm going to run this client application to send up the request goes and it goes boom, right? this is what we've heard. a lot of people saying, hey, i want to just stream this but it doesn't work. and that's because s3 requires the content link. we have to know how big of an object on there and that.

so we expect the stream to be seeable and be able to push it up in there. so again, a couple of months ago, we updated the sdk to kind of virtually make it support streaming for us. so i'm gonna switch now to a newer version of the sdk. make sure i save, save, save, rebuild cause i just keep messing that part up. and i just, and i run that and now let's send over our requests. you can see now it's actually uploading it up there.

so how that worked because you know i'm on the.net team. i'm not on the s3 team. i can't change s three's architecture to no longer need that. so what we did here is so our transfer utility lives on top of s3 and this is what we've always used to say, oh, you're uploading a small object, just do the regular put object request. you're uploading a large object, then let's switch over to the multipart api.

so now in the news version, we've added essentially a third state which is, hey, i can't, the stream is not seeable. i don't know how big this is. so what we do is we switch to a multipart api mode. and what we do is we read that stream and we essentially buffer it up to a part size. we read up to five megs into memory. once we get that up, then we send that over to s3. so we're basically doing it in five meg chunks.

so what i want you to really take away like this feature is now there and it can be very useful. but keep in mind, we are going to use essentially a five meg buffer to make this work. so you have to make that trade off in your application if that's gonna work for you.

alright. i think we're doing ok in time because now we're getting to the, the big one that takes all right, one more time on slides. i actually put a few more times on slides. um so again, it was version 202 that came out in august is when we added that feature so just make sure you're using that version, you don't have to make any code changes. it'll just work for you, but just keep in mind that you are using an a five meg buffer to make that work. ok?

so done at native a ot. this is gonna be our last major subject we're gonna talk about today. so the.net team over at microsoft, they have been working a lot done at seven and definitely done at eight to add done net or native a ot support with uh the platform native a ot. if you don't know, this is the process of essentially compiling your data application into a native executable and linux mac windows executable application.

the advantage of native a ot is it can give you a really fast start up time because you're not booting up the.net run time anymore in this situation, which for those that are really concerned about their cold start performance of lambda. this can be a really big deal.

now, since you're essentially compiling your code, your dependency code, all of the done at run time into a native executable, the whole system relies on trimming to keep the size manageable.

now, as great as it is to get that cold start performance. there are some challenges with using native a ot dot net. we've been using.net, i've been using it since the 10 days. so 20 some years, right? it, you know, we've built a lot of patterns on using reflection and things like that.

so reflection in a of t is limited, you can't just do that reflection off of an unknown object type in native a ot all of the types that you're using have to be known at compile time. this means that the community including aws, we have to work on updating our libraries to really truly support native a ot the last problem is it's sort of a logistics problem is it's just sort of a challenge is, you know, when you're building your whatever, wherever you're running your native a aot application like la mda, you also have to build it on that same platform.

few of us are probably doing our dev work right on a linux laptop or anything like that. we're most likely working on a windows or mac laptop, but here you have to do it on linux. we do have some tooling out there that helps you with that. we can do what's called a container build. we do it in visual studio and in the cli where essentially we have a, a lamb to build image. we have one for six and seven and we'll have one for eight, hopefully soonish um where we'll essentially, you know, do the compilation inside the container for you on linux as well as i'm sure a lot of your c i systems are under the hood. linux as well.

alright. no. ok. so here is, uh again, it's that zip code table that i keep using all over the tape where we're just gonna go and essentially uh load up some zip codes up there now to compile this into native a ot what you do? let me close some of these windows that i've accumulated here. oh, that's the, the dummy server is, you essentially pass in the ms builder. publish a ot. this is and that will say, hey, let's, you can see right there. it's generating native code.

the other two switches you can pretty much ignore. i have those on there to help print out some extra diagnostic information that we're going to look at. but it's basically either set in on the command line or in your cs project file um to say, publish a ot.

now, the thing to look at though is this giant wall of yellow and i'm not expecting you all to read this. this is the sdk compiling sdk and it's saying, hey, the sdk is doing things that are not safe inside native a tsdk. as i said, it's 14 years old. we've got lots of fun reflection tricks. we do, we do dynamic assembly loading as we talk with sts, things like that. um those things um don't work in a ot environment

Now, this here. Oh so to work around that because we've been talking about a OT since down at seven is in our lambda blueprints. We, we've defaulted them to say the trim mode is set to partial. So what partial means is only trim the assemblies that are marked as trimming. If it's not marked as trim, then you compile the entire thing.

Now, the con of that though is one obviously you're including the whole thing. And if we look, this is uh producing about a 15 meg executable for this application. And, and this is how Microsoft describes it. If you're getting any trim warnings, that means essentially the behavior of your assembly could change inside an AOT environment.

So when you're compiled for AOT, you need to test not just your code, but you also need to really test your dependency codes that have trim warnings to make sure that they aren't changing for you or, or changing on you in an AOT environment. So you need to do this testing and you need to do it thoroughly in an AOT environment.

So the good news and why I want to brought this up is again like just about two weeks ago as part of our support for .NET eight, we just pushed out um new version, not this project file, uh 37, 300 version of the SDK. So that version, let me I go run the compiler again.

So with the C 37, 300 version of the SDK, this is where we added a new target with the SDK. The SDK already has we have a .NET Framework, 3545 Net center 203.1. And now we've added a .NET eight target to the SDK. And with that, we've also gone through the SDK and made sure we addressed all of the trim warnings.

So now, you know, if you got that, that we're not changing the behavior of it. And it also means that the size here is now, went from 15 megs to about 11 megs. Now I forgot to show you this, but I'll show you the second part of this. So this is a tool I use called SSO Scope done by one of the engineers on the .NET team. It lets me look at some of that diagnostic information. Um that's, you know, those extra flags, just kind of analyze the size of things.

So by trimming, so I meant to show you this beforehand. But before I did this, the core here was 800 k and the dyn B here was 700 k. So now you can see the core has shrunk to less than, you know, about 500 k and that is now just 20 k, right? So all those high level libraries that would normally be in there, they're gone because I'm not using those. And if we look at the models folder that you can kind of see, that's just, just essentially the get item and anything that's used by that, right? So it really shrinks that down.

I think this is interesting too is because now that it's actually pulling in, it's trimmed the S the D to B that we're using is it can also trim the actual D framework even more because it's realizing what parts of core we're, we're not using. Like for example, D to B is a json based service. So things it doesn't need an xml parser, like some of our other services do. So all that's gone.

So I trimmed about, about a meg out of, you know, from our stuff. But the overall size shrunk by about five megs because it's using less stuff.

Now, now I said that we addressed all of the issues that the, the trimming warnings in the version that does not mean I we've, we've made everything work in the SDK. There are some pieces that because of their history of too much um reflection. We have basically marked those as saying they're not supported right now.

So for example, the reason, well, I'm using the low level, high level libraries, the DY to BE high level libraries do not. They, they do a lot of reflection. So if I was to take some of that code and pop that into here, you're going to see I'm going to get getting a warning. It looks a little bit over here. That's, you probably can't see that. But essentially what that's telling you is that you are using code that requires unreferred code, code, not need non at compile time.

So if you're getting these warnings. This is your sign, don't call it. I have to figure out how to refactor my code in order to not use things. Now, the amount of code that we have like this in this case is quite small, but there are a few places and you'll run into those.

Now, the other major challenge that you will find when you switch to AOT is what we in our core component, we have what we call optional run time dependencies. STS is the prime example, STS is used by a lot of our credential providers that are inside core, but we can't have core depend on STS because STS depends on core. We got a fun circuit of dependency problem there, right? And also, you know, you only need it in a few use cases as well.

So we've always had this trick that we do in the SDK where essentially it's like, oh you're using one of the components of EST S, we dynamically load that assembly and call it through that way, which is what you cannot do in native AOT.

So to get around that, we've called what we call is a registry where you can essentially pre reg you can register with core. These are those dependencies you should use.

So what I would do um so just to reiterate here, uh uh if I go and I set the profile to be new profile, assume role test, like that's that same profile using in that other demo, right? If I run this inside Visual Studio, um so this is not native AOT when I run here, it's gonna give me a warning that says, or error, I can't find STS. That's how the ST A has worked without being a native AOT. And what you just do to fix that is you can still add yourself a dependency on there so that we can dynamically load that assembly that's there and it'll all work for us.

So that's how you have been using this in today. But if I was to go into a native AOT environment, I got to build this again, which gives me a chance to take a wider break. There we go. And now if I go run this, it blows up, it blows up really fast because it's native AOT, but it does blow up.

Now it's blowing up and it's gonna give us an error message that says you're missing that run time dependency. Now, what that error message is, we're trying to make sure we tell you exactly what you do need to fix. This is essentially you need to go register what version of ST or STS to use to fix that.

I'm gonna go create my SDS client and then we have Amazon, runtime dependency, global registry instance. And then there's these register methods. So this is all of essentially the things that we have that are optional, run time dependencies.

So if you get an error, it's going to tell you, hey, you need to, you need to register this instance with this register method. So I need to go put STS inside there and so big font. Ok?

So now that should work to do. Uh yeah, I don't know if we need to really watch me do this again. But essentially, yeah, you can trust me that once I compile this. And now when it gets to that point where it's using that credential priority to assume rolls, it's gonna go check that registry first, find that we have that there. And that's gonna avoid us having to do any of the ref you know, simply loading um that we uh there we go. It avoids the simple learning that you just can't do a native AOT.

Now I have a lambda function here that is essentially doing the same thing we just talked about takes in as an input as zip code gives you back the output. I'm gonna save you all the excitement of uh watching me deploy this in every possible permutation. And I'm gonna switch back over to slides if I can get help on that.

Here we go. So this table kind of shows that these are the different permutations you can do with deploying uh lambda here with, you know, in .NET eight today, right? So with 0.8 you can deploy a self contained .NET dot net eight lambda function and you see here that cold start from performance there is about 3300 milliseconds. Keep in mind these are approximations of doing this thing where since we have a lambda function that's making one surface called identity to B, there's a lot of variations always to the colds, but that's an approximation there. And its deployment bundle size is a 35 meg zip file if you look at our manage run time, which is probably what most people here are using uh for lambda that brings your cold start time to 1900 milliseconds and much smaller deployment bundle.

And when you know, we're working on the 0.8 run time, it'll hopefully be out there as soon as we can. Um but I'm gonna bet you that the performance number is gonna still be in that same rough ballpark on there. So what gets really interesting is when we switch over to AOT.

So without the SDK being trimmed, you're using the one that came out earlier this month. Um you can see it done at seven, it was at about 1300 milliseconds and then for done at eight, it went down to 1200 milliseconds. This is all, you know, full props to the Microsoft team that was the work that they did to improve the performance of native AOT there. Nothing that we did. But the, you know, the, the golden version here is if you can use native AOT with that and using our latest version of the SDK there, you can see that that cold start performance is just 950 50 milliseconds.

So that doing that is half the time that you're gonna get with our managed run time. So if latency cold start issues is your biggest concern with using lambda, then using native AOT is something to really look into to bring that down.

Now I mentioned there are a few things that don't work in the SDK with native AOT. The top two are the most likely things that you're gonna care about. Um not to say anything about the other ones. I just think that those two are the most common um which is the down to high level libraries and that extensions netco setup, they both do a lot of reflection of things that are not known at compile time.

So these things on the list do not mean that we can never make them work. This is I view this as our 37, 300 version. This is like the what we have. This is our first stake in the ground. We've gone through the SD A, we've annotated everything that he's annotated, we fixed it, which was, you know, fairly easy to fix. And now it's kinda up to you all to tell us where to go to next. If you need these things, then you need to let us know. This is why I want everything we do is open source. You can go let us know on github and that can help, let us prioritize it's time for us to do more investment in this area.

You know, like there are things we could do, but they're just not easy. Like we can invest in source generators and things like that. The same thing that Microsoft has been doing with their stuff, we can do those too. We just need you guys to help prioritize that work.

So again, recap if you want to use this, be sure to be using 37, 300. That again, I think it literally, it was, it was the week before .NET eight came out. So it's very, it's very new. If you're building with native AOT you wanna avoid having any trim warnings. So if you're finding anything from us, let us know we'll try to get that up and we've, and I haven't, I don't have a demo on this one, but all of the lambda libraries or most all of the lambda libraries we've gone through are those are the amazon lambda nuka packages. We have gone through those and again added the 0.8 target and addressed their trim warnings in there as well.

So we wanna make sure especially with lambda because I think that that's the prime use case you would use it. We wanna make sure you have a smooth of a process on there.

You know, the one thing I know it's going to be a little catch up on people is that you know, those optional one time dependencies. And so just be aware of having to register that with that global dependencies, you should see an error right away and say, oh, I need to do this and you just add that one line of code in there. You can add either an instance or you can add a factory to that registry depending on your use case on that.

Ok? We made it through all of it. We covered a lot of platform features and a lot of service features. And hopefully you learn something that you can take back uh to your teams and help with how you build your applications. Got a few other links there. I know we got some people on our team that we did talk, it was yesterday, but it will be on youtube on how to write uh the protection r lambda functions. I know they touched a little bit on AOTs there as well.

Again, everything we do is out there on github, we're very open source friendly team. So if you have feedback, please give it out there and we have a few channels that we put out some of our .NET blog posts with that.

Thank you very much. I'm here all week and if anyone wants to catch up with me, I would love to hear their use cases on dot so feel free to reach out to me and I would gladly catch up and chat with you.

Um, if you like this session, you want sessions like this, please fill out your surveys and yes, if you didn't like it, you could fill it out too. That's all good. I can take it.

Um, but thank you all again so much for coming to my talk.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值