Best practices for serverless developers

Good morning, everybody. Welcome to Poor Use Cases for Generative AI, hopefully you're in the right room. No, don't worry, don't worry, we're in this beautiful room with a lovely audience being altogether is such a treat.

And to the people in the similar cast room somewhere else in Vegas, hello to you and thank you for joining us. We're with you in spirit and if you're watching the recording after Worlds, we high five to everybody in the future. Thanks so much for watching.

My name is Julian Wood, I'm a Developer Advocate in the Service team. I love teaching and helping people build service applications and also act as your voice internally to make sure we're building the best products and features.

And I'm joined by the one and only, oh that's great, Chris Muns. I used to lead Developer Advocacy for Service here at AWS. These days I work as part of our Startup team as the Tech Lead for North America.

Cool. Chris is gonna be talking about lots about later but for now. So this is a big topic and talking about best practices for service, you know, we unfortunately don't have all four days of Reinvent. So there is a bit of a small warning, we are gonna be covering a lot and we're gonna be going quickly. So to give you as many possible best practices as we can and give you as many jumping off points with other links to more content and information to dive even deeper.

So the slides and the recording will be available later. It means you don't necessarily have to take pictures if you don't want to. This resources page, which I'm gonna share again at the end, already has the slides and lots of other links to best practices. So you can have all the things you need to build your service applications.

I've actually done two previous talks on this topic at Reinvent in previous years. If you haven't seen them, the links are also in the Resources page, lots of best practices over there. So we just want to cover some new stuff today and some more things to think about.

So Chris and I just need to take a breath, you probably need to take a breath. It's gonna be a lot today. Everybody ready? Let's go.

Ok, so a question we may have been initially thinking about is, well, all the service stuff, you know, is it a fad or actually, is it the future? Well, we commonly think of the start of services at Lambda, like nine years ago at Reinvent.

But S3 and SQS, some of our foundational services, they're very much service launched way before Lambda in 2006. And even before EC2 in 2008, the initial fundamental building blocks of AWS were built before we even introduced the idea of being able to rent servers.

And in fact, you could actually say that the cloud was born service in the same way. And in fact, it's only gonna get more and more service as time progresses as the new announcements even at ReInvent come along.

And as we're gonna be providing easier and easier ways to run and operate your applications. So when Lambda was launched in 2014, and the industry sort of created this weird term "serverless", it was designed to help people with this mental model of running code without managing servers or infrastructure.

But over the past sort of nearly decade now that we've been doing this, that sort of has evolved a little bit more. And what we try and think of services now is more in terms of building applications beyond just running code to many more things.

And today, many people I speak to have a sort of mental model as services being closer to delivering value for customers without having to manage complex infrastructure capabilities.

And then actually what that translates to in a day to day basis is you're delegating the outcomes of building on the cloud to people who are experts on those outcomes.

And if you think about what development in the cloud looks like today, well, you need to understand how to develop for distributed services, how do you manage failures at large scale and manage availability and of course performance.

And you know, you, you've got other complexities of managing maybe large fleets of ephemeral compute and storage and networking that come in and out in a virtual capacity. And you've got network connectivity between various resources. These all need to be managed with permission constructs and everything that also comes with that.

And all of this requires of course a certain level of expertise. And over the nearly decade or so we've been doing the service thing, that expertise has become the norm. But learning all this cloud expertise isn't the actual value you get from doing the cloud work.

The actual value is delivering value to your customers and being able to deliver and build things cool things for them. And what we see more and more is builders leveraging AWS's expertise in delivering these best practices. And that's what we call the term "well-architected outcomes" and these are things like security and scale and performance and availability.

So the builders can focus their efforts on the differentiated work that they need to do for their customers. And when building service applications, we sort of actually evolve our building blocks from infrastructure primitives. These are things like load balancers and instance types, networking and storage.

It's rather application constructs like databases, functions, queues, workflows and many more things. And this distinction is actually where I think some people sort of miss the full value proposition of AWS when we talk about less infrastructure to manage and AWS really has the broader selection of services to offer those application constructs, EventBridge, Step Functions, Lambda, DynamoDB and all our managed services including Redshift, ElastiCache, Managed Kafka and others that are offering service services and they're all offering or moving to a more service model where we bake in the well-architected protected goodness for you.

And so I'd like to, to sort of rather consider service as a strategic mindset and approach to how you can build applications. And certainly the events over the past years and the economic environment we're in has universally sharpened the focus on business value.

So what Werner spoke about in his keynote this morning was about cost and value and efficiency and speed in enabling real customer value. So today's services can be thought of more as that operational model of being able to run or build applications without having to focus on the undifferentiated muck, we like to call it, of managing low level infrastructure.

And this allows you to build within the cloud, taking advantage of all the features, security agility and scale, not just building on the cloud on top of the cloud with having to use a whole bunch of abstractions that maybe make you do a lot more work.

And the benefits are clear, getting apps faster from prototype into production, fast feedback loop, which helps you iterate quickly for your business. And we do need to measure things. I really actually like the DORA metrics from the team who've, you know, really had a huge influence on making DevOps successful.

And there are four metrics for working out how well you can release applications, how well you can release application software, how often you can release to production, the time then from commit to maybe running a production, and then the percent of deployments causing a production failure, and then how long to recover from a production failure if you do have one.

And the two top metrics then correlate to speed, how quickly to get features into the hands of your customers, but then also importantly quality, how good those changes are, because I mean the reality is there's no point being really, really quick and rushing things into production when you have to then go back out and redo the work to restore functionality.

But Dave Farley also of big DevOps fame says, "If you want high speed, you must build high quality systems. And if you want high quality systems, you must build them quickly as a series of small changes." And this actually means excitingly that there isn't a tradeoff between speed and that quality, they actually work together when you can do it right.

And service is really a great way to achieve this and improve your DORA metrics iterating with small changes quickly. And AWS brings you agility and cost benefits and what you can expect when you build on top of AWS and the thought process I'll leave you here is innovation comes from speed and speed means doing less. And so to do less, go service.

The next topic to cover is service full service using configuration rather than code and using managed services and features where possible. When we often talk about the service application, you know, maybe with Lambda, which we know can be written in a number of languages, or of course you can bring your own with an event source which then triggers your Lambda function and based on a change or a request and then perform some action on that request and sends it to another service. This is a very common Lambda based application.

But what if the event source directly talks to a destination service? You don't have to then maintain your own code. And this is a direct service integration, what is called being service full.

And a great quote from one of the fathers of Lambda, Tim Wagner, who says "You must use Lambda when you need to transform data, not just to transport data. If you're just copying data around, well of course there are gonna be other ways to do that."

And another thing to think about is how much logic you're then squeezing into your, into your code. Are you adding, you know, more and more functionality into your code into your Lambda function? Doing everything possible in code if-thens decision trees, all those kind of things and it becomes what we call a "Lambda lift", getting a little bit large and unwieldy.

Or another way, how little code are you actually running in your Lambda function? When your function runs, you've got a whole lot of code in your function that isn't doing much. Well, it's gonna be adding complexity. It means it's gotta, you gotta have tests against that. You've got to secure that and you're not actually using that code.

And this often it does come from good intentions when you're moving to the cloud. Well, you've got an application that sits on premises, maybe in a container or a VM and a lot of the components and functions of the app are in a single place. That was a good thing.

And so you move it to, you move it to the cloud and of course wisely you think "I'm gonna choose Lambda for your compute, stick an API in front of it and maybe S3 for some of the storage", but all those components and a lot of that complexity is just moving into your Lambda function.

And when you are ultimately ready, well, you should be migrating all those components into different discrete services, as it shows here using the best service for the job, move your front end to S3, get API Gateway to handle your caching, your routing and maybe your throttling, and then you can use your various messaging services, asynchronously offload transactions to a workflow, use the native service error handling and retries and then also split your Lambda functions into more discrete targeted components.

And this all helps you scale your application, provides high resilience, improved security and hopefully even better costs. And as part of this, it does help to make your functions modular and single purpose. If you can, instead of having, you know, a huge big single Lambda function that does a whole bunch of things, rather have multiple functions that each do a single thing.

For example, if you have a single image processing function that changes the format, creates a thumbnail and adds it to a database, that's what's going on here. Think about it, maybe having three different Lambda functions that do each process. This also improves performance as you don't have to load extra code that you don't need to.

And you can improve security as each function can be scoped down to only what it needs to do. But to impact the Lambda lift that people use, it might seem reasonable to have an app where API Gateway then is gonna catch all the requests and route them downstream to a single Lambda function.

And then the Lambda function itself can contain logic to branch internally, to fit the appropriate code and to run the appropriate code. And that can be based on the inbound event method, the URL or the query parameters. And this works. And yes, it does scale operationally but means that the security, the permissions, the resources, the memory allocation and performance is applied to the whole function.

And when you think about splitting this up into more granular functions, now this is an extreme example where every single API route is a separate Lambda function, which of course does have the benefits of very granular IAM permissions and being able to manage your scaling per individual function.

But operationally, this is gonna be a lot to manage. And so particularly for web apps like this and other scenarios, it does make sense to be pragmatic about how you group your web routes and your functions. You know too many functions can be an operational burden and too few can be too broad security and have resource issues.

So grouping your functions, your Lambda functions based on maybe bounded contexts or permission groups or common dependencies or maybe your initialization time can give you the best of both worlds - effective permissions and resource management and operational simplicity.

So more on using services when building distributed applications, another aspect to think about is how you can effectively use orchestration and choreography as communication methods for your workflows. And rather than writing your own code, manage this as configuration.

Now in orchestration, you have a central coordinator which is gonna manage the service to service communication and coordination and ordering in which the services are used.

Now choreography is slightly different and it communicates without tight control. Events flow between the services without any central coordination. And many applications use both choreography and orchestration for different use cases.

Step Functions is an example, it's an orchestrator doing that central coordination and ordering to manage a workflow. And EventBridge is a great choreographer when you don't need strict ordering and workflows and need events to flow seamlessly without centralized control.

And at ReInvent this year, we've been demoing a great app which I've loved doing, which shows how these two can work together.

serve. video is a live video streaming application built with service technologies. Make sure to take a look. We are bringing you live broadcasts from aws experts through all throughout reinvents. And after the live broadcast, you can watch the content on demand and all of this is managed with a cus backend.

There are a number of microservices managing the channels, video streaming and publishing and doing post processing of videos, which has got a really cool flexible plug-in architecture where uh different builders can build functionality to, to do a whole bunch of different things. And it could be transcribing the speech to text, uh generating the video titles based on uh generative a i and doing some optimized integration with amazon bedrock and also doing some content moderation.

And this uses then both eventbridge and step functions to work effectively together. It uses eventbridge to pass information between the microservices and each micro service then does what it needs to do and asynchronously places a finished event on the event bus, an individual microservice like the video processing service, then use the step functions to do its orchestration.

It's got decision logic like whether to use lambda or fargate for the uh compute depending on the video length and then step functions then makes a decision does the orchestration part and when it finishes emits an event when done so the rest of the microservices can react very powerful.

The plug in manager service also uses step functions to handle the video processing timeline using various life cycle hooks. And so the text of speech and the uh gen i title generation all work in a particular order. Again when finished the plug in manager service puts an event back back on the event bus and the other services can rela uh can react extremely flexible and of course super scalable with step functions great opportunities to remove code.

The state machine on the left is doing quite a lot of uh logic with lambda functions, but they're pretty much just invoking other aws services. So you can optimize this with direct ssdk integrations like this, implementing the same business logic without running and paying for a lambda function. And obviously you can mix and match and transition gradually. You have complete control over what your workflow contains. So the story of how choreography and orchestration work together.

As we see when running service video, each of the microservices can act independently and within each microservice, the bound of context then decides what happens. Step functions can help with any orchestration within the service and then use events to com communicate between the bounds between the microservices, which is a very effective way to build distributed applications.

If you didn't know there are actually two parts of step functions, standard workflows, they run for up to a year on asynchronous and express workflows on the right. They're fast and furious and they're built for high throughput and they can run for a max of five max of five minutes and can also be synchronous standard workflows and express workflows have different sort of pricing uh pricing models.

And the cool thing is is with the different cost structures. Express workflows can also be significantly cheaper. And here in this example, we've got here, you can use express flows as the workflow runs synchronously under five minutes and this is actually half a second faster to run per execution. So also a performance boost and then a million standard workflow executions would cost $420. But using express workflows, this is $12.77. So that is seriously quite a big cost benefit too.

But the even better story is how they can both work together nesting express workflows within standard workflows allowing you to run long-running, standard workflows that can support callbacks and other kind of things. And then you nest the express workflows for high speed and high scale which return to the parents standard workflow when they complete and a great way to get the best of both worlds, which is also happy for your budget.

Now, when building the step functions, you don't have to start from scratch. Our team to put has to put together the service workflows collection, prebuilt open source step functions, workflows and service land.com for a whole bunch of patterns that you can literally just pick up and get going as soon as you need to also other options for reducing code.

Same goes for api gateway. Do you have lambda functions that you serve only as a proxy between api gateway and downstream services? Well, you can optimize them as well. You can configure api gateway to connect directly to multiple aws services such as dynamo sqs step functions and many more. Once again, no need to use lambda just as a proxy. There are many other ways to reduce code and use native service integrations.

It's a common pattern to consume dynam a db streams uh using a lambda function to pause the events and then put them on a eventbridge maybe for a downstream service to, i don't know, take some action when a new customer is added to a database. For example. Well, eventbridge pipes, if you're not aware is another part of the eventbridge family and it allows you to do just this. But with configuration code rather than lambda code, you configure the pipe to uh read from dynamo. And then there's a built in integration to send the event to an event bus.

And the pipe actually uses the same polling mechanism under the hood as the lambda event source mapping. But the code to move the data is just handled for you just manage your configuration, which doesn't need security patching or any maintenance. It's a winner in my book.

So with all of the service stuff, remember the best performing and cheapest lambda function is the one you actually remove and replace with a built in integration, but don't get too excited about that. That's not the whole story over to chris.

Thanks julian. So julian's obviously just covered here a whole bunch of ways that you can build service applications without thinking about lambda. The lambda was obviously the thing that kind of started the world of service here for us at aws, i actually didn't first call lambda service product, we launched it, but obviously, we've seen this, this concept in this world kind of grow around it.

Now, julie talked a little bit about lambda, the model that we have of you have an invoked source, you have a lambda function, you have the things that your lambda function does. And one of the unique things that lambda brought to the industry that we didn't have before was an ability to directly invoke application code behind an api as a service.

Now, today, there are over 140 different services that can invoke lambda functions on your behalf. And there's three ways that they do that synchronously, asynchronously or via what we call a stream or pole based model, otherwise known as events, source mappings. Now, usually the different services again, do this on your behalf. You can also use the api directly to invoke these functions.

The one of the things that we did based on feedback you're hearing from our customers for many years was back in april of 2022 we announced lambda function urls. So this gives you the ability to invoke a lambda function directly from an htp endpoint, essentially looking very similar to a web hook. So you've got a couple of different ways now that you can invoke lambda functions again, integrated with the platform via its api directly or via the web hook model.

And now when it comes to thinking about performance in lambda, we kind of say that there's one kind of primary knob that you can turn and again, maybe to lean a little bit on werner's joke from his keynote earlier today, of kind of cracking the knob for performance. Essentially, what we do is we give you the ability to configure the memory of a lambda function. And what comes with that is a proportional amount of cpu and then essentially networking throughput.

So today, you can configure lambda functions anywhere from 128 megabytes up to 10 gigabytes. And again, that gives you this proportional amount of cpu and network bandwidth. Now, customers often ask or they're trying to understand when they are performance bound. You know, again, how do i get more access to cpu? And again, that is the primary way that you do it.

This is an example here. This diagram is not entirely accurate. It doesn't completely go linearly as you scale, there are some stepping actions to it. But essentially as you increase the amount of memory, you then of course get that proportionate amount of cpu. And so at 10 gigabytes, you get up to six cores.

Now, technically, before this, we start exposing the cores to you, but essentially what we're doing behind the scenes is we're limiting the power of those cores up until you get up to the maximum memory configuration. And so you do end up again at some point with six cores that you can make use of. But the key aspect of this that makes it successful for you is that your code has to support the ability to run across cores.

So technically a lambda function tops out at a single core performance somewhere between about 1.5 and 1.8 gigabytes of memory. So if your core, if your application code is not multi threaded, that's where you're going to see basically the maximum payoff in terms of cpu performance. Again, you might need the additional memory for your function for other needs. But when it comes to cpu, that's going to kind of be where you top off.

So ways to think about this, right? Let's assume that i have two different functions. One is configured for two gigabytes of memory, it runs for one second. The other one is configured for one gigabyte of memory. It runs for two seconds, right? Effectively, these are the exact same when it comes to cost, right? So running for half as much time twice as much memory is the same as running for double as long with half as much memory.

Now, how about this one here? So i have a function that's configured for 128 megabytes, it runs for 10 seconds and then i have a function configured for one gigabyte and it runs for one second. The answer in this case is that the one that runs for the one that has one gigabyte configured is the lower cost one, right?

So why is this happening? Typically as you're getting more cpu power, you're able then to have your application code run faster. Where do you see this? Almost any place that you have a lambda function calling out to another service. So a number of years ago, htps across the industry, the tls certificate bit rates increased from 1024 to 2048 to 4096 that you see sometimes these days actually required a linear increase or sorry, a logarithmic increase in cpu in order to handle the encryption of the traffic back and forth between the source of destination.

So even if all your function does is talk to a single hps endpoint, more memory will give you a faster function. Now, i said there's basically just one knob that we give you for performance. It's kind of a lie. There's another toggle that we give you, which is the type of cpu that you can run your lambda functions on.

So we've launched with uh back in 2014 with x 8664 bit processors today, we also have graviton too. So graviton two, you do get a better price performance. You know, again, you do want to test your application depending on what your code does, whether or not it's going to be supported on graviton. But generally speaking, when we see customers move to graviton, they find success in being able to all save money and have functions run faster.

Now, you don't have to blindly stumble into doing this. We've got a number of ways that you can explore it. One is with the lambda power tuning tool. Uh this is an open source project that was started by a member of our community who's now a member of a, of a staff, but it's been wonderfully supported by the community for many years. It would allow you to do is take a function configuration and then punch a push a bunch of test invocations at it and then you can change or have it test for different types of configurations.

So we see here in this diagram that i've got a number of different memory configurations that it's testing what it can come back basically and tell me or i can deduce from the data. Here is this is the lowest cost function. This is the fastest function, the lowest cost and the fastest may not always be the same. And so it depends on what you're looking for. If i have a synchronous invocation, i probably care a lot more about performance. If i have an asynchronous invocation or one of the uh event mapping or poll based functions, i probably care a little bit more about cost.

Generally speaking, i'm not looking for things that are consuming sqs to be fast, necessarily the same thing goes for when you're working with graviton. So you can basically take your function delay on x 86 run it through power tuning. You can then take your function delay on graviton, run it through power tuning. And the power tuning tool allows you to compare or contrast those two runs that you have.

And so we can see here that the graviton configured function ran 27% faster and 41% cheaper again for the workload that was used in this test. So free and easy tool for you to use that gives you the ability to test these different configurations.

We also have another tool inside of aws, which is called aws compute optimizer. This gives you a whole bunch of information. It is constantly kind of looking at your functions and how they perform over time. Again, it gives you the ability to kind of look at the different options for turning down memory based on performance and what you need. So again, another tool in the toolbox that you have for when your functions are actually running in production to see, hey, does this seem like it's configured? Well? Should i think differently about configuring it?

The next thing i want to talk about here is the abu lambda uh execution environment life cycle

We're gonna talk about everyone's favorite topic here, which is cold starts. And I know we've got AJ somewhere in the room down front here, uh, who gave a great talk on demystifying cold starts earlier this week.

So cold starts - what is this? I feel like I've been talking about cold starts for half of my life here at Amazon. But essentially what this is, is that when Lambda needs to create a new worker environment to run your code, there's a period of time where we have to bring up that environment and make it available to you.

Now, there are a couple places where this happens due to actions that you take. There's a couple places where this happens due to actions that we have to take. But the real key thing that I want you to understand here is the line that's in purple, which is that our data shows that cold starts impact less than 1 percent of all production function invokes.

So again, if you have a production workload and you have any sort of consistency or normalcy of traffic, generally speaking, cold starts should be pretty far out on the tail end of your traffic. Now, for some of you that are running against synchronous based workloads, you've got APIs, you've got consumers on the other end of that, maybe that 1 percent might be not acceptable to you. So we'll talk about how you can overcome some of these challenges later today.

Other times that you'll see cold starts - if you deploy new function versions, you deploy new code to your Lambda functions, that's going to cause us to have to basically swap out the environments for you and then they'll spin back up as traffic comes into them again. We'll talk about how you can get past that as well.

On our side, Lambda is a managed compute service. We take care of a bunch of things under the hood for you and that's part of the magic of what Lambda does. So from time to time we actually do have to what we call "reap" these environments, take them back away from you for various reasons - keep the instances fresh, keep the operating system, various code patches, stuff like that.

Again, for the managed runtime configurations of Lambda, we're taking care of a lot of these things on your behalf. And so we have to take care of those things. Another is failure, right? As Werner has always said for many years now, everything fails all the time. And so eventually you have a problem potentially. And so again, you could see environments get kind of swapped out from under you.

Now, if you break down again here this function life cycle and we look at where the cold start does happen - what happens inside of this are a number of things:

  1. We have to create that new execution environment. We basically have to find in our pool of resources, a host that we want to run your code on.
  2. We then have to download your code or the OCI image if you use the container packaging.
  3. We have to then kick up your runtime again, whether it's a managed runtime, a custom runtime or the OCI image.
  4. And then we have to run what's called your function pre-handler code.

Then after that point, your function is warm and it's ready to execute upon the event that's been sent into it.

Now, basically in a managed runtime world on Lambda, this is where there's kind of a demarcation here between what you can control and what you can't control. So essentially everything that comes before the end of the runtime is on us.

The Lambda team spends a lot of time over the years here, shaving down milliseconds and nanoseconds, improving jitter and trying to make everything that comes on our side of this line here, faster and faster and faster.

Let's talk a little bit how the composition of a Lambda function impacts things. So this is some kind of example pseudo code here - nothing really kind of amazing going on here. Although Julian did beat me up for having a dash in a function and saying that that wasn't clean Python. So this is apparently clean Python.

But what we see here is I've got kind of two sections here that are part of my initialization of my function. This is code that's going to run in that init period during a cold start before my actual invocation. And then I have my handler function. And the handler function is where we look to execute your business logic and we pass the event into during an actual event invoke.

And then if you follow a best practice of ours, one of the things that we encourage you to do is to take your kind of core business logic, not have that in the handler, but have it in separate functions or separate parts of code inside of your application.

Some of you will wrap up your own business logic into other packages that you might include. Some of you might use layers and containers for this. And it really helps with, you know, portability testing, keeping the handler nice and kind of short and clean and concise. And so again, a general kind of best practice that we recommend overall for Lambda.

Now there are some things that you could do to help make the init be faster again for your functions:

  • One here is that you only really want to import the things that you need. So some of the various SDK libraries that you might be using will allow you to selectively import just certain aspects, right? So don't import a huge, huge library that's got tens of thousands of lines of code if you only really need a small subset of that.

  • Another thing that you can do is basically to lazy initialize various libraries based on need. So you might have something where inside of a function, let's say that I have potentially two different logic paths inside of my function - one might use S3, one might use DynamoDB. I can essentially at the needed time inside of my code decide to further initialize those aspects. Again, with the way that Lambda works, once they've been initialized in a warm environment that will stick around going forward. So again, it depends on what you're trying to do - are you looking to get through init really fast? Are you looking to get through Lambda invocations really fast? That's something that you should check.

  • Someone should wake up because their alarm went off and I'm sorry if I put you to sleep.

Now again, we've got a bunch of other guidance here for what to think about in that init pre-handler code. Again, I'm not going to go bullet point by bullet point through this, but this is stuff that you kind of want to loosely be aware of, right? Don't load it if you don't need it. And we see lots of people bring lots of tools and things into their Lambda functions, lots of excess code. And try to keep that as minimal as you can. Try to lazy initialize shared libraries. Try to think about how you establish connections.

So sometimes establishing connections in init makes sense, sometimes you're better off waiting for during the first kind of invoking your function and connecting at the time of need. And then also re-establishing connections inside your handler - think about how you use state during your functions.

So sometimes people like to bring state in early and then they maybe don't need it for every invoke. And so it kind of sits around again and can slow them down early on.

And we'll talk a little bit more here about pre-warming or pre-initializing your functions.

Now X-Ray can help you identify this as well as a number of other tools that we have here from industry partners of AWS. So we see here I've highlighted where the initialization is, I can go further with X-Ray and actually tool inside of my Lambda functions, the individual things that are happening inside of it so that I get really good deep data on what's happening inside of an initialization.

And again, I think this is something that is part of your testing environments, part of your testing your functions. You really want to measure this to understand the impact.

Now there are a couple of other variations of the Lambda function life cycle that we see:

  • One is if you use a capability called Extensions - Extensions give you the ability to plug in code that exists outside of the actual execution of your function and respond to events or things that are happening inside your function. So we have many partners here at AWS that have released extensions that allow you to do things like inspect an event, inspect performance, look at what's happening, say on the wire over the network, provide things like access to parameter stores, key value stores for various things, different logging tools and agents and so forth. When you have those in your code, it shifts the optimization line over a bit, shifts that line of shared responsibility, because the extension performance then becomes something that the third party partner or your team has to think about. And so that's something that you end up owning. We have seen some of these partners need to tweak things over time where the extension hasn't been as optimized as it could.

  • The next model that we have is what we see that happens with SnapStart for Java functions. So SnapStart was released last year. And what SnapStart does is it basically goes and it completes the full invoke of your code for you ahead of time and then it takes a snapshot, or effectively like an image, of that running of that execution environment and then makes it available going forward for your Lambda function. And so basically, then what it does is for every new invoke that happens for a function with SnapStart, it starts from that pre-inked environment. And so this could be a really great benefit for Java based functions, the language that we support that today. It's also a language that historically has struggled with init performance.

Now, SnapStart again, I pretty much just encourage you customers that are using Java to just use it. It works really, really well. There are a couple of nuances of things you want to think about how you connect to databases because that will be frozen in the image then that the rest of the environment we use over time. But beyond that, there's no additional cost for this. There's no other tooling that you need to use for this. There's no special packaging, you need to do nothing, changes your CI/CD pipelines, you basically toggle it in the config and it helps make your Java functions much, much, much, much faster on that init.

Now, there are other optimization things that you could do across pretty much all the different runtimes that we have. And there's a bunch of talks that have happened this week covering optimizations inside of Lambda in general, kind of coding best practices.

And what I would say is that a lot of these are just general best practices, either SDK best practices or again, best practices for the given runtime that you're working with.

Now, one really cool hack that I like to see every now and then - this is a super secret trick that I've learned over a couple of decades of working in IT - is upgrade your stuff! So one of the best things that you could do, sometimes one of the cheapest, easiest, like dumbest, laziest wins is like just run the latest version of something.

And there's been a lot of examples over the years where moving to a minor version on a runtime gets you a performance win. And you're like, oh my god, all I did was deploy a new version and I'm saving money and my stuff runs faster and that's awesome.

So keep on top of version updates, you know, keep on top of your dependency updates. I'm like, yes, dependabot can be annoying for security things. But there's a lot of stuff that happens, especially when you're including code, where minor tweaks, minor new versions, whatever it might be, can lead to graphs like this where you see a major drop off just by moving to a new runtime.

And so I love to see wins like this.

Now, one thing that isn't necessarily a performance win but can help logically inside your functions, how you think about logging - and we've just had a bunch of new stuff come out with logging here in the last couple of weeks, both pre-Reinvent and then during this week here - and certain observability tools.

You know, one of the things that you could do is now have the ability to control log levels for your Lambda functions. Now, not necessarily a performance thing, but definitely a cost thing that we see with Lambda - people who are aggressively logging, it's going to lead to higher costs. So now you can set the log level, control the log format and the outputs of those.

You also have the ability to use the infrequent access log class in CloudWatch. Again, it's going to help you save money and just make, you know, overall things better.

Now, one thing that I'm also a huge, huge fan of is the Power Tools for Lambda. This basically helps automate a whole bunch of best practices, guidance in your function - how you think about coding for your functions, how you think about how you handle and process events that come in.

That team has been cranking on full steam. They became an official team inside of AWS under Heitor Lessa earlier this year, who's an incredible member of our team here at AWS. And so the Power Tools are something that we're seeing really create adoption inside of AWS, outside of AWS. And it's kind of just best practices in a box.

And so I definitely encourage you to look at power tools. The last thing I'll talk about here is another thing that you can do is turn on the CloudWatch, Lambda Insights. Lambda Insights gives you more data about how your Lambda functions perform.

Now, this is good for testing and good for production and good for diagnosing things. Maybe you don't want to keep it on all the time. You are going to be paying for this data that CloudWatch, but it gives you a bunch of metrics and information that you wouldn't otherwise get with the default CloudWatch metrics for Lambda functions such as the CPU usage, the network usage.

And it can also those cases when you see those help you think about, ok, I should tune up memory. I should think about my configuration for functions a little differently. Now, we've got a great learning guide of this on Service Land Cost Optimization for AWS Lambda, a whole bunch of stuff in here that can help you thinking about cost and cost again, very much aligns to performance in this world.

So let's tell you another fun topic here. Could currency, currency is basically the number of your execution environments that are running at any point in time. And this is definitely another topic that I find that people have struggled with over the years. They're not talking about a per second rate typically. But the currency in Lambda again works a little bit different.

One of the other aspects here about Lambda is that a Lambda worker environment or execution environment can only process a single event at a time, right? We do not today support the ability for you to have multiple events inside of that.

Now, this is different if you're using Lambda to process things like SQS messages where we do batching, right? So batching still comes through a single event, but again, it's not multiple events and individual messages. So again, regardless of invocation model, regardless of everything that might be sitting in front of it, you do have again, a single environment processing a single effectively, you know, request or event at any point in time.

So when we think about this, let's take a scale of time here in a window, I have a function that has just gotten an invocation. It's going to have that little bit of cold start, go through my init code and then it's going to execute and run my logic. And so again, this is only processing that single request.

So all of a sudden I start to get more requests in because that first environment is essentially locked on that first event. What does it do? It causes some cold starts as new environments are spun up. And so we see here that I have these two new function invocations that came in, they both, they both cause a cold start and then they start processing the event.

While those three are still running, I get two more that come in. Again, all three of those first environments are still basically busy or tied up. And so essentially, I now have two more environments that have to go through that cold start and begin invoking the uh the the event that's coming to it.

However, I can see here that at some point my first environment becomes free. And so another invoke comes in and the Lambda service behind the scenes is able to say, aha, I have an environment that's already warmed and up and running. I'm going to pass the event of that. And so essentially here, now we now have effectively, you know, depending on where you're looking on this timeline, three, concurrency, four, concurrency and so on.

And so eventually, as more of these events come in, the service says I have warmed environments, we keep warmed environments around for a period of time based on idleness, based on scale of your function, number of other factors. And again, without some of the other things I'll talk about here in a moment, you really don't get to control this something that we take care of and try to optimize on your behalf.

And so as we see events, you know, 789 and 10 come in, they're able to use these warm environments. However, nine had come in during a time when there was no environment free. And so again, that caused an in it. So in thinking about how currency actually works over a period of time here. And this is kind of a loose scale of time.

What we see here is that during the time period, for 0.1 I have one con currency during the time period, two, still that one currency and eventually at time 0.3 still one currency but then as this expands over time, you see again where these function environments are active, where they're being utilized is how you think about the concurrency at that point in time.

So again, this is not necessarily a request per second type of model. This is just a point in time way of thinking about what's happening with my functions. Now, one thing that can happen that you might see from time to time if you're using tools like X-ray or other observability tools is you might see a disconnect in between a cold start and a function invocation or inside of an environment.

So one thing that could actually happen is we see up top here, that environment, you know, the worker environment on that top level for the first function came in did it's in it did its invocation. And then at some point, I got a second invocation that came in and the first environment was tied up. And so we started to do a cold start for a new worker environment.

However, before that in it finished, the first function environment became free again and it said I'm available for an invoke. And so behind the scenes, we sent the invoke to that first worker. Now at this point, the second worker becomes available at some point in time for a future invocation. But if you're using tools like X-ray, you might see then that you had an inv you had a cold start and then you had no execution that happened for a really long period of time. And then all of a sudden you see the actual invoke of the lambda function happened kind of detached. And so this shows up as a gap in tools like X-ray. But again, know that what happened here is that we're optimizing for the performance of your application. And so we're basically giving the freshest worker environment the invoke when we can.

Now talking a little bit here about TPS. TPS starts to play a role when you talk about downstream systems, right? If I'm talking to a relational database, I only have so much capacity and ability to work with that database or maybe I have a third party API that I'm working with. Again, I might be constrained by some aspects of that third party effectively transaction per second or relativity against the currency that you have and the time that it takes your functions to run.

So we can see here that if I have 10 invocations and they each take a second to run effectively, I have 10 TPS, then if my functions took half as long, I would be able to fit up to 20 of them in that time period. And so again, I don't actually have more than 10 concurrency. It's just that, that 10 concurrency is working faster because it takes a shorter duration. And so again, performance here as a factor of concurrency the time duration of your function is running and that's what leads to the combination of what looks like TPS.

So if I have a downstream service that I'm talking to and they're going to maybe start, you know, throttling me back at 15 TPS. I then have to think about that as a factor of the amount of currency that I want to allow to go to that downstream service. And so we have options for how you can control this. We have a concept called reserve currency. This allows me basically to set a threshold of how much concurrency I want to allow a function to have. And again, this can give me the ability to protect downstream services without having to worry about, you know, overwhelming or causing errors down to that other service that I might be talking to.

And again, there's a bunch of cool things that you could do with treating it as like an off switch in case of times where you've got downstream issues or impacts. Now, one thing that we do have in Lambda that can help you avoid cold starts is a capability called provision concurrency. What provision currency does is it comes in and you configure it for a certain value and then we go and we effectively prew warm those environments for you and we try to keep those warmed environments available for you always.

So if we have say that we need to, you know, reap or take back a worker environment or a piece of hardware fails, we'll then go back and re uh pre provision those functions for you. And so we see here that we, you know, you configure your function, you turn on currency for a currency of 10. We go then and run all those in its. So you see all those in its happen in parallel at some point, your events come in and they land on those already warmed environments. So you won't see a cold start in front of those environments.

Now again, you are setting this initially for a value. So you're setting it for 10. So if I had an 11th request come in at this point in time and all my workers are busy, that's then going to just be an on demand effectively invoke and that would cause a cold start for that function.

One of the other cool things that we do with provision currency is it does have a slightly different cost model for it. But if you use it really, really well, you actually save money with provision currency over the on-demand pay model for lambda. And so we see and it does vary slightly by region but somewhere around about 65% utilization of your function, at least in this case, us east one or 60% us cs one these days, when you've utilized at least 60% of your provisional currency for a given function, it actually becomes cheaper for you to run that lambda function firstly on demand.

So again, this is both a cost and a performance knob that we give you for your lambda functions. And so you can kind of think of how we could apply this on top of our workload. Let's assume that we have some concept of the traffic that's going to come to our application that's backed by lambda. At some point in time, we've looked at this workload. What we could do is essentially establish a baseline of provision currency that we always want to have configured for this function.

We could then use tools like auto scaling to actually turn up and down provision currency against that cost demand. And so in so far as, as long as this environment is at least 60% utilized again, us east one, i'm saving money and increasing the performance effectively in my application by removing cold starts from it. And so again, you want to keep on top of this right?

It would be a mistake for me to set the provision of currency for this application at 100 which is kind of at the top bar here because then i might have environments for periods of the day that were not getting invoked. And you really do want to try to find this model where you can leverage again the ability to either set a baseline that covers the majority of your, your traffic over time or again fluctuates based on the need of the day.

Now, another thing to talk about when we talk about currency is also how fast Lambda functions can scale over time. Now, previously, we had a model of account con currency quota where based on a given region, you had a total amount of currency and then you had a burst rate inside of that. We've kind of gone now and changed the burst model. So this came out just about two weeks ago now, and this applies to basically all functions that exist.

So now what you end up with is a maximum increase in currency of 1000 instances or 1000 worker environments over a period of 10 seconds for every function to help visualize this a little bit better. I'll show you the old model that we had so previously and burst rate depended on the region. What you had is an initial burst rate. So in many regions, we had an initial burst rate of 3000. So that means you can go from zero workers to 3000 pretty much almost instantaneously. And then we had this kind of stepping scale over time that we could add 500 per minute to that inside of your account.

So basically what it looked like is that at some point in time, over 12 minutes, you could get to 10,000 stain requests. Ok. This is starting at zero, right? For production workload, you're typically not starting at zero. Unless you're doing things like deploying a new function, which isn't using provision of currency with the new model, it looks like this.

So what we're able to do is actually in 90 seconds, get to that 10,000 currency. Essentially, this is the fastest way at AWS to get a whole lot of compute power behind an application and it got faster with this. So again, really interesting change. A lot of interesting stuff with scale happening behind the scenes with this, we're gonna hand it back off to Julian to take us home.

Thanks, Chris. Wow. I love our Lamb is so flexible and scalable. Anybody like Lambda? Excellent. You know, you can run uh run your code in the best possible way uh well architected way. So in this section, I'm going to talk about the software life cycle, you have your services, you have your code. Well, you know, how does this all fit together from your workstation uh out into the world?

Now, if you are building service applications, just please use a framework, it's gonna make your life so much easier. There's server specific infrastructures, code ones to define your cloud resources from AWS. We've got AWS SAM the service application model. And also you can use CDK which allows you to build CloudFormation in familiar programming languages, both generate CloudFormation.

Uh there are a number of great third party tools uh you know from here and even others. But you really want to be using a framework to build your service applications and get into the habit of um starting rather in with infrastructures code rather than in the console.

But if like me visual is your thing also have a look at application composer which you can now also jump to from the lambda and step functions console and announce today in verus keynote. It's also available in your id e with vs code and this has got a great drag and drop interface to build applications and not just serve ones.

It works with all cloud formation resources and you can import existing stacks to see what they look like. And which is great for understanding what you already have. It actually syncs with your local file system. So you can be build visually in the console or your id e and then generate the infrastructures code at the same time. Two for one best practices built in isn't that good?

So, and you don't also have to start from scratch as with service workflows. You've got the service patterns collection on service land.com, more than 700 sample infrastructures code uh patterns across many languages across many services and with different service integrations. And i'm sure there's likely one for your use case which you can just copy and use in your applications. And because it's all open source, you can even submit your own and why not help out your uh fellow other builders.

So a traditional developer workflow is often done on your local machine to get fast feedback while you're developing your applications and then developers then think well, they need to have their entire app locally and run everything locally. However, when you're building cloud applications, this works slightly differently because sure you've got code that you're developing. There's also a lot of other stuff that you're connecting to integration with other, with other services. You're gonna be sending messages and events or maybe connecting to other api s or talking to other databases.

And so it can be tempting to uh try to emulate all these things locally uh to build all these services locally on your laptop. So you can do everything. But this is hard, this is really gonna be hard to get everything working and also critically to keep things up to date. So try avoid doing this if you can now locally you can do some stuff.

So sparingly, you can use some mock frameworks for example. So if you've got some um complex logic and you want to do some testing for that, you can mock your vent uh payloads. So you can then provide that input and uh uh check your output and that's a really good thing to do.

But ideally, we want the best of both worlds, we do want this sort of local quick iteration and also in the cloud, you wanna iterate locally on your business logic and then also run your uh code in a cloud environment as quickly as possible.

And so s am has s am accelerate, which helps you with just this. While you're developing, you can iterate um uh against the cloud with the speed of local development. And cd k watch also does a similar thing and this allows you really cleverly to work in your local id and sync the changes to cloud resources. It actually really quickly updates code to test to test against other resources in the cloud without waiting for cloud formation to deploy.

And also you can use s am logs to get aggregated logs and traces directly in your terminal. So you don't have to jump into the console. And this makes what developers call the inner loop really super quick using both cloud and cloud uh sorry, both cloud and local resources. And this really does change the way you build service applications in the cloud, giving you the best of both worlds, fast local development experience and using real cloud resources.

Now just thinking back to those doro metics, i mean i was talking about earlier about getting things into production quicker. Now, remember we both want we want both speed and we want quality. Well, automated testing is the way to get there. Good testing is an investment in that speed and in that quality and they help ensure that your systems are developed efficiently accurately. And of course, with high quality, you want to have good test coverage from your code all the way through your c ac i cd pipelines. So you can confidently get features into production.

Now, there are of course, there are a number of places where tests are important. You should of course unit test your lambda function code when developing locally and then automatically in the cloud through your pipelines. You can use test harnesses, these are super useful to generate inputs and then to receive the outputs. And then you want to be testing service integrations in the cloud as quickly as possible. Maybe you're gonna divine define some integration tests. Maybe you're gonna pick two or three services and then develop your full end to end testing for your all application.

And then of course, you want to move towards also testing and production to prioritize speed. This isn't only testing in production, this is also testing in production. So you can use things like canary deployments and this allows you to develop uh uh develop things locally, push it to the cloud and introduce changes more slowly in defined increments rather than having a big bang all at once. Approach feature flags also help you to introduce code um effectively and then back out really quickly if you do have a problem and observ ability, this is absolutely key, you know, observe observable tooling is super critical to measure what's happening and also understand if things are changing and it could roll back procedures then allow you to reduce the risk and also increase your agility. Again, another jumping off point plenty of more information uh to talk about testing more than we've got time for today.

So uh dan fox and some other experts have written a superb learning guide on surplus land.com which has loads of information and the link is in the resources page. And there are examples for various programming languages super helpful. But again, let's just switch and look a little bit about op s.

Again, the biggest barrier to agility when building applications is often a la a lack of time spent on the things that matters. Cio s want development teams to focus on innovation and move with speed. But today, what are most uh developers doing? They're spending a lot of time on operations and maintenance.

So then you ask the question, what does ops actually do in a service world? Well, i think there's a lot with cvs, the operational management required to run and scale applications is handled with you by aws in the cloud. So not only is no ops, not a reality, operators are actually more important than ever, but ops are different, but the role isn't any less important. But the cool thing is it actually becomes less manual. It's more strategic taking on a wider, a wider role within the organization. So you can actually operate safely and with speed.

And there are two approaches to op s, there's a free for all. Now, of course, this isn't a reality for production applications but that at the extreme end it, let, let, let's devs go as fast as they can, but they want, do it the way they want, go as quickly as they can. But obviously that's gonna risk bad code. Uh, you're gonna have poor code going out, you're gonna have reliability issues and, you know, it could be as bad as even legal issues.

But then on the other end of the spectrum, you've got uh everything being centrally controlled, you've got a central team that is gonna take control of the release pipeline. Maybe it's gonna do all the provisioning of the resources. It's gonna handle all the security and all of the troubleshooting. It's gonna be lower risk, of course, because it's very understandable, but it's obviously gonna be a lot slower because due to the dependencies and the time lags.

So we actually want it both ways. We want it fast to get features out really fast uh iteration, but we also want it to be safe with a low risk to the business. And this is why we use this concept of guard rails and these are processes and practices that reduce both the occurrence and impact of uh undesirable application behavior. And these are rules that define uh rules that you can define to stop the bad stuff happening and obviously you want to uh express them as code.

Now there are many examples, you know, things like enforcing your pipelines and m maybe not making things public uh logging and tracing, looking at that, whether you need access to a vpc tags, log groups, um encryption settings, you know, a whole bunch of stuff in the list list here. Uh and these are things you want to ensure that actually gets done and these need to be checked at various stages as much as you can. Of course, while building your application, the so called shift left. But also you wanna catch those um to be able to catch those things early on, but also at various stages during your automated pipeline and while your apps are running.

And so you have proactive controls which you can use to check and block resources before they're deployed and you can use linting and cloud formation guard, super useful aws config if you haven't used, that is super helpful to get a view of your cloud resources and you can define in rules to check the compliance of those resources before they're actually deployed into production.

And then on the other side, you've got the detective controls while your app is running to ensure that your app still stays in compliance, checking for vulnerabilities and config issues on a continual basis. And aws config is still helpful and you can also use the cool amazon inspector for ongoing vulnerability management.

Again, another jumping off thing, lots of learning guides, information on this implementing governance in depth for service applications. The link is in the resources page. So i suggest you take a look now dev ops has been superb to foster organizational collaboration. But we're asking a lot for developers to take on more and more in building applications, particularly when we're building service applications and we're using application constructs rather than as infrastructure primitives and giving developers full uh giving developers teams full control over everything is intimidating and it's gonna be complex, especially with the governance.

And so the concept of platform engineering recognizes that and a lot of the operational and governance issues actually don't need to be surfaced to the developers directly. And you absolutely want to increase developer productivity and the pace of software uh delivery developers want to get their stuff done using self-service enablement and great tooling to work with your applications.

A central platform team can provide some of the best practices across your whole organization to manage and to govern and to run your applications. But also i want to caution you just a little bit from building one huge platform to rule them. All the job of a platform team should not be about building a platform, it should be about enablement and integrating other platforms.

And you probably wanna have maybe many teams doing this enabling many platforms that your devs can use from security platforms, logging platforms, dev tooling and to integrate with other things. You want your platform teams to work closely with your dev teams to understand how the platforms are being actually used within the business to better enable people to use it. Unless if you don't do that, it's just gonna become another isolated silo and probably a very expensive one.

And here are just some examples of the kind of things the platform teams can get involved with to help your developers working. Look at all this list, observable c i cd pipelines, deployment strategies, cost management, uh security.

So if anybody says that serve list doesn't require ops, they certainly don't know what they're talking about and you're gonna send them my way. I'll tell them what to do.

So in the time today, we said we would cover a lot. Is everybody still ok, still breathing good, good. Well, you probably need some time to digest this all. So that's just was part of the plan and what we thought.

So we've got this resources link. We talked about how service lets you focus and concentrate on your customers. First of all, how you can build with great service, full service um uh connecting different services together, obviously the awesome power of uh lambda that chris was talking about and then talking about the whole software delivery life cycle and how and can get things into productions.

But of course, you know, we haven't even started, we don't have another hour where there are many more best practices and optimizations available, the link resources page includes all the links in this presentation and a whole bunch more. So we'd suggest you have a look at that as well. Of course, you can continue your aws learning, you can do skill builders and ramp up guides and digital badges. Just some cool things you can learn more about service development.

And of course, we mentioned it a few times today. Service land.com, your best resources for all things service on aws.

So, with a deep breath from chris and i uh we really appreciate you talk uh joining us today. It's your fourth day of reinventing. You've still survived this far. Uh hopefully we've given you some things to think about. And also if you really like deep technical content, you know, uh please rate in the survey in the mobile app. Uh five star rating lets us know that you are absolutely hungry for more and our contact details are here and we will be around a bit in the foyer if you do have any questions. Enjoy the rest of your day.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值