Developing serverless solutions

My name is Walid Al Khatib. I'm a technical instructor at AWS. And our session today is on developing SUR solutions.

Our session is divided into three different modules and we'll start with Module One which is building asynchronous event based solutions.

And in this module, we'll talk about asynchronous events sources for Lambda. And we'll focus on two services that act as, as asynchronous event sources through Amazon EventBridge and Amazon SNS, the Simple Notification Service.

We'll also compare between EventBridge and SNS.

So before we talk about how Lambda invoked asynchronously, let's first talk about asynchronous processes in general.

And so as opposed to a synchronous process where the client immediately expects a response back from the server, an asynchronous request is a non blocking call.

So what that means is the client can continue to send additional requests without waiting for an explicit response back.

So for example, if you walk up to a cashier at a diner and you ask for an order of a pie, the cashier acting as the client here will take that and pass it to the kitchen manager.

The kitchen manager might respond with some type of acknowledgment indicating that your request was successfully received.

Um and from that point on, we'll pass the event to the baker. The baker will bake the pie and then the waiter will bring the pie back to your table.

But since this is asynchronous, that doesn't stop any other customer from placing additional orders for pies. And that's how Lambda is invoked asynchronously.

So when using event sources like Amazon EventBridge or Amazon SNS, those are services that invoke Lambda asynchronously.

So what this means is that when they pass the event to the Lambda service, the Lambda service will take the event and place it in its internal queue.

And if it can successfully place the event in its queue, it will return an HTTP 202 indicating that I could successfully place your event in my queue.

And then at some point, the Lambda service will invoke the Lambda function with the event in the queue.

If our Lambda function fails, we have 0 to 2 retries which are built in with asynchronous invocations.

Now because the client doesn't get an explicit response back from the server as a client. What are different strategies that you can use to understand what the status of your request is?

Well, for example, if we're talking about a customer at a diner, the customer can simply walk up to the cashier and ask is my order ready. Uh which is analogous to client polling.

Uh the second case would be maybe the cashier handed the customer some kind of device that notifies them when their order is prepared, uh which is analogous to a web hook.

And then thirdly, uh the waiter can simply come back to the customer's table with the order uh which is similar to a web socket API where we have a persistent two way communication channel.

So those are three different common strategies that as a client, we can understand what the status of our request is.

So now let's focus on those two services. We'll start with Amazon EventBridge.

So Amazon EventBridge is a service that lets us create and manage serverless event buses. And an event bus is a resource that enables publisher subscriber type communication across our applications across AWS services and also across third party software as a service applications as well.

So within EventBridge, what we do is we create an event bus and then we create rules which are essentially just JSON patterns to match against incoming events.

So if I create a rule and I have an incoming event that matches against that rule, I can send the event to a target, which could for example, be a Lambda function that automates some kind of action.

EventBridge comes with a lot of different features and we'll talk about some of these today. But for example, event scheduling.

So with EventBridge, I can create a scheduled rule if I wanted to trigger my Lambda function, for example, every day at a certain time.

Now with EventBridge, there's three different types of event buses we can have.

Firstly, we have the default event bus. This is not one that you create, this is one that's automatically created in every region of your accounts. And this is where AWS services are sending their events to by default, whenever they face a state change.

For example, when an EC2 instance is terminated, that's a state change that would generate an event and send it to the default event bus.

In addition to that, we can also create a custom event bus for our custom applications. For example, if a user on our web application clicks on a button, we might want to generate an event and send that to our custom event bus.

And then for every third party software as a service application, we integrate with, we also have a separate event bus that they would route events to and we'll talk more about that in a second.

But after we create the event bus, we then create rules to match against these events. And if a rule is matched, we can send the event to a target.

So what does a rule look like? Well, on the right hand side, you see two example rules here and then we see a sample event again, rules are just a JSON patterns.

And if these rules are matched, which in this case, they are both of them because the source and the detail fields match what is coming in the incoming event.

And so each of these rules can have different targets. So maybe one triggers a Lambda function, one sends a notification through SNS.

Now, for some applications, it might be useful for us to access past events. And this is something we can also do with Amazon EventBridge, we can enable an event archive.

And what this does is now every event that passes through our event bus is going to be our, we don't have to archive every single one. We can also take use of filter policies to selectively select which events we want to save.

And then after they've been saved to this archive, we can replay them. So for example, if our Lambda function faced some kind of bug and didn't do what it was intended to uh after debugging, we might replay these events from our archive.

In addition to that EventBridge also supports a feature called schema discovery. And what this does is that after we have an event that gets sent to our event bus, it's going to save the schema within a schema registry.

So we might use this if we're receiving events from third party data sources and we're not sure what the schema of those events look like.

So once those events get sent through our event bus, their schemas would be saved in that schema registry. From that point, we actually download code bindings and use them in our IDs.

And this allows our developers to interact with those events natively as class objects in their code. So this is another feature that really streamlines the development of event based applications.

So I'm going to jump into the console now and I'll show you how to create an event bus. We'll also create a rule and we'll set the target for that rule as a Lambda function.

We'll set the source of events as an API and Amazon API Gateway. And then we'll also test this rule from Cloud9, our cloud based IDE.

So I'm in the console here. I'm first gonna head over to Amazon EventBridge and I'll simply search for that. And from Amazon EventBridge, I'll start by heading over to event buses.

You'll see that I already have a default event bus. Again, this is created for every region by default.

I'm gonna create a custom event bus and I'm gonna give this a name of serverless oops, serverless bus. And you can see we can enable event archive and schema discovery here.

I'm going to leave those as disabled and scroll to the bottom here and simply create this. And now you can see I have my serverless bus.

Now, what I'm going to do next is I'm going to create a rule. So I'll go to my rules tab and I'll click on create rule and I'll give this rule a name, I'll call it pi rule and I'm creating this rule for my serverless bus.

So I'll select that and then I'll click on next. I am going to receive events from API Gateway. These will be custom events from API Gateways.

So the source here is going to be other, I'm going to scroll to the bottom all the way down to where it says event pattern. So here's where we set the JSON pattern and I actually have one here in VS Code.

So what this is showing, I'm just going to copy it here and then paste it into that field and then we'll talk through it. But this is essentially saying if an incoming event has its source set to bakery store has its detail type field set to snacks and has the detail of snack type set to pie.

This rule will be matched and if this rule is matched, then I can send this event to a target, which in our case will be an AWS Lambda function. But you can see all of the different AWS service targets I can choose from here, but I'm going to choose Lambda function and there's a function in my account called pi function.

So I'll select that I'll click on next. I won't configure any tags. I'll just go to the last page here and create this rule.

So now if I uh use this drop down and select my serve bus under my rules page, we have this one pi rule and it says that if any event matches this rule, then we'll send the event to this pi function.

So what we've done so far is we've set the second half of the workflow where our event buss is sending those events to the Lambda function. We still need to configure the first half, which is API Gateway acting as the source.

So that's what I'll do next. I'll jump into this tab here where I have API Gateway open and I already have an HTTP API that I'll use.

So I'll use this API which currently only has one route. I'm gonna add a new route and this will be a post route and we'll call this route new snack. I'm going to create this and now you will see the new snc route is exposing this post method.

And so now I'm going to attach an integration to it and this will be an integration with, of course, Amazon EventBridge, the integration action in this case will be put events because we're putting events to our event bus.

And then if i scroll down here, here's where we can configure how that event is encoded when it hits the event bus.

So if you remember from our rule, the source has to be bakery store. The detail type has to be snacks. The detail has to be of snack type set to pie.

So when API Gateway sends this event, what i'm gonna do here is i'm gonna hard code the source to bakery store. I'm also going to hard code, the detail type to snacks. So every event sent through this API will have those.

Now the logic is going to come from the detail field. So whether or not our Lambda function will be invoked depends on whether or not the uh inside the request body. The detail has a snack type set to pi remember, snack type set to pi uh and we'll take a look at a couple of example, post requests in a second to show you what i mean.

But i'm going to use this placeholder here which is request. So from the input request, i'm gonna look in the body and then i'm gonna look into a field called detail within that i should have snack type set to pi.

And so that's where the dynamic aspect will uh will come from. Uh i do need to give API Gateway an i a role so that it has permissions to put events.

And i do have my a n my Amazon resource name here for a role that i'll use. And then lastly, i also need to specify the name of my event bus.

So i'll uh we call it our event bus, serverless bus and i'll create this. So now i completed that first half where API Gateway will send events to the event bus.

So now we can begin testing this. And in order to test my API, i'm going to jump into my stages and within my default stage, i can grab the invocation url.

So this is what i'll use to invoke my API and then within VS Code, i also have two sample post requests that i'm going to use.

As you can see the post request on top within the detail field of the body has snack type set to pi so this one should invoke my lambda function.

Whereas the one in the bottom has snack type set to cake. So this one should not. And you can see here, i do have a placeholder to input my invocation url

so i'll paste that in here for both of these and now i can begin sending these requests to test them and i'll do that through cloud nine, our cloud based and browser based ide through the terminal here.

i can send post request number one and i did face some error here. legal characters. interesting. let me just see what's going on. um sorry, what was that one? uh yeah, we are passing through that line. i see. uh so after after new snack o ok. what i'll do here is actually let me just grab my, sorry about that one second. i'm just going to quickly grab that post request. i do have it somewhere else here. ok? ok.

so back to my cloud nine, i'm just gonna paste it in here and then grab my api gateway invocation url one more time. we'll see if this works now and paste it in there. sorry. what was that? the new line? oops. uh so wait, hold up one second. demo. gods not with me today. this should work. i've tried this. oh there we go. yeah, you are correct. ok. there we go. applause.

all right. so, so that was post request one. i am also gonna do that with post request number two. so instead of the snack type of pie, i'm gonna say cake. so we copy that and we'll send it. ok. so as you can see, we don't get a request back from lambda because this is an asynchronous invocation.

so if we wanted to know what the status of this request is, where can we go? well, we can jump into the lambda function itself in the lambda function. i'm going to jump into my pi function here and then i can jump into its logs to check whether or not this was invoked by cloudwatch logs. it's going to take me to my law group for this lambda function. and here i should see if i have any log streams. so i do have one log stream here. i click on that. you'll notice that all of these log events are all part of the same request because the request id is all the same. it ends in 15 d as we can see here.

so which request was this? this was the request where snack type was set to pie, not the one that was set to cake, of course, did not invoke this awesome, right? so that was eventbridge. now we're gonna move on to amazon sns the simple notification service.

so sns is also a service that gives us a publisher subscriber type uh mechanism for communication. uh for both application to application or application to person communication. and s ms supports many different subscription types. for example, sending messages to lambda functions, sending messages through email or s ms, sending messages to an http or http s url sns also supports features like fifo topics. so first in first out, if we did want to preserve the order of messages

now within sns, what we do is we create a topic and then we have a number of publishers and a number of subscribers. so by default, whenever a publisher sends a message to the topic, all the subscribers will receive that. however, with sns, we also have control in terms of selectively uh controlling what each subscriber should receive.

so in this case, you can see subscriber a which is a lambda function and subscriber b which is an sqsq, both have a filter policy where we're saying that subscriber a should only receive messages where the attribute is set to pi subscriber b should only receive those where the attribute is set to donut subscriber c has no filter policy. so we will receive every message with sns. we can also have our different pipelines or workflows act as subscribers to a single topic. and this way, we can fan out a single message to multiple concurrent processes.

so in this case, we might have an event storage and backup pipeline up top, which is backing up every message to amazon s3. in the bottom, we might be having some kind of search or indexing pipeline where we store every message in a table in dynamodb so that we can later index or search through them. in either case, both of them have an sqsq which is acting as the subscriber to a single sns topic. so every message being sent by lambda here is being fanned out to both of these pipelines and processed in different ways.

with that said, now we're gonna quickly show you how we can create an sns topic set, a lambda subscriber and then also preview how attribute message filtering works.

so i'm going to jump into sn si, don't need api gateway anymore. so i'm just going to search for it over here and here, i'm going to create a topic and all i'm going to do this will be a standard topic. all i'm going to do is really give it a name. i'll call it p topic and i'll scroll all the way to the bottom and create it.

now that we have our topic, we need to add a subscription. so our lambda function is going to act as a subscriber. so the subscription protocol which you can see supports many different protocols here. we're going to use aws lambda and then i do have to specify the endpoint for my lambda function. so i'm going to jump back into my lambda function and grab the amazon resource name, which is the unique identifier for it. so i'm going to bring it and then paste it in here and the endpoint.

and now before i create the subscription, i am going to enable a filter policy. so only messages that have certain attributes are going to invoke this lambda function. i do have an sns filter policy here. essentially what this is saying is that the attribute has to be of key call type and the value of which has to be either pi or pies with capital or lower case p. i'm gonna paste that in and then we're going to create this subscription and now we can begin testing it.

so i'm going to start sending messages through my topic, publish message and i'll give this message somebody like for example, a new pie flavor has been added to the store. i'm also going to copy this so i can use it in my next message. and this is not really gonna impact whether or not lambda receives it. it's really the attributes, right?

so within the message, attributes, i have to have an attribute of name uh type, the value of which has to be pi or pi. so i'll use pi here. so this one should invoke my lambda function because the attribute was there if i publish another one with the same body, but now maybe instead of pi i'll use a new doughnut flavor and then within my attributes, now the type will be donut.

so this one should not invoke my lambda function again because this invokes lambda asynchronously. we don't get the status of the request back in order to check that we can jump back into our cloud watch logs.

so back into our log group and here we'll see we have another new log stream again. this is just for one single request. and if i expand this, we'll see this was the request where the message attribute of type was set to pi not where it was set to donut.

all right. so there are similarities between eventbridge and sns, but there's also features that will help us decide which one works best for our use case. for example, one notable feature of eventbridge is its integration with third party software as a service applications like shopify, datadog, mongo db, so on and so forth. it also gives us a larger number of sources and target choices that we can choose from more advanced routing rules. so by setting those json patterns, we can create more sophisticated rules. it also gives us that schema registry which helps our developers standardize how they write events and the retention period is 24 hours in case any events are undeliverable, it will retry that for up to one day.

whereas with sns, if you're looking to create an email list or send out texts, it supports those subscription types. uh if you're looking to implement something like a web hook or web socket pattern, it supports htp and htps. uh also has the option of first in first out topics, it supports very wide fan out so it can fan out to a large number of subscribers. it can also act as a dead letter q for a lambda function. so remember how we spoke about with asynchronous invocations. lambda has 0 to 2 retries built in. well, if those retries are exhausted, it can send the event to a dead letter q which can either be an sqsq or an sns topic and then retry policies for server side errors extend over multiple days.

so i hope everyone has a better idea about how lambda has invoked asynchronously and the differences between eventbridge and sns. we're going to jump now into module two which is building poll based event driven solutions.

so here we'll talk about how lambda is invoked using a poll based event source. and we'll focus on different services, particularly amazon sqs, the simple queuing service and also kiss data streams and dynamo db streams.

so how is lambda invoked uh using a poll based event source or how do poll based event sources work in general? talk about first.

so as producers are processing in events from the client, they're sending those events as messages or records in a queue or stream. and then on the other side, we have our consumers which are actively polling that data source so that queue or that stream and it's attempting to grab a batch of those records and process them as a batch.

so here, uh as opposed to the previous case where the event source was pushing an event here, the consumer is actively polling that stream. now with lambda, when using event sources like amazon sqs or kinesis data streams that polling process is built in. so lambda understands how, how to poll that q or stream. it will attempt to grab a batch of records from the stream.

if it can process the records successfully, it will move on to the next batch if it cannot. so if any of the records in that batch fails, it's going to retry that batch continuously over and over again until it succeeds or until that batch of records expires off the stream. that's also known as the poison pill scenario because it's gonna continuously retry retry retry. so it's gonna stop all the other batches, right?

uh and so we'll talk about ways that we can tackle this as well. now, just to set aside some differences between cues and streams with cues. every individual message in our queue has data value because every individual message could correspond to a single transaction. whereas with streams, it's really the aggregation of multiple records that gives us actionable data in terms of the message rate.

q's message rate is variable. whereas with streams, we have a continuous stream of records. with regards to message processing with a q, each message is intended for a single consumer. and after the consumer successfully processes it, it deletes the message of the queue. whereas with streams, we can have multiple consumers that are processing the exact same records on the stream and they do not delete records off the stream records are only deleted after the retention period has been reached.

and then with regards to retries with a q, if any message fails, the message will be visible again in the queue

Whereas with streams, as we said, if a batch of records fail, it's going to continuously retry that batch over and over again until it succeeds or it expires.

So let's focus now on Amazon SQS (Simple Queuing Service). And if we take this back to our diner, we could have multiple customers that are placing orders at the same time. So we don't want to overload our baker with too many requests.

So what we can do is we can take those requests and we can uh put them in some kind of queue. And now the baker can actively pull that queue whenever the baker is ready to process orders.

Now, when we talk about using SQS as an event source for Lambda with regards to Lambda concurrency, Lambda is always gonna start with five concurrent invocations. So for example, in this case, we have nine batches of messages in our queue. Lambda is always gonna start with five concurrent invocations and then it can scale up to 60 invocations per minute until it reaches some type of concurrency limit or until all the batches in the queue are taken care of.

Now, when, for example, if we take a look at invocation A up top, if it processes that batch of messages successfully, it will delete that message from the queue if it doesn't. So if any message fails from that batch, the entire batch will become visible again in the queue and will become available to another consumer. That's also with regards to setting a parameter for our queue called visibility timeout. It's the time in seconds that a message will remain invisible for after being picked up by a consumer.

So for example, after invocation A picks up that batch, it will become invisible to other invocations for a period of time known as the visibility timeout. And that's what allows us to avoid duplicates.

So what do we control when using a queue as an event source for Lambda? On the queue firstly, we have to select the type of queue. So this can either be a standard or FIFO queue. If we want to preserve the order of messages, we'll use a FIFO queue, the visibility timeout, there's a rule of thumb, we want to set this to at least six times the function's timeout.

And then we also have to think of the redrive policy for the queue. So are we using a dead letter queue in the sense that if our Lambda function attempts to process a single message over and over again and it continuously fails at some point, we want to send that over to a dead letter queue and that's something we need to set.

So a dead letter queue can also limit the frequency of bottlenecks. And then on the function itself, we set the batch size, uh we set the function timeout. Uh so with regards to the batch size, if we have faster workloads, we'll set this to a larger amount. If our workloads are slower, we'll set this to a lower amount.

Um and then also with regards to partial failures, this needs to be implemented as logic in your Lambda function code. So for example, if our batch size is five, let's say four of those messages succeeded, but one message failed, the entire batch will become visible again in the queue. Sometimes we only want to make that message that failed visible again in the queue, not the entire batch. If we did want to implement that, we do have to account for partial failures, which will be additional logic in our code.

And then generally, we also want to account for idempotency. If it happens to be the case that our Lambda function did process the same message twice, we don't want to have an outcome on the initial time that message was processed. So for example, not updating a value in a database twice.

So now we're going to jump back into the console and quickly show you how we can set up an SQS queue as an event source for a Lambda function and we'll also test it out.

So here, I'm going to jump into SQS now, Simple Queuing Service. And you can see here, I have my donut dead letter queue. So a dead letter queue is just another SQS queue, but we'll have all of our failed messages in this queue. So we can have another process that looks over this or maybe a manual process, let's say.

So what I'm going to do here is I'm going to create a new queue and this will be a standard queue. I'm going to give it a name, I'll call it doughnut queue. Uh if you scroll down here is where you can set all of your queue configuration parameters, things like visibility timeout. I'm going to keep all of these as default.

I'm going to scroll to the bottom where it says dead letter queue and I'm going to enable this and I'll set this as my donut DL queue. It's my donut dead letter queue and here is where you set the maximum receive. So what is the threshold? In my case? For example, two.

So if a message cannot be processed after being picked up twice, so it's still not deleted off the queue. SQS is automatically going to send it to our dead letter queue. And I'm going to go ahead and create this now that I created it. I'm gonna add a Lambda trigger.

So if I scroll down, you'll see there's a tab for that, I'm going to configure a Lambda function trigger and this will be a function. Now, I'm going to use the doughnut function and I'll save this. It does take a couple of seconds for this trigger to be ready. So we can just refresh this until we see that it's enabled. Ok? There we go.

So it's enabled now. So now I can jump into my queue and send and receive messages and we'll just send one message here. We say for example, test message one and I'll send this message.

And now I'm going to jump into my donut function to view the logs. So I'm going to jump into that donut function. And again, the code here is just gonna, you know, log the event, it's gonna json stringify it and log it.

So I'm going to jump into my monitoring tab and view CloudWatch logs. So that now is going to take me to the log group for this Lambda function, which is the donut function. We have one log stream here and you'll see that this was the message we just sent. So the body here is test message one. Ok.

So that was queues. Now, let's talk about streams here. We'll talk about both Kinesis data streams and DynamoDB streams. So with a stream, if you tell me something like one dozen, that doesn't make sense. But if you tell me one dozen six jelly three sprinkles so on and so forth, that's what gives me actionable data that I can work with.

So it's really the aggregation of multiple records that gives us actionable data. So with a stream, Lambda maintains a pointer on the last successfully processed record. And this polling process, as we said is built in. If Lambda can successfully process the batch, it's gonna move its pointer to the next batch if it cannot. So if any record fails, it's gonna retry that batch over and over again until it succeeds or until the batch expires.

So what do we control when using a stream as an event source? Firstly, on the stream, this depends on whether we're using Kinesis data streams or DynamoDB streams. If we're using Kinesis data streams, we have direct control over the number of shards in our stream. So the number of shards controls are throughput and has a direct correlation to the number of Lambda invocations we're gonna have running at the same time.

For example, if I have three shards in my stream, I'll have three Lambda invocations one per shark with DynamoDB streams, it works the same way in terms of the number of shards is gonna uh impact the number of Lambda invocations I have. However, we don't directly control the number of shards with DynamoDB streams. This is going to be dependent on how much throughput is set for our table.

So with DynamoDB throughput is the number of RCUs and WCUs or read and write capacity units. And this can automatically scale depending on the amount of traffic that's coming in. So if my RCUs and WCUs increase automatically, the number of shards in my stream will automatically go up. And the number of Lambda concurrent environments will also scale on the function.

Again, we control things like the batch size. We also have a parameter called concurrent batches per shard. So this is defaulted to one and that's why we have one invocation per shard. But if we wanted to process things a bit faster, we could increase this to something like two, let's say in which case, we'll have two invocations per shard.

And also the batch window is something we can configure. So this is the time in seconds that Lambda will wait for our batch to fill up. For example, let's say we only have eight records in a shard and our batch size is 10 at some point, you just want Lambda to pull those eight. So you can set the batch window, which is somewhat like a timeout and then error handling options as well.

So we spoke about that poison pill scenario where Lambda continuously retries that failed batch. Now we can deal with that in two different ways. The first strategy is called uh b batch on error. This is a little bit easier to implement because it's simply something we have to enable on the event source. And what this does is it will iteratively cut the batch in half until we isolate the record that failed. But this does not guarantee that we do not face duplicates.

If we did want to guarantee that we do have to go for another error handling strategy called checkpointing, which requires additional logic in our code. And then like with queues, we we always have to account for idempotency as we said with queues. It's also the case with streams.

So we spoke about both Kinesis data streams and DynamoDB streams. There are differences between them. For example, the polling rate which is built in with Lambda, with Kinesis data streams. It polls every shard once per second with DynamoDB streams, it polls every shard four times per second with Kinesis data streams. We have direct control over the number of shards. So we can manually scale this to increase the number of Lambda uh invocations. And we can also use something like an auto scaling policy if we wanted to, to scale the number of shards maybe based on a schedule if we wanted to.

Whereas with DynamoDB streams, we don't have direct control over the number of shards. As we said, it's going to scale with the amount of throughput for our table. And we can have auto scaling enabled. In which case, traffic going to the table can automatically drive the number of Lambda invocations we have running.

So I hope that gave you a better idea about polling event sources for Lambda, specifically SQS, Kinesis data streams and DynamoDB streams and some differences between them. Remember with streams, it's really the aggregation of multiple records that gives us actionable data and queues support FIFO queues. So if we wanted to preserve the message ordering, our last module is on continuous integration, continuous delivery and continuous deployment.

So here we'll talk about the importance of CI/CD in serverless applications. Uh we'll talk about tools that we can use in a serverless pipeline. We'll talk about AWS SAM which is the Serverless Application Model and we'll also end with best practices for automation.

So every deployment pipeline is made of several stages. The question is which parts of this pipeline can we automate? For example, in the source stage, this is where we implement source and version control. This is where our developers are committing updates. Once those code changes are pushed, that will take us to the build stage. Here's where we compile the code and we generate a deployable artifact, something like a docker image

"And we are also running unit tests on that build. And then within the testing phase, here's where we'll carry out all of our different types of tests. So not the unit tests, but the integration tests, performance tests, low tests, so on and so forth.

And then finally, if those succeed, we can push into production and here we'll carry out production tests and monitoring. So what do those terms mean when we talk about continuous integration or CI?

That means every code check in, initiates a build. So every time developers check in code that's going to automate a build job, generate a deployable artifact and also run unit tests on the build. So because we have those unit tests, our developers have instant feedback. For example, if there's a broken build, they're directly aware about that.

Now, continuous delivery extends on continuous integration by delivering a production ready build. So in addition to building it, we're also going to deploy it into a testing or staging environment where we carry out all of our different types of tests like integration tests and all of our different tests. And if those succeed now we have a production ready build.

The question is, are we automatically deploying it to production well, with continuous delivery, we have a manual approval step. So before we push to production, someone has to manually approve it or reject it. And that's the difference between continuous delivery and continuous deployment with continuous deployment. The entire pipeline is automated. Of course, we have to have a lot of confidence in our pipelines and a very robust testing strategy.

Now, the benefits of CD are it gives our developers a repeatable and consistent upgrade process. Also with modern applications, typically we're implementing a microservices architecture. So instead of having just a single large pipeline, we have many smaller deployment pipelines. And now the development teams can really focus on their individual microservices when they want to carry out updates. What tools work the best with regards to their pipelines, any issues with one pipeline is not going to impact any other service. And it also gives our developers and end users a constant continual access to the latest version of our application, for example, implementing beta or user acceptance testing.

So AWS gives us a range of different CI/CD services that allow us to build out these pipelines. For example, for our source stage, we could be using something like AWS CodeCommit, which is our managed git control service. So our developers can push in code to CodeCommit that can automate a build job in AWS CodeBuild, which is our continuous integration engine that can compile the code and build a deployable artifact and also run unit tests on the build.

If those unit tests succeed, we might move into some kind of testing stage. Now, the service that really orchestrates this workflow is AWS CodePipeline. So that's the service we typically start with, we'll create a pipeline and here's where we set the different stages and define the transitions from one stage to the next.

For example, for our deployment stage, we could be using CloudFormation, which is our infrastructure as code engine to automatically deploy infrastructure. Now, in addition to CloudFormation, two tools at the bottom of the slide here, AWS SAM the Serverless Application Model and AWS CDK, the Cloud Development Kit can also be used to define our infrastructure using code and they both operate on top of the CloudFormation engine.

So AWS SAM is great for deploying our serverless resources. So it makes it easy for us to define things like our Lambda functions, tables and DynamoDB API and API Gateway, it supports standard CloudFormation syntax. So the ability to incorporate dynamic data through the use of parameters which we'll talk about soon and then it also supports a guided option.

So when using the SAM CLI, we have the guided option which shows us every step of the way when we deploy this stack. And then the CDK can be used for all of our other infrastructure components, including the infrastructure needed for the CI/CD pipeline itself. And the CDK is a, is an imperative framework.

So instead of using declarative languages with SAM like JSON or YAML, the CDK allows our developers to continue using programming languages, they're already used to like Python, Java, .NET, JavaScript and TypeScript. And again, it works on top of the CloudFormation engine.

So it's going to take for example, your Python script and synthesize it into a CloudFormation template. Here's an example of what a full production pipeline might look like. So as developers push and code to CodeCommit that can automate a build job in AWS CodeBuild, which can package compile the code, generate a deployable artifact and also run unit tests.

If those unit tests succeed, we can go to our testing stage. Here is where we can deploy the application using a CloudFormation template. And one notable feature of CodePipeline is the ability to directly invoke a Lambda function to run a test. So after deploying two testing, we might want to run some kind of stubbed integration test.

If those succeed, we'll push it into staging. Here's where we'll deploy with CloudFormation again. And another thing we can do with CodePipeline is integrate with a lot of different third party testing tools, for example, using Runscope for API tests and then we can also have a built in manual approval action as well.

So before we deploy to production, someone might need to manually approve or reject this. Awesome. So last section is on best practices for automation. First thing we wanna make sure is that we're automating tests wherever possible. So after our developers carry out their manual code reviews. We wanna automate all of our other tests here.

So our static code analysis tracks through our lint and syntax tools. Our unit tests once we deploy to testing, here's where we'll automate our mocked or stubbed in integration tests. And then when we push the staging, here's where we'll automate tests against real production services. And then finally, in production, we might be implementing canaries where we gradually deploy this to a smaller subset of users before rolling out this update to everyone.

And we could also be using pre traffic and post traffic Lambda tests, which we'll actually talk about next. So AWS SAM in our AWS SAM templates, we have a couple of different fields that enable seamless deployments and also make these deployments safe.

For example, you can see here that we're using the auto publish alias field, which is basically going to take our Lambda alias and automatically point it to this new Lambda function version that we're publishing. So we don't have to manually point our alias.

We're also using a deployment preference of linear 10% every 10 minutes. So what this is going to do is going to gradually shift 10% of traffic to the new version of the Lambda function every 10 minutes.

AWS SAM can also monitor up to 10 CloudWatch alarms which you can see there. So if any of these alarms are breached or going to the state of alarm, this entire deployment will stop. And then you can see at the bottom under hooks, we're also using pre traffic and post traffic Lambda function hooks.

So these are different Lambda functions we can run to carry out tests before and after we shift traffic. So how this might work practically. So for example, here we're using linear 10% every 10 minutes. The first thing it's gonna do, it's gonna run the pre traffic Lambda function. If that Lambda function test fails, the entire deployment will stop.

If it succeeds, we'll begin shifting 10% every 10 minutes. So from for example, Lambda function version one to Lambda function version two. In the meantime, SAM can monitor those CloudWatch alarms. If any of them go into the state of alarm, this entire deployment will stop if everything succeeds and 100% of traffic has been shifted, we'll run the post traffic Lambda function.

And if that fails, the entire deployment will roll back. But if it succeeds, then it succeeds. Another best practice is to have separate AWS accounts for each of our environments. And this not only allows us to isolate traffic, but also allows us to isolate access control because the IAM principles.

So the roles and users we create in one account are really only scope to that account. So they won't be able to interact with resources in different accounts. So that's one best practice to keep in mind. And then we could have a pipeline that sort of streams across these accounts.

Another thing is to use a single SAM template across environments. So instead of having one SAM template per environment, we'll have a single SAM template that acts as a single source of truth. And then for any configurations that change from one environment to the next. For example, the name of a database or an S3 bucket, let's say we can use parameters.

So we can parameterize those things which allow us to make sure that we're not changing the template each time we deploy it to another environment. And then finally, if we are dealing with secrets, which are also configuration values that change from one environment to the next, but they're confidential. For example, database connection strings or API keys, then we want to make use of an external service like AWS Systems Manager Parameter Store.

So before we talk about that, just be aware that AWS SAM does support standard parameters. But this is only used for low risk data because they will be visible to other users of the account. They can also be passed to Lambda functions as environment variables. In addition, some provide support for API Gateway stage variables.

So these are environment variables or name value pairs that we can change the value for depending on which API stage we hit. So I can have two different stages for an API maybe they are invoking different versions of my Lambda function. So I can control that using a stage variable, which can also be specified in our SAM templates.

But as we said, for any confidential config or secrets, we'll use something like AWS Systems Manager Parameter Store which supports the use of SecureString parameters, which essentially means that these parameters are encrypted at rest and can also be directly accessed within our S templates and directly accessible from our code at run time.

So that's the end of module three. I hope everyone has a better idea about CI/CD with serverless applications. We spoke about AW SAM, the Serverless Application Model and the CDK, the Cloud Development Kit. We spoke about different best practices and features like using the deployment hooks in our SAM templates to run pre traffic and post traffic tests and also the notion of defining a single SAM template across environments.

So that is the end of our session. I hope everyone enjoyed it. Just remember that you can visit us at the Training and Certification booth or the AWS Certification lounge, both of which are at the Venetian. You have the chance of entering the JAM challenge to win prizes and swag. And then in your own free time, remember that you do have access to over 100 labs on AWS Builder Labs. You take down that link.

There are a lot of other great events happening this week. So if you did want to learn more about the times or locations of these sessions, scan that QR code. Keep it up here for some time in case you wanna scan it. And then finally, AWS Skill Builder is our online digital platform for on demand courses and online learning. There is a seven day free trial for the individual subscription. You can scan the QR code there to gain access to that trial.

You'll have access to over 600 plus on demand courses for free along with a lot of other training material that you can use like labs and gamified trainings like Cloud Quest. So that's the QR code to Skill Builder. And thank you again so much for your time today. Please remember to take the session survey. We do appreciate your feedback. But other than that, if anyone has any questions, I'm happy to talk after the session and have a great rest of re:Invent!"

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值