Advanced event-driven patterns with Amazon EventBridge

李白的朋友王维

已于 2023-12-06 18:47:25 修改

阅读量136

点赞数

文章标签： aws 亚马逊云科技科技人工智能 re:Invent 2023 生成式AI 云服务

于 2023-12-05 16:29:47 首次发布

本文链接：https://blog.csdn.net/just2gooo/article/details/134810923

版权

Warm welcome to the talk. And how about you? I feel like the first day, first show. It's awesome to be here and I'm so honored to be here talking about one of my favorite services.

I am Ian Bridge, my name is uh Shane Brussels and I'm a service as hero and also happy to share that. I am currently writing a book on service as on a w with my colleague, Luke. It's expected uh uh early next year.

Now, AWS r is an occasion uh celebration of technology and also an opportunity to share and learn from each other. Just curiosity, how many of you are new to you and bridge new to you and through an architecture. Be brave. Perfect.

So hopefully end of this session, you should be able to get inspired and take the message back to your teams, organizations and have the event driven architecture using eventbridge in place. It's a packed agenda. But don't worry, i will take you in parts kind of incrementally. I will stop you recap and move forward.

So basically, we will cover the basics of event of an architecture and then we will, we will look at why and how eventbridge comes into the picture and look at some of the best practices when you think of events before we dive deep into looking at some of the event driven patterns. And i will close with a few of the faqs questions commonly people ask when they come across even bridge and even through an architecture.

Now before we move forward, i want you to go back not that long ago. Last year's re invent. When werner mentioned these two important things, asynchrony and event driven. Understanding of these two are crucial for us to move forward in, in this uh today's talk.

So let's look at the two important things asynchronous and i drive to understand asynchrony, you need to understand synchronous, right? So most of you must be familiar with this sort of a simple pattern, an application requesting a service or you know, get me the price of a product and then it gives back the price you as a customer with the app happy. It's a synchronous flow.

Now, let's take it forward. And here we have two services. These are, you know, service to service communication, let's say services, order fulfillment, service day is payments. So before an order gets shipped, you need to capture or settle or transfer the funds for the previously authorized payments.

So typically, so c once in a few hours or depending on the you know, busy system, it will send a bunch of thousands of payment ids to service d to do the capture. Of course, service is not going to hang around there because it can take our days or maybe even weeks.

So what it does is it sends an acknowledgement back with maybe with a batch id. Now we have a problem, right? Because how does service c knows the progress of the batch of payments and it can submit multiple batches during the day?

Ok. So typically service d will have some kind of status pool and we will open up an api end point for service to you know, paul or curry. Now it looks ok. But from service c point of view, it poses challenges how soon it should start polling, how long for and how frequently you should keep calling.

So this is a situation with the typical command query c qr s kind of pattern. So let's take one step further. What if so you know, submits the payments like we did instead of. So c asking for the status d pushes the status to c as events.

So here we are getting into this sort of event driven territory, right? So rather than waiting, it sends every payment where it captured goes out, goes an event. So we seek and react, you know, perform the order shipment or et cetera.

Now equally, it is also common that uh services push messages on to queues and for other services to consume the point. Here is like you start to see the loose coupling or decoupling of these microservices. That is one of the fundamentals for asynchronous even driven architecture.

So what is ed a it's an architectural concept where events communicate and do uh asynchronous um you know, sort of um um invocation and implementation pattern.

Now, if you look at a simple event of an architecture, there are four main elements. So here is, you know, simple architecture. A service is publishing events onto a bus or a broker consumed by two or more uh consumers or target services. And then it they, they kind of, you know, carry on what they're supposed to be doing.

So the four things there is a producer or a publisher and there is a consumer or the target for the events and two most important stuff. One is the broker or the bus and of course the events when it comes to a producer or a publisher, the common concepts in the industry is like, you know, the producer needs to be agnostic of consumers. You shouldn't know.

Now it's true if you have a sass platform or something like that. But typically if one of your colleagues team is your consumer sitting behind, obviously they get to know, right? So there will be kind of understanding and influence by your, but in general, it should be agnostic.

And the other important point is when you send an event, don't pack the entire thing, just share what is required this is like the least data privilege of sharing principal equal to the security aspect. This is a mistake commonly teams do from the consumer point of view, they also have responsibilities.

So they cannot always expect events to come arrive in an order. So they should be able to cope with, you know, um the orders are sorry, the events arriving in different orders and most importantly, item potency, there is no guarantee every time that an event is going to be delivered just once you may receive multiple events of the same type coming again.

So you should be able to, you know, guard against those situations. Now that takes us to the next one, the event broker or the carrier or the bus. So this is where eventbridge comes in and often, you know, supports you to interest events from multiple producers or publishers and then safely deliver to several targets.

On the other side in between, they may also provide you with the rules transformation capabilities and other features. Now let's take you to i bridge. And typically what happens is when someone starts new with cus, they start with an experimentation because everything is new. They need to kind of experiment that's fine.

So they keep adding, probably start with the lambda function on this, you know, growing the cus thing. But the thing is you get confused the moment you have a sqsq in place, your colleague will say, hey, now you need to have ad lq or dead letter queue to capture all the errors.

So you add ad lq, once you have ad lq, someone else will say, oh who is going to handle the messages coming to your error? q you need a mechanism for ok. So you add a lambda function to cover that. Now, you're confused. What do i do? Should i put the events back to the original queue or should i send to a destination event bridge?

So this is typically how the architecture grows. And all of a sudden you see a tangled event driven architecture in front of you, your architect will come along and say, hey, you know, microservices anyone. So that's what you should be thinking of. Ok, let's do microservices, you then draw boundaries and you push all the services around and then you're happy.

But the problem is you try to deploy one microservice and you find everything else get deployed. So what we have is a tightly coupled service architecture, this is where ivan bridge comes in.

So when you have the microservices, they all produce events rather than always, you know, crisscrossing to resources within their own kind of boundary, they share events via eventbridge, even bus and then events can even flow between you know, the buses and the you know the whole situation be so harmonious microservices can be independently deployable and they, you know each one is separate, everything is fine.

Now what is un bridge so if you're not familiar, it's just an event bus and it makes it easy to connect with applications where events from, you know, publishers to uh targets, right?

So what makes an even bridge? So in court, there are three parts, one, the main part, the un buses. So if you are completely new to un bridge, go back and check your aws console, you will find a default un bus there. So that is aws un bus, all aws services push events to the default bus.

Then on shop, you can create your own custom un buses, then aws also supports partner even buses as well. And on one side, you have producers, they push messages or the us on to the un buses, on the other side, they have consumers and receiving those events.

But as i usually say, the power of un bridges in between the filtering and routing rules, that's where the magic happens. That's where you identify. Ok, i need this event to go through these targets and perform these sort of transformations or the logic and things like that. So that's sort of the, you know, it's kind of the core of even bridge. But on, on, on, on top, you have the several of its features as well.

Ok, let's move on to looking at the fourth main element of event driven architecture events. So every event should have a unique identity, carry just the data as i said earlier, but it's it's hard when someone is new to event driven architecture to conceptualize the events.

So that's why you need to spend time to design, designing your events because some of the teams make mistakes here. Now, typically you hear if you went to any event storming session, the first thing you hear from the coordinator guys think in terms of past tense when you come to events, right? It is not always, it's difficult for us to imagine that way, but that's how it because it thinks something happened in the past.

Now, if you look at the one bridge events scheme a pattern, so this is pretty much it. So it has a bunch of things and the most important part is the detail attribute. So that's where your payload even data goes in. Ok? And you can provide any valid json in there. But if you spend some time thinking of which structuring, then it will make a lot easier rather than just dumping everything or anything as you like think in terms of a structure across your team or teams and organization.

So split it metadata and data and what is metadata? And so in metadata, you can carry your own event, id, identity, your own version of the event schema. You can say which domain name, events, which services are coming from? What type of events this is. And also you can say tt l how long the event should be kept by a consumer? You know, imagination is yours, you are your organization. So think that way and then you know, have a something structure uh working for you.

And the data section is basically the instance of the data. So this is where your particular payment of a specific order or an insurance, these kind of things go in.

No, there are two ways i usually classify events. i mean, i know it's debatable but it's worth remembering and kind of, you know, tailor it as you need when it comes to categories. Domain events are the pure form of events. These are the events you share across other domains across beyond your bounded context boundary. And this is the kind of, you know, the goal events for you and then you have operational events.

So these are the events that say, oh this third party system is up or down the service now, you know, raised an alert or all these sort of internal within your boundary so that you can, you know, take actions and then of course the local internal events, they are just, you know, flowing within, they never go outside of your boundary.

And then transformation, transformed events are like more anonymous, even like source event comes into eventbridge and eventbridge. As part of the event rule, you make some transformation. So rather than sending the every data, everything from the data, so you just say this is the, you know everything from the event, you just say this is the data because you know that metadata is not necessary for this particular target. So that's the transformed event in my, you know, sort of way of saying and then of course, you have the aws events.

So except the aws events, all other events we usually know, call us custom even because those are our events, we create and push mostly to the custom event buses on the types. Again, it depends how you classify, you can mark as a data event or it's a query. So you know, you get back an answer or a response again, there's no hard and fast rule here. It's up to you how it helps your teams to have, you know, the structure in place.

Ok. Now, ok

Recap. So we looked at synchrony event and we looked at the uh the you know, event of an architecture, the elements. Uh we took a look at eventbridge, what it provides and how to structure events and uh all good. Ok. Let's move on.

So let's change gear and look at uh event, run patterns when it comes to patterns. I like this code a lot because if you're familiar with the old gang of four design patterns or gregor and bob's uh patterns, microservice patterns, there are so many pattern books coming all the way. But the thing is modern architecture is when you work with the server and um uh clouds, there are many patterns hidden within your architecture, say for example, amazon a p a gateway, most of us use every time, right? Do you know a p a gateway itself is a pattern. Similarly, the debt to qdlq is actually a pattern, but we have a service product we use all the time. And for us, we don't see that as a pattern, we see the outer architectural patterns, right? So that's, that's why, you know, i like this quote that we need to keep in mind when we work with, you know, patterns in our server.

Let's start with one of the basic simple patterns. Choreography. It's a twin, the other side of the, this one is orchestration though they are twins, they are completely, you know, dissimilar. So they do different things. But when choreography and orchestration come together, we will see later on, you know, you can do wonders with uh uh server, right? So the name comes from, you know, these choreographing uh events. So when you choreograph, you have one or group of dancers, they know what to do when the tune or music plays, you don't need to instruct each perform as per the, you know, the steps that needs to be done.

Now, let's take an example. So you, you bought in the kitchen equipment, you want to register it and uh you if you register, you get your loyalty points and also they also give you some discounts for your next purchase. And uh you know, they need to, you know, they inform the customer service for issues and also inform the manufacturer looks kind of simple sequential. But thing is not everything needs to be, you know, this way. And there is dependency because you need to register a product in order to generate your loyalty points, you need to have a discount code so that somebody can email. Ok? And at the same time, there are things that can happen in a parallel like loyalty points computation or uh discount code generation can happen in parallel. And also you know some of the other things. So how do we take this into a choreographed even through an architecture with uh u bridge? Let's see.

So we have an app and that kind of interacts with the product registration service and product research service emits an event to say that a product was suddenly there are different services coming alive. Oh, i want to do that. I want something, i want something to do that with that event. So that's how event driven architecture happens. And you know the eventbridge target sends these events to different services and then once the promotion is computed, the discount goes generated even comes up and there is somebody else interested in it to send to customers or email to customers. And of course, there is a manufacturing system, a third party application that also needs to send you to be in works with the certain details so that can be done even with the even bridge rule. So what we, what we, what we can do here is with the a p a destination which i will talk about later on. So this is a simple choreographed, you know, even driven architecture with amazon i bridge.

Ok. So things to remember. So this is like co ordination. It's not like there's no control or instructing each one what to do and then decouples it, it allows the decouples. And also tomorrow, if you want to bring up a new microservice, it's so easy to spin up a new microservice, make it part of the whole ecosystem and item. But c and as i mentioned earlier on uh important for observ, you know, adding tracing attributes or tracing values and taking the values all the way through are crucial.

Ok? No, you go to a developer conference and they will say code is a liability and you sit there, i wonder, i don't understand this. I know financial liability, but what is code is a liability and go to the next session? They say the code you write is legacy tomorrow. You think that ok, i'm done. My job is over because i can't program anything and go to the final session and to someone like werner will say the core you ever write is business logic. You think that's it, business people, stakeholders taking over my job is done. Not necessarily what this implies is don't do unnecessary programming or implementation.

Ok? Why if you reduce reduce writing functions, there are a bunch of benefits. So less score, less security worries and things uh you know, worry about there is less, you know, debugging hassles and all sorts of things um uh help you and uh you know, reduction in your cost maybe. So this is where the next pattern comes in functionless integration pattern.

Now it is also known as low code code less, you know, those sort of terminologies. But the concept is the same, that means you don't write unnecessary custom functions when there is no need, right? So what is functionless for that? You need to understand what is function full, you must have come across this pattern architecture those hundreds of times.

Now, if you look at the lambda function, you think what is its role? Is it doing any business logic or is it shifting the data from the api payload on to the queue? If it is just shifting the data, then it's not its strength. Lambda functions are compute, you need you know logic to implement et cetera. So this is an area you need to think whether i can be a functionless. So rather than doing a function, you think? Oh can i use native integrations to achieve the same thing? And this is kind of the starting point of your functionless thinking.

Now, if you think of a p a gate gateway, it it it supports over 40 integration through several aws services and i'm sorry. Um and also the uh staff functions. It, it provides so many opportunities especially with the stk integration to interact directly with, you know, services without having the need to write lambda functions. And last year's re invent the eventbridge pipes was announced. This is again sort of a 1 to 1 pipeline where you can do transformation and connect with the target services in most cases without having to write a lambda function.

Ok? Now let's take an example, somebody registering an account with the online system, for example. So they add personal data, they give their payment details. And then these days, you also need to capture their concern whether they can be contacted via email. Can they get promotional stuff, etcetera, etcetera. So most business this is not their strength. So they usually use a third party application to send the data only when necessary. They query and get the you know the details back. And typically you will have some kind of a api that you invoke, just send the data, you don't expect anything else back.

Now, this is an a location because it doesn't need to happen when the customer registers, it can happen, you know, behind the scene and you you have api supplied and they have quota and also they can have different authentication mechanism, open authentication or you know other mechanisms. So, so when you think of implementing this as a you know architecture, so you have simple a p a gateway that returns, you know, acknowledgement back and pushes the details into a queue. And then you, you have, you may have pipes. And for the logic you have, let's say a step function. So a step function workflow then has different steps for different things. And there is a lambda function that is dedicated for conducting the uh third party application all looks fine. There's nothing wrong here.

Now what happens at one day the system is down or there is a flaky network connection. So to, to, to take care of this one, your architecture now needs that letter cues in place. And when you have queues, you need to know how to handle. Ok. So you add more complexity into your architecture.

Now this is where one of the you know uh native integrations of i bridge comes in and that is a p a destinations. So you send an event to events and events has a target rule and the target here is an a p a destination that hits the, you know the end point on the third party system.

Now, what is a p destination? Now? A p a destination? And so before i go, there have a look at this lambda function in terms of the functional less concept we just discussed discussed. Can you can you can you perform what this lambda function does if it's just doing the transformation with the step functions, intrinsic functions and things like that, then you may be able to get rid of the lambda function altogether and just have your step emitting an event onto the bus. So this again kind of refining your architecture, making sure that you use the optimum approaches or the patterns as you build your cus applications.

Now, what is an a p a destination? A p a destination is basically hg gp and points where it's a target for your a bridge rule and you can use it to have a functionless way of sending or in working your extra targets. So typically the customer registered event comes to the even bus, you have a filter to know. Ok. So this is the customer registered event. So i need to, i need to do something and you may be doing some event transformation and that you know, give it to a p a destination.

So a p a destination has two parts. So you need to have your connection. This is where all the authentication, the credentials and things happen. And then obviously you have your endpoint with all your headers and all sorts of stuff and that can, you know, get, gets to the final target, which is your external application or a different service in a different domain.

Now, if you just, if, if i show you the, you know the structure of a pn destination, so it has like a connection and the target gp n point. And if you look at connection bridge supports these three forms. I'm not sure if anyone is still using basic but it's there and then a p a key mechanism. And of course, the oat where you can kind of supply your token curial for a, you know, eventbridge to deal with behind the scenes. And the end point that you can get hit any customer end points and also the even bridge partner endpoints as well. So this is so cool that if one of your partners work with is part of the even bridge, you know, destination partner scheme, it becomes so simple and easy.

Ok. So that's the p destination. One of the important points with the destination is when it comes to credentials. Ib keeps the credentials in secrets manager. When you hear secrets manager, you may think, oh, that's going to be costly because every secret is like 40 cents per month and then a p a invocation costs, et cetera, et cetera. The good news is i irrespective of how many times this invocation happen, millions of times or billions of times even bridge consumes the cost. So we don't pay for the, you know, the secret manager cost ourselves. It's all covered, consumed by uh even bridge. And then you can, you can add rate limiting and it supports a retry.

Another thing you need to be, remember, you need to be mindful of is that the time out is five seconds if the target takes long it will drop and sort of get into, you know, the retry mode. And uh i had written a blog a while ago, it goes into that, you know, sort of details of how we can do this, uh the whole thing, right?

So let's move on. When you think of a reliable application, you think of resiliency, high availability, all sorts of things. These are crucial when it comes to serverless and even drive an architecture, especially distributed microservices and things. And there are different ways we can do and why we need all these things. Because as i just showed you in the previous case, network connection can be flaky and the system could be down for maintenance or whatever. And then the traffic pattern could vary and completely suppress everyone and take the whole thing down. So these are all eventualities you you come across in a production environment almost every day.

So that's the reason why we need to build all the capabilities as part of our architecture when we design these solutions. So one of the patterns, it's a very common popular pattern that we can use is the even broker pattern. But what i'm going to do here is explain event broker, uh sorry, even breaker pattern. And also I will add on how can you take care of the failures with the retry mechanism as well.

So if you are new to a circuit breaker, it's, you know, it's simply the concept from the, you know, electrical circuits. So a circuit is closed when everything goes through fine. So in here, the green arrow means it's closed. Everything is happy. It's a 200. Ok. Going all the way through to your client. It's a synchronous in vacation. Happy.

So in an open circuit situation, so same kind of pattern, but there is a problem reaching your target. So you can't reach the target because it's down or something happened. So your circuit now is marked as open. So that means you're not able to, you're not going to reach the target, but you return a failure or error response to your client or a consumer.

So this is like the more two common things and there is a half open et cetera. But let's leave that for, for the time being to keep things simple. Ok?

Now circuit breaker, i usually term this college as a manager because in our implementations, the manager's responsibility is to know when to declare a circuit as open, when to close because the applications lie on this status before deciding what to do with the with the request they have in hand.

So simple thing that we can do is we can store the status of your, you know, endpoint of the target system and then do or react based on its situation. For example, first you check before you invoke the third party, you check the status. Ok? If it's kind of closed, good to go. Ok, i'll call it and then that goes through and then you, another request comes along and you check, it's fine, but you try to hit it. You see that it's not going through the circuit is for some reason it's open.

So what happens is you update the status to, to say that, oh, hang on. It's, you know, there's some problem. So you have some logic to identify when to declare that as you know, open circuit.

So that means the error response goes back and your system kind of, you know, takes care of those things. You're not kind of, you know, waiting for the target to come back alive and you know, adding latency to the request and things like that.

So usually when you, when you, when you decide when your application or the logic decides when to declare as a closed or open, it is based on certain things, you probably won't declare as a closed circuit as soon as one request goes through, you may want to try out five or 10 requests within a short period of time.

So this is part of your circuit manager, you know, logic. So that's kind of, you know, the threshold that you usually set. And so this is the simple way of doing it.

Now, if you take it to one step further, you can even build this as a service. For example, you have a, you have a critical third party or external application that many systems rely on. And what you can do is you can set up a dedicated status checker.

So you have a simple scheduler that comes alive once in a while, maybe i don't know once every minute or two et cetera performs a heartbeat of the external application and sets the status in a dynamo d tb table, for example. And every time it happens, the table emits an event, you know the stream event and there is a handler and looking for this or capturing all this coming along. And this is basically the manager because it receives the status updates as they come along.

So it can then decide when to make the status changes. Like you know what i mentioned earlier on, the benefit is the same manager can update other places. For example, it can also keep things in a uh ssm parameter store attribute. So if it's within your boundary, your service can check the essm parameter to see the status or if it's you know another microservice or another application, you can emit an event, you know the operational event i mentioned early on.

So that can be consumed by different applications. So to get the status details back, and of course, you can also and a p a gate to end point, they can simply query the relevant field from the dynamo table and report the status back.

So this is kind of evolving or simple circuit breaker implementation of course, there are different ways to do it, but this is something you know, worth experimenting if you have these sort of use cases in place.

Now, as i mentioned, there are different ways for different consuming parties, especially within your own account, within your own bounded context. You can use certain things. If it's external, you can, you can, you can go via events or uh you know, uh api s so makes everything, you know, simple and clean.

Now, one of the common things you have when you have a circuit breaker is to fail fast. So that means if your circuit is open, you can't go, you immediately fail back. That means you are not holding on to resources or uh you know, adding latency for that, you do a simple service, check status of the service and decide what to do.

So it just goes back as an error. It's a simple and most common case that you will find and it's fine in most cases. But the problem is if you're handling, if you got a, you know, customers order in your hands, you can't simply fail back and say sorry and i can't do anything. You have to, you know, have the data and resubmit to your downstream application.

So they can, you know, get their orders delivered, et cetera, et cetera. So this is where we need to kind of think of not just failing first, you need to have the replay mechanism in place.

So in this case, what happens is like when it's a, it's a failure. When you can't get to the other end, you write somewhere, you write the failed request. So that when the circuit is back, you can replay three common ways, two most ways you will find this implemented. One is with sqsq and then dynamo db table based on your query requirements and access patterns and things like that. And the third option now you can use is with even bridge archive and replay.

Ok. Right. Stay with me. It's going to have a little bit trickier if you are near to our kevin plate.

Now what is eventbridge archive and play is simple like how you set the filter, but you identify a pattern for events and then say these events, i want to keep in an eventbridge archive. So they go to this archive and you decide how long you need to keep in the archive and then you can replay events from the archive from a particular, you know, for a time frame from this time to this time. Ok? Come on, replay the events from this particular archive.

So it's a simple mechanism but really useful helping and helps in several situations. Now, you know the simple thing, you do a status check and then you think what do you need do. And so sometimes it's useful especially when you work with events to have this sort of status of certain things reflected in your event.

So for example, a successful submission or invocation is 200. Ok, fine. Nobody is going to question. But if you get into an error and that is due to your data issue, validation issue, this you can classify as hard error. The situation is however many times you resubmit this is not going to go through because it's your problem. You have a problem with the data and the third type is triable status.

So this is where your third party da or you get a 500 service error, etc, etc, you need to collect those things and retry. So it's it's useful if you can, you know, kind of have this sort of classification in events. It helps to um you know, with the replay archive mechanism.

So this is kind of a simple metadata i mentioned. Uh so i show here with the status and based on the status, you can now set a rule in eventbridge. So eventbridge, you can have a rule to capture those retrievable status events and push them into an archive.

So this is like a typical um you know, the filter rule, you may have your, of course your you know situations will vary. But this kind of the basic thing.

Now archive creation is simple. All you need is uh you know, a bunch of cloud formation scripts or uh you know other ways of doing it with a filter pattern. And uh that's how it sets now, replaying events from archive.

So you have things pulled here. Now your status is your system is now back up. So you have some kind of events flowing through. You can have a logic implementer, you can have a lambda handler. We kind of know that, ok. You know this is up now, i need to replay from this particular archive because this is all part of how you set up and you'll say that, ok, it was down last time. I mean, this point is up here.

So i should be replaying the events from this point to this point onwards. So then eventbridge will replay those events, push those events back onto the, you know, the same bus and you can have your handler deal with the events to do this.

One thing you notice here i highlighted is a replay name. You might be thinking that wasn't part of your, you know, event structure. So this attribute aws adds to the event when it replace it for a reason so that you can differentiate the replayed event from the original event and avoid these sort of a cyclic nature of events going through. This is important and you can kind of set the replay name so that you can kind of set the filter patterns accordingly, right?

So key points um there's no order guarantee. So as as with the u bridge as of now and uh speed, there is no way to control the speed when you replay it, just kind of, you know, puts everything so you need to have uh buffering or queuing mechanism downstream to um um you know, cater for that.

Now, i usually recommend a grander archives. Don't just go one global archive dump everything, just keep one archive for that specific purpose and that you know, helps a lot with our managing, managing the archives.

Another crucial point, there is a delay between the last event that pushed in the archive and you can replay that. So there are usually typically a five minutes gap or more before the latest event that went into the archive can be replayed. So keep that in mind. And uh if your use case is fine with that, then this is the best option to do that.

And um again, i have a blog uh detailing and going through all these things. Ok, let's move on to one of the other prominent patterns orchestration. This is the tune of the choreography i mentioned.

Now with orchestration, it's again based on, you know, the name comes from the orchestra, there is always a controller, right? Instructing what to do uh which part of the orchestra, et cetera

So a similar concept here. So usually when, when we say orchestration immediately AWS Lambda functions come into the picture because that's sort of the, you know, state mission orchestration.

Now, I usually talk about three types of orchestration. It's all about keeping things simple, clean in service, orchestration, cross service and distributed, distributed and cross service are kind of similar. I'll show you the way I differentiate.

In service is simple. Let's say you have a domain and it has a microservice and it has a step function. Let's keep it at a domain level. Don't go into the boundary context. It's fine. An API in works the stuff function, it does some logic you see here, everything is within that microservice, there is no arrow going outside, you know the boxes. So this is even in service, this is the perfect and simple form, there is no dependency, there is no hard wiring etc.

So in terms of um the cross service, this is slightly broadening the, you know uh the the uh the the orchestration. So here similar service but it has needs to reach out to other microservices within the domain or even other domains. So what it does is like it reaches out via mostly via a p a calls. So typically you will have a lambda function to do that. But recently in the last day or so, there's a new feature or announcement like from step functions, you can directly hit http targets. Now, so that again makes the function of functionless life, simple and easy, right?

So this is like a cross service. Let's move on to the next one that's the distributed, you know orchestration. Now let me let me bring three domains to explain the concept. Stick with me. This probably can, you know, um confuse some of you. But uh follow me, let's say three domains and each domain you have three microservices.

So for example, domain a that is the kind of the controller or primary orchestration. Think in terms of say, i don't know um insurance claim processing because there you have legal parties coming in or uh you know, car dealers or manufacturers so on, so on and so forth.

So it can be kind of a complicated process. Now say uh task here two requires something to be done by uh service b. So this could take hours to complete and on the other side. So you have a different task in your primary orchestration orchestrator that reaches out to a different service, it may take weeks to complete.

So, so the main orchestrator now needs to wait until it gets answers from these different uh you know, uh services. It could be, you know, i'm just showing uh as orchestration workflow, but it could be, you know, other implementation as well.

Now, how can we do? So this is where orchestration and choreography comes together. So the way to do this is like uh a two pushes an event with what we know as task tokens, task tokens of uh staff functions.

So a token within an event goes to the service. And similarly for other service on a different parallel arm, there's a different event goes out with a different token. And these tasks wait there until the respective tokens come back to the staff function. And so the token comes back, i'm just, i'm not showing the sort of even bus or even birds here just for clarity.

So the event comes back and the task move forward until that point, it will just stay there. I will expand this in a little bit more detail for you to understand.

So task emits an event with a token, it goes to eventbridge and there's a consumer which is, you know, microservice c and it consumes it knows, ok, i have an event. I have a token, i need to kind of keep, keep the token and send the token back in my response that goes back to even bridge and then eventbridge, the microservice consumes that event with the token.

So it has a handler to, to process that event and the handler will submit the token to the step function and the staff function will carry on. So, so this is how roughly how the task token mechanism is so powerful. You haven't, if you haven't tried, i would recommend that you try it out. It's, it's so cool. Um it's only supported with the standard um you know, uh state. So uh the stuff functions at the moment. No, this is fine.

Now, how do you add an event with a token to the bus? It's so simple as part of your task there is something called wait for task token. So that is kind of the, you know, step that we emit and it will stay there until you know the token is back. And then you can inject a token by calling the task token. So that's sort of the step functions construct. You can attach to any of your attributes. It doesn't need to be a task token attribute. You can call any of the attributes. So your consumers know that ok, this attribute carries a token i should honor and return back, right?

So key points in terms of distribution. So multiple tos tokens you can use, they are all unique and uh you know, you can, you can use sqs or sns or whatever. For example, i shown the stuff functions on the other services that could be anything. And the time out the heartbeat understanding is important.

What happens is when you emit the token, you set a time up, i want to wait for say 20 hours. Now, if the token didn't come back after 20 years, the task will time out. So this is where the retry status messages i mentioned earlier help because then the token handler can understand and increase the heartbeat.

So it won't time it, it will wait for longer for allowing time for the other service to complete and the token to come back before it can kind of, you know, carry on. So you can say success or failure or as i mentioned, extend the time out as well.

So again, there is a blog that you can follow if you are interested. And um final one bounded context, we all know right domain driven design bounded context. Now how many of us respect boundaries with events as we do with api s with api s, we have all sorts of things, payload contract and security, et cetera, et cetera. But when it comes to event, we are so relaxed, we just send the events all around.

So there is a pattern we can use what i call as a gatekeeper pattern. So this is a way of kind of guarding your domain or a bounded context boundary. So just to explain, so for example, you have a payment processing system which has a bunch of micro services, it has an internal bus where all the events flow into. But at some point, this needs to send certain details and invocation to external targets or applications.

So finance domain needs domain events, for example. And then you have a third party application that requires some data to be sent. And then you have a checkout boundary conduct that may be interested in certain events from the payments boundary as well.

So this is a situation you can even simply manage with a single bus. So this is where a different thought process i usually recommend to teams is to separate the the external communications with a different event bus with your bounded conduct, what i call the gatekeeper bus or you can call as an external bus.

So the idea here is that the internal event bus that doesn't care anything about cross accounts or who are the external consumers. Its focus is purely within the bounded context, dealing with the microservice events, all sorts of events. Whereas the external or a gatekeeper only deals with the domain events that needs to go out or the cross account rules that allows you to share events with, you know, other domains or other boundary contacts. It's just a way of kind of separating and keeping things simple.

So that's it. Now, it's, as i mentioned, it's just kind of a mechanism to, you know, kind of clear the things, keep things simple uh in your implementation. So like i mentioned, the gatekeeper bus is the one that will have all the you know, rules and stuff.

Now the key points as with the even bridge, um you know, it it releases complexity, but also you need to be thinking in terms of all the you know, ordering issues with even bridge et cetera. So it can act as like a typical anti corruption layer. So you can easily set up a microservice to go with the, you know, the event, the gatekeeper bus and you can have all your transformation logic in here.

So for example, one consumer requires the events to be transformed as cloud events. Then you can have those things covered as well, right? So we come to the, you know, end of the pattern thing.

So let's look at some of the three of the common questions i get asked every time, first question often people ask is how many even buses should i have? Can i have an enterprise wide bus or a domain bus or a bounded context bus? I mean, you can have you know combinations of these things.

The problem is with the enterprise bus, you obviously you can think of all the different events going in. So you need to have some kind of governance built on top of your event bridge. So for example, how can you automatically on board, off board producers and consumers? So these are the things you need to be mindful of. It's not that simple because when you have different domains and events push events and consume events.

So domain level event bus is a bit simple but still as complexity depending on you know, the set up of your business domain. But again, it would be useful to have those sort of governance in place and the schema validation etc in place domain, sorry boundary contact bus is a bit more easier because that sits within your own two piece of team boundary. So you won't you decide with the gatekeeper bus, how to share events, etcetera?

The next common question is should i use even bridge or kinesis? Can i replace kinesis with even bridge. The way i say is kinesis has a purpose. It is there for cloud scale event injection or streaming. I call it just a streamer. Whereas i consider eventbridge is a bit more refined, you know, even handling mechanism with your microservices or applications, you don't just dump everything into u bridge. So it's more of choreographer and of course, there are differences in payload sizes and how long the events can be kept. And these are the things you know, obviously differ service to service, but have that in mind unless you have a varied use, case leave, what is best for kinesis to kinesis and do what is best for, you know, handling even bridge events that way, right?

I and finally this again a common question, there are three different services, which one should i use? Again, there are commonalities, of course, they are all asynchronous. There are you know, commonalities in terms of purpose and things. SQS for queuing buffering SNS is the typical pub sub model with the topics. Eventbridge is a broker is a kind of a choreographing your services coordinator and then me message flow.

In some cases with SQS can adjust with its characteristics, you can adjust. Whereas with eventbridge at the moment, there is no way you need to put q or something on the other side of consumption to slow down things. And FIFO is another differentiator as of now, eventbridge does not support FIFO or you know, sort of the ordered events and then the batching.

So with SQS, you can, when you process, you can pull one or up to 10,000 messages in one batch and process. We don't have that capability with eventbridge at this at the moment. So these are sort of the basic, you know, concepts and differentiators and similarities you probably need, you know, keep in mind and that's all.

So thank you all so much for listening. Thank you so much and please complete the survey. Thank you.

李白的朋友王维

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Advanced event-driven patterns with Amazon EventBridge

Let's see.Ok?Ok?Ok?Now?
复制链接

扫一扫