Dive deep into Amazon DynamoDB

If you're not here for that 330, now is the time to find another room.

A couple of months ago, I posted this on LinkedIn. And the presentation which we're going to do is a deep dive into DynamoDB.

The approach we're going to take is this - a number of you gave me questions which you wanted me to answer. So we're going to be talking about DynamoDB on demand, provisioned mount, transactions streams and global tables. But basically I'm going to try and answer the questions which all of you gave me.

So for those of you who gave me these questions, thank you. And like it says up there, my name is Amri. I'm a senior principal engineer in the DynamoDB team. I've been with this team for about four years now. I worked with the databases for about 30 years. And when I joined the steam a little over three years ago, almost 41 of the things which I noticed which was a little awkward coming from a sequel background.

Quick show fans how many of you here have a sequel background? Good. Ok. One of the things which I missed was a nice C which like I SQL or my SQL. So I made myself one that's the QR code down there. Try it out. It's open source. If you have ideas for things which it can help you with. Let me know in my day job, I work with the engineering team which builds DynamoDB, operates DynamoDB, the service. But I spend a lot of time talking to customers as to understand the challenges which they have with data, the things they would like to do.

And one of the things which I hear often I heard this yesterday twice, we're going to be migrating to DynamoDB. And we're not really sure whether DynamoDB can handle our traffic.

So before we get to the regular presentation, let me just deal with that question. If you wonder whether DynamoDB can handle your traffic. Here are some numbers for you.

DynamoDB is fully managed. DynamoDB is serverless. What that means is if you bring your workload to DynamoDB, your data is gonna be stored right along with some of our largest customers, your requests are going to be served by the exact same hardware which are serving those requests.

So if any of you here is concerned whether DynamoDB is going to be able to handle your traffic from your application, we got you covered.

So with that out of the way, and I'm gonna be referring to these numbers, multiple times during this presentation, that's a half a million times a second per customer and there's hundreds of them a billion an hour. So keep that in mind, that's the scale at which we're operating.

So when I talk about how we do things in Dynamo, DB, it's because we want to deal with that scale and we want to give you predictable response times at any scale, whether you're doing 1000 requests per second or a million a second, consistent responses. Ok.

So the first topic we're going to talk about are the two capacity modes. uh and this is going to be the format of the rest of this presentation. um I'll tell you the three questions which I'm going to talk about and then we'll go about answering them.

How does 90 db scale? How does on-demand really work?

Second question is actually how does rate limiting in DynamoDB work and which should I use on demand or provision?

So those are the three. um Also, this is not the first time we're doing a deep dive, talk into DynamoDB. There was one in 2018, 1 of my colleagues, Jasso did that. If you have not watched that, I strongly recommend that you do. I'm not going to be repeating the same stuff there.

So um these two talks go together similar with Alex's stock and a stock. So how does DynamoDB scale you store data in DynamoDB in tables when you create a table, we ask you for exactly three things. What's the name of the table? What's the primary key? And what is the billing mode? The billing mode is the capacity mode? Is it on demand or provision?

Now, we ask you for 1/4 we ask you for a credit card number, but that's beside the point. We take your partition key, computer cryptographic hash and distribute your data into multiple partitions.

DynamoDB scales horizontally by dividing your data into partitions. We try to keep each partition approximately 10 gigabytes. We store multiple copies of your data, we store n copies. It says down here in the small print, uh small print, you will hear other presenters talk about three copies. I don't like that because three doesn't change n does sometime soon. We might have more or less copies. So we store multiple copies in different AZs for availability and durability.

We scale by distributing your data horizontally and we serve your requests half a million requests a second by distributing your data into multiple partitions. That way each partition gets a small fraction of that load.

So the question again is, is there a limit? NamoDB is serverless which means you don't have to manage any servers, but you've got actual data. It's sitting on actual nodes and there are a finite number of hosts. Therefore there is a limit. But in reality, that number is so large, you don't have to worry about it. Half a million requests, a second, hundreds of them.

So if you have a large load or a large data set, 200 terabytes of data, we can easily handle that for you. And we do that a little piece at a time, small partitions, 10 gig each, many of them to handle your load. Ok.

So to understand how on-demand really works, um on demand is something which became more popular. After the last last of these um deep dive presentations, you need to understand how rate limiting works.

So if an application is going to be serving any substantial traffic, it's not going to be one instance of the application, there's going to be many instances of the application. So each instance of the application establishes a connection to a regional endpoint which is something which is our request ror fleet. And again down here is where your data is actually stored.

When a request is received from your application, the request router does one thing. First, it, it authenticates you, it authorizes you and so on and then says, are you within the number of requests per second which are allowed to for provision mode table? You tell us what that limit is for an on-demand table. It's your account limit. Are you, are we good to serve that request or not?

If we are good to serve the request, we send it down to a storage node and the storage node will do some partition level controls. Remember the storage note is shared by everybody, you and the people with large tables and small tables. So each partition gets a small slice of the storage nodes capacity.

So we have guard rails in place here to make sure that your response time and everybody's response time is going to be predictable, single digit milliseconds thereabouts. So there's two sets of limits table level enforced by the request routers and at the storage node level enforced by each partition.

Now doing this rate limiting when you've got tens of thousands of request routers is a hard trick. And we do that using a mechanism called global admission control.

So global admission control, which you'll hear me refer to as g ac, it is a highly distributed rate limiting system, tens of thousands of request routers and every time a request is received, we verify whether we should or should not serve that request. And we do that along with the guard rails on the storage note to make sure that everybody gets predictable response times.

So let's look at a picture of how this actually plays out. Global admission control maintains a table level token bucket. Quick show of hands how many people know what token buckets are? Ok.

Token bucket is a very simple, highly efficient rate limiting algorithm. um The way it works is very simply that we add tokens to a bucket at some fixed rate depending on whatever the rate is we want to enforce when you want to serve a request. You see if there's tokens in the bucket. If there's tokens in the bucket, you serve the request, there's no tokens in the bucket. You don't.

There's tokens in the bucket, you take a token out, serve the request and move on. Ok. So gap maintains the table level token bucket, but it would be very, very difficult every time there's a request, a half a million times a second, a billion times an hour for a request writer to go to g a and say i want to serve a request. Could i do it? Nope.

Instead request routers maintain a virtual token bucket. So two parts token bucket maintained by gat and request for orders, maintaining a virtual token book. So let's see how good i am at powerpoint animations. There you go. I can do something.

Um tokens fill in the token bucket with gak at some rate. They fill up to a certain point after which we don't accumulate any more tokens. And then when a customer comes along, an application comes along with a request that request makes its way across to a request router.

Well, that request router has no tokens. So it goes to gak and gets not one token but a small stack of tokens. Ok. It uses one of those tokens from here and it will serve the request and we're done.

Now, this application has got numerous connections to numerous request rotors. Before long, you're going to have many request ror which have tokens. Periodically the request writers go back to gak and say you gave me a stack of tokens. I serve so many requests. I have so many left periodically.

A good analogy for this is you have a large sports venue with dozens of entrances. You have people manning the entrances who have to let all of you in. If a person needs to be let in and the security person has to contact some control center and say, i want to let one person in those walkie talkies are going to be really, really busy instead at numerous control points at numerous entrances.

If the person there asked the central place and that place says, let in 50 people and then let me know or call me in in a minute and tell me what's going on, reduces the traffic. And each entry, each entry point can let in some number of people. You accomplish the same thing with much less communication. That's basically what g ac does request, routers receive stacks of tokens, they serve requests and they report back.

So here are those three steps, request routers fetch tokens from a central token bucket, they store them in their virtual token buckets, they serve requests, they publish statistics. And now g a has a very good understanding of the actual load on the table.

So as a customer, you set a table at, let's assume it's a provision mode table. You set at half a million requests per second. We're going to serve half a million requests per second over maybe 10,000 request voters. But we're not going to be talking to gak for each request periodically.

So at scale, that's how we do rate limiting the advantage of this is, in addition to doing admission control, we now have a very, very good understanding of the actual rate limit of the actual rate of traffic on the table. We know the rate we know the rate over time and we capture that telemetry.

So i'm going to mention two concepts here and i'm going to build on those in the next couple of slides. A split is a mechanism by which we divide a partition into multiple pieces and create multiple partitions to divide load, knowing the traffic. Ok. Knowing the traffic on the table, we can preemptively split serve more load. Hold that thought.

The other thing which we do storage nodes hold partitions, different partitions serve different amounts of traffic over time. We've noticed that some storage nodes may be running hotter than others. We move replicas around to load balance the storage notes. Again, customers want predictable response times at any scale. All of these things are happening behind the scenes. You don't have to do anything for this. You create the table load data, run traffic, we do all this for you.

So hold these two thoughts, pre-emptive splits and replica moves because we're going to build on that later. So admission control in DynamoDB two parts table level and storage node, partition level.

So the question then is which should you use in your application? And this is one which almost every customer conversation i have goes into is should i use on demand? Should i use provision to understand that you need to understand why we built on demand in the first place.

When DynamoDB launched, you created a table, you told us what the provision capacity was. But the promise of DynamoDB has always been, it is fully managed and it's serverless and fully managed means we should not be asking you the question, how many r cs and how many w cs create a table load data, run your application

The rest should be magic. So we're trying to get to that magic. At the end of the day, the choice between provision and non demand mode is something which you should make on a, on a case by case basis. And it always comes down to cost. What is the cost of serving the request and what is the cost of throttling the request? And the cost is not just what you pay. Dynamo DB. A lot of you build event driven applications with streams or lambdas or k ds down the down the pipe from dynamo db are those systems scaled.

We have no problem serving requests half a million requests per second. No problem is the rest of your application downstream scaled up. What is that going to cost? You compute those costs and decide. Do you want provision mode? Do you want on-demand? Do you want rate limiting or not?

So the reason we built on demand, um this chart is something which we build internally. When we're looking at applications, it shows time on the x axis and this shows the keys which were accessed over time. And this is a very narrow strip of a much wider chart. I think this shows 5000 keys over some period of time. This is an ideal application. The load is evenly distributed over space and time.

How many of you have an application which looks like this for those of you who are watching this on video? No hands went up. Ok? This is what real applications look like once in a while you get a burst of traffic once in a while there are some keys which get hot and customers told us. Yeah, i understand you want this ideal traffic. This is our reality, fully managed serverless deal with it.

We built on demand to deal with exactly these kinds of things. This is reality. So when we launched initially with provision mode, uh this is a blog post which the qr code there gets you to. We gave you this example of an application whose load went up and came back down. With auto scaling, you can slowly step this table up in provision mode and what you pay is the area under the red curve. Ok. This looks very nice. Why do we need on demand?

Well, consider this case, this is also a table with auto scaling. The table was provisioned initially for 7500. Let me, let's call it rc us. It may be wc us. It doesn't matter 7500 requests per second actually serving about 4000. No problem. And then the traffic shoots up.

Customers told us when we got, when we get these kinds of spikes and if they're short lived, you should just deal with it. Ok. So we introduce burst capacity here. This burst lasted more than five minutes. And then we started throttling and it took the customer or auto scaling to actually dial up the table before we actually started serving the traffic.

So this was a solution. Customers told us they wanted burs, we dealt with burs, but this was not all. What if this burst was going to last longer? What if we don't want this, this throttling to happen? So we built on demand in response to customers who said the following our application, traffic is going to go up over time, organic growth deal with it. We don't wanna do anything. Ok. Once in a while, we're going to get burst, deal with it too.

So he asked how much should the burst be some said, you know, two x 1.5 x five x, some said 100 x. But we finally came down to anything under two x of your last high spot. We'll just deal with it. There's a larger burst, more than two x. Ok. This is a compromise. There may be some throttling.

So what we do in g a is we figure out what your utilization on your table is. And if your table is on demand, we say we will pre split the table, pre-emptive split of the table so that we can always handle two x done for you. You don't have to do anything further. G tells us what the capacity, what the actual traffic is on the table. We presplit, we deal with two x more than two x. We will serve the traffic but it might involve a short period of throttling while we split the table and then serve the traffic.

Now behind the scenes, we continue to innovate with this, but here's the same blog post. This table was provisioned for i think 10,000 requests per second or, or this table had served 10,000 requests per second at some point in time recently, it was serving 4000 to 5000. It shot up to 18,000. No problem. We'll deal with it preemptively split to two x and then this customer's traffic actually dropped down to zero. What did they pay at this point in time? Nothing on demand is pay exactly for what you use.

So if you have a workload, which is variable and you want scale to zero, there's your answer. So behind the scenes, we continue to innovate on these things. Um quick show of hands here, those of you who used dynamo db before for, for a while, when was the last time you got an email which said we have a maintenance window and your table is going to be unavailable from some time to some time. Anybody, we don't have maintenance windows. We do all of these things behind the scenes, 100% available.

So when we change the architecture of how dynamo db works, you don't notice it fully managed serverless. That's the promise to you. So which should you use on demand or provision? It's entirely a matter of cost. What is the cost of not serving the request? Is it acceptable for you to not serve the request? If you absolutely want to serve the request, move to on-demand mode, the best practices we use within amazon and i'll suggest this to everybody here.

Start with on demand till you understand your application till you understand the patterns. Once you do make the decision whether you want to move to provision or not with auto scaling, you can automatically dial up a table. We have a customer who has three peak loads during a day breakfast, lunch and dinner. They use auto scaling with timed scaling to dial up the table and dial it down. They understand predictable workloads. That's the best thing for them for cost.

We have other customers who say, i don't understand this workload. It can, it can burst any time. It's too expensive for us to lose that traffic. We've got to get that traffic they use on demand. What is the cost to you of serving the request? What is the cost of throttling? That's the choice you have to make till you understand that start with on demand. Once you understand it, you're in a better place to choose.

All right, let's switch to the second of four topics, transactions. Um for those of you who are no to dynamo db who think dynamo db is a no sql database. You may be surprised to hear we're talking about transactions. I i did a lecture recently at boston university to a bunch of students and i started talking about transactions and someone said uh this is not an r db, ms you don't have transactions. Yes, we do. Customers told us they want to build applications where we can update multiple items at the same time. Transaction, either both happen or neither happen or all of them happen or none of them happen. So we build transactions.

So here's three questions which i picked from, i think a half dozen about transactions. What isolation guarantees do we provide so quick refresher, we support standard asset transactions, the isolation level, we support is sizable. Quick show of hands, everybody familiar with what serial liable is. Ok? So let's talk about what sizable is.

If multiple things happen at approximately the same amount, same point in time, a system is sizable. If an external observer can look at all of those things and say first a happened, then b happened, then c happened. If an external observer is able to explicitly order multiple things one after another, the system is sizable.

So if you send two transactions to dynamo db at exactly the same instant, all of the first will be done, then all of the second sizable. So two transactions are always sizable with each other. If you have a non transactional read or write each individual read and write will serialize with the transaction.

So if you do a batch get or a batch, right? each individual get or put will serialize with the transaction. Not the entire batch. If you do a query or a scan, each item fetched in a query or a scan is an individual fetch. Let me say that again. If you do a query or a scan, each individual item fetched in the query or scan is an individual. Get transactions will serialize with each of those. Ok?

If you need to use transactions, remember that we will a guarantee, thomas city and consistency is always going to be eventually consistent. But we strongly support isolation, sizable isolation. I've heard a couple of people who i was talking with preparing this presentation said no, but dynamo db does dirty reads. No, we never do dirty reads. We will never ever expose a written but uncommitted operation. Never. Ok.

So how do transactions actually work? Um this is only a one hour presentation. We can't go into everything here. But if you're interested in all the details of how transactions work, some of my colleagues here wrote this paper it was published in or it was presented at e 2023. I think it was this year. Um goes into a lot of details about some really cool innovations which we came up with here.

So you have requesters, they speak to dynamo db, they talk to a request writer, we authenticate and authorize your request. We verify whether we should serve it or not admission control. And then we have a standard transaction coordinator for those of you who are familiar with two faces commit systems, standard transaction coordinator.

The actual implementation of the transactions is much smarter than that because transaction coordinators traditionally with r dbm. Ss include distributed lock managers. Distributed lock managers is code for unpredictable response times. Dynamo db is all about predictable response times. So we do have a transaction coordinator, we do use two phase commit, but we do it with no distributed locking.

So if you're now curious how we do it, nice paper for you to read. So talking to people about this, we do distributed transactions. We do transaction coordinator but you don't do big and end transactions. You don't do multi requests transactions. How can you do transactions? How many people here are familiar with big and end or multi requests transactions? Ok.

So on the right, what you would typically see in a relational database, you do begin, do a read or write, select insert, update, select whatever and then you decide commit or roll back. That's the end. That's not how a dynamo db transaction goes. You identify all the things which you want to do and you hand them over in one go single request.

So the question is why do we do single request and not begin end? And the answer is relatively simple. The reason customers use dynamo db is we guarantee predictable response time at any scale for those of you who have built applications with begin and style transactions

Um how often have you debugged the situation where your application appears to hang? Somebody else has got a lock on the table. You go find that person, they did begin, they did some changes, they went to get coffee. How do you fix that? Don't give people coffee. No, that doesn't work either. So we came up with these things which give us predictable response times without having a distributed lock manager. And the way we do that is by having single request transactions, we know as soon as the request is received, what are all the things we need to do in that transaction. But it comes at a cost, a multi requests transaction here, traditional begin and you can intersperse, reads and writes, you can begin, you can read, you can see what the value is. You can decide whether you want to write or not. All of these things. You can have complete flow control on your client application. You can't do that here because you have to sell us up front. What are all the things you want to do in the transaction?

So let me tell you, you lose absolutely nothing with single request transactions, everything which you can do with begin and end, you can do with a multi with a single request transaction. And here's how you do that. If you think easily with big and end great, identify all the reads and rights, make two little lists, reads, rights do all the reads first. Once you do the reads, then construct a single right transaction. Everything which you want to write and all the stuff which you read the invariants make those check conditions between the time when you read and the time when you do the transaction, nothing has changed in the world, everything which you can do with big and end you can do with a single request transaction. And oh by the way, predictable response times at any scale, no distributed lock managers. That's the reason we did it this way and you lose nothing in the process.

Ok. Single request versus multi requests, transactions. I was down in Brazil a couple of weeks ago and uh customer built this very, very sophisticated application with transactions and they posed this problem to us. My application is up and running it does a transaction and before it receives a response from Dynamo, DB, the application restarts the application is coming back up. It sees that it sent a transaction request down and it doesn't know what the status is. How do we deal with application recovery?

Quick show of hands. Everybody understand the question being asked. Yep. Ok, short answer. We deal with this all the time. One of the things with the cloud, you have got to keep in mind when you're building an application in the cloud and you're building it at the kind of scale we're talking about is zume that your application is going to die at the most inappropriate time possible like there you've got to deal with it. So we built Dynamo DB transactions to exactly deal with this. Transactions are important along with your transaction. You specify a client request token, that's a unique identifier for the transaction. If you execute the same transaction again within 10 minutes and you provide the same client request token, we will not re execute the transaction for you. We'll just tell you what happened the last time around. Ok?

Um just a quick tip construct the client request token as a hash of the request. That way you don't have to store the request token somewhere and in your recovery, you don't have complications. You'll understand why. When i show you this picture, I have a client who comes along makes a transaction with a client request token. It makes its way over to Dynamo, DB, Dynamo DB is processing it but the client is restarted dead. One. DB completes the transaction. It succeeds but we have no place to send it. And whether it succeeds or fails is a material, it completes it. But we have no place to send this response to the application restarts it sends the exact same transaction again with the same client request token. The o db says, oh, i've seen this before. I'm not gonna do it again. That succeeded, keep going or it failed and you can handle the recovery in your application. So if your application terminates unexpectedly, transactions will help, you know what happened.

All right. So for those of you who've studied human behavior about 30 minutes is the limit that people can sit and follow, especially a boring presentation. So if you could do popcorn, i would, we can't. So this is the best i can give you for humor.

Um so we'll just do a quick change of pace so everybody can reset for the next part of the presentation.

Um recall that we said tables have a name, a partition key and a sort key. That's a primary key, we divide stuff into partitions. We have n replicas and so on. For those of you who have been following Dynamo DB for a while, those of you who watch Jasso last presentation from 2018, a replica has a tree and a log and a replica is the thing which serves your request. It serves your reads and writes, but it does other things. We have n replicas. One of the replica terminates the way in which recovery used to work. We would go to one of the other replicas and say, what's the state of this replica? So the replica is actually doing a lot of things, streams, notification. Let's go to the replica if you want to take a snapshot and go to the replica. But Dynamo DB also said we want predictable response time at any scale and doing all of these things started introducing tail latency.

So one of the things which we have started towards is the replicas and the storage nodes will only serve reads and writes because all of the other things can be served just fine by leading reading a log stream. You want to get streams, notification, go to a log, you want to know what the current state of a replica is. You don't have to go to the replica itself, you can reconstruct it from the log. So we built this and we came up with this abstraction called the log service, which has substantially simplified our integration and our ability to build a lot of other features.

Um we recently launched export to s3, you export to s3 from a table. We don't touch a replica. We go to the log. One of the other things we do quietly in the background, we rehydrate a partition on some host which is not uh a storage node. And we check some of the partition, we check on the replica and then we go back to the actual replica and say, what's your current check? So we identify if there's any data corruption completely behind the scenes, indexing streams, global tables backup and restore. All of these rely on this new abstraction called the log service.

And now that we talk about streams uh and global tables, you'll see why i stuck that in there. So with streams, the three things which folks wanted to do have answered, were these? What guarantees do you provide? Why do you have shards? Why are there so many of them with a shard rotation and splitting? Let me point out this is Dynamo DB streams, not kinesis streams.

So how many of you here are writing applications using Dynamo DB streams directly? Small number understood. Um if you use lambda, this is probably interesting but lambda abstracts much of that stuff for you. So two, these two questions kind of go together. What guarantees do you provide and what are stream shards and why are there so many?

So here's the architecture for streams. Um mutations to a table happen with inserts, updates and deletes. Those go to the replicas, they kick off log entries, the log entries go to a log service and there's a streams listener, the streams listener then turns it into shards. And when you make a streams request those go to the streams, routers not request routers. List streams, describe stream, get shard iterator and then fetch records. So you make the record to the streams router. It identifies which shard you're talking about and it gets you the data from the shard.

The question is why do you have so many sharks? So if you have a table and you're going to do, i don't know, 10 rights per second, you're probably doing that from a single threaded application. 10 rights per second you can do from a single threaded application. No problem. You can also probably process those changes as they happen in a single threaded application. No problem. But what if you're one of those customers who's doing half a million requests per second? That's not coming from a single threaded application? And i can't reasonably expect that you will process a half a million mutations. A second in a single threaded application.

Dynamo DB scales horizontally sharding of streams is a mechanism by which we give you the ability to process a high rate of transactions or high rate of streams, notifications in your application. So when you make a connection and you request data for a stream, we get your data from the right shark again. So you can handle high throughput.

So why do we shard? And what guarantees do we provide? Important to note this is Dynamo DB streams, not kinesis streams, we guarantee strict ordering of all mutations on an item and a mutation will occur exactly once in the stream, kinesis streams does not give you the ordering guarantee. And it says that there may be duplicates and you have to deal with that distinction between the two on a per item basis. Every mutation which actually happens to the item will appear in the stream exactly once if you attempt to do a right. And there's a condition check failure, we didn't actually change the item, nothing shows up in the stream. If you actually change the item, it does.

Ok. So why split shards? And why do you rotate? Now for those of you who raise your hands when i asked how many of you write applications with streams? This is hard. I'm sorry. But there's a reason we do the things which we do. So to explain that, let me just give you a simple graphical illustration. When you call the describe stream, you get a list of shards, some shards are closed, some shards are open. What's the difference? An open shard has no end sequence number. Every shard has a an identifier of some kind, a parent, a starting sequence number. And an ending sequence number. An open shard has no ending sequence number. A closed shard has an ending sequence number.

I make a table. I say i'm going to do 1010 wc, 10 rights per second. Yeah. One chart is more than enough. We give you one chart. It's the root chart. It's open, it's open because it has no ending sequence. Number time goes on about four hours later, we close that shard and we create a new one. This is the process of shard rotation and we'll talk about why in a second some more time goes on. And i start driving, i don't know, 1500 rights per second, 1500 words per second. We don't believe you can deal with in a single thread. We give you two shards, we close the single shard and we give you two child shards and the two child shards say my parent is a so there's a root shard shard rotation, shard split, we split so that we can handle a higher rate of rights. And over time you end up with something which looks like this, a root shard shard rotation, a split, another split shard rotation. And so on, the reason we do all of these things is again, we do splits to handle load so that you can handle a higher rate of rights per second. But the underlying implementation for streams, each of these shards is hosted on a particular instance. We maintain those instances. And we noticed that over time, some of those hosts were running hotter than others and we needed a mechanism to load balance internally. Everything which we could think of would be a change which you, the customer would notice.

We wanted to minimize the number of things which you would have to deal with in your applications. We knew we had to support splits. We knew that you would have to deal with a situation where a shard closed. And now there's multiple shards would say that's my parent. So he said, let's just do this thing called rotation where a shard closes. And a new shard says that's my parent, not two but just one. So rotation is just a mechanism which we use internally to deal with load balancing.

So we implemented that and we forget about once every four hours is a nice time. If we were to rotate, shards, everything works great. Um i work with the customer who got very annoyed when we changed that four hours briefly. If you want a timer which goes off every four hours, buy an alarm clock, do not assume that four hours is going to be cast in concrete, it could change any time

Sometimes it may happen sooner because one of the underlying hosts has to be restarted. The other warning I'll give you is this - when you're building applications natively against DynamoDB Streams, today, you notice that when a stream shard split happens, a shard splits and you have two child shards. Don't assume it's always going to be two in the future. It could be three in the future. We could have shard consolidation as well where multiple shards are coalesced.

When you call the DescribeStream, we will always give you a list of shards and each shard will say who its parent is. Do not assume that it's always going to be two way splits. Ok? With that caveat and warning out of the way, let's get to the last section which is Global Tables.

Um how are we doing for time? I think we'll be good. So how does, how do global tables work? What are the consistency and ordering guarantees? And for a global table with streams, will the mutations in all regions be identical?

Just like um customers building more sophisticated applications, customers wanting to build event driven applications led us to build streams. Customers wanted to build globally distributed applications. So we built global tables.

If your data is all stored in one place, let's say us east one, your customers who happen to be in us east, we'll get good response times because you know, speed of light and all of that stuff. But if your customer happened to be in Australia and your data happened to be us east one, they would probably not get the best experience. You want your data to be closed to customers.

Data replication has been around for a long time. But DynamoDB is fully managed and we want to give you the ability to focus on your application. So we will deal with data replication and we will deal with conflict resolution for you. You create a table say in us east one or us west one and you say give me replicas in all of these other places. We take care of all of that for you. You focus on building the application.

Customer in Australia connects to the local DynamoDB table in Australia, the customer in Brazil connects to the one in Brazil and so on. We take care of the replication for you. But that's mean when there are conflicts, what happens, you can write the same item in multiple places. We provide conflict resolution for you. Very simple. Last writer wins when the system reach a steady state, the last writer of an item is the version of the item in every location. Ok.

Again, speed of light, the latency of replication between regions depending on your choice of region pairs usually less than a second. Even in the worst case, you focus on building the application. We take care of all of this for you.

So how does Global Tables work? Here's a simple illustration of a global table in n regions. I make rights to the table in region one. Notice that construct here the log service. Again, we read the stream's mutations, there's a replicator in region one and all it does is write the data to all of the other regions. We do not go directly to the storage nodes for this.

So a table in four regions, I do a right in region one, the replicator drives that traffic through the request routers in region 23 and four. Those of you who use global tables, you wonder why the right capacity in all regions needs to be the same. This is why the re capacity can be different in other regions. The right capacity has to be the same. Ok, I do right. In region two, region three, region four, full measure application completely managed for you.

So what are the consistency guarantees and ordering guarantees, fairly straightforward, eventual consistency. Customers want predictable response times when you do the right, the right is done locally and then propagated to the other regions eventually consistent. Most recent wins, active active systems, most recent wins. Ok. What is the ordering guarantee? If you make updates to a to a table in multiple locations, you will always see time appear to move forward. An older update will never be applied to a table. Ok. Bear that in mind. Most recent wins, which means that if an older update happens to come over the wire, we will not do anything with it. And again, the latency depends on the speed of light. Usually sub second if you use us east one and us east two, which a lot of customers do. It is well below that. But sub second is the number we like to use.

So this is probably the hardest of the questions. Um customers building applications with streams and global tables. And the question is this will the streams events in all regions be identical? Quick show of hands, everybody understand the question. Ok. So what do you think the answer is? Nice picture. But what's the answer? Do you think it's gonna be the same in all regions or not? Yes. No. How many for yes. How many of you know? Ok. They don't seem to know something. The correct answer is mostly yes, which is actually no, but you're right. Uh it's most often. Yes, certainly, you know, I don't want the yeses to feel bad here. So it's most often yes, here's the exception.

I do it right in region one, by the way here, time moves downwards. So i do a write in region one. A very short while later, I do a write in region two. So first thing which there's no more rights um being originated in region anywhere. What should the end state of the system be? Of course, it should be that most recent wins. Ok. I did write in reason one, the right actually happened. So there's a streams notification here. Great. I did a write in region two that right got propagated over to region one, the right from region one got propagated over to region two again, this is eventually consistent. There's no guarantee on the ordering here.

So know what happens, write one occurred before write two. What happens to this? Right? Anyway, gets ignored. Ok. Time always appears to move forward, never backward. Since you do right to region to uh item here, there's a notification. By the way, this stream's notification could have happened before the item arrived. It doesn't really matter. Time never moves backwards. And of course, this right comes here. What happens to it? Does it get applied? Ok, gets applied. So there's a notification. So the answer is most often. Yes.

So if you're building an application and you're using global tables and you're using streams do not zoom that every right which is done to an item will actually be reflected in the stream. You need to handle that in your application. Ok?

So a quick recap of the things we talked about. How do we scale? We scale horizontally, we divide tables into partitions. You need to specify a primary key in a billing mode. A partition key is used to distribute data onto partitions. Everything in DynamoDB operates on shared infrastructure. It is fully managed and serverless for you. As a result of that we have table and partition level limits. Global admission control is the thing which enforces all the table level limits.

Oh there's a mention here of iops solution. How many of you here have used DynamoDB long enough to know what iops solution is good. It doesn't happen anymore. It doesn't exist. Um we still hear it once in a while from customers, we do predictive splits. We do adaptive moves. And i highly recommend when you're building your application start off with on demand mode, you've got enough to work on in your application. Don't be bothered about your dialing up, dialing down and all once you understand your workload, then make the choice what you need to do.

We support asset transactions. They are strictly sizable transactions, serialize with each other and they serialize with individual reads and writes. Remember, query and scan are not single reads and rights. They're multiple reads and rights. We never return uncommitted items. We never do dirty reads. We do single request for predictable performance and we don't have complex block managers.

If you want to convert a multi requests transaction to single request, do all the reads, do the rights in a transaction and do all the reads as check conditions. That's the invariant for you. Everything which you can do in and and c transactions you can do in a single request transaction, deal with crash recovery in your application. Use the item potency token, use the client request token.

If you use DynamoDB transactions streams enable you to build a vent driven applications. We give you strict ordering exactly once this is DynamoDB streams, not Kinesis streams, we shard and split for scalability. We rotate for internal load balancing and both of them follow the same construct a shard ends. And new shard appears, some number of new shards may appear.

If you use global tables, it makes it easy to build globally distributed applications. We take care of application, we take care of conflict resolution for you propagation delay is usually in the order of sub second and i promise you a list of other interesting reading material.

The first one is Jaso Stock from 2018 before I joined DynamoDB. This is one of the talks which I listen to. I highly recommend you listen to that one.

Alex and I did this one last year um covers data modeling, I think Alex has other sessions this year. There's other sessions on data modeling as well.

And these are two papers which were published in use. Nex. The first one talks about how we built a scalable um database and the second talks about transactions in great detail.

And with that, let me say that I highly recommend you give us feedback because we'd like to tailor the content of future rents to the things you're interested in. So please take a minute to provide feedback. That's all I got.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值