Achieving scale with Amazon Aurora Limitless Database-CSDN博客

本文链接：https://blog.csdn.net/just2gooo/article/details/134838050

本文围绕关系型数据库扩展展开，介绍了三种扩展方式，重点提及分片技术虽能提升写入吞吐量，但会带来查询、一致性和维护等问题。为此，亚马逊推出Aurora Limitless Database，它采用分布式架构，提供单一接口，能自动扩展，还介绍了其查询流程、性能优化及生态系统等。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

Hello and welcome. Today's session is going to be about scaling relational databases. Databases are so big that they go beyond the limits of what a single machine can handle. And if you're in this room, you likely already deserve congratulations because you and your teammates have solved an important business problem. But now you have another problem on your hands, which is that as your application scales, the data size and query volume that your database needs to handle is going to grow exponentially and it's on you to make sure it can keep up.

My name is Christopher Heim and I'm a product manager on Amazon Aurora. I'm gonna talk to you today about Amazon Aurora Limitless Database, which is a new scaling capability for Aurora that lets you scale far beyond the limits of existing Aurora clusters.

When we talk about scaling databases, there are really three things that we can do. We can work harder, we can work smarter or we can get help.

Working harder means scaling up the instance that's running our database using more vCPUs or using a higher percentage of those vCPUs that we've provisioned.

Working smarter might mean optimizing your SQL, optimizing the way that your application is using the database or using some of the latest innovations from Amazon Aurora, including Aurora IO optimized or optimized reads.

The third way is to get help and for really big workloads, this is the only thing that you can do and it means putting more workers on the job, more physical machines serving the same database workload.

Now for reads, this is relatively straightforward and it's something that we accomplish on Amazon Aurora through read replicas scaling reads. On the other hand, writes is a bit more complicated. And the most common approach to doing this for relational databases is a technique called sharding.

Sharding is a long established technique for scaling the write throughput of relational databases. Traditionally your application talks to one database and that database contains all the data that your application needs. When you shard a database, you spread that data across multiple instances and each instance becomes the home for a subset of the data within the database. And if we do this right, these instances can work in parallel to achieve a much greater scale than any single instance ever could.

So, sharding brings scale, but sharding also brings problems. When you split a single database into multiple databases, there's a whole new set of problems that fall on you when you're developing an application. For starters simply querying the database isn't necessarily straightforward. Now there are multiple databases and your application is going to need to know when it needs to connect to which database where each piece of data lives and things get even more complicated if you have a query or transaction that spans multiple databases. Now you might be implementing something like a join within your application, which is something that most people would much rather rely on a relational database for that has a sophisticated query planner.

Beyond querying, there's the question of consistency. Once you break a database apart, a lot of the basic functionality that we rely on relational databases for begins to break down in a sharded database. There are no consistency guarantees, there's no ACID transactions between multiple databases. And again, that's something that falls on you to manage. In reality, transaction management is so complicated that most people developing applications aren't going to do it on their own. And so you'll just have to find a way to live without it, which in itself is not easy.

There's also the question of maintenance. As the number of databases multiplies, routine maintenance actions can become increasingly complex. Performing an upgrade or a patch or taking a backup without the kind of consistency guarantees that we normally get from a database can be extremely complex as can debugging or optimizing your workload against a sharded database.

So while sharding does allow you to scale your writes on a relational database, it also creates a very complicated picture and a lot of problems that you have to solve when you're developing against it.

So we heard all these problems from you our customers and we developed Amazon Aurora Limitless Database, which is now available in limited preview for the PostgreSQL compatible version of Aurora. The fundamental idea behind Limitless Database is to give you the power of a sharded database with the simplicity of a single database by providing extreme scalability within a managed service.

Limitless Database makes it easier than ever to scale your workloads on Aurora. So what is Limitless Database? It's a serverless deployment of Aurora that automatically scales beyond the limits of a single instance. It uses a distributed architecture to scale, but it presents you with a single interface. So you can use it much more like a single database than a sharded database.

Importantly, Limitless Database offers transactional consistency across the entire system. This is a very complex problem and we're going to talk more specifically about how we solved it in a moment. All of this means that you can scale your workloads to millions of transactions per second and work with petabytes of data all within a single Aurora cluster. And that last part is really important because it means that you get many of the Aurora features that you're accustomed to today, like the durability of Aurora storage, the high availability of multi-AZ deployments. And we've even updated tools like Performance Insights and CloudWatch so you can debug your Limitless Database workload using familiar interfaces.

So if those are some of the benefits of Limitless Database, how is it that we actually go about using it? To illustrate this, we're going to take a look at a very simple sample schema, drastically oversimplified. We've got three tables - customer, order and order details. We're going to assume we have the world's simplest e-commerce application and that we use this schema, this database to process orders and collect the appropriate sales or VAT tax based on the city or country that our customer lives in.

And now we're going to look at how we would scale this database using a Limitless Database. Within Limitless Database, we've introduced two new types of tables which are shard tables and reference tables.

Shard tables are distributed across the multiple instances that power the database called shards. You choose a column of a shard table to be the shard key and the value of that column determines which shard a row maps to. So for the example of our orders table, we might use the customer ID field as the shard key here. Each color represents all the data that has a given customer ID, so all the data for a given customer.

Now if we create this as a shard table within a Limitless Database, Limitless Database will automatically spread this data across instances. And you can see that in Shard 3, we've actually mapped two customers to the same shard. That indicates that in reality, multiple customers are getting mapped to each shard.

So one of the convenient things about choosing the customer ID for our shard key is that then if we need all of the data for a given customer, it all resides on a single shard. And there's no need for us to move data around the system or perform complicated joins to get the data for a single customer. In fact, Limitless Database can send that query to a single shard. And if we do that across our entire application, send lots of queries to individual shards, we can get a high degree of parallelism and a whole lot of scalability out of the system.

So we also had another table in our schema, which was the customers table. And we'll likely also want to make this a shard table. And it might be convenient if all of the data for a given customer from both tables was located together. Because then if we join those tables or if we needed all of the data from both the customers table and the orders table, it could again be gotten from a single shard, we could send that query to one shard, operate in parallel and get a lot of scale.

So in Limitless Database, we call this co-locating tables, which is a way to specify to Limitless Database that these two tables share the same shard key and that all the data with the same shard key value should be sent to the same shard.

So here we're going to distribute the data from the customer table right alongside the data from the order table.

Now we have one more table in our schema which was the tax rate table. And this is a good candidate for a reference table. Reference tables instead of being distributed across all of the shards are present in full on every shard. A good use case for this is a table that you need to join frequently to sharded tables, but that doesn't lend itself naturally to sharding. In the case of our tax rate table, it's also relatively small and has a relatively low rate volume compared to the other tables.

The benefit of having that table present fully on each shard is that now whenever we join it, we can get more of those single shard queries, more parallelism and again, more scaling.

So if that's sort of the conceptual approach to how you would map your schema into Limitless Database, let's talk about the syntax you would use to actually go and create the schema.

If this is the CREATE TABLE statement for our customer table, the approach we've taken to creating sharded and reference tables within Limitless Database is to make it as much as possible like using existing Aurora and existing PostgreSQL. And so we've left the CREATE TABLE statement alone and instead to indicate that you want to create a sharded or reference table, we've introduced a new concept which is the CREATE TABLE MODE which you set using a session parameter.

By setting this CREATE TABLE MODE to SHARD, any CREATE TABLE statements that you run after that Limitless Database will know that you want to create a shard table and it will automatically distribute the data for that table across all the instances.

Now, for sharded tables, we need to set a shard key and we use the same method for that. It's a session parameter where we set the SHARD KEY and here we're gonna set it to the CID field.

And that's all we have to do to create a shard table. I think there are two important things about this syntax. The first is that because these are session parameters, once you set them, they stay set. So this isn't something that you need to run for every single table. If you're creating a lot of shard tables, you can set the CREATE TABLE MODE to SHARD and you're off to the races.

Similarly because we've left the CREATE TABLE statement from the PostgreSQL syntax intact, the DDL that you build for your Limitless Database is entirely PostgreSQL compatible and you can run this in any PostgreSQL environment without errors.

Creating a co-located table is very similar. It's another shard table. But we've introduced one other session parameter that you need to set, which is the COLOCATE WITH parameter. So if this is our order table, we'll set the COLOCATE WITH parameter to the already existing customer table. And again, we're already in context where we've set the fact that this is a shard table and that we're using the CID field to shard it.

So those first two statements are actually optional. And again, all three of these statements then indicate that we're going to create another shard table. It's going to be co-located with the customer table. So all the data from both tables will be located on the same shard, rows with the same shard key value will be located on the same shard.

And there's that third table which is tax rate, this is gonna be a reference table like we discussed. Can anyone guess the syntax for creating a reference table? You can shout it out.

Nothing. We got nothing. Almost! You set the MODE to REFERENCE. And because shard tables or because reference tables don't have shard keys and they aren't co-located, that's all we need to do. So once we set it to REFERENCE, then any CREATE TABLE statement we run will be a reference table.

The reason NOTHING was a very good guess, the reason that it's not NOTHING is that we also support creating standard tables. And when you set the CREATE TABLE MODE to STANDARD, when you run a CREATE TABLE statement, it'll create a regular PostgreSQL Aurora table that's in the same cluster as your shard and reference tables.

So that's a bit of an overview of how you would use Limitless Database and bring your schema into it. To talk in more detail about the architecture and how we've built it, I'd like to introduce David Wyatt, a Principal Senior Principal Technologist on Aurora.

There's just two things you really need to configure besides giving it a name. These are the max ACU and compute redundancy.

The max ACU controls how big the shard group can get - think of it as a budget for your shard group with the aggregate and total vertical and horizontal scaling, counting against this budget. So this is how big everything can get in total.

Compute redundancy controls the HA nature of the shards. So we'll talk about this in some detail.

Let's look at a compute redundancy setting of zero. This is the base, no failover or no HA for a shard setting. This is the base infrastructure you get.

As always, storage is 3AZ durable - everything in storage is 3AZ durable, no configuration needed for that.

The routers and the shards are spread across availability zones. Entropy happens. And it's likely that at some point during the life of your system, one or more components may failover from the AZ that they're in to another AZ.

And we decided it's best to just always keep these spread across AZs such that a failure of one out of an AZ that you were trying to concentrate everything in wouldn't create an abnormal performance profile.

Because the routers don't own any data, they don't need dedicated failover resources. If one router dies, another one of the routers that's in another AZ can pick up the work while we add a new router back to the fleet if necessary.

However, if a shard were to fail, the data on that shard would become unavailable until the host is reprovisioned.

So this is now compute redundancy one. We add a shard failover for each shard in a separate AZ. Here, shard S1 has a failover target in AZ2 and so forth. And so if one of these shards were to fail, the failover can very quickly assume ownership of the workload, take over the storage and the shards that it was managing.

Then of course, compute redundancy two is we give you a failover target in two AZs. You can have zero, one or two based on how much resiliency you want. Of course, you do need to separately configure the primary HA for your primary writer using typical Aurora read nodes. Those are in the cluster but not in the shard group configuration.

Now let's go into data distribution. Chris showed you an example of shard tables. I have a slightly different schema here based on the pgbench schema from the Postgres benchmarking tool.

The interesting thing on this slide, you see the same syntax Chris used before. You set the shard mode, you set the create table mode to shard, I set a shard key, I created the table. But now I take a look at what this table actually is. I do a describe of the table and it shows that it is a partition table and it shows partitions covering, it shows that it's hash partitioned and the partitions cover ranges of values.

So we use something called hash range partitioning to distribute your keys. We take the shard keys that you give us (and it can be a composite key), we hash them into a 64 bit space and then we assign ranges of this space to the shards.

These ranges are called table fragments - that's how we refer to them and ownership is assigned to a specific shard. The storage itself is Aurora distributed storage, so that's why we call it a data access shard because it's not actually persisting, it doesn't have disk drives in the shard, but we can call it shard for short.

We call the entity on the router that references a table fragment a table fragment reference - pretty straightforward - and it's just a bit of metadata but not any actual data.

Interestingly, the routers are all in sync and our DDLs are strongly consistent and easy to reason about. So when you do a CREATE TABLE, it doesn't eventually propagate to every router, it is strongly consistent ACID, it's same Postgres semantics as today.

Though not visible to the user, I do want to talk slightly about table slicing, which is something we do internally. So within a shard, we take that table fragment and we slice it into finer grain slices which are sub-ranges of the range that the fragment (that the fragment owns).

Why do we do this? Well, first off, it gives us the ability to do better intra-shard parallelism because we have multiple slices we can work on and they actually form the basis of our horizontal scale out.

Speaking of which, let's take a brief look at how we horizontally scale out a shard. When we scale the shard, we call it a shard split. And during a shard split, what we do is we move table slices from one shard to a new node based on our heat management observation of load and other knowledge of the system. Of course, we honor the colo location rules.

One of the cool things that we do is we actually leverage the underlying Aurora storage which has cool technologies like cloning, copy-on-write, and storage level replication. So when we do the shard split, we're able to do almost all of the work down in the distributed storage system which avoids putting extra load on the current hot shard that you're trying to split naturally.

We can also add routers if need be, though it's a little less interesting since they don't hold any unique data - they just get new references to tables.

Chris also talked about reference tables. Again, the syntax: set the create table mode to reference and create your table. One thing to point out about these is that they are also strongly consistent like other things in this system. They are not eventually replicated over - that's a distributed transaction that writes in parallel across all the references and is atomically committed. So it's fully ACID.

The main purpose is for join pushdown. So if you have data that you frequently join with shard tables but is not itself sharded, we can push down joins to the shards and execute them more efficiently. And the use case is really for very frequently read or joined with data but not frequently written.

If you were to use your high write throughput tables as reference tables, you wouldn't really benefit from the aggregate scaling of the system because every node would be doing the same duplicate work.

Now let's talk about transactions and how they work in Limitless Database.

We'll start with what we support - we support read committed and repeatable read isolation levels and we support them in a way that is consistent as if it were a single system. This is a very important thing - the transaction, it looks like you are working in an unsharded single system database.

Here's a refresher before we go further:

In read committed mode, your query sees the latest committed data before your transaction began. And every query in the transaction can see different data.

In repeatable read, which is a stronger isolation level, you see the latest committed data before your transaction began and every query in a transaction sees the same data.

Standard database semantics - Postgres honors these. Our design goal was to maintain those semantics in the distributed system - that's hard. It turns out it's actually really difficult to do in a distributed database while maintaining performance. If you want to be slow, it's not that hard. But while being performant and scalable, it's tough.

In Postgres, it works by keeping track of all running transactions on the node. So it's fine in a single node, but it's not gonna yield a scalable distributed system if everyone needs to know about everything.

So to get good scalability, you need parallelism and to get parallelism, you need to minimize or eliminate coordination between components. So a centralized transaction manager is out, you can't do that.

Transactions in a relational SQL-based relational database can be short or long, they can be implicit or explicit, they could be touching a single shard or multi-shard and when a transaction begins, we have no idea what it's going to do and we don't want to make it declarative on your part where your application has to tell us at the start of a transaction what it's gonna do for the scope of the transaction. We didn't want any changes there.

A SQL statement arrives at a router and then we have to send it out to different shards. It arrives at the router one time, but it's gonna hit different shards at different times. So there is no single time of execution.

We have to maintain order so that writes, so that reads see writes regardless of which router they go to. And we need to be able to do consistent restores. As I said, you're not restoring a bunch of individual clusters and everything - there's one backup, one snapshot, one point in time restore. And when you do a point in time restore, we have to bring back the entire system entirely consistent at that point of time with all of your transactions - correct? No torn transactions, no missing transactions.

We solved this with bounded clocks using special clocks and a custom algorithm. And this is pretty complicated, but I've got some time and I'll try to make it easier with a thought experiment.

Everybody in this room is carrying at least one clock - everyone's got a phone, a watch, something. But none of these clocks are exactly the same. So we've got several hundred people here who each think it's a different time.

Imagine if you were to look at your phone's clock and very quickly write down the time you saw. You pass the time that you wrote down to your neighbor and you ask them if the time is before or after what their phone shows. If you did this very quickly, it's possible that the time you wrote down is actually in the future of the time that they think it is from their phone.

So while you consider having written this down and done this in the past, from your perspective, from your neighbor's perspective, you wrote down something that hasn't even occurred yet - it's happened in the future.

Now, while we can't all agree on what the current time is, we can agree that it's definitely after probably like 12:55 and it's definitely before 1:05. We can all agree on that.

So instead of writing down the current time, which is probably somewhere around 1pm, you wrote down 1:05pm and then you kind of held on to that piece of paper for a little bit and you waited a while and you waited till you were completely confident that everyone in this room thinks it is absolutely after 1:05pm.

So once you're confident that it's after 1:05pm, you hand the paper to your neighbor. Your neighbor looks at their phone, their phone says 10:50:03 or 1:10 or something. And absolutely, the order of the time that you wrote down is guaranteed to be before the time they see on their phone.

Sounds a little weird, right? But EC2 gives us a capability to do this and not with 10 minute estimates but with microsecond level precision.

EC2 time sync delivers us three pieces of time information. It delivers us the current time, which is approximate system time, and tells us the earliest possible time, meaning that it is absolutely guaranteed that the current time is after this earliest possible time. And it tells us the latest possible time it could be, which means that the current time is definitely before this latest time. And these bounds between the earliest possible time and the latest possible time are in microsecond range.

So we integrated this deeply into Postgres. We changed the way tuple visibility in Postgres works - instead of based simply on monotonic series and knowing which transactions are running, we created time based snapshots and time based commits.

I'll talk you through the details of this. We implemented time based knowledge of these bounded clocks within Postgres. This gives us global read after write - if you do a commit, you're guaranteed to see that data again, no matter which part of the system you're running on.

And we also implemented both a fast path one phase commit for shard local writes and a durable ACID two phase commit for multi-shard transactions.

Let's go through an example of how this works. Stick with me here - we've got three transactions...

Transaction one is gonna do a select in repeatable read isolation mode. All right. So he selects a balance from an account where the uh branch id is 619. So that's on one shard. So, um and he gets a result. All right. So uh what happens here is the router receives the select statement and he gets his time, which is time 100. And what we're able to do is the protocol between the routers and the shards is not pure postgres protocol, but it's an extended protocol. We're also able to pass down time contexts to establish snapshots at specific times. So the router tells the shard that owns 619 to execute the query at time 100 he gets a result 704.

Now, transaction t two is going to do a select, followed by an update of a different account on a different um uh on a different uh shard. So in this case, the router gets time, 103 executes on the shard that owes 801 with the snapshot of 103 gets a result one. So the balance for account, uh that's, that's in the example here of branch 801 is one. All right, then he does an update and he sets the um balance to 1000 and one and issues a commit.

So what do we do at commit time? Well, the router is able to know that this transaction only touched a single shard. And so it does a one phase commit um pushes delegates at all to the shard to do. All right, the shard's gonna assign a commit time of 100 and 10 and it will acknowledge the commit when all the writes are durable on disk, which is completely standard database behavior. But it also has a special trick which is it will wait to ensure that the earliest possible time is past 100 and 10 before it acknowledges the commit.

All right. Now, that may sound like to you like we're adding latency in the commit path. But we do this in parallel with the dis io and given this is a network based system, your network packets back and forth. And ec two's time granularity is microseconds. We rarely if ever add, add any additional latency to guarantee this time difference.

All right. Now, if we were to start another transaction uh that selects this data, we would get a time that's guaranteed beyond 110, 125. And then we execute the query at 125 and we see the updated row 1000 and one as you would expect.

All right. But now if i go back to t one and i select this data, remember, i'm in repeatable read. And one of the cool things here is this transaction. t one has never talked to this shard before. It didn't establish any existing context on this shard. But what he's able to do is he's able to go and say, give me a snapshot at time t 100 in the past and give me the result. And because it's repeatable read, we get the correct result for that isolation level, which is a value of one, not the updated 1000 and one.

So this is multi shard distributed, repeatable read, let's talk about read, committed. Um now read, committed. If, if we did repeatable, read, read committed is easy, right? But there's actually a really cool case. I'm gonna show you here.

Um so t one is going to sum all of the accounts, the balance of all the accounts in our bank. He's gonna do a sum operation. All right. So what happens here is the router receives this query says time t 100 then in parallel tells all the shards to execute their sum of their local data at time t 100 sends back the result, the router receives all those partial results, aggregates it and tells you that the total amount of money in your bank uh, is 10 million, right?

All right. So that's your, uh, that's just a recommitted view of sum of everything. 10 million. All right.

t two, what t two is gonna do? He's gonna move money from one account to another and those two accounts happen to be on different shards. So he's gonna start a transaction. He's gonna update one account, take out 500 then update a second account, add the 500 into it again. These are living on different shards.

All right. So here we're in read committed mode. All right. So router gets time 103 executes an update statement on one of the shards at 103. Then the next statement it's read committed gets a new time, executes at 107 on that other shard he decides to commit.

All right. So now we're in a tricky or commit situation. All right. So the router has to decide that we've touched multiple shards. We have to do a two phase commit in order to be uh atomic here. So it sends a prepare message to the shards and does this in parallel. And so the shards each individually respond with the time that they have prepared at one of them prepared at 118, the other one prepared at 112.

So the router collects these times back compares it with some other information looks at the clock it has now and says, ok, this transaction is gonna commit it 100 and 20. So the router actually locally commits that transaction at 100 and 20 again, does the same thing of acknowledging the client once it's durable and once the earliest time is past 120 then it tells the shards to transition from a prepared state to a committed state at 120.

So all the shards will have the same commit time for their portion of the transaction. All right. So now we'll go back to, we'll go over to transaction t three and we'll do the sum of account balances again. Now this is a timing thing and it's interesting here. So in this example, i say, well, this router got time t 116 which is actually between the two preparers.

He does the same thing with the aggregate sum, sends them all out at 116 gets back 10 million. The interesting thing here is that you can run these updates thousands a second, moving money between shards committing in a two faces transaction. And you can run that sum separately, that some will always tell you 10 million because we are getting a consistent view across all the shards when we do the sum.

So even if money has moved has debited from an account on one shard and credited to account on another shard. And you look at it in between when the money is moving, you will always see either all of that transaction or none of that transaction. Again, this is basic database isolation, but it's done in a distributed system. And that's the, that's the thing that makes it special.

Again when you do a snapshot restore. If you were to do a point in time, restore at um say time 119, you would not get that update transaction. You would, if you did 1 21 1 20 you would get the update transaction. They come in uh guaranteed all the transaction or none.

All right. So in summary, we give the same repeatable read, read committed semantics as postgres, all of the reads are consistent. There is no need for read quorums because of the way the transactions work and aurora storage works. So we don't need to do quorums. You get fast reads. Uh even if there's nodes failing over, there's no special need for quorums, you get fast reads, um commits with single shard rights scale linearly out millions per second. So well charted, well behaved application will scale out very well distributed commits to multiple shards. We still, we still offer all the acid guarantees obviously, because you're doing a two phase commit there, there's more overhead than the one phase commit, but it still works. It still performs.

All right, we'll go into uh sql compatibility and queries a little bit. So this system is fundamentally aurora postgres. All right. Um we are still postgres wire compatible. So you use all your same client drivers. The post grass parser is there and the postgres semantics are there. We have a broad coverage of the surface area of the sql engine. All right. So the post grass sequel surface area is, is massive. There's a lot of stuff there, distributed systems. Um it's hard to implement all of the uh post graph semantics. We have broad coverage over over most of the sql feature set and we support a selected set of extensions that are available uh in aurora today.

Um query execution basics um a little bit about how we put this together. So the foundation of the architecture for quarries are postgres foreign tables. Ok? Um they've p foreign tables have come a long way in the last several major releases of postgres and they give a really good foundation for us to build on, but there's limitations in foreign tables. And so we did a lot of enhancements inside the core post engine in order to um to get past those and to give us the sort of performance scalability and features that we wanted in this uh in this offering.

So then we tied it together by writing a custom foreign data wrapper that built on all of these. And that foreign data wrapper integrates with the time based transaction system. It integrates with the fully managed shard group concept with our view of topology who's in and out of the cluster, the shard splits, things like that and numerous performance optimizations.

So um a brief bit of query flow. So queries flow from the router to the shards and back. All right. So the router will receive the query from the client. He'll do some initial planning. So he has to figure out um which uh which shards might be involved in the query might have to figure out joints. What joints can be pushed down to that shard? If is it a join with a reference table or like that can be pushed down, some aggregates can be pushed down things like that.

Um and then it will create partial queries and send it out to the various shards along with the transaction context, it's set up. So the shard will then receive the partial query from the router. It will then do its local planning. So um the router one way to to make the system scalable is to not have each router have to know everything the shards know the details about what data they have and they're in the best position to make decisions on index access paths, table scans local joints, things like that.

So we keep that contained within the shard, keeps the routers a little more lightweight. So the shard will do its local planning execute, send the results back to the router depending on the query i mean, it might just be a straight pass through back to the client or the router will have to do any final joins, filters and aggregations.

Um so here's an example of a post grass explain plan. Naturally, you'll get the best performance, the lowest latency and the best overall scalability. When your queries are scoped to a single shard and have the most success with a shard system, you wanna have well behaved, the bulk of your operations, be well behaved targeted shard operations.

So um this is just an example where um we use an optimized fast path where we identify single shard operations and we can push down the query. Uh we've made some uh some pretty good enhancements to the explain output and postgres as well.

Um there's a lot of stuff, there's a lot of stuff in this presentation. We're not gonna have time to cover around the usability and management improvements that we've made that give you the single view of the system such as we have.

Um we have a limitless database version of pg stat activity, which gives you a single view of all the activity going on across all shards and routers. We have a limitless database equivalent to pg locks, which will show you all the lock tracing throughout the system.

Um and in explain plan, we will also bubble up the query plan uh up to the router. So you could have just to simply know what the router chose as a plan and not know what the shard did or we can bubble up what all the shards are doing as well in your explain output.

Then integration of tooling goes uh other places as well in terms of performance insights, cloud watch, things like that. It's a full, it's a full built out ecosystem.

All right. Um uh there's some other stuff we can do. I just talked about when things are single shard, they're best, but some things are not single shard and they're awesome. And those are tend to be like embarrassingly parallel things. So for example, creating an index on a shard table. If you have lots of shards, that's like it just, it, it goes out each shard churns on it and you get basically for each shard, you add, you get a reduction in time.

So index creates things like postgres, analyze vacuum, run very fast or run. Yeah, run very fast when you scale out and you reduce the amount of work that they have to be done on each system. Aggregates. Sums mens maxes, those can all be pushed down and run in parallel.

So for operations that are inherently parallel sharding, lets you put a ton of vcp us and ra m against the problem and run very quickly.

So um that's the uh that's technical deep dive we have today. If you're interested in aurora limitless databases and encourage you to join the preview, you can sign up for the preview uh using uh the link posted here in this qr code and it qr codes on this slide as well.

So thank you all very much for taking time out of your day to attend today.