Hello, everyone. So have you spent time managing database capacity, worrying about your estimates, worrying about downtime, provisioning extra resources for, you know, just in case scenarios? Because there's only so much you could worry about.
My name is Anna Zhang. I'm a product manager with Aurora. And in this session, we're gonna talk about exactly that. We're going to talk about how Aurora Serverless v2 can help you reduce your operational complexity for managing database capacity and save on cost.
So just a high level overview of what you're going to see in the next 45 minutes to an hour. We're gonna start off with an overview and motivation to set the context with how Aurora Serverless v2 can help you. And then we'll jump directly into the auto scaling capabilities of Aurora Serverless v2.
Now, when it comes to your critical and demanding workload, it's not just auto scaling capabilities that you need, you need all those other additional features of Aurora. So we're going to talk about how Serverless works with those features. We're going to touch a little bit on billing and then we're gonna bring it all together with a customer use case study to show you how Intuit has seen cost savings with Aurora Serverless v2 and we'll close it off with how you can get started today.
And during the whole session, we'll show a little bit, a couple of demos so you can see things in action.
Now, Aurora Serverless v2 is a feature of Aurora, which is a MySQL and Postgres compatible relational database that is built for the cloud. You get the performance and availability that you expect from a commercial database, but at just 1/10 the cost. It is also the fastest growing service in the history of AWS.
Now, when you at very very high level, when you're building your application, you figure out how many database resources that you need and you provision your database based on that. As your workload grows and it goes beyond your estimates, you provision a bigger size machine and you do a fail over.
Now, let's say your application is coming to end of life, you do the reverse. So you reevaluate how many resources you need, what capacity do you need and you of course correct.
Now, let's say your organization has hundreds and thousands of applications each of which could be backed by a single database or multiple databases. Now, for managing capacity, instead of worrying about one database, you have those hundreds and thousands of databases that you have to worry about.
What if you are an ISV or you are running multi-tenant applications, each of which could be backed by, you could be following a tenant per database model for complete isolation or you could be grouping your tenants together to get better utilization.
Now, it is difficult for DBAs to have hundreds of thousands of customers. So again, all of a sudden, you have to worry about all of these databases and over time, the requirement for these databases could also vary. So you have to think about the requirement of each and every one of those and every time you have to switch machines, there's downtime involved.
So all of this to say is database capacity management is hard.
Now, a simple solution could be let's get the biggest possible machine. So you don't have to think about, you know, moving machines back and forward and just use that. That's not a realistic solution. Somebody has to pay for that.
So then you look for some tradeoffs. Now, if you want to closely follow what your requirement is for your variable workload, you require expert knowledge of folks who understands the database, understands your workload. And any time you have to move to a different size database, you have to take downtime.
So if your estimate is wrong and you end up provisioning something that's less than the peak capacity that you require. Here in the diagram, I'm just showing provision for average capacity but anything less than the peak that your application requires. There's going to be application degradation. So that's not ideal.
So one thing you can do is just provision for peak capacity. So whenever you need that those resources, you have them. But again, this is an expensive solution because most of the time you're not using those resources, but you continue to pay for that.
So none of these scenarios are ideal.
Now, this is where Aurora Serverless v2 comes in. It is the fastest adopted feature in the history of Aurora. We launched last year and it continues to be the fastest adopted feature that we have.
What does it give you? It is an auto scaling and on demand feature for Aurora. So what it provides you is as your application load varies, the database scales automatically, you don't have to do any of that. The database takes care of all of that you pay for whatever the database resources that the database is consuming. And with this new version, we make sure that we follow your workload closely. So we give you a very frictionless database capacity management experience.
Now, for the next couple of slides, we're gonna talk about under the hood. How does auto scaling happen for Aurora Serverless v2?
Now, how should you think about it? So when you are provisioning a database, you're thinking about a fixed size instance. So for example, in Aurora, you could be provisioning an r6g.large which gives you 16 gb of memory and two vCPUs. So for the purpose of this conversation, I'm going to refer to that as a provisioned Aurora database.
What happens in Serverless? Instead of picking this fixed instance size, you give it a capacity range and based on your workload, the database is going to scale within that capacity range.
So on the screen I saw there's a screenshot of just where we put the capacity range, the minimum and the maximum. Now think of Serverless as just another instance that you're putting in your cluster. And instead of specifying a fixed size instance, you're giving a range of it, but it's just any other instance.
And we it uses the same engine code as provisioned Aurora. It has the same security posture as provisioned Aurora as well. In terms of versions for Serverless, we support MySQL 8 onwards and for Postgres, it's 13 onwards and we have built it such that we have made it easy to switch between Serverless and provisioned.
So if you are provisioned any one of those, if you have provisioned Aurora or Serverless, you're not locked into that, you can go back and forth between them easily as well. And we'll talk more about it later on.
Now, I talked about how the database is going to automatically scale capacity. Now, how do we measure this capacity? So we use a high level abstraction for capacity measurement called Aurora Capacity Units or ACUs.
What do you get with it with one ACU you get two GB of memory and for CPU and networking resources, it is similar to what you get with provisioned Aurora.
Aurora's starting capacity, we can the database starts off with as small as 0.5 ACUs which is equal to one Gib. So that's a pretty small database. This is what you can start off with and it can go all the way up to the maximum database capacity.
So at provisioning time, you give this minimum database capacity in ACU and a maximum database capacity. Minimum is the starting point. So this is where your database will start off as and at all times you will be paying for that minimum database capacity.
Think of the maximum as your budget control measure. At no point will the database scale beyond this maximum database capacity. And when you're thinking of what that number should be, keep in mind, what are the peaks that your workload is experiencing? What are the features that you're using? Because you want to make sure that your database has all the resources available that it needs so it can scale up to that.
So these are a couple of things that you think about when you are thinking of what minimum to set, what maximum to set.
Now, that's just about capacity. Now, when should this database be scaling? What are the factors that the database looks at to make this decision of when to scale?
So there are a couple of factors that the database looks at or deciders as we call it.
First one CPU utilization. Now it's not just the workload that you're running. So if there's anything that's running in the background that the database is using to keep it healthy, any of the background processes, classic example would be purge, the database will scale up if it requires those resources.
Memory utilization is another one. So if there's load on any of the internal memory structures, the database adjusts capacity to make sure your workload has those resources available.
And the third factor is network throughput. If it requires more network resources for anything, it's going to scale up to provide your workload those those resources.
To make the scaling decision. if any one of these factors require resources, the database is going to scale up. So let's say it needed a little bit of CPU resources, it's gonna scale up. But for scaling down, none of these factors should require resources before the database initiates the scale down process.
Now this is about, you know, what are the factors that we are looking at? How does that database actually scale?
Now we scale by what we call as in-place scaling. What we do is we take a running database process and we give it more resources, we give it more CPU more memory. So whatever the database requires, we give it those resources and we scale by in-place.
The other factor is non-disruptive scaling. The way we build the architecture under the hood, we're not switching the machines or doing any of that. So there's no disruption cost when it's scaling, you could be running hundreds and thousands of transactions and the database will still scale when it requires resources.
So we provide non-disruptive scaling experience.
The third point I want to highlight is that Aurora Serverless v2 scales in fine grain scaling increments. What we mean by that is if your database just require a little bit resources, it has the capability to scale just by a little amount.
So the smallest that can scale in terms of increments is 0.5 ACUs. So let's say an example, it's a 12 ACU and the database needs just a little bit push, we can scale up to 12.5 to give it those resources.
So at all points, we want to make sure that the database is following very closely to your workload. So you're not paying for extra resources.
And lastly, when your database require resources, your workload requires resources. The database should react instantly. So Serverless scales instantly to give your workload those resources, there's no lag or anything it'll scale, right?
And the way we design the scaling experiences, we wanna make sure that this predictable scaling it's instantly. Yes. But we also want to make sure that you know, one it shouldn't be that one day something happens the next day something else is happening.
So we provided a very consistent scaling experience. This scaling experience is based on a token based system. So imagine each instance gets a bucket of tokens which are characterized by two parameters. You have the bucket size and you have a refill rate.
So at each instance size, it comes with a bucket size that that instance can consume instantly. So if any of the deciders tell the database needs resources, the database can scale right away with whatever is available in the bucket. So that gives you the instant scalability.
Now what if you need more resources, then the bucket is also getting refilled at a sustainable rate. So then your instance can continue to scale up to make sure that your workload has the resources that it requires.
Now what determines what the bucket size is or what the refill rate is, it is based on the size of your instance. Now a trend line showing how do the bucket size changes? It varies based on the instant capacity. So bigger the instant capacity, bigger the bucket size you get so it can scale instantly more and then same goes for refill rate too. It has a very similar trend that bigger the instant size faster the refill rate that you get. And with all of that, it makes shows that bigger the instant size, the faster your database can scale.
So when you are testing your application, look at the scaling rate if you feel like um mm the scaling rate is not, you know, up to the requirement of what your application require. What can you do right now, i'm talking about it's the current instance size. What you can do is you can change your minimum database capacity because as i said, that's the starting point that changes what the size of the database would be. So if you want a faster scaling rate change the minimum ac and you will get that.
So for example, if you have a launch coming up, you already know that there is going to be surge, you can increase that minimum to make sure that you have that faster scaling rate.
Now this is about, you know, how does the database scale? Let's go into a little more details about buffer pool. What happens to it during scaling? So buffer pool is your primary cache of the database. It also scales when the database scales now. So the parameters you're looking at is for my sql. It's in o db buffer pool size for post graph, it's shared buffers. So we are scaling these parameters when the database is scaling and it requires more resources. In terms of memory allocation, it mostly goes to buffer pool, around 25% goes goes to heap.
Now, when the database needs to scale down. It does it through a combination of least frequently used and least recently used algorithms.
Now, let's quickly look at how does the whole buffo scaling works uh with a simple animation. So as pages are red, they get hotter and hotter. And if you have a new pages coming in for storage, you don't know what the future access pattern is. So we want to make sure that these are least likely to be evicted. So they come in as warm pages and as they get red, they become harder and harder and those are the least likely to be evicted. So it's the the pages that are depicted in gray, the cold pages that are the ones that are most likely to be evicted when the buffer pool needs to scale down. So we're gonna evict those pages, reclaim memory and shrink the buffer pool.
Now, i talked about how scaling up was instant because when you need those resources, you wanna make sure you get them right away. But for scaling down, we want to be cautious, we want to make sure that we're not prematurely evicting these, these pages that you may need and then you have to go fetch them again. So for scaling down, we take a more cautious approach and in the demo, you'll see how you're gonna see this intentional stepwise decrease for scaling down.
Now the database can continue to scale down and it it comes down to the minimum that you have configured. Now, there could be scenarios where it doesn't come down to the minimum. Now, why is that? We talked about how if your database needs any resources, be it resources or the workload or the queries that you are running or any background processes that the database could be doing anytime it needs any of those resources. The database is going to scale to that level where it can sustain that workload.
Now, there are some factors where you're not running any workload, but there's something else going on in the database. It could be the background processes such as vacuum approach which requires the resources. But your minimum is let's say 0.5 ac u which the resources that you require are more than that. So then the database is not gonna come down to the minimum that you set, it's gonna stay at the level where it needs those resources.
The second factor could be some of the features. For example, again, if your minimum is 0.5 ac u and you're using any features such as you know, global database, which are, which requires more resources, the database will stay at the level where it can operate that feature healthily performance insights is another example of the feature where it may require more resources that a 0.5 cu database cannot sustain.
Another factor is what if you have a large store size attached to your database? In those scenarios, the database is also gonna stay at the level where it can sustain that workload, fill our priorities here.
So if you have set a priority tier of 01, and we're gonna talk in more details later on about it. This could be another factor where the database is not going down to the minimum that you set.
So overall what you should be thinking about is the database capacity will stay at the level where the database has all the resources it require to for the for your workload, you can set the m and the max, but the database will automatically scale up to make sure it can sustain that workload.
Now, let's quickly talk about uh parameters how they're handled. I've given away half of the slide before because we talked about buffer pools. So there are some parameters that we automatically adjust um during scaling time. So buffer pool is one of that. So for progress, shared buffer is that parameter, then there are other parameters that are going to be changed when you change that maximum ac u your maximum capacity setting. So there's a list of those parameters to for example, maximum connections. And of course, always look at the documentation for the up to date list here. I'm just showing a few of them for post graph and similarly for my sql. So parameters that change during scaling and then there are parameters which will change when you change the maximum capacity.
Ok. So this is all about you know, scaling how it happens. What goes on is there going to be enough capacity lying around for your database to actually scale because we can understand all about scaling. But what if there's no capacity, how do we make sure that there is enough capacity available?
So first thing we do capacity planning for in region with our experience of running aurora and r ds, we can make projections based on that. So that's one. Secondly, we do call this serve us. But at the end of the day, we are running this on actual machines with finite capacity.
Now, what if that underlying host that runs out of capacity? What do we do then? So a has the capability to make sure that we can move instances around non disruptively to make more room within that host. So your database can scale up and we wanna make sure that this background movement of instances while we're doing it, we preserve state. So we preserve connections, your buffer pool. So from your side, you won't see that difference. We're just moving things around to make sure there's enough capacity available for your database to scale up.
Now, this is just a little sneak peek into how do we manage heat at scale. What you're seeing on the graph is x axis are host numbers and y axis capacity. Each blue dot represents an instance. And the purple dots that scattered all over are very, very large instances and a vertical line composed of blue dots, those are single hosts containing all of these instances. This just shows that when your host reaches a certain threshold, we move instances around to make sure there's enough room and a black arrow left and right is going to appear to just show that how instances are moved around. I'm just showing a, a small uh vision to give you an idea. So you see that black arrow appearing that's to move instance away from the host to make sure there's more room for other instances that were the databases that were growing. And this is just a little sneak peek into how is it all working at scale.
Now, one thing is understanding how you know scaling work, but what can you do to monitor your database? So with aurora, there are various different capabilities that gives you the monitoring capability. For example, for instance, for your operating system for database engine with aurora serve sv two, all of this also works. And we have built additional metrics on top of that and additional features to make sure that what you use for what you need for aura service is is available and we're gonna touch on a couple of metrics, key ones only.
So let's start with uh cloud watch. We have this metric called uh several database capacity which represents the current capacity of your instance. So when you are provisioning, you're only giving minimum and maximum, but any point in time, what is the capacity of the database? This is the instance that's going to tell you that. So here if you see, you know, there's like constant capacity in ac u, then it scales up, you see that, then it's scaling down, then we have ac utilization which is the corresponding percentage of the current capacity out of the max.
So now what this tells you is if this is coming too high, you know, you're coming to 80% 90% you have a certain threshold, either you go optimize your workload, so it requires fewer resources or you can increase your maximum database capacity. So you have additional resources available.
Another one is uh cp utilization which is again a a percentage of cpu based on the maximum ac. Again, if it's coming close to a threshold, your database requires more resources, you can change the maximum uh database capacity to give it more resources.
And the last one i want to touch is free memory. Same as you have also seen this in provision aurora as well. So it's the total memory that is available to the database engine. Now, if it reaches to a point where not a whole lot of memory is available, again, you can increase the maximum database capacity to give it more resources.
The next feature i want to touch is performance insight, which gives you a simple yet powerful tool to monitor your workload in terms of active sessions. Now for so you want to be paying attention to this dotted line. It's the estimated vcpu that i've highlighted. It's the text is like super tiny. Now you see that where it goes up, this is where scaling is happening. You see that blue bar because the buffer pool scaled up, but you still need to warm it up. Once it warms up, then the bar uh uh blue bar becomes smaller, which is your um data file read. So performance insight can help you identify bottlenecks as well. And there you can you know, go optimize your workload or you can increase your maximum uh ac u to give your database more resources.
Now, let's look at a simple demo to see how you can get started with the a asv two and see scaling in action.
Now, this is r ds console to create a database. This is very similar to what you do currently with aurora pressing the orange button, you get into the same workflow for creating a database. Nothing is different here. You have standard create, you have easy create. So it's the same options that you are already familiar with in aurora for engine type select amazon aurora. And when you're looking at edition, you have uh my sql compatible edition and progress compatible edition. So several less works for both of them. For my sql
As I mentioned, the versions that are supported are eight and onward. So all new versions that will come in will also support those at the same time as provision Aurora. If we look at Postgres editions, it's Postgres 13 and onwards. And as you get more newer and newer versions, those are gonna be supported with several as well too.
So you give it a name and password. Now, for instance, configuration, this is where you select a, a fixed instance. So you have, you know, r5 large or any of these. So this is where the setting is gonna change to. Now you're selecting serveless and you get this range that you have to specify, you can fix default as well. So you have the minimum AC which is the, the starting point for your database and the maximum AC setting. And when you're just test trying, you can go ahead with the minimum and the maximum available.
So now for availability, durability for this demo, I'm going with all uh defaults. So now I'm going to select any other options. So this was to show that to create a several SV two, you just have to specify that capacity range. Everything else can stay the same. And when you refresh it in your cluster, you will see that you have a several SV two instance and it gives you the range that you selected to show scaling.
Uh we're using this app which is basically, you know, you get a question, you select an answer. So it's an even driven app. And here this is just a sample question to show it. And when there is work uh load coming to the database, the database should adjust capacity to uh show that.
Now I'm gonna use just a, a sample uh data set here. We're just initiating start of the workload. Now, what you should see is when this workload started, the database would should adjust capacity that we're going to observe through this cloud watch uh dashboard.
Now, what is the blue line shows? That's your database capacity in AC us. When the workload is starting, you're gonna see two more lines red and brown. So one is users are coming in, they're registering and then they're casting their vote, just give it a second there and you'll see workload appearing and then you should see this blue line respond to that workload coming in.
So here there's overlapping red and brown. That means workload has started and you see this blue line responding to that workload coming in as the workload increases. You see this almost vertical blue line, this is where instant scalability happens, whatever is in the bucket, the database gets that and it scales up to the point where it can sustain the workload.
If more resources are needed, it will continue scaling but it will get to the point where it can sustain that workload or it goes up to the maximum that you have configured. The more users are coming in they're casting their vote and the database capacity stays at the level where it can sustain this workload.
Now, the red and the brown line have gone down, meaning that users have stopped registering no voting is happening. But the database capacity, the blue line, it didn't drop vertically down right away. We pause because we want to make sure we don't evict those buff pages prematurely. So you do see this intentional step down for database capacity, we scale down a little bit to make sure. Do you need capacity? The da uh your workload needs capacity or not? And then it continues to scale down and then you get the final results. It's a pretty even mix. So you can tell the numbers are fudged. I didn't want to take any sides.
So this is how scaling works with aurora seven SV two where you get instant scaling up with that you should uh saw with an almost vertical line. And for scale down, it took a very cautious approach to make sure that resources are available and we're not prematurely evicting buffle pages.
Now, that was all about how auto scaling works for your critical demanding workloads. What about all those other features that you're using with aurora? How do those work with CUs?
Now, first, let's look at where the server actually fit in. This is a very, very, very high level overview of the architecture where I want to highlight some key points with aurora's architecture, you get separation of storage and compute in this diagram on the screen, what you're seeing is across two availability zones. You have provisioned aurora instances with the storage layer, you have r6g in a one and a three where does serve fits in at the compute layer, any one of these nodes could be serveless.
So you could add a sur reader. This is just like any reader that you have with aurora. If your reads are all variable in nature, you can have all your readers to be aurora several SV two or if you have a workload, that's just spiking nature. Overall, you can make all of these instances to be aurora several v two. So your entire cluster is a several cluster.
So this basically shows that you can mix and match your provision instances and your several instances. Now what about storage? So we use the same storage layer as aurora provision storage layer. So those nodes are the writer and the reader, they're operating on the same storage layer. We're not replicating storage. So storage is spread across three availability zones. Always, always, always. This is what gives you the high availability that aurora has.
We save uh six copies of data across these availability zones. If something happens to one copy, no worries, you won't even see the difference. There are five other copies lying around and keep in mind you're only paying for that one copy and even in scenarios where the whole AZ goes down again, you won't see the difference or no because there are four other copies lying there. And then everything will just continue as it is, as your workload grows, the size grows, the storage layer also automatically scales up. You don't have to do anything.
If the data size for some reason reduces, the storage layer will also reduce in size. So all of this happens automatically and you're only paying for storage for how much you're using.
Now. What about high availability and read scalability with aurora several SV two? So with aurora, you get the capability to create up to 15 v replicas or readers and with CUs it's exactly the same and you can use any combination mix and match depending on your workload for serveless. What you do is you specify the capacity range at cluster level in all instances are going to inherit that capacity range.
With aurora, you have the capability of automatic fail overs where if something happens to the AZ, it's failed over to a reader instance. And the way it determines that is based on priority tiers between zero and 15. So smaller the priority tier higher the priority.
So in provision world, it's it's easier because you know the size of the database, there's no guess work involved there. If something happens to the AC you fail over to that particular machine, but with several less, you don't know at what size that instance is at.
So what we have done is four priority tiers zero and one, we have made it such that if they, if your reader instance has a priority of zero or one is going to follow this capacity size of the writer, so commonly known as reader follow writer behavior.
So if you have set those priority tier, you could be having a workload for that particular instance. But it's also going to make sure that it's ready to take over the writer. In case something happens to the AC and for read scalability, if you don't want this coupled behavior, all you have to do is set your priority tier to be two or onwards. So any number from 2 to 15.
And in this example, here, I've shown like extreme cases of 14 and 15. So for these reader instances, they're not gonna be looking at what's going on with the writer and similar to provision as well, you can spread these several less instances across different availability zones for high availability and multi AZ aurora cluster is supported by four nines of availability essay SL a and the same applies for serve. So none of that changes in several.
So in this animation, I want to quickly show you how high availability and read scalability works with several less instances. So here you're seeing two applications, one, you have the primary application and a reporting app and you're seeing end points. So aurora has provides capability to create various different end points where you have cluster end points. Your reader end point here, I'm showing two different types.
So you have a cluster end point. The application talks to your cluster end point which always connects to your writer instance. So there's workload coming to the writer instance and it's going to scale up. The writer is scaling up. Now, any of the readers that you have in this cluster that are priority tier zero or one, those are also going to scale up for this reader follow writer behavior.
Now you could be bunching together a group of other uh instant reader instances for requirement of a a separate reporting app which has a different uh spiky behavior. So here we have a reporting app that talks to a custom endpoint that's in turn talking to those two instances for this particular example.
Now you saw when the writer scaled, nothing happened to these instances, but when there is load coming from the reporting app, uh with these priority tiers of 14 and 15, these instances are going to scale up now in this behavior. So if you don't want the the couple behavior, make sure to set your priority tiers to be two onwards. So 2 to 50.
Now if your application has requirement for global application or you want disaster recovery across region, what you do is you enable global databases and what that does is it replicates storage to a a secondary region. So this is storage layer application. And under the hood, we also create replication servers and agents that you don't see. And what what's happening is that with global database, you get low latency read scalability across these different regions.
In terms of lag think of it typically less than uh one second global database gives you availability to create up to five secondary regions. And for each of these regions, you can again create up to 15 reader nodes. So in total, you can have 90 readers to scale your reads.
Now, it also has the capability to provide right forwarding for occasional rights from your secondary region back to your primary region. And it's available for both engines, my sequel as well as post and in in case of a disaster, if one region goes down a secondary region can take over and typically you'll see a downtime of less than one minute where the servers fits in.
Now any of these compute nodes that you have, they could be several nodes as well. So I've shown several less by these vertical arrows. When these nodes are idling, nothing's going on, you only pay for the minimum capacity in case of a a vision failure. When the secondary takes over several less instances will scale up to support that workload or if you have these reasons for uh v scalability
so these instances will scale up when there's workload on, on these instances.
now, what if you're running applications that are opening and managing lots and lots of connections and it could be to get a quick response time. now, this could be happening if you have, you know several application lambda functions that can scale up to hundreds and thousands of applications. and you want your database to respond to that as well too.
you may not be using all these open connections, but they're still consuming your database resources and you want to save your database resources for their actual job to run your queries to run your workload.
so what you can do in this scenario is you can use r ds proxy, which is a fully managed highly available database proxy that supported for aurora several sv two as well. you can place this proxy on top of the database and it takes the brunt of opening and managing all of these connections.
what it does is is take lots of those application connections and maps them to a smaller number of database connections by sharing connections. it also gives you better availability. you get 66% faster fillo times because there's no dns propagation delays.
so it's constantly looking for what is going on with the, with the writer. if anything goes wrong and there's a feel over, then rds proxy will connect to the new writer. and in the meantime, it's holding and keeping all of those connections open to you. so you get better availability for your application.
it also gives you better security by integrating with aws secrets manager and you can use i am authentication as well with rds proxy.
so all of those features that you have for rds proxy, it works as it is for serve as well.
now what if you are doing analytics, you're saving all of that data to get insights from it. so in those scenarios where you are building analytics pipeline, aurora makes it easy for you with zero tl integration where you don't have to build those complicated pipelines to get insights from your data.
now, let's say if you're using lambda functions for red shift, you're using red shift several to get insights out of it. in this end to end pipeline for your transactional database, you can use aurora several v two.
so you can go from all the way from data to insights in a serve less pipeline and you don't have to build any of those complicated integrations with zero etf and you're only paying for the resources that you're using with several. we do as well as on the red of service side.
now, how does billing happens? what's different about aurora seven sv two?
in terms of billing aurora for aurora, you get built on various different dimensions. you have your instant storage i os uh backup storage, everything stays the same for billing except the instance one.
so aurora several v two, billing falls under the instance billing. so the unit is ac hours and we do per second billing for several.
so if i do a simple example, like if you're using 10 ac us per hour for a month, 7 30 hours, and for, if you look at the price for your region, uh in this simple example, we are using us e you get 7 $86 for the month.
so for billing purposes uh instance is where you're looking at everything else stays the same.
now, we also introduced another cluster configuration that we call aurora io optimize. now what this gives you is better price predictability for those io heavy workloads. and you also get better price performance through this storage layer enhancements that we did for this feature.
what happens is that you get bill for your compute and your storage and no charges for read and write i os, there's a uplift on the compute and the storage. and as a result of all of that, you're not getting charged for read and write i os.
and where does serve comes in. as i mentioned, it's at the compute layer. so there's a 30% uplift on compute and a 1 25% uplift on storage consumption.
so this way you can make sure that there is predictability on your price for those io heavy workloads. so i optimize is available for all flavors. so you have uh it's supported for provision as well as serverless as well,
now let's bring it all together to see how one of our customers got cost savings by using a seven sv two.
so into it is a financial management company. so think credit karma turbotax.
so they started working with us a while back. they had a bunch of requirements around what they needed from their database. they already had provision databases and they were looking for more cost savings.
so the key requirements were around read scalability. they want to make sure that they can scale their reads. they wanna make sure if there's a primary node failure, the database recovers from it. and if there's a vision wide failure, there is a disaster recovery also available.
and with all those features, they want to make sure that we provide it in a cost effective way. and at no point, are they over paying for their resources?
so with all those requirements, so they were already using provision aurora and they had all these features that worked well for them and they were looking for ways to reduce cost.
so this was the previous architecture before. so there are two region, this i'm only showing like one of the cluster with two regions. they have a cluster end point which is talking to the writer, which is a four excel and the reader endpoint in the same region with other provision instance. and for disaster recovery, there's other provision instance uh which is a four xl.
so at all times they are paying for that provision instance, even though they're not using it, and the rest is at the bottom. you see the storage layer and all that.
so now what they did was they started evaluating aurora sv two to see, can they get cost savings for workloads where it's variable and they're not using those resources, but they're still paying for it.
so they tested it, they started seeing cost savings and in, in their after architecture, you see, you see that they replaced aurora s in their secondary region. so the disaster recovery instance is in aurora sv two, it stays at the minimum when it's not being used.
so they're only getting billed for that minimum capacity when it gets used of the instance will scale up and um support the workload for the primary instance. they also added server there with future plan of adding more.
so in this architecture, they're using some of the key aurora features around, you know, deploying across multiple a disaster recovery with global database and has several s four to get the cost savings.
so with this architecture, they move their not just their non production databases as well as their production databases as well and we're able to get up to 55% cost savings.
now, how can you get started in the demo? i showed how to create it from scratch. how do you create a new serverless database? it's the same as provision you just specify the database capacity.
now what if you already have a, a provision cluster? so to get started for that, to create a serve less instance, you can in your existing cluster, you can create a reader. it's your typical reader that you have. now it's a serve less v two reader by specifying the capacity.
and then once you've tested everything works as expected, you can just do a fail over and then your serverless becomes the writer for that instance. now you can keep the provision reader there or you can delete it.
now, that was if you have a provision cluster. now, with launch of aurora seven sv two, we have two versions of aurora serves several sv one and v two.
so if you are one of our customers who are using aurora cv one, how can you upgrade that cluster to serve sv two today?
now, we have made optimizations recently that can help simplify that for you as well. what we have done is we've added additional parameters in the modified db cluster api that can help you do in place upgrade from several v one, which is the previous version to the new version of v two it does in place scaling.
so you don't have to do backup and restore anymore this with in place scaling. think of it similar to when you do a fail over. so less than 30 seconds downtime and from several sv one, you can get to a provision aurora cluster.
so as we talked about that, several sv two is build on top of provision cluster. so this is where you could do in place upgrade to provision aurora and then the versions are different. so with the one, it's 5.7 for example, from my sequel, you have to upgrade to 8.0.
so you can use blue green deployment to safely and securely do that upgrade. and then the step is just add a simple a reader which could be a serverless reader.
i'm going to show you a quick demo just to show like the different steps and it's gonna go by really quickly. i'm also doing the upgrade that takes a little bit of time, but i just speed it up in the in the video so we can cover it up.
so you're starting with the previous version of aurora serve, which is seven b one, all you have to do is run the modify command, which we have added those additional parameters. that's going to help you do the in place upgrade.
once you run that the cluster configuration changes, now it becomes a provision aurora cluster. you have the cluster, you have one instance, you can run a blue green deployment on that cluster to upgrade from. in this example, i'm going from 5.7 to 8.0.
so you have blue cluster, you have green cluster and the green cluster is where the upgrade is happening. this is where i'm going to add a sur instance now and i'm gonna do all my testing on that green.
so once i refresh, as you can see in the cluster, there are two instances, a writer and a reader. so the reader was the service reader that we just added, we're going to do a failure to make it a writer again, less than 30 seconds down time to make it into a writer.
so this green cluster now has a primary which is a serve less instance between 4 to 3032. now you can test this to make sure everything's working as expected. and then for your blue green deployment, you can do a switch or once you have made sure that everything is working according to plan.
now to summarize, bring it all together, what you saw in the last 45 minutes was how serve list can help you reduce your operational complexity and save on cars.
it does so by automatic in place and non disruptive scaling, we built it on the same architecture as provision aurora so that it works exactly the same way with the additional auto scaling capabilities.
so all of the features are also supported. so multi a deployment read replicas global database, you briefly saw how billing works.
you saw how one of our customers into it were able to get cost savings by using aurora seven sv two. and lastly, you saw how you can get started today using the a ps or few clicks and rds console to quickly get started with aurora service sv two.
that is all i have for today. thank you very much and i'm really excited to see what you build with aurora several sb two and don't forget to do the survey.
thank you so much, everyone.