Dive deep into different AWS DMS migration options

最新推荐文章于 2024-10-01 19:57:56 发布

李白的朋友王维

最新推荐文章于 2024-10-01 19:57:56 发布

阅读量98

点赞数

文章标签： aws 亚马逊云科技科技人工智能 re:Invent 2023 生成式AI 云服务

本文链接：https://blog.csdn.net/just2gooo/article/details/135120341

版权

So what is today's session? Everyone in the right one. Yeah. Migration options. AWS deep dive 300 level course, that means we are going to get into some of the tech. So we're gonna try to, you know, I know over the course of a week you're probably gonna have what's known as death by powerpoint. We're gonna try not to have too much of that in the session. We've got some slides, we've got a demo, wanted to have two demos, but it was just getting to the point like this is gonna be over 60 minutes. So, uh you know, we're, we've cut it down to one, but it's, it's a pretty good one. I think you'll find it interesting.

So, uh let's see what it is. We're gathered here today to talk about. It's not a wedding. I'll tell you right now. It's not gonna be anything remotely close to that interesting. We are gonna talk about migrations and not the african savannah sort either. I know maybe I'm leading you a bit astray with this one, but we're going to focus on the migrations of the data type b on replications as well because honestly, the difference between a migration and replication is kind of saying tomato, tomato, it's much the same thing.

Um so we know migrations that animals go on to be challenging, long arduous. What have you, there might be some similarities with data migrations, but we try to make it as simple as possible. We're all about trying to give you the easy button in your migration, make it straightforward. So what we're here today is to help you understand our migration tooling, the architecture primarily behind d ms, how it works. What are the different options you have to migrate when you use? Which one? Uh what's happening behind the scenes? What are the factors you need to consider? And maybe I should be a little bit clear about what I mean. When i say a migration, a migration is moving your data. It could be moving your data from on premises to the cloud. It could be moving data inside the cloud. It could be moving data between different database types inside the cloud. Really just think of it as moving data and keep it simple. We're going to spend some time talking about what's nuisance last year. You know, time hasn't stood still. We've added a lot of new features and capabilities to the product. If you're, if you've used d ms in the past, it could be a little bit different right now.

Um and as i said, we have demos, give you a break from the powerpoint. So let's just jump right in. So we can get a deeper, deeper understanding about it, see how we can help you with your migration projects in particular. That's what we're here to do today to see how we can apply this, the technology, the offerings that aws has to your challenges and your migration projects.

You have, my name is john. I lead product for aws d ms. And over there we have ryan. He leads our technical field community. So he spends a lot of time in this as well. We're going to tag team through this session and uh hopefully you come out, learn something new.

All right. Does anybody have this button on their keyboard? I, i, i'd like it. I don't have one yet. I mean, a sticker over the enter key doesn't quite do the same thing. Right. So, you know, we're getting to the point where we can, we can have something similar. We're not there quite yet, but we're trying to make it as straightforward as possible. So again, we're gonna spend some time talking about the different options for that. Immigration today.

All right, as i alluded to, we are gonna be talking a fair bit about the database migration service. Um it focuses on migrating data from one database to another wide variety of sources and targets. I think it's something like 21 different sources to 19 or 20 targets. I can't keep it straight. Don't quiz me on it. Our web page is the best place to look because i tell you that list changes every year. Again, it can be a migration from on prem to the cloud or, you know, from an oracle database on premises to one on ac two, maybe from that same oracle database to r ds. Or you could go from an oracle database to aurora or, you know, a data warehouse or something like that. It doesn't really matter. It's just about going from one database or data warehouse type to another.

Um and as i said, one of the main strengths of this service is that it's actually all about replications. So a little side story, i've told a few times in the past. So apologies if maybe you have heard me say this before. Um when you launch a service at amazon, you go through a number of different uh bureaucratic processes, shall we say one of which is what you're going to name the thing? So i've been with uh amazon for quite a while now. It's scary we're getting on 10 years and i was uh i wrote the narrative and was in the room with andy jassy, who was our ceo at the time trying to name this service. And we were pushing for drs database replication service. And he's like, no, no, we want people to migrate to the cloud again. Remember this is going back quite a while now. So we're gonna call it d ms don't let the name confuse you. It's a replication service. That's what it does. It just happens to move stuff. So, hence migration.

All right. Uh you're not the first person to think about migrating. We have been around a while. A million plus the number at the moment we're officially allowed to quote is like more than 1.2 million databases, but it just sounds better to say more than a million. So that's what i've got on the slide. Uh lots of customers have done it. Uh all sorts of sectors and verticals, public, private, financial bank or financial and banking, the same things. But anyways, lots of different people have moved, but it's always kind of neat to talk about us when i say us. I mean, amazon,

Um bit of trivia for you as you are all aware. Uh aws launched as an offshoot of amazon because hey, we got pretty good at managing services. So maybe we'll make them available for the public to use and presto you got aws, right? Greatly simplified.

Um i'm using the side fact for the number of years, amazon did not use aws. We had amazon had its own data centers and then there were aws data centers, the two shall not meet. Well, eventually they did because it was really silly to maintain two sets of data centers. So amazon migrated to aws uh 2019 is when we finished the database migration and we got off oracle to cloud native and open source systems.

Um if you think it can't be done, amazon is a great example of being able to do that because, and i need to look at the number to make sure it's right. We had over 7500 databases with more than 75 petabytes of data that was moved just a coincidence. It was 270 fives. But, you know, at any rate, that's how much there was. And that was way back in october 2019. So we can move that much back then we can move more now. So it is possible.

All right on to the next thing. Migration tools and services aws. Um if anybody can name all of the services, i'd buy you a beer. I can't keep them on track. Not a chance, right? We've got a whole range of services and products available for you to migrate.

Um we're gonna spend some time playing in this area today. So we're going to talk about the d ms product and its different functionalities and capabilities. But know that with every migration project, there's other things available to help you. We have solution architects, people like ryan that can help you out with the technical challenges of your particular migration. We of course have third party partners who are trained and have worked with countless customers during the migrations. Aws has its own professional services group. They can give you a hand as well, that's always an option. And then uh we actually have a sub team kind of part of d ms. We call d ma advisors, which can offer sort of free guidance and advice if you need them as part of a migration too. And then i'm not going to talk too much about the programs at the end. But note that there are always uh financial incentives or enablement or training or things like that to help you get going with your migration and that's what those programs at the bottom are going to go into.

So first step on the path of migration, do you know what you have? This may not always be applicable for everyone in the room, you guys? Yeah. Yeah, i got it. I, i'm in it. I know everything that's running well, i can tell you that i used to run a program called database freedom and we used to go around and do what we called the workshop to get started where we do an analysis of customers databases and we'd say, ok, show us what you got. We'll recommend this path for migration to the cloud and some customers complained because before we went on site, we said, yeah, we need to do this thing. We call the the homework. Now, i know many people haven't done homework since university, but that was the idea and the homework was like, tell us what you got and then we'll offer you some advice on those.

Well, i remember walking into one customer and there's like, thank you so much for forcing us to do this because we found that we could turn off a third of our databases before you even walked in the door. Think of all the money that saved. And on top of this, we actually found a data center. We didn't know we had, it happens now, slight exaggeration there. I believe it was a few racks in a data center from a third party, but nonetheless, they called it a data center. So, you know, here's what it is.

So that's where fleet advisor comes in. Uh d ms fleet advisor is a functionality of d ms that we launched two years ago, uh geared towards customers with large fleets of databases. It's a free capability of there. The generally the general idea of it is we'll go there, we'll look at what you got, we'll see what kind of infrastructure it's running on and we'll recommend what you should migrate to at aws in an automated fashion, right? This isn't like in depth analysis. It's like let's scan everything, look at what you're running, look at the performance characteristics. Here's what we think you should migrate to.

All right. So how does fleet divisor work? I did say this was a 300 level course, didn't i?

Um, so it has a single on premises collector that gets installed, right? So what we do is this collector is installed on premises. Uh it connects into your lda server to get an information about all your servers out there. Uh and then once it's got the database servers, it tries to connect each of those and gather some information about those database servers. What does it do with it? Let's try the button again. There we go.

Uh it collects all of this data about those servers. You can look at it first, if you don't trust us for any reason, you can look at the data first that the collector is gathered, just puts it in a file folder

It's kind of like a CSV file and you can look at all of that and then we'll go and upload that secret S3. The data goes in S3 again a bucket in your account and then it gets loaded into Fleet Advisor. And again, this is just a capability of DMS. So it's just loaded into the service where then we analyze it, we look at the usage patterns of that database, right? So, um you know how many, how often it's running and what the CPU is. Uh we're going to look at things like the licensing of that database and we say, oh, this, this thing's expired, you're running an unsupported version, you know, big red alert there. It's something you should consider.

And then at the end, we're going to recommend, hey, you should consider migrating this to I don't know an RDS RDS Oracle database. This many CPUs, you can use standard edition because RDS handles the fail over. You can use multi availability zone for fail over, for example. Or hey, this is a candidate you could consider moving this to RDS PostgreSQL and the best part of all of it is, it tells you how much it's gonna cost at least an estimate, right? It's grabbing the information from the price list and because it was originally developed as a monitoring tool, it'll continually monitor your databases as long as you want it to. Because of course, you get spikes and troughs throughout the course of a month of your database. And they'll say actually we thought you only needed, I don't know, a 16 core processor, but now I notice there's a big, big spike, maybe you should consider going to a 32 something along those lines.

So that's how Fleet Advisor works. All right. So you found stuff. Now, what? Well, you probably need to migrate it to something. Now, in a lot of cases that migration can be pretty straightforward. If you're running Postgres on premises, you're probably just gonna migrate it to RDS for Postgres, no conversion required. But say you're running a commercial system, say you're running something like SQL Server and you don't want to pay for it anymore. You know, like getting those bills from Microsoft and it's a system that you, you own, right? So the applications using it are all things that you control or what have you.

Um so now you're thinking about, ok, let's convert that database from one thing to another. Uh DMS schema conversion allows you to do just that convert a schema from one database type to another automatically. Now, it doesn't get 100% but you know, 85 90% of the way through, it'll help convert that schema from one database type to another.

The first step with a schema conversion is it does an assessment and it looks at it and goes ok, I can convert 82% of this and here are the areas that I'm going to have some difficulty and I might need some manual intervention to change it. Now you may be going hold on. He was just talking about Fleet Advisor. What's the difference between the recommendation that Fleet Advisor does? And the one that the Schema Conversion does, Schema Conversion is much deeper. It goes and looks at every procedure function table, data type, what have you and says, yeah, this is gonna work or it's not Fleet Advisor is a bit more like, yeah, we see we see what you're running, we see how heavy the usage is. We see what kind of, you know, infrastructure you're running today. We see your licensing, we think you should move to x. But if that x is a conversion you want to do that deeper analysis is the next step is Schema Conversion.

The reason for the two separations is Fleet Advisor is quick. It can scan thousands of thousands of servers in a limited amount of time Schema Conversion can take a bit of time. It does a deep dive analysis. We've seen some of the biggest complicated conversions run for hours. Uh you know, as it's doing that analysis so it can happen.

One thing I do want to highlight though is there are two things that you will find in the Schema Conversion space for us. We've got DMS Schema Conversion and we've got AWS Schema Conversion Tool. It's not quite tomato, tomato, but it kind of is SCT was our first launch. It's a desktop tool. Whereas DMS Schema Conversion is integrated into the service Schema Conversion doesn't have quite the same breadth of sources and targets that SCT does. But it's a service. You don't need to install a local tool. You don't need to worry about the CPU capabilities of your laptop. You don't need to manage it yourself. You don't need to talk to your information, security team to be allowed to install it. DMS Schema Conversion is just a feature of the service. So it uses the CPU of the cloud as opposed to the CPU of your laptop and it is also free. So it's available for you to use.

How does it work uh again. So we haven't talked about a couple of things that aren't on the screen. Um we have something called an Orchestrator which orchestrates the DMS Schema Conversion workflows. And we have a Loader which loads all the metadata for the schema. But as we get going, uh we're going to start off with an Iterator which we have to iterate through each of the schema objects. And then we read that source code, right? That source code is parsed. Now, the Parser splits things into different components. So we've got a procedure, we've got a function, we've got a table, what have you. And then that Parser puts things into an Abstract Source Tree. That's what AST stands for. And again, these are the internals of DMS SC you don't really need to worry about. But if you want to know how it works, we're putting everything into some generic tree which is abstracted away from how that database in particular operates.

And then we go and of course, resolve it, get any conflicts. You know, we've got this, this column over here and this procedure over there, make sure the procedure isn't referring to a column that doesn't exist. You know, there's no errors like that. And then when we're all good, we resolved. You know, we got no problems there and then we convert it. Oh, sorry, it's resolved. You know, we got no problems there and then we convert it from one type to another. So I don't know, you're going from Oracle to MySQL. And once that conversion's happened, you put it into a target abstract source tree, right? So the source one was right in, in say, what did I say? Oracle? Yes. Source one was right in Oracle. The target one is now going to be MySQL. So that's your MySQL abstract source tree. And then of course, we have to print it right. So you can view it on the screen and see what the results are. And eventually you can save it either as SQL code or you can apply it directly to your database and reiterate as we get through the entire schema.

So there's no magic of DMS Schema Conversion. I'd love to say stand on stage. Oh, we got the best new AI capabilities in there. Not yet. Let's just put it that way. Right now, it's purely rules based. So that's, that's how Schema Conversion works today.

So you've converted the schema. And again, I want to reiterate, you don't necessarily have to convert a schema. If you're just going from one database type to a managed version of that or from on premises to the cloud, it's only if you're going from one type to another. But now you've converted that schema, you got to move the data.

So to move the data option one with DMS Logical Replication Instance, this this sound familiar. I think I called the service, the Database Replication. No sorry, it's the Migration Service. I'll let that go. But what this is behind the scenes DMS is a logical replication service. And in essence, what we do here is we have it installed on an EC2 box that we manage on your behalf that you can access. So the idea here is we've got this network container that encapsulates everything that is sort of hybrid it into your VPC and our VPC where we manage it. And the idea being you can control the connectivity to your databases. The data is all contained inside your container and we can still manage it by making sure the host is up to date that.

So that's the actual EC2 box that's running there. You know, we patched the Linux system on there. You don't have to worry about it. And then of course, we have to maintain some sort of management connection to make sure that things up and running. You know, if there's ever any interruption, we can fail over to uh to, you know, another instance or something like that. But, but that's the general idea is that we, we encapsulate things this way.

And then we have this logical replication engine that goes out queries the source database. Essentially we're doing a SELECT statement to get the data out, bring it through the replication engine, transform it if required and insert it onto the target. And then of course, when you're dealing with the actual replication aspect of things. We are mining the logs on the source could be a bin log, could be a WAL log depends on your database engine type. And we're applying that to the target, funneling it through the replication instance.

Ideally, no data is going to reside on that replication instance. It just kind of funnels through it. The only time any data goes on the replication instance in the attached storage is if there's some sort of delay, your target is like super slow compared to your source or something like that, changes might get cached there, but then they get flushed out as it catches up.

All right, that was option one. That was what I'd call DMS classic. It's an easy EC2 box. You've said I want that EC2 box to be an m5.xlarge and that's what you got. It doesn't matter if you need that much processing power, that much memory, more or less whatever. That's what you said, that's what you get. And let me tell you there can occasionally be problems if you under provision with that version. So we try to make it as simple. Going back to what I was saying, that easy button we're looking forward to making as straightforward as possible.

Serverless is one step in that direction. You don't have to worry about what size of replication instance to deploy anymore. How many people are truly replication of migration experts? How do you know how large of a server to deploy. Somebody has asked me this, sorry, not somebody. Many people have asked me, can you just give me a table that says this server size versus these things? No, there's too many variables out there. You know, how busy is your source database? How busy is the target? What's the structure of the tables? You know, what's your network throughput? Like all that kind of stuff makes it really hard to say this is the size of replication instance you should use.

So that's where serverless comes in. We handle choosing the right size. Now, how does the thing work? Here's the architecture, there's a lot to it. Ok. Things are not totally straightforward. Again, this is meant to be a deep dive course just so you guys can know what it is. You don't really need to worry about all of this. Um but this is the general idea of how this thing works.

Generally speaking, what it comes down to is at the start, we don't know what size of replication instance to deploy either. So we run a bunch of queries on your source system to understand what it looks like what the structure is, how busy it is and all that sort of stuff. And then we use our provisioning service to deploy an initial size of replication server. Now, you want to have some control over this stuff.

So you don't need to let it run wild if you're afraid of what the bill might be at the end of the month or what have you when you deploy a, what we call a replication because we don't worry about a instance anymore. You specify what we call ad ms capacity unit or dc, you give it a minimum, you give it a maximum. So we're not going to go under or over that one.

I would suggest you don't um set the minimum way too low. Otherwise, really, what's the point of server? Because you'll just always run at that smallest size. And then also if you know that your system needs a minimum of, you know, it's a pretty busy database, a pretty big database don't put the minimum down at the lowest setting possible because, you know, it could, it could if you have any spikes or troughs and there's a drop and then d ms drops down and then it suddenly spikes up again, you know, there's gonna be a bit of a delay for it to scale up.

So we decide what instance to use at the start by doing all those queries again, based on the minimum maximum you provide. And then as this thing goes along, we're monitoring the, just the cpu and memory usage of d ms itself and scaling up and down as required.

Now, we don't scale up and down instantly. Uh we scale up sort of, you know, every, uh I think it's 20 minutes, we scale up uh no, once per hour and we go down, once every two hours. I always get those numbers slightly wrong. So it's not going to do yo yo thing. We're not going to go up and down. It's gonna sort of, you know, peaks and troughs in the day rather than by the second. So that, that's how serve this works.

Option three because hey, we like options, homogeneous migrations. Homogeneous migrations are a bit different than heterogeneous d ms can do homogeneous migrations. Standard version, no worries, but it's kind of overkill. It's like bringing, I don't know, i'm trying to think of a good analogy here but you know, bringing an overly complicated toolbox when you're just trying to do something simple, right?

So with d ms, homogenous migrations, what we're doing is we're not using the same logical replication engine of d ms. We are not going in there and extracting the data bringing into an intermediate data type right into the target. It's like overly complicated with d ms two. You need to worry about copying the schema over normally, right? But with homogeneous, well, why don't we just use the native tools?

Who knows how to move for argument's sake. a my sql database better than my sequel or who knows how to move post graphs better than postgres. Now, you may have gone and looked at the documentation on aws s website about moving the database and there's a lot of steps you know setting up your network security, uh, you know, where to deploy the tools. Um, how do you make sure that you've got the right access rules, all that sort of stuff. Oh. Do you have an s3 bucket to put things in? You know, it gets a little tricky.

What we've done with d ms modulus migrations is we've actually wrapped those native tools and taken away all the complexity of it for you. We're handling all that security. We're making sure that things are transitioned through s3 as they should be. We're setting up the replication properly on the source and the target. You know, we're just essentially making it so you can automate the migration in a way you couldn't with the stand alone tools.

So if you wanted to move for argument's sake, 100 postgres databases with homogeneous migration because d ms uses an api like every other aws service, you can script it so that you'll move all 100 of those databases with d ms with essentially using the native tools underneath the, underneath the covers. And you've got a complete migration that's all been managed. And heck you can even turn on multi availability zone if you want to fail over in case there's any interruption.

Like, you know, we're essentially making it as straightforward as possible, going back to that uh that click button if you will. So i've rambled on a fair bit now. I think it's time for ryan to do something so pass it over to him.

Ok. Oh, did i press the wrong button? Yeah, we're done with, we will jump into the demo.

All right. So i like to show you in this. Uh, it's about 1112 minute demo, so i'll move pretty quick. Ok.

So if you've looked in the d ms council recently, you'll see. We have three new sections, migration projects, instance, profiles and data providers. These all fall under convert and migrate. And so i'll walk you through what those look like today and we're going to start with data providers.

So if we go in and open up data providers, oh, data providers, the data provider is just a place to store connection information about your data stores. And i've prec created two for you here today. Reinvents source and re invent target. We'll look at reinvents source and i'll walk you through the information you have to put in there. It's not a lot you can see, i gave it a name, the engine type. I specified my host name, the port and the database name. And then at the bottom, we can see that this data provider is associated with a migration project. And data providers can be associated with one or many migration projects.

If we were to look at the target, we would see different information but virtually the same screen. So let's jump into instance profiles. And so once i open up instance profiles, we're going to see that i've prec created this dms re invent vpc here. And an instance profile is where you configure your network and security settings. And so what i've done here is i've created this subnet group that you see here associated it with this instance profile. And based on the subnets, i choose, it determines whether this is publicly accessible or not. And so if you're migrating from on prem or a different vpc, it's required that you use the public endpoints. But if you're migrating from within the same vpc, you can use the private addresses and which you'll notice when you run your first data migration, when you set it up and run it for the first time, it'll take a couple more minutes to get to a running state than your subsequent runs.

And that's because behind the scenes, we're setting up the networking infrastructure that it's needed to support your migration. So we may be doing things like deploying that gateways or configuring vpc peering. But then once that's there, it will, it will persist and you won't have to wait those maybe two or 34 minutes, whatever it is to get that going, um which is really a trivial amount of time for, for most migrations.

And then we're going to talk about migration projects. So i open up migration projects. We can see that i've got my prec created migration project here and a migration project is a logical container or grouping of resources that are going to be used in this migration and that's meant to make it repeatable for you because customers typically just don't set this up and then they run and hey, we did it once we're in production, you know, we set this up, we kick the tires on the service, we do some stress testing, some load testing. We develop our run books once we're really comfortable with how this works.

Ok. Now we're gonna schedule our production migration at a time that's convenient for us. So let's go ahead and click into there. We'll look at the data migrations tab and see that we don't have a data migration yet. So we're going to create one as part of this demo and the data migration, right? This is meant to be that repeatable process. So we'll go ahead and click on create data migration that will open up and we'll see what we have to put in there. Ok?

So i'm gonna give this a name re invent demo. I'm going to select full load for this demo. Cdc is optional and then i'm going to turn on cloudwatch logs, right? This is optional, but i highly recommend it. There's a lot of essential information in cloud watch that tells us about the status of a mi migration and what exactly is going on. We'll open up advanced settings. There is only one setting here. It is the number of jobs it defaults to eight i'm going to set it to three. I'll tell you why later.

And then of course, we need an i a role. So i created the native migrations role. And i did that just by looking at our public documentation that has all the permissions and trusted entities needed to support this migration. So we'll go ahead and get that started. We'll click on create data migration that comes up almost immediately. And we can see we have our reinvented demo.

So we'll drill into that. We'll see a couple of facts. We've got a status of ready. The type is full load. We can see when it was created our source and target data providers, the engine type, our public ip address the fact that we have cloud watch logs on and the number of jobs is set to three.

So let's go ahead and get this running. So we'll click start. This typically takes a couple of minutes to get going in the interest of time. Here, we are going to skip ahead. We'll see we have a status of starting and cloudwatch log is on, but we don't have a link yet and then once it starts, we see that the load is running and now we've got a link for our cloudwatch logs. We'll talk about monitoring in a second.

Let's jump to our source system. So i got a source post grass database on ec two. And customers always ask me how much load is this going to put on my system, right? This is a critical database for me. You know, we don't have a lot of headroom. So we'll look at this system here, i'll run the top command filter by cpu. And you can see that we've got post grasses running and we have two copy commands running.

We are not consuming a lot of cpu, we are not consuming a lot of memory and our load average is near zero. So if we look at this another way, i'll break out of this and we will look at the io profile. So we'll run i oat and you can see that we're not doing a lot of io either, we're 99% something idle and not a lot of reads and writes. Ok?

So let's log into our source database. We'll see why that is ok. So i'm gonna show you two things here. We'll look at their tables that we're migrating the size and then we're going to actually look at the activity that homogeneous migrations is doing on our behalf.

So you can see here, we've got two relations. They are about six gigabytes each, which is pretty small, but we've got a short amount of time for this demo today. And this is just some sort of publicly available data set which i which i trimmed down.

And next, we're going to look at this view. Postgres has it's called pg stat activity and pg stat activity. Reports, all the queries that are running in your database. So we'll run that query and we'll see the results. Ok?

So we can see here, we have two copy commands running. Now. It's interesting because we set the number of jobs to three, but we only have two tables, right? And so pg dump isn't going to parallelize individual tables across threads. So you can configure the number of jobs from 1 to 50 but it isn't advantageous and actually works against you a little bit. If you don't have that many tables, also configuring up to over 50 could potentially put more impact on your source. So just something to be aware of and what you'll see here is we're copying these tables to standard out and on the target. What you'll see is we're copying these tables from standard in.

And so the majority of this migration is happening in memory and that has a lot of benefits. One of course, for performance, it's hugely advantageous. Two, we're not doing a lot of io on your source system. So we're not putting a lot of load on it. And three, if you're storage constrained on prem, you don't need a lot of storage to leverage native migrations to move to one of our managed services. So a ton of benefits in doing it this way.

Now we're gonna go ahead and talk about uh monitoring. So i'll jump back into the d ms console and we will pull up cloudwatch.

Ok. So I'll click on my CloudWatch logs link here. This takes just a second or two to populate. And what we'll see here is logging for. We are assuming the role that we created. We're fetching our credentials from Secrets Manager. We're logging into our databases. We're getting the version information about these databases.

Next, we're going to run queries and we're going to determine which tables we're going to migrate. And so we'll see our source capture list is going to populate in a second. And you can see here, here are two tables. We get an estimated number of rows based on statistics. They are about 80 million rows each and next, we're going to see that. Ok. Now it's running.

So at this point, our migration is in progress, we've got a loading still in progress message. And so there's a lot of good information here in CloudWatch. You can also see some of this within the native logging, right? Depending on how you have your logging or auditing configured, you can see the queries that native migrations is running and the DDL or anything that we are performing.

Um so that's also helpful. One more thing we want to look at within CloudWatch here because we talked about storage a couple of times and I want to point out exactly how much storage we're using. So we had about 12 gigabytes of data to move in a short time for this migration. We're using about 168,000 bytes on the source system. That's all. So it's less than a megabyte to move 12 gigabytes of data.

So again, if you're storage constrained, this has a lot of benefits, we'll jump back into CloudWatch and we'll see that. Hey, this is finishing up. So let's go look at our target. We'll jump out. I'll connect to my target in RDS. I'll show you the connection info and you can see here that I'm connected to re invent target and I've got RDS in the host name there. So, you know, we're on our managed service.

So we'll look at the relations in this target and you can see that we moved our two tables, they're about six gigabytes each. So this migration in that short amount of time was successful, I showed you resources and consumption on the source. So let's look at it at the target.

So I'll jump into CloudWatch. We'll go to our five minute view here and you can see that we don't use a lot of CPU, right? We spike up to like 10 11%. That's about it. We don't have high db load average. We don't have a lot of dis cuing at all. Free memory stays really flat. I had a scaled down instance a couple of hours before which is why you see that we see, of course, some network receive and transmit throughput. But what we really see is ok. Yes, of course, we're generating transaction logs and we're writing about 80 or 85 megabytes a second.

So on the target side, this is really just a one big right operation for us. So keep that in mind when you are setting up and scaling your target and think about performance and talking about that. So now we're on the managed service, we don't have to query pg stat activity. We can, we can leverage Performance Insights and we can see exactly what we're doing is like I said, we're copying from standard in.

So we're reading from memory, we're persisting to our new new database in RDS. And then after that, we run a couple of queries to get a so that we can tell the user exactly how many rows we migrated. Ok. So that is it for the demo and I've got a few slides left.

Ok. Prem prem miration assessments. So what's the expression measure twice cut? Once? It's the same with your migrations? We don't want you to run a migration and get hours into it to encounter errors when we could have helped you maybe avoid them. And so pre migration assessments, it's just a check box in your DMS task. You can just turn it on and it will generate a report for you by querying your source once your task connects and then it loads that source into S3 and you can view it right in the console and it's going to check for things like unsupported data types or you're using labs, but the target column isn't null or the source labs doesn't have primary keys or you're trying to create a CDC task, but your tables don't have primary keys on them and we're constantly adding things here. So this is a, it's a really easy thing. It's just a check box to turn on. So I would recommend using this, all right, data validation and this is a big one for customers.

So I'll tell you a little bit about how it works. So, data validation ensures that your data matches between your source and target database, even if they're different engines. And it does that by querying both source and target and for full load, we'll do it in one pass, we'll query the source, we'll query the target, we'll create the output which you can view in the console. It's stored in S3. It's also going to live in a table on your target database, DMS validation s or validation report. I can't remember.

Um and so for CDC, so for full load, whereas once, once compares for CDC, it's doing this ongoing. So as your replication is running, we are going to create what we call partitions. We're basically going to chunk up the data in records of 30,000. We'll call that a partition. It is configurable between 10 and 50,000 depending on how much memory you have on your system, on your replication instance, you may want to turn that up. So we'll do our compares. We do that by hashing the data on each side, comparing it. And if it's even we move on, we don't write any errors. If it doesn't match, then depending on your configuration settings for, do you want to air out if there's any errors or do you want to retry?

Um, maybe your source system was just very busy and it was updating really quickly. So we get an error but then you retry 30 seconds later and you're good to go. We've also got custom validation rules. So think about like nulls. So different engines compare these or treat them really differently. Some use the, you know, the anti sequel standard and others do not. So with DMS validation or validation tasks, you can create these custom validation rules. It's like, hey, if you see a kn you know, just replace it with empty white space and then we won't get any errors when we're doing our comparison, you can also configure the number of threads and i said as i said earlier, like the number of errors or retries that you want to do.

This is available also for Amazon S3 and that was a big ask from customers who may be archiving data or creating a data lake or something like that. But and then they want to validate the data once it lands in S3. And so now DMS can do that.

Ok, let's talk about some common challenges that i see. As John said, i lead our technical field community. And so i work with a lot of customers around migrations and DMS and the first sort of being instant sizing. And i know we have server less now it's relatively new. Um but we, like John said, we don't have any magical calculator that can tell you exactly what instant size to put on. And so what i typically see with customers is if they're struggling or they're hitting some sort of performance or memory issues, usually their instance is really undersized.

And so they may have a very large migration and they've got to move terabytes of data, but they're using like a t3 medium or a t3 small, which just has a little bit of memory and memory is going to be king for your replication instance, right? We want all this data to flow through in memory for maximum performance. Otherwise we can persist it to the storage volume associated with your replication instance, which is just a typical gp3 volume.

So when you're looking at your migrations and for instant sizing, right, just keep an eye and cloud wash and really look at that free memory and swap use. And if you know free memory is coming way down and swap is spiking, you're having memory pressure and you may want to consider scaling up for better performance two is sort of limitations or i'll call it supported capabilities, you know, before getting too far down your path and understanding DMS and setting up your migration, make sure the versions and the features that you're using and whatever source engines is supported by the product today.

Um we don't want you to get too far down the road and say, oh we need this capability, you know, log this PFR. Um and we're happy for that feedback, but it does take a while to ship new functionality. The third is gonna be network bandwidth. And so this is another one, we see a lot where customers have a very small pipe to AWS and they can only use a fraction of it for their data migration, but they've got a 50 terabyte database. So just be realistic about how much data you can put through your uh connection to AWS.

The fourth i'll talk about is large objects. And sometimes we see customers that have very, very, a lot of large objects data and it's also very, very large and so that can affect the performance of your migration. So it's important to understand the different modes we have for moving large objects data.

Um whether that's in line or limited LOB mode, if you don't need them as part of the migration, consider doing them outside maybe backfilling after the fact or even moving them to S3 the fifth, i'll talk about is parallelization. And so this is a performance improvement, but it's also really important because it helps customers finish their migrations in their cutover windows that is generally dictated by the business.

So the business may say, ok, on this weekend, you've got to get this migration done. And so typically, i see customers that their full load will be taking two days or 2.5 days because this is a multi terabyte database, but they're not using any sort of the built-in parallelization. And if you have partitions, it's really just a check box or a flag to use multiple threads. And you can specify how many threads you want to use or you can use multiple tasks, set up things like boundary ranges if you have a column that fits that.

And so what i see is once i work with customers and they're at two days or 2.5 days, we can get down to three or four hours. So now their full load takes three or four hours. Now, they don't have two days of cash changes, right? sitting on their replication. Since that they, they, they need to apply and then CDC takes a while to catch up. Of course.

So maximizing the performance on your full load, i think is really important for any large scale migration. And the last one i'll talk about operations or operational excellence, right? We have our well-architected framework. Operational excellence is one of those pillars. And what i'm talking about here is understanding how to monitor your migration or your replication, understanding the cloudwatch metrics, having run books and playbooks for things that happen.

Like let's say your storage volume on your fills up because something happened to your source, to your target. I mean, briefly. Ok. So what happens, right? This is something that you should know and have an SOP for before starting your production migration, what if you need to resize that volume, things like that.

So just treat DMS like you would your database or your containers or anything else, right? And apply the same sort of frameworks or run books that you do for that infrastructure as you do DMS.

Ok. What's new? So John covered some of this, I will just highlight a couple of things on these slides. Um one is being the Glue Data Glue catalog integration. So now with DMS, you can replicate data to S3, we can integrate with Glue Catalog so that you can get a schema of your data and then run queries against it with S3. I mentioned data validation for S3. That's here because that was a big highlight. And a big customer ask the last one i'll touch on is AWS DMS S3 151 and this was a big release for us and it just shows that we continue to put a lot of investment and support into the product.

So DMS S3 151 had support for Postgres 15 for MongoDB partitioning improvement performance for DocumentDB elastic clusters, data validation for extended data types, Postgraphs to Redshift spatial data support and Babel as a target.

All right. So I will end with a quote. "The first step towards getting somewhere is to decide that you are not going to stay where you are" - JP Morgan.

Thank you all for coming.

李白的朋友王维

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫