Empower your commercial business with data replication and resiliency

最新推荐文章于 2024-08-29 09:50:39 发布

李白的朋友王维

最新推荐文章于 2024-08-29 09:50:39 发布

阅读量108

点赞数

文章标签： aws 亚马逊云科技科技人工智能 re:Invent 2023 生成式AI 云服务

本文链接：https://blog.csdn.net/just2gooo/article/details/134788851

版权

Hello and welcome to our session. Before we get started today, we'd like to get a good idea of who we're talking to.

Can I see a raise of hands? How many of you in here are business executives? Alright, if I can see through these lights. Alright. And how about how many of you are technical builders? Ok, good. We've got a good mix. I'd say it's about 30% business executive, 70% builder or maybe some other undecided. We'll see.

Thank you. Welcome to our session and good afternoon. Before we kick it off, I'd like to take a few moments for each of you to ask yourselves a question: If a disaster happens tomorrow, do you understand the impact to your business? Let me ask that one more time - of a disaster, any disaster happens to your business, do you understand the impact?

Excellent. Here at AWS, over the past few years, we've experienced a dramatic shift from what we previously considered rare, anticipated or unanticipated events becoming the new normal and redefining what we call business as usual. These shifts can be especially challenging to commercial businesses who can’t absorb the shock as quickly as funded startups or large enterprises.

A few recent events, I'm sure each of you can think of them in your head, have given rise to the creation of or updating of business continuity strategies in a rapidly volatile landscape. And that is exactly why we're here to talk about business continuity today.

Hi, I'm Teresa Steffens, AWS Sales Leader for our Commercial Business and Field CSO. How many of you here in this room have felt pressure to grow your business over the past few years? I hear some laughs. I see lots of hands. I see lots of nods. Anyone here looking to gain some confidence in your business continuity strategy? I see some hands as well.

As a Field CSO I have the opportunity to work with over 500 commercial customers across many different industries. I get the opportunity to talk to executives every day, some of you are in this room today. I can tell you all of the executives I've talked to feel pressured to grow their business and most of them lack confidence in their business continuity strategy.

That is of course, until they meet Archana and me. In today's session, we'll walk through why resiliency matters in competitive business environments and how every commercial executive can prioritize growth in certain and uncertain times.

Let me change gears a little bit. Let me put all the technical builders on the spotlight. How many of you are worried that you may face some sort of data loss in your critical applications that could put your company's brand value at risk? I definitely do all the time. Anyone here looking to learn preventive strategies that could help avoid data disruptions and help you maintain your company's brand value? I definitely am, Yorgen.

Welcome to the session today on “Empower Your Commercial Businesses with Data Replication and Resiliency”. Today, we’ll walk through:

What is a business continuity strategy and why does it matter?
Common challenges that commercial businesses face and how to overcome those
Implementing simple, cost-effective solutions for your data replication strategy
And we’ll do all of that through a fun business scenario where you'll get to see Archana's and my Oscar-winning acting skills.

And if we've done our job well, today you'll walk away with a few key takeaways:

You’ll understand the importance of identifying business goals and your business continuity strategy
How to evaluate tradeoffs and design factors in your data replication strategy
And how to pick the right tool for the right job to achieve resiliency and empower your commercial business

Ok. Although we may not want to go back there, let's take a look at 2022 and the common themes that arose:

We faced record high, rising inflation rates
A looming recession
Supply chain issues
Rising costs of labor and skill
High energy rates
Global conflict
Climate change
Cyber espionage
General uncertainty

Sounds like a movie trailer, right? Despite these detrimental impacts, why did some businesses flourish in 2022 while others did not?

I bet I know what you're thinking - money talks. Historically, larger enterprises have recovered faster than other businesses in a disaster or crisis, largely due to cash flow. But cash flow does not dictate resiliency. In 2023 we live in a world where the key to growth is agility, speed and resiliency. Commercial executives can and deserve to think big about your business continuity without running out of cash flow and resources.

Take Tsubota for example. Tsubota is a global clinical trial technology company, an AWS customer of Archana’s and myself. Because Tsubota delivers clinical trial technology solutions to support mission critical moments of life sustaining trials, it is crucial that Tsubota’s customers have quick access to their applications around the clock.

Collaborating with their AWS account team, Tsubota built a resilient infrastructure on AWS, achieving virtually 24x7 uptime and reducing its recovery point objective by a factor of four. According to Tsubota, on AWS their infrastructure supports the increasing complexity of clinical trials so their customers can focus on life saving research.

Pretty impressive! In competitive business environments, in certain and uncertain times, the demand for executives to grow revenue and drive profit continues to rise. Executives globally are eager to grab the estimated $3 trillion in EBITDA uplift opportunity that can be enabled when businesses leverage the cloud for mission critical workloads.

Tapping in to that estimated $3 trillion EBITDA uplift is largely dependent on the resiliency of those workloads on the cloud to drive scalability, agility, and availability. A business continuity plan keeps the business operational during a disaster or crisis, accommodating elements of a business beyond the IT workload.

The definition of a business continuity plan is the same for enterprise businesses and commercial businesses. Oftentimes with the same expectations to meet enterprise level customer demand, grow revenue and drive profit, and meet customer demand for always on, anywhere access.

The difference is the means in which this plan is executed and these goals achieved. Enterprise businesses oftentimes have less limited access to funding resources and technology. Commercial businesses have to understand how they can leverage limited resources effectively.

Can anyone here relate to that? Anyone here wear a lot of hats in their role, I thought so. When exploring and creating your plan, you may be faced with a number of challenging scenarios making the process seem daunting:

Funding resource availability
Technical debt
Experience and training gaps

In today’s session, we’ll walk through a few key strategies and how you can overcome these challenges - specifically by implementing simple, cost-effective data replication solutions.

A business continuity plan may sound like a disaster recovery strategy. Although a disaster recovery strategy alone focuses on restoring data and IT infrastructure after a disaster, whereas a business continuity plan focuses on restoring a business beyond the IT workload - allowing executives to focus on growth.

A disaster recovery strategy is a subset of a business continuity plan. It is a set of tools and procedures that restore businesses' vital information processes and systems following a disaster. Implementing a disaster recovery strategy can reduce the impact to revenue by meeting customer demand for always on, anywhere access - and empowering employees to weather any storm.

Archana and I will walk you through how to do just that, specifically for your data strategy. Archana, take it away!

ARCHANA: Thank you Teresa for highlighting the business impact and importance of resiliency. So we understand why it is important to build resilient applications and also the challenges that commercial businesses face. Let’s now dive deep and try and see how we can leverage data replication strategies to overcome these challenges.

In simple terms, data replication is creating and maintaining copies of data across multiple systems, locations and databases. But why do we really replicate data?

First, to ensure business continuity we can back up and use the copy of data to recover from any disaster or disruption.

Performance and scalability - we want to provide low latency data access to all our users across the globe.

Satisfy any compliance requirements - you might have compliance, data residency requirements.

And lastly, monetize the data - you have accumulated volumes of data which you can actually monetize by using analytics and reporting.

Let’s walk through an example scenario to understand this further.

So Teresa, imagine you are the CEO of an ecommerce website. Your business is doing pretty well but with a very limited number of users. You came across this McKinsey report that states there is a $3 trillion EBITDA uplift that can be enabled by migrating onto cloud. You definitely want to grab that opportunity right? And you decide you want to migrate onto AWS.

So what would be your first requirement?

TERESA: Hm. So my website needs to have all the features. I need product display, product information. I need all of my customers to be able to process orders, process payments easily. My top priorities would be for my infrastructure to be scalable, durable, available 24/7 so I don’t miss out on any orders thus revenue and I provide an optimal customer experience.

ARCHANA: That’s a long list of pretty easy requirements right? Well it is easy if you can leverage some data replication strategies and AWS services. Let’s look at it. How so?

Let’s imagine I am the engineer responsible to build this entire AWS architecture right? The first thing that comes to my mind is what are my data sources? So I start with identifying the data sources. It’s an ecommerce website so I am thinking about all the orders and transactions data that I would have. So I’m thinking of a relational database and I decide to go with Amazon Aurora which is MySQL and PostgresSQL compatible relational database built for the cloud.

Then I’m thinking about all the different products that I will be selling on the website right? So I don’t want to tie down to a specific schema here because they could belong to different categories. So I’m thinking I’ll go with a NoSQL database and I decided to choose Amazon DynamoDB which is a key-value pair NoSQL database built for the cloud.

Lastly, what about all the images, video or any audio files that describe my products? I want to store them in a block storage or an object storage service. So I decide to go with Amazon Simple Storage Service or Amazon S3.

Once I identify all these data sources, what I did is I came up with an initial architecture. As you can see in the diagram, I have my website that is running on Amazon Elastic Compute Cloud or Amazon EC2. Then I have all the three data sources to basically keep my relational data for the orders, transactions, product information and so on. And currently I have set it up everything in one single region.

TERESA: Archana, this looks great! Can we take it to production tomorrow?

ARCHANA: Not really. Ok so this does take care of all the initial requirements that you asked for. But what about the global expansion? We want to make sure that we are able to handle all the user base, also try and keep the data as close to the users as possible and multi region resiliency too, because remember you talked about 24x7 uptime, highly available, scalable and all that. So I want to go further and make sure that this is also resilient in a multi region.

TERESA: I guess you're right, Archana. Thank you.

ARCHANA: Before we move on to how we can change that initial architecture and make it more resilient across multiple regions, there are few terminologies that I want to explain so that we all are on the same page.

So two of the key metrics that are very important, very crucial are Recovery Time Objective and Recovery Point Objective.

So Recovery Time Objective or RTO stands for how quickly your application should be available after a disaster happens.

And Recovery Point Objective or RPO defines how much data loss can your application tolerate, or rather how old data can be after an application recovers from a disaster.

Both these key metrics are measured in hours, minutes and seconds and lower the number defines less downtime and less data loss.

So now that we understand the key metrics, let’s try and go one data source at a time and see how we can make this more resilient, use some data replication strategies and make it more resilient for multi region.

The first one is Amazon S3. So this is the object storage service that provides industry leading scalability, performance, security and data availability.

Amazon S3 provides 11 9s of durability. It also provides some robust replication features. It's a fully managed service which kind of helps us to use the simple, easy to configure out of the box features and thus overcome the challenges that we talked about like limited resources, limited funding or limited like skills experience gap.

So let's look into the replication that Amazon S3 provides.

So replication enables automatic asynchronous copying of data between the multiple S3 buckets and these buckets could be in the same AWS account or they could be across multiple AWS accounts. They could be in the single, same region, multiple region.

And there are four different types of replication that it supports:

The first is Same Region Replication - so if you want to be like highly available, highly resilient within a single region, we can use the Same Region Replication.
Cross Region Replication - when you want to make sure that you can replicate data across multiple regions, you can leverage the Cross Region Replication.
Batch Replication - in case you want to do replication at a specific time interval or at a frequency of time intervals or even if there was data already existing and you want to now replicate it to a different region.
And lastly Bidirectional Replication - so, bidirectional replication is let's say there was some disaster in one of the regions and you are replicating it to a backup region, then you failed over to the backup region. So you want to make sure that any new changes that happen in the backup region are also copied back to the original or the primary region.

The next data source we talked about was Amazon Aurora. Amazon Aurora is a MySQL and PostgresSQL compatible relational database built for the cloud. It provides six copies of data, two in each Availability Zone. You can create up to 15 read replicas to provide low latency data access. And it also provides automated multi region replication again with a fully managed service and simple, easy to use out of the box configuration.

We are able to overcome the challenges that we talked about. Let's look into some of the features that Aurora provides for replication.

The first one is Aurora Global Database. So Global Database is designed for globally distributed applications. So you can create a primary database cluster and the secondary database cluster, you can promote the read replica to be the primary in case there is a disruption that happens in the primary database cluster.

All the write operations will be performed on the primary and the read operations would be performed on the secondary. When a disaster happens or a failover happens, the writes would be pointing back to the secondary which we have now promoted as our primary database cluster.

It also provides two different options - the Global Database switchover and failover. So switchover is when you want to do any planned migrations, you want to just do a database rotation and failover is when you have a disaster that happens in your primary database cluster and you want to now fall back onto the secondary.

Next is Aurora Database Cloning. So with database cloning, you can quickly and cost effectively clone your database to a backup database or to like a secondary database. These databases can be used for either dev test environments or you can also use for parallel production environments. Also, the creation of the clone is nearly instantaneous. You can also use this for point in time snapshots.

Lastly, Aurora also provides Point in Time backtracking. So backtracking is rewinding your database to a specific time. This can be used to avoid any destructive DDL or DML actions such as if you did a delete without a where clause and you want to just go back to a specific time before that operation happened. And backtracking can be done multiple times. It is a non-destructive operation.

Next for our Amazon DynamoDB. DynamoDB is a fully managed NoSQL database. It provides three-way replication, it provides global tables, and point in time recovery features. With these automated replication, it provides automated backups and being a fully managed service again, it gives us very easy to use configurable features that we can leverage for our data replication strategy.

Let's look into how it provides single region replication. So we used Amazon DynamoDB for our product information. So every time there is a new product that is getting added, what DynamoDB does is actually does three-way replication, which means that data is always replicated to three Availability Zones. So you have like three copies of your data. So this is for within a single region.

For our multi region, we have Global Tables. So DynamoDB provides Global Tables which are fully managed, serverless, multi-region and multi-active database tables. What does that mean is when you have a globally distributed application or do you have an application used by global users, you can actually leverage Global Table, specify which regions you want to replicate that data to.

In this diagram, if you can see we have three different regions, let's assume that you have your users that are across those three regions, then you can use Global Table and then replicate the data to these three regions.

It provides disaster proofing with multi-regional replication. It is also easy to set up and no application changes are required. It also provides multi-active tables. What that means is that if there is a disaster that happens in one of the regions or if there is a disaster that happens in one of the database tables, there is no failover required because every time data is written, it is actually written to all the three regions here or all the database tables that you have configured.

So this is the final architecture. As you can see, we have a primary region and the secondary region for our three data sources.

We are using features such as:

S3 Cross Region Replication
Amazon Aurora Global Database replication
And DynamoDB Global Tables

For our website that is running on the EC2s, we can also leverage another fully managed service, AWS Elastic Disaster Recovery Service that can help you to replicate all the configuration of your servers or the EC2s from your primary to your secondary region.

We are also seeing here, Amazon Route 53 which can be leveraged for DNS mapping and also to configure the failover in case the primary region goes down, you can use Route 53 to now fall back onto the secondary region.

We can use Amazon Route 53 health checks to check the health of the services that are running in our primary region and CloudWatch for observable monitoring, logging, etc.

So this is our final architecture that is able to handle the multi-regional resiliency, able to handle our global users and expand our business.

What do you think about this architecture, Teresa?

TERESA: Archana, now I think this looks great! There's one question I've been thinking about though - what about my cost of resources? Do I need to hire someone for this?

ARCHANA: Not necessarily because we have used all the managed services here and we are leveraging all the out of the box, easy to configure features. So we can actually train our existing resources to understand these different configurations and then use those. So we should be good to go with this!

TERESA: Fantastic! This looks great. Let's do it!

So now we have our ecommerce website up and running, able to handle the multiple users across different regions, resilient, etc. What's next?

We have been accumulating so much data over a period of time. We want to make sure we can also leverage that data to monetize or monetize the data to grow our business further.

We can create a data warehouse using a service such as Amazon Redshift, copy all the data from these three different sources or any other sources that you might have, copy that, bring that to Redshift, create your data warehouse.

You can also use the data sharing feature of Redshift that helps you to easily replicate data across multiple clusters and then run analytics and reporting on top of that. You can leverage the data warehouse to run recommendations, personalize, machine learning, get insights out of your data and eventually monetize that data. That definitely makes our CEO happy!

Let's wrap this example scenario and our Oscar winning skills.

So we had a business objective to grow our revenue and profit in certain and uncertain times with a business continuity strategy.

How can we achieve that?

By building highly available and durable applications, by having low latency, high performance, making sure we have a disaster recovery strategy.

And lastly, monetize our data.

How we were able to achieve that:

Using the AWS solutions, we used all the managed services to take all the undifferentiated heavy lifting and let us focus on the business innovation.
We use some simple configurable features
And we can also leverage AWS training resources to understand the meaning of these different configurations, how to configure those.
And then get started with this.

We were able to overcome all the different challenges that Teresa explained in the initial slide about limited resources, funding, experience and skills gap and all the infrastructure debt.

Though there are some design considerations we must understand for our data replication strategy:

The key metrics, the RTO and RPO
We always work backwards from our business goal. What are my expectations? What is the business really need? Understand those key metrics and then build your data strategy.
What is the size of the data that I'm trying to replicate?
Are there any security and compliance requirements? So if you have data residency requirements, you cannot necessarily copy that data to a different region. So consider that when you're choosing your replication strategy.
Synchronous versus asynchronous replication - what is the requirement for my application specifically? What is the kind of data that I'm handling?
And lastly logical versus physical replication?

Any time we are choosing our data replication strategy, these are the design considerations that one must think about.

There are also some tradeoffs when building a distributed system, consistency, availability and partition tolerance or the CAP theorem needs to be considered.

The CAP theorem states that we can only pick any two out of these three. Achieving consistency and availability is very common for single region. But when we talk about a multi region setup, we need to consider the latency for traffic because the data would be transferred between multiple regions, the distance and so on.

Thus, in a highly distributed systems typically compromise on strict consistency, favoring availability and partition tolerance, which is usually referred to as eventually consistent data model.

Ok. In conclusion, in today's session, we talked through a few key takeaways:

We talked about the importance of identifying business goals and your business continuity strategy
Evaluating design factors and tradeoffs in your data replication strategy
And how to pick the right tool for the right job to achieve resiliency and empower your commercial business

I'll leave you with this - as you look at this image, build your business a trampoline rather than a safety net so that when your business inevitably faces a disaster or crisis, you bounce back higher than ever.

Thank you for your time today. We can now move on to any questions in the room.

李白的朋友王维

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Empower your commercial business with data replication and resiliency

Hello and welcome to our session. Before we get started today, we'd like to get a good idea of who we're talking to.Can I see a raise of hands? How many of you in here are business executives? Alright, if I can see through these lights. Alright. And how ab
复制链接

扫一扫