Accelerate innovation with end-to-end serverless data architecture

最新推荐文章于 2024-11-10 23:17:40 发布

李白的朋友王维

最新推荐文章于 2024-11-10 23:17:40 发布

阅读量180

点赞数

文章标签： aws 亚马逊云科技科技人工智能 re:Invent 2023 生成式AI 云服务

本文链接：https://blog.csdn.net/just2gooo/article/details/134807926

版权

All right, welcome everybody to this session on accelerating innovation. We have end to end several less data architecture. I'm Vincent Owski, Principal Analytic Solutions Architect in AWS. Uh based in E My role is to support customers in building data platform on AWS and today I'm with Sandipan.

Hi, everyone. Good morning. Thank you very much for joining us in this session. I'm Sandeepan Bhowmik. I work as an Analytic Solutions Architect at AWS. I'm based in London. I have been working in AWS for around three years and over the past decade and a half, I've been working with customers mainly as a data professional, building and designing large scale data platforms and a AWS. I help customers modernize their data platforms in the cloud and optimize for price performance.

Thank you, Sandy. So the objective for today for you is really to understand what are the benefits of several of using several in modern data platform.

OK. And for this, we built an imaginary migration for an imaginary company Tan, uh an online retailer uh who is moving from minority platform with its changes to a fully s os and microservice based architecture. We picked this imaginary scenario to show you the full scope of benefits.

OK. Uh but it's inspired from lots of feedback from our customers in re and we will cover a few examples at the end of the session or so, frey is yours.

Thank you, vincent. So I think we are past that stage where we usually start these sessions saying that data is the new oil is the new fuel, currency, bread and butter and whatever, right? We are past that stage, we are actually seeing organizations becoming data driven now and growing massively year over year. However, we still see organizations a large chunk of them who was not able to reap the benefit out of the data that they produce and store and a massive chunk within them do not even have a data strategy in place. So we want to stress on the importance of having a data strategy in place. What we have seen through studies is that if you have a modern data strategy in the cloud in place in your organization, you are 8.5 times more likely to grow 20% over year, over year, right? And couple that with the reduction in operational costs, when you move to the cloud, you can realize massive return of investment over time.

Now I'm talking about modern data strategy, the importance of it. But what is that? Right? I've got three words so that you can take away and remember about them. So modern data strategy simply put is is the backbone that you can form to make those game changing decisions to innovation. So the three words are modernized. Firstly, you know, move from your legacy tech to modern technologies. Secondly, unify, break down the data silos in different departments in your organization, bring data together, govern the data at one place so that you can give access to the right user to the right data set at the right time. Third innovate, once you have modernize your applications, you have unified your data is you can really start innovating with technologies like generative a i.

Now, obviously, I'm talking about modernized unify and innovate a data strategy also has other dimensions like mindset culture, the people you have got in your organization, the skills they have got what kind of tools they use and then the technology choices, right in this session, we will focus on technology choices, we will not focus on the other parts right now. And when we talk about technology in the cloud, obviously, there is an underlying infrastructure, right? We don't servers in the data center.

Now, to be honest, managing servers, you know, procuring them doing maintenance tasks every week. security patching reminds me of old days. you know, it's tedious, it's tedious, it's mundane, it's boring. And what it does is it takes precious time out of your business in firefighting, right? So you are not actually creating stuff, you are managing stuff on a daily basis. That's where we think server as becomes the key for a modern data strategy for your infrastructure.

Now, what is serverless when you choose a serverless service in your architecture, what you do is you offload the burden of managing infrastructure to us and we have got expert teams in the data center who really manages that for you and because that is their job, they can innovate and provide better performance, better efficiency in terms of managing service, right? So as customers, what you do is simply connect to a ps end points, use the service and pay for how much you use. And that gives you the leverage to start spending time on what adds value to your business rather than firefighting with infrastructure.

Now, what we will do is we will take an example, imaginary company. Tank companies imaginary. The stories are real. Ok? So we have collected anecdotes from customers, what we hear in our daily conversation with customers and the challenges that they face and how they, how we advise them to transform.

So if you look at this architecture pretty simple. Oc is an ecommerce company. They provide an application where the customers can go and browse a catalog. So they have got a cataloging service, the product catalog where you can look at products, they can search for products. So they have the search service on the application. They they get recommended for new products or related products. So they have got this recommendation system service running as well. And then they have got this, you know, ordering obviously. So you'll go and order products right now. At the back of this application sits a large database, a postscript sequel database.

Right now, any changes or updates in data that is relevant for analytics, right? For business intelligence, for machine learning, stuff like that get streamed into a hadoop platform that they host on premises, right? And then data engineers in the organization would connect to that platform using things like jupiter notebooks to do data exploration, create spark applications, run data processing jobs, et cetera. And then there are data analysts who would consume the data for business intelligence purposes. So they will run business intelligence report on structured data. And that structured data is stored in another post the sql database which is optimized for, you know, business intelligence queries.

Now this is very simple architecture looks pretty stable but is ang happy. Now tank is an ecommerce company on days like black fridays, their employees are so tensed whether the application will be stable or not, will it support the load that they get on the on the on the system that they themselves can't go for black friday shopping, right? So when it comes for patching, fixing bugs and stuff like that, because it is a monolithic data platform, there are always tends in making changes to one part of the system because you know, it might take down the whole system, even with planned maintenance. If they make a small change in one part of the system, they have to take the whole system down, which is really expensive for an ecommerce company, their developers have to use the tools that are dictated by the architecture. They cannot bring the tools that they are comfortable with, that they are skilled at and connect to the architecture, right? And then, you know, with with experimentation and innovation, it really blocks them for doing that. So if they want, if they have a new idea, they want to experiment it, they have to set up a new server, procure new services, you know, bring them together, it takes a long time. So they cannot really run low cost experiments, you know, in iteration to really generate new ideas and develop them.

So what we will discuss here is how we could take an architecture like that to a server, less microservices architecture and what benefits might realize i will hand over to vincent to talk about the benefits and then i will come back again to talk about examples and close things off.

Thank you. Ok. So see it as a reference architecture for a fully serves data platform. Uh there are operational systems on the left side, analytics systems. On the right side, we will start with the operational systems. Ang decided to move from this single engine to a purpose build databases and this is very important. In other word, we don't think that a one size fits all approach can be good for our customers. You will probably reach some limits a t some point and have some trade off on cost or performance or capabilities. So we really recommend you to choose the right product for the right job. And if you look into the catalog system, catalog system requires fast access to catalog product information, mostly in a key value access pattern. Ok? So ang decided to move these parts on dynamodb or several sequel, no sequel database. They kept the ordering system on progress because they need strong consistency here. Ok? But they moved to aurora several to benefit from this experience. That's the same thing for search system and recommendation system for search system. They were using very inefficient indexes on progress. They moved to amazon opensearch crs to benefit from the full text search capabilities and they also benefited from the similarity search. Ok. So they are more efficient using vectors in the database. They are more efficient in the search on the recommendation system. That's a typical component that require a different approach. Ok. Uh for recommending products to customers, you typically navigate through relationships between customers and product and you try to find some similarities. So this is where a graph database shines. Ok. And here tan decided to move to neptune s. So using the right product for the right job for the operational system, they are doing the same, they are using the same approach for the analytic system. On the right side, we are now ingesting orders from the ordering system through amazon, ms k connect and ms k several our manage and several cca products uh into a data la ok? For the data ingestion parts they are using glue jobs. Glue is our uh data several data integration tool that is built on top of spark and that can provide very fast spark resources and compute resources to industrial data. It supports streaming and batch. We will see later on the scalability part but using glue and glue providing fast start up time, it opens new capability for octane. Ok? So they can move first to streaming ingestion but they can also accelerate the pace of injection of data for batch. Ok? They can imagine you have resources that can come up in seconds. That means you can reduce your batch ingestion to micro batch near real time and even event driven processing. So you can ingest file just when they come to s3 with a notification.

Ok. After the data is stored in the data lake on amazon s3, they can process it using spark with amazon er serve. So here they typically clean the data and reach join aggregate and prepare the data for consumption and then the data when it's ready. They expose this data datazone to the different teams in tank datazone is our marketplace data marketplace product where you can catalog data, then consumers can go on this portal and navigate through the different assets, search for data that they need request access and then the owners can grant the access to these teams with an approval workflow. So it ang to really uh remove the silos. Ok, share the data across the organization and have different teams uh using the data differently.

So typically data analyst wants to build dashboards on top of our data. They, they will use amazon quick site our business intelligent tool to to create dashboards including using gene a i capabilities with q and they query the data with uh amazon redshift serves our cloud data warehouse product which provide very fast response time on low latency and highly concurrent workloads for b i data engineers need more other analysis. Ok? They do some exploration, they don't want to manage anything, they just want to connect, run a quarry. That's all. So i ang the data engineers can now use them as a latina or several sql engine on top of tri to directly query the data, find some uh do some experimentation, validate results, et cetera.

And one of the direct outcome of this migration for tank is the capability to extend the platform. Ok. They use, they have access to a full catalog of products in a rest and they don't need to maintain the infrastructure anymore. So they can bring new capabilities like sage maker or a product. So they hire up data scientists that can just access the data on the lake experiment train model and build some inference end point to expose these results directly into the operational system. So that's why you have an arrow coming back on the recommendation system. That's a typical improvement that they they do. They enrich the recommendation system with a machine learning model and provide more accuracy on their accommodation.

So that's the overall picture. Ok. With the full capabilities uh fully serve less, we will now go into the details of the benefits and we will start with the ease of use. Sanan already explained the characteristics of several. You don't need to maintain infrastructure. Ok. Uh i just want to show you how simple it is to create and use this ls product with amazon aaa mle. So tan move this ordering system on a a and this is again, this is an example of your the change in the mindset. Ok? You are not a builder of the database anymore

You are a consumer of database and you can focus on development. So here we just select the version that is compatible with RDS. We configure the database with admin credentials. But what is important is the next phase, the instance configuration. You see that there is no instance, no operating system, nothing to manage. You just provide minimum and maximum capabilities capacity for the database. The workload will automatically adapt. You can of course, have resiliency, multi AZ and you can set the connectivity into your private subnet if you want to secure your workload and isolate it and that's it. Nothing more. Ok? So that's again, that's very simple. That's less time spent on building the database, more time on using it and building application and bringing features that has value for your customers.

Ok. In general, you will have improvement in development velocity and you will have acceleration in time to market to new features. And this is what matters for your customer and your business.

So let's see an example, Tan fully embraced this model of microservice and they developed a microsoft CTL on top of Amazonia RDS. The idea was to give full autonomy on the developers, the data engineers on both the business code but also the infrastructure code. They maintain everything in the same code repository. They can do that using Bs CDK, the infrastructural code product that is following a programming approach in opposition to a declarative approach. And they can also go one step beyond this, which is also maintaining their own development process with self mutate CD pipeline.

That means in a single code repository, a team can manage business logic that means the value for you but also infrastructure that is running this business logic and also the infrastructure that allows them to develop and to deploy this infrastructure. Let's see how it works.

It starts with data engineers working locally on their laptop using VS Code for example, they develop their Spark logic using EMR container. So with this container, they develop locally, they have the exact same environment as in AWS cloud. They can run unit tests. And when they are happy with their logic, they create the infrastructure code. So what will run this code? And they also create the CD code, the infrastructure supporting the CI pipeline. This is what we call the CDK pipeline stack. Ok?

They deploy this stack, it will create the CI and now they can push code on a code repository. When they push code, there is a build stage that is taking business code in Spark building it running unit test. There is also a build of the CDK code. But what is important is this stage. If they push a change in the CDK pipeline stack, the CD will self mutate. So that means they can change it dynamically if they have a new feature or if they have a new test. If they want to experiment, they don't need to rely on another team to adapt the CI process for them. They just push the code.

After that the application stack which contains all the resources necessary to run the Spark code is deployed in a staging environment. That's a typical step where you need to run integration test, validate the results. And if you are happy with the results, you can promote the code to production environment.

Ok. So this is a kind of ideal situation where the data engineer is really responsible for and owns everything. As part of the application, the application is self contained both the logic infrastructure and development process.

If you think about this microservice architecture, you can dedicate resources that's easier to scale obviously because you don't need to connect with another team, you build your own scaling strategy. But choosing serverless also brings lots of new capabilities in scalability because you are using products that are built on top of expertise and we have lots of experience on scaling.

Let's take an example with EDB Glue. So Glue has a very fast auto scaling. Ok? It can adapt the amount of resources in your Spark cluster very quickly scale up and down based on the demand. Pretty valuable for the streaming part when they ingest the CDC from the ordering system because the load is not constant, but it's also valuable for batch ingestion because if it can start very quickly and here we're talking about tens of seconds to start a job. That means they can do micro batch.

As I mentioned already, they can have near real time processing they can react to a file coming on S3, just listen to the events and trigger the injection job. So it brings also new capability and they get more value for their customers because they have fresher data in their analytics platform.

So how we do that, how we scale fast in Glue, we are using the one pool approach. That's a typical pattern. When you want to provide fast start time and fast auto scaling, you just create resources and you wait for requests and allocate these resources to your jobs that you submit in Glue.

We developed the custom scheduler in Spark and doing this your Spark driver. When you submit a job, your Spark driver contains all the logic, all the intelligence and it knows how many tasks it needs to process how many executors it will directly connect with this warm pool and request resources. So we are able to provide resources to provide executors typically Spark executors in the cluster in a matter of seconds.

Even if the pool has not enough capacity, it can auto scale. Ok? We have a fall back mechanism on EC2. So that means we will be able to create resources quickly in less than 10 seconds.

This direct integration between Spark scheduler and this resource provider brings fast auto scale and fast task start time. And we are monitoring this, and since between Glue version one and version two, we have seen a big improvement in start time distribution. Ok?

We have seen that a warm start decreased from less than one minute to less than 10 seconds and even a cold start decrease from less than 10 minutes to approximately 35 seconds. So that's very fast. And again, it really opens new capabilities. You can do micro batching and even driven ingestion.

So that's the first model using one pool just pre creating resources allocating them to your jobs. But we can go beyond that and we can even provide faster creation and auto scaling. And this is typically what we do in Amazon EMR for Apache Spark.

So we are able to provide subsequent start up for your code to execute. I'm not talking about starting the EMR cluster. You know that Spark is based on the EMR cluster is long to start ok. It takes 10 to 15 seconds. Here, I'm talking about subsequent start up time where your Spark cluster is ready to execute your code.

And we are using for this technology Firecracker. It's the same that we use in Lambda to provide very fast start up. It's a kind of virtualization on top of the containerization. So we can create instances in microseconds with the same isolation, same network resources, etc.

But we want also to improve and to avoid this constraint on the EMR cluster. And for this, we create the snapshot feature of Firecracker. So that means we can pre start virtual machine snapshot them after the EMR cluster has initialized. That means when you request resources in EMR, we provide this, we start the instance already up and running in the EMR cluster.

So we are really able to start very, very quickly. I think the time to load the page is longer than the time to start the Spark cluster. And so that means you can be very aggressive on scaling. You can be very aggressive also on starting lots of jobs and starting lots of sessions and you don't need to manage typically the one pool.

So these features, that's the two examples of how we manage auto scaling in serverless projects. These features are natively available to you. That's also something you don't need to care about.

Ok. And that's one of the big benefit of choosing serverless moving to microservice also brings lots of benefits on the resiliency. Because if you can create more infrastructure and dedicate infrastructure to different applications without additional overhead, you will have a reduced blast radius when you have an outage.

So remember Tan had this monolithic Hadoop cluster with all the workloads. That's typically the left side of the slide. If you have an outage on the infrastructure, it will impact all the application and all the workloads running on it.

With a microservice approach, you just isolate resources and it only impacts the application that is running on this infrastructure. Another advantage is when you use serverless AWS products, you have some built in capabilities for resiliency.

I will take the example of Aurora Serverless here which is multi AZ by design and that's also something you don't need to care about. It's natively available in the product.

So we have in Aurora Serverless, we have the separation of compute and storage, we abstract this for you. But that means when you write the data, your data is copied six times across 3 AZs. Ok. So it's super durable and highly available.

And because you also have this separation of compute and storage, you can scale independently the compute, you can have read replicas, you can have the readers up to 15 and in different availability zones. So that means if you have a failure, an outage in one AZ a read replica can take over the writer role and you can continue to run your application and your workload without any interruption.

So that's typical things. Again, we move that responsibility from you. Ok? You don't need to manage, you just rely on native features.

What about security? Security is our top priority. Neither rest. And if you think about the shared responsibility model that we follow where you are responsible for building a secure cloud and you are responsible for building applications that are secured on top of the cloud. There is a major impact here when using serverless.

Also this is the typical situation where you have self managed infrastructure like EC2, you manage your workload. AWS is responsible for securing the layers, but we will stop at the compute storage, database and networking layer.

So that means you need to secure everything that is on top of this. In opposition to serverless where we take care of more layers and more things for you. So you are only responsible for securing the application layer.

And this is again, what what is important? You don't spend time on securing the infrastructure, you spend time on building application features for your business. Moving to microservice architecture also has an impact on the security. And we can compare here, the monolithic uh hadoop cluster with ya s tank was using this shared infrastructure. And I think if you know hadoop, you know how painful it is to secure it.

Ok. We need typically cabers on top of this, you need to operate a cros domain. Uh you need to be sure that all the frameworks on top of hadoop are secured with caos you need also to secure the user interface. So it's clearly very painful and this is where the microservice can help because again, you dedicate infrastructure to workload. And that's typically the case with vmr cs.

So now octane can submit multiple jobs because the spark spark cluster is isolated. They use the job execution role to pre to specify the permissions and they are sure that uh there is no mixing between permissions between jobs so they can have fine grant permissions very easily.

Now, you think that if we create more resources with a microservice approach, it may be more expensive. I will give a throw back to sanan, who will explain about this.

Thank you, vincent. So vincent has been talking about using purpose built systems, right? Purpose built serverless services for each use cases. So many services together, i know, you know, we get questions from customers, is that how much are we going to spend? Right. And i'm pretty sure that you are thinking about that as well.

Now, when we talk about cost, when it comes to using serverless services, we talk about thinking about the total cost of ownership, right? So you when you use a serverless service, you do not pay for the service. When you are not using it, you do not pay for over provision resources. So you're not actually dishing out a lot of resources even when you don't need them, right? But which you do in an on premise server, right? So you would provision for 80% utilization and when your utilization is that 20% you are still paying for that 80% utilization resources, right? So you don't need to do that with several less, you have got reduced operational overhead. So we operational efficiency, you stop spending time on firefighting and managing that saves you cost. And lastly with combined all of those things, you are enhancing your productivity, you are innovating, creating new products for your company that brings more revenue that brings down the total cost of ownership over time.

I will give you an example. So we had amazon rah over less. Um if you look at think about using a data warehouse, right? You're procuring the data warehouse and paying for for licenses over time that you do with traditional data warehouses. It is very costly. And then on top of that, you have to have administrators who will need to do admin tasks and you know, manage the servers, do maintenance jobs and all of those things takes time and money as well with several or less. If you see the darker box here, this is, this is the time that your data warehouse is available to your users, they can go and run queries, it's available, but you are just paying for these times when the queries run, right? So just those boxes. That means although you have this data warehouse available, you're not paying for it all the time. You're just paying for it when you hit queries to it. And then red shift over less has got machine learning powered maintenance jobs. So you know, like auto vacuuming, reducing, taking out memory, updating statistics and normal data warehousing maintenance jobs that you do red shift does that automatically for you? Which means that you are not spending time on doing them, right?

So bringing all of them together. It really helps you reduce your total cost of ownership with serverless services. No another important thing with when you use our services, it becomes much easier to improve your efficiency. So sustainability is something that we think about seriously. When we talk about architectures with our customers, we advise customers to imbibe sustainability into their architecture. Think about it as a nonfunctional requirement. And that's why we have got the sustainability pillar in our well architected framework as well.

When we think about sustainability, when it comes to actual like data platforms and solution architecture, the main focus is maximizing efficiency and reducing wastage, right? So think about this. When you use a serverless service, the actual servers running in the data center that we manage for you, we are not blocking them for your workloads. When you don't use those, we can use those servers for other workloads. So we are not reserving hardware in the data center. Right?

Then again, with efficiency, reduce operational overhead, reduced manual tasks, you reduce errors, you maximize efficiency, right? You do, you make sure that you are utilizing all of the resources that you have invested for and that's what we make sure in the data center as well. We we come up with innovations in the data center that helps us manage the servers better overall over time, it takes down your energy consumption which results in lower carbon emissions.

Again, i'll give you an example with an unpredictable workload like business intelligence, especially when oc tank, we took the example of oc tank, right? So business intelligence workloads are really unpredictable. You have got analysts running reports, you could have days where too many analysts will run too many reports at the same time on the data warehouse. And then you will have times where there will be no reports running during the day, right? Or there will be days where no one runs anything now with, with, with services like amazon rh servers which are really optimized for serving bi workloads. You can really make sure that you run your data warehouse lean, right? That's what you need to focus on. How do you run that in, in, you know, you don't want to run a bulky data warehouse.

So when you run it lean with serverless services, you can get resources provision really quickly on demand. So when there is an hour in the morning where report runs, you know, a lot of reports run red shift will automatically provision an extra work group, extra compute. And then when that subsides, when when reports stop running, it will take that workload away. So making sure that what you're running throughout the day is really lean and what you are, you know, and you still don't affect the peak times and they, your queries are solved when you're hitting a large number of queries to red shift as well.

Now we talked about a lot of benefits that server provides. But what, what it results in is how you innovate in your organization. When you think of innovation, think about an idea, develop that into, into a product, into a business, you have to run experiments, you need to make sure that you can run multiple experiments at low cost. Imagine doing that in on prem or even in self managed applications, it's really difficult, it's really difficult to control the cost. It's really difficult to run them in iteration and run them multiple times. And then you need to have that agility to take advantage of opportunities.

So when we when generative a i came up in the world, many businesses took the opportunity to create applications that helps their businesses to create new products. When covid hit, many organizations took that took that opportunity to serve their customers better, right? So innovation is really important.

When you think from the serverless side, now we are talking about tank, the ecommerce company, a company like that that switches to serverless architecture can really start thinking about using cutting edge use cases in generative a i, right? So when ang started thinking about some of these use cases that relate to their industry and picked up one of the departments like customer support where they can start building products that help their customers.

So what they came up with is this life call analytics and agent assistance system. The life call analytics and agent assistance system provides octane customer support executives to really understand customer sentiment in real time provided by a system. And that system provides an assistant which summarizes the customer sentiment. Takes the knowledge from the customer data and other data available in the organization. Use this large language models to summarize that and put that back to the agent as an advice that what the agent should tell the customer. Right in real time.

Now, think about building a system like sorry, think about building a system like that, right? You see the calls coming in, getting transcribed in real time from speech to text, then streamed through data streams. And then lambda uses that binding multiple machine learning languages, machine learning models and also using like summarization from large language models, getting the knowledge from candra and then putting back to the user to the to the to the call center agent as as advice, the next best action for their customers.

There are so many services, right? They are too, all of them are serverless here. What do you when you think about develop these kind of services? You have to stitch them together like lego blocks, right? And then you have to make sure that you can experiment develop and improve this over time and you can only do that efficiently. When you move towards the microservices, even driven serverless architecture, it becomes really easy to start building these applications.

Now on aws, when it comes to data and machine learning. We have got a full stack of server services that will help you with any use case. Whether you're ingesting data, whether you're storing data, whether you're processing it, you're building search applications, generative a i and data governance with amazon data zone, right? So you get that full stack of serverless services that you can put together and build applications and build new products. Monetize your data that in turn generates revenue. That's where it binds back to the statistics. I was mentioning earlier that how you become 8.5 times more likely when you have a modern data strategy in place 8.5 times more likely to grow right in revenue.

Now, we have been talking about examples with imaginary companies. I want to talk about two customers who have leveraged several less services to improve their operations, reduce the burden that they have got on their teams in terms of managing servers.

So bmw, everyone knows bmw, anyone who doesn't know bmw. Bmw is in case you don't know bmw is a german motor manufacturing company. They had a big hadoop cluster running on prem time consuming, hard to scale. It took their teams a lot of time. It did not have any self serve capability for the analysts to analyst to build jobs on top of that. So they, they, they, they started using aws glue for structured data ingestion when they started using glue as a serverless service. They could access a ps to trigger programmatically automate their ingestion jobs so that they spend less time in like firefighting with ingestion jobs. And then blue has got the c of capabilities with the cataloging. So with the catalog that analysts started to find data easily and also build jobs by themselves using glue studio. This helped them seller data. And because they could seller data, it increased the adoption of their data platform, right? So they could build trust within their organization for the users to use data.

Nextgen healthcare is an american software company that provides health care softwares for electronic health record and patient patient management systems. They were managing multiple data warehouses and it was really hard for them to manage all of those data warehouses. Their teams had huge operational overhead, lots of manual effort. They moved to red shift serverless. Once they moved to the red shift server list, they could take advantage of the autonomic that red shift server provides. We announced a new one yesterday, the a i powered performance optimization feature where you can you can there is a slider, you could choose between cost and performance, right? So you can have you can be in the middle to balance between cost and performance or you can optimize for cost or performance depending on your needs. And that shift will automatically through machine learning, decide how to optimize queries when users hit the data warehouse with queries, right?

So these kind of autonomic really helped really help customers to reduce the overhead. In this case. Next gen and then lower the cost over time and then leverage red shift for innovation using features like red shift ml. You can start developing machine learning models in simple sql language on top of your data warehouse. So that's, that's the leverage you can take with services like this.

Now you must be thinking that's all great. How, where do you we start? Right. So how, how, what, what is your next action now? We can help you depending on where you are in the stage in your organization. For example, let's say you don't have data strategy in place. We can help you with, with our teams of data strategists who can work with you to create a data strategy from scratch. If you are on prem want to migrate to the cloud, you have got some databases or traditional data warehouses running that you want to modernize. We have got the reimagined data program where our modernization experts will go to your team work with them to create this modernization and migration strategy. So you can avail that program. And thirdly if you're already running workloads in the cloud, you have got data warehouses running in the cloud. You have got your data systems running in the cloud. We can engage experts who can do well architected review to review your architecture and provide you best practices on how to optimize them for each of these benefits we discuss but also provide you re architecture proposal on how you can actually move to serverless services and leverage all the features that server provides.

So when it comes to innovation with generative a building new data products, reducing cost in your organization, your data is your differentiator. Having a modern data strategy in place is where you start create that modern data strategy. Focus on your technology choices and see how you can leverage several less services to deliver on the data strategy.

Thank you very much for your time. We will be around for answering any questions. Please feel free to approach us. You can also get in touch with us on linkedin. So the qr code will take you to our linkedin profiles and please fill up the survey for us. It's really important for us to, you know, tell us, tell us what do you think about the services about serverless about this session? Give us some feedback. It's really important for us. Thank you very much. Thank you for your time.

李白的朋友王维

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫