Putting your data to work with generative AI

最新推荐文章于 2024-10-14 16:58:17 发布

李白的朋友高适

最新推荐文章于 2024-10-14 16:58:17 发布

阅读量540

点赞数 9

文章标签： aws

本文链接：https://blog.csdn.net/weixin_40272094/article/details/134736480

版权

Please welcome Vice President Technology AWS, Milan Thompson Bukovec.

My name is Milan Thompson Bukovec and I'm a Vice President of Technology at AWS. Now, I have been working on AWS cloud services for over 13 years now and I focus on those large scale data services that are essential for any cloud application like storage, streaming messaging and observable.

Now, every modern business is a data business and I have spent a lot of time over the years talking to customers about how to use data to drive their digital transformations in the cloud. Now, we are having that same conversation about how to use your own data with generative AI.

Generative AI, as many of you know, is a kind of AI that creates new content, whether that is imagery or text or audio or video. And it's based on the patterns and data learned by the underlying model. Now, that generative capability as you all know, can transform how businesses think and operate. But to do that, you need to understand the role of your own data in generative AI. And that's what we're talking about today, how to customize your generative AI applications and your AI system with your own business data.

We're also going to unpack the top three data initiatives that are central to generative AI that span industries and geographies. Now, if you are somebody who works with data in an organization and you are talking about these three initiatives, you'll want to start. They are just so critical for businesses that are adapting generative AI.

And as part of this conversation today, I'm going to help you connect the dots between these three critical data initiatives and the most important announcements for data this week. My goal is not only to share what data organizations across the world are doing for generative AI but also how you can put, you can use AWS to put your data to work.

Now, data is growing at an incredible rate and it's powered by consumer activity, sensors, business analytics and so many other drivers that data growth is driving a flywheel for generative AI.

Foundation models or FMs are trained on massive data sets from sources like Common Crawl, which is an open repository of data that contains petabytes of web page data from the internet. Now, enterprises companies like yourself are using smaller private enterprise data sets for additional customization of FM responses and creating what's really a new intermediate set of data. I'll talk about that more today.

These customized models will in turn drive more generative AI applications which through customer interaction, create even more data for this data flywheel. So what data is going into that flywheel?

IDC forecasts the amount of data generation on an annual basis. And they predict that the growth of data over the next five years is going to be 22%. And by 2027 the world will have 229,000 exabytes. Now that data is going to include structured and unstructured data. But the vast majority of that is going to be unstructured data, like video and pdf files, audio text files. 90% of that data is going to be unstructured and it dominates the growth of data in the upcoming years.

Now, the quality of that data matters for generative AI because higher quality data improves the accuracy and the reliability of the model response. In a recent survey with CDOs, we found that almost half 46% of the CDOs view data quality as one of their top challenges to implementing customized generative AI. And 93% of those CDOs said that the importance of the end to end data strategy and its role in making generative AI custom to their business is one of the most important things that they can do.

So it's not just data that matters, it's high quality enterprise data that is safely and correctly and responsibly used.

So let's start by understanding the relationship of data all up with FMs. Now, as many of you already know FMs are large neural networks trained on massive data sets to develop a broader generalized understanding. And these capabilities can then be customized for specific tasks.

Now, because these models are trained on vast amounts of data, they are able to learn more complex representations and patterns. And many of the leading FMs are trained on data stored in Amazon S3. Like the most recent Falcon 180 billion parameter model is trained using data in S3. And Anthropic uses S3 to store hundreds of petabytes of training data and model parameters.

And because those FMs are trained on multimodal data, for example, both text and images, they understand different modalities which in turn enriches their generalized understanding of tasks. These are the FMs that are driving a new generation of applications.

So one of the questions I hear from customers all the time is should I use an existing FM or train a new model? And the vast majority of the time my answer will be use an existing FM because these models are just so capable and they're evolving so rapidly in their generalized understanding and knowledge. Plus, you can customize the responses of these models or the models themselves with your own enterprise data. And that process got even easier this week and that's what I'm going to be talking about in a little bit.

Now, folks, this is a place where some of you who are already deeply knowledgeable are already deeply knowledgeable in ML/AI. And I'm going to be telling you a few things that you already know, but we have others in our audience who are learning. So I'm going to give a super quick summary of some ways to customize generative AI experiences. And for those of you who already know how FMs work, we're going to be digging in shortly with details about how customers at scale like Adobe and Pinterest are doing generative AI with their custom data and how AWS is helping. So hang tight, let's talk about those three techniques for customizing an FM with your own data. And I'm going to start with the easiest first.

Many customers use prompt engineering to customize their foundation model. And that's because it's both simple and it's cost effective and it lets you refine your inputs or your prompts for generative AI so that you get on target outputs and optimal results.

Now you can customize the outputs of an existing model using a technique called retrieval augmented generation (RAG). And you don't need to retrain your model if you're using RAG, that's really important. With RAG external data is used to augment your prompts. So, and, and they can come from multiple data sources including document repositories, APIs, databases and RA helps the model adjust the output with data retrieved as needed from these knowledge sources.

Now, when you fine tune an existing AI, you're using a smaller data set from your own domain. And it's basically creating a new model for you with your prep data set. The same thing, if you're doing continuous pre training, if you do continuous pre training, which you can do with some FMs, you're picking up where the FM provider left off. You're training the model on data sets in your enterprise to explain, extend both the generalized and specialized knowledge of the model.

Now, I'm going to walk through these three different techniques in the context of Amazon Bedrock. And that's as I think, the easiest way and best way to evaluate which technique is right for you and which model you want to use is using Amazon Bedrock. If you haven't tried it yet, you should. I think Amazon Bedrock is amazing. It has a lot of capabilities today. It's evolving at a super rapid rate. And as of this week, it supports all three of these capabilities to customize your model responses with your data.

So let's talk about each of these in more detail, but we're going to talk about them from the context of the data characteristic. If you have data that changes often like weather data, product inventory or news, you aren't going to want to update your model every single time your data changes, which would be super often if you're passing in recent weather patterns. For fast changing data, we find most of our customers tend to use RAG. RAG helps your model respond with the most recent data as context and it influences the response. So you're not changing the model itself instead, what RA does is it adds data into the context of the user's request or question to bias the model's response in the right way.

Say for example, you want to train your FM to learn last week's sales reports. A RAG based system can incorporate that data in the FM's response. Now, RAG works by retrieving relevant facts from an external knowledge source like a file in S3 and it grounds the FM on the most accurate and up to date information.

Now with RAG, you preload the information that's based on your own data into the context of the LLM by taking your data and creating embeddings for it. Now, embeddings are numerical representations of your text image or video or audio, but it includes the semantic meaning. It creates a semantic meaning that is basically stored in a really fancy index at the end of the day. And that semantic awareness can mean the difference between a model knowing if you're talking about a financial table or a dinner table or if when you use the word bike, you're talking about a motorcycle or a bicycle.

And as of this week, we also have CLA 2.1 model available in Bedrock. And what's super interesting about the CLA 2.1 model is that it has a 200,000 token window that is super helpful for RAG because it means you can pass in as much as a 500 page document as a prompt.

Now, RAG is the easiest way to customize model responses with your own data, but customers can also do fine tuning as well. Now, fine tuning an existing FM means that you're training the existing or you're customizing the existing FM on your domain specific data and that is giving it context on your underlying enterprise.

Now, what's interesting about fine tuning is that you can not only specialize the model to your domain to your context, but you can special specialize it to your tone or your style to that sample data set. So when you're fine tuning, you're providing examples that not only set the domain context but also the style that you want to inject in your customized model.

Now, this takes a little bit of data prep to get to that high quality curated data set, but it can really impact the relevance of the responses for the model. Now keep in mind models are only as good as the data that they learn from. So make sure you're using high quality data when you're fine tuning.

And as of this week, fine tuning in Bedrock is generally available for Coherent Command Meta Lama two, Amazon Titan Text, Amazon Titan Multimodal Embeddings, Amazon Titan Image Generator and Preview and fine tuning. Support for Anthropic Cloud is coming soon. You have a lot of choice for fine tuning for customization.

Now, the third way that you can customize a model is to do continuous or continued pre training. In this way, you're picking up where the FM provider left off and you're using larger unstructured and unlabeled data sets stored in your enterprise, like thousands of text documents in your data lake for S3. And today, Amazon support. Sorry Bedrock supports pre training with Amazon Titan models.

Many of our customers, by the way, are going to be using multiple techniques with multiple FMs as in they're going to be using FMs as is they're going to be fine tuning, they're going to add RAG with context and sometimes they're going to do pre training, continued pre training of a model. We see customers at scale often have a collection or really an ensemble of models that drive their AI system.

And I can't think of a better person to talk about how to use generative AI at scale than Alexandre Costa, the VP of Adobe Firefly and Sensei, which are the products powering generative AI in Adobe products for thousands of creators today.

Please join me in welcoming Alexandre here everybody.

My name is Alexandre Costa. I drive the generative AI agenda at Adobe and I'm here to talk about how we use some of the AWS capabilities. How do we think about our data to teach some of the learnings we had in our journey.

And at Adobe, we've been serving our customers for four decades and we've helped them navigate many technological disruptions, digital photography, internet, social era. But last year, we realized that the new era was happening, the era, the AI era. And for us, it was a very important moment and we've decided to invest early and start doubling down on transforming the Adobe product lines into AI enabled products.

And we talked to many across all our segments. We talk to consumers, we talk to small businesses and students and education, we talk to our creative professional users and we talked to enterprises and we asked them, what do you think Adobe should do in terms of bringing generative AI to all of you the right way?

And they told us they need particular things because they're very specific about creating and editing content. They told us they want control, they want to be to make sure they can materialize what's in their eyes, mind's eye and be able to create and tell the story. They want to tell"

They told us integration in our product line is very important. And for that, we want to bring those capabilities inside the products they live in use like Photoshop.

They also told us that they want to customize it, be able to create content on brand and create content variations at scale. But most importantly, they told us, they need us to create content that is safe for commercial use. Meaning they gave us this strong signal that we need to think really deeply about the data we use to train our models.

And this is what helped us create a series of models this year and across image generation, vector generation for illustrations, and design generation. And I'm going to show you a quick video with some of the capabilities we've launched throughout 2023.

Yeah, was, yeah, thank you. We're very proud of what we've accomplished, but we're here to talk about data. Before we do that though, very important for us was the success those capabilities had with our customers seeing that we train on the right data we gave them the capabilities they needed led to amazing success and use of those capabilities.

Like for example, Generative Fill, which is how the Firefly model is integrated in Photoshop is the most used feature in Photoshop today is used at the rate of 10x more than any other feature we've introduced ever in the Photoshop history. And throughout the year, more than 4 billion images have been generated with Firefly.

And again, a very important point for our customers was how we train these models, the data we trained on and let me walk you through how we do it and reiterating also what Malan said when we talked about data, we have the Adobe Stock marketplace. We have a marketplace of content, hundreds of millions of images, illustrations videos that Adobe is offering to our customers.

And we have stock contributors that participate in this marketplace. And we've decided to take this as the foundational data set we want to train our assets on. And of course, the stock images are stored in AWS. But we've also decided to enrich this data with a lot of embeddings and augmentations to make this data better for training these models and increase their quality.

And it is a lot of, there are a lot of models involved in this process models that make the data better for training and also models that participate in the generation when we think about the whole process on how we train our models.

So we have the Adobe Stock, hundreds of millions of assets, those assets are already curated and moderated both using human moderators and AI do not contain trademarks, intellectual properties or recognizable characters because we want to make sure our model cannot generate those.

So our data is filtered and this ensures us, give us the confidence that our model cannot generate a particular brand or logo or recognizable character.

We take all this data and then using various LMs and other classifiers. We augment it, we create this pre computed embeddings that help us not only add quality to the data but in the same time, increase the speed at which we train. Because having this precomputer embeddings allows us to train faster, not having to recompute it every time we do large distributed trainings and we load the data. It's a very important part of the equation to load data test, keep the GPU occupied.

So we keep data in S3, we keep data in the ElastiCache in FSx. And we continuously monitor it for uh continuously synchronize it with the database in Adobe Stock and check for lineage and make sure that we have traceability of how models are trained, which data was trained to make models better.

We also have this process called RHF. You might have heard of reinforcement learning with human feedback. So we collect likes dislikes downloads from our apps and feed those back into the training data to teach the model to generate more assets our customers would love.

So at the macro level when we decided to go about it, and again, this was 2022 for us, we created a team responsible for creating data sets as a product. This data team has a has the sole role of taking data, preparing it for training, compacting it in diverse way, computing these embeddings and sharing it with hundreds of Adobe researchers and the Play researchers.

So they can focus on training the models, making sure that the data they train on is very high quality.

We we operate with petabytes of data petabytes of raw data petabytes of embedding. When we fine tune for a new modality like illustrations. We add more petabytes and vectors and videos are going to increase the quantity of data we have to operate on and using AWS scalable solutions enabled us to move very fast and not worry about any of those data sizes. Those were data sizes. The Amazon solutions helped us operate that.

But we've built software on top. We used open source to stream our data to this training machines. So we package the data, we store it in S3. We're very excited about the S3 press one zone which will allow us to keep the data closer, faster to the training nodes and also cheaper and we compact those data into shards and we stream that to the training machine.

So very important with multi modality to invest in a software layer or use some of the offerings. AWS is bringing to the table to make sure your GPU stay occupied all the time during training.

And for us at Adobe while we decided to train our own FMs. So we're a large company. We, we think our customers need us to invest in our own FMs. Uh we've also tried to stay ahead of the regulations. We always did responsible AI, we've invested in something called Content Authenticity that enabled us to bring to the table. transparency in how not only we train the data but transparency in how documents are AI generated or not.

We're working with different governments and giving them advice on how to improve and regulate training, to make sure that everybody in the world and governments can have a say in how models are created to make sure they're trained responsibly.

And we also invest in heterogeneous computing because this is the challenge we're seeing these days where not only you need to train on your data, but sometimes you need to add some of your customers data to create better products. I was very excited to see the announcement of the Amazon Clean Room that might give some of that opportunity to bring data sets together in a clean way without without interference.

And if you train with data on the internet, there are emerging threats, things like the laws changing but also data poisoning. There are new techniques that are happening out there where artists are trying to protect their data by labeling it in a in a novel way.

So you need to really invest in how you collect, manage and govern your data set to make sure your models being them ragged or fine tuned are high quality.

Finally, again, I hope our story helped us, helped you understand how we succeeded at Adobe to create many models and we have many more in the pipeline by really investing in data. We think it's still an exponential, we live in exponential time. So it might feel late for some of you to come in. But I think the biggest change is still ahead of us in terms of how those generative models will change industries and hopefully will make, will make all the knowledge workers um businesses better.

We do think more data transparency and governance will be needed and regulations will enforce that in various geographies. And I do hope that you will also take this to heart and start investing in your own data sets, start training and fine tuning your models in order to succeed and embrace this generative AI wave.

I want to thank again the AWS team for the partnership. We wouldn't have been able to succeed without them. And um I think Malan will join me on stage in a. Thank you.

All you need to know is your data. And earlier this year, customers were able to use Canvas to access multiple FMs on Amazon Bedrock or through SageMaker JumpStart. And now with Canvas's no-code interface, customers can upload a data set, select an FM and Canvas automatically starts helping customers build their custom models.

All right, we've talked about how you can create and use custom data sets. The second data initiative that every data practitioner needs to think about is leveraging and extending your existing data architecture with new generative AI applications.

Here's a fact, customers do not want to create new data architectures for what is essentially a new application type. They want to take advantage of the existing systems that already exist to store and use data for other business applications. And they want their generative AI applications to follow the rules of the enterprise for data access governance and compliance.

Think of those generative AI applications as a new application type that sits on top of an existing data foundation. That means you want to plug in your existing data sources like your S3 data lakes and you want to use other data building blocks that you and your organization are already familiar with and using today in your data architecture.

So I talked earlier about how important it is to keep the vector data up to date with the most recent data because that's how you get fast accurate and relevant responses using RA. Now with AWS, customers want to use their data store to store embeddings too. They can use Amazon OpenSearch, which is a great choice if your use case involves search and that's because it provides you semantic search. It combines the vector and full text search in a single query.

If you're using Amazon Postgres Aurora or RDS Postgres SQL, you can use a pg_vector extension as your vector store. There are great choices if you're already using those relational databases and you want to join the vector data with the traditional table data in your queries.

Another option is Amazon Kendra, which is an end to end managed service that automates the entire process of ingestion connectors from over 20 source systems to the generation of the vector embeddings. And just a couple of days ago, we introduced vector capabilities in Amazon Neptune, our managed graph data database service as well as vector support for Amazon MemoryDB, which is our in memory Redis compatible, fully durable data store.

If you want another vector database provider like Pinecone, you can select a choice you want directly from Amazon Bedrock. Our goal here is to provide you choice so you can use your preferred technology. The one where you already have your skills and your organization and just extend it to include vector as part of your new AI system.

Now, when you build on your existing architecture, you can also leverage and extend those data pipelines that are already in place today. Now, many of our customers use AWS streaming technologies like MSK, managed Apache Flink and Amazon Kinesis to do real time data prep. In traditional ML AI, you can take those workflows and you can extend them to capture changes to your data and make them so that you can make changes to your data available in real time updates to your vector store.

You can update the fine tuning of your data sets with integrated data streaming and S3 using Amazon Firehose and you can do other changes to extend those existing workflows and make them work with your new AI system. Basically, if your data structure infrastructure is already built, using AWS services, you are already most of the way there to just extending that to work with generative AI along with managed services like Amazon Bedrock, which is purpose built for generative AI.

We continue to add new capabilities into all of our services so that your existing data architectures bridge easily into generative AI applications. Now sometimes that means taking generative AI capabilities and putting it into our AWS data services. So that helps you manage your existing data architectures better.

For example, Amazon DataZone this week announced support for automated descriptions using generative AI which is an LLM driven capability. So you can easily enrich your business catalog by automatically creating comprehensive business data descriptions and context for those business sets or those data sets.

In other cases, what it means to bridge generative AI from your existing architectures includes new features like Glue, which introduced the ability through Glue Data Insights to simplify how you improve the quality of your data sets used for training and in inference. So data quality insights, sorry, automatically detects anomalies in your data set by analyzing data statistics using ML algorithms, telling you about hidden data quality issues and unusual data patterns.

Now, here's another example of how we extend your capabilities on existing services. We have a lot of customers that use Kubernetes for ML training and inference and the health of those virtual machines is critical to completing a generative AI workflow. And so earlier this month, we announced CloudWatch Enhanced Container Insights for monitoring EKS clusters.

Now, these metrics are going to help any EKS user, but they're actually super helpful if you're doing generative AI workflows like fine tuning key metrics like GPU utilization are included by default. But you also have access to other incredibly useful metrics like power draw, container CPU or encoder latency. These metrics are aggregated by cluster, namespace, job or pod. So you can monitor and maximize the uptime for your EKS hosted workflow.

And if you look across your data architecture, you are going to see places all over your data strategy and architecture where the fast pace of AWS innovation helps builders of generative AI applications.

Let's take Amazon S3. Amazon S3 has well over 700,000 data lakes today and we have customers that have terabytes or exabytes of data. Amazon S3 begins where the internet ends when it comes to specializing FMs for your enterprise. Here's why - the data in S3 is inherently high quality and that is because it's already being used for business operations like analytics and fraud detection. That data is already at work in your enterprise. And so it is a very short step to take that high quality data you have in your data lake and use it to customize your AI system.

When you are using enterprise data in S3, you are typically working with already a great source of high quality data for generative AI. And when we extend Amazon S3 capabilities as we are always doing, generative AI practitioners get the benefit of that as well as developers of any application type.

Here's a specific example. Earlier this week, S3 announced S3 Access Grants. Now data lake customers are super excited about this because it gives you the ability to apply fine grain controls all the way down to the prefix level of S3. But that also helps generative AI developers who are using shared data sets for RA fine tuning and continued free training because it lets you maintain strict, logged access controls over those shared data sets and it even offers integration with third party identity providers.

So again, everything we're building into our AWS building block services like Amazon S3, they're going to help any application developer, but we have the capability of also extending those into the generative AI workflows. This fine grain control I talked about, it really matters for AI systems because you're often working with shared data sets.

For example, you might want to use that same prepped data set for fine tuning different models or you might want to have one set of critical documents that you use for different RAG in context learning and with access controls as of this week, it's a fully auditable way to provide specific access like time based permissions or read only permissions right down to individual prefixes.

So if you think about that, you think about that concept of the shared data set that's already at use in data lakes today for business analytics, fraud detection, personalized advertising, you name it, that's also going to apply to generative AI applications as well.

Well, speaking of data lakes, Pinterest was one of the first to adopt a data lake architecture many years ago and they have already deployed generative AI at scale for use with their data lake. So I am delighted to invite Dave Burgess who is the VP of Data Engineering for Pinterest to come and share with you how Pinterest puts their data to work for generative AI today. Dave.

Good afternoon everyone. I'm Dave Burgess VP of Data Engineering at Pinterest. We're going to talk about extending our existing data lake with generative AI.

So Pinterest is the visual inspiration platform where people come to search, save and shop the best ideas in the world. We have 482 million monthly active users around the world and 1.5 billion pins. These are images that are saved every week.

We've created a really agile engineering culture on AWS cloud where we can rapidly develop and deploy software and production at scale. We've been in AWS cloud for 13 years since we were born. We run thousands of experiments in power to win or learn. We can train and deploy ML models into production within a day. And we have dozens of ML use cases that together execute hundreds of millions of ML inferences per second.

This generates up to upwards of 80 million events per second which we log, process, glean insights from and do ML training on. This has resulted in us storing an exabyte of data in our data lake and Pinterest is one of the largest data lakes on Amazon S3 on the planet.

So Pinterest is a mix of AWS technology and open source software. Our data consumers range from engineers and data scientists to product managers and executives. We really have a data driven culture and data driven business. All these users create queries in Pinterest open sourced Querybook and Superset analytics user interfaces.

We execute these queries via Presto and Spark open source big data engines that run on Amazon EKS. We have tiered our data based on the schema design and the quality of the data and the documentation and we store all of our metadata such as our schemas, fields, metrics definitions in Data Hub, which is an open source data catalog.

The goal of our highest tier one data sets is to enable 80% of our company's queries to be executed. And even with these capabilities and the infrastructure, we felt that we could improve our analytics productivity even further using generative AI.

So first we identified our analytics product pain points. So what I'm about to describe is how we significantly improved our analytics productivity using large language models and RAG with our existing data lake on AWS.

On any given day, our users and business intelligence team have many, many different analytics questions that they need to answer and they're constantly working on trying to figure out the answer to these questions. These questions range from knowing what data to use and how the fields and metrics are defined to what the quality of that data is, where did it come from, is it trustworthy and how to write the queries in SQL?

We saw an opportunity here to solve these pain points by using RAG. And the main way that we did this was to automatically generate the SQL queries from text questions. We found that the text to SQL generation is about 97% accurate given the right table to use.

However, we do need to find that right table, that right tier one table to use in the SQL query. And for this, we needed to have text descriptions of our tables. So we use large language models to actually generate the text descriptions of our tables from the table schemas as well.

Here's a demo of text to SQL in Querybook which runs in production and has been open sourced. So you can use it if you like. You type in a question, Querybook generates the SQL and executes it for you. You can also edit the SQL and Querybook will verify the syntax is correct. And if you would like, you can automatically generate a title for your query as well if you want a little bit of panache.

So here's a glimpse of the generative AI elements which we added to our data lake for text to SQL. And this was built on top of AWS. This was done by two engineers, very, very capable engineers in a matter of a couple of months working part time on this as a side project. So it really is quite easy to go and set this up.

When the user asks a question, we first need to find the data tables to use using AWS OpenSearch. This is a distributed search and vector engine and OpenSearch has the table schemas in them, the description, the tier of the table and example queries to recommend the best tables to use.

And then once we have the right tables to use, we can then create a prompt for the large language model and generate the SQL via the large language model.

Ok. So what was the impact of this? So by being able to automatically generate SQL from text this has resulted in 40% gain in productivity for our product analysts, our data scientists, our product managers and our engineers and most of this is from speeding up the data discovery part and the data query creation time - finding the right table to use automatically and executing the SQL.

So we've used off the shelf large language models for this use case. Some of the latest large, the latest ones are give excellent results with 97% accuracy. And we've also used these large language models for other internal developer product to use cases too.

We find most of the differentiation is in the prompts and the data that we fed into these large language models. And we've got an example here of the prompts that you can use if you'd like by going to this url to actually generate uh the sequel from the the prompt.

I'd like to thank the, the pinterest engineering team, pinterest as a whole and also milan and team for really close partnership with aws. Thank you.

40% productivity gain from that. They have an amazing data like they talked about an exabyte of data, but they also talked about how generative a i is at use at scale in their engineering workforce to make sense of all that data.

And part of the reason why dave and pinterest were able to move so quickly to generative a i is because they were starting from their existing data architecture that was based on amazon s3 for their data storage. And from there, it was just a short step to integrating lm into their existing system.

It's just one of many examples of how companies can put in place a i systems very quickly when they're starting from aws services. Uh and that is because aws is constantly innovating for what customers want. And I want to give you some examples of that for new releases that happened this week that are really going to matter for generative a i practitioners.

Here's one example, the data paths between compute and storage is incredibly important for generative a i i workflows speed matters on that data path and it matters because it speeds up the model training and the inference process. So as an example for us to accelerate training workflows, we went into many different parts of that data path between compute and storage.

For example, we've significantly sped up the default command line performance for data retrieval from s3, from very commonly used instance types for generative a i applications like training one p four d and p five instances. And because we know so many customers are using eks to orchestrate the distributed trading jobs. As of this week, we introduced a new c si driver for s3. So you can natively provision and mount s3 buckets to access s3 objects through a file system interface directly from kubernetes. If you are using kubernetes on two or eks you should use this, you get high aggregate throughput without having any changes to your application code that's new.

This week, we're also optimizing the performance for commonly used frameworks in ml i workflows, we know that customers use python for mla i training. So we accelerated the performance of the aws python sdk for s3 access. And that auto incorporates optimizations like automated timeouts, retries request parallelization. And again, customers don't have to write a single line of code. These optimizations are just going to speed up your mla i workflows like data loading and checkpoint.

We've also just add a file cache for s three's open source client side fuse connector. We call it mount point for amazon s3 and that file cache that's going to speed up all your multi epoch training jobs to reduce the cost of repeated access to that data.

And just last week, we introduced a new amazon s3 connector for pytorch open source project for the pytorch open source framework. Now, for those of you who are doing ml today, you know that pytorch is incredibly popular machine learning framework to build and train machine learning models. And now you can use the amazon s3 connector for pytorch to automatically optimize your data loading and checkpoint performance for your ml training workloads.

And as adam said during the keynote s3 this week launched a single zone, high performance storage class called s3 express one zone. It gives you the highest performance and lowest latency cloud object storage with consistent, consistent single digit millisecond latency and express lets you scale to millions of access requests per minute millions. And with request costs that are 50% less than amazon standard, you can reduce your total workload costs across compute and storage by up to 60% compared to amazon s3 because your jobs are just going to run faster.

Now express one zone uh is much, much faster than any other storage class in s3. S3 standard was the fastest storage class, but now express one zone is 10 times faster than standard.

All of these examples that I just gave you are just a few different examples of the innovation throughput that we have across our aws services that help the generative a i builders on aws.

Now I am now to the third data initiative that data organizations in every industry and every geography are paying attention to and I believe that every data organization needs to be their own best auditor.

Now, for generative a i customers should know what data sets you're using for training, customization, fine tuning rag. And you need to know how the models making decisions in a rapidly moving space. Like generative a i, you need to look around the corner and become your own best auditor. And you know, every organization has to prepare for a world of regulation and compliance. We know it's coming, we know it's going to be different in different countries. And in order to be prepared for that, you need to do your own auditing today and you need to do it in a fully automated way because that is the only way you're going to be able to scale to the use of generative a i.

Now today you store protect and govern your enterprise data sets. There are new data sets in town for generative a i and it's everything from the prep data set for fine tuning to embedding to the generated data you create. And so when I talk about being your own best auditor, now, I am always including these intermediate data sets with a i. Your data provenance needs to expand to include these intermediate data sets that are created and used by your own a i system. That can be the evaluation data sets that you use to test the accuracy of your model. That's one example, it's the embedding that i talked about before. It's prompt engineering data sets that you use for r a and it's staging data sets that you use for things like customizing code whisper. These are just some of the many new intermediate data sets in this new world of generative a i. And if you're going to be your own best auditor, you need to make sure you're storing those new intermediate data sets using the same access and security model you have in place for your data architecture and you're also logging how those data sets are being used in your generative a i work flow.

I can't emphasize enough how critical this is for enterprise grade generative a i. So one way to think about this is the explainability of data in generative a i as an example, any prompt response that your generative a i application gives needs to reflect the user permissions to the underlying data because it's super important, the user can only have access to the data that they would be able to see anyway if they weren't using a generative a i application.

Now, that fundamental is built into all of our a i services at launch like code whisper and amazon q. But we also incorporate that concept into in context learning and fm customization. For example, for rag use cases, you can rely on the existing user level permissions and fine grain access control in the embeddings and vector databases like opensearch.

Today, aws customers use different aws services for auditing like cloud trail datazone, cloud watch and opensearch and they govern the use and monitor the usage of their data. You can extend all of those services into your a i system. If you are using aws managed services for a generative a i, you have the capabilities that i talked about. They're already built in.

We launched our generative a i capabilities with cloud trail support because we know in a world where auditing and compliance is just a matter of time you need to have that audit trail built in. So anytime you create a data source in amazon q, it's logged in cloud trail, you can use a cloud trail event to list all the api calls made by code whisper. And amazon bedrock has over 80 cloud trail events that you can use for auditing right now to understand how your fms are being used through bedrock.

For example, if you want to know what actual prompts are used with the foundation model, you can configure bedrock's model invocation logging which will record the input and the output of the invoke model call using cloud trail. You can see who invoked the model. And when was the model invoked? If you have customized a model with your own data set, you can bedrock will log where the data set is stored in s3. And all of this gives you transparency into how you are using what data for what particular training and it's built in to bedrock.

It's not just the core audit capabilities of bedrock. You can use other aws services to get even more layered protection of your sensitive data. I'll give you an example with bedrock model invocation logging, you can send your invocation logs to cloud watch where you can turn on a capability called sensitive data protection. Now, once you have cloudwatch sensitive data protection, cloudwatch is going to automatically detect and mask over 100 types of sensitive data types over any egress path for your logs that prevents p i data from being ever accessed by your users through the invocation logs from bedrock

aws believes deeply in responsible a i. And you'll see that belief in our services. For example, in order to provide transparency on the model decisions, bedrock agents show and log the intermediate steps that the model wants to take. That is model decision explainability that can go into your cloud chair logs to help you be your own best auditor in code whisper. The reference log lets you review references to code recommendations that are similar to training data.

aws gives you uh data transparency in our managed stack because we know how critical auditability is to your enterprise. You also have to think about transparency in responsible a i and that means explicitly sharing the details and context to use your a i correctly and responsibly

aw s launched a i service cards and they're a form of responsible a i documentation because they provide customers with a single place to find information about the intended use cases and how to use services that we provide uh with the best practices and safely. And this week we've launched six new a i service cards including amazon transcribed toxicity detection and many others.

Now, as part of responsible a i as of this week, aws is also providing intellectual property ip coverage, um indemnification coverage for outputs of amazon titan models and amazon code whisper. So if you use one of these generative a i apps locations or models that i just talked about from aws and somebody sues you for ip infringement. Aws will help you defend that lawsuit which covers any judgment against you or settlement costs. That's responsible a i, in the way we work, we're also investing in the automation of adoption of responsible a i.

It's one thing to say that you do responsible a i, it's another thing to do it and it's next level to automate doing it at scale. That's what you get in bedrock. So for example, guardrails which was announced in preview this uh this week, it lets you specify topics that you want your generative a i application to avoid and automatically filter out queries and responses and restricted uh categories. These guardrails work across all f ms including custom, fine tune ones and agents for amazon bedrock.

Another way to automate responsible f i is to build into your fm selection. Amazon sage maker clarify which we announced in preview this week as well, gives you the ability to evaluate and select your fm based on responsible a i definitions. For example, when you're comparing models and really even models that perform the same function like summarization are in the same family like falcons, 40 billion versus falcon's 180 billion parameter models. You'll find that every model performs differently across various dimensions like accuracy, robustness, and toxicity. And now you can use amazon um sage maker clarify, you can use it in bedrock or you can use it in sage maker canvas to evaluate these models based on responsible a i metrics.

Your goal is to be your own best auditor. So you are prepared for whatever comes your way for compliance in the future. And aws is here to help whether you're using a i applications like amazon q managed mla i services like bedrock or individual services like cloud trail across multiple parts of your application.

Now, we are just at the start of this journey. I think it's important to note that it's not just about how our models learn about our business context to our data. It's about how we as individuals and organizations learn too about generative a i techniques, about responsible a i and about how you can navigate the world of taking your data and bringing it to your a i system.

Now, at amazon, we have a leadership principle called learn and be curious and it says we are never done learning and always seek to improve ourselves. And we're curious about new possibilities and act to explore upon them and that's what we're all collectively doing with generative a i and aws is with you every step of the way.

Thank you, everybody. Have a great rest of your day.