A new era: The path to generative AI in public sector

All right. Oh, wow, that's perfect. How's everybody doing today? Um you know, we're gonna talk about a path to generative AI for the public sector today and kind of that we kind of came up with this topic is generative AI is moving really, really fast and that's a good thing because there's more and more capabilities kind of coming out every day that can help, you know, us in our professional jobs and, and our personalized with this gene AI technology.

And so we wanted to help our public sector customers find a way so they can start using that for production applications to help out with their missions and their employees as well. Um we, we've learned a lot over the past year or so with generative generative AI, helping customers find ways to deploy generative generative AI solutions in their own space. And so what we wanted to do was kind of bring that experience back to you so that you can learn from the mistakes and the lessons from those people and, and make new mistakes yourself and, and learn new lessons yourself instead of, you know, repeating past ones.

So you know, I, I've been supporting the public sector for probably 20 years now. My primary focus has been Department of Defense. Uh and um I started out software engineering and kind of worked my way through in the last five or six years. I've really been focusing on big data and data analytics problems. And with me, I have Aaron, who I'm gonna let introduce himself.

Good afternoon everyone. I'm Erin Sack and I'm a machine le machine learning solutions architect at AWS where I support the public sector. I've been at AWS for four years, always supporting public sector customers. So I'm excited to talk about general of AI and machine learning today. Back to you, Joan II.

I feel bad. I forget I introduced myself like I gave you all my background, but I'm John Troll and I'm a principal solution architect at Amazon.

So uh quickly today we're gonna kind of split the talk up. I'm gonna go through the, the kind of first part of the day here talking uh about how to get, you know, the, some of the steps to get to JV I and then Aaron's gonna come and, and talk your way through the rest of the presentation as we go through with the agenda here.

So just to start off, I kind of want to give an overview of generative AI uh I know lots of people have probably you're here, you've probably read about it or been playing with it, but we might have different uh definitions. And I, so I kind of wanted to give you the history of where it came from. So you can understand where we came from and, and where we're going with it.

Basically, in 2017, there was a paper that came out, it was called Attention is All You Need. And that, that paper really kind of solidified the transformer model, which is what's used to, to train and to create of AI today. And you're sitting there probably saying 2017, like we haven't been talking about gene of AI since 2017. And that's because the next big kind of transformation that happened was in November of 2022. And that's when uh Chat GPT came up with a really uh intuitive user interface to interact with these models. And it's kind of like that, that iPhone moment, right?

So you, you used to have these flip phones or whatever and then the iPhone came out. Now you're like, wow, look at all the power I can have in the palm of my hand. Well, when Chat GBT created that UI, it enabled anybody from age 6 to 80 to start using gene AI uh models. And so that's why you kind of see that gap between 2017 and 2022 when we actually started using these models. In fact, the application they put out was the fastest growing application in internet history. It went to 100 million users in five days. Uh mainly driven through social media.

And so anybody here use Chat TPJ, Chat PGPT before? Yeah, I figured most of you would, uh, I remember the first time I, you know, kind of became aware of it. It was almost a year ago. So it was Christmas. I was at my in-law's house and my nieces and nephews were like, playing with them like, oh, hey uncle John, you know, tell me a little bit about yourself and what you do. And the next thing you knew they had a haiku that was all about me. And I was like, I know what a haiku is. It's a poem. Like, I guess that's the extent of what I know about a haiku. I certainly couldn't write one. But so I thought, oh, this is kind of a novel idea I can write haikus with, with the internet and it can generate that for me.

But, you know, and ii I think it's kind of coincidental that when we release party rock, if anybody's looked at party rock, it came out what a we a week or two ago, the, the first use case or the first like sample application has is generating a haiku. So apparently haikus are a lot more popular than I thought they were. But um we, you know, it, it was the kind of my first exposure to that.

So now that we kind of have the history, you know what exactly is generative AI and so generative AI is the idea of creating something new, that new thing could be text, it could be images, it could be video, it could be audio, but it's creating new things. And so it's not like when I go to the internet and I search or something, I say, hey, I wanna search for aaa image of a cat and I see all these images, cats come back in my search results. That's actually images that already exist, that it's finding that have been labeled as cats. What gene AI does is actually generate a brand new image of a cat that really has never existed before. In fact, the cat doesn't exist, right? It's just a facsimile of a cat that was generated by this AI. So it's creating new things that have never been seen before.

And you know, traditionally, when you look at, at models before generative AI came about those models would do one thing really well. So if you look at Amazon um textract, you know, it does one thing really well. If you look at um Amazon translate, it does translations really well, but it's not doing multiple tasks. What genera generative AI allows us to do is do multiple tasks with that same model. And in fact, with an untrained model.

So generative AI, you're not training that model to do any specific tasks at all. It just has so much information and so much knowledge uh from, from the m training, it's done, they can understand how to do multiple tasks at once. And, and again, all this is based off of that transformer model that came out of that 2017 paper. Uh Attention is All You All You Need.

So how do we get from, you know, taking what we understand about generative AI and make this something that we can take to production and use for our customers and use for our, our employees and use for our missions to make it better. So what we're gonna walk through today is what we call the generative AI life cycle. It's a few steps to kind of help you get from where you are today actually to a production system.

And so if you look at the life cycle, there's kind of four steps in this, I'm gonna go over the first two and then I'm gonna hand it off to Aaron who's gonna, you know, take you the rest of the way home. And uh so just to kind of dive in on this, the first one is uh scope out a use case and I've had a lot of customers come up to me and say, hey John, I i we we need JIT of AI, how can you help? And my first question is great. What do you want to do with it? Like we want J of AI like? All right.

So you know, to use gene R VA I correctly, you really have to have a use case. And, and the main reason I kind of bring this up is there's lots of good gene AI use cases, but there's also AI use cases that aren't good fits for generative AI and so not, not every use case is going to fit in that generative AI model. So to find the right use case to, to implement that generative AI solution is really that that first step and it allows you to bring kind of measured business value into what you're doing.

So to kind of get us warmed up with this, we're gonna play a little game. Uh you're just gonna vote with your raising your hands. It, you have a 5050 chance. It's like yes or no. So don't, don't worry, there's three questions and there's no real prizes. So you don't, don't, you know, think of what it is. Um but we're gonna go through some different use cases, we're gonna determine, hey, do we think this is a good use case for junior VR I?

So the first one we're gonna talk about is text summarization. And what this is talking about is, uh let's say I have a 30 page document and I don't have time to read it, you know, give me five paragraphs of what this 30 page document is about. Do we think that's a good use case? Virginia AI raise your hand if you think it's a good use case. Yes, it is. See you guys are doing good. You hit bat in 100 right now. So that's a really good example. And a lot of people are using gene value for this today.

So the next one is object detection. And what we're talking about here is can I take an image or video and say, tell me how many bicycles are in this image or tell me how many, you know, women are in this image or how many people are wearing a white shirt. Is that a good use case for general of AI or not? You think it's a good, good use case?

Now I'm gonna kind of tell you gene AI is moving really fast, like I said earlier. So while today, this is not the best use case for gener AI, there are other models like uh convolutional neural networks that can do this better. There's a lot of work being done in this today with AI to where, you know, by the time this v this session hits youtube, that could be wrong. Uh again, but that's a good thing, right? We're, we're able to, you know, learn how to use these models to do more and more over time. But right now there's better ways to do object detection than which generative AI.

So our next example here is image creation. Now, if you listen to me earlier when I said what generative AI is, this should be an easy one for you. But do you think image creation? You think it's a good use case for gene of AI? Yeah, it's right in the name creation.

Um and so it's a lot of fun if you've never played with any of the image um image models like uh stable diffusion and did some random gene image generation. It's a lot of fun. I recommend giving it a try sometime, but it is a good use case for gene of AI.

So we start taking a look at the gene of AI use cases. We kind of break them up into 33 categories here, enhancing customer experience, boosting your employee productivity or optimizing those business processes. And if you take a look underneath each of these categories, you'll see some examples of the use cases that fit under there.

For instance, if you're using any kind of uh chatbot now, like if you're going out to your insurance company or ask an airline and have that virtual assistant that pops up on your web browser that you don't want to talk to. A lot of those are now transferring to generated AI and they're getting a lot better about communicating and making you feel like you're talking to a person.

Um there's also things like conversational analysis. Hey, what's the sediment of this person's conversation? Are they happy? Are they unhappy? Those are good use cases under the, you know, enhancing customer experience, we start looking at employee productivity. We already talked about text summation. I mean, that's a huge one. If you've ever had to read a 30 page document for those five bullet points. Now you can just get the five bullet points.

Um as a former developer, code generation is another powerful one for me. If you start looking at what we're able to do with, with things like code whisper and be able to generate code or even self even document code or add error handling to code to make sure it's, it's clean code generation is just a huge use case now, now there and it's exploding and then our optimized business processes. The, the one that I find most interesting is the cybersecurity one and how we use gene of AI to kind of help detect security events and help guide remediation. Hey, I i found this issue and by the way, here's how you might want to solve it using generative AI to understand those.

So you can see, you know, despite haiku's being, you know, really cool and all JV, I can do a lot more than just write poems. So the next step we're gonna talk about is how to select the model and you have the ability to also train your own models. And if you, you know, listen to swami this morning talking about how, how you can train models and, and the, the advancements we're making there. It's really, really cool, but we're gonna stick to just how to, how to kind of um enhance a model versus train a new model. And the main reason why is training models is, is expensive.

In fact, if, if you start reading seven months ago, what people were talking about with generative AI, they're like, there's only gonna be three or three or four foundation models. There's no, no reason to have more than that. But now you look, you know, there's easily a few dozen uh models and it's because of that training has been made easier over time, but it still is a costly and time consuming process to train your own model. So if you can use an existing model and, and, and just enhance it, which is what Aaron's gonna talk about. Next. It, it's a much better path.

So we're just gonna stick to, to using existing models for this talk versus creating models from scratch. So before we kind of talk about choosing model, you know, let's talk about what actually is a large language model. And so when I was growing up, I have an older sister and like one of my favorite hobbies was to annoy her, right? So any way I could find to know my older sister

"And so I, I learned this little trick where I could listen to what she was saying and guess the next word she was gonna say and say it with her. So when she was talking to my mom or talking to her friends, I would just repeat everything she was saying, but try and repeat it as fast as she was saying it and it really irritated her. But that's kind of what a large language model is doing, right? It's trying to guess what that ne next word's gonna be.

And so if you look at the example here, today, I went to the potato, you can easily see potato doesn't make sense. I wouldn't go to a potato. And so the large language model is gonna throw out that word. And so it's left with store mall office. And so what it's gonna do then is try and say, hey, do I have more context to go by? So if the sentence before this was, you know, I, I don't have enough groceries to make dinner, then this sentence it would make more sense to say. So today I went to the store and so that's how that large language model is going to start guessing that other word.

And I, I've used the term large language model and I've talked about foundation models. I just so you understand a, a large language model is a foundation model kind of the way Toyota is a car and there are more types of foundation models other than large language models. But for today's kind of presentation, we might use large language model and foundation models. Uh you know, a, a aaa as the same. So don't confuse the two, but like we might just talk about them in the same way for today's presentation. I just wanted to let you know, there are other types of foundational models other than large language models.

And so, you know, again, we're, we're all just trying to find the next word and language models themselves have been around for, for years and we used to think of them as large. So if you look at like something like sequence to sequence uh models, you know, we're training those over a million parameters. So, you know, 23 million parameters, we used to think, wow, that's, that's large, right? 2 million parameters and kind of to explain what a parameter is. If you remember back in your high school days, maybe, uh you know, you have the equation for a line. So you'd have a graph and you have the, you know, the x and y access and you have your y intercept and you have your slope. And so you get this equation for the line which is y equals mx plus b. Well, your, your, your slope which is m and your y intercept, which is b those are parameters.

And so when we're talking about these large language models. we're talking about 100 million parameters in multidimensional space versus two dimensional space like a line is. And so when we get the large language models, now we're going from 100 million parameters to 100 billion parameters, right? So you start taking a look at some of these models, you know, you'll see nine, you'll, you'll see like a 9 billion parameter or 100 million billion parameter model. That's what they're talking about that they're having all those parameters that you can tune to get better responses out of those models.

And so that's one of the things that have changed with models and making them better is how many parameters we are used. The other thing that we're able to do is now train on more tokens. And when I talk about tokens for simplicity's sake, consider a token a word. And so now we're training models on a trillion tokens and I, I'm throwing around a lot of big numbers, billions, trillions. And, and sometimes it's hard to quantify that when you talk about it.

So if you look at a, a trillion, the english version of wikipedia is 4.3 a billion words. So that's less than 1% of a trillion. So now we're training models on, you know, uh uh that, that trillion, you can just kind of see how that scales out from wikipedia, english. It's just a tremendous amount of data we're using another way to think about it is if we count seconds, uh a trillion seconds is 37,000 years, roughly, it's not exact, but it's roughly 37,000 years. So these are big numbers and what we've learned is taking the parameters and the tokens and using those together um really allows us to adjust and find the right performance for these large language models, which is how we're using those today.

So how do we start to do this, uh you know, choose these models? Um amazon bedrock was released earlier this year and, and it's a tool that will actually we had a lot of customers say, hey, it, it's hard to deploy models. So how can i evaluate models if i can't deploy them? So we kind of came up with the amazon bedrock, which is kind of a, a point and click way to be able to quickly deploy models and, and um experiment with those models with your business use cases to see, hey, is this the right model to solve the business use case i have so i can take that into production later.

And so uh if you, you know, listen to swami this morning, you, you saw we've extended uh some of those models today and again, it comes back to a i gen a i is moving really, really fast. But um so we're, we're able to do that uh today with these models. And if, if you take a look at what bedrock offers. And again, this slide is a little out of date because the, the new announcements we made this morning, um you know, here's the models that are currently supported under under bedrock so we can support things from a i 21 labs, you know, anthropic and anthropic is now on, on cloud on claw 202.1 you know, meta's lama two. And then of course, amazon's own titan models and, and the new models that we talked about that, that we talked about this morning in the uh keynote.

So bedrock gives you the ability to start using and interacting with these models today so that you can test out those business use cases and say, hey, is, is this the right model to take me to production because each of these models, if you look under them, they all, they, they all are doing multiple things, but some, some of them do some things better than others. And so you really want to be able to experiment with those and figure out which model is going to best fit the use cases that i have uh for my business needs and you know, data security and and our customer security is is job zero here at aws.

And so of course, we wanna make sure that any, any data you provide is not reused inside those models because you don't want your data to leak through a model. So we protect any data that you provide. And aaron's gonna walk through how some of that works. Um later on today, you know, we also support a lot of the, the standards such as uh gdpr and h ip a. And then all of our data is encrypted both in rest and in transit. And you can create what are called private endpoints to allow all the traffic to stay on the amazon backbone. So none of your traffic going to and from these models are going over the open internet.

And I I know uh a lot of people in the public sector uh uh use govcloud instead of our, our commercial clouds. And so uh right now without um bedrock being available in, in those in, in the gov cloud regions, how can you start doing this in gov cloud and we have the bill to do that as well. So with amazon sage maker, you can start launching models and testing these models out in, in our gov cloud regions as well. And if you're not familiar with sage maker, it's kind of a, it, it's aaa notebook type capability where you can uh run um models and, and run, run code to generate models and, and to interact with these models and, and share them across your organization. It's all preconfigured for you. So you, you just spin it up with a point uh click of a mouse and it actually has more models that you can use because you're choosing those models yourself.

So if you take a look at what's available in sagemaker today, these are the different models that you can bring into that sage maker uh capability and start playing with those models and, and seeing how those can fit into your needs. And then of course, the the same thing on this side of the house, sage maker understands your data security is, is really important to you. Uh you know, it it's important for our commercial users, but especially public sector as well. And I we know that's a concern. So again, none of your data is gonna be used to retrain any models. It's not gonna go to the foundation model at all. Uh all, all that communication again can occur over the amazon backbone. So you're not uh moving your data across the uh the the open internet and it's all gonna be encrypted uh transit and at rest as well.

And so this is just two ways that you can start to play and, and experiment with these different models. So you can choose the right one to get you to production. And with that, I'm gonna hand it over to aaron who's now gonna kinda take you to the next step of the life cycle.

Thanks john. Thanks john. Can you hear me? Um so awesome overview. We have two tools today with bedrock api based flexibility for large language models. Sage maker, as we mentioned, open source models that can be deployed, they can be deployed in the gov cloud region as well important for our public sector customers. I want to cover a little bit behind the scenes of how do you interact with a large language model? Some of this might be background for some folks in the audience, but it might be but might be new as well.

And in machine learning, a lot of times we will create new words and it seems like we confuse our users. So I want to dispel any myths. I wanna uh we kind of give a baseline of of what some of these words mean. So I want to start with the prompt, create a new word, a prompt for a model, machine learning models. You didn't traditionally never talked about a prompt. But now with a large language model, you have these prompts. The prompt is nothing more than the input to the model. So what goes into the model? And again, john showed where we are completing the next word, right? So here in this example, we're saying the sky is blank. Our prompt is the sky is we're feeding that to a large language model. And the large language model starts to complete it says the sky is blue and it keeps going, the sky is blue on a sunny day, on a rainy day, the sky is gray. Is that a good response from a large language model? Oh, I guess I mean, this is actually shows how difficult it is to evaluate the output of a large language model. Because if we, we asked 10 people, we might get 10 different answers of complete the sentence, the sky is blank.

Models are very sensitive to the prompt. I think if you would have asked me nine months ago or 12 months ago, are we just gonna be able to guide these models with prompt? I would have been suspect at the time. But working with generative a i for the last 12 months, you can get a lot of performance out of your models with just prompt, construction and prompt engineering. They're very, very sensitive to these prompts.

So let's talk about what is a prompt and it's the input to a model. But how do we construct a prompt? And the best way i like to think about a prompt is be very specific. Treat it like you're a kindergartner teacher, right? If I'm gonna talk to a kindergartner, I have a first grader at home. So if I'm gonna talk to a first grader, I'm gonna be very specific, right? I'm not gonna be suggestive about what I want. I'm going to be specific. This is the task that we want to solve. This is the output that i'm looking for and it works just as well with a prompt.

So start simple iterate with your prompts, expect that you're going to iterate with your prompts because the first prompt that you use is not gonna always work right away. You're gonna be iterating over time. You'll also see these prompt templates where you might have a persona, you might have some instruction, you might have context, additional detail, the desired output format. So think of that as a template that you would fill out when you're creating your prompt to help guide the large language model.

And here in this example, we're using a few pieces of those large language model prompts. So the first thing is instruction, extract the names of the people in the following text, right? That's the instruction that that's the task that i want the model to solve. And now i give it some context. What is the following text? Here's a blurb about amazon.com and i'm asking it to extract the names of the people. But oh by the way, i want it in a specific format, i want it to be people colon and then the names listed out with a comma separated list.

Now, if i ask the model by giving it this desired format and the prompt, i get the output people colon, jeff bezos, andy jasse, the two people that were mentioned. So another example though of using a large language model, this is known as named entity recognition in natural language processing, extracting proper names out of raw text, we've used a large language model here and guided it. It was never trained to be an entity extractor, right? We've guided it to be an entity extractor through the prompt construction alone.

So what happens now if we, we prompt our model and it's not working right? This is not the output that we're looking for. What do we do? One of the common terms that and techniques that has been developed over the last 12 months is what's known as f shot, prompting a again, another made up word and machine learning. But all this means is give the model some examples. The few shot is the examples, a few examples and actually humans work just the same way, right? If i was taking a physics class at the university and maybe there's a a really difficult subject that, that i'm trying to learn. The first thing that i might ask the instructors, can you give me a few examples? Can we work some problems together? And as a human, i oh it's making sense. It's really clicking now it works the same for large language models.

So we can give it a few examples and the model can then guide itself. So this is one way to improve your prompts. So in this example, this is known as sentiment analysis, i get another traditional natural language processing technique and here we're analyzing text for sentiment. So this is a pretty simple, simple prompt for these large language models. And we're saying this is awesome. Ok. That's positive. This is bad negative. That movie was ra ok. Positive. Now we want to to uh classify the sentiment of the next piece of text, right? What a horrible show. And because i've been giving it examples, it knows to complete the next word and the next word is going to be negative, right? It's classified the sentiment and i've given it three examples here. So few shot prompting works very well in your prompt examples, your prompt templates and, and i usually typically go to this when the initial prompt doesn't work right away"

So I might try a few shot right away and, and see how that works. Now, what if we wanted to include data or ask a large language model about a topic it's never seen before. This is important on a recent topic. These large language models are trained on a corpus of data and there's some time horizon on that corpus of data. So maybe nothing after July 2023 the model doesn't know anything about that.

So if I ask the model about who won the Formula One race in Las Vegas last weekend, it has no idea. It's going to make something up. It's going to hallucinate. Now in a way I sometimes view myself or humans as large language models and we can act just the same.

So if we give it some context or if I take in some context, I can then a answer questions about something that I don't know anything about. So let's assume John doesn't know anything about Formula One. John's our long large language model. And I'm going to ask him, John, can you read the paper from Las Vegas over the weekend? And I want to now ask him so he reads the article, he reads the Wikipedia article about Formula One.

Now I can ask him questions about that. Who won the race? John? He's gonna know he's read the article, how many people were in attendance in Las Vegas? He's read the article, you can answer that. He's summarizing that information and processing that information just like a large language model will and the technique to do this with a large language model where you want to use your own data or you want to use new data. is it called retrieval augmented generation or RAG.

All this means is that we want to provide the model with context in the prompt. So context in the prompt, we are not training the model at all. The model is static. At this time, we are still doing prompt construction, but now in the prompt, we're stuffing relative context.

So maybe I'm an organization, I have 20,000 documents in this repository. And I want to talk to those documents. I wanna search those documents. I wanna ask questions of those documents. How would I do that? I would use a process like this retrieval augmented generation where I'm gonna first take those documents and I'm gonna chunk them up so I might do that by sentence, I might do that by paragraph, I might do that by page. That's a knob that we can turn.

I'm going to chunk those up and convert them into vector embeddings. And if you use some of our managed service, this is all handled for you. But if we have a lot of customers that are still building their own RAG based architectures, we're going to store those in a vector database. The vector database just allows us to retrieve the important pieces of the text.

So now in the prompt, I'm not going to stuff all 20,000 documents. I want to stuff the three most important documents or the three most important paragraphs to the question that I'm trying to answer. So maybe I'm asking it to summarize or compare and contrast different documents. It's gonna go find those documents first.

So the retrieval augmented generation, the RAG the R is to retrieve, go find the documents that are important, go find the material that's important first. And then the generation we're going to augment our generation, we're going to include that important context in the prompt. So here you can see there's just three of the chunks that I'm gonna include. And now that's gonna be part of our prompt in the large language model.

So this is a way to use relevant data, your own proprietary data data outside of the training corpus that was used to create the model. Now, what if I wanted to change the behavior of a model completely? So I don't want to just have it know about my, my documents, i wanna change this model.

So instead of a chat based or instruction based model, maybe I want a model that just takes in a transcript and produces a report. That's its only job. Right. This is the problem, the business problem, we're working backwards from a business problem and it's gonna be a report generator. So I'm gonna take a transcript and I'm going to produce a report.

Well, if I have lots of examples, maybe I've been doing this in my organization for the past 10 years. I have lots of examples of these raw transcripts and these human generator reports. I can fine tune a large language model. So this, we are changing the weights or a typical approach here is we'll use an adapter, uh a low rank adapter to change the model behavior. But here we're changing the behavior.

So now earlier when we talked about the RAG based approach, we were changing the knowledge of what the large language model. Now, when we fine tune, we're changing the behavior, we now are creating a report generator or we are creating a uh sentiment analysis, right? Whatever task you might have, you can fine tune your model. This is no different than supervised machine learning on fine tuning where we have input and output examples.

What's my input on this example, it's a transcript and my output is a report that's a supervised machine learning. You might also hear about pre training, which is also another topic in large language models. And that's where we just want the lexicon of the large language model to be increased.

So maybe it's medical information, we might do some pre training and fine tuning. Both of these are available not only in SageMaker but also in Bedrock. Fine tuning is now generally available in Bedrock as well. That was announced this week. But we have lots of opportunities to create our prompt to include our own data and to fine tune these models to make them their own.

So now instead of having a generic open source model that we're going to use, we might fine tune it. And I have a report model. I have a legal model. I have an HR model where I've changed the behavior over time.

Ok. So we've talked about how to use the model. What a prompt is few shot learning with examples, RAG based architectures. How do we use these? Right? If we just have large language models on the shelf, it's not important, right? We want to use them, we want to use them in production, we want to production, alize these and how do we do that on AWS.

I'm going to cover two different areas of production. I and deploying your large language model. The first one was Bedrock so again, Bedrock is a fully managed API it gives you the flexibility of selecting different large language models under the hood. And when you communicate with the Bedrock endpoint, this might be a LLama two endpoint, this could be a Titan embedding endpoint. These models are deployed in a Bedrock service account. This is on AWS and you interact with that model from your virtual private cloud, from your VPC.

And when you interact with that model, none of the data is stored on based on that interaction, none of your data will be used for downstream training of the model. None of your data is gonna be stored in the Bedrock service account and you control encryption at rest and encryption is automatically applied in transit in communicating with that Bedrock service account.

You have options on how you communicate with that service account so you can communicate to the service account over public IP space. So over the public internet again, encrypted in transit encrypted at rest, we have other customers that never want to have any of their data even though it's encrypted in transit, leaving the AWS network backbone, never leaving the AWS network.

And you can configure your VPC to use private link to communicate directly with the large language model where your data never will transverse public IP space. Never on the public internet. If you've ever used any of the other AI services like recognition or transcribe or textract, it's a very similar interaction with interacting with those models. Those are models that are deployed by AWS that we get to use through a simple API very similar to Bedrock.

So what if you want to use SageMaker, SageMaker is very similar but it's not fully managed. Just an API you first have to deploy this large language model. And there's a couple options on how you deploy. You can use a SageMaker jumpstart for deployment. When you deploy a SageMaker model to a real time endpoint that's deployed in the SageMaker service account, very similar to how it's deployed in the Bedrock account.

However, that is your model to interact with you get an HTTPS restful interface to be able to interact with that SageMaker real time endpoint that's hosting your large language model and you can interact with that endpoint just the same as you can with Bedrock either over the public IP space or using private link to communicate from your VPC directly to the SageMaker inference endpoint.

So lots of options there on how to configure your data again. The same security and privacy is in place for SageMaker as it is for Bedrock. So encryption at rest and in transit, now these foundation and large language models can't perform end to end task on their own. They can do a lot with the prompt and we're gonna show some examples of that. But what if we want to have a little bit more complicated task with additional steps or thoughts just like a human or that they would execute, right?

We want to give you a task and then break it down and that's where agents come in. Agents were announced this week is generally available. They've been in preview for a number of months. Agents for Bedrock. What is that? What's an agent for Bedrock? An agent is something that's gonna take an action for you. It's gonna orchestrate a task. It's gonna do a small piece of the work is what an agent is and they can interact with external APIs.

So let me give you an example. Maybe I work at a university and the task that I want to solve and I want to solve this quickly and not with a developer in the loop is that I wanna personalize an email to all the students that have applied to the university, but they haven't completed their application. I want to personalize that email from the department that they're interested in.

I could give that instruction to a large language model with agents and it would start to break that task down first. They're going to find all of the, the students that have applied which ones have completed applications but haven't finished them. What are the different departments that these students are interested in? Let's then create a personalized email for what step they need to create next? Let's even have it personalized from the department itself.

So maybe we wanna, we wanna add departmental information, we wanna personalize it about the person itself. And then I wanna send that email and oh by the way, if they don't open the email, go ahead and send them a text to follow up in seven days, we can have the agent execute that as well. So now I've given a very simple task, just like I would, if you know, if John and I work together at the university, we could execute that task. But now a large language model can execute that task. And that's through agents of breaking down that complicated, more complex task and individual task and agents with Bedrock, you can even understand and troubleshoot those traces.

So you can understand what actions did the Bedrock agent take and how do I change that if they're not quite correct to tweak that over time? Ok. So we've talked a lot about large language models, generative AI, what are great use cases? I want to show two different demos of generative AI, i'm going to start with one that's about an executive order.

So earlier this month, an executive order came out from the White House. I think it was on November 1st and this executive order was on the responsible use of AI in the federal government. And it's very descriptive on what actions each department needs to take and when they need to take those. So it has requirements in it here. You can see an example of that. Um I think it's, yeah, it's 36 pages. Um if you've never read an executive order, uh it's a bit dense. Uh and so we thought, well, how can i summarize this and extract information of this document from the perspective of different departments within the federal government?

So obviously, our federal government customers are gonna be reading this, but let's simplify it just a bit in this example. So we created this this quick demo where we're gonna upload this document, this executive order, it happens to be in a pdf right now and we're going to upload that to our demo. We're gonna drag and drop it over there and this could apply to any executive order. Ok? You've seen that the executive order is now uploaded in the background.

We are using text track to extract that information off of the pdf. So we have raw text and now I want to summarize and extract information from this document from the perspective of a specific department. So here I want to start with the department of energy. So what are the requirements and important points in the executive order from the department of energy? Behind the scenes here, I'm using a prompt template. So I'm tempts and asking the large language summarize it from the perspective of this department.

I'm also gonna ask it to extract all the requirements for that department and sort them chronologically. So I want to know what actions I need to take right away. And you can see here that it's already working. It's saying that ok, section 4.2 directs the Secretary of Energy to take steps to understand and mitigate AI safety. It goes on in section 5.1 for new researchers. So there are very specific requirements on these different departments. And then at the bottom of our analysis is where we've summarized those requirements and sorted those chronologically, we did not sort them. We asked the large language model to that was in our prompt template. And here are bullet size points. We can see that the first one is within 90 days, they must develop and implement a plan for building model evaluation tools within 100 and 20 days, within 100 and 80 days. You can see now we've extracted these requirements, took four seconds to run this large language model to read those 36 pages and extract the information. So that's from the Department of Energy.

Um if you have read the executive order, the Department of Commerce has a lot of work to do. Um and that's the next example where we'll evaluate this from the purview of the Department of Commerce, very similar. So again, we're using the prompt template behind the scenes where we're swapping out the department. But the same questions that we're asking summarize this, extract the requirements and tell me what actions I need to take. So again, now you can see that the sections are ever so slightly different and we're seeing views from the perspective of the Department of Commerce.

If you're a deep researcher in generative AI, this is a softball for large language models. However, it's still amazing that it can extract information, it can sort information, it can understand dates and times. What if we want to get a little bit more difficult of a demo, right? This isn't using any additional tricks. In that example, we did use the executive order as context. So again, if we're thinking about prompt construction, part of the context was 36 pages of executive order that were extracted in the next demo. I wanna crank up the complexity just a little bit.

So this is the application that was developed internally from a uh the social responsibility impact team at AWS. And this is just a simple demo about evaluating proposals. So at AWS, if you have an idea, typically your first task will be write it down, write a paper about it, write a one page paper, write a six page paper. So we're gonna evaluate those and to evaluate these proposals, we're first gonna think about our persona. So for a persona in this, we're gonna be on the SRI team, we're gonna be a member of that social responsibility impact team and we're gonna evaluate a proposal for that and that's our pers, that's our persona. This again goes on the prompt. We also have a rubric. So how do I grade this proposal? How I grade it might be different, how John grades it, if we're on a different team, we might have a different rubric. Here. We have a social impact rubric. So we're gonna look at what is the impact potential, what is the feasibility? What's the benefit? And oh by the way, we're not gonna wait these evenly across it. Potential impact is going to be 40%. Innovation is going to be 25%. So we can adjust how important each pillar is in our rubric. Ok? So we have a persona and we have a rubric on how we're going to grade this.

Let's go ahead and upload an example proposal. So this is an example proposal that we're gonna upload is about telehealth and social impact of telehealth and health equity on being able to access health care. So I'm going to paste in the raw text of this proposal. This proposal is only a couple of pages long, but we're using the information we've already told it, but we're gonna tell it here in just a second, what per persona we wanna use and what rubric we wanna use. We're copying the raw text in here and we'll use those same personas and rubrics that we looked at. So we're using all of this information as part of the prompt again, we're not changing the large language model at all. This is basically prompt construction. And here's our rubric again with those pillars. Ok.

So let's take a look, this is analyzing right now, I've already run one earlier. So let's take a look at the output of that. This takes about 20 seconds to complete the analysis. So let's look at the output of how did the large language model grade the proposal against the rubric?

So we can see the content there, we repeat the content at the top, we can see the score. So we get a numerical score notice though that my rubric didn't have anything about numerical score. It's had weights that we've asked. We've prompted the model to include a numerical score. This is a 78. Ok. I think that's ac plus maybe. Um and then we have scores also for each pillar. So for equity, a score of 90 right? For feasibility, a score of 70 and we also get assessments. So for this pillar in the rubric, how did we do on the proposal? But not only how did we assess it, how do we make it better? So we have a separate section on recommendations for each area. If you want to focus on sustainability, you should change these things for your proposal. And remember at the beginning of today's talk, we talked about large language models doing more than one task, right? So here we're grading this proposal, we also asked it to extract entities at the bottom. This was just a little extra. Let's go ahead and extract all the people that are mentioned in there if there's any AWS services that are mentioned. So we threw in named entity recognition. At the end, we're doing all of this in a single prompt. So a large language model can do more than you think typically in in your simple prompts when you start simple. But think about the impact this proposal example has on education, right? We uploaded a persona that was specific to our organization with a rubric that was specific to our organization. What if I want to change it to high school? Chemistry? What if I want to change it to a government proposal for an RFI or RFP? What if I wanna have the RFP automatically extract all of the requirements and grade our proposal response against those requirements. We can use a large language model to do that and we should use a large language model to do that um to be efficient and move faster.

Ok. So this is all great. We've talked a lot about generative AI, it's a great story if we have these models available. But how do you get started? Like how do you actually put this into production? There's a couple of different ways and I want to give you a few tips.

First of all, there's a course that came out just a few months ago on generative AI. Um this course is on Coursera. It was also developed with Deep Learning dot AI. So if you're familiar at all with Doctor Andrew Ning from Stanford, he was originally from Stanford. It was developed uh with Doctor Ning and with uh some of my AWS colleagues. So you can take this Generative AI this gets quite deep um and, and can get a bit technical, but there's also an introduction uh uh and an easier track to take. This isn't the only track that's available.

We also have workshops that are available. It's, it's really an amazing world that we live in that we can just go to YouTube and search for Bedrock examples, Bedrock tutorials, how to get started with large language models today. But for me, I always come back to where we started. How do we select the right use case? We don't do machine learning for the sake of doing machine learning, right? We don't, we're not doing generative AI to do generative AI, we're solving a problem, we're solving a business problem, a mission problem, an organization problem. We're working backwards from that problem. And there's still gonna be use cases and examples. Where could you use generative AI probably should you use generative AI? Maybe not. It might be overkill and, and we'll work with you on determining which ones are good. use cases for generative AI, the, the other piece though is that the bar to experiment is extremely low.

I get questions from customers all the time. Hey, can we translate Perl into Python with a large language model? And typically my answer is, I don't know, we should try it. The bar to experiment is so low. Let's try it. And that doesn't mean that we have to move to production right away, but we can fail fast and try it. Try what works, determine our prompts quickly. Getting your teams up to speed is the second thing. So training either that Coursera course or other YouTube um trainings that are available and then most importantly, getting your hands on the keyboard and working.

So I work with customers all the time, federal government customers about proof of concept. So at AWS John does as well, solutions architects will work hand in hand with your team on short engagements of a proof of concept. How do we try something out quickly? How do we act as a technical coach to show you the path to avoid the pitfalls that we've seen other customers engage in? So if you're not already doing proof of concepts with AWS, engage your AWS account teams.

And then the last piece on generative AI getting started is the Innovation Center. So we invested $100 million in the Generative AI Innovation Center. We work backwards from a customer problem. So we are hand in hand with customers and developers to work backwards from those problems to show examples of how to solve these problems, to get a product or get a proof of concept in place. And this has been extremely popular over the last three months. And even just this morning, they announced that there's going to be an Anthropic. So a specific foundation model provider, focused innovation center that's gonna start um just after the new year. So lots of opportunities to get started.

So just to recap the tools that we have available at AWS to get started with Bedrock being fully managed, API giving you flexibility of selecting different foundation models while ensuring that your data is secure and private with encryption at rest and encryption in transit. But your data is your differentiator and it stays your data period using this generative AI on AWS doesn't change any of that.

We talked about deployment options about if we want to interact with these large language models and put them in production, how do we interact with them? And we have options again, both in Bedrock and in SageMaker in SageMaker. You'll also hear the term SageMaker JumpStart. But at the end of the day, AWS has everything for you to get started on generative AI today and to build powerful applications, especially if you're already an AWS customer. If you're already using AWS services, it's a natural step to get started.

We do have a couple of links here. that we'll share and this deck will come out as well um with some QR codes to get started with Bedrock. So the documentation for Bedrock is the first QR code. The second one is a step by step tutorial. This is a YouTube video of one of our developers put together on how to get started with Bedrock. Gonna be sharing a screen and showing and then the third one is ok. Great. Now I'm ready to get my hands on the keyboard and use Bedrock. How do I get started? So that's a deep dive workshop that you can get started of using Bedrock today.

So I hope that you've learned something today. We've covered a lot from prompt construction to use cases to Bedrock versus SageMaker. And when you might use one versus the other, John and I wanna thank you for your time today. It's

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值