Generative AI: Asking for a friend

Oh, what a start. Hello everybody. Thank you for joining us at this session. Um generative AI asking for a friend. How is everybody's re:Invent? Excellent. Um I'll just let you know that we are here in Mandalay Bay as you know, um we're also being joined um on simulcast uh by an audience who's in MGM Grand. How are you? I think I heard them. Yeah, and also in the Venetian as well.

Um my name's Mike Chambers and I'm a Developer Advocate for Amazon Web Services and my name is Tiffany Sua. I'm also a, a Developer Advocate Specialist in uh AML at AWS. And uh yeah, we'll start off by doing one of these. I don't know if you've seen these. I haven't worked with one of these before, but this is a bit of a survey so you can all join in no matter where you are watching this.

Um but you can take a picture of this QR code or do you know the thing and you tap tap, tap, tap and you should get a poll and we're asking this question has generative AI. Well, today I changed the way you live. Um, has it changed the way you live, Mike? I absolutely. It's changed the way I live because I'm standing on the stage here. I don't think I'd be doing it because of that. What about you? It definitely has changed the way I live also. But it's probably because we work in this field. So it has a huge impact on our work. But we'd be very curious to know if it has changed the way you live. Absolutely. And so, oh, look at that. We've got some results coming in live. Excellent. Yes and no. That's very indecisive. Yeah. That's surprisingly, I don't know, a little bit. Maybe not. We were not expecting that. Actually, I was thinking maybe it would be a little bit more on the side of. Yes, I guess we didn't actually take part in this as well. Ok. Well, that's, that's interesting. I guess it would be interesting to know after the end of this presentation whether you've got a sort of similar feelings or whether you're a little bit more positive, maybe about generative AI moving forward. So we can, we can look at that as well. Should we hop into the agenda for today?

Yes. So today we are going to answer a few questions. Um what is generative AI, what are language models? Uh how does text generation work? What are foundation models? How do you use LLMs? Uh how do you customize LLMs also, what is a retrieval augmented generation? What are AIs and also we're going to talk about Amazon CodeWhisperer.

Absolutely. And so I think it's important to look at those questions we're going to ask, answer some of the fundamentals. But we actually want to go through to some of the more current topics that people are talking about in general to AI make sure everyone's got an understanding of that as well. So, um and if time allows, maybe we can open up to questions from, from this room in Mandalay Bay as well.

Yeah. So generally, what is generative AI, we've here, we hear a lot about this uh especially for the last year. But um most of the time people don't exactly know what is generative AI, it's actually a subset of artificial intelligence uh before all of the generative AI hype. If you might remember in machine learning that um for example, we had models that will very good at classifying things. For example, if you give it an image to, to analyze it, it was able to tell you if it was a dog or a cat in the image, it's always dog and cat.

Yeah. Well, a dog or cat, a car, but it was so it was able to do recognition, but it was not able to draw a cat or to draw a dog. And that's generation and generation like generative AI basically is specialized in generating new data that it has not seen before in a training data set. And so you could now um send a prompt, writing a golden retriever wearing glasses and a hat and a portrait painting. And it would generate that image that it has not. This image has not been created by anyone before. Like the, the AI has generated that also the same with a question. You can have a conversation with an AI and it will um give you back that answer. That is completely customized to the question you've, you've asked, we're doing another one. So we're not going to do this the whole time. It's not like one slide. Then we'll ask you another question. Um invite you all up on stage. Let's just do one more and then we'll, we'll crack on and then we'll have one to more towards the end as well. So, has generative AI changed the way you work. So I think before we're asking the way you live, what about the way you work? Is it ingrained in your workflow? Is it something you're actually developing with at the moment as well? That's another question we have. Hopefully you got that in time a little bit more. That's interesting. Yes. Actually, we didn't ask. Has it changed the way you work in a positive way? Hopefully it has changed the way you work in a positive way. We got some. Well, you, no, that would be a debatable question. We probably don't want to have this debate here. Now, we'll go, we'll go to a coffee shop afterwards and talk about this. All right. Ok. Off camera. Look, I appreciate you answering those questions. Thank you. I hope you find it interesting to see what everybody else thinks as well.

So, can you tell us what are language models? What are language models? Well, and so we, we're talking about generative AI and one of the most um key developments and where we're seeing the most business value at the moment, particularly most value is in the language models that we're using. And so I just wanted to explain a little bit about where these have come from. Why is it that we're talking about language models today and now, and we weren't necessarily talking about them so much a couple of years ago. And so I want to just recap just briefly the road to generative AI, how did we get to where we are now? And it all starts back at the beginning with word embeddings.

So word embeddings are a statistical way of being able to um draw relationships between words. That's really all they are. So we take as many words as we can find in natural language. So um I don't know everything that we can find on Wikipedia, everything or the complete works of Shakespeare, whatever else we can find and start to analyze that. And see the relationships between words and this is something that we've been doing, by the way, this road goes back at least 10 years, maybe a little bit longer than that in terms of really serious of working in this space.

And I can't show you word embeddings in the space that we would normally that they exist in because they are a multi dimensional space. And unfortunately, as much as we are a technology company and this is a technology conference, my display only shows two dimensions. I do apologize. So this is a two dimensional graph and we'll have a look at the relationships, a simulated idea about what the relationships between words might look like.

And it's coming up again, the dog and the cat thing, it's a common theme in machine learning that we talk about dogs and cats and abalone. But anyway, so this is where dog might sit in a two dimensional vector space. So we've seen the word dog used in natural language and then let's put it in the context of something else. Yes, it's going to be cat and we see that dog and cat coexist in this space like they, they are, they relate to each other in this two dimensional. And the way we calculate the relation between those words is it is the way they are used in the language.

So cats and dogs are often seen close to the same words like running after something or eating a biscuit or, you know, playing with something. So this is how mathematically we can say that those words are close to each other. It's because they are often used with the same other words and on the, and, and the, the juxtaposition to that is that we've got car and car is not particularly related to dog or cat, although dogs do sometimes chase cars, um, in this particular space anyway.

Um, but it is related to bus, it's a different kind of transport. I guess it's another vehicle. And so if we put onto this graph, all words in the natural language of which um this is a subset, this is not all words, you can see that it does map out, you can see the relationships between words and these relationships between words in these word embeddings are the basis of language models.

And so the language models that we're using today share their roots with this. And so just to add a little bit mathematically what it means is that the location of the words is the position of their vectors. If you might have heard of this, when we vectorize the text, it basically we give a position in the space to those words. That's exactly it. And so in years gone by, we would use these number of different algorithms including the word to vec family of algorithms to help us to be able to process large amounts of text and come out with these basic language models. And this allowed us to be able to work in the field of what we used to call them. I guess we still do natural language processing an NLP.

And so the kinds of tasks that we could do back then and we can do now as well um are things like text summarization. So we could train a model to do text summarization. We could train a model to do question answering. We could train a model to do sentiment analysis and also speech language understanding, which is basically the technology behind uh products like Alexa where it understands human intent. So that was what we did.

And with those kinds of language models, how could we use those to generate text? Well, what we try to do is we try to predict the next word. There's a lot of that going to be in this presentation, we're trying to predict the next word. And so in this particular example, we have a sentence and we know the last word of that sentence and we're gonna try and predict the next word and we're gonna use that using very simple word embedding techniques.

Now, at the moment, we can't see the rest of that sentence. And the reason why we can't is because the model can't either. It's just looking at the last word lazy and figuring out what the next word could be might be. Prediction, machine learning is all about making predictions. And so based on the fact that it's lazy, we could guess, i guess that maybe it's person, a lazy person. It makes sense from a language perspective. It makes sense. It could be that is it that we don't really know because we can't see the rest of the sentence.

What we needed to do was try and find a way to get more of that context, more of the, what was in the rest of the sentence. And so we worked on this over time and we, and i'm talking to us as a society and we come up with recurrent neural networks and technologies like this and those recurrent neural networks can actually see more of the context and see more of that sentence.

And so now we can see a little bit more about what's going on here. The more words we try to include in this calculation, by the way, um not quite technically exponentially, but we get a big growth in the amount of compute power memory that's required to make this calculation so not particularly efficient by today's standards. And so we can see fox jumps over the lazy something and now we're trying to predict the next word.

Now, I need you all to pause your human squishy brains which probably knows exactly what's going to come next and think about it from the perspective of the model. The model is just looking again at just similarities between words. And so it's going to make the prediction of rabbit. That could be it. Right. That makes sense. A fox and a rabbit, they might coexist in the same space. So we're making predictions about what the next word might be.

The point is with this older technology that it wasn't very good, it wasn't very accurate at predicting what we wanted because it just didn't have the ability to have the whole context of the sentence. So what changed? Why are we here now? Why are we all very familiar with our chat bots that are capable of doing amazing things? Why are we talking about generative AI? And the pivotal moment was in 2017.

Now, I'm not gonna go through very many white papers in this session. Don't worry, but please lock the doors. We're not letting anybody out. Um this white paper Attention is All You Need was released in 2017 and it really changed everything because inside of this paper, they discussed this diagram and I'm now going to spend the next week explaining everything that's in this diagram. But this is the transformer architecture and this is honestly a much more complicated architecture but very very powerful

And two, you will all leave. So I'm not going to keep doing that. But trust me, it will keep coming back with gambling and now we can adjust it by typing in something different there.

Um but we'd have to turn that ran, uh temperature back up in order for it to start generating more things. So this is useful, right?

So if we're doing something where we're asking the model to produce code, which is something we could do, then maybe we don't want it be created if we want it to be a little bit more deterministic. Otherwise, if we're looking for it to write a poem, then we could write, move that needle all the way to the top.

So top p and top k are similar kinds of controls that um again influence the diversity of the words that it's looking to use for its generation.

Um and there is more information you can get from the info bar. So I'll let you have a look at that in your own time.

Other kinds of configuration options here are things like maximum length. This is super useful because I'm only generating one word. I could literally bring that down to uh aaa small number.

Now, this is tokens, not words, which there's a nuanced difference between this, but I could bring it down to be something quite small. And if I was to say here, write a summary, I'm really pushing it here, write a summary of las vegas rather than just a one word one.

And I produce this sorry, reduce this down to, I don't know something under 100 and press run, then we might get some kind of generation but look, it finishes at this something and you're all scanning it now to see if that's libelous or anything, but it finishes halfway through.

Now, if we're using the api, it would actually tell us that it's been truncated and cut off so we could switch that back up. And so this is useful because when we're using large language models, when we're using generative a i, we often build on the word or token level.

So having some control over it with the maximum length is super useful. So if I push that up to the top and press run again, we hopefully will get something which then gets all the way to the end and that's pretty fast, right? And pretty comprehensive answer.

Hopefully and obviously super super simple prompt. If we wanted to get a little bit more in depth, we could give it quite a lot of context about the kind of thing that we want it to do. So that that went ok. I think that was the demo of using the creativity did not work a bit. Alrighty.

So how can you customize the large language model? So we just looked at how we can prompt in different ways. Subtly we did a couple of different prompts, one word or not and we can see that prompting it in that way, can produce different kinds of outputs.

But we've probably all heard of hallucination. Who's heard of hallucination? Yeah, we've all heard of hallucination. One of the challenges of using generative a i as a whole is how do we, how do we control the output even more than the kinds of things that we've just looked at?

So how can we customize a large language model? Now, um tiffany was talking before about the foundation models. One of the things you could do is you could go and build your own model. But I think as you suggested before that, yeah, probably not the best idea. That's, that's not a rainy weekend project, right? It's not, it's gonna cost you a lot of money, it's gonna take you many months to do.

So, it's certainly not the first port of call so if you're struggling with the outputs from the models, I wouldn't recommend starting off by building your own model. Another thing you can do is continuation training. So you can take a foundation model that already exists. And um if the, if you have access to the actual model itself, so uh an open source model, for example, you could take that model and you could get your own training data and you could carry on training it.

That's totally possible, but it's still quite a big job. So not necessarily your first port of call, um you can also fine tune subtly different from continuation training. This is where we give it examples of well for this prompt, this is the answer i would like and you can use big data sets like that to refine or fine tune the model to make it more task specific for the kind of thing that you want to do.

So if you're working in a specific industry or more to the point, your, your business works in a certain way where you want the generations to be a certain way. If you've got a data set already, i like my thing summarized like this, then you could fine tune it that works too. But still, i wouldn't necessarily recommend that as the first port of call, i think the first thing to do is to take a look at in context learning.

And this is where we start to look at the topics that you might have heard of spoken about this week and in adam's keynote as well where we talk about retrieval augmented generation. Yeah.

So basically, um what you do is that you have your lom that is non specialized, like you, you have your plain villa uh foundation model and you send it the prompt. So that's the query and what you can do uh for in context uh learning is that you give more data to that prompt.

So basically what happens here is that you, you basically, so send the request the query that you want to send to the llm. But then you give it also very um accurate data about like that could help the foundation model answer the question and where you get that data could be from different sources, it could be from vector databases. It could be from an api that uh can query and then sends back uh the data to the prompt.

Basically, what you do is that you enrich your prompt with more information and that's rag retrieval augmented generation. And you have a nice explanation like a nice metaphor as to what rag is. You're saying this has something to do with me. I don't know. Ok.

Yeah, look. So retrieval augmented generation. Sorry, i stole your punch line. Yes. So um we want to explain a little bit about this and how it works. There's um there's some misconceptions out there and so it's actually a really interesting architecture. So let's step through it. And i thought the way that we could do that would be with a little story.

So let's now not think of a large language model as a computer model. Let's think of it as somewhat of a person. I'm not a fan of anthropomorphizing, making human characters out of large language models, but i'm a fan of this one.

So this, we haven't got a name for this character. The wizard, the wizard wiz the wizard, this is whizzy, the wizard just named um and wizard, the wizard is learning magic. This is our large language model for the sake of this demonstration.

And um initially, obviously, an untrained model can't do very much poor wizz, the wizard doesn't know how to do any magic at all. So what do we do? I think we probably all know what we do is we send wizz the wizard to school and we send wizz the wizard to a wizard school where wizz the wizard learns how to do magic.

And so we are essentially pre training our large language model at this point. So now we can do magic, right? This is great. Um i know how to do everything. So i i'm not wiz the wizard of course, but um go out with my wand. Um see a troll, i need to protect people from that. What do i do? I don't know how to do that. I don't know how to protect people from trolls. I know how to cast spells. I know that. I know how to switch the wand, but i don't know how to actually do that about.

So, what do we do? Well, one thing we could do i suppose is we could go back to school, could we do that? We could do that and we could. But, but by that point, you know, the troll has caused the havoc and we haven't been able to save people from that.

So we don't go back to school and sort of learn every single spell that ever existed. We want to be a little bit more in the moment. And so we grab our spell books, so our spell books contain all the information that we need in order to be able to perform the task, which is right in front of us right now. And this is what retrieval aed generation does because we can give the large language model, the wizard access to our data, the spell books.

So we don't have to always go back to the knowledge that the model had when it was initially trained. And also one of the interesting thing about this is that the i was about to say the nlm but going back to the, the, the the wizard um when going out of the school, it has knowledge that stops at the end of its training. And so with this, it's the same with ll ms, you know, it's, it has the knowledge up until the day it was trained.

But then if you want data that has been generated after, like if you ask a question to an llm about something that happened this week, but it has been trained in 2021 then it would not be able to recollect that data because it has not seen it yet. And um and that's the thing with, with bell's book, like you can have a book that is up to date. And so the l and m is just able to um query that data and being able to be up to date with like fresh data. It is the brand new spells, brand new spells. Exactly. Absolutely. Now you can find it.

So why would we ever have continuation training? Why would we have fine tne then if we can just do this? Well, imagine the situation where wizard, the wizard goes to the spell books and opens up the spell book and reads stuff. So i don't understand this. This is this is higher level magic here. I don't understand some of the ways and methods that are in here.

So at that point, that's when we can send whizzy the whizz it back to training school and do some fine tuning or some continuation training. But that's the kind of time when we're going to look for that kind of thing to say. Ok, well, now you understand other concepts so that you can then work with the data that you have.

So in um a different non wizarding world, if you imagine that you're working in financial services or in uh medical applications or in legal applications, something like that, something where language is somewhat different from the everyday language that we use, different from the everyday language that the model might have been initially trained on where you can do your continuation training, maybe some fine tuning to bring the model up to knowledge so that it knows how to work in that world.

It still doesn't need to know the specifics of everything because we can give that through retrieval, augmented generation and through the spell books of the world. So agents. So another term that you might have heard of uh again this week. Um and in adam's keynote, we mentioned a couple of things there and one of the things was agents and tools.

So what are agents? Well, we actually have all of the components already in this story to be able to talk to this and to talk to what uh agents are. So we'll go back to this. And if you want to take a picture of this slide and say this is the kind of slide i saw at re invent, i don't know. Here we go, this is agents.

So the ability to send messages elsewhere, the ability to be able to interact with other systems. So in this particular case. Yes, with whizzy, the wizard can send messages off and can request things, can send information away queries away and get responses to come back in the um large language model world in our um uh applications world.

This might mean connecting into an api um booking a flight and sending an email, even doing something as simple as finding out what the current date and time is because think about it, large language model has no idea what the current date and time is. So having an agent for that is genuinely really useful thing to have.

So that's our architecture diagram for this session. I hope you like it. So going back to the less wizardly world, um we have amazon bedrock, which is basically a tool that can help you use the everything that we've talked about but more easily in a fully managed way.

So amazon bedrock is a framework where you have as we showed earlier, all the foundation models available, so you can query them with an api you also have the knowledge base. So the knowledge base is basically so the foundation model is the wizard and now we have the knowledge base, which is the book of sales and this is where you put all of the relevant data for your business.

Anything that obviously the llm hasn't seen before in the training, basically something, some data that it could be data that is private to your company internally. Obviously uh and uh foundation model doesn't know specifically what your company is working on or the data that you have internally.

But what you can do is put all of this data inside of an s3 bucket and you give to bedrock, the url to your s3 bucket and bedrock will take care of everything for you. It will uh take the data in your s3 bucket vectorize it and put it in a vector database. And so it will be uh something that then the model would be able to query.

Also in bedrock, you have uh access to agents. And so as mike said before, agents are able to perform tasks for you if you give it the right api and the right permissions. So basically, if you want to book a flight, for example, or send an email and you have the right full api to do that, then you can specify it here in bedrock or a new broomstick as well.

And we are contractually obliged and anybody who's worked been at aws conferences before for a while will know this. This like nothing else is undifferentiated, heavy lifting. So we are required to say that term, we're not really, but this is absolutely what this is doing.

So as tiffany was saying, you know, like taking your data and actually putting it into retrieval augmented generation, there are a number of steps which are fairly tedious and laborious to do and this will do it for you. Yeah, you could build the entire thing on your own. You could build your own vector database, you could vectorize with your own embedding models.

So you could, yeah, you could create your own agents. But that's a lot of work that you wouldn't have to do if you use a completely, fully managed framework. Absolutely. And so the last question, do you one more question? And then we'll take live questions one more question.

Do you think generative a i will change the world in the future and again frame that in whatever context you like? But i think we're framing it in a positive context here and the kinds of things that i guess you can see that hopefully, you know, you can do with generative a i will this still be 5050. I don't know. I was surprised by the answers.

Oh, yes. Ok. I think we were very convincing and a cheer for it being a positive change to the world a little bit. Yes.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值