Explore Amazon Titan for language tasks-CSDN博客

本文链接：https://blog.csdn.net/just2gooo/article/details/134833831

Uh hello everyone. My name is Brent Swidler. I'm a Principal Product Manager on uh Amazon Bedrock and I specifically focus on the development of our Titan text family of models.

Um I'm joined today by uh Ben Snively, who's a Senior Principal Solution Architect and Satin, who's an Application Architect for from uh Electronic Arts.

So, uh just over the next hour, we're gonna go through an introduction, a quick introduction to Amazon Bedrock just to, you know, set the stage here. I'm gonna go into a quick uh details of the Amazon Titan family of models and what's been announced uh the past few days.

Um we're gonna then go into some architectures and demos from Ben to show exactly how uh the Titan family can integrate into different systems and then uh best practices as well. And then Satin is gonna go into details about how Electronic Arts is using the models today. And then we'll end with just a summary of uh what we went through.

So, just for those who aren't necessarily familiar with Amazon Bedrock, uh Amazon Bedrock is a platform that allows you to choose from a variety of different models. Uh foundation models, both text models and text generation models or image generation models today. Uh you can choose from the drop down list of the, what I think is now 17 different foundation models inside of Bedrock. You can use them out of the box or you can customize them to your needs.

And then you can also start to connect them to data sources through the G A of uh Bedrock Knowledge Bases. You can start to connect them to exterior uh systems through function, calling through uh Bedrock Agents. And then we've also announced uh Guardrails for Bedrock, which allow you to put guardrails around the model performance to define exactly what you want for your end users as well as uh you know, the 17 models available.

So um doing model evaluation is now in preview as well uh where we can start to send multiple prompts into one model and multiple models and see exactly which one is uh mean exactly what you want to do both in a human and automated kind of way.

Um just the set of models that are inside of Amazon Bedrock. Today, we have models from uh A I 21 Anthropic Cohere uh Meta Lama family of models, Stability. And then our first party models which are from uh called Amazon Titan, uh which is, you know, obviously we're gonna go through when we look at the portfolio of the Amazon Titan family, uh there's the text models and then there's the multimodal models. And as you can imagine, this is gonna continually grow and be iterated on.

Um as of this week, Titan Text uh Express is generally available and Titan Text Lite is generally available. And a couple of months ago, we announced uh T Titan Text Embeddings as generally available as well.

The Titan Text Lite is a more compact model that allows you to do uh rapid iterations of something like fine tuning or customization that you want to do. Whereas the Titan Text Express um has broader functionality with both function calling uh more robust RAG applications. It's multilingual, I mean 100 plus different languages. It's generally available in English, but you can use it in preview in the 99 plus other languages. And um it's uh it allows for code generation and rich text formats like JSON and SQL formats as well.

Uh some of the other things uh to note here. Uh the context window for the Light is four K and the context window for the Express is eight K. And then on the embedding side, uh that embedding was J uh in September. Then also we announced uh Titan Text uh Titan Image Generator and preview.

Um you should be able to sign up to have access to that as well. And then the Titan uh Image Embeddings, there are separate sessions on those. So we won't go into too much details on that on that today.

Um the majority of the focus will be on the Titan Text Express and Titan Text Light models. Some of the things that we go through on our side are the um you know, when we think about the customer usage, if I were to probably pull the audience today and ask how often you use these models just as leveraging the parametric knowledge or the underlying knowledge that's embedded into the models versus how often you're using your own data sources to do.

So, I would say that there's the majority of the SKU is mostly towards using your own data as opposed to relying on the model's inherent knowledge for answering questions for you. So when we go through different prompt types, we often discuss this in three different general categories.

There's open ended prompts, which is, you know, like who was the first president of the United States, it's just an open ended statement that's relying on the model's parametric knowledge in order to actually answer the question, then the next layer is closed ended prompts, which is really really where the entire prompt is completely combined by the content that you provided.

So you know, based on the provided context, what is the answer to this question or summarize this document or whatever these types of prompts are more along the lines of completely closed ended variations. And then there also ends up being hybrid variations where you're relying on the content that you provide as well as the content that comes from the parametric knowledge of the model.

Now, the reason why I bring this up is because we often uh you know, in working with customers and the design of, you know, other consumer facing applications or internal applications. Just think about what knowledge you want the model to end up having and where you want the knowledge to come from.

So if you want, excuse me, if you want to rely on the model's inherent knowledge, then you're moving more towards having open ended applications where if you want to be more confined to the applications or the data that you end up using, you end up moving more towards the closed ended variations of the data.

So when we think about model customization, there's multiple ways that you can end up manipulating the, the content and the and the ability of the model. Um first, you think about controlling dimensions, do you want to control the behavior or do you want to control the knowledge, controlling the behavior is understanding like different format outputs and the different ways that the model can actually output a a completion versus controlling the knowledge is understanding what information the model is actually using in each one of these scenarios.

So these are the two dimensions of this, of this chart here where you see on the y axis as knowledge control and on the x axis is behavioral control. And then we're gonna expand out from there. To show exactly how you can use these models, uh how you can adapt the models to both control the behavior and or the knowledge that you want to have the most straightforward one. And probably the easiest to implement is through prompt engineering.

Just basically adding the information and the instructions in natural language into the actual prompt of the model without having to do any sort of uh training or anything else like that. Prompt engineering allows you to both control the knowledge and the behavior. But to what extent there's, you know, there's additional ways that you can control these in in um with a lot more data, a lot of times with prompt engineering, you're confined by the uh the context window that you end up giving into the model.

If you have a larger context window, a lot of times that might appear to be better. But the cost optimization of doing something like that might uh might outweigh the uh might outweigh the benefits.

So if we just go up the chart here, the first thing that we see is controlling the knowledge, right. So in inputting information into the prompt is one way to actually do this uh through uh prompt engineering. A variation of this uh that a lot of folks are familiar with and this is a 300 level discussion. So I'm going to assume some level of understanding. But retrieval augmented generation is the idea that the model can end up going back and forth from your own content in order to uh in order to unders in order to give you the outputs that you'd want.

So if you ask a query, instead of relying on the model's own understanding of the world, you're going to point it to an external data source that has your information on it or whatever information that you want to impart on the model and the information comes back as part of the prompt. So you're combining both search and then the search results come back into the model and you're able the model is able to consume that information and and output the right result for you. This is R A um which is another way to just control the knowledge.

In the most extreme variation of control, your knowledge. We announced continuous pre training um specifically for the Titan models. The way that we often train the models is, you know, you do uh you train on uh uh like the entire corpus of tech, the trillions of tokens that you end up using and then uh we do human alignment.

However, you know, we don't have all of your data and we won't. So what we offer is the ability to have continuous pre training, which allows you to do in an unsupervised fashion, inject more of your data into the underlying parametric knowledge of the model.

So you're able to embed your information into the model through continuous pre training. It's as simple as pointing to an S3 bucket that has your entire corpus of documents um in a JSON L file uh that allows you to just point it to that point it point the uh API to that S3 bucket and click on training and you don't need the prompt completion pairs.

What you're doing here is your, your whole purpose is to inject or you know, train the model to embed your the knowledge that you want to have. So these are the knowledge control parameters. Prompt engineering RAG uh continuous free training.

If we go along the x axis here, we're talking about behavioral controls. Prompt engineering. Again is one way to do that where you just are adding the instructions that you'd actually want to have into the model.

Um but the other one is fine tuning, fine tuning is different than continuous pre training in which fine tuning, you have a prompt and completion pair. So what you're doing in this scenario is more likely aligning the model to how you want it to respond, given a specific instruction and a prompt.

So in these cases, you, you know, again, you can point it to an S3 bucket. And uh in this case, you have a prompt and completion pair as opposed to a completely unsupervised method of actually training the model.

So the reason why I go through this is just because I wanted to give a little bit of a background on some of the design criteria and system designs that we've gone through and the development of the Titan models um in the G A release to uh yesterday.

Um there's also a significant documentation on the prompt engineering guidelines for Titan text as well as the model service card. Our model service card goes into details on the responsible usage.

Um how we went about the design of our responsible A I strategy, as well as a lot of the system design criteria that we went through both from uh you know, how we designed for prompt engineering and how we trained for things like function calling.

Uh if we use something like Bedrock Agents, um that is all I wanted to go through before we go into demos from uh our, our team here. So I'm gonna hand it off to Ben to go into more detail on the usage of Titan.

Alright. Alright. Um so the first uh example we're gonna give is um oftentimes, you know, your, your enterprise, your organization, your government uh uh agency, your nonprofit um data is a critical asset. Uh so in this example, we're gonna show how to take massive mal data or uh a large transcript of data. Uh uh I should say, uh and synthesize that and extract information from it. Uh it's aaa knowledge type of uh exercise.

Uh and we're gonna be doing that uh using an interview transcript. So what you see here is uh essentially a notebook that uh we're gonna bring up and this is an interview uh transcript. Uh this transcript uh has already been run. Uh so, um in your architecture, um this could be a gene from uh the A I service transcribe uh and populated in S3 or you could have a transcript already.

Uh but we have in a larger system, uh a need to, to generate a transcript. And now we want to analyze that and derive more information from that transcript. Uh so we have this interview transcript here in S3. And what we're gonna do here is we're gonna then start analyzing it.

So, um here we have some code and as we start executing this code, uh what you'll see is the transcript showing up underneath the cell. Uh so, um as I scroll down here, uh you'll see um a very large large uh amount of text that represents the interview uh between uh Werner Vogel and uh doctor Soy.

So, and so very long transcript and what we wanna do is we want to take the entire transcript and ask what are the four key highlights um in 10 words or less each uh out of that transcript. Uh so we're gonna execute the cell.

And what you'll see here in the code is uh we're calling uh uh Time Text Express uh and saying here's the prompt. Uh so right at the very top there, you saw the context with the prompt

Uh we provided that interview. So it was that that uh close ended uh prompt saying uh based on this interview and actually we do a rag architecture saying only use that document, which is a better example of that, but um only use this or use this interview and uh generate the information uh uh based on that interview. Um the top four key takeaways and what we see here are um the results a i and machine learning are putting data to work. Uh the complexity algorithms and use cases are increasing. Uh so a i and mr are able to solve more and more complicated issues. Uh and the next uh step up is gen a i large language models and foundation models. So that was all extracted out of that large um amount of text that was uh uh within there.

Next, uh what we we could do is we could actually take that same information. Uh and say now what i want is i want to be able to generate a uh article from that. So we uh in this task here, what we're gonna do is we're gonna say uh write me a narrative based on that interview. Uh so same, same interview that you saw earlier, uh we're calling text express and now saying write me an article. Uh and it's, it's calling the end point here. And then what you'll see here in a moment is the text.

Uh so here we see um swami vp of database uh analytics and machine learning. Uh and uh a pretty good article it wrote uh behind this blue screen uh that shows the um uh article that was extracted um from that interview. Uh so titan titan express uh really, really powerful to be able to take these um this information, ask it to do these tasks, to be able to synthesize information, pull out key points. Uh i could have asked for things like key dates uh persons. Um you know, different types of um extraction tech techniques as well uh to be able to get that information out.

Uh in this next example, uh what we're gonna show is um that rag example. Uh so brandt talked about rag um and being able to use rag um in a close ended um prompt strategy to, to use documents in your organization or documents that might have not been used for uh in the pre training phase of that foundation model.

Uh so here, uh we're using bedrock uh knowledge base. Uh nice thing about that is we uh pointed that bedrock knowledge base to re event sessions this year. Uh saying use this uh all these reents sessions with the uh with the title uh as your knowledge base. Um uh and what bedrock knowledge base did is it automatically synced that created the chunks for us, used tighten embeddings and stored that into opensearch for us. So all we had to do is set that up within the console and it did all that work for us to populate our opensearch service uh cluster behind. Again, this blue screen.

Uh what we do here is um kno knowledge base supports uh different uh knowledge repository. So i could have uh used uh pine cone or i could use uh in memory store. Um i chose open search services in this case. And uh what i'm doing here is i could pull back and this is the opensearch serves cluster. And what we see is that's the end point we're gonna call and i could go in here and i could call this endpoint directly.

Uh the question i asked, i should have highlighted that is i'm heading to rent this year. What are the sessions on titan? Uh so we're asking a question uh about uh pulling those, those um titan sessions back. And here we see um i asked for the top five sessions. So in the rag architecture, um part of the de definition, when you're uh praying that knowledge base is defining um you know what the function is, the similarity algorithm is for the different vectors and how many to return. And then uh we're gonna do that closed prompt uh strategy and call uh tian express uh to be able to get our answer. And um these are the uh i purposely planned to show this example uh at this session uh uh of tying uh explore tying with language models.

And uh another one that's related to tieing through bedrock, i can also uh go in and do this through the console. Uh in here, um this is where i configured the knowledge base. Uh so here we have the data source and i configured this data source to be able to, to uh pull in and uh get that um get that information out uh and sync that into the um uh into essentially the knowledge base. So that was uh i in here, the knowledge base. And over here, what we could see is uh within the sync job, you could uh see the uh the embeddings as well. Uh so we're gonna pull up the embeddings. And down here, we can see where you can tie, tie embeddings. And this is the vector store uh which is the same vector store you saw uh in the jupiter environment.

Uh over on the right hand side, uh i could also go in through the console and test it. Um uh uh right now through the console. Um uh it's using the clod uh for the infer portion of the large language model. Um but uh uh through the jupi environment, i ended up using th express uh uh so from there, uh what you could, we could, we could build uh on that example. So we have um we have bedrock, uh we're using uh a knowledge base in opensearch uh encoded through t express uh crane our vectors.

Uh um and now, what we could do is we can also uh plug in lambda uh to do uh uh a react style architecture, uh reasoning and action uh to be able to use rag for additional um information uh based on our organization and then take action. Uh, so, in this example, uh we have a lambda function, it calls a dyno o db table and is able to pull things like case information, ticket information. Um those sorts of use cases where you wanted to fulfill an action based on observation. So, uh over here, um the way this works is uh within bedrock, uh you can define a interface definition. This internet interface definition defines, it requires first name and last name.

Uh and through the large language model integration, uh uh as it's uh evaluating the, the traceability, the trace of what action should be performed. Uh it determines if a function needs to be called. Um this is using uh uh cloud uh for the agent. Uh so we're uh we've uh created the agent based on a cloud model. Uh and it's used in time embeddings uh in the knowledge base uh using opensearch server list.

Uh we asked what are the time sessions? Uh we got that back. Uh and here we could then um say what's my ticket information uh which we configure this agent. Uh this is all fake data. Uh but we configured this agent to, to know about a function to get ticket information uh to be able to get the results. And the thing you could do over on the side here is uh uh see the traceability of what that agent is doing through the console. Uh and see that based on the interaction when it's going to the knowledge base uh versus when it wants to perform the lambda function uh under bedrock here, we're gonna uh switch over to, to sein and go into uh the next section. Thanks so much, ben.

Hi, everyone. Hope you're all having a wonderful time. Uh at las vegas, our uh as my flight attendant put it lost wages. Um anyway, so my name is uh sachin kal. Uh i work as an application uh at electronic arts today. We look at uh some of the uh opportunities uh within our organization where we have leverage a i uh along with bedrock and titan before we dive deep. Uh let me introduce who we are and what we do.

Uh i am part of a player experience team for electronic arts. Uh we majorly build platforms and tools that help uh business uh generate and distribute uh world class content for our players. Um facilitate hosting of live events such as e a play and uh support our players when they get into an issue or an error while playing the games as you can imagine uh we deal with uh millions of players worldwide. Uh so our workloads are complex, unpredictable and also needs to be scaled when required.

So with the help of uh bedrock and foundational models, we hope we could build uh the generative applications that can, that are easy to develop uh cost effective mainly and also scalable uh when needed. So today, i have uh four different opportunities within our area that i would like to talk about. Um first two of them are to do with productivity enhancement of our teams. The third one is for enhancing our players experience. And the fourth one is most importantly, our uh business team is getting the feedback and the uh knowledge from the social media posts.

Before going through the first opportunity, i would like to do a quick uh audience poll if it's ok with you all uh quick show of hands who here uh here develop code in their day to day, like write code into in their day to day work. Nice. And um next question is against show of hands. Uh have you ever felt that you have to be compromised uh while writing the unit tests because of time crunch or because of business priorities? Awesome. I see some hands. That's good. I see a lot of hands. That's nice.

Um we all know that uh unit testing is very critical uh part of development life cycle. Uh we all know that uh unit tests, uh help us make sure the product doesn't have any error when uh when it gets into the next phases of software development life cycle. Uh but sometimes because of the business priorities or business of the time crunch, uh we may have to relegate them as a second class citizens. What we have. Uh that is we gave a tool for our developers so they can just by click of it, generate an automated unit tests. Uh at least the boiler bread ones so that they could uh basically uh write the uh education scenarios and more complex tests uh thereby we can help them uh improve the efficiency in their development process. And also shift the responsibility left towards the developers to make sure that uh the product released to q a is properly tested with multiple complex scenarios and make sure it's good uh for oq a to uh run more regression tests before going into a demo.

I have a quick high level architecture. Uh what you see on your uh left side is developers interacting with uh uh in house built plug in or tool uh that is powered by a service behind the scenes that's in turn powered by amazon bedrock uh with titan foundation model. Uh when the developer uh triggers the plug in uh the code and the metadata is sent to the right which uh with the appropriate prompts, which would generate the test, test cases, unit tests and then return back to the uh code ator.

I'm not as brave as or courageous as ben. So i have a recorded demo. So i would just play that. So what you see is a coordinator here. Uh and it's a simple java function which uh we just convert the uh emojis to uh emotions to emojis

Uh developer has just triggered the plug in in a matter of few seconds. We will see the tests that gets generated and displayed. So i have like uh 10 tests that the uh titan behind the scenes generated automatically.

So now the the developer can review add more tests if needed change something. They have at least a boilerplate test to start with this. Uh the second uh opportunity that i would like to talk again is to do with uh uh productivity enhancement.

So this is mainly for our quality engineers. Uh why should developers have all the fun? Right. Our quality engineers spend a lot of time making sure the products doesn't have any errors. So this is for them uh to help in their day to day job uh today.

What happens when our quality get a requirement from our business is they spend a lot of time analyzing of the requirement, uh write the test scenarios even before thinking about regression tests or even tests um for a complex requirements. This would obviously take a lot of time and also it sometimes is uh uh is kind of not that efficient, i should say because the most of the time is spent in analyzing and writing the basic uh test scenarios. Even before thinking of the complex ones, edge cases and stuff like how we did for developers, we would also like to give a simple tool of plug in for our uh quality engineers.

Uh with by click of a button can generate the uh test scenarios and they can add more complex ones if needed uh thereby improving the quality much further. And also making sure that the production code uh doesn't fail mainly in production. So we hope with this, we can enhance the reliability of the applications as well because uh now the test scenarios cover a wide range of real time scenarios. And also we make sure that uh there are more complex that we covered.

I'm sorry if you're hearing a white noise. I'm just trying to make sure it's i'm audible. Is it good or? Ok. Awesome. Thank you.

So here is the high level architecture. So you will see a pattern uh in the four opportunities where i just want to make sure i show that it's it's it's simple. So on your left side, you see quality in ayran with again a custom built button uh within the jira. So which when triggered takes in the requirement on any other metadata from the jira ticket and then passes on to the back end service, which is again powered by amazon bed rock and t uh and whatever the scenarios that's been generated is displayed to the quality for reviewing and also making any changes and adding any more.

We look through a quick demo. Uh here again, uh it's a jira where we have our custom uh button set up. So quality is hitting the custom button now. And behind the scenes, titan model, uh bedrock with titan model is work which takes in the description any other metadata from the requirement in a matter of few seconds, we'll see the scenarios that gets displayed in the model.

Now, the quality is can review it, add more additional scenarios if needed. Uh another change. Uh if they want to at least they have a basic starting point that they can start with and uh they can keep adding more complex scenarios and also they can spend more time on regression and automation test suites.

The, the third uh opportunity that i would like to talk about is called the next generative uh player chat. So today, uh when our players get into any issues while playing the games, uh they have a question, a general question, they would either go to uh es help website and, or r uh they would uh chat with the customer service agent in order to get their questions answered. As we all feel we would rather have our players have fun playing games than deal with this kind of stuff.

So what we would like to do with gene is provide them an easy way where uh within the game so that they could just get their questions answered or errors fixed without even leaving the game. So with the help of generative uh that's mainly uh amazon titan, uh we have created this uh help application within the games uh through which the players can have a seamless chat experience uh in a conversational manner uh and get the answers needed, thereby reducing the response times and also drastically improving their experience within the games.

Again, high level architecture follows the same pattern where now you see on your left side, players interacting with the help chat, uh the help publication mainly and that's again powered by the uh amazon bedrock and titan behind the scenes.

Uh here, we also have the uh the open search with the knowledge, knowledge base, knowledge articles mainly that it can refer to in order to uh provide a proper solution to our uh to our place and also answer any of their questions.

This is a sample prototype app uh that i have uh where in a matter of uh in a moment, you will see uh the player asking a question, uh something to do with their uh uh with an nfs game. In a matter of a few seconds, you will see the llm responding back with the step by step. Ouch step by step details of what they should do in order to get their uh question answered mainly.

Yeah, the la the last and the most important uh opportunity that i would like to talk about is social insights. So during the, during the game seasons, uh our game team and the business teams are very busy uh in the marketing activities, making sure we get proper feedback from our players. Uh making sure we address all of them and uh get the social insights from the marketing campaigns.

As you can imagine during this busy hectic game seasons, this is a huge task for our game teams where they spend a lot of time analyzing the the player feedback and also get the sentiments from all these different social media campaigns like how we have provided for our uh internal teams and players.

We would also like to provide some kind of an easy way for our uh for our uh business teams to get this social media insights mainly. And uh what we have brought them is a, is an app through which uh they can uh go for any of the social media campaign, get the uh social media posts and get a sentiment or summary of all the different uh feedback that they have within that content.

Again, on the high level architecture, it follows the same pattern where here uh we have game studio, studio team interacting with uh the admin portal, that's the uh app that we have built them uh through which is again powered by the service uh behind the scenes by amazon bedrock and titan. And uh they would get the uh insights from uh hitting this url.

Let's look at it in action. So what i have here is uh f 123 game trailer. And um in a second, you will see tons of comments again as this one social media post. So now uh this is a prototype uh app that i would like to show where uh the url is given. And uh i'm asking you to analyze like top top 0 800 comments. And in a matter of few seconds, you should see uh the the insights from this particular social media.

So the video is not removed. So that's why you, you see a little bit of a delay because i have not removed any time between the execution. So you can see clearly uh it has, it uses the past feedback like whatever the uh comments has been addressing and also it gives the improvements as well. It's not just the past feedback but also any improvements that the players or uh other youtubers has been talking within the comment section as a final third.

Um i'd like to say using generative a uh with bedrock and titan helps us in multiple areas within our, our organization. Uh one we have seen how it helps developers uh write unit tests, automated automate the testing process and also write more complex tests uh our quality engineers write uh test scenarios, help them with writing test scenarios and also expand the horizon by including complex test uh test cases.

Uh third, our place especially uh helping them when they have an issue uh directly within the game application. And fourth for our business teams, game studio teams uh getting the social insights uh from this uh different social media campaigns.

We hope with bedrock and titan, we can drive the innovation within the industry and uh uh take it forward with this. I will hand over to via my, my oh, great job.