Responsible AI in the generative era: Science and practice

最新推荐文章于 2025-04-13 00:23:15 发布

litaibai-04

最新推荐文章于 2025-04-13 00:23:15 发布

阅读量337

点赞数 1

文章标签： aws 亚马逊云科技科技人工智能 re:Invent 2023 生成式AI 云服务

本文链接：https://blog.csdn.net/littlechenlin/article/details/134783772

版权

Good afternoon, everyone. Thanks for coming to the breakout session on responsible AI in the generative era. My name is Michael Kearns and I'm an Amazon Scholar, which means I divide my time between AWS AIML and the computer science faculty at the University of Pennsylvania.

At AWS, along with my good friend and colleague Peter Hallinan, I'm very involved in all of our responsible AI efforts within AWS - from both the technical and policy perspectives, and everything in between.

The plan for today is that I'm going to take a little time at the beginning to talk about some of the science behind responsible AI these days, especially the challenges to that science emerging in the generative AI era. Then Peter will take over after a while and talk about how we turn that underlying science into the practice of responsible AI in AWS AIML products and services. Hopefully we'll leave plenty of time for Q&A at the end.

I don't think I need to say too much about generative AI to this audience these days. I've been in this field a long time and I sort of miss the days when my non-work friends didn't care about what I did, instead of having to talk about generative AI at every dinner party.

But you know, this is just a little demo showing the type of thing that goes on behind the scenes in something like a large language model, where there's some initial prompt or context, as we might call it. Then the underlying model is computing the probability distribution over the next word, given the context so far. Of course, then the sequence is one word longer, and so you can apply the same process to the next token or word, et cetera.

There is great power in this - first of all, it's scientifically incredible that just by solving this apparently myopic problem of next word distribution prediction, you get so many things for free, like syntax, semantics, compelling coherent texts in a style of your choosing. Also, because of the randomization involved in not always picking the most likely next word, but actually drawing from the underlying distribution, you get variation in the output - the same prompt will generate completely different output the next time.

This is an incredibly powerful technology that's become part of popular culture and society at large these days. There are already a very large number of existing and rapidly emerging use cases for generative AI, including writing tools to check grammar, spelling, style or suggest alternative phrasings; activity tools like using an LLM to summarize a meeting transcript; using it as an aid in creative content; code generation, so the universe of people who can participate in software development is greatly increased.

The excitement around generative AI is understandable and well justified - it brings great promise of innovation and productivity. But it also presents some new risks and challenges that Peter and I think about at AWS and try to address using a mixture of underlying science and procedures/policies around the general AIML workflow.

If I had to say what most distinguishes generative AI from all the AI/ML before, it's the open-endedness of the output - both the input/prompt/context, and the fact that instead of making point predictions, you're generating freeform, open-ended content. Of course that generality is part of the great power compared to traditional AI models, but as we'll see, it's also the source of some challenges in curbing undesirable behaviors in certain contexts.

Let me talk about the science of responsible AI in the pre-generative era, or the "before times" if you will. A good example is consumer finance - you have historical loan applications that you granted loans to, so you know the outcome and whether it was repaid or defaulted. A natural thing is to build a predictive model trained on that data, so that given a new loan application you've never seen, it makes an accurate classification.

This is a very targeted problem - you know exactly what you want to solve, giving loans only to credit worthy individuals. You're making narrow predictions, just outputting yes or no, 0 or 1.

So what might you be concerned about from a responsible AI perspective? Of course, how you train the model and the source of the data. If you have historical data generated by human lending officers with demographic biases, you shouldn't expect training a model on that data to eradicate the bias - rather it will likely perpetuate it.

If you're worried about fairness/equity issues, you have to define fairness in a satisfying way, then enforce it in the trained model. A typical definition might be equalizing false rejection rates across different populations - you don't want the rate at which you deny credit worthy applicants to vary widely between racial groups, gender groups, etc. We know a lot these days about how to train models to enforce those kinds of constraints.

So the fact that we have this very targeted use case and prescribed model outputs gives us leverage on solving fairness from a scientific/technical perspective.

Now consider: what would the analogous thing be for a large language model? Concretely, what would it mean for an LLM to be "fairer" in how it treats different demographic groups? I want to convince you this is a much more difficult problem - more of a definitional problem than a technical one.

For example, consider gender bias in the completion/selection of pronouns continuing a prompt. If I give a prompt like "Dr. Hansen studied the patient's chart carefully", I might expect continuations referring to Dr. Hansen to assign either male or female pronouns. But LLMs from a few years ago would invariably use male pronouns.

One thing I might ask for is that the frequency of male vs female pronouns be approximately equal when some occupation is mentioned. But it's much more nuanced - there could be things in the prompt that change what distribution I want over pronouns. If I say "Dr. Hansen has a beard" maybe I don't want 50/50 male/female pronouns anymore. Or if instead of Dr. Hansen, I mention a WNBA player, do I want some frequency of male pronouns?

I'm belaboring this to point out that even defining what we mean by "fairness" for something like pronoun selection is very difficult to articulate - too many conditions could change what we want the distribution to be, across too many occupations.

So you can see that even defining fairness for LLMs requires new approaches - the science we had until recently won't suffice. Similarly, privacy concerns have morphed in the generative era.

For traditional AI/ML, privacy might mean ensuring a trained model doesn't let a 3rd party reverse engineer private financial information used in training data. But now with open-ended output, there are additional concerns - the model might not output private info verbatim, but some close variant that with a little work, allows it to be figured out.

The generative era hasn't only exacerbated traditional responsible AI concerns - it has entirely new ones as well, which you've likely read about:

Veracity - hallucinations, the LLM making up apparently factual information that is verifiably false
Toxicity of output - less of a concern with 0/1 predictions, but very relevant to freeform text generation
Privacy concerns bleeding into intellectual property concerns - our legal/policy framework for privacy or copyright didn't anticipate training models on artistic data/creative writing and the implications of that

Let me talk more about hallucinations. I'm sure you're all familiar with the concept and may have experienced LLM hallucinations firsthand. The prompt here asks the LLM to tell us about some papers I've written, which I can definitively say do not exist. But the LLM happily generates seemingly factual information about these imaginary papers.

Michael Kearns: And um let me get here to where i can read it. You know, it starts off by saying, oh, Michael Kearns is a prominent computer scientist. Thank you very much. It then correctly names several areas in which um i am a researcher. And then, and then it lists a bunch of papers. Um interestingly, it seems to mainly have picked up on my alter ego in game theory rather than machine learning. But that's probably because once it started generating tokens from the lexicon of game theory, then it continues to do so because that's, you know, sort of reinforced what the context is.

But what i could say about these papers, if we went through each one of them is that um two of them are entirely fictitious papers that i never wrote with co-author that i never wrote and the papers don't even exist, but they're all plausible, right? And the co-author actually exist as real people that whose names i recognize. Um two of them, none of them are actually entirely correct. Two of them have true titles. But with co-author that with different co-author than the papers i actually wrote with one of them mentions. Can arrow is a co-author, he's a uh the, the late nobel prize winning economist. I wish i wrote a paper with can arrow, but i didn't. But this is what we mean by hallucination. And if you understand how an llm works, it's not hard to see why this happens because you're, you're not actually going out and looking up citations. You are just generating words from the lexicon associated with whatever my footprint in the training data was at the time of the training of the model.

Ok. Um let's go on to talk about toxicity and safety for a second. And again, you know what i my, what i'm trying to do with my time here is just not point out to you sort of what the solutions are, which people like me are actively thinking about, but just what some of the nuances are and even asking figure out what you want to, to enforce. So for instance, you know, toxicity and safety. So, you know, there are many passages that are quotations from famous writings and novels, if an llm outputs a quotation that might be considered offensive by some or even many, but it clearly identifies it as a quote from a well known, you know, text. Do we want to suppress that? I mean, it depends on the context, probably it depends on who's reading it. We wouldn't want to put it in front of children, but we wouldn't want a blanket suppression of it because that might amount to a form of censorship?

Ok. Um what about in a similar vein? What about people's opinions that others might find distasteful but are clearly marked as the opinions of that individual. And so one of the things i think one of the things that peter and i have realized over time in the generative era is that, you know, we need to find some middle ground between the generality of things like large language models and enough specificity in use cases that we can start to imagine what the constraints might look like. So what do i mean by that? Well, what i mean by that is if i'm using a large language model as a creative writing aid and i'm writing, you know, novels or short stories for adults, then i might have a certain tolerance for certain amounts of content that people might call toxic. For instance, you know, lang you know vulgar language, for instance, if i'm using an llm as a writing aid for children's book, you have zero tolerance for that obviously. And so once you start to say even within a broader use case, like content generation, exactly what the use case is, what the audience is, then you can start to come closer to articulating the constraints that you want to enforce. And ideally, you would like to be able to write them down in a, in almost a mathematical formal language so that you can constrain the training process to obey those constraints and so, you know, many of you are probably familiar with the fact that a lot of the large language models that we have around today are, are sort of trying to approach some of these problems with so called guard rail models. So things like toxicity detectors that are applied both to the prompt and to the output and it perhaps even have a knob on them that can set your tolerance for how much toxic you want. What types of toxicity you want? Because of course, you know, if you look at the literature on toxicity classification and detection, it's not just toxic or not toxic, it's what type of toxicity. What type of content is it vulgar language? Is it images? Is it other disturbing content, et cetera? And what is the severity of it?

Um intellectual property is another area that now is front and center in the generative era because now of course, we have concerns like this example where we've gone to a image generation model with the text prompt that says um you know, create a painting of a cat in the style of picasso. And you know, this is in some strange gray zone between privacy um and things like copyright because you know, the picasso's estate or the andy warhol estate, like these images, the original images, they're not private data, they're not like your loan application or my home address, they're in the public domain and perhaps the artists or writers want them to be in the public domain, but they didn't anticipate them being used to train a model that then is engaged in some kind of perhaps stylistic appropriation. And so again, it's hard to say exactly what we want, you know, in one context, you might call it um a hallucination. And in another, you might call it a generalization. In one case, you might call it stylistic appropriation. In another case, you might call it sort of, you know, appropriate creativity.

Um and this is, i think an area that won't have purely technical or scientific solutions, but it will have to involve um our legal system adapting to the unforeseen training of these models and the use of this kind of data. And so let me just conclude by saying, you know, i've, i've told you about a lot of problems and i've told you about a lot of problems that have gotten harder, not just to solve, but even to conceptualize and, and sort of define what we want in the generative era. But there is quite a bit of science going on. I mean, i know many people in my field that more or less i would describe as they've stopped doing whatever they've been doing for their entire career to work on responsible a i. And so you have a lot of very, very smart people that are thinking hard about these issues, some of these things i think will have technical solutions kind of like the fairness constraints i described in consumer lending, others are going to require a strong public policy, legal and regulatory framework. But you know, to just to mention not all of these but um things like um um red teaming or water marking, sort of water marking in order to identify content that was generated by an llm so that people aren't getting fooled by a a i generated content in inappropriate ways.

Um the somewhat distasteful phrase model disgorgement and machine unlearning refers to sort of an emerging science around. How do you take a model that's been trained and sort of remove the effects of particular pieces or subsets of the training data on the output of that model. Ideally, in a way that doesn't require retraining the model from scratch, which if you know about the size and expense of training these models is completely infeasible to be doing like on an ongoing basis every time somebody wants, you know, their, their data to re removed from the training process. And so there's a lot going on scientifically and i want to now turn it over to peter to talk about. Um you know how we take all of the science and the challenges that generative a i presents and sort of operationalize it within the context of aws products and services, peter. Take it away.

Peter Bailis: All right. Thank you, michael. So let's all right. So for a little bit of context, i lead a central team at aws that advances the science and practice of responsible a i. And that means that we live ambiguities that michael has been discussing over the past 20 minutes um every day of our life. So let me just start with a little bit of level setting. Ok? To make sure that everybody is on the same page.

Um i want to observe that ml software is not smart traditional software. Ok? There's a couple of fundamental differences from which a lot of stuff flows. Um first one is that with ml solutions, right? Um we spec with data as opposed to traditional software where we're specking with human language. Ok? That causes problems right up front. If you're an ml team and you go out and you ask a data team to build you a nice data set. One of the first things they are likely to do is try and deliver very high quality data, which means they give you an extremely detailed uh uh sort of set of guidelines for the people building the data set. What that really means is that they tried to write down in english, you know what the manifold is that they're asking uh the data team to label, which is completely contradicts the point of going to data in the first place because you couldn't write it down in english. If you could, you would have done that the sort of uh historical feature engineering. So that's actually a big difference right off the bat.

The second one is that with traditional software, right, customers do not expect to test it should just work right out of the box with ml solutions. Customers must test this is completely different. All right. As soon as you introduce privacy, that means that it's possible that someone submits data into the system that is has assumptions that differ from the assumptions on which it was trained and only the person submitting the data can know that. So there's a testing expectation with that testing expectation comes the expectation that you know how to build data sets, you know how to judge the results, et cetera, et cetera. So another big difference, third one with traditional ml software, right? Sorry with traditional software. When you release version n plus one, you expect it to work as well as or better as version n on every single input. Not. So with ml solutions on ml solutions, new releases are going to work better with respect to some metric. For example, the average, which means you can have things stop working for a particular person or a particular input that worked before. And again, you have to have, you know, processes in place to handle that. So this doesn't change with gen a i, all right. So that's, that's kind of the base level of stuff that we, we have to worry about.

And all of that means all of that means you have sort of the shared responsibility between providers of models and dep employers of models. Ok. So that's, that's a fundamental point. Now, on top of that, we layer what we consider sort of considerations or dimensions of responsible a i and michael has talked about a number of these, but there are more still, um we have in the age of gen a i, we have the issue of controllability, right, which is about making sure that you can steer and monitor your model to achieve its desired system behaviors. I i think in the worst case, people are wondering or worrying about some people may be worrying about skynet and something running amok. Well, you know, do you have control levers, right, that you can employ to steer this thing? And do you have monitoring mechanisms to make sure that in fact, it's doing what you expect and we have privacy and security which michael talked about and safety um and fairness and then veracity and robustness. Those are there.

Um we also have explainability, transparency governance. All of these are issues that you have to juggle when you're deploying um ml solutions and you need to juggle these kinds of properties of your, of your application of your model of your supply chain on top of those issues about the shared responsibility model. All right, let me just pause. I hope this isn't just raising a lot of concern, it's a lot to juggle. Um but there is a way forward. We have a framework that we use to think about how we make our investments in this, obviously leading it is our investments in science

Then we have a lot of work that we put into translating the science into practice. Thirdly, we spend time thinking about how we bake all of this into the entire ML dev ops cycle. Finally, we also think about the people - how do we engage stakeholders? How do we educate people, et cetera?

So I'm going to cover the 2nd, 3rd and 4th boxes and try and leave everybody optimistic that in fact, we can make progress in building and operating responsibly.

Okay, so second box theory to practice, let's talk about some secrets to success. I think Michael was highlighting this one in a number of cases - defining application use cases narrowly. I'm going to dig into each of these to make sure that they're clear.

Second one - matching the process, your development process, your operating processes to the level of risk.

Third, treating data sets as product specs.

Fourth, distinguishing application performance by data set.

And five, operationalize the shared responsibility model.

Okay, so these are just five things to consider. Let's dig into each of these.

What does it mean to have a narrow application case? A lot of people think, okay, face recognition - that's a use case. It's not. Here are just three flavors:

Retrieving an image from a gallery could be for a found child - you want to look up the child in a gallery of missing children.
Celebrity recognition - looking up a celebrity in a database of production video.
A virtual proctoring application.

But each of these, although they employ face recognition, have different kinds of variation that you have to worry about - different kinds of biases, different consequences for an error, different ways in which you tune the system so that you have fewer of the errors that are more costly.

For example, with retrieving images in the missing child case - you really want to return as many matches as you possibly can. But in looking up a celebrity to find a clip in a video, you're gonna get lots of hits - so there you're not gonna tune it for recall.

In the AI case, you have the same kinds of issues. Suppose you want to catalog a product versus persuade to buy - you would write different things if you were human.

When you're cataloging a product, you are intending there to be a broad demographic - anybody can read the catalog. You want to worry about veracity, consequences like brand damage, lost sales, returns. When tuning it, you're gonna favor neutrality of language, clarity, completeness.

But when you try to persuade someone to buy, you're probably going to target a narrow demographic. You're going to have additional issues with unwanted bias - how is the LLM thinking about this demographic? You have greater consequences if it shows unwanted bias in terms of representative harm, for example. When you tune it, you're probably gonna try to zero in not on a complete description of the product, but on the particular problem that this demographic is most interested in and the benefit to that demographic.

So it's the same thing - traditional AI, generative AI - zoom in on a narrow use case. I just can't emphasize this enough because if you go broad, you're going to get into trouble. You won't be able to define what bias really means. You're going to end up doing worst case risk analysis which will tell you not to do it. So this is absolutely critical as a first step.

Second thing - risk based approach. If your application is recommending music versus identifying a tumor - very different consequences for getting this wrong. So you need to have a systematic approach to assessing risk within the organization.

First piece of advice is to align with the NIST framework. Second thing is to realize that in some contexts, risk - people worry about risk to their own organization, but in this situation, we're worrying about risk to the stakeholders. It's not about saving your bacon, it's about saving their bacon.

So you want to identify all the stakeholders - both in the development process, imagine the people labeling stuff for toxicity - and on the usage side, people who might not be included in building it. Go through, identify the stakeholders, identify potential events, negative or positive, estimate likelihood and impact of each event, aggregate the risks and then choose development and operating processes appropriate to the level of risk.

This is really not easy to do - you can get in extended arguments with people on what the actual likelihood and impact of a particular event is. It takes a social process of forming, storming, norming and performing before you get everybody able to consistently assess risk. But it's critical.

Then treating data sets as specs - we said before, data sets are particularly key here. Take a look at what's actually in the input, anticipate global diversity - for example with face recognition, you may be familiar with skin tone in terms of 5-10 variations, but the actual curve is a banana shaped curve in RGB space - much more complicated. You want to accommodate the full diversity that's out there.

You have to think about the difference between intrinsic and confounding variation, make sure your data sets have it, and use lots of different data sets.

With AI, your dataset is encoding your design policies.

Distinguishing application performance by dataset - this is critical to internalize. Performance is a function of an application and a dataset, not just the application. I know this goes against everything people want - they want to say this model is good or bad, this model works or doesn't work. But that's really not the way to think about it.

The issue is how well does a model or application perform on a particular dataset. So over time model performance may increase as you improve it against a test dataset. But there might be some other test dataset on which it does even better over time, or some dataset on which it gets progressively worse over time. All of these are possibilities.

What this really means is that when you're developing, both in traditional AI and AI, you've got to worry about two development trajectories - the trajectory of the model and the trajectory of the dataset, because very often people are evolving the dataset, it's not constant. And every time you evolve the dataset, you're jumping from dataset A to C or A to B or something.

So another critical point in terms of doing this practically - in terms of sharing responsibility upstream and downstream:

If you're a model provider, like Amazon with its Titan family, you need to anticipate downstream use cases.

If you're downstream, you need to define your application use cases narrowly, but upstream it's even harder - when you anticipate diverse downstream use cases, you still have to treat them narrowly and say "these are some things you really shouldn't do and these are some things we're really designing for."

On both sides, assess risk and select the process.

On both sides you're building datasets - upstream to train and evaluate, downstream you may not be building but you still need to evaluate, so you still have to build datasets.

Testing the component on anticipated data downstream, testing end to end - just because an LLM or foundation model has been tested out the wazoo doesn't mean you can assume you can just drop it into an application and you're okay. You should test your application end to end and make sure this whole thing is really working as you expect.

Sending feedback - if you're using a vendor to help build a dataset and you learn something, give them feedback. Similarly if someone downstream deploys an Amazon service and has an issue, they have to let you know. Communication has to go upstream and downstream.

And then you also have to act on what you've learned - that's absolutely critical.

So think about if you're gonna be deploying, you know, gen AI within your organization, either internally or you're building products based upon it. Think about having processes in place to deal with each of these issues.

So one example of the way in which we do this is that we've introduced last year, these things we call AWS AI service cards. They describe services from the point of view of not a scientist or an engineer. We have model cards and data sheets for things like that. These are trying to educate sort of a more non technical audience about appropriate usage of a um of a particular service.

So anyway, just one example. Alright. So let's move to the third box where we talked about um you know, first we talked about theory to practice, then we talk about, ok, well, we know what good practices are. We have a set of good practices. How do we kind of embed those throughout the entire ML dev ops cycle? Right?

So here's a cartoon of the dev ops cycle. Everyone has probably a different favorite cartoon. This is just one of them. Um and the question is, is like, where do we, where do we think about these issues and the answer is going to be as you probably can expect absolutely everywhere, right? Every single stage you gotta sit down and think about how does controllability, fairness, privacy, security, robustness, how do all of those issues play out in this particular, you know, block of the, of the cycle?

Um let me just pick one example which is uh at the very beginning if I can go back here, right? We had am l problem formulation box over on the left hand side there. Um ok. So you know a fundamental question is just asking whether or not ml is appropriate. Alright. So how well do humans perform on the same tasks? What tasks are humans really solving? Right? Might your system be repurposed? You just like kind of ask the big questions, give yourself space and time to do this.

Um let me give you some uh a few examples we have had people ask, you know, can you build a loitering detector? For example, loitering is a, you know, say in the context of uh you have video sort of security video of your store and you want to know if someone is loitering so you can your security team can take some action. Ok. Well, loitering is a value judgment. Uh you don't want the system making value judgments. How is it going to do that? You cannot, that's not a good system to be thinking about building. Alright, can you do sort of a number of humans standing in some location for, you know, a certain time period. Yes. Alright. That's, that's much more practical.

Um, all right, stylish clothing. Two years out. Could you predict that? Well, what are your input signals gonna be? I would say this is kind of a question mark. Try it. But uh, good luck.

Um, this one, write ad copy for a group for a red and blue, 64 inch diameter golf umbrella. Alright. So in general, can you write persuasive ad copy? Well, maybe you can, if you use demographic groups that are defined by the task that a particular person cares about, but maybe you're going to get into a lot of trouble with ll ms if you try and use kind of standard definitions of demographic groups that represent intersections of gender, age and ethnicity or something like that because maybe the llm hasn't really, you know, and how could it like code what the biases of those folks might be?

So anyway, ask those questions. Um and here, I think, you know, if you, if you reflect upon this, i think it's also good to appreciate that you have a very powerful tool in deciding what your system will do and what it won't do. You do not have to build a system, for example, that handles the full diversity of the planet in every possible situation. Um if you had to do that every time you would probably never get off the ground. How would you get all the data that you needed, et cetera, et cetera to be enormously expensive. And you might build something that eventually people didn't even want. And you have to learn that.

So you have this lever which is transparency, you can elect to say to folks. Ok. These are the circumstances in which it's intended to be used and this is how it works. Um and everything else is kind of, you know, either unknown or, or at risk and please don't do it or do it with full knowledge that you're gonna have to do extra testing. Ok?

So you know, you have this lever that you can, you can choose uh to, to operate. Alright. So across the ml life cycle, um we have a number of things that we deploy. I think the primary thing to make everyone aware of is people, we have a partner network, we have a gen AI innovation center, we have solution architects. Um there's a lot of complexity as, as michael was pointing out and as i have sort of taken you through a bit uh up to this point. And so the those folks can be extremely helpful.

Um we also have tools on sage maker, we have data wrangler, we have ground truth, we have clarify, we have model monitor, we have ml governance tools for example, model cards. Um and we have the internal team um that, that uh focuses on helping all of our service advance the science and practice of responsible a i.

So we have a lot of investment on this. Um and the outputs you can see in services like amazon code whisper, amazon titan, right. On the code whisper side, we have data that's private and secure. We have content filtering. We have built in security scanning, we have attribution uh for, you know, sourcing.

We have indemnification, we have on the, on the titan side, we have again, privacy, security content filtering. We have human enlightenment through, you know, reinforcement learning and through supervised fine tuning. We have knowledge enhancement via rag uh orchestration customization. So there's lots of investment going into these all using the principles that we've uh talked about previously.

Alright. So let me finally switch to the uh the the bucket around people. Um i have sort of walked through a lot of challenges and if you're in an organization which is either trying to deploy a gen a i application internally or build one for its own customers, right? There's gonna be some process that you go through of building awareness, sort of establishing foundational skills, seeing capabilities emerge, you know, and then sort of finally baking it into your operations and you have to just sort of set upon this journey explicitly. Ok.

And as you go down this journey, you can't just expect it to happen without, you know, thinking it through, i think one of the things i want to call out is that uh responsible a i is not something purely for your technical teams, for your scientists and for your engineers, it's absolutely crucial that your product managers uh be in the center of this. Ok.

So, you know, at amazon, we have this kind of tier where, you know, foundational principles around human rights and sustainability are things that, you know, we um we pursue across the company. And then we have on top of that, the set of normal stuff that any product manager is worrying about use case accuracy feature, set latency cost up time.

Um and then on top of that, you have all the ra i things that we were, we were worrying about, right? And so you're gonna inevitably find yourself, you know, in some situation where, you know, the things on the bottom are things you cannot trade off. The things in the middle are things that people are trading off today in the normal course of business. You know, do i want this many features that's gonna come at a higher cost if i lower the features, lower cost, that kind of thing. And then the the ra i properties are also gonna have potential tradeoffs, all right.

Um and those need to be made explicit. And again, if you happen to have gone down the path of having narrow use cases, it's much easier to understand what those trade offs are. And to communicate them so that people can have a good experience using it.

And then the last thing i know it may sound uh a little much but participate in other words, you know, you're in your org, you're trying to develop your capabilities within your org. But you s, you know, there's a lot of, uh there's a lot of activity on the policy side, on the legislative side, on the regulatory side and you need to participate as much as you can in those processes. Io 42,001 is gonna be a linchpin of the eu a i act, right? But there's 30 other standards roughly that are also in development. Ok? And the folks developing these standards are trying to do the best that they can. But the more input they have from people who are actually building and deploying applications, the more likely these standards will actually be effective. Alright.

So that's for example, why we participated in the white house voluntary commitments, we participate in lots of standard bodies, but we would encourage everyone else to do the same.

So let me, ok, if you want further information. Alright, we have a couple of uh actually we have three, not two, not two pointers for you. Uh one, you can hear more from michael at the first uh at the first link, the second one, you can uh learn more about the white house commitments and the third one gives you more information on getting started with gen AI with our services.

Um all right. So i wanna thank you at this point. We've got about 11 minutes left.