Best practices for Amazon CodeWhisperer

最新推荐文章于 2024-09-30 17:19:07 发布

李白的朋友王维

最新推荐文章于 2024-09-30 17:19:07 发布

阅读量163

点赞数

文章标签： aws 亚马逊云科技科技人工智能 re:Invent 2023 生成式AI 云服务

本文链接：https://blog.csdn.net/just2gooo/article/details/134787248

版权

Rory: And everybody so excited to have you all here. This is fantastic. We're ok. Rory is gonna, I'm gonna tell. She's gonna, it's a really good story. She's gonna interrupt me. All right. So you know what they're here for? They're here to see us.

Um my name is uh Rory Richardson and I've been working in tech a long time and there's this one time where a developer was trying to ask me out, but I had to pass a gauntlet of questions before he would make the effort to flirt with me. And one of his questions was, what's your favorite language? He wasn't talking about French. So that's my question to you guys. What kind of languages are you using? Raise your hand. Uh if you're using Python. Of course, I got a little Python somewhere, Java and C any flavor.

Mhm. Oh, that wasn't, that wasn't so popular there. Wait, watch this, watch this scholar. Oh, I see you. I see you coal. I've got more coal in New York. Got a lot of, all right. Well, I'm Doug Seven. I'm the general manager for Amazon Code Whisper, which is a developer productivity tool powered by generative AI to help with coding. And that is what we're here to talk about today. And most importantly, the best practices for how to use a coding companion like Code Whisper.

My name is Rory Richardson and I'm uh the director across all the developer stuff that we're gonna be talking about this week. Um and I am requested to actually do this, this session with the because I think Code Whisper and a lot of the J enabled services that we are releasing in the developer space is kind of changing the game of who's going to develop, how they develop, how fast they develop, how cheaply they develop, how fast we can create innovation. And I'm super excited that we get to work in this space over the next few years.

Yeah. Now I was walking around asking a bunch of people before the session, how many people had used Code Whisper before? So give me a little, all right. So we're gonna start with a little bit of baseline. So we understand how Code Whisper works and what it does. Then we're gonna transition into some best practices that we've seen from how people are using Code Whisper and what they're able to do with it. Uh and talk about uh all kinds of different things you can do in different ways, you can work with the tool. So off to the race as we go. All right.

So the interesting thing that has happened this year is that AI has changed the game in a lot of cool ways. Uh I think in January, we were looking at the developer experience very differently than we are today. And if I think about the rate of change from when Coper Code Whisper went GA in April to where we are now, it has been truly remarkable to see how fast we can leverage DNAI in order to create an even better developer experience.

One of the things about being developer is a lot of times there's undifferentiated heavy lifting that isn't necessarily adding as much value. So if we took the same principles that we've applied to AWS and applied them to the developer lifestyle, how do we remove the undifferentiated heavy lifting? That's sort of our mantra like what, what can we take off a developer's plate so that they can be more productive, more creative um and produce better quality code.

So with that, let's talk a little bit about what Code Whisper is and let me, let me real quickly before we talk about how it works. Let me, let me just kind of show an example of what it does. So Code Whisper is installed through the AWS tool kit, which is an extension you can get for VS Code, which is what i'm using here today or for the family of JetBrains products. And so here I've got the AWS tool kit installed and you see that it says, including Code Whisper. So that's how I get Code Whisper. And then I log in in a couple of different ways. We're not going to talk about that today unless we really find that's an important topic. But I just want to kind of show you the primary use case for Code Whisper.

So I'm going to start in JavaScript. I heard a few people saying they like JavaScript, let me move this out of the way. And so the idea behind Code Whisper is that it's ambient, it's there kind of listening to you and watching what you're doing. And so as I'm writing code, let's say, whoops, I'm going to write a function to do a topological sort. And so I write this and then down at the bottom, you see, there's a little thing that says Code Whisper, it's got a spinning circle and you see this gray text show up.

So that's Code Whisper, making a suggestion, you notice I don't have any line numbers with that that hasn't been written to my file yet. That's just Code Whisper inferring from what I've written so far, what I'm trying to do. And so it understands from the word function and from the, the phrase topo sort that maybe I'm trying to do a topological sort. So it's recommending some code for me. And if I like the code that it's recommended, I just press the tab key and that code is written into the file.

So now you see, I've got line numbers. Now, if you're astute, you'll notice that it, the suggestion included another function DFS function that is not in my code yet. And so as I go down to other lines, I can trigger Code Whisper to see if there's anything else it wants to do and it realizes DFS is not defined yet. So it defines a function called DFS. And I can, it starts by just one of the ways we, we work with Code Whisper is we start by just suggesting the function signature. And if we get the function signature, right, then when I go to a new line, it fills in the rest of the function for me.

So we want to make sure we get the idea right first and then we fill in the gap. And now I've written my topological sort just by writing code and having Code Whisper kind of think ahead for me much like if and I were writing code together and I start to write, she's like I got this and takes the keyboard and writes it out for me. And the difference is that this, you know, 20 line file of code that might have taken me, I don't know, a minute or two, maybe three, maybe five, maybe 20 if I didn't know how to do this, and I had to go search it up somewhere. Uh this was done in a matter of seconds just by Code Whisper, ambient, paying attention, seeing what I'm doing, understanding the intent of what I'm trying to accomplish and then giving me that code suggestion.

So let's go back to our slides to talk about how that actually worked and what happened. Oops, back up one second. There we go. So I mentioned that it's installed into your IDE with the AWS toolkit. So it's there and once you authenticate into Code Whisper, it's just ambient. You don't have to intentionally interact with it. It's just present. And as you're typing, it's paying attention to what you're doing and it's finding the right time to trigger this loop where it goes to the service and it takes your input as a, as a prompt, we call it and it goes into a large language model or a foundation model and that processes that intent and from what that foundation model knows and understands about writing code, it says, oh, I think this is a topological sort.

So here's what I know about topological sorts and it writes net new code for how to do that. So it's not pulling from a library or anything like that. It's actually writing the code based on all the knowledge it has about how to write code and then it's returning that code and that's happening in, you know, under a second and it's happening continuously as your writing code. There's a few other things that happen there that you see on the slide like the reference tracker and the security scan. We'll talk about how those work as we go through the session. We won't worry about that right now, but I want to talk about it.

I want to ask Rory if you could talk about how did we get here? Technologically, there were two key reasons why AI was able to happen this year, one massive amounts of data, right? Um we were able to look at at information differently than we ever have before. And the second is cost, we were able to move, make these queries affordable enough to enable really complex ways of interacting with that data.

So this year AI is no longer in the purview of just data scientists, but there was a democratization that made it possible for everybody to play with it. I don't know about you guys, but I have a 13 year old son. First thing that I asked him to do back in February and March is to start doing his homework with Chet GDP. It's made me extremely unpopular with the other parents, but it is the future. It's the new baseline. I have to teach my son how to be the plus one to where we are able to create content from generative AI.

I personally was super excited uh by this because this homework improved rather dramatically overnight. But at the same time, he instead of him thinking about syntax, am I am I do I have the, the, the right conjugation. What I saw was his stories, his narratives got exponentially better because he was able to express his creativity and not necessarily worry about the nuts and bolts of how he was writing. And, and when this was back in March and February, so when I started talking to Doug and other people on the team, I was like, oh my gosh, if we could do that for developers, if we could unlock their creativity and with syntactically perfect code, how much faster could we be? How much happier would we be? What is the limit of the possibilities that we can do with this?

Yeah, I never have to write another topological sort that's for sure. So, so this this kind of explosion of language model or foundation model, we use these terms sort of interchangeably the capability. So if you go back, even as, as short as four or five years ago, the size of a foundation model sort of the state of the art was measured in the millions of parameters. And now we're measuring in billions and potentially trillions of parameters. And all that means is how capable is the model, how big and how capable is the model and that the evolution of those models.

And then us being able to drive down the price of training and hosting those models through things like our Trainium chips and our inferential chips and things like this made this all very possible. So how does this all work? So you've got this concept of a language model which you just take in massive amounts of data. And so in the case of something like Code Whisper, we're talking about all of the source code that we can get our hands on.

So I think all the publicly available source code will throw in a little bit of our own Amazon source code, uh particularly the best practice code for how to use Amazon APIs. Things like that, all that goes into essentially a large machine learning process to build a deep neural network such that as as these tokens come in. And when we talk about tokens, we talk about pieces of the input context. So that function topo sort could be broken down into multiple tokens like, hey, it's a function, it's topo and it's and its topo sort.

So it could be multiple tokens that are going into the model and then that model is able to process those tokens. And it's just doing a predictive analysis saying based on these things, what do we think would come next based on the entire source of data that we know and understand. And since Code Whisper is trained exclusively on software development practices and software development code bases that makes it, you know, ideal for this kind of environment. But then you know, these foundation models can do lots of different things.

So you've seen text generation models and image generation models and all kinds of other things. There's all kinds of different foundation models and they come in different sizes and different capabilities. This one's about code. So we've designed the Code Whisper model specifically around software development and trained it to do these things.

I think what Doug said there was particularly interesting because training the models uh around code or operational practices even is we learned how to optimize um the LLM differently than when you're using natural language. Natural language has synonyms, natural language has um cultural context code has a lot less. So the model itself has to be trained differently.

So there's a broad spectrum of applications that we could spend our time with leveraging gen AI. But it seemed like a natural adaptation to focus on things that were fairly regular and see how far we could go with it. So, out of all of these options, this is where we are for this particular session is how, how are we gonna help people code and then how do we make it easier and more um safe for, for people to get started coding and then be able to scale it?

So you may have seen some announcements about that we support Rust SQL C# but just recently we announced CloudFormation ADF CDK, the AWS CDK, and most importantly, one of my favorites Terraform. So moving up the stack into infrastructures code gives you even more capabilities. Um the idea that we supported SQL I thought was pretty interesting because uh how many of you are DBAs or have ever played around with databases as? Yeah, y'all aren't great developers in general, right? Because DBAs dude do DBA, I'm a DBA. So I am speaking from personal experience that what can, what can supporting SQL in order to be able to write more efficient queries based on the past patterns that you've been using within your organization

What will that mean? So being able to extend past the application tier and into the database tier gives a more unified and universal experience across any application that you're trying to build. Additionally, um we announced support for Visual Studio. Uh we already supported JetBrains and VS Code um as well as the Lambda console and SageMaker Studio, Glue Studio.

So the word that Doug used is key. How do we make this ambient? How do we make you forget that it's even there so that it becomes ubiquitous? And you don't remember how you wrote code before because you have a companion, a best friend sitting right beside you helping you write great code.

So I'm gonna, I'm gonna take over here for one second because um now that you understand like how these works and maybe you came in already kind of knowing how these foundation models worked and what they did. Um I want to kind of show like there's different use cases. So we're going to kind of transition now. I feel like we kind of laid the groundwork for everybody. Not everybody coming in here had worked with Code Whisper before. Now, you all know, kind of what it is and how it works and what it does. But now we're going to spend the rest of time showing you things and getting into some of these best practices and things you can do.

But I just want to show you there's, there's kind of the two modes of how this works. The one I showed you earlier is I was just writing code and Code Whisper was sort of thinking ahead for me and saying, hey, based on what you've written, this is what I think you want to do. It's doing its best job based on what little context I'm giving it to suggest the code that I need. And I did a pretty good job.

But I can also because of how these language models work and these foundation models work. I don't have to restrict my input context, what I provide to the language model to just code. It can also be natural language so I can come in here and say I want a function to perform a topological sort. And then Code Whisper comes up and says, ok, you want a function called topological sort. Ok. That seems reasonable. I'll tab to accept that and I'll go to the next line, it says, ok. Now here's, here's some code for you.

Now, what's interesting with this is it's, it's largely doing the same thing. Some of the names are a little bit different but the code, if you look at it, it's basically doing the same thing as the code the last time. But because I'm adding natural language as my input, it's adding natural language to its output. It says, oh, you like comments? I guess we'll give you some comments.

The other thing that's interesting is it would be maybe presumptuous of us to think that we would just uh predict what you were doing in the first guess. And so when we deliver a code suggestion to your IDE, we actually deliver in most cases more than one. And so if I were to use my arrow keys, I can scroll through and see like you can see a subtle change there just in mostly in the comments and the name of the helper method. And as I scroll through, I can get different examples of what the foundation model thinks you're trying to accomplish. And our, our goal is to make the first suggestion, you see the best suggestion you see. That's not always the case. Sometimes what you're looking for might be the second or third or fourth suggestion or even the fifth suggestion, but more often than not what we found is people tend to accept the first recommendation they see. So we want to make sure that's the best one you see, so I can accept that.

But the same thing happened here, this created a method called topological sort recursive. And as I go to, I move down a couple of lines you see at the bottom, it says Code Whisper, it's kind of cut off on the screen. There's a little indicator that tells you what Code Whisper is doing. And right now it's, it's ambient dormant. It's just hanging out. It's not doing anything. But I need to create this additional method called the topological sort recursive. So I can write that myself or if Code Whisper didn't automatically provide me a suggestion. This is one of my favorite tips, by the way because um what we find is that people that do this are 2 to 3 times more likely to accept the suggestion that they get, which is to manually trigger Code Whisper when you want it.

So everything you've seen up until now is us deciding when to give you a suggestion. And we have an ML classifier that runs in your client and tries to find the best time to give you a suggestion based on what you're doing. And when you're most likely to accept the suggestion that you're given, um but if it doesn't pop up, if it doesn't say, hey, like right now, I want a suggestion, it didn't give me one. So I'm on a mac so I can hit option c if you're on windows, you would hit alt c and that just triggers Code Whisper to come up with a suggestion.

So it came back and said, ok, you want to perform a recursive function to, to perform the story, ok? That seems reasonable. And then it says, ok, well, here it is. And now you see the Code Whisper is spinning and it's doing its thing. And now I've got that same function. This is functionally the same as the code from the first example. But because of my use of natural language in the input context, I got use of natural language in the output context as well. So if you like commented code, this is a great way. It doesn't mean by the way just because you use natural language in the prompt, it doesn't mean you'll always get comments in the code. But Code Whisper is looking at what you're doing and trying to match your style in a sense. Ok?

Back to yeah, go back to me. It is, it is me. So we were talking about this input prompt idea. And so I uh when we talk about Code Whisper like matching your style, what we're talking about is um uh this term that has started to pop up a lot, which is prompt engineering and prompt engineering is just a fancy way of saying, figuring out the right things to write in the input context to get what you want in the output.

And so one of the things that's really fascinating about how these foundation models work is they look at the context that's coming in and where they can, they try to mimic that context in what they put out, they try to pattern match. And so in the last example, I started with a comment. So we saw a lot of comments in the code. In the onscreen example, we're writing some code. You guys said you like Python. Hmm. Round of applause for Python. I heard you, I heard I changed it right before the session. Uh but it's basically looking at that structure. It says, oh, you've got a, you've got a function signature, you've got a comment block that describes what you're doing and then you've got the function below that. So then when I go to prompt for another one, it does the same thing.

So if we were to look at that, I'll give you a real world example of that. Um we'll come in here and do, we'll keep going on the javascript thing. It is a little weird for javascript because normally your javascript comments would be outside the function. So you usually have a function block, doc string up above the function. But if you go to write a new function and you have comments inside of it, then Code Whisper can understand that and replicate that.

So in this case, we're asking to parse parse a CSV file full of songs. This is one of my personal favorites because i like doing things with songs and it's describing what it wants. So, as i, as i go down to a new line, it says, ok, well, we'll, we'll start by creating a song array um or i can, you know, try different things and see what's available to me and it's gonna kind of work through, it's gonna split. Ok. That makes sense. And then it's kind of come to the for loop and then it's going to figure out what to do with that for loop, didn't close it off for me and that's gonna return song. So that all makes sense.

And then if i wanna then continue on with something similar to that and i say function um save CSV songs path, then it's going to do something very similar to what i've already done. Because again, the foundation model wants to pattern match, it wants to look at that input context and say, oh i can pattern match that output context. And so it gives me a function where it's got that comment, it's got the parameters in the comment and then it's got the function body and it does the things it's supposed to do.

So it's these, you know, from a best practice standpoint. If you start your, if you start by putting something in the context to let the foundation model know how you like to do things, the patterns, you like to use, then the foundation model will try to replicate that. If you start with just an empty, an empty file and start typing, it's gonna do its best to do what is necessary. But if you give it a little extra, you give it a little help, then you're going to get a little bit more of what you like back this, by the way. Just means i didn't have to spend time writing the comments and filling the par all the stuff that's like i don't really like doing, but i have to do because it's part of part of how you do good software development.

So this continues on. So we're talking about context. So context is all about what goes in the model. So again, we have that mo classifier running on the client. It's looking at the right time to trigger the model and then it's gathering whatever is in what we call the context window. Now, when we first released Code Whisper, for those of you who had used it before, when we first released Code Whisper, our context window was limited to the file that you were working on. So it only ever looked at the file you were working on. But sometimes the thing you're working on is in is related to something in another file. And so we expanded the context window to what we call cross file context. Meaning if you have another file open, we'll look across all those open files as the full context window.

So if you're doing something in one file that is related to something in another file, we understand that and we can use it. So let me, let me kind of give you an example of how that works. And again, from a best practices standpoint, the idea here is if you're working on something related to something else, have both of those files open.

So i'm gonna open up a couple, found him in the clothes, we close a couple of these out. I don't need these. So i've got um two files. I'm gonna open up. Oops and let me split this. So let me give us a little bit more space. So i've got two files here. So in the eval utilities file, i've defined a function called cal underscore es that takes references and hypotheses as its input arguments. And then over in this other file, i'm working away and i want to reference that.

So if i said i want the result equal to cal underscore e the prompt, the or excuse me, the, the response coming back, the code suggestion understands that cal es needs uh uh references and hypotheses as its input arguments. It knows that because it can see that in the other file. So that full those two files are my context window. If i were to close that file and do it just in the single file, we no longer have that cal es function in our context window. So if i did the same thing and said results equals cal underscore es

Now Code Whisper is going to do its best to try to figure out what cal underscore es is by looking at its entire body of knowledge and knowing what it is and basically making the best guess it can at what cal es is - very different result than when I had the other file open.

And so again, going back to the best practices here, if you're working on things that are related, have both those files open because that context window spans all those open files up to a up to a degree. If you have 100 open files, it might not, it might be too much. Oh, excuse me.

So let's take this a little step further because I mean, I've been trying to teach my son to code since he was four. Are you y'all familiar with Code Monkeys from MIT? Well, you can teach your kids to code with Python trying to get the monkey to the banana. And, and so we've been doing this since he's four. He's 13 now.

I convinced him to use Chat GPT begrudgingly because the immediate reaction was mom, I'm not doing any more of your coding examples. I'm tired of writing arrays. I'm not touching Chat GPT. Then he realized it wasn't involving code and he was super excited.

Um only to realize like a couple of months later that what I was trying to do since he was four is kind of useless if we are able to interface and write code with the natural language format. Well, what's the teaching in Python? So that's one of the things that could whisper Python, right. What you remember that they all like Python but they don't have to, they have options now.

I'm just saying, I think they speak like English too because they're listening to us. I think English is one of those languages that was very popular. It's doing pretty good. I know more people that can with the English than the Python. So you can write comments. And if you write really good comments, Code Whisper's gonna infer and be able to write code from those comments.

So uh I don't know about you, but when I wrote code, I hated writing the comments, but the better the comments are the b the less code that you would actually have to write because we'd be able to infer and interpret what you're trying to get done with really good comments. Ok?

Let me, let me, let me take this a step further. More often than not if you, if you start using tools like this and you read the documentation because you all read the documentation. I know this. When you read the documentation, you'll find that it says short and succinct prompts get the best result, which is generally true. Short and succinct, like I wanna write a function to do a topological sort short succinct I get the result I want.

But what's interesting is these language models are very, very capable. And so if you provide the input context in the right kind of structure, the language models of the foundation models can understand that and do something really compelling with it.

So in this case, I'm going to, I'm going to write a function and I'm providing the requirements to the foundation model in a much more verbose. Like normally you wouldn't see this example in documentation because we say short and succinct. But in fact, you can be verbose if you're clear. So clarity is really the key here.

And one of the things that's been really fascinating to watch, I, I spend a lot of time with our customers talking about how this works and I, I watch them use it and I see what they do and what's fascinating is learning how to prompt an AI is a skill all on its own. The, the way you think about approaching this kind of changes.

So there is a learning curve, the tool itself is really easy to use, but there's a learning curve to thinking about how to prompt the AI.

I was here last year when we first um put Code Whisper into public preview and I was sitting down with the customer doing a demo and I handed him the keyboard I was like, hey, you do whatever you want, like you go off, go off script, do whatever you want, do your thing. And he started typing things and what he was typing and what he was telling me he was doing didn't match what he thought what he was typing, matched what he was saying to me, but it didn't.

And so he was getting frustrated because he wasn't getting the results he expected and it was because he wasn't typing what he was thinking, he was typing what he, what he would have typed, just writing code, you know, without, without an AI helper. And so learning how to prompt the model is a skill in and of its own.

And once, once I got it and kind of got him to like, hey, just type what you're saying to me and he was able to do that, he got the results he wanted. And so in this case, I've got an example of saying, hey, I want to write this sum function that takes a string as a, as an argument right there. I'm kind of off the reservation because most sum functions would take numbers as their input. So I'm giving it a very specific requirement for what I want to go do.

But then I've added five requirements for what the function should be able to do. If I just had the first line that said, write a sum function that takes a string. And adds the things together, I would get a very simple function. But because I've got all these requirements, I can get a great example, like throw a not implemented exception. So the easy way to go home early or I can get this example, that's quite verbose in terms of doing all the things that my requirements to find.

Ok, I mentioned we try to give you the best example first, sometimes we give you the shortest example first, like a not implemented exception. But here, all I had to do is hit arrow key, get over and find out that. Yup. Sure enough. This is what I was looking for. It's got even the exceptions being thrown all of those things.

What's really interesting though is I can use this in other ways. So I'm gonna, I'm gonna save this. This is in TypeScript. So, um I'm going to, uh I'm gonna go to source and I'm gonna run. Um hold on a second. Give me a second. I forgot what I'm doing. Uh I want to um compile the index dot ts TypeScript file. Oh, simpler than I thought. Ok. So I'll just do that.

So I've compiled the, the TypeScript file. We saw the JavaScript file, uh pop up there. Um that was just a little, little brain fart on my part. But what I'm gonna do now is I'm gonna open up uh a test file and I'm gonna have to go back here you know what I realized I didn't export this function.

So let me export the function and oops, I don't need all that and let me just recompile it. And so now I'm ready to do something similar uh where I've got basically the same text. You can imagine that all I did here was copy and paste out of a requirements document because you all get very good clean requirements documents when you're working on code.

So you just copy and paste out the requirements document. And all I did was change the first line to say, hey, test the sum function. So now uh I can come in here and say, ok, I wanna do this and it's thinking and then it comes up and says, ok, well, I want to describe this. I'll go ahead and just accept it.

And what's interesting about this is it automatically knows from the requirements that some takes a string and that it's got to test for a single number. It's got to test for numbers greater than 1000. It's got to do all these different tests. And so it didn't end my file. So I'll end my file. I'll save that and then I'll do, sorry, give me a second. Let me do that again.

Um I need to run, oops, not npm test. I want to just test. That's uh ok. So I think just is simple, isn't it just do that. Yep. Sure enough. So I got, I got four passing and one failed test. So somewhere in between all of this, there's one failed test. And in fact, the reason this test is failing is I can tell you what it is because I've done this demo before, but it's got a non numeric characters are not allowed.

And then it was expecting the letter a as part of the message because that's what the test passed in. But instead it got a not a number as a message which um really had to do with the fact that I cast everything as numbers here. And so there's a simple fix this. I'm not going to worry about that here, but the idea that I can do all of that and get that accelerated.

And now I just had one little problem to find and fix and it's actually relatively simple. Um gives me all kinds of cool stuff. But there is other cool things that you just witnessed in there other than just like, again, really clear and clean, even if they're verbose prompts can get you really complex results. So you can really accelerate yourself. It's not just, hey, give me a simple function, I can give a lot of requirements and I can get a lot of stuff back.

And in this case, unit tests are just code two so I could use them to do some unit tests. Again, I'm providing the requirements. I'm saying hey, this is what I want from you and it's going to go generate that stuff for me. But there was one more interesting thing that happened.

I was in the command line, it was in the terminal in VS Code in the terminal. And I typed in cw space AI Code Whisper AI for the terminal Code Whisper for the terminal is something we launched about a week ago and we add autocomplete and natural language to bash command for over 500 CLIs.

The terminal is uh and, and CLIs are this fascinating space where the interface that you as a developer work with has not fundamentally changed in like over 30 years. It's a black box with a blinking prompt. And the single biggest problem I personally have witnessed is it's hard to remember all the commands and all the things you do.

The number of times I've walked through uh you know, development teams offices and seeing like git cheat sheets hanging on the wall because I can't remember all the git commands or all the other things. Um is the indication that this was needed. The idea that I don't have to remember all the commands anymore. I can just say, hey, Code Whisper, here's what I'm trying to accomplish. And it will say here's the command you need and it will give me the option to then execute that command or refine that command or just cancel out because maybe it didn't get it right. Or maybe I expressed myself incorrectly.

So this works for the AWS CLI, this works for GitHub. This works for uh Mongo, I mean 500 CLIs autocomplete and natural language to command, which is super cool right back at you. You got which uh so do you want to do a full demo of the command line interface or you want to keep going?

Let me, let me just show you a little bit more since you asked. So here's the thing. So I showed that as being integrated into the VS Code terminal when you install Code Whisper for the command line right now, it's in preview

so we announced it a week ago. it's in preview. it is available for mac os currently and it will be available for windows soon and it integrates right into your terminal uh as well as into the terminal that's embedded in your id e.

so um i've got a, a little code whisper icon up at the top that shows that i've got it installed. whoops. that's not the, not the one i want to hit, i want to hit settings. and so i've got uh it installed and it can do the cli completions, it can do the english to language translations. it's got all kinds of getting started info for me and then it works right in my terminal regardless of whether that's in the id or stand alone.

so i can come in here and i can say cwa i, that's the prompt to say, we're ready to give you a natural language command. and i could say uh reverse my last get commit and then it comes up and gives me the command. so i don't, you can apply it. it's ok. like it's the command line. you guys, like i saved you a like 100 hours this week alone. just that, you know, you know, it's a compliment when they take pictures of your sites.

no, they took pictures. i'm telling you appreciating what the importance of this is. no, it's like the, the the developer version of a clap is taking pictures. ok. exactly. wait, you fixed the command line.

so, but wait, there's more. um because what we've also heard from you guys is, wouldn't it be great if we could remediate and fix security stuff and uh earlier on? because uh we've noticed that most of the development team uh don't always follow security best practices. they aren't that careful necessarily about what code they got from overs stack or anywhere else. can you help us write better code that's in compliance with what we want to do as an organization.

so now we're going to talk a little bit about reference tracking, making sure that the code comes from a place that you want it to security scanning, which we initially had security scanning at g a. but wait, there's more. now you'll be able to remediate based on what the scan finds. wait, don't tell him that. oh, sorry. and then there's responsible a i and then i need to stop talking.

so if you think about how you're creating um governance and security uh for at the developer level, we're trying to reimagine how to make that easier. um and just like we've had a, a shared security model, i don't know, for like a decade at aws, we're, we're going even further in, in helping building the tools that you need for a governance model as well there better. that's great. ok, thanks.

well, then, well, then, so the, the, the first thing on this slide is reference tracking, reference tracking is a really important responsible way i attribute. i'm going to show you how it works. but the important thing to understand is again, if we were to sit down and write code together and we were working on something like that, some function that i was just writing and rory said, wait a minute, i've seen this before. i know we can just go copy and paste that code from this open source repo and we'll just use that and that's a perfectly fine way to do software development. it's a very common practice in software development. say, hey, look, this problem has already been solved. let me just go get it from where it's already been solved and use it here.

the issue is if, where it's been solved is licensed by somebody, it's owned by somebody. there's a copyright on that code. uh oftentimes when you get code from an open source repo somewhere, it's got an mit license or an apache license or an ic license or something associated with that, which puts some responsibility on you as a developer to do something. in some cases, it's just simply attributing where you got that code from. uh in other cases, it could be, it could be more complex than that. and so reference tracking is all about replicating that developer experience of, hey, i know where this code came from and i want to do the right thing.

so when we first started building code whisper, we felt that responsible a i was a really key pillar. if we're going to build an a i that is going to be collaborative with you in the coding process and write some of the code that you're working on, then we want to make sure that we are doing the right thing so that you don't find yourself in a bad situation. and open source code is, is a really important part of that.

so you may have realized and i'm, i'm gonna, i'm gonna delete this code that i had here just to show you something that happened because i, i didn't point it out when it happened and some of you may have noticed it and some of you may not have. so i want to show you what happened.

so i'm going to generate the same code, the input prompt, the context hasn't changed. so the output that i get is gonna be the same. um and so the same thing i go find this. now what you'll notice is there's a little adornment right above the suggestion that says reference code under mit view, full details in the code whisper reference log.

so what this is saying is, hey, here's a code suggestion but just know that some portion of this code suggestion includes code that is a high match, a high degree similarity match to code that's in an open source repo that has a license associated to it. you can use it if you want, but just know that you're choosing to use some license open source code while you're doing that. and so i say, ok, i'm willing to, i'm willing to do that and i hit tab and i accept it. that little adornment goes away.

now i forget all about it. and so what i can do is on the sidebar here is the aws tool kit and there's a, there's a developer tools section and code whisper sits underneath that and i can come to this little spot that says open code reference log. and so here it's, this shows up twice because i did it once and didn't tell you about it and then i deleted and i did it again where i told you about it.

and so what you're seeing is it's telling you that like, ok, at this point, at this time, you accepted a code recommendation that had this particular code in it and this code had a high similarity match to this particular repo. so in this case, it's the advent of code from 2022 which is actually where i got this example from, it was a coding challenge last year during the advent of code, which is a a series of coding, daily coding challenges.

and so it's just saying, hey, the solution is already there. it's under an mit you can use it if you want and i can follow the link to the repo to look at it to decide if it's is this code unique enough that i need to attribute it, do whatever i need to do. but that's all there for me. and in fact, in this case, it was this line of code. and so if i hover over that line of code, it tells me, hey, this line of code is the reference line of code. it's under an mit license so you can use it or not.

the other thing you can do is if you're like, you look, we don't use open source code in what we built. so i don't want any references that have open source code in them, filter those out. i don't want to see them. so at the individual developer level, you can do that by coming into the code whisper settings. let me close this bottom thing and there's a, there's a setting for include suggestions with code references.

so if i uncheck that i won't get that suggestion anymore. so if i uncheck it and i come back to that code and we'll delete it one more time and then i go to generate the code. it gives me those not implemented exceptions. oh, it gave me the reference code. why is that? why, why it failed? i have to double check why that happened.

um this setting is also a setting that works at a policy level for an organization. so if you're using code whisper professional, which is the um the version that we make for organizations, there's the that setting you can apply and applies organizationally. so you can set that as policy for all your developers.

all right, back to rory.

all right. so this is what i gave away on the previous slide that he was a little annoyed by it is that it's not enough to notice when there is a security vulnerability if you don't have the capacity in order to remediate it. so recently we announced to being able to do a i powered code remediation.

um and we currently support java python javascript, more languages to come. um and this is able to resolve challenges much earlier on in your development cycle than versus waiting for a security check after you your code complete

um we have new language support for security scanning um including terraform. and i think this is kind of interesting to note we've been talking a lot about the application and application here, but then we also talked about databases and now i'm talking about infrastructure as code.

if you have a ubiquitous coding companion that is is able to stretch across different personas effectively, you're kind of changing the role of developers. the developer of tomorrow is not the sa exactly the same as the developer in the past because we saw how you could create code from a natural language interface.

um i had a customer two weeks ago say, oh, you mean english is the new coding language? and i was like, oh, he's right, he's right. so when you're thinking about like, how do i need to prepare my organization or how do i need to prepare for the future? my recommendation is to start playing with this now because this is how we're going to be able to be the plus one to what it's delivering.

all right, let me, let me take this in action. so uh one comment on the last demo, i realized why i still got a code reference. and that's because i'm logged in to the professional version. and by policy, we have said we accept code references with uh our code suggestions, with the references. that's why i got it.

so my clicking on the id e didn't change that setting because that setting is a policy for my entire organization. i i i discovered that as i was doing this, i forgot which way i was logged in. i uh you can log in as an individual user and it's completely free to use. so you log in with what's called the builder id. i think i mentioned at the beginning or in my case, um if you take a look at this, i am logged in with my im identity center credentials, which is my organizational policy credentials. and so the ability to accept references is an organizational policy. this is using my amazon credentials.

so let's talk about this security uh checking. so i'm gonna kind of show you how this works. you guys like python. so here it is. so i've just got a bunch of a bunch of examples here that, you know, i've written a much code and at any point in time as a developer, i can invoke a security scan. and the reason for this is we again going back to if we're collaborating, if we're building an a i, that is a collaborator with you in writing code, we do a lot of work to make sure that the code suggestions that we're providing are good quality. code suggestions include errors, don't include issues, but as soon as the human and the a i start collaborating, we can't guarantee what's happening. not to say anybody here is going to do something they shouldn't, but we just don't know what's going to happen. and so one of the, again, our principles early on was we want to put the tools in your hand to understand what's happening, what you can do. so we added in this security scanning capability. and so the idea here is that any time at any point in time as a developer and whether i'm using the free version of the product or the the professional version of the product, i can come in here and uh in the a tool kit and go into the code whisper section. and there's this option to run a security scan. so i'm just gonna click that and this takes a little bit of time

So it's telling me this, this is gonna be counted as one of my scans. If I'm using the free version of the product, I get 50 scans a month. And if I'm using the paid version of the product, I get 500 scans a month.

Now, I know what you're thinking as developers, you probably want to run more than 500 security scans a month. You're excited about security scanning. You just want to do it constantly. So hopefully 500 is enough uh to go and that's per developer per month.

And so this uh this completed and it says I can, you know, see the scan files, oops. Um so here's my scan file. So it found a handful of things, it says, oh, there's some issues here and so I can pick one of these and say, oh, unsanitized input is run as code. So it tells me where that error is and I can hover over it and it tells me this is a common, a common error that unsanitized input is run as code.

And so I can come down here and it tells me a little bit about that and what the problem is and what it does and I can view the details and see what it is. And as I look at these, so I've got a bunch of different examples. So if I look here and say, oh well, here's an insecure hashing problem in this case.

So not for every issue you might find, but for some of the more common ones and we're going to continue to expand how many of these we cover. You'll see that we tell you like, oh well, here's what the problem is and this line of code that you have should actually be this line of code.

And so in this case, we're saying, oh, you're using a sha one hashing algorithm. You should really be using a sha 256 hashing algorithm. So I just click apply fix and the fix is applied and I can rerun my security scan so I can go through this.

The idea here is one click bug fixes for security vulnerabilities and common best practices. That aren't being followed. Um, I can run the security scan over and over at any given time when we run the scan.

So you see, it's telling me to run the security scan to see if the fix is applicable at any given time. When you run the security scan, we'll give you up to four quick fixes. So if, let's say, let's say in my case, i had, i think there was five or six findings, four of them would have the quick fix applied as i go and apply those quick fixes. I can run it again and the next group will get it.

And part of that is we just want the, we don't want the scan to take so long as we're generating all the code for all the things that might have to be fixed. So we do it in small chunks. So you get a few fixes and you can work on those and a few more and you can work on those.

The goal here is that what you will see as you work towards, uh you know, your, your product development and your software development is the number of security bugs remediated goes up, the number shifts to production goes down because it's as easy as clicking to fix the security issues.

I kind of expected applause on that one. I'm not pretty good. I mean, it was smooth and you know, it's kind of slick. This is a tough crowd. Because most of the time developers don't necessarily care about the security when you have to ask for it. I mean, really we're begging for it, don't make me beg.

So another goal thing that we announce in preview is customizations. So initially back in the day, which was six months ago, we commonly thought, well, the bigger i can make the training model, the better results i can possibly get because it's going to look at all this data and the most common ones are going to be surfaced. i need massive amounts of data to train in my model.

And yet we found that sometimes being more picky about what goes into your model yields different results, sometimes better results. So for those of you who've worked with organizations that have large corpuses of data, maybe the patterns of how you develop code aren't in line with the corpuses of code that exist in open source or even at amazon.

So being able to point code whisper at your code posits in order to learn from how you like to write code in order to predict and make those suggestions can yield better results for you because it's based on how your organization codes.

So that's customizations, being able to train the model based on your environments. So can you show us a little bit? Yeah, i can. And this is based on like real world issues, right?

We use code whisper internally at amazon. But what we found when we first rolled it out is it was marginally useful or marginally beneficial. But the issue was that so much of our code depends on internal frameworks and internal libraries and internal apis that we've developed that aren't published publicly.

So code whisper has no knowledge of them and we needed to solve this. And so what we did is we built this customization capability as we said to point code whisper at some code and say, go reason over this code and understand it so that then the suggestions match what i'm trying to do.

And so I've got a relatively trivial example here, but just so you can see what happens here and this is i'll talk through it. So it makes a little bit of sense. But this is really meant for an organization where you point at a large corpus of proprietary code, we build this customization, it's an arm resource in your own aws account.

So it doesn't go into the foundation model itself conceptually think of it as a layer on top of the foundation model that is for your organization only. So nothing in that ever goes to any other customer.

And so here i can go into the aws toolkit and you can see there's an option if i'm logged in with the professional version. This is why i was logged in with the professional version. I have an option to select customization and it will pop up and it tells me right now, i'm using the code whisper foundation or default model, but i have a couple of other customizations available to me.

So these were customizations, my organization created, i could have one or i could have many in my organization for different use cases or different teams or different purposes. And so i'm going to stay with the foundation model for a second just to kind of show you this.

And so i'm gonna use this hatchet functionality to do what's called a chop. And this is about sending log files up to s3 and doing things with it. And i'm going to come down here and say i want, i want to do a hatchet chop and it says the best guess it has is hatchet dot chop takes a level info as its first argument and the string as a second argument.

Um that's just a guess that it has based on, i'm not even sure what, but that's not accurate. It's close, but it's not correct. And so using my uh customization, i can come in here and i'll choose uh we'll choose this one.

So when i create a customization, if i'm a user in that organization, much like i got a little pop up here that said suggestions are now coming from this customization. If a new customization is created, i'll also get a notification that says a new customization is available and then i'll be able to go check that, that customization is based on whether or not i was given permission to access that customization.

So now that i have the customization running and i come down here and i want to do that chop functionality and this is sort of subtle. But what you'll notice is it says hatchet dot chop, the string is the first argument and the level is the second argument. Ok?

So the first one was a close guess based probably on some context that it has from what we're doing. But the second one is actually accurate to how the api is built. So hatchet dot chop takes a string as its first argument and a level enum as its second argument. That was only possible because we have the customization built to understand that code base.

It's a relatively trivial example, but it shows the difference between what the generic or foundation model knows, which is i always equate it to the foundation model is kind of like the developer that you hire into your company. On the first day. They know a lot about software development. They know a lot about programming but they know nothing about your internal api s and your internal frameworks and your internal tools with the customization.

It's like that developer that's been at your company for three years. They know how all this stuff works, they know what the frameworks are, they know what the a ps are and they know how to write code. The way you write code and that's where the power of the customization comes from.

All right, a little applause, i thought. Yeah. Yeah. You're welcome. Hm. Hm. It's only monday. But i mean, come on, we got to train him to clap a little. I know.

All right. So the future is bright. What you saw today? These are a lot of foundational elements that are absolutely necessary in order to create innovation throughout the organization, right? How, how are you building faster and more securely with better stuff? It's a theme. But if you let's take it a step further and imagine what else we can do with this. What are the next logical steps?

Once we have these foundational elements in place of having security at code inception, then how much faster could we move? What if um you know we're saying, ok, great. You can write code from language. That means you can spend your time worrying about doing well, really complex and interesting code versus really boring stuff.

We're shifting. Who is doing what? By removing the undifferentiated heavy lifting of being a developer in a way that honestly, i've never seen before um in, in, in how we're evolving. If you think about ah this is a good metaphor. This is uh uh uh this is a great metaphor.

This is my favorite one in the 19 sixties, how many people uh could write assembly language ones and zeros like 1000. I wasn't born i'm not sure. I know i'm way older than you. Face cream. Uh 5000. But then, but then we got fortran and four time was a layer of abstraction where you didn't have to be neo reading the ones and zeros and we got cobalt and then we got c we got bad c we got good c we got python.

Each layer of abstraction opened the aperture of the number of people that were going to be able to build. How many people are going to build tomorrow? What are they gonna build? How are they going to do it? Is it gonna be your kids? Is it gonna be you?

So i hope this session inspires you that yeah, this is, this is a great foundation, but i've been even more excited about what's gonna happen in the next 6 to 12 months as we democratized how to take advantage of ja i into these foundational elements to build really, really cool stuff.

I'm going to say one thing and then i'm going to say thank you. And then there's gonna be this thunderous applause to the people in the room next door. I think they missed something. But the idea here is like we wanted to introduce you to code whisper, but also introduce you to some of these best practices and they keep evolving. This is such a new space that we're constantly finding the new and better ways to inform these models and to provide context and we're changing how the models work constantly.

We actually roll out new model updates pretty regularly. So the capability of the model improves the way it responds to, improves the amount of context it stands improves. So the best practices we're showing you here in terms of how to do the prompting and how to get the results you want are just the beginning and we're going to continue to publish content that's going to help you understand how to do that.

So pay attention to our social feeds and our blogs and things like that as we publish more content and with that, i want to say thank you. You guys figured it out. I like it. You'll have a great rein b thank you.

李白的朋友王维

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫