Transforming deployment: Deep dive into Backstage on AWS

最新推荐文章于 2024-10-16 12:49:26 发布

litaibai-04

最新推荐文章于 2024-10-16 12:49:26 发布

阅读量174

点赞数

文章标签： aws 亚马逊云科技科技人工智能 re:Invent 2023 生成式AI 云服务

本文链接：https://blog.csdn.net/littlechenlin/article/details/134784064

版权

Hi, everyone. Uh it's amazing to see everyone here. Uh my name is Brian Landis. I am a senior solutions architect. I've actually been here for about 5.5 years primarily supporting our global and industry accounts. Um but lately we've been really focusing on platform engineering, uh developer portals and of course, backstage, um if you are new to re invent, welcome. I hope you're having a good time maybe uh gambling or, you know, drinking a little bit. If that's not your style, there's tons of restaurants and good food and show circus soleil. And if you're anything like me and you hate vegas, don't worry, we're almost out of here.

Um before I really kick it off again, I just really do want to thank you. Um I know how hard it is to get around re invent. Um also I wanted to say thank you to anybody that's in the venetian right now. We're actually casting to them at the same time. So I just want to say wholeheartedly, thank you so much for being here.

Um so yeah, welcome to open 404 transforming deployment, deep dive into backstage. I'm here with my friend mihai from spotify. I'm very lucky to have him here. And then also my colleague neil, uh who both of them will go ahead and introduce themselves in a couple of minutes um to kind of go into what we're gonna be talking about.

Quite simply, this is basically the flow. Uh I will ask mihai to come up here and talk a little bit how spotify looks at their backstage and more importantly, talk about the developer experience. We'll then shift on to a little bit more closer to backstage on aws and then we'll talk about some integration patterns that we see.

Um so yeah, uh mihai, would you like to take us backstage?

Thank you. Our human needs had evolved in the grand scheme of things on a scale of time. It feels like yesterday, we were still learning to master and control the fire. Whereas nowadays we fuel rockets on their way to the stars on the same scale of time. It feels like yesterday, we were still building our first fishing tools. Whereas nowadays we are crafting beautiful violins and pianos to perfect the sound or to come closer to our industry. It feels like yesterday we were attaching a wire to a kite in a thunderstorm to discover electricity. Only to witness the rise of the a i a few centuries later, our human needs have been constantly evolving. But so did our responsibilities too.

Ladies and gentlemen, my name is mihai i'm an engineering manager in backstage at spotify and i'm here to talk about how our industry makes no exception to that evolution. The state of developable tooling for that matter followed the same suit. Developers who once focused primarily on coding find themselves nowadays responsible for so much more. But the distribution of responsibilities has changed dramatically as well. Developers have to keep in mind a much higher cognitive load, how to distribute, how to scale, how to manage database, how to manage cloud spend. How does one remember all of that? This ever changing transitioning mandates us to re enable our developers to do what they do best, which is to innovate. But as the tools proliferated over the last decades, all these challenges compounded the rise of the idps which stands for internal developer platforms have addressed this problem. How you may ask well by reusing the tools to increase the knowledge sharing, which led to less interrupts, which leads to less cognitive load, which leads to less attrition and ultimately to better outcomes for the businesses.

So i think with that in mind, it's safe to say that the id ps act as a force multiplier. Now, platform engineering teams have become increasingly common in organizations and so have the internal develop portals. One now has to develop an application. They not only have to worry about that application itself but also keep in mind things such as security constraints and compliances cic pipelines observ network controls and so on. The idp is that self service interface that's highlighted here in blue. It's that central layer of all the pooling within that stack backstage is that layer. It is an open source platform for building developer portals. It was developed and created at spotify to tame the chaos in the wake of a hyper growth timeline in the history of our company. Since then, it has been open sourced, donated and graduated. The cnc fc ncf is the cloud native computing foundation, also home to envoy and certis amongst others. It's seen exponential community growth since then, becoming one of the large one of the largest gatherings of dc practitioners, which actually makes me realize and reflect that backstage. I don't think it, it, it it's being created anymore. It's now being knitted if you will by a beautiful ever growing community.

Our core philosophy is simple, single pane of glass to aggregate all the information ownership matters and empowerment through extensibility. Speaking of it, extensibility is that front runner power of backstage and follows a pizza model if you will. The dow is that initial core backstage, open source repository available to begin with. The toppings are individual plugins that each of them add a flavor to backstage, an individual flavor to backstage. Whereas the pizza itself is that freshly cooked and deployed one single instance of backstage that is maintained by one productivity team to quickly recap the holy grail of backstage. If you will consists of the following software catalog for finding things, software templates for creating things, tech talks for documenting things and then the last one search, well, one search to rule them all.

We recently ran an impact analysis at spotify on over our 6000 developers to try to better understand how backstage is affecting them on their daily basis. And the results we're seeing are encouraging. We see more developer activity, we see more code changes deployed in less time and our software is deployed more often and stays deployed for longer.

That said, let's have a look at how backstage actually looks like at spotify. One second, i need to turn on the oven. There we go, there we go. So i mentioned a little bit earlier that backstage is powerful, primarily due to its extensible plug in architecture. It was built with modern technologies and common frameworks such as react and no gs and also stayed cloud agnostic and vendor neutral.

Now, plugins are just added functionality and they can come pretty much from anywhere if you have specific needs in your organization and you have bespoke needs to implement them. You can build those internal plugins yourself. If not, you can maybe reuse some of the plugins that are already available out there in the community. So it's safe to say, i suppose that whether you are a pepperoni fan or a quarter for mai fan or a different kind of pizza fan, there is a little bit for everyone bundle this npx packages.

Um you can browse around um the um grocery store to find all of these plugins uh in the m pm mirrors. In this particular case, i chose the catalog. So both the front end and the back end plugins are available uh out there. But speaking of plugins, this is the homepage plug in one of our open source plugins, highly customisable and incidentally our default welcoming page that all the developers at spotify see on a daily basis.

Now, one interesting fact about our internal instance is despite it being the old instance, there is, it's actually not the largest, i think for the last count that i've done, we roughly have around 45,000 components uh which is roughly all the um assets total number of assets that we have indexed in the catalog. But there are organizations out there that have four or five times that number, which only proves that scalability wise, actually backstage goes far beyond our current needs.

But that said, let's go back to the home plug in. I have at first glance, all the information that i need in a given even a given day, both for myself and for my team, which in this case is infinite sharks. Now, i know that may sound strange. We do embrace an autonomy culture at spotify. So we have a deliciously complicated relationship with naming things. So maybe that's a story for another time on how we name them at spotify. But for now, let's just assume we all love animals in backstage.

What you see here is in the front end is essentially just a react app and you can configure that the way you would normally expect that to do with a caveat that you can get a few things for free, right? You don't have to worry about navigation or how to consume those api s uh from other plugins because all of that is standardized for you. Similarly under the hood in the back end, primarily you have node express services and you can define your end points the same way you would normally do.

Now, i wanted to showcase today three specific plugins for you. And the first one is a simple one yet. Let me increase the font a little bit. The first one is a simple one yet powerful because it's a visual mapping of all the technologies that we prefer and we recommend to our developers to use on a daily basis. Now, from a technical perspective, this is simple enough. It's just rendering some content in the front end. So nothing complicated there yet. By implementing a backstage plug in for this, we are able to answer this question to the developers if they have that on a daily basis, which technologies should they use now to build on the complexity?

Let's move on to the second plug in. I hope that renders well and that is the software templates i mentioned a little bit earlier that fragmentation is one thing that is slowing developers down. Well, it turns out that another thing that slows developers down is the actual process of creation. Imagine i just joined the company. I have a brilliant idea on how to, i don't know, implement a backend system that will optimize our playlist management system or something.

Now, instead of me focusing on implementing that back end, my first reaction would be how do i cope with that in the existing ecosystem? How do i bring the network controls? How do i make sure that i have the right c id pipeline set up and all of that. So suddenly, instead of focusing on what i actually have to do, i'm thinking about all that boilerplate code in order to solve that, please need softer templates whose purpose is to create repeatable templates give a simple ui in the hands of developers, provide them a couple of the information they need at hand on the meta data on the component they want to build and let that automation do the heavy lifting for them.

Let's see how that looks like we're going to see in a second, some of the templates that we have at spotify. And here what you see here are all the templates that we have internally. Let that be back end of various technologies, front end machine learning tooling data pipelines, mobile components, you name it. But this is just a mere subset of what it is available out there in the community for that matter. Some of the things that we actually have uh internally are coming from the open source and vice versa.

Let's have a look at that for the purpose of the demo. Today, i chose a back end in goal just because my to be really fair, my skills in goal are very, very rusty nowadays. So i chose to create one of those any minute. Now. There we go. So this is the u i that i was mentioning a little bit earlier. I'm prompted to input a couple of the uh initial values, simple things such as you know, defining a component name, adding a description, i plan to own this repository. So it's going to go under my own account and then for the rest of the tooling, i'm just gonna go with the defaults that software templates provides. I can always worry about these deployments configurations later on and there it is what happens now under the hood is that software template? There we go an error. Hopefully that's going to go well with an automatic retry

Excellent. So what happens under the hood there is that all the prompted values that I've inputted earlier of that metadata plus the preloaded skeleton are being combined together in order to scaffold me the component that I needed. So there are two things that have happened:

Number one, a repository was created for me and number two, the associated component for that repository has been generated and registered in catalog so that I can find it later.

Let's see how this looks like in GitHub. In this particular case, I have a fully fledged application here in GitHub, not only with the back end in Go that I was looking for, but also I have the Certis manifest so that I can worry about those later. I've got a Docker file in there, generated the build files for the CI/CD pipeline and last but not least a catalog info YAML file which contains the metadata needed in my YAML files. So that catalog can actually register this later on.

Speaking of catalog, let's look how that looks like and this is how it looks like in catalog. So in a matter of minutes, I have a fully functional repository and I have the components associated with that ready to be used. So now I can go about implementing the initial back end that I had in mind without having to worry about all of that boilerplate code.

Now, last but not least Spotify has its own list of plugins if you will that apply to every single component that we have across the fleet. One of those plugins is the Sound Check plugin whose purpose is to push and drive the best practices across the fleet, but not just in any way, but in a fun way. On the right hand side, you'll see that we have multiple tracks or programs, each of them corresponding to one tech health initiative. Let that be the security compliance or the test certification. I think for the purpose of the demo, I chose the test certification, one whose purpose is to basically improve the quality by measuring how we monitor the testing within the components.

You will notice that for this particular track, we have badges, we have multiple levels and each of those levels consists of individual checks at their turn and each of those checks can be configured. Now, the internal research has actually shown that by splitting into levels and by splitting into granular checks, there is a higher chance of adoption across the developers. Now, for those of you that have played RPG games in the last two decades, like I have, this should hardly be a surprise because it's basically a gamified way to drive both ownership, one second, there we go to drive ownership and responsibility across implementing these checks.

And by bringing all of this playfulness, if you will into Backstage, we not only make the developers aware of these changes, but we're also putting the tools in their hands to actually take control of that and drive it forwards. Now under the hood, what happens is that we actually have real time access to the data so that we can let me pause right there so that we can actually track in real time how our tech initiatives going on across our squads, how many our squads and our developers spend with those tech health initiatives.

In this particular case, I chose a mobile dashboard that follows our mobile components to follow their on the road to follow the across the fleet, the best practices. And you will notice there that despite us not yet being at 100% if we scroll down a little bit there on the individual granular level, we have done that. So rather than go for a large change all at once, splitting into levels allowed us to gradually increment those changes and make sure that we put this value in front of the developers.

Now, in conclusion, what I want to leave you with is that at Spotify, we've learned in all this time that the right set of tools, the right set of tools in the hands of the developers can actually act as a bicycle for the mind. So when we ended up open sourcing this project three years ago, we asked ourselves, what if instead of it being in our hands, it was in everyone's hands, but more on that, I'll hand it off to Brian.

Awesome. Thank you. Alright. Yeah, that was amazing. Thank you so much. I, I always actually really, really enjoy how Spotify really hones in to the developer experience. And as you can see having a full blown Backstage built out and giving that self-service capabilities can truly change how companies work.

What I wanted to mention is that to kind of dive a little bit more into the AWS, you know, side and some of the customer stories. I wanted to really mention that we truly do believe that these types of portals and platforms really can change how your organization works. I'm gonna talk a little bit about that, but I just want to mention, you know, it's definitely a journey, right? I'm gonna try to give a couple of good tips in a couple of minutes.

One thing I do want to mention is Toyota. I'm actually very lucky. I was very close to this group and I saw the entire journey. They started very small, which you should always do with a group of people and then after about six months after, and especially treating their product or their platform as a product. They were able to actually roll this out to the currently into the North America team.

But a big thing that I want to mention just kind of walking through is a lot of times, you know, they're a very large enterprise. It would take quarterly pushes in deployments, you know, by using Backstage and again, that single pane of glass and reducing that cognitive load, they now seriously can ship weekly, which is a huge change for that team.

Another big thing is too is that while the central enterprise platform team was building out Backstage, they were working with individual business units asking where can they drive value. And so as you see on the next bullet point, you know, they've reduced time for 8 to 12 weeks for about 25 teams reducing the equivalent to about 250k per team. So that's spending time spinning up infrastructure, getting your repo and also hands on keyboard, right? Having your developers maybe focusing on something that they should have to, right?

Also one big thing is that we see a lot of customers do, but what was super important for TMNA or Toyota or KO who you see on the screen is that they wanted to really focus on cloud optimization and control. And so now since they've launched it since 2022 I mean, they're saving tons of money and I'm really happy that they're able to do that. And what I mean by that, by the way, with the charge backs and kind of costs and usage reports, any person that goes to Chauffeur, which is what they call their Backstage. You can actually inters the actual how much you're spending per catalog item, right? So basically people can see where to optimize and so on.

I also want to mention too is that, you know, again, when we see these platforms being built out. They started very small, found one team, one developers and they started building out templates. Now they actually have over 40 approved templates and these things range from maybe just adding tech docs, some documentation but also even single page applications using something like React or Angular with all the best practices baked into the repository Kinesis Cinetis Outpost DataDog Hashicorp basically helping these engineers self-service their capabilities without having to have almost no human interaction, right? You can self-service and go about your day.

One thing is too, is just to kind of again lean a little bit more into how AWS and grades with Backstage and how we support it. We actually absolutely love the Backstage community. It's one of my favorite communities. It's grown exponentially. And also I just want to give a small shout out to the Backstage maintainers. They've been spending a lot of time building.

What we have done is we've actually spent time building out AWS implementation on using IAM credentials within your Backstage. So you can actually assign IAM roles to your Backstage to go act on your behalf with your AWS resources and still maintaining the least privileged way that you should probably approach these things. And so this is how you're able to actually, you know, integrate with tech docs search, Kubernetes and also extended it to things like DynamoDB, right? And, and other types of tooling, we also have been building out a couple of AWS plugins specifically having integrations to something like AWS Proton Neil on stage actually helped build this. So he's definitely going to be talking at least a little bit more about how we think about approaching plugins.

But with the AWS Proton service plugin, basically, this is extending Proton to Backstage to help with your deployments and upkeep of infrastructure. Also, the last plugin that I'll mention is AWS Code Services. So this is the ability just like that catalog info YAML that Mihai showed you can add ARMs to it that actually point to CodeBuild, CodeDeploy and also CodePipeline. Because again, what you're trying to establish for your developers is that single panted glass, right? You want to keep them in that React app, you don't want 500 Chrome tabs, you want two, right? Maybe GitHub and your Backstage.

OK. So when we start kind of talking about how to build this out, I want to mention a couple of key things when you are building out an internal development platform, make sure that you have a landing zone. I know some folks may already know what that is, but it's a very specific way of how to separate your multi-account so that you have ways to have compliance security standardization from an account level and also other mechanisms like an account vending machine and able to actually have a little bit more control of your overall environment.

One thing I want to mention is we will be absolutely sticking to the shared service account. That's where Backstage should be hanging out in. Because what we're going to be able to do is actually spin up infrastructure in these other accounts that you see with the development, staging and production and then also kind of working with these AWS Organizations which you see here. So Control Tower helps here Terraform Landing Zone, CloudFormation Landing Zone, but make sure you have this set up.

Alright, how do we deploy this thing? So we really thought about this for a while. This is actually the most common pattern. We actually see a lot of times we might, you know, we run into different folks that might be using ECS or EKS and we really kick the ball around here to decide which one we wanted to show. But truly, this is a React app. It is a Node app and for us, for a lot of times we think that maybe simplicity is the best way to go.

So just to walk through the first thing that you would be setting up or make sure that you have is going to be your DNS. It's going to be Route 53 in this particular example, but you can use something like Infoblox or any other types of DNS, you would ingress into the VPC and actually hit an ELB or Elastic Load Balancer, specifically an ALB or an Application Load Balancer. You could tie Cognito here and use ACM which is our Certificate Manager. If you wanted to have Cognito as your auth authentication to make this simple. Again, we're just making it straightforward. Like I said, the common pattern that we see most is people using Elastic Container Service and using AWS Fargate, which is our services compute to run actual Backstage.

Very simply, it's just again, don't handle servers say under differentiated heavy lifting. And so that's why we suggest going to Fargate, you might have something like an Auto Group to help scale depending on how many customers are coming into your environment. But nevertheless, we highly suggest sticking with something like ECS Fargate to connect to the different parts of the functionality that Mihai just showed.

We highly suggest using Aurora or RDS for Postgres specifically trying to use serverless options to help keep the Postgres database in a highly available fashion. Also, we highly recommend connecting OpenSearch to the actual search capabilities to Backstage. There's a lot of things that we see here out of the box. When you just launch Backstage, there's usually something that's running that's called Lunar. It's good, but it doesn't have that kind of enterprise search that a lot of folks may be kind of used to with something like OpenSearch Service.

I also do want to mention, we are considering integrating with something like Kendra. So you can have multiple data sources say like Confluence S3 SharePoint and you can actually search everything through Backstage, right? Single panted glass. Last but not least good old S3. We highly recommend to store all your tech docs into S3 for the high availability. You can turn on Intelligent Tiering if you want to save some money. But again, I highly suggest storing everything in S3.

Quick note, you would have to actually set up your CI/CD pipeline so that it would pick up your docs folder and compile them and actually, you know, deploy them to S3. So you would have some CI/CD build here. And also one last comment, definitely make sure you're using Secrets Manager and hiding your secrets or using something like Vault.

We've been talking a lot about Backstage. I wanna kind of mention a little bit more, which is kind of two additional personas that we see. And in fact, a lot of times the most amount of questions we really do receive is actually how do you do this? Like how do you actually get this going? And so I'm gonna try to kind of bring that up a little bit as many of you may or may not know.

Platform engineering has really taken the world by storm. I think a good way to put it is basically having a focal point inside your organization and pushing people through that focal point to enable DevOps that scale specifically through automation and self-service capabilities.

Normally, there will be a platform engineering team, they probably may already exist inside of your company or start up. They're usually a builder tool team or developer experience, maybe DevOps or SREs, we usually see a two pizza team style which is about, you know, 12 to maybe 14 folks working together in different aspects of the platform.

What they would do then is actually define that platform as much as they can in either Terraform or anything in infrastructures code. And anything that would be scalable or meant for your organization. So this might be a Kubernetes Helm charts or again, ECS whatever have you.

There is still a architecture review and a security review, right? You want to build in these best practices for the actual platform, maybe using something like a wall architected review or framework to, you know, establish these high availability and best practices and cost optimization.

But nevertheless, there's still going to be a some type of review and then storing everything into get, get operations or get ops is your best friend. And also too, there's a lot of new changes coming to backstage where now you can actually have declarative infrastructure for your front ends as well, right? So the the backstage maintainers are heading this route.

Um this is important because you will have version control, you can have auditability and if anything breaks, you know, blows up, you can just pull and go backwards and roll back.

When we look at this actual platform itself. I want to mention a couple of major things. Obviously, you would have some type of cloud that would help you on your cloud journey. So this might be aws this might be on prem this might be edge but nevertheless, your cloud strategy and having some type of computer cloud at the bottom, we already talked about landing zones a little bit, but this is a good way to have segregation of accounts and control of your accounts and how you would be deploying these resources into these various things.

Um next, we see tons of different types of orchestrators. So you would need some type of orchestrator to actually be able to spin up the two rows above it. So normally we'll see, you know, kubernetes, maybe something with cross plane. Um we will see terraform cloud development kit cloud formation. These tools would be able to spin up infrastructure on the right hand side.

Um I guess that would be your left and basically also able to connect into those different two rows, right? So these orchestras would provide tendency c id observ ingress and so on.

Finally, as we've been kind of talking about backstage as a developer portal would sit on top of this platform uh as we joke around. Um it's kind of like putting lipstick on a pig, right? You're trying to make it pretty. Uh you're trying to make a platform look nice, but more importantly, is abstraction.

Uh you want to abstract all this complexity away from your developers. What i normally also see too is that when you get started on this journey, most folks will actually pick developers first. They will find one team. How can we improve your life? Uh they will take do metrics to actually prove out value, not only to more teams that may join later, maybe your leadership or maybe just for yourselves or the developers that are using it.

But definitely they start getting these feedback loop mechanisms and measuring them once you start on boarding more developers, what i also normally see is the builders start coming in and this is kind of just at all almost, but this could be your dev ops sres whatever have you putting it in dashboards so that anybody that is technical can actually start really getting more value out of these, these platforms.

I want to mention too that do not forget about your data scientist. Um i have a joke with one of my buddies. Yeah, that's uh part of my customer. And he told me one time, i didn't know i had to learn all this crap just to do my job. And basically what he wanted is just a jupiter notebook. And that's a big key theme about what we're talking about here. Is that instead of you learning 5000 tools, how about i just extend my platform to you, right? And that's what we we normally see is actually data scientists really uh get a good part and a lot of joy from using one of these portals.

Last, but not least, i'll make one last comment. We actually do see the entire business come to these backstage portals. So don't forget this could be lady dotty for everybody. Um and we actually see people tie something like bedrock or llm models to the actual backstage portal as well.

Um last but not least as you can kind of see here, you'd be deploying to these different vpc s and uh accounts or whatever you may see fit or deem fit. This is a little bit more busier slide, but i think it's one of the most important things that we talk about the most and then neil will definitely come on and get much more technical. But this is what we actually answer a lot about is again, how do you do this?

So we've been talking about architecture patterns, standardizations. Toyota is a good example, right? We know when you need a single page application, just use this template, codifying your templates, right? Standardizing maybe on terraform or cloud formation slowly but surely. But again, using handlebars and and the scaffolder to help you there, we've already talked about the self-service portal tons and times, but more importantly, is the next couple of bullets tracking your financial uh operations is huge.

Uh no one wants to have lambda function that spends up $50 billion right? But it's also nice to just know what you're spending. We actually see customers kind of gamifying this saying. All right, if you want to deploy node app, you could pick a serve or you can pick containers. This is the cost difference. You will actually see them coming to services. Obviously, this would be set up how you would deem fit for your cloud strategy. But we'll also see just like miha just showed that people would gamify this, right.

So whatever team may have saved the most amount of money in a month, cool, you just want a free echo dot right. But again, you're helping enable innovation while you're building out these platforms.

Um another big thing is is that you should have your education and training actually part of these portals. And so what you're trying to really do is build a community. The last thing you want to do is go spend six months building out a platform that no one wants to use because you shoved into.

Um sometimes we'll actually see folks say, for example, this gentleman right here, he made a cool new project, save the company billions of dollars. He's a superstar. Well, cool, get him to write a blog post, have a blog post session or section on your actual home page and show that, you know, there's some awesome work being done within your company, right, bill by your team for your team to solve your cloud problems, organizational change and management.

What i mean by that is our back, certain teams getting only certain plugins or maybe functionality, certain teams only getting certain functionality, maybe again, talking about guardrails, paved roads and then also of course, future innovation so that you and your teams can actually spin up aws accounts go play around. Some customers will slap 200 bucks or something on an account once it's done, delete the account. But again, you're helping enable innovation while you're building out these platforms.

The three actual orange look at 33rd party engagement, eliminate tech debt if possible. While you're standardizing, it's not gonna not gonna solve everything in a day but working towards that. And then also again, just general cloud management.

Um one last thing i wanna mention is that we often see coes or cloud center of excellence use these platforms to get what they need done. Um again, it takes a minute but uh we find that this would be super helpful for teams and and last but not least, do not forget about that community.

A lot of times we'll see customers instead of just having a normal yarn start for backstage that will change the cs s framework, it will change everything to make it look like it doesn't look like backstage. For example, i think american airlines calls theirs runway change the colors to their schemes. Zalando has sunrise again. Toyota has chauffeur so make this for your team build back into the product, the platform as a product. And yeah, do you? All right, thanks brian.

Hey, everyone. Uh so my name is neil thompson. I'm a principal container specialist at aws. And for the last maybe a couple of years now i've been working with folks like brian to really try to understand how aws customers can and are leveraging backstage to achieve what both mihai and brian have showed you so far.

So in this section, what we're gonna do is take a little bit of a deeper dive into some patterns that we've seen in talking to all these customers. So hopefully you can understand how to achieve some of these things with aws services. Uh and maybe take these back to your own platform initiatives.

Oh, i had the wrong button. So um in general uh application source code isn't all you need, right for your workloads. Given that you uh reinventing. I imagine that some if not all of your workloads require at least some aws resources to run that could be compute to run it on, could be dependencies like an r ds database or an elastic cache or s3 buckets, you generally need something.

Um and as kind of miha has shown his demo, there's a few different ways that backstage can kind of intersect with infrastructure so what we're gonna do is break down these areas a little bit more to see like what we've seen with customers.

So when you bootstrap a new project as mihai showed, how can we get infrastructure along with the application source code? Uh once we've got that infrastructure, how can we get the metadata back into the backstage catalog with a feedback mechanism? And the third one is how can we then implement the plugins that brian mentioned earlier to show things like code pipeline in backstage, maybe taking a little bit of a zoom out in kind of like how we build them and reference that, that aws infrastructure.

So when it comes to actually scaffolding infrastructure with software templates, uh we'll take a look at a really simple example to try to frame the problem a little bit so well, maybe start is a similar example to mihai's. Maybe a developer wants to kick off a new goal microservice, right? In our case, we might deploy that to ecs. So the first thing that they'll do is they'll come to backstage, they'll pick our carefully crafted software template from the platform team and they will complete the software template process. And the scaffolder will kick off, this will go off and build out or get repository. It will give us hopefully a fully working go microservice with all of our best practices baked in all of our logging, our configuration, uh all of that all ready to go.

But ideally, we also want something like infrastructure is code right to go alongside that and that can live within the same software template as our application source code. Now, there's lots of different ways that you can go about doing this. This is obviously one very simple example, one some of our customers like to store their iac right alongside their source code, some like to put in a different repository. There's unlimited ways that you can basically put this together. But we're just going to use this as a really a really simple example.

So you could essentially build your software template with your code and something like a set of terra form. So out of the box, the developers are ready to go, in our case, we'll use aws code pipeline and code built for our c id process. Um you could use whatever tool you you want here. Obviously, organizations tend to have pretty, pretty strong opinions on, on this area of their tech stack.

Uh and this is obviously gonna build test and package our application. So maybe something like building a container image and pushing it to cr um but once it's done that we also wanted to provision all of our a reverse resources as well. And essentially it's just applying our, our terraform our infrastructures code to get, ideally everything the developers need to get all the way to production without having to raise any more tickets or do anything else now again, we talk to a lot of customers who have separate pipelines for their infrastructure as code or it runs in a separate process. That's not just that you can do that here. Again, we're just trying to keep this example really simple, but this is the most common pattern that we see.

The main thing that we're touching on here is that when you're running through the scaffolder in backstage, it's not provisioning resources itself. It is relying on your existing c id tools and infrastructures code to do that infra provisioning. And although there's nothing technically feasible from stopping you from running your terraform in the scaffolder, it's not designed for that repeatable infrastructure evolution as you constantly change your infrastructure.

So ideally backstage is seeing this repository and then your existing tools are doing a lot of the work. No, a lot of the customers will be taught to find value in once that infrastructure has been provisioned, helping their developers understand how it relates to the infrastructure components that it belongs to.

Um in terms of what mihai talked about, in terms of ownership, uh we want to model not just the software components, the infrastructure dependencies, but also the relationships between those. So we can understand what belongs where.

So backstage comes with a a rich entity model out of the box. So uh earlier, we saw a component in in that catalog info yaml to model the component itself, uh we can also model resources and the relationships between them to get this relationship view essentially out of the box if we do it correctly, right? This is built into backstage. And if you build out these entity models correctly, you get this on every application page.

So how do we actually make this work? In this case, this is a resource kind. So we saw the component kind earlier, this is an example of a resource and the intention of the resource kind is exactly what we're using it for here.

Um now, this isn't any sort of definitive example of how to use this, these, these are flexible entity models, but this is an example of what we've what we've seen work. So in this case, we're maybe modeling an rds cluster that a team is using for their application

Um we're gonna use annotations to add extra metadata that might be useful for whatever purposes. So maybe the ARN of the resource. So we can uniquely identify it, the name of the database, the region, um whatever is useful to drive other capabilities in your backstage, you can add here.

And if you've used Kubernetes a lot, which we see quite a lot of crossover, this will probably look very familiar in terms of overall structure, right, especially the annotation section jumps out and the kind. So the entity model is modeled after at least in spirit, the Kubernetes entity model. So you'll feel pretty at home here if you've used Kubernetes,

Finally, at the bottom, we're gonna make sure importantly, we get those relationships. So you can see here, we've indicated to the team that owns this resource for that ownership model and we've also specified a component. So we're saying that the catalog API component owns this database. And this essentially when combined with a component is how we build up that relationship view to get that, that, that component that we we saw on the previous slide. And at Backstage, we'll just do the rest of the work for you.

Now, one of the, one of the first questions i kind of had when i looked at Backstage was so how do i, how do i actually feed that back? Right. It's all very well. Me showing you a bunch of a bunch of yaml. Uh we saw how the scaffolder works. Um but if the infrastructure is codes create in the infrastructure, how do we know what to put back in the Backstage? So this is essentially the question of how do we fill this feedback loop here, right? We want to get that metadata back in and we don't want to do it by hand.

So one approach that we've seen work across a few different customers is essentially exporting some form of infrastructure metadata directly from the CI/CD system. Uh e either to an intermediate or posy like an S3 bucket. Uh maybe a DynamoDB table. Some people have an API that sits in the middle. Um but essentially we want to get that, that metadata out either directly as the yaml or as some intermediate format. And once we do that, we can actually just rely at that point on core Backstage capabilities to do a lot of the work for us.

So Backstage has a great set of features termed catalog discovery where it can ingest metadata from a whole bunch of different places to build that catalog up from disparate data sources. So it can pull from git repositories, it can pull from S3 buckets. There's a whole bunch of integrations that you can find in the Backstage documentation and it supports S3 out of the box. So we can throw this metadata into S3 Backstage will regularly pull all that metadata in. And that means that your catalog is evolving along with your infrastructure and you're not really doing a whole lot of work.

So this is a relatively simple approach, but it seems to be working for, for a bunch of the customers that we've we've talked to. We do actually have also have a blog post that covers this approach mostly. Um so Traveloka is a customer that we have who is using Backstage and production. Uh they were kind enough to write up uh their experience in terms of uh how they feed back metadata into Backstage. In their case, they are exporting open API specifications that they use with AWS API Gateway into Backstage to build a developer portal that basically builds itself from the infrastructure as code in their deployment process. And at that point, developers in their organization then have the ability to find APIs understand the open API specifications which Backstage can render using something like Swagger and learn how to consume them. So this is an internal API portal for their developers.

The last area of infrastructure with Backstage that we're going to touch on today is something that could probably be a talk in itself which is building plugins that maybe display something about AWS resources in Backstage. So Brian mentioned earlier that we have a couple of plugins that we've we produced over the last couple of years for Proton. This is our AWS Code Services plug in specifically the CodePipeline integration. You can see here it's showing the execution history of a given CodePipeline for an application. It's available in a CI/CD tab in Backstage, which is one of the default tabs you get in the application view, it's just empty. So we provided essentially an implementation of this tab to pull from CodePipeline uh so that teams can see the status of their CI/CE process without even leaving Backstage.

We also built an entity card which is more of a summary view that can go on the overview. But this is an example of a concrete example of how we're integrating uh Backstage with AWS services using plugins, we have an endless amount of ideas for other integrations. So if you have integrations that you've been wanting to see, please come and talk to us afterwards. We'd love to hear ideas. We've heard some surprising ones.

Now, in terms of how we actually relate the component to AWS infrastructure. Generally, the pattern that we've been using is to leverage the annotations section of the component. And it's a pretty simple approach which probably surprises no one. It just leverages the the ARN of the CodePipeline. When the page renders the Backstage plugin basically checks for the existence of this ARN. If it exists, it queries the API and we built a React plugin for for Backstage that displays the component that you just saw.

Um now this our approach for identifying a AWS resources is very straightforward. It it certainly works for for a lot of the initial use cases. But as we talked to more customers about this plugin or the plugins in general, we we figured, you know, it it becomes quite hard as your infrastructure scales, right? A lot of people have even CodePipeline, have multiple CodePipelines for a single application, right? So we don't really want to comma separate a bunch of ARNs in here.

So you know, as you infrastructure grows as your AWS accounts get more complex, this starts to, you know, run into some scaling issues. So between the resource feedback mechanism that we touched on and the annotations, we've got a couple of areas where customers are having to put in work and there's some friction with some of the approaches that we've taken ourselves.

So, one of the things that we are looking at at AWS right now and talking to customers about is are there more generic ways that we could publish integrations that can solve both of these problems using services like AWS Config, which can aggregate all of your resources across all of your AWS accounts across your organization into a central place. And that would give us, for example, the ability to say, I want all the CodePipelines that are with Catalog API in, in, in your AWS definitions. It would also give us the opportunity to do things like what's called a custom entity provider for Backstage. So we can extend that catalog discovery with specific AWS integrations that understand Config and can just ingest straight from Config without any intermediate repositories or anything custom, which is one of the downsides of the approach that we've seen customers in the approach that we saw earlier.

Um so again, if this is something that you have, uh you've uh sort of thought about or, or, or wrestling with in Backstage, if you happen to be using it again, we'd love to hear your experiences. Uh we hear so many different perspectives that uh it, it would be great to hear yours.

The second pattern that we're going to look at is similar to what Mihai showed earlier, right? So this this codifying checks and guidance, basically, how can you surface feedback from your organization on your best practices directly to them so that they can align with them?

Now, this idea of shift left that Mihai talked about essentially giving developers responsibility. It really becomes critical to codify these guidelines and make it so that your developers can understand what do they have to implement and measure how much they are adhering to those measures without someone else tracking it for them.

So examples of these can be things like code or documentation quality metrics, things like SonarQube, for example, it could be measures of whether you're building secure workloads. Um it could also be operational readiness checks, right? Are you ready to go to production?

So the great thing about Backstage is that it really lets us contextualize this information from an app view instead of seeing account wide or even organization wide sets of information, it really reduces it down to something more consumable.

There's a couple of different plugins you can use to build these sorts of checks into Backstage that are that are available. So the Tech Insights plugin that you can see on this screen is available in upstream Backstage, you can install it in your Backstage and get up and ready to go implementing custom checks. And the second plugin we'll suggest would be SoundCheck that that Mihai showed earlier in terms of governance on AWS, there's obviously a ton of services you can use to build these guardrails and to build best practices and and have, you know, have them implemented and checked where we see Backstage, really enabling this is that contextualization, right?

So instead of seeing all of the Security Hub checks for everything in your organization, can you open an application on Backstage and just see the Security Hub findings for that app? This could also apply to things like the Well Architected Tool, right? Identify architectural risks. And it really just allows us to take all of that stuff and pull it into that single pane of glass.

The developers don't need to leave one example that we heard from customers. That kind of surprised us. I i didn't think about this one was the Well Architected Tool. So, you know, they were asking to be able to use the Well Architected Tool to define their architectural risk profiles, capture that from their workloads through questionnaires and basically be able to display that feedback directly in in Backstage either through something like a UI widget.

So this is just a custom entity card or integrating with something like Tech Insights. So you get more of a a checklist style view. Um so this is again, something that we're looking at is, you know, ideas for whether it's Security Hub or the Well Architected Tool, you know what services would you be interested in seeing here? And uh we're hoping to publish out plugins like this, you know, in the coming year, based on what we hear from the community,

The final area that we're gonna touch on real quick is cost optimization. So essentially, it's unsurprising to hear that customers are tired of having their FinOps teams chasing around developers, tell them to spend less money and they want to shift that left as well. So in order to do that, we need to tell developers how much they're spending. But ideally, we want to tell them actionable ways that they can reduce the cost too and not just leave it up to them figuring it out.

So the Cost Insights plugin that you can see here has a bunch of capabilities, but the two main ones that we care about are exactly those ones we talked about. It can show the development teams how much they, how much they spend and it can also give actionable recommendations. So you should migrate this workload to Graviton to save cost performance. Um now it's up to you to, to, to provide that data.

So Backstage, doesn't understand AWS. It doesn't understand your infrastructure providers. So we need to implement the Cost Insights API which essentially comes empty out of the box. Thankfully, this isn't too difficult. We get a TypeScript contract that we can fill in with our, with our own set of data and actually doing that is something that we're trying to provide potentially more of an opinion on.

We'd love to be able to provide a default implementation of the Cost Insights API that uses something like the AWS Cost and Usage reports in order to aggregate this cost information from all across your organization. And at that point Backstage, can just send Athena queries. So this is the same stack that's used by Cost Intelligence dashboards. If you're familiar with that from AWS, as well as things like CubeCost and the ER ecosystem use the same set of data.

And what you can end up with is something like this. So, you know, we've prototyped this integration showing things like a per service breakdown for a given application over the preceding month or 9 months or however, it is. This is just one example of how you can segment the data.

Um at the end of the day, you have the whole curve to play with, you could segment by environments by regions by account ids. It's really up to you. There's tons of possibilities when you combine the flexibility of Cost Insights with the the ability that you can just query whatever you want from the Cost and Usage reports is a pretty powerful combination.

So to recap what we've talked about today,

We looked at, you know, empowering innovation experimentation using software templates for not just your application source code but your infrastructure as well.

We talked about reducing the developer cognitive load by feeding back things like the infrastructure into the Backstage catalog along with various other aspects,

The surfacing of best practices, organizational guidelines and helping our developers adhere to those is is super critical as well as enabling cost optimization and crowdsourcing that to help your organization save money.

Some parting thoughts:

Backstage.io best place to go. Documentation on Backstage itself, overview of its features that will inevitably lead you to GitHub. Which obviously is, is where you'll find the codes.
We are also showing a demo at the Modern Applications and Open Source Zone. There's a couple of different sessions. I know we've been teasing a lot of integration in this session that was partly on purpose. So if you want to see a lot of this stuff, actually working, please drop by to one of these times and at least one of us will be there whether you just want to talk about Backstage or you want to see any of the things that we've kind of touched on today. We're happy to dive a lot deeper than we've been able to in this session.

This is where you can find the Modern Applications and Open Source Zone. It's over in the Venetian in the the Expo hall, there's a section sort of in the back, right ish where you can, you can find all sorts of experts on these topics.

The, the, the Expo is open, you know, the, the, the same area as the Expo is, is where you can find folks there.

But with that, um we are gonna close off this section on Backstage. We're really glad that you spent the time with us today. And if you have any questions, please come up to the front afterwards and we're happy to chat.

Thank you.