What developers want from internal developer portals

最新推荐文章于 2024-08-07 17:09:39 发布

李白的朋友王维

最新推荐文章于 2024-08-07 17:09:39 发布

阅读量49

点赞数

文章标签： aws 亚马逊云科技科技人工智能 re:Invent 2023 生成式AI 云服务

本文链接：https://blog.csdn.net/just2gooo/article/details/134867116

版权

All right. So we will be talking about what developers want from IDPs. IDPs obviously stand for a lot of different things. We've heard identity providers in this context, we're talking about internal developer portals.

And so one of the big things we always think about is as you're rolling out an internal developer portal, how do you drive adoption? How do you drive engagement? How do you actually get people using the portal that you've built?

And so a little bit about me, the company that I'm here with. My name is Ganesh. I'm one of the co-founders and CTO of Cortex. Cortex is an internal developer portal. It's really designed, it's designed to help developers take ownership over their software, their services, their components, the quality of those things, the reliability in a very efficient way without manual overhead.

And the the whole goal of Cortex is to create a culture of continuous improvement, reliability, enabling productivity. And you'll hear me talk about continuous improvement a lot and it's a theme that will keep coming up because when you think about developer portals, what we really want to do is help development teams get a little bit better than they were yesterday and the day before and the day before and you keep doing that and you have a great organization at the end of it all.

So today's agenda, we're gonna be talking a little about the tech landscape. What's happened over the last five years, the last decade? How do internal developer portals help with some of this stuff? And how do we help developers improve their own work as a result of all this change?

So last five years modernization, why now, why are we talking about developer portals? Now we have had an introduction of microservices, right? The last 10 years, microservices have become a big thing and there's been ups and downs in that journey. But generally people have headed in that direction of let's build small reusable things, let's have them all work together and it's enabled development teams to work parallel efficiently ship independently. In theory, assuming you're not building a distributed monolith. But you can move faster as an organization.

Kubernetes has kind of accelerated this where people can build very powerful internal platforms, deploy faster, operate things at scale, learn from each other. Obviously, cloud computing has been a big trend. It's made hosting cheaper storage cheaper again for the most part. And it's enabled teams to move faster, deploy quicker and get things for production faster.

And last but not least developers just have more tooling available to them, right? Every time we see a problem, a company, a product something spins up to help developers with those issues. Whether it's CI/CD, whether it's developer portals, better monitoring, alerting productivity tools, all these kinds of things have come up to help engineer organizations shift better.

And so at the end of it, all engineering organizations have become more efficient in the aggregate to a point. And so while microservices and Kubernetes and all this stuff have enabled parallel work, it's made it easier to build even more complex software.

Now you have organizations with hundreds of services named after game of thrones characters or things that you don't know and you get paged at 3am for something, you have no idea what it does. And so that volume and the complexity has made it very difficult to track ownership, you get paid for something. What does this thing do? Who owns it? Who do I talk to vulnerability comes up? Who's accountable for fixing it? It's very hard to capture that at scale.

Cloud hosting. Yes, make things easier and cheaper. But because of that, we tend to use more of it. We're more free with deploying things to the cloud. And so now you deploy something, you haven't touched it in a year, maybe it's not even used anymore. But now you're paying for all that compute for all that storage.

And obviously everyone today is thinking about cloud cost optimization, but with all the tools, the tool sprawl the complexity, everything that we've now given developers to ideally make them better. We've now created more process and more noise and the signal to noise ratio is horrible, right?

Every developer has got five different teams asking them to do things. They have a production checklist, they need to go through to get something in production. Security is top of mind for everybody and none of these things are bad. These are all good things that we should be doing as organizations, but developers have been bottlenecks. There's so many things we need to think about.

So how do we cut through that noise and how do we make things even more efficient? And so when we took a look at the landscape of what tools are out there today to help organizations, we kind of put it in this major in this graph of do tools help drive quality or do they help drive velocity at the end of the day? Those are the kind of things we're thinking about as an organization.

And so we know one aspect of this is like monitoring tooling, APM, dev ops platforms. And so those are really focused on performance, right? It's like are things performing the way we expect them to? Are they, are we hitting breaching our SLOs are things in run time in the wild doing the things we expect them to do. It could also be things around like CI/CD and stuff like that where it's like, how do we move more quickly?

You have things like engineering intelligence and metrics. So it's more like the output of teams, like, are we shipping fast enough? Are we shipping often enough? How many incidents do we have, you know, are we getting bottlenecked and PR reviews? Stuff like that is interesting. It doesn't necessarily drive action, but it's tools that exist to help us kind of assess the situation of our organizations.

And finally, you have more static information about what engineer teams are doing. That can be git that can be Jira that can be Confluence. So code things, we're working on how things work. It's more static, it's more like human driven knowledge.

And so each of these things kind of falls in different buckets, right? So when you think about CI/CD and things like that, they're really around quality, like are we doing things the right way? They're not so much driving velocity to some degree they aren't. But for the most part, you think about them as quality gates.

But when you think about development platforms or APM and things like that, it's around quality, but you're also trying, you know, you're not driving as much velocity anymore. And so this is kind of the market today. It's kind of little bits and pieces of everything across the entire spectrum.

But what's missing is like how do we collect all this information and create a cohesive picture across all that stack? Right? I have a service in all of its code and git there's things being worked on that's being tracked in Jira. That thing is deployed to three different clusters and prod and staging and dev is being monitored by Datadog and maybe something else and maybe something else, maybe something else is being deployed by some pipeline and maybe you have different pipelines tools across the organization.

You have some spreadsheets somewhere tracking some of this information. You have Confluence tracking docs. So how do you create a cohesive picture to say this thing is deployed here and this is it on citation and this is its monitors and its SLOs and then now we can create this picture to help us answer questions and drive more productivity that way.

And so that's where internal developer portals are. Obviously, I wouldn't be giving this talk if it wasn't about developer portals. And so the idea behind developer portals are there's a couple of key requirements that a developer portal must have, right?

The first thing is a system of record that's kind of what we describe. Developer portals are at the core is a system of record for all the information about things in your stack. So that's kind of that cohesive entity graph of everything in your engineering tooling. How do I create a single picture? A single source of truth or everything that out there that's out there, the system of record, the cataloging, all that stuff should be part of your developer portal.

The second thing is how do I drive action? Ok. I'm collecting all this data. What now, why did I collect all this data? It's because I'm trying to drive best practices, standards. We have things that we care about as an organization. How do I get people moving against that? How do I get the organization tracking towards that?

And finally, now you're telling people here's what we have, here's what's good and here's how we're doing against that. And finally, how do we make it easy for people to do that? Can we help them self serve the things that we're asking them to do? If we're asking them to follow best practices and standards around production check or security or whatever that is?

Can we now give them self serve tooling to make that a one click process or a couple of click or whatever that may be? Can we give them self serve toing to make that easier? These are the three components that must be met in a, in an IDP for it to be successful.

And so, you know, when you think about developer portals, I think one of the things that's happened in the market is you think a lot about like there's a platform team who's building an IDP and they're kind of working in isolation, thinking about like our consumers are the developers.

But you know, we want to think about it from a product manager standpoint, right? So if you're a product manager on a platform team building an IDP, you would go talk to developers and research. That's exactly what we did.

So we went out and we talked to a bunch of developers and engineers who are looking at developer portal. So not necessarily our customers but just other developers who are thinking about IDPs. And we asked them about their pains, why they're looking at IDPs in the first place. And we have three key takeaways.

The first thing, it's impossible to find information in an organization. It's unclear to developers what they should be focused on. Am I doing the stuff that's like feature development? Am I looking at productivity, reliability security? What should I be working on? Here is the remaining transcript formatted:

And finally templating golden paths, self serve of things like golden paths are probably the number one thing that developers want make it easy for me to do the things that you're asking me to do.

So the first thing finding information, most people say they can't find information, it's complexity, microservice, sprawl information, sprawl. You have 10 different tools like we talked about developers just wanna be able to go to one place and find information, what's out there. What does it do? Who owns it? Who can I talk to? Where is it deployed? Can I find all that information in a single place? And if I have that, then so many different processes can be built on top of that.

And so the number thing that one thing that we hear is finding information as a pain. The second thing and we're seeing this kind of trending over time is as information becomes more and more difficult to find prioritization is becoming more important. Because now you have platform teams asking things of people, you have SRE teams asking things of people, security teams are ok. What do I do? Like, I've got so many things on my plate. What do I work on?

And so giving developers clear prioritization around the things that they should be focused on what they should be working on can be and should be one of the things that developer portal solves for. We kind of secondary to that. We also hear that there's a lot of noise from many tools you have your security to tools yelling at you. Datadogs are in Splunk and PagerDuty and all these other tools are yelling at you about something.

You just have a lot of information and again, back going back to that signal to noise, there's a ton of things happening. So how do you assess all that information and make informed decisions about what you should be working on? And developers are telling us that noise from those tools is making it very difficult to figure out what should I be working on.

And finally templating golden paths, self serve. So an IDP should give people an easy way to get started with things to scaffold things to self serve. It's like, hey, if you're asking me to do things, make it easy for me to do those things. So if I'm starting up something new, if I'm starting a new project, if I'm adding things or projects, can you make it easy for me to go and do those things?

And so where does Cortex fit in? How does develop a portal? How does Cortex help you with some of these initiatives? And so the way we think about this is we actually think take those core concepts from IDP and we expand and build upon that. And so the system of record is the core foundation of this.

So and so the way we think about the system of record is how much of it can we automate? So can we pull information from your existing integrations your text stacks? And instead of you telling us a bunch of information, can we go out and help you discover a lot of this information and pull it into a single place while giving the flexibility and the sensibility to define your data models the way you see fit.

So how you talk about your, your organization, how you talk about your taxonomy, giving that flexibility while underlying that with all the different integrations that we can find information for you and keep it up to date, which is one of the biggest problems. This is my record is I've got a bunch of data.

It's gone out of date with Cortex. A lot of that is automated. The second thing is on top of that, we've built our score carding products, the ability to drive automated checks against the data and using that data to drive things like package, package upgrades, platform, migrations, production, checklists, security standards, service standards, all the things you care about that, maybe today you have a Confluence page tracking or a spreadsheet or something.

How can we take all that and automate it and take it out of the hands of a few people and drive that action across the organization and then tell developers exactly what they should be focused on and then finally self service. So can we give people tools to go and do those things themselves?

And so the first thing is a system of record. I think one of the things that people struggle with is just collecting all the information to their IDP. We talked to a lot of people who have tried an IDP before and they say we are six months in, we still don't know what we have. There's tons of information out there. How do we collect all this stuff and make sense of it?

And so one of our philosophies is look at your IDB as a pointer to your other tools, can you create an entity graph by bringing in data from other tools to then create that cohesive experience? And so a lot of that is based on the integrations that we have. So we have over 50 integrations at this point and a lot of it can be convention over configuration.

So you tell us like, hey, we're doing this standard tagging scheme. Ok. Well, boom, we've come this entire picture of your organization covered now because we can discover so much information about it. So how much can we pull from your existing tools? So you're not keeping things manually updated.

For example, ownership, ownership of software is such a hard problem because people join teams leave teams. There's reorgs, it's not very stable as a as a concept inuring teams. And so one of the things we as an example is we integrate with the other IDPs like identity providers like Octa and Google groups and all these things. But also HR is tools like Workday where we can sync all of your teams and team structures into Cortex.

So you can use that for ownership. So now if people leave or join the team, we're keeping that information up to date automatically so that you can have accurate ownership at any given time. Catalogs are automatically syncing. So we talked about ownership, documentation dependencies. All these things can be pulled in automatically from the variety of tools that you have.

The really cool thing about Cortex is once you have all this data in the catalog, we have our own DSL called Cortex Query Language, which allows you to then go to query that data. So you can now ask it very complicated questions like how many of my tier one services that are in production have been affected by the wrong version of of a bad version of Forge have not been deployed in the last 24 hours and doesn't have an owner associated with it.

And boom, there's a list of services that are at highest risk when something like Log4J hits the next time you're able to query that data correlate data across different integrations because you have a system of record that connects different data sources into a single place.

The second thing in Cortex is prioritizing action. So there's a lot of different ways we do that. The first thing is the developer home page. So for a developer, if I log in, I get an immediate picture of the things that matter to me the most. So my services and my infrastructure that I own the help of those services, my pull requests, my reviews that are assigned to me, my Jira tickets, my score cards and action items.

So if I'm out of compliance with things, it'll tell me all the things that I should be focused on. So it's one place where I can go to get a lay of the land without having to go to six or seven different tools. And we send alerts for these things too. So if you have a J ticket or a PR or a notification or something, you're out of compliance with, we can go out and reach out to developers for you.

So you don't have to be going and prodding and poking at people with all this information in the catalog. The second thing is score carding. So if you have migrations, if you have audits, you have compliance standards, best practices, service maturity things. Can you take those things and automate them away with all that information we're collecting from your tools and immediately tell you, hey, 50% of your services are meeting the basic level of maturity, 30% of your services are in that middle silver level of maturity and so on and so forth.

So how do we gamify, how do we push developers to be better across the organization but also track things like migrations across the board? And what's really cool about the platform is it allows you to build exemptions because not every team is the same, not every service is the same. So being able to say, hey, this thing that doesn't apply to me, can we build that into the product to make it really relevant to every single organization reporting?

Obviously, I think this is one of the things that everyone asks for with all this information in there. Can you report on this in different ways? And so being able to break down things like reliability or service maturity across product areas, functionalities, teams help you make more informed decisions.

So instead of saying, oh my god, it looks like our monitoring sucks. You can say like, hey, how nervous services are act meeting our monitoring standards? Are those teams doing well on our other metrics that we care about? If not, what are the things we should be pushing them to do more? Do we have teams that are maybe falling behind where we need to be prioritizing? More tooling, more infrastructure, more resources.

And so now you're doing more data driven analysis of your organization versus just trying to look at your dashboards and say like, oh my god, like what do i do next?

Self-service? This is kind of the next big theme of an IDP that we were talking about. And so one of the core components that we offer in Cortex is a scaffolder. So one of the things that obviously in more mature service organizations, you end up building a lot of new services right across organizations. People are spitting things up all the time.

And so if you have these scorecards which are telling you, hey, here's all the things you should be doing. Here's what good means. Here's the golden path. How do we make that so easy? Says, hey, by the way, if you want to meet all those standards and you're creating a new service. Here's a one quick way to do that, that's gonna get you 80% of the way there already.

And so scaffolders are a great way of doing that. And the way we do this is by using an open source framework called Cookiecutter. If you're with that, it's a boiler plate generation tool. So you can give developers a way to come in and say i want to create a new uh spring boot service. Fill out a bunch of information which you're fully customizing and the Cortex will create the repository, generate the code, add it to the service catalog kind of really control the intent process.

And at the end of it, all developers have a new project that's meeting all of your standards from day one actions. So how do you help people orchestrate and do things from within the portal? Because if your portal is just a bunch of information, no one's going back in there, you need to put things in the portal where people are taking action, they're doing things in the portal to drive that feedback loop of adoption.

And so actions are a way to drive these kind of uh triggers to external tools, to internal tools to trigger some sort of workflow outside of the system. So basically HTTP payloads with like no code, low code UIs. So for example, what we use this for internally as we do food Cortex is to deploy our back end to our Google Cloud.

And so we have a button where developers can go in. They open the service catalog, they click on service to hit the deploy button. And behind the scenes it's going off. It's triggering the GitHub action workflow. It's orchestrating that sending information back to Cortex. And so now developers have a one click way to deploy tools instead of telling every new developer that joins.

Hey, this is the pipeline you gotta go and run. Here's the right parameters you should fill in. It's like go to Cortex, click the deploy button and you're good to go. So now you're building that self serve into the platform itself.

And then the last thing we'll talk on is is plugins. So with Cortex, you can actually build on top of a platform with the plug in system. So you can build your own React TypeScript components. UIs embed them inside of the portal, embed them throughout the service catalog.

So if you want to build your own custom deploy tooling, if you want to build your own Cotis control plane and expose that you can do that inside of the portal. So you're not restricted to those integrations we were talking about earlier. You can build on top of our platform and really extend its capabilities with anything you want.

So the way we think about this platform is let us give you the plumbing, don't do the boring stuff. Don't build a AWS integration, don't go and build an integration with Datadog or PagerDuty. Let us do those things. So you can spend time building plugins that are very custom to your workflows and the things that matter to you the most and last but not least.

Uh this is kind of how we think about the IDP space as a whole. So when you think about what people are trying to do with Cortex or IDPs, we talked about at the start, it is continuous improvement. How do we get people to keep getting better over time? And so the way we've seen customers and people really roll out IDP successfully is by breaking it down into this framework, you can't just jump straight to self serve, that's gonna be setting yourself up failure because you're not knowing what am i optimizing for. Why am I improving this particular experience?

And so the way you do that is you start by aggregating information, what information do we have? What's out there? How do I collect this information then, let's say, ok, now with this information, where are we today? What are the standards we care about? What is the baseline that we are at today so that we can measure where we're going next? Here is the remaining transcript formatted:

Then using initiatives, you drive action, tell developers clear things they should be focused on to improve reliability or maturity or the things you care about. Once you start doing that, you'll see opportunities for enablement and optimization.

Say, hey, we're asking people to do these three things. That's really hard. Let's go build some self sort of experiences to make that easier. So now you're creating a feedback loop where you're assess, you're aggregating information, you're assessing it, you're telling people action, you're optimizing it and you keep doing that.

And now all of a sudden you created a flywheel with your IDP. And so with that, uh that is IDPs adoption and everything that we think about a little bit about Cortex. Thank you all so much.

李白的朋友王维

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
What developers want from internal developer portals

Right?
复制链接

扫一扫