Scalable load balancing and security on AWS with HAProxy Fusion

最新推荐文章于 2024-07-25 13:56:03 发布

李白的朋友王维

最新推荐文章于 2024-07-25 13:56:03 发布

阅读量48

点赞数

文章标签： aws 亚马逊云科技科技人工智能 re:Invent 2023 生成式AI 云服务

本文链接：https://blog.csdn.net/just2gooo/article/details/134868138

版权

I'm coming to you with an official HAProxy hat. Although I'm not gonna wear the entire presentation, that would be too much.

Um so I believe that HAProxy needs very little introduction. Which one of you has heard of HAProxy before? So a lot of people.

And um, HAProxy is an open source load balancer. We're actually one of the most used load balancers in the world. So I believe a lot of companies here on this expo floor are actually using HAProxy for various use cases. And we're one of the most used software load balancers in the world today.

I want to talk to you about load balancing on AWS with HAProxy and our technology.

So, you know, or as I mentioned, my name is Jacob and I'm the Solution Architect at HAProxy. I'm not gonna give you a live demo on my laptop because I built the demo for you. So if you go to reinventdemo.haproxy.com, you can follow along with just a little picture of how I'm load balancing.

This is outdated. I'm load balancing around 1.5 million requests per second on six AWS instances. It's like t3.xlargest. So you can take a look and just to introduce HAProxy, we're the world's fastest software load balancers. And I have the benchmark to prove it.

One of our founders or the founder, Willy did a blog post in 2021 about using HAProxy on Amazon Graviton instances and load balanced over 2 million requests per second. On a single instance, I don't recommend that you actually do that because maybe some, some high availability would be, you know, recommended, right? But certainly you can.

So that's why we are one of the low uh one of the fastest load balances in the world.

What I'm gonna do today is talk a little bit about how some of the customers um talk when I talk to customers about load balancing and we often talk about how customers have specific issues that result in latency on requests because they have a lot of hops in their requests.

They also have high scale load balancing with millions of requests per second and they need to optimize for latency and ultimately protect traffic against security issues, right?

So a lot of customers come, a lot of people using HAProxy come and they want to talk about some of these specific issues and that's what I'm gonna talk about today and it starts really simple, right?

Whether you're using AWS already or you're moving from like a legacy on premise load balancer. You say, hey, I'm gonna build an app. It starts simple how hard it could be. And you build a application server, you put a load balancer in front of it and you spill it up on a ec2 instance and the good news is that it usually works.

Those projects are usually extremely successful and they're like a beachhead for moving to the cloud and then you suddenly wake up and now you have more and more apps, you have them in multiple regions in AWS and you're actually moving uh traffic between regions in AWS. But you also have on premise, regions that are still legacy data centers and you deploy more and more apps, right?

So you become successful and you have more apps, which means that you have even more stuff. And just like I think that when you have a new manager that starts at a company and suddenly starts is invited to your slack, right? Suddenly there's like a kubernetes entered the chat and now not only you have a lot of regions with applications, but then you have kubernetes and suddenly you're dealing with new issues with kubernetes where one of the things i think kubernetes was made for is to make the networking simpler. So it makes it more complicated.

So now suddenly you have a lot of kubernetes clusters and i have a lot of people calling and saying hey, i now wanna do uh kubernetes multi cluster, multi region fail over. So i wanna have a kernes cluster in every region in aws and i wanna move traffic between them as my kernes clusters fail or they're possibly upgraded and things like that.

So you have a lot of these issues with possibly these new regions and new applications. And what that usually means is that your low balancer starts sprawl and the sprawl is good at start, but you wake up and you have 3000 load balancers and you have 3000 applications and you have databases everywhere, right?

And you actually have a lot of requests. So your request path as i call, it grows even longer as you go through all of these layers.

So you know, first of all, you have a customer that requests some api or your website, your app and you need to figure out how to get traffic into your cluster. So you do an ip any cast or route 53 dns load balance in to get traffic in.

Then you do ad dos protection. So you protect the traffic against ddos, then you put back in front of everything, then you do rate limiting. Then you do some access control lists like an ip access control list and then you route the traffic, whether that's just sending it to some containers, clusters, easy two instances or you route it like an api gateway based on specific rules.

And then high scale multiplies the problems you have because you not only do this but you do this across millions of requests.

So like raise your hand, if you have an application with like hundreds of thousands of requests, potentially, i have customers that do hundreds of thousands of requests per second or they do 10 million requests per second across these thousands of apps.

And what happens on top of all of that is a lot of these are separate layers, right? You have ddos protection as a layer, you have rate limited as a layer, you have a vv application thyro as a layer. And so these layers grow in the vendors, the costs but also hops and latency, right?

So for you, what that means is that you can't optimize your costs because you're just using a lot of resources for load balancing management, takes time. You're doing a lot of, you know, vendor management and ultimately you're also increasing latency of your requests because especially if latency matters to you. If your request goes through many hops to get to your application and back, you're increasing the latency.

So we need to solve that, right? And we're gonna solve it with a, a proxy. So how we're gonna solve that? I'm gonna solve it with simplification and ultimately speeding up.

I have a friend who was like one of the original sres when sre wasn't the word yet. And um he used to say nothing is more complicated to scale than complexity and to scale, you need to very often simplify.

So what we're gonna do is we're gonna simplify by first collapsing the entire request path by using HAProxy as a load balancer. And we're basically gonna create a layer of load balancing using HAProxy where we're gonna add all of the routing, all of the access control lists, all of the rate limit in all of the web application firewall and everything else to a single layer that can run across all of your regions.

We're gonna obviously scale it with two and auto sc in groups. So we can actually scale it based on your traffic.

Um and we're gonna send traffic data into it from your requests from your applications. I personally think that some of the ddos layers that aws uh provides are really good. So i would always recommend just, you know, keep those, but they were gonna simplify and we're gonna collapse everything into a single layer, which means that we're already asking load balancers to do a lot.

In the end, everything is being load balanced, right? And now we can ask the load balancer to do even more as soon as we implement. Some of this people usually come and say, hey, can you now do both management and can you also do go ip uh detection so i can send the traffic to the closest region for the for the user right.

And i can start kind of piling on things that i can add to the load balancer, even sometimes offload single sign on to the load balancer. So my application doesn't, doesn't have to worry about that.

So again, we're gonna simplify, we're gonna collapse everything into a single layer of uh hh a proxy that will allow me to basically speed up the request by allow uh by reducing the amount of hopes, therefore reducing the latency of my requests.

And ultimately, as we claim h a proxy is really fast, but by processing the request really fast as well.

So i'm gonna take a minute to reintroduce h a proxy overall, right?

HAProxy is an open source slow balancer. It's used by millions of companies out here. I work for HAProxy Technologies, the company and it allows me to first of all have very efficient performance. So scale requests to millions of requests per second with built in auto scale in and routing of requests like an api gateway or just routing based on whatever i need, whether i need to send a request slash api somewhere slash api slash v two somewhere else, right?

But then as we simplified, we also added what i call multi layered security. So it allows me to now block attacks from, for example, bots or some specific attackers on the load balancer itself. Fingerprint requests to do bot management, implement rate limiting but not just rate limited rate requests or rate limit requests because i don't want to allow more than 10 requests per second or something like that.

But even dynamic rate limiting to say, hey, only rate limit if the amount of requests today during this hour is higher than the 99 percentile of the requests last day during the same hour and ultimately control access via ip spa or anything else.

And i can control access to almost anything i have a customer. Some of you will recognize this use case asking the load balancer to do more. I've talked to somebody who did um load balancing between vpc regions and they wanted to actually rate limit based on the vpc id where the request came from.

So we implemented getting the vpc id from the proxy protocol from aws nlb and then we rate limited based on that instead of a source ip because the source ip wasn't available, the vpc s could have the same subnets. So we didn't know what the source ip could be. But now you've added a single layer of load balancing, but you still have multiple layers across all of your regions.

So what we often do is we actually centralize all of this management of these load balancers through through what we call our fusion control plane, which is basically a single layer of load balancing that or of control plane that allows me to manage all of my load balances at once.

So i can simplify the management by implementing auto scaling group and register any new load balancers that come in. So i can say, hey, spin up a new ec2 instance or container of a load balancer connect it to the control plane and it will inherit all of the configuration, it will be added to your traffic distribution and it's gonna start s serving traffic.

I don't need to do configuration management on top of that, right? Because few will do a lot of this configuration management for you. Configure all of these notes prevent config drift. So nobody can log into the load balancer and change it, change the configuration on their own. And then you're wondering why one of the 100 load balancers is serving traffic uh differently but also promote what i call and what's gonna be uh relevant a little later, what i call immutability of infrastructure?

Why upgrade all of my infrastructure? Why upgrade all of my load balancers when i should actually replace them? I should just make it so simple to add a new load balancer to my load balancing layers. Put it in my cluster of load balancing and spin down the old ones just like i would do it in certis, right?

I would add a part, i would spin down the new one or the old one. And ultimately, when i'm then running multiple millions of requests per second, it's hard for me to figure out where are all of these requests going because sure, i can send all of the traffic data into like a specific log monitoring, but it might not be h a proxy aware.

So i can actually take the control plane and put all of the observable in there because it's h a proxy aware. And on top of that, and this is i was at a different conference a couple of weeks ago and this was the topic of the week multi cluster routing using kernes where customers say, hey, i wanna have two certis clusters, each one of them with a different version of cuber neti or a different version of my application or an a b test.

And i wanna load balance traffic in between them, right? So now i can actually do service discovery to discover the parts of all of my applications using the load balance and send traffic to them automatically and do like a blue green test be between two cabernet's clusters.

Just to recap, what we talked about is using HAProxy as a load balancer to get better a performance by reducing the amount of hops doing a faster load balancer. In the end, simplifying the management, right, optimizing the observable of all of my load balancing layer. And ultimately using that to reduce the latency of the request, reduce the time of recovery time objective if something fails and centralize all of the management of my load balancing.

So thank you so much. For coming today. I would love to continue the conversation. We're in booth 251 and I'm happy to answer any questions as well. And I saw some people raise their hands so I have a t shirt for you if you're using a, a proxy already. So please come in and please complete the session, uh, survey as well in the app.

李白的朋友王维

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Scalable load balancing and security on AWS with HAProxy Fusion

I'm coming to you with an official HAProxy hat. Although I'm not gonna wear the entire presentation, that would be too much.Um so I believe that HAProxy needs very little introduction. Which one of you has heard of HAProxy before? So a lot of people.And um
复制链接

扫一扫