Run a hybrid cloud environment at the edge with Amazon ECS Anywhere

最新推荐文章于 2024-11-02 11:55:32 发布

taibaili2023

最新推荐文章于 2024-11-02 11:55:32 发布

阅读量425

点赞数 7

文章标签： aws

本文链接：https://blog.csdn.net/weixin_46812959/article/details/134589653

版权

I know that this is uh 747 45 to, to 845. So I appreciate your uh your time and attendance on this topic and hope you get a lot out of it.

Um today, I wanna talk about uh building a hybrid cloud environment uh at the edge with Amazon ECS Anywhere. And we're gonna go through basically why. And then how, so let me start out by talking first about what hybrid cloud is and I want to present it as the basic concept of your hardware um combined with the power of the cloud.

So for example, in this, in this diagram, you see you could have an on premises uh server, you could have a point of sale, you could even have a vehicle that's internet connected. Um or you could just have internet of things, devices uh like I've used Raspberry Pis before and these devices run your code, run your application, but you want more um you want to be able to use the power of some of these uh APIs cloud services as well like EC2, APIs Fargate uh or even APIs Lambda uh and benefit from the cloud as well.

Now, there's a few reasons why first reason I wanna talk about is capital expenditure investment. So this is a classic one when you've invested in on premise hardware, like a data center, um you already spent money on that and you wanna get a return on your investment. So even if you do have a plan to move to the cloud or you like cloud services, you're never gonna want to just abandon your on premise data center before that thing actually breaks down, you wanna get your your value out of that.

But capital expenditure investment can be a dual sword there because sometimes you don't want to spend more money if you reach a point where your data center is not size big enough. Um you're trying to think, am I gonna really want to spend a bunch more upfront money for on premise servers? Maybe I wanna make utilization of the cloud just to make sure that this growth and capacity requirement is um gonna last before I invest in buying for on premise uh in the long term.

Now, another reason for hybrid cloud is uh compliance requirements. So this is a classic one in any industry that has sensitive data, for example, health care data or even uh an industry that operates uh in European regions, they have the uh the General Data Protection Regulation uh which requires that data be kept in the physical uh region as the end user. Uh even down to the country level in some cases.

So you may have a requirement that your data be handled in a certain way or kept in a certain location, which requires you to have a hybrid approach.

Um if AWS doesn't have infrastructure in your particular local data gravity and proximity is another reason, sometimes the data is just so big, so heavy and there's so much of it that moving it to and from the cloud is not feasible.

Um we've seen this in, in some clients that have just absolute petabytes of data on premise and maybe it's even old tapes or something like that. And you're not necessarily gonna wanna digitize all that and move it to the cloud.

Um alternatively, you have the requirement to work with that data on premise. For example, a video rendering studio. This is a classic one. If you are building a uh big budget film these days, you're probably using a 3D rendering studio and the artists are working with the asset files that could be gigabytes in size, maybe even tens of gigabytes in size. And they're working with that locally on their uh desktop studios, often times in on premise.

So they have an asset store, they need to grab pull in those textures, pull in those 3D models uh run test renders and they, they do that locally inside the studio. But then they also want to be able to render uh segments of the finished film and they want to render those quickly.

So now we're able to combine the power of the cloud for rendering the end product with the ability to keep data and assets and do some local processing as well. Classic example of hybrid cloud.

And the last the benefit of hybrid cloud they want to talk about is consistent operations. So there's many different ways of doing hybrid cloud and the easiest and sort of first way they might think about it is just well, I'll treat cloud as one thing and I'll treat um the on premise uh operations as a different thing.

But the reality is having a hybrid cloud gives you a fantastic ability to use consistent operations. You can actually have the same API locally as you do in the cloud and you can create a system that allows you to move workloads between one and the other um at will with minimal extra complexity and we'll talk about how that works.

Now with the concepts of hybrid cloud introduced, we need to talk about why containers.

Um and there's a few key benefits of containers. I like to talk about the first being velocity. So containers increase your ability to deliver an application quickly by giving you a prebuilt base to start from. This is a classic example of, you know, um installing your runtime. I was thinking back uh a couple of months ago and I realized that it had been several years, probably 5 to 7 years since I bothered actually installing Node.js or Python directly on a host because now everything is inside of a container.

The, the the host operating system is just an empty shell into which um I'm able to bring a prebuilt container image and benefit from the velocity of someone else having prepared that container image for me and I didn't have to set up and installs or uh create a system for deploying software onto that host. I just brought my container image.

A reduced risk is another important reason for containers.

Um first of all containers automate builds. So we know that automation reduces your risk of things breaking uh later on.

Um and additionally, when you have things automated with container build, it allows you to create reproducible um builds which leads to increased quality. So um normally, if you're running things by hand or you running your own uh build scripts, you're likely gonna make mistakes or there's gonna be edge cases that happen in the delivery of software to and in computer which are hard to reproduce and hard to handle.

So the advantage of Docker containers for delivering software is that you build once and then you're able to deliver that pattern to as many computers as you want.

Um and that one delivery will either succeed or fail once rather than trying to say I'm gonna pull this off of NPM, I'm gonna pull this um this packaging installed, I'm gonna pull this binary and install from all these different sources. And then last, but not least, I want to talk about operational excellence uh containers allow you to focus on delivering your business logic rather than focusing as much on, on the, on the hardware and the uh the operational aspects.

So you can have a strong division, you can have developers who just think about the code and provide the container and then other operations. People that focus on providing the platform that executes uh that container on demand uh as needed and then ends up and, and quality ends up being better for both in many cases.

So containers give you this application artifact that works everywhere. You can use it from local development on your laptop. You can use it uh on on premise server. And you can also use that same uh container image in the cloud because it's a standard format, it will run the same in all these places reliably and reproducibly.

But when you're doing a containerized deployment, there's one more piece that you need if you wanna have fantastic containers and that is a container orchestrator. You see when you have one container, let's say on your local development, laptop life is good. It's very easy to spin up that container, rebuild it, restart it as needed.

But what happens when you have 10 containers, 100 containers, 1000 containers.

Um now you run into an issue of keeping track of all these containers. How are you gonna know if one of them crashes and needs to be restarted? Or what if the demand for a particular application rises and folds over time? You need to scale that up and down.

How are your end clients who are using your system going to get traffic to these containers as you have a large number of them?

So this is where the orchestrator helps out. It allows you to set up a high level command and the high level command could be something like I want to run 10 copies of this container. I want to register them into this load balancer and I want to scale the number of containers up and down dynamically according to CPU utilization and scale.

If the CPU goes over a certain threshold, let's say 80%. So what's gonna happen is orchestrator is gonna tie all this together. Orchestrator is gonna talk to your compute to launch your containers. And the number of containers that you require is gonna wash over those containers.

If the containers crash, it's gonna restart them, it's gonna gather metrics from the containers and respond accordingly. So if CPU goes too high, it's gonna launch more containers and it's even gonna reconfigure other resources either on prem or in the cloud um in accordance to the list of containers.

So for example, a load balancer, you don't have to go into a load balancer and by hand add different container IP addresses to the load balancer anymore because the orchestrator is gonna be doing that for you uh on the fly as containers are stopped and started.

There's a variety of tools that go into the container services landscape and I wanna show some of the some of the ones that you'll hear about again and again if you go through different container uh sessions here at re:Invent and as you look through the documentation, so the first core concept that a lot of people need uh that I'm gonna talk about is application networking.

So service, discovery and service mesh. How do you actually know where containers are running? If you have a large pool of capacity, a bunch of VMs or a bunch of EC2 instances uh or a bunch of physical servers in a, in a data center rack, data center rack. How are you gonna know where, what is running on which server and what address do you use when you want to talk to a particular container?

So AWS Cloud Map is a in cloud solution for that problem. There's also management. I talked about the benefits of using an orchestrator. Well, there's two different orchestrators you can use Amazon Elastic Container Service, uh and Amazon Elastic Kubernetes Service.

So, Amazon ECS is the main one we're focusing on today. But you should also know that there is an Amazon EKS Anywhere in which it provides a Kubernetes uh deployment uh and distribution that's designed to run on premise and be compatible with what we have in the cloud.

There is a very fundamental difference in architecture which we will discuss later on the slides, but you should know that both exist. And also I want to talk about the hosting layer.

So on the cloud side, there's multiple options for how you want to host uh your containers. Amazon EC2 is gonna be the one that provides the cheapest uh way to say, give me this much CPU and memory capacity for my application to run.

But it was Fargate provides a more hassle free um easier experience to uh natively think about containers and not have to think about VMs anymore. And the reason why this is important is uh when you're dealing with, with EC2 instances, you have to choose how many EC2 instances to run.

Uh AWS Fargate allows you to think just in terms of the containers. So I want to run 10 containers, give me 10 containers. Whereas with EC2, you have to think, do I have enough EC2 instances to run 10 containers or do I need to run a certain size of EC2 instance to to host 10 containers on?

And then the the other component is the container registry. So once you've built that container locally, where are you going to store it and how are you going to get it in the cloud? So that way these cloud services can actually make use of it.

Amazon Elastic Container Registry as a solution for that.

Now, that was the cloud side of things, but there's also the hybrid side of things. So this is bridging the gap between the cloud and what you have on premise or on your own hardware.

So you're gonna see uh AWS Outposts listed there first. This is a very cool and if you spun by the expo center earlier, you may have seen some AWS Outpost hardware which is very cool to see. I always love seeing those beautiful servers uh sitting there on the shelf.

Uh so think about a Outpost as a AWS hardware inside your data center. So um with that you get a actual rack or a server to slot into your rack, which is the same hardware that you would expect to be using if you were provisioning EC2 instance uh in the console, Amazon ECS, anywhere is your hardware but just being orchestrated by AWS.

So you're not actually buying or renting any uh physical hardware, but you're providing your own hardware and we're providing management to help you utilize that hardware on premise.

So, Amazon Elastic uh Kubernetes Service also works with Outposts and there's also a Amazon EKS Anywhere that I talked about before.

Um AWS IoT Greengrass is a very cool service for managing containers specifically for IoT devices. So, sensors, um manufacturing industry loves this one. We also have seen this adopted in certain uh connected cars, vehicles.

Um farmers use this. Um basically, it's for devices that may go up and down in terms of connection, sometimes they're connected, sometimes they're not connected, maybe it uh it needs to store some data while it's disconnected and then upload all that data when it reconnects. That is a great use case for, for Greengrass and then a Snowball Edge.

This is a physical hardware device that you can ship back and forth between an AWS data center and your on premise. And Snowball has a variety of different sizes. It's all the way from like a small device that can essentially fit in a, in a envelope which allows you to ship terabytes of data up and down uh to the cloud without paying the same ingress costs or networking, bandwidth constraints.

Um but it also provides uh rugged uh computers which have a bunch of CPUs graphics cards memory in it. This allows you to essentially run an EC2 on premise. So people use this at festivals.

Uh they use it for uh shooting film on location, uh things like that where they, they need to have the same power of EC2 and, and certain AWS services. But brought with them to a rugged location in this plastic essentially cage is gonna keep that computer safe.

And then when you fill it up with data, you might send, just send it back to AWS and say, hey, I want you to stuff all this data into an S3 bucket so I can start processing in the cloud.

Now, I want to dive a little bit into the focus of this talk, which is Amazon ECS Anywhere and how it works. So that spectrum that I showed earlier. So you understand that where Amazon ECS uh lives.

If you look, think about capacity on AWS, you're gonna see ranging all the way from AWS regions down to your own hardware with Amazon ECS anywhere. And as you go from left to right on this diagram, it's gonna get more specific and closer to you or your end customer.

So in AWS region, think of it as gigantic buildings full of computers that are located in, in one particular location, usually far from a city center. You know, there's gonna be multiple buildings that are separated by geographical distance.

Um this is a full uh service uh installation that has all they do is services in, it has a ton of capacity that allow you to go elastically up and down in terms of how much capacity you're utilizing.

AWS Local Zones is smaller data centers that are located inside of city centers. So if you have um like there's one in LA, there's one in, in, in certain cities that we're deploying to. And this is more for when you want to be closer to your customer for establishing for example, your own uh points of presence or uh content uh delivery network that's closer uh to your end customer.

Its Wavelength is applications being deployed onto hardware inside of 5G network. So this gets even closer to your end customer.

Right there in the cell tower or right in the uh cell providers, data center, extremely close to mobile applications outpost gets even closer. It gets right there inside of your building right there inside of your data center.

AWS will bring a rack of hardware in or a server and install it uh for you and make sure that that runs and they'll bring in maintenance for it as well. But then if you want to manage your own hardware, now, that's where Amazon ECS anywhere lives.

So I want to talk about how it works uh by showing this diagram so we can understand the pieces. So we have the region and the region is where Amazon ECS fundamentally runs.

Um ECS control plan itself does not run on your hardware ever

Um, the only thing that runs on your hardware is what's called the ECS agent, which is a lightweight agent which establishes a connection back to the control plane that's living inside of the AWS region. And you'll see a couple different pieces here inside of your server VM. You'll see the ECS agent, you'll see the SSM agent and you'll see the operating system and, and, and containers and as we build up the flow here. You'll see how everything connects together.

So we have the ECS agent which connects back to the Amazon ECS control plan and we have the SSM agent standing for Simple Systems uh or Systems Manager, uh which connects back to a Systems Manager. So both of these agents run on your hardware to allow the cloud access to control it.

Now, interestingly, you do not need to have a public IP address for your software uh or for your hardware, you don't have to open up any networking firewall rules or anything along those lines. It's designed that as long as your hardware has an internet connection, it establishes its own outbound connection to the AWS region and then over that connection, um the AWS services are gonna communicate back down over it to send instructions to those agents. So this makes it very secure. You don't, you don't have to open up any firewall uh port rules or anything to AWS.

So let's go through the process of bootstrapping. So you have your, you have your data center, you have your hardware in there. Let's say you're, you're using VMware on premise. So you have a VM, you need to start providing a few things for Amazon ECS AWS to work. The first being an operating system. Obviously, you need to run some kind of operating system inside your VM. And the operating system is your responsibility because you're the one installing it on your own hardware.

But then on top of that, the first thing that's gonna get installed as the SSM agent uh Systems Manager agent. And when that agent installs, it's going to use an activation key that you have pre prepared inside of Systems Manager. So you go Systems Manager, you say I would like to register a device and you can get an activation key back and the activation key can be used to activate one machine or uh up to 1000 machines at a time I believe. And you can also specify an expiration on how long the activation key lasts. But you share that activation key out to all of your uh machines in your network as you install that SSM agent and the SSM agent will send the activation key back to Systems Manager and say, hey, I would like to become a registered piece of hardware with Systems Manager as a result when this happens.

Um, if approved, then the SSM agent will generate a uh a private key locally uh a key pair similar to, you know, an SSH connection, the private key pair stays on the hardware. The public key gets shared up to SSM and registered inside of SSM. And so this allows uh SSM to validate that the end of hardware is who it says it is. And the purpose of that is for SSM to be able to send back credentials in a secure manner. Um to your hardware. It can actually uh encrypt those credentials using the public portion of the key. And then the only person who can decrypt those credentials is the holder of the private portion of the key, which in this case is your particular piece of hardware uh running that particular SSM agent.

Every single piece of hardware has its own key pair. And so SSM has a whole list of all the public keys they can use. Um but all those private keys, they stay on your hardware inside your data center, inside your device. And so the SSM agent uses this whole process. It decrypt the credential, it has a credential at the point that it has a credential, it can now communicate to any other AWS service in the cloud. But the main one for the functionality of Amazon ECS is the ECS control plan.

So the ECS agent starts up, it says I have a role, the role allows me to talk to Amazon ECS and um I am now able to register myself as a managed instance inside of Amazon ECS. So now ECS has a list of all of the computers that have been registered with it. And it's starting to keep track of things like how much CPU. So uh the ECS agent looks at the number of CPU cores, it looks at the amount of memory in that device and it says here's the pool of resources that are available to run an application.

Now, you can go into the Amazon ECS console or the API and you can issue those commands. Like I said earlier, run 10 copies of my application and hook it up to this little balancer. And Amazon ECS will communicate back down to the ECS agent and say I would like you to run a container. The ECS agent communicates to a Docker engine that runs locally inside of, of your VM. Once again, this is something that you do have to install. We provide a helper script that goes through this whole bootstrapping process. So from your perspective, it'll just look like a one command to run. But all these components SSM agent, ECS agent and Docker engine are running locally on your hardware.

The Docker engine then spins up containers um as instructed by the ECS agent. And once again, all those connections, the SSM agent connection and the ECS agent connection are outbound connections from your hardware. Um they're the, they're communicating over the internet uh or via private link that you establish uh directly to the service in the cloud. But the cloud is not ever establishing any connection to your service. It's only communicating back over the, the connection that was established by those agents.

And uh the cool thing about this is that now that this whole flow was set up, the ECS agent can now bring in other credentials. So that that first set of credentials that was established by the SSM agent is just a top level credential that allows Amazon ECS anywhere to function. But each of those containers now that runs on top of your VM can have its own unique credentials that authorize that service to talk to a certain subset of resources inside of AWS.

So for example, in this diagram, an S3 uh bucket or CloudWatch. So an example of that might be gathering the logs out of an application and uploading them to CloudWatch or just an application that needs to store and persistent data inside S3. So following this process, you can bootstrap all the way from a bare OS and a VM on the data center to these agents bringing in all the configuration necessary for an application to function as if it was running in a production cloud environment.

And the cool thing about it is it sounds complicated. It sounds like there's a lot of components there. But the reality is these agents are very lightweight. Um they're very thin agents, they don't consume a lot of resources. In fact, this was a uh fun little project. I set up with uh Raspberry Pies uh connected uh to ECS and registering as as capacity. So this right here was a uh 16 core 32 gigabyte or was it 64 gigabyte? I don't remember now. Yeah, I think it was 64 gigabyte uh cluster. They registered with the cloud and then Amazon ECS was placing tasks onto this piece of hardware.

Now, obviously, this is not a particularly powerful piece of hardware. These devices didn't even actually have active cooling. I was just using a, a heat sink on the processor. So it wasn't super powerful, but I could do quite a bit of processing on it because the agent wasn't consuming a lot of overhead.

So it's an important distinction between a lot of other container orchestrators where it's running a full database and it's running all this logic uh on your own hardware. And so this gets into the key use cases. The the the fun part, the 1st 1st 1 I want to talk about is consistent hybrid workload.

So remember that from earlier when I was talking about use cases for hybrid, the consistent operations, the cool thing about Amazon ECS anywhere is that you now have one set of APIs that can deploy both to the cloud resource and to the on premise resource. So the, the two core APIs they use day to day with ECS are the RunTask API and the CreateService API.

RunTask is used to run a single container on demand and run it to completion until it exits. So this would be used for if you have a batch job or you have like a script that you schedule to run maybe on a cron job, something along those lines. And you just wanna run that from top to bottom until it ends and you don't want to really think about where it runs. You want Amazon ECS to figure that out and use the available capacity, but you do wanna run something on demand.

It create services for when you have a, uh, a website, a API something along those lines where it needs to be up and running at all times and needs to restart if it ever crashes. And so that's where you'll use the CreateService API for. But either way those two APIs are Amazon ECS APIs that can deploy to any of those locations in region in local zone in wavelength in an outpost, a manage hardware or even your own hardware.

Now, uh it's one unified entry point for launching an application anywhere and I want to talk about some example, customers that have benefited from this particular approach. Uh the first one here being uh uh Tempus x. So Tempus x, they are a provider of um of video uh transcoding and they needed a way to um consume all this data from uh uh webcast and, and, and different sports leagues around the world and, and you'll see some of the, the results there. I don't wanna go through all that text. I don't wanna just read that it's kind of boring. But the core idea there is it facilitated processing speeds of up to 40 times faster.

Now, that leads into another, another use case of, of edge orchestration challenges as you are operating uh an application on the edge, you'll find that you need to do certain meanness to it. We're, we're, we've kind of become used to cloud services that are maintained and patched and upgraded by AWS engineers. But the reality is when you're operating your own software on your own hardware, someone has to manage those patches and upgrades. And for a set up like Certis, there's a lot of different things to be need to be upgraded.

Kubernetes requires you to upgrade, not just the orchestrator, but also the etcd database, the storage state. And you also have to upgrade the different agents on the on the nodes that are providing the capacity for Kubernetes. So you're gonna have to manage all those components and this problem becomes harder to deal with the more locations you're dealing with.

Let's say you're a restaurant chain or let's say you are a business that has a lot of different warehouses that you are working with or a lot of different devices that are like mobile devices like cars. Um if you're running a complete orchestrator in each of these locations, they all need to be patched. And this problem becomes harder, the more locations you have.

Now let's compare that and contrast it to the way this works with Amazon ECS. You remember Amazon ECS, the agents are connecting back to the cloud and that control plan is centralized in one place inside the cloud. Well, AWS engineers are the ones managing upgrading and patching. That the only thing that you have to upgrade and patch is the agents themselves.

So you've removed several different components, you would have to um orchestrate and an update in all these different locations and replaced it with the vast majority of that operation overhead happening on the cloud site. And the cool thing about is that these agents are backwards compatible too. So there's technically not even a reason why you would need to upgrade the ECS agent or SSM agent unless there was a bug or there was a new feature of the platform that you wanted to adopt.

So in many cases, you could leave a, a site running on an older version of the agent and the agent will happily connect back to the Amazon ECS control plan even if the control plane has been updated. Uh so that's one of the nice benefits of Amazon ECS that I've run into.

So an example of a customer who benefited from that is 3DI um 3DI is a video streaming uh software and they have a lot of different um uh third party data centers that they're, they're watching for over they're doing um if I remember correctly IP cameras and so all of their customers are are gathering IP camera data on premise from different security cameras and obviously that creates a ton of different installations at different businesses warehouses, different places with their security cameras all around the world.

And if they were to go in and try to manage the software and all of those, it would be a significant burden. But by using ECS, anywhere they're able to keep the bulk of that centralized operational overhead in the cloud. And then even like I said, they could use an old version of the agent, leave that on premise or if they really want to, they can upgrade the agents. Uh once again, lightweight agents very easy to upgrade compared to upgrading a database or an entire configuration stack uh of a hardcore orchestrator.

Now, I want to talk about another one that I particularly like about ECS network, which is GPU scheduling. Uh this has become uh one of the key use cases of Amazon ECS anywhere as more people adopt machine learning and uh are trying to train machine learning algorithms.

So you can scale and place applications on more than just CPU and memory dimension. You can now also say I want this application to have a GPU core attached to it. And the ECS agent is aware of the GPUs that are, that might be on that piece of physical hardware and schedule uh workloads.

So I mentioned the um video rendering studios before uh 3D rendering. They love GPUs. They need GPUs for, for uh rendering uh films and film frames. But one customer that is using GPUs as well. is Kepler.

Um Kepler provides uh a machine learning service that uh uses uh film cameras to monitor elderly uh people in uh care residences. So it's essentially watching over them in case they would fall down or get hurt or have a medical emergency. Maybe they're not even able to press the call for help button, but the machine learning algorithm is able to detect that call a nurse to save their life uh much faster.

Now, obviously, to run that model, they need particular hardware such as GPUs and they need some way to uh to orchestrate all of that throughout all these care homes uh that are distributed around the world. So they benefit from Amazon ECS. anywhere there, some other core features, Amazon ECS, anywhere that I like ECS exec.

So exec is built into ECS as a way to get a shell inside of a running container. And the reason why this is important is that traditionally, if you have a, a device that's running containers or that's being used as compute capacity, how are you gonna connect to it if you need to debug? Well, it really feels bad to open port 22 to the world to allow SSH access from anywhere.

Um particularly now you have to give that device a public IP address on the internet. You have to add um firewall rules and make sure that you can't, don't have somebody brute forcing in there and trying to crack the password and get into your device and start running their arbitrary code on it.

So I talked earlier about how that SSSM agent opens the channel back up to the cloud.

Well, ecs exec uses that channel to provide uh essentially built in bastion host inside of ssm. You connect to simple systems manager, ava systems manager to initiate your ssh connection to systems manager. And then systems manager connects down over that connection with the ssm agent opened up to ssm. And so you're able to get a shell on the remote container on that remote host without ever having ssh installed and without ever having uh port 22 open to the world and you control access using im policies. So you can give each of the a ds users on your account, different access to different services and authorize who's able to connect to ssm and then through ssm down to which particular containers uh that might be running in a particular rack or uh location around the world.

And then last but not least cloudwatch monitoring. Uh one of the classic problems of running an application is what happens when the logs start to stack up and you're trying to rotate them. And next thing you know, you get an alert that the disk space is running out uh on that particular host. And what do you do with those, those logs, you're gonna ship them off to a storage device somewhere or just delete them because you don't really care about them.

Well, amazon ecs, anywhere out of the box gathers logs and metrics. Um it's a built in feature of amazon ecs because it has that im rule that allows it to communicate back out to other a services. One of those is cloudwatch logs and so it can gather the logs of your application, ship them off that piece of physical hardware and up into the cloud for storing, exploration and uh querying later on. Even if that particular piece of hardware was destroyed, you will still have the telemetry from that task. You will still have the log lines that that task had written uh prior to that piece of hardware being destroyed.

And so one more uh example, customer that I want to talk about is just walk out technology by amazon. Um just walk out technology is being used in different stores like the amazon go store and you've probably seen it. They have all those cameras that kind of watch from overhead. And as you walk to the store, you can just grab something off the shelf and walk out of the store and you get a bill later on. So you never have to even talk to a cashier. I love, I love the concept.

Um so obviously they want to be able to scale this out fast to many stores and you'll see the quote there as we continue to scale. Just walk out technology, look for ways to accelerate our deployment processes. And ecs anywhere helps us to expand faster by maintaining the same deployment processes, metrics and tooling on premises and in the cloud. So classic example of being able to utilize that same operational workflow both in the cloud as well as on prem.

So I've talked about several different customers at a at a sort of high level, but I want to bring up one customer who's gonna go and say deep dive on how they used amazon ecs anywhere and that is c a mac from ocado technology.

Welcome. Count.

Cheers. Good evening, everyone. Um like Nathan, I was, wasn't quite sure what the attendance was gonna be like, especially given the time of the day and given the location we're in, right? This is vegas. Um so my name is cam mac at ocado technology. I'm head of product for an area that is responsible for providing connectivity and compute for people and automation.

Before I start talking about ecs anywhere at a cardo, I thought it might be useful to share a little bit about who we are and what we do at a card of technology. We have been, we are solving some of the toughest technological challenges of our age. I've been quite fortunate and I've been with the company for two decades and for over the last 20 years, we have been transforming online grocery through cutting edge tech, our engineers build and support solutions for ensuring we have the right stock in the warehouse, an e commerce offering to allow our customers to buy the products and solutions to fulfill and deliver the orders rather than buying off the shelf technology to support our needs. We decided to build our own because what was out there did not match our exacting standards.

Now we license our technology to retailers all over the world. Today. A 2.5 1000 strong team of developers are across 12 development countries in across 12 development areas. In eight different countries are developing our leading our industry, leading capabilities in automation, robotics, machine learning and more, we have over 500 patents covering our technology estate. And we are tremendously proud to have been able to solve some really tough problems and develop solutions that are being used by retailers all over the world.

Our advanced capabilities in machine learning and artificial intelligence enables us to achieve amazing outcomes, fresher food, greater convenience, wider choice and the lowest rates of food waste all in a business model that brings the best economic returns for our business. This is why some of the world's largest retailers are using our technology to become leaders in their markets.

We are delivering highly automated warehouses like this one all over the world. More on how these work later on, our cutting edge technology supports the online operations of 11 of the most innovative and forward thinking retailers in the world. Close to where we are today in north america, we have soy and kroger. We have a number of uh retailers in europe and out in australasia and in the far east, we have eon and coles. Since producing this slide, we have also signed a contract with lotte in south korea. So we have a new retailer, come on board.

So what is our proposition? Well, it's os p, the ocado smart platform, os p is the most advanced platform for groceries in the world. It's the end to end suite of capabilities for ecommerce, fulfillment and logistics solutions using advanced technologies such as machine learning and artificial intelligence through using os p retailers can win market share, so can be build loyalty and win market share.

We are built for the cloud and adopt a microservices architecture. The beauty of this model is that it is extremely scalable. Warehouses can be as big as seven football pitches to serve a large area or scale back to small serve a smaller population density area. There are also microsize facilities that offer quick delivery to customers in urban areas.

So what makes the ocaa way of doing online grocery different? First of all, grocery is the most difficult retail segment to deliver online profitably. There are some hard problems and hard challenges that we face big order sizes, 50 plus items, products that need to be stored as frozen, chill or room temperature, short shelf life, low margins and low tolerance for substitutions and yet high expectations for on time delivery.

Ardo entered the sector in the uk. Determined to take a new approach. Unlike most traditional retailers with an online presence, acaa fulfills its orders without the need for bricks and mortar stores. This allows for greater efficiency and flexibility. The os p warehouse is the most sophisticated of its kind in the world. There are standard micro and mini size warehouses in the ecosystem.

Goods from our suppliers arrive at the inbound area and are decanted into storage bins. The bins enter the hive. This is at the center of the warehouse comprising of a grid and thousands of bots collaborating with each other like a swarm of bees. The precise orchestration of this system allows us to collect a 50 plus item order in just a few minutes. A key recipe for in the a key ingredient in the recipe for quick delivery, a pick station allows an operator or a robot to pick into the customer order. Total labor hours used to fulfill an order runs at just 15 minutes compared to that of one hour and 14 minutes at a supermarket store.

We are cloud native at a card of technology. Even our most latency sensitive bot orchestration system runs in the cloud. But there is still a need for the edge based provisioning of on premise, computes depending on the size of the warehouse we would typically have about 50 to 200 devices running different kinds of workloads to support key business functions such as p and decant. And there lies our challenge. How do we enable this for multiple warehouses in a highly repeatable way?

Well, that's why i'm here today to tell you about how we leverage ecs anywhere at a cardo tech. So the problem set for us was clear, we needed a way to allow our engineers to deploy container workload to devices scattered in warehouses across the world. We knew that some of these key, these critical business functions could require multiple workloads. Our engineers are super proud of the ardo technology platform. This is our in house built suite of tools pipelines and governance frameworks to enable us to build and deploy applications in a consistent way.

It was incredibly important for us to leverage this and to not re invent the wheel. And regardless of whether we're deploying into the cloud or out at the edge, we wanted to ensure the engineers had the same experience. In addition, we also had the following considerations, simplicity. We wanted to avoid creating something that required specialist training or creating complex interfaces which not only would distract our engineers from using the product but would also reduce their time to focus on innovation.

At a cardo technology, we strongly believe in continuous deployment. This means that anything that slows down their productivity is not a good thing to make it even easier for our engineers. We also factored in the need to deploy to devices in logical groups. For example, on a monday, i may choose to only update pick stations in kroger warehouses and on a tuesday, some other set of devices for some other retailer, the permutations are endless and we needed to provide support for this support can be can become incredibly difficult due to the sheer number of devices.

This means we need to, we needed to manage our devices in a way that was compatible with both our staff located at the retailers warehouse, as well as our engineers supporting the product backup base. Making it easy for those who support our products was paramount. Lastly, we needed it. We need it for it to be easy for engineers to move over to using the product, not just for their first application but for any application.

So why ecs anywhere before we started our proof of concept with ecs, anywhere we undertook research into the technologies available to us, we considered a number of options including green grass rancher and even a combination of iot without with an in house built orchestration layer. The choice was clear however over and above the great coverage ecs anywhere has of our requirements, it was a product that we felt at home with.

This is because at a cardo technology, we are already a heavy aws ecs user. We all know the economic and technological benefits of using managed services. We want to build our products without reinventing the wheel and want our engineers to focus on their missions, not on infrastructure. This sits firmly in line with our product strategy.

Our use case today centers on the need to deliver workload to devices within the walls of a warehouse where you have supporting services such as vpn s. We know that one day we may need to deploy to devices located anywhere with just an internet connection ecs anywhere gives us that straight out of the box.

One of the key benefits of ecs anywhere is that it allows you to scale out applications in the cloud to devices out at the edge. This works really well for most vanilla case scenarios. Through our proof of concept, we identified that this logic was not entirely compatible with our use case and we work collaboratively with the ecs anywhere team to tweak the back plane to overcome this issue, we're tremendously proud of the outcome.

So how do we go about assembling all of this together? And what we have here is a very high level simplification of the components that make up the solution. We have an ecs cluster per warehouse. This separation allows us to deploy per warehouse which we thought was the right se uh right balance. The innovative applications developed by our engineers are pushed as containers with the appropriate tags and attributes. This is done using our deployment tool which has integration to our deployment pipeline using cloud formation together. This gives us the ability to deploy a particular workload to a particular device.

And at the receiving end is the compute device. The ecs agent is deployed as part of the standard operating system build information that allows us to identify the device is collated by our in house platform agent which passes this on to the ecs agent upon successful validation that the device is an ocado asset. The auto registration process enables the device to receive the next task. It's as simple as that.

So after approximately four months product development and a roll out period, we now have about 1300 devices running across 18 different warehouses in five different countries. This is helping five of our retailers in their missions to use os p to gain competitive advantage of the online grocery market in their territories. Phase two of the roll out plan will add devices to five more sites in the next quarter across two new warehouses for two retailers. And this will add another 3 to 400 devices as well.

We're also seeing an incredibly low overhead in maintaining the product and have a high degree of confidence that it will continue to scale to support the known projected number of warehouses and retailers beyond the pick and decant use case. We're also seeing other business functions express interest in how we can help them deliver workload to their devices. This ranges from dan imaging where we use vision technology to detect unwanted packaging and guide the operator to removal through to developing the charge solution for the next generation bot.

Hopefully you have enjoyed today's journey we learnt about the challenges faced by retailers and how ocado have developed innovation to help them turn these challenges into opportunities. This is a tree through the ardo smart platform. The ardo smart platform is the end to end suite of ecommerce fulfillment and logistics solutions. As part of the fulfillment operation, we needed a way to deploy container workload to devices across the globe. And we achieve this through integrating ec s anywhere, the integration was highly collaborative.

We can't thank the ecs anywhere team enough for the way they responded the way they engaged and worked tirelessly to support the tight deadlines of the project. The end result, we have a product that easily allows our engineers to deliver workload to devices across the globe. The solution runs of a single control plane and we have not had to upscale our engineers due to their familiarity with ecs.

Thank you again. Uh my name is cam mac at a card of technology and we hope you can also benefit from using ecs anywhere in your organization to build great products to serve the needs of your business.

Thank, thank you very much, cam it's a high praise indeed. Um and uh if you're interested in amazon at u cs, uh feel free to reach out. I'm sure cam uh can answer some of your questions and you'll also find my twitter handle up there at nathan k pc. Uh d ms are always o always open and uh happy to reach out uh and answer some of your questions or connect you to an engineer inside of amazon ecs, anywhere who can help you to build your incredible platform as well.

Thank you so much for attending this uh late evening session and i look forward to seeing you around.