Rocket science: Process, store, and analyze engine test data on AWS

最新推荐文章于 2024-07-11 12:08:29 发布

李白的朋友王维

最新推荐文章于 2024-07-11 12:08:29 发布

阅读量95

点赞数

文章标签： aws 亚马逊云科技科技人工智能 re:Invent 2023 生成式AI 云服务

本文链接：https://blog.csdn.net/just2gooo/article/details/134791535

版权

Good morning, everyone. Thank you for joining us on the day. One of re:Invent. Yeah, everyone looks pretty fresh, energized, excited. It's good to have a session on Monday instead of a Friday or Thursday afternoon. People are a little more tired, but thank you. Thank you for coming here.

Um we hope that you enjoy the session and you get out of this for what you came from. And yeah, we are very glad that we were able to organize this session and overall we are kicking off our Aerospace and Satellite track this year at re:Invent.

So we have a bunch of space related sessions designed, just got done one on Software Defined Radio. We have this one, we have many more coming through the week. But what we wanted to make the track this year was we wanted to make it exciting, which by default, it is because we are talking about space or that's what I would hope. Especially for people in the industry, we wanted to make it relevant so that we are talking about use cases and customer stories this year, which is happening right now, which the solution, the customers are using the problems they are currently solving. That's happening right now.

And we wanted to make the track inclusive as well because we believe that space is for everybody, the space for everyone. So that brings us to today's session. My name is Yudhijit Das Gupta. I am the Head of Solutions Architecture for America and Asia Pacific. And together we are the AWS Aerospace and Satellite.

And today we will be discussing about how to process store and analyze engine test data, the data that is generated during hot fire engine testing. And we chose this very specific topic and gave it sort of a catchy title is because we believe that this use case is very niche, but it is very relevant for our customers. Right now. They are going through hot fire engine testing and this is a problem that is coming up and we would like to provide a solution and we'd also talk about what the customers are using today.

And joining me on stage, we will have Matthew Lyden who is a Solutions Architect with AWS Aerospace and Satellite, our team and we have Nathan Mc. It's a Principal Solutions Architect also in the Aerospace and Satellite team moving on.

So the agenda today will cover we will give a brief description of what is a hot fire engine testing. Some of you might be already aware of it for some it will be new and we will talk about the common customer needs that we are seeing from the field and from the industry like all solutions architecture, there are trade offs. So we will discuss design trades and the decisions that we take while designing architectures. And we'll give some examples of those architectures as well.

And the part i am most excited about getting started. So you come, you attend a session and then we would like you to go and implement it in your own way or in the way we described. So how do you get started after the session? And we will have a q and a round, we will take questions, but we will also be around after the session if you want to come and talk to all of us.

So before we move on, we are doing a quick polling. So if you can use your cell phones to scan this and answer a polling question for us that will give us some data about what makes you excited about the session. If you are in the industry, if you're designing rocket engines yourselves or um you are just here to know more. So we'll give us uh give this about a minute or so, i'm getting some indication from our tech support that the polling results are coming in. So that's good. We will close the pole in about 10 seconds.

Ok. Uh just find out how many rocket scientists are there in the crowd. But uh huh. So the most of the audience they do work with time series related data, something something we thought will come up. So this is a relevant use case from the data analysis and data storage perspective. And the second largest group of people take rockets are cool. That's a basic. Awesome. Awesome. Ok. Moving on.

So let's start, let's start the discussion and it might be familiar to people who are in the industry, but just to give a brief description how the hardware development life cycle works is that it involves steps of designing and simulating and testing and then in the continuous improvement cycle of those designs. And the tools that we use are c tools, computer aided design tools that assist engineer in designing, simulating a component.

What the digital simulation tools can do is they can catch a design issue before you're investing time and material spent on that. So they are catching the issues before you make the investment and after you machine the part and after they are assembled at the factory, they are then tested to ensure that the actual component functions as was expected in the simulation as was expected by the performance.

Let's have Matt Leyden talking about the most common physical tests that we are seeing from our customers.

Thank you. Yeah, thank you all for being here again the time to join us today to talk about rocket science. Um so I just want you to take a moment to think about all how dramatic a rocket launch is right. The components and the materials that are going up into space, we want to make sure that they work as designed properly.

So before an engine or its parts are ever launched or put on the pad, they go through a number of physical tests to make sure that they perform as expected. And some of those tests can be putting the part in a pressure chamber to measure, you know, can the part uh withstand leaving the atmosphere going up into space, testing thermal resilience.

So can the component, the engine withstand the heat of the exhaust and the cold of space as well? Um acoustic vibrations, making sure we literally will vibrate the part to make sure nothing falls apart or comes loose. Uh so that the launch forces that are placed on the part, um don't shake it apart. Uh and then also electromagnetic interference and um you know, making sure that as the component leaves earth's magnetic field that still operates effectively, especially if there's electronics in there.

So the ultimate test is often referred to as a hot fire engine test. And you could think of this as essentially launching the rocket but not having it move. We put the engine on a test stand and this is uh essentially a mount that will act as if, as if the rocket isn't moving. And here's a really good example of what that looks like.

Um the engine is ignited and things will heat up and that's what we call it a hot fire. And this is where the components are operated in actual running condition. So there'll be an engineer on site, uh usually a few to monitor to make sure everything's going well. Sorry.

And then the uh all around the test stand are sensors that are outfitted to measure things like thermal. Uh you know, the heat, the flow rates of the of the com the propulsion components and the uh the fuel. So there's two key areas we're looking at here. Uh there's the combustion chamber where the propellant is mixed together and ignited. And then there is the nozzle which actually uh will direct the thrust and you know, help the steer the engine or steer, steer the rocket.

And then all around are the sensors and they're capturing data at really high sample rates and passing that off to specialized uh data acquisition s uh systems. So engineers now have a heap of data. It's not immediately useful, right? it needs to be processed first and this data is usually synchronized around the t zero mark, right? So you have t minus, you're counting down and then t plus.

So if you have different data acquisition systems, you first have to uh you know, consolidate all those so that you're working on the same, same time scale, right? So that time series data that you're collecting, you want it to be accurate for all of the sensors. Um and then you're taking the raw data from those sensors and coming up with measured values.

So the thrust, the propulsion um specific impulse which is you could think of it as like fuel efficiency of a rocket engine. Um and also the pressure and the temperature in both the nozzle and the combustion chamber. So we'll be using hot fire as the example for this test or, or this architecture um because it's really cool and you get awesome pictures like this. Uh but it does apply to more than just hot fire engine testing, right?

So if you're testing different components or different engines, it's a similar process here. So now i'll pass it off to nate to talk about some of the customer needs that we have experienced, talking with customers and uh our design choices.

Thanks mate. Thank you. So, as we've been working with customers on these workloads, there's a number of common themes that have come up uh for customer needs. The first of which is faster time to results. Really what we're talking about there is getting the process data in the hands of engineers so they can make the decisions they need to make about what to do next, whether to change the design, whether to optimize the design.

And for a lot of folks in this industry right now, it's highly competitive and so the quicker they can iterate, the faster they can make their products better. And the more competitive they'll be in the market. So that becomes really important.

Um excuse me, often, there's a couple of bottlenecks that we hear that folks are working around. The first is the actual compute and data movement necessary to do the data processing, right? And, and we're often talking about using scalable cloud resources to try to finish the data processing faster, to try to move the data more quickly, to reduce those bottlenecks.

The other is in a lot of these use cases we've seen out in the field is there's still a lot of manual processes where there's human action needed to get data through different processing steps, even stuff as simple as an engineer has to get a flash drive or a data card out of a data acquisition system and into a laptop and put the data where it can be accessed and sometimes the bottleneck is waiting for people to go do those tasks and we talk about addressing those through automation.

The next common theme we've heard is the need for scalable and cost-effective storage. This data is important. It's expensive to generate. It's a big investment to be able to collect this data and customers often want to keep it for long periods of time. But for, you know, we've heard a number of anecdotes and challenges.

I had an engineer tell me about being asked to delete a bunch of engine testing data from a shared drive because the shared drive was full and they didn't have the resources to expand the storage, which she was not particularly pleased with. And that's a more common theme than you would think.

So, we'll be talking about taking advantage of scalable cloud storage to be able to allow folks to preserve that data for as long as they need to in a cost-effective manner.

Mhm. The third is improved data governance and I mean, this actually in two different senses. So I'm kind of using data governance in two ways here. The first is data discovery. We've heard a number of organizations we've talked to that. Their challenge is finding historical data and keeping the data organized and, and sort of this regular problem of why, you know, I need, you know, I want to find the data from that test. I ran that time, but it's literally on a shared drive somewhere and nobody remembers exactly how we organized it. And, and it's hard to figure out what's what.

The other use case that I'm I'm kind of putting in this data governance bucket is also for organizations that have either regulatory requirements or maybe just company policy requirements around data retention, data preservation, et cetera. Having the visibility to know that those requirements are being met and that the data is being maintained in the way they need it to be maintained and preserved for the amount of time they need it to be maintained.

And finally extracting more value from that data back to that theme that we'll talk about over and over again, right? These tests are expensive to operate. Companies are already making a big investment in extracting this data in their hardware development process and they want to extract more value out of it and be able to take advantage of things like advanced analytics or even machine learning to get even more value out of this data after they generate it.

And so with that, we'll jump into our final polling question for the session, I promise. Um and so what we're interested in is if you do uh an interesting data on the first one in terms of what folks are interested in. But if you do have a hardware development process that you're working through, similar to the one we're talking about, what of those different categories? If any of them are most meaningful to you in your organization and we'll give everybody uh about another 40 seconds or so to have a chance to click through everything there.

You know, we've been joking about as we've been preparing for the session, what i was going to do if everybody selects other. And I think I've just decided that if you all select other, I'm going to ask all of you to meet me in the hallway after the session so I can find out what other is.

Alright, it looks like folks are wrapping up. We'll give it just a couple more seconds and we'll swap over to the results.

Alright, let's look at the results to extracting more value from data and, and faster time to results. Only two votes for other. Ok. Both of you in the hallway after the session, please. If you have time, I'd love to know.

Alright, we can switch back over to the slide. Thanks everybody. Now, it's useful data for us. I'm gonna hand it back to Matt who's gonna walk us through the set up for the example architecture we're gonna be talking through.

Yeah. So before we dive into the specifics, because we are gonna go piece by piece of how this architecture works and what we recommend. Uh but first we're gonna talk big design trades and decisions that we've had to make. So we'll take a look at the overall process, high level what we're doing um to set the stage for that.

Uh first, right? We have our test stand usually out in a remote site somewhere. Um and we'll have to ingest that data up into AWS up into the cloud and store that um in the long term storage so that we're not deleting those old archives, right? So we'll ingest that data in, then we'll do batch processing on the data. So take the raw data values, transform it, um reduce it down to things that the engineers care about. Um and then spit out results, right? The things the actual information that the data they're trying to get from each test. Um and we'll store those results in a place that can be, you know, long term, easy to access for the right purpose. Um and then pass that off to the analysis tools that the engineers are working with. And we'll cover each of these in more detail, but wanted to set the stage before Nate talks to you about the design trade offs.

So we're going to start with a couple of design tradeoffs and the reality is the rest of the session is going to be talking about design tradeoffs and architecture. But there's two that I wanted to talk about upfront in part because I think they're big important ones to think about. And in part, because in the example architecture, we're going to show we've already made these decisions. And so I want to walk you through how we made these decisions in case you need to make different ones.

So we're kind of going to handle these two upfront and then we're actually going to walk end to end through a data flow. And an example architecture, the first that I want to talk about is to time series database or not to time series database. Um a lot of times when we talk to customers who have time series data using a time series database is kind of a default assumption whenever picking a data store, the way I like to think about it is I like to work backwards from the data consumer because I think that actually is usually what most meaningfully drives the right data store. And I like to think about that in terms of two kind of key factors.

The first is what interface best benefits the data consumer, how are they going to interact with that interface and what interface do they need? And the second is what's the actual data access pattern and what data store can i use that makes the consumer's most common data access patterns efficient or scalable or you know, meet the requirements the consumer has in terms of latency data throughput, things like that.

And so time series databases in my mind, the two things out of those categories, i tend to look to a time series database for is first is time series specific query language and features. And so providing interfaces in the query language that let you do things like interpolation of missing values that can be really important for some use cases or being able to push certain kinds of analysis into the time series database. And and effectively use it for some of the data analysis like for example, doing things like statistical analysis or even fft transformations and things like that and pushing that into the database and having a query language that lets you do that.

The other thing i i tend to think of time series databases for is performance when you need low latency performance for transactional ish queries where you're going to expect to be able to have really, really low latencies to be able to go hit that time series store over and over and over again in this workload.

When we're talking about rocket engine testing, I actually would suggest there's an alternative looking at object storage and there's a couple of reasons why I think that tends to be the better fit. And a lot of the successful systems i've seen out in the wild have opted for object storage and we'll talk a lot about that in the rest of the talk because that's what's in our example architecture.

Um the biggest thing is we're talking about storing the state on Amazon S3. It's high durability, it's low cost and it's very intuitive. Um these engine tests and a lot of hardware development testing. Yes, it's time series data and it's streaming as we're measuring a real physical process that's taking place, but they're discrete tests, right? It'll be a single test or it'll be a series of firing within the test. And so thinking about those discrete tests and storing that data in that way sometimes becomes a little bit more intuitive since we are talking about what amounts to a discrete process.

At the end of the day, I think the other benefit to object storage, which we'll be talking about in the example architecture is by taking advantage of S3, we have integrations with a bunch of other data and analytic services we can use to automate other parts of the data processing if you do think you need a time series database.

So i'm going to take a step back. My default recommendation, like i said in the example architecture is we're going to talk about object storage for most of the rest of the session. My kind of parting thought on time series database is is if you do have a workload that looks like this one and you do need a time series database, I'd suggest taking a look at Amazon Timestream. It's built from the ground up to handle time series data. It is scalable, fully managed and serverless, which is a nice benefit. It has a in memory storage tier. If you need that kind of query latency, it also has. And i think it's really interesting for this workload, a magnetic archive tier that's very cost effective for longtime archive data. If you want to be able to keep that in a time series database, which is often a challenge with traditional time series databases.

But with all that said, I would suggest if if in doubt, look at object storage and consider starting with S3. The other thing I'd say on this is that it's a bit of what we tend to call a two way door. So you know, for example, with Timestream, you can unload data from Timestream into S3 and vice versa. You can load data from S3 into Timestream. And so as long as your data volumes haven't grown too large yet, you have a fairly easy path to be able to move back and forth and try both and, and see which performs best for your workload.

The other big design trade we're going to talk about is streaming versus batch. And I bet this one's gonna be a little spicy with this audience based on the survey. So i tend to lean towards batch for these architectures as being less complex and easier to scale the resources. And so a lot of times the the reaction i get is ok. Well, this is, you know, this is time series data. We're monitoring a real time process, why not use streaming? And i'll talk a little bit more about that in a minute. But i would say, you know, back to the idea that effectively these are discrete. And for a lot of the systems we've seen depending on the kind of data acquisition systems the customer is using et cetera, they're kind of already built into batch because they have to close the test before they can get the data files to be able to start processing them, right? And so it becomes very intuitive and the architecture becomes much simpler.

And so my recommendation is going to be if in doubt and you don't know what you need start batch, uh it'll be easier to get up and running and it'll be easier to troubleshoot and reason about. And then as you mature, if you need streaming, build a streaming fast path on top. Uh and then you can fully move over to that if and when you're ready to do so, benefits of streaming where i think that the complexity of a streaming architecture is worth it is one where you really do need real time low latency access to the data.

For example, like you've got analysis processes that need to be able to run on the data while the test is executing. Or for example, you want to enable remote engineers to be able to review the data as the text is executing.

The other that we tend to see is um for very, very low latency end to end system latency, right? Like that time to results being able to drive that latency from tens of minutes down potentially into into tens of seconds or lower doing using a streaming architecture to effectively pipeline the processing steps, right. So you don't have to wait for or data transfer to finish before you can process your first byte. And so you don't have those kind of chaining time to last byte delays.

Um but again, you know, generally rough numbers we're talking about, you know, getting from tens of minutes down to tens of seconds and needing to kind of make that switch over to streaming to be able to, to drive those last minutes of latency out the other thing we've seen is sets of customers who effectively all call it test like they fly. And what i mean by that is they use what amounts to some subset of their command and control systems and their avionic systems and that becomes their test system.

And so i've seen that as kind of a divergence in the industry, some people go that direction and they use that, you know, that one kind of system for everything. Some people have a very complex separate data acquisition system and apparatus that measures a lot more things than they even necessarily do in flight. And so they have, you know, a totally separate path for the test stand for how the data comes out of it.

But in that case where folks are effectively using, you know, what amounts to almost like a version of their command and control system that they would use in flight to manage the test data. They've already had to make that investment in a streaming architecture and take on that complexity. But they're getting the benefit of being able to build it once and use it for both use cases.

But i come back to the idea that if in doubt, i would start with batch unless you know, for sure that that you need to be able to do one of the things that a streaming architecture enables.

And with that, i'm gonna hand it back over to Matt who's gonna start walking us through the example architecture.

Yes. So starting with a batch architecture and we're gonna be using objects, right? Because this is a single test event. We're testing a rocket engine somewhere out on a stand and we're treating that as one event. And the first question we need to ask is how do we bundle all that data together as a single event, right? You can upload individual files, you, but that requires you to have to kind of check to make sure they're all there before you do any processing.

And a really common and simple use case that we see customers using is just to bundle all that up into a single, usually a zip file or a single repository, a single item and treat that as all of the data for that particular test.

Now remember this isn't the calculated values, this is just the raw data coming off the sensors on the test stand. Um and also any metadata or associated information about, you know, when the test took place, who was the engineer who was responsible for making sure the test uh went well.

Um all of that associated data, whether things like that that might affect the test. Um so we'll treat that as a single file and right, you can upload that to Amazon Simple Storage Service S3 and use that using the SDK or Software Development Kit, the command line interface. If you have some things that are running some systems running on the ground already out on the test stand or also just have an engineer upload it through the console or whatever tools, right? It's easy to get that data up into the cloud.

And then once that data is in S3, it's very easy to set up uh the the life cycle policies, right? To make sure that it's archived. And we'll talk about how that is automated um a little bit later on. But one thing to note is that once that data is uploaded into S3, it creates an object, uh an object created notification which uh an event bus like EventBridge can listen for and route to specific processing tasks and this automates it for us, right?

So that whenever a new test is uploaded, the test is finished, the data is uploaded. Um that processing task can kick off for us. And like I said, we can also automate the uh archiving process, right? You don't have to delete that data after you. After you process it, you can have it there. So you can go back and look at it. It's very important in the aerospace and satellite industry that we have auditing and we can look back on previous events. If anything ever goes wrong or if we have to change our designs later on, we can go back and look to see what took place and reprocess it or piece things back together.

So using our S3 Glacier archive tier, gives you that long term storage at very low cost. Um and it makes it a lot easier to automate that process as well. There is another kind of subset of data we have to consider as well and we're calling this the calibration data, but essentially this is data that's independent from the test, right?

You have all these sensors that are out on the stand and you want to make sure that they're measuring things correctly. So oftentimes uh engineers will go out and they'll calibrate their sensors from time to time to make sure things are uh operating and collecting that information correctly. And we'll need this information later on when we actually process the data and uh correct any uh misc calibrations on the sensors.

And because these are independent, right, they're taken either before or after a test. Uh we'll store that in a different S3 bucket and we'll turn on versioning and uh versioning control in there as well. So we have our data from the test and we have our calibration data.

Now, we need to actually process it and AWS has a wide number of tools to do the actual compute work and to crunch the numbers and, and uh you know, take the data and come up with, with results. Um AWS and Lambda functions are really good if you're just doing very simple processing. Uh so we've seen customers who have CSV files that come off of the data acquisition systems and they just need to run that through some Python code to do a really quick uh analysis and data reduction on the on the information.

Um but for larger operations where you have a little bit more compute going on or you have different data sources that you need to break down. Uh we would recommend using AWS Batch as a batch processing service. Um and what this lets you do is use a container to actually uh run the compute. And this is really great for customers who already have an investment in their code or their development process and they can reuse the same code to do the data processing.

What Batch will also do is it will uh orchestrate the actual compute required to run that container, right? So you can uh have Batch use for example, AWS Fargate to do our uh container list, uh our serverless container service uh to, to run the container and to come up with the the results.

So what's actually happening in the container? What is going on with the data that's coming off the test stand, right? So we have our single file or single object with all of the raw data sensors. So capturing the thrusts the flow measurements, the temperature, the heat, uh the vibrations from accelerometers, if you have those right, capturing all of that data and then unpacking it right from the single file.

And the first thing that usually happens is what we call a data reduction. So engineers don't necessarily need to see the entire event, right. They might be looking for specific changes and thrust. So throttling the engine up or down to make sure, you know, to see what happens during that event. Uh or if they for larger systems uh called gimle moving the engine. Uh so depending on what the customer is doing and what they're looking for in that test, they'll isolate specific time, time periods within that test and just focus on that, right.

Uh then they'll take the data and they'll make the calculations. So earlier, we talked about specific impulse uh the pressures of the chamber pressure, uh the deltas and the temperatures um and they'll compare that against what they expect. So uh the designs that you talked about earlier, the computational fluid dynamics, the CAD simulations, right? That's the expected results we'll compare is this a pass or fail within the allowed limits of what we expected when we created the design.

And with that, uh the customers will then would generate a summary of the information, right? To make sure. Ok. Did this meet our results? Here is the data and create uh you know, just uh information that isn't useful for the engineers rather than having them sift through all the raw data.

So with this output summary, and I'll pass it back over to Nate who will talk about how we're storing that data for access over time thank you.

I'm actually gonna grab a quick supporter here, Pert Maine. So there's three different types of data that we need to think about storing. The first that we've been talking about a lot is that time series data. And in the use cases, I've seen that's the bulk of the data. The largest volume of the data is the actual time series data coming out of data reduction.

But there's two more we'll want to talk about one is what I'll call summary values that are outputs that summarize the entire test that are not themselves time series. And so sometimes that's, you know, pass fail values about whether things were within parameters. Sometimes that's single digits that, that are single numbers that summarize the test. For example, like a peak thrust or a, you know, 90th percentile temperature or single values that, that summarize what happened that are not time series data.

The other is metadata. We have things like start times and stop times of tests and firings. We have things like the name of the responsible engineer for the test. Um you know, we have information like the serial number of the device under test and all of that needs to be stored along with the test data.

And so we'll talk about all three of these different types of data and how we would recommend storing them the time series data. We we've already talked a little bit about taking advantage of object storage for that. I'm going to dive a little bit deeper into that. Um so we recommend storing that data on S3. I recommend looking at storing that data on S3. And in what I'll call a portable format, something like CSV or Parquet.

Part of the reason for that, and I'll talk about Parquet more in a second. But part of the reason for that is by using those formats, the trade off you get for doing that is integration with other services that can integrate with that data on S3, which we'll talk about and getting access to a broader ecosystem of services that can process that data and automate the processing of that data.

Um I think Parquet is particularly interesting and I've seen, you know, historically, I think it's thought of more as an analytics format that's kind of the world it comes out of. But I'm seeing it used more and more for engineering use cases because back to what I kind of talked about earlier with the the data store conversation, it can be really efficient for the types of data access we tend to see from the consumers of this data where oftentimes they're going to want to take a particular measurement or a subset of measurements and take them across an entire time range.

And so Parque as a calendar format makes those kind of data accesses more efficient. It also has really good built-in compression options that can reduce data volume and reduce cost for storing the data. The trade-off though is you were talking about how that that's great for data access. But is interface Parquet comes out of more of an analytics world.

We've started seeing more uptake on it in the engineering world. But if you have for example, specific simulation tools, if they don't know how to read a Parquet file, it's not a lot of use to you. And, and so if that's the case, you've got a trade off to think about the about whether the format that stores and allows efficient data access causes you too much inconvenience, inconvenience or cost in terms of integration with tools that may not support it.

And whether you want the complexity of having to add, for example, a translation layer on data retrieval, if something you know expects a TDM S file instead of a Parquet file as an example. And so some of those integrations i was talking about earlier just to give you an example, right? Rather than just reading the data back from S3.

Amazon Athena is another service you could potentially use to read this data back again if it's stored in those formats. And so Athena is a service that will let you write sequel like queries or even now Spark jobs that it will automatically run on the data as it sits in S3. And so the the compute in Athena is, you know, it's serverless, it's a managed service and it's divorced from the data.

And so you can run SQL queries on the data as it sits in S3 without having to move it into another database or a data store. Which could be really interesting, especially for these use cases where use of a lot of this data, especially once it's been archived for a while, may be fairly infrequent in a lot of the customer situations I've seen.

But other, you know, kind of examples are for example, AWS Glue, which again, we tend to think of as the data and analytics service. But if you're trying to enable more advanced analytics across this and looking at this time series data across multiple tests, potentially multiple years and the data volumes grow using a distributed data processing tool via something like Glue could be a very efficient way to process that data.

And so by using these formats that are portable and integrate with these other tools, you get a little bit of built in um um flexibility and agility to use those other tools, even if you don't necessarily need them today.

There we go. And so uh but mostly we've been talking about the time series data. So now let's talk about those other two types of data. Um I've introduced the concept of what I'm calling a test catalog here. Uh and so the purpose of this test catalog is to capture those other two pieces of data metadata and parameters from the test summary metrics from the firing or from the test depending on how you think about and organize that data.

And the purpose of putting all of those into this database is to solve two problems. One is the one i talked about in data governance. At the beginning of the talk where we've talked to a lot of customers who struggle to find that test, they ran that time or even want to be able to search for a test that meets a set of parameters, right?

And so capturing all of that data in a single system so that it can be found and stored in a way that that's searchable and separate from the raw time series data. And so providing that pointer, i think the other thing that pulling this data into a single data store lets us do is it lets us do reporting and analytics across the different tests that are in that catalog back to pulling those summary metrics up.

We've seen a lot of customer use cases where they're interested in being able to do analysis and say, ok, well, if i look at these metrics, how does this engine perform in family compared to its peers? Um do i have trends over time? That are interesting? And we've heard a number of anecdotes from customers that that's something that they want to be able to do, but not all of them have figured out to do that with their, how to do that with their current systems.

Based on that description, i think sometimes the, the assumption i hear out in the field is that, that sounds a lot like a use case that you might use something like OpenSearch four or a data warehouse for. I've already given it up on the slide here. But i actually think i would start with just in terms of starting simple with a relational database.

Sometimes everybody squints at me kind of funny. A relational database. What do you use in a relational database for here? But hear me out, i think there's actually a couple of reasons why this may make a lot of sense and be pretty efficient for the type of data that we're talking about here.

First, i expect at least in, in the, in the systems i've seen out in the field and catch me afterward and told me if you've got data contrary to this. But generally, what i've seen is that the data that would live in this test catalog is ss merely is fairly small and there's a light load on this system, right?

We're, we're inserting effectively a row into a table every time we do a firing, which for most of the folks i've talked to is, you know, even if they test frequently, it's not that frequently when we're talking about a couple of rows a day

And generally, you know, even with all of the summary metrics and test things it's, you know, 100 or less columns if you thought of that as sort of one big table. And so scalability isn't really a huge concern there.

So the other benefit to using a relational database here. And I'm specifically recommending looking at Aurora Serverless. And if you're gonna make me pick a flavor, I'm gonna pick Postgres because I like Postgres better, but you could use MySQL as well.

Um is that it gets us easy integration to a number of other tools that know how to talk to a relational database. And so you get a really big um selection of tools that you can use for front end of that data, which we'll talk about in a little bit.

But the other reason is um you know, we're looking at Aurora, we're looking at a managed database service. So it's managed, you don't have to manage the compute. But by picking a serverless variant of Aurora, the costing of the database is done based on the storage and based on the number of queries.

And so with a database like this, that in most cases, it's going to be fairly light load, very light light load, pretty light re load and generally a fairly constrained amount of data we expect that that will be really cost efficient. And so that's part of why i decided to put it up here.

So hopefully that explains why i haven't gone crazy and why i think actually a relational database might be a really cost-effective simple fit. Obviously, if it's not, if your data is larger than that, if you outs scale it in time, you've got a lot of easy options to move up to um systems that could scale a little bit further or give you a little bit more flexibility.

For example, you could move that data into OpenSearch, you could move that data into Redshift. Um but I would suggest that if in doubt um that Aurora Serverless is probably a good place to start here for storing that data and making it available.

And with that, I will hand it back over to Matt who's going to talk about managing the data reduction code plea.

Uh yeah, very briefly. Um one advantage of doing all this in the cloud is that we could start to separate the actual testing of the engine and the software required to process the testing data.

Um typically it with aerospace, everything is very close integrated, right? It's kind of one off cases and we didn't talk about how we get that code that does the processing in that container up into AWS.

So one thing we would definitely advise is using something like AWS CodePipeline to manage the software that is doing that data reduction and processing.

Uh if there's one common trend coming from all of this is that archiving and version control of all of this information is really important for aerospace customers.

Um and having something like CodePipeline and a code repository of your, of your choice like AWS CodeCommit.

Um lets you do that archiving and that version control of your software as well. CodePipeline will also introduce manual checks, right? So who was the person who made the change to the code? Who was the person who reviewed it and approved it before it got sent over to the container repository where it is then used by the processing.

So CodePipeline can help automate that. Uh it also supports encryption at rest. Um and you get audit logs anytime any of someone interacts with the processing code that is doing the processing of the data.

So you can look back on it and say who who made the changes to this code? And which version are we using for this test? If you find out later on that your calibration corrections have changed and maybe a sensor was picking up incorrect information and now you have uh incorrect values contributing to your data.

You can roll it back or change the code version and rerun that processing on the test data. So that's why we introduced this here. Um and it's another advantage of having this running in the cloud where you can go back and reuse that data if you need to.

And now Nate's gonna wrap us up with how we are actually consuming the data that we've processed and stored.

Thank you. So, um this is going to be a common theme. Think about your data consumer, think about the interfaces, they need, think about what tools are going to make that data use efficient for them.

And so we're going to talk about four different ways to enable data access once we've processed and then stored this data. The first of which is enabling access to the actual time series data. I know that says SageMaker bear with me for a second. I'm going to explain why we're not talking about ML just yet.

And so generally, I've seen two things and I've got a recommendation to think about a lot of folks out in the field have built what amounts to custom user interfaces to enable access to this data that generate dynamic plots that their organization likes, that are specific things that they want to look at.

And they've just gone ahead and made that investment of building effectively a plotting wei to enable access to this time series data to go and view it.

The other suggestion i have that i think has been picking up a little bit of steam lately is looking at Jupiter Lab notebooks as an analysis environment. And the reason i think that's interesting is for, for those not familiar Jupiter Lab notebooks are effectively a Python environment that lets you build up step by step, a run book of your data processing steps and see the results.

And so you can kind of ask a question of the day to see the answer, ask a question of the day to see the answer and you can sort of iterative, iteratively build through it in that way.

And so for engineers doing data analysis that, that are, you know, willing to learn a little bit of Python if they don't already know it. Although we're starting to see more and more Python used in these environments, um they can take advantage of that environment and they can take advantage of software tools like nm psy pi et cetera to actually build their data analysis.

But as they're doing that, if you do that in a Jupiter Lab notebook, you're effectively building a run book of that analysis. And so if it's more exploratory to start with, that's fine, that environment will let you do that.

But as you find that you're doing the same analysis over and over again and, and getting comfortable with that process, you're already building steps towards either turning that into a run book that other engineers who are less comfortable doing it from scratch can then build off of and you're well on your way to building automation and getting to the point where effectively what you've ended up with is just a Python script.

And that could be a subsequent batch job that you could then just automate once you're comfortable that the plots or the output analysis that it's generating is what you need.

Um there are a number of good ways to host Jupiter Lab notebooks. So you don't have to do it all yourself. One of the ones i think is interesting is just taking advantage of SageMaker notebooks or SageMaker studio.

Yes. On its surface, SageMaker is a machine learning process. But one of the things it does is it'll run hosted Jupiter Lab notebooks.

Um it's not the only way, but it's one of the ones i would suggest. And back to the top. If you are interested in exploring this data with machine learning, you're, you're already part of the way there, right?

The second is we tend to see a lot of use cases where customers want to analyze this data with what what i'll call for. lack of a better term thick client simulation or analysis software. The kind of stuff you traditionally would have run on an engineering desktop or laptop.

My suggestion is to look at VDI, i would specifically recommend looking at AppStream 2.0 1st as a way to move those data processing workloads close to the data and avoid the amount of data movement you have to do and and save the time for, for folks for whom this data is very large of having engineers have to sort of repeatedly download those large data files and be able to handle them locally.

So we talked about the test catalog and i organized these in a strange way because the test catalogs at the top of QuickSight is now a third line. But imagine QuickSight connected to the test catalog up at the top.

And so my suggestion here with QuickSight is that QuickSight gives you an easy way to put a front end on that test catalog both to just give access to the tabular data in the test catalog.

So engineers can go back historically and review the tests that are available and have pointers to the actual raw time series data in in whatever environment they're going to review that in.

But also by putting QuickSight in front of that data and by storing some of the summary values and pass fail values in that test catalog, um you enable being able to do reporting and analysis and so trending within families and things like that at that point becomes something you can do in QuickSight.

So again, this is another thing that some organizations may choose to build into a custom we ui and make that investment. But getting started. I think QuickSight is a great way to enable access to that data once it's stored.

And again, part of the benefit is by using you know, a relational database or something that QuickSight can talk to. We get an easy managed front end to enable access to that data without having to write a custom web application if you don't want to.

And last um enabling direct access to the data, right? For other processes, other tools being able to use the SDKs CLI tools, shell scripts, what have you uh as scrappy as you need it to be or, or as integrated as you need it to be um to enable direct data access for, for users with authorized access to that raw data.

And with that, I'm going to hand it back over to Matt, who's gonna wrap us up and talk about how you might be able to get started.

Yeah, thanks. So we have our architecture, we take the data off of the test stand, we've processed it. Um and now we have it stored in a way that lets us do two things.

The first is bulk analysis on all of our tests and not just a specific engine or a specific component, but if you have different engines, different test stands doing different things, having that consolidated in one view makes it really easy to do bulk analysis or some of those machine learning tasks that Nate mentioned earlier.

It also gives you a historical view. You can look back previously over time to see how the engine component is trending.

Uh what what is our progress looking like? Are we making progress? Is the engine becoming more efficient? Is the specific impulse better or are thermals within better limits or are we regressing? Do we need to make some changes to the design?

Um Nate mentioned using QuickSight to create a really quick dashboard, we have an example of one here, right? So it's apologize for being small. But some of the things that we're looking at here are like the specific test id, the part number who was the engineer who was in charge of the test?

Um which firing number? So uh which number of sequences was the engine ignited? Um if you have an engine that's doing uh you know, multiple lights and you, you want to ignite the engine multiple times to make the spacecraft move those would be different thruster firings, right?

So you want to isolate those out and see which one are we uh analyzing here. Uh also the duration uh the the time that it took. And then importantly, the S3 object idea where that raw data is, right?

So the engineer can just look at this dashboard, this high level view of the test. And if they notice something that they want to dive deeper into uh select that S3 object id, right, and get the raw data to bring into their engineering tools to do more analysis on it.

So the exciting thing about this is each of the services we talked about are already available in AWS today. Um so you can go ahead and use them and use this architecture and build off of it.

Uh and also for our customers who operate in the US and have really strict I tar uh restrictions and requirements for compliance. Um all of the services are also available in GovCloud region as well.

Um so it gives you a really good foundation of your data, you can build and integrate with other applications. So if you're doing different tests or if you have um say your parts division who are in charge of creating the components uh separate from your testing division, right?

You can integrate that with other internal tools you might have because you have that foundation of data that you can build on top of.

Uh so, yeah, we definitely recommend take a look at architecture, come talk to us afterwards. Uh if we encourage you to build on this rocket engine testing is never the same, right? Everyone does things a little differently.

So please iterate off of, off of it. Um and we'll love to learn how you're using this. So that will open it up for some q and a if anyone has some questions.

And um yeah, thank you all so much. Uh please please please do the survey. Uh it's really important that we get the feedback and we really appreciate it and thank you so much for, for joining us.

Thanks to the audio individual crew and thank you for staff for enjoying a smooth session and thank you all for coming. Have a good night.

李白的朋友王维

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Rocket science: Process, store, and analyze engine test data on AWS

Good morning, everyone. Thank you for joining us on the day. One of re:Invent. Yeah, everyone looks pretty fresh, energized, excited. It's good to have a session on Monday instead of a Friday or Thursday afternoon. People are a little more tired, but thank
复制链接

扫一扫

Rocket science: Process, store, and analyze engine test data on AWS

“相关推荐”对你有帮助么？