AI acceleration at the edge

最新推荐文章于 2024-08-20 00:23:36 发布

李白的朋友王维

最新推荐文章于 2024-08-20 00:23:36 发布

阅读量97

点赞数

文章标签： aws 亚马逊云科技科技人工智能 re:Invent 2023 生成式AI 云服务

本文链接：https://blog.csdn.net/just2gooo/article/details/134834577

版权

Alex: Right. Brilliant. Thank you very much everyone for coming along to our session today. We're gonna be spending the next hour talking to you all about deploying scaling creating solutions for the edge utilizing AI to accelerate them. I'm Alex White, sales engineer with Intel, also joining us some of our customers and partners as well as Mohan from our cloud team as well to talk about all of this together.

So just to set the agenda of what we're gonna be talking about today, we're gonna start off by talking about the true business outcomes for AI, how it can truly be revolutionary across all businesses, all sizes of businesses and of course, across every type of workload as well. Then we're gonna be looking at exactly where to deploy edge AI and of course, the edge is not a single spot, it's a broad spectrum and making sure that you're deploying the right workload at the right location to make sure you get the most benefit from your data and pass that through to your customers as well.

And then as we go through, we're going to be introducing some of our customers and partners here to talk about how they are able to create AI solutions integrate AI into their customer workflows and of course, bring AI to those extreme environments as well.

Towards the end, Mohan's gonna come on and talk about using our latest generation Xeon processors to accelerate AI on the CPU and utilize some of the inbuilt acceleration with AMX in those processes to truly make the most of the hardware you have, especially when you deploy it in EC2 instances in the cloud as well.

So with that, before we get into anything else, can I have a quick show of hands from everyone in the audience of who already has an AI strategy for business at the edge, I see a few hands that's very good.

So historically, many edge AI workloads were deployed at the edge due to necessity, not because of the value that AI can bring. This was things like connectivity issues or latency, the requirements where you have to have your data and your processing physically as close as possible to the platform itself. But as we've progressed and there's there's more and more compute available at the edge. Now we're seeing a strong move in terms of AI being synced, which truly revolutionizes the way your customers expect to have experiences at the edge as well as making sure that the data you have at the edge is processed and you stay ahead of the competition as well. And this truly isn't just a topic to be talked about across large digital first enterprises. This is something which is gonna be truly revolutionary across all segments at all businesses as well. And to make sure you stay on top of this and to make sure you stay ahead of your competition. Having an edge AI strategy is gonna be really important going forwards.

So now, you know, there's a lot of value here, there's it's imperative that you deploy age edge AI solutions where exactly should you deploy? So there's many different areas which make sense for different deployments. First of all, we have the cloud and enterprise and of course, that is utilizing EDA services, that's a stage maker to create hybrid experiences. So having your centralized model trained in the cloud and then of course, just distributing that inference far and wide and this is really powerful, you're able to get the benefits of AWS services, the cloud, the scale and then of course, make sure the infer happens as quickly as possible as well.

Then we also have the edge and of course the edge sync which is the traditional edge on prem edge or industrial edge and of course the far edge as well and we're seeing increasing movements and towards the far edge so that you're able to process even faster, distribute further. And of course, be more agile with that compute as well.

The other side, which is, which is often forgotten is very much around the clients in the workstation. And this is often the mobile clients and workstation, which are a key area of deployment and data creation for data scientists doing that, complete AI journey as well as deploying training on their high performance workstations. And of course in temporary environments as well, client devices can be used to be set up in new areas running infer incing locally. And of course accelerating on smaller machines as well to improve consumer input as well.

And of course, with this, there's many different types of architectures which make more or less sense depending on where you deploy CPUs hardware, GPUs FPGAs etc, all have very different hardware architecture requirements and commonly have very different software architectures you have to build, to make that possible with Intel. We're taking a very holistic approach to looking at how can we make sure that all of the different types of hardware architectures which have their benefits in those different areas of deployment can be easily utilized and be able to deploy models across multiple architectures so that you as customers and partners and make sure you deploy with the right hardware and not constrained to those individual deployments as well.

One of the key things to be able to make the most of data and to be able to make the most of AI deployments for businesses and individuals alike is to be able to deploy and iterate and fine tune fast. And with this, this has been a common struggle for many, for many years now, deploying AI workloads can be very time consuming. And of course, with that, you have to have a lot of specialized knowledge within your organizations as well. And with Intel, we're making sure that we can make that journey much easier for all of you.

First of all, we have a really powerful new software tool called Intel Getty. Intel Getty enables you to create and train a computer vision model in a fraction of the amount of time is what it historically has taken as well as having to use a fraction of the amount of data to deploy these workloads as well. And this is going to be revolutionized, revolutionary for many organizations where you're just starting to get into looking at what's going to be the right type of deployments. I want to have something which is capturing the insights, but I don't know exactly where to start. This is a really powerful way to get those initial computer vision models up and running.

Now, once you have a computer vision model and this can be in any sort of format like TensorFlow, Caffe, Caffe, PyTorch, ONNX and of course your Getty models. OpenVINO the Intel distribution of OpenVINO is a really powerful tool toolkit to enable you to choose exactly where you want to deploy your architectures with OpenVNO, it's a single command line input and from there, it creates a IR version of your model. And with that, we're able to then have hardware abstraction for your models. You can choose exactly the right type of architecture which makes sense. Moving a model from a CPU to an Intel GPU is as simple as a single flag within your command line when you spin up your infer toolkit, and these are going to be truly very important technologies and something which is going to happen across every single segment as we're seeing more and more segments and workloads come on board with utilizing edge AI.

One of the other really important things is the emergence of new sector specific tools which have come out to enable the much more quick adoption into segments where there may be historical geographic requirements or regulatory requirements as well. Federated learning is a great example of one of those technologies which enables you to have a single centralized trained model. And with that multiple organizational organizations, multiple entities can help contribute training data influencing data to help create a bigger picture, a more accurate model. And then all of that inference which is deployed out to the edge, none of those other entities have access to the data as well. So you're securing the data, but you're also taking advantage of utilizing a much larger data set to help with the outcomes as well.

Now, we've talked about the value of the, of AI at the edge. We've talked about where to deploy and how to overcome some of those challenges. But now the next question which is commonly asked is how do we make sure these deployments are secure? Of course, you're trusting your organization's decision making on this. And with that, we need to make sure we secure these edge platforms as well. And when we look at security, it's very important that we take a very holistic approach to security. And this really means that you're not just securing the workload, we're not just securing the OS but truly securing the entire platform as well.

So with this using technologies like SGX and TDX at the edge, enable you to create secure enclaves. And with that, you can have your model, your workload isolated into an individual enclave on the platform. So if you're running in a multi tenant environment or you just don't simply trust the host, that's a really important way to overcome this as well. Also making sure that your supply chain is secure is very important. These devices are gonna be shipped to multiple locations and they might be shipped, you know, long distances as well where you might not be able to trust that the device which arrives at the edge is exactly the same as it was when you left your facility as well.

So as Intel, we're aligned to a lot of industry standards to make sure security protocols are pulled through and can be utilized as easily as possible on these edge devices and FDO, which is a standard created by the FIDO alliance and making sure that supply chain security is utmost as well. So if you have your fleet management tool set, making sure that these devices can be on boarded securely. And of course, with that, any workload which is then subsequently spun up can be truly trusted backed with a hardware route of trust as well.

Now, I'm gonna take a couple of minutes to look in a little bit more detail around the key workload, key technology for edge AI which is truly driving a lot of today's edge inference demands as well. And that is of course computer vision. Now computer vision is truly a technology which is gonna be touching every single segment as well. For example, in retail, creating smarter self checkouts in industrial being able to do defect detection real time across many streams in parallel, it's gonna be very revolutionized, revolutionary to be able to improve quality of product output as well as being able to deliver those next generation experiences to customers as well.

But when we look at computer vision, it's a very commonly used phrase. But what exactly is computer vision and what are you trying to achieve with that? So with computer vision, there's typically three things that you're looking to do first of all there's identification. This is where you have a video frame which you're looking to process. And you want to understand exactly what is in that frame. Is there a car? Is there a bike? Is there a plane with that? You may also want to be doing classification to classify exactly where the object is. And with that, it's looking at exactly where in the frame it was previously tracking that over time. And then with that as well being able to extract those local insights onto the device.

The third type of use case for computer vision is really around text extraction. And this is a really important way to be able to take text symbols, icons from video feeds from videos and images and be able to bring them forwards into a digital format. And with this, you can apply things like natural language processing models to them. So you can understand the true sentiment of that text as well as text summarization models as well. So you can get a true understanding of exactly what people are saying what text is being input and what that output really means in a much more digestible format as well.

And where is this used? And I think a really great example of where this is used is with one of our partners AI 0.0 who are truly revolutionizing the way that sports and training academies utilize video data to improve experiences. So with that, please welcome Jonathan Lee on stage for AI dot io.

Jonathan: Thanks Alex. I'm excited to talk about AI I where our mission is to democratize sports opportunities and we believe that everyone anywhere in the world surely have the opportunity to be scouted by and potentially play for a professional team. And we use cutting edge sports tech that was once only available to elite athletes. Now to do that at AI do we have a portfolio of products that bring AI to the cloud as well as to the edge. And Intel has been a key partner.

At every step, it begins with AI scout. A scout is how we bridge that gap between the aspiring athletes and the the team, the protein aperture team, the the club, the university, the national team, the the federation. When a player downloads the A scout app, they are immediately presented with access to a unique opportunity, an opportunity to be connected to those orgs. The app has a series of carefully selected drills. So these drills have been, have been designed and selected by our sports science team in conjunction with the pro team and they are intended to measure your athleticism. So for example, your speed, your agility, your coordinations, as well as your technical ability, passing, shooting, dribbling.

When a player does the drills, they capture the videos using um using the app, those videos get streamed up to the, to the cloud. And that's where our AI models sit. Our AI models are, are trained on Intel Gaudi using on AWS DL one instances, optimize using OpenVINO to run on Intel, the un scalable processors in the AWS cloud. The models give us two D and 3D information about how the athlete moves, where the ball is, where the cones are. So we see how the athlete interacts, not just how she moves, but how she interacts with the ball. She's using her left foot, right foot. How far is the ball from her? So we get a comprehensive view of how the athlete performs.

That data then translates into a score and a benchmark where we compare the athlete to other athletes in her demographic and we provide that information to the club and the club can then look and see, hey, I want a midfielder, good speed, left footed and they can invite that person to trial for their club because we run on Zion. That helps us to keep our inference costs manageable, which is important for us because this is truly something we want to democratize, globalize. So whether you're in the UK or snowy Canada or India, now everyone has the same opportunity and we partner with top top teams. So in the English Premier League, we partner with Chelsea and Burnley and all of Major League Soccer to date. We have trialed and signed 111 players selected for signed by pro clubs or national teams.

If you look under the hood of bay, ask scout, you find 3.3 DA or 3D athlete tracking is our collection of biomechanics and human movement analysis driven by AI and computer vision, computer vision, excuse me.

Three. That was originally developed for the Tokyo Olympics. It was developed for the Olympics and has been used in the last two games in the broadcast as well as for coaching and training elite athletes with Three Dot. And this is an example of computer vision. We detect over 20 key points on the body to extract a 2D and then a 3D skeleton from the from the athlete. All this is done without any special sensors or suits, just computer vision and AI. And we can use any video source. It could be a mobile phone, it could be a broadcast camera. It could even be a historical video.

With that 3D skeleton that we extract from the video, we then run another set of models called inverse kinematics, which takes that 3D skeleton and turns it into natural and actual human movement. And from that 3D model, we can then compute and calculate over 1400 biomechanics metrics, which allows us to get again a deeper insight and a comprehensive view into how the athlete moves.

These models were trained on Olympic athletes, on track and field athletes, sprinters, jumpers, throwers, winter sports snowboarding, speed skating, figure skating. It is the most diverse and largest set of athlete data in the world and it is what powers Three Dot. And Three I continues to get better as we train it on the AI Scout data.

And as you see in a second in our AI labs, AI labs is where we bring AI to the edge and we use AI in an elite performance lab to train and test athletes. And Three DA underpins both AI Scout as well as AI Labs.

As we can bring our solutions to your training facility, we can deploy it as a permanent or semi-permanent solution or a mobile solution. In this case, when I say mobile, I don't mean a phone, I mean a truck that drives into the facility, drops off the trailer, the shell comes out, it's like a transformer and then the glass sides push out. The networking, the hardware, the equipment is all there. You just bring the athletes and here's what it looks like.

It is a collection of gold standard sports science testing equipment mixed with AI, including Three Dot, where we can bring all this data together in real time, connected, aggregated. So as an athlete you would do one of these drills, so it could be called a gait analysis or 3D body scan or cognitive tests or like a reaction time. All that stuff, all that data gets processed, aggregated in real time. There's a leaderboard that we show you, you know how you're doing compared to your peers. Of course, the athletes love leaderboards.

That was the Chelsea training facility. And this is also where we bring again our edge AI, as I mentioned earlier, our models are trained using OpenVINO, which not only allows us to bring from cloud to edge, it also allows us to use different Intel hardware whether it's Xeon or a PC. We can tailor fit the solution for what we need.

So in this case, you're looking at an athlete doing a counter movement jump, hands on hips, jumps up. And so here traditionally, you could measure things like how high she jumps, her vertical, but also her force plate, so how much force output she's generating using the force plates. But now with 3D and our multi camera solution, now we're adding another layer of data, which is her biomechanics. So we're combining the other information with things like how stable are her knees when she lands, right? So clearly this is really cutting edge.

All that data is aggregated again into our control center where we or the club can see the data from the lab, see the data from AI Scout all in one place. And if you want to have a little slice of what it's like to be tested, like an elite athlete, go visit the Intel booth where AI.DO has an activation that we created for. And with Intel, you do a series of drills, physical, cognitive, you get a score and you can see how you rank versus your fellow Re:Invent attendees.

All right, thanks so much. It's Jonathan and I hope you really got a sense there of the importance of having optimized images and having the flexibility to deploy them in the right place for your workload and for your customers as well.

So now we're going to look a little bit of a step beyond kind of CPU based infer and look at how do we accelerate this with Intel accelerators as well. And as part of the demo we've been running on the Intel booth and there's a little bit of time for you to still come and see this, we're running a complete media pipeline on a GPU accelerator on an Edge Blade as well.

The Intel Flex Series 140 GPUs are an ideal GPU to accelerate edge infer mainly because of the four discrete media engines, meaning that as you create media pipelines, all of those encoding workloads, your decoding workloads from your camera streams can all be done directly on the GPU as well.

And with this, we've been able to build out a complete solution architecture which is truly bringing edge native components together with cloud native services as well. So in this example, we're bringing RTSP camera streams onto the system and those will be network based cameras, we'll be doing the decode directly on the GPU, then we'll be doing video processing.

And a common example of this will be running an OpenVINO model at this stage directly in line in the media pipeline. So for an example on the Intel demo, we're looking at identifying cars passing on a piece of road. And with this, we're classifying the type of car, the color of the car and then we're identifying where the cars come from, where it's going. And then with that, you can extract insights in terms of its speed, exactly where it's come from and then generate analytics around that.

The next stage is all about re-encoding it to be more efficient for the network as well. And truly, as you bring these devices onto the network and away from the edge, you want to be able to conserve as much bandwidth as possible. So with this, we're able to utilize AV1 natively on the GPU and this is a much more efficient codec than the likes of H.264 and H.265 as well. So being a much more efficient way to transfer that data away from the edge.

And of course, back into the cloud onto an ADS region as well to make sure we're aligning to AWS services and to take advantage of those services as well. We're using KVS to create the WebRTC streams and also with Greengrass to help provision and control the edge device as well, a truly cloud native, but making sure those edge elements are bridging that strong gap as well.

And of course, with these solutions, you might be deploying in areas which additional thermal requirements, you might have dust concerns and wanting to deploy a system which is gonna be rock solid stable for many years at a time. And to make sure we have those systems which are possible to deploy in the right location for the right form factor.

Let me please welcome Michael Greene from OnLogic to talk a little bit more around edge hardware and accelerating edge AI workloads. Michael over to you.

Thank you so much Alex and I head the AI solutions at OnLogic. I joined the company around nine years ago, a little shy of nine years now and most of the time I've actually been leading the engineering and the product management. And it's been really exciting seeing the company grow by almost a factor of eight in revenue during this time, but I recently made the switch because I'm really excited about the prospects of AI, all those new opportunities.

And as Alex pointed out in the beginning, we're only really at the very early stage of this. And there is so much opportunity in this. And that's why I'm now focusing on this area. So let's start with just a brief few words about OnLogic. We're a global provider of industrial computing solutions. We are a leading edge hardware provider for the edge. So really the edge is where we focus on.

There we often experience environments where we have very hot temperatures, we have low temperatures or we have particulates or it's dirty. And in many cases, if this is true, reliability is really, really key also for our customers, the ability to purchase these products over many years, not like a commercial computer that's going to have a new successor after a year or two. So that is key.

And then, so when for all of these things that is where we focus on is OnLogic. And on the slide here, you see some of our main product lines, we refer to them as industrial rugged panel and edge servers. And in terms of markets, we serve diverse markets - manufacturing, logistics, in vehicle automation robotics, medical, you name it in terms of customers.

We serve customers like Amazon, General Dynamics, GE, Hitachi, UPS and so forth. And I was told this year so far, we've served more than 7000 unique customers. What OnLogic is all about is making it possible for our customers. And increasingly that actually means providing not only hardware but also helping connect the dots for customers building industrial IoT edge computing and edge AI solutions.

And let's talk a little bit about how we go about that. That combines a whole bunch of new elements where we partner up with the right partners to make that happen for our customers. And I'm trying to show this here on the layer diagram and we'll start at the bottom here with the hardware.

On the hardware side, we work very closely with Intel who has a great portfolio for the edge, specifically a large variety of CPUs in different performance ranges, also very capable of these environments that these computers go in with the embedded life cycles, exactly what the edge requires.

And the other thing that where we've seen a really nice development over the last years and expecting that to continue into the future is in terms of the features, the ability to process AI workloads like Alex was just talking about, increasing capabilities on the GPU side. And really enabling applications to run just on what's in the CPU package.

In some cases, not requiring separate accelerators or graphics cards. For example, we have one customer who's a global technology provider working on an innovative baggage handling solution. And with previous generations, the solution required a graphics card. With the latest Intel generation that can be eliminated and it makes a lot of things easier and much more straightforward for the customer.

And yeah, we also support, you know, for those customers for the application that required it, graphics cards and accelerators like the Intel graphics cards.

Moving on to the operating system level, we work closely with partners including Red Hat. Red Hat is a leader in the open source technology and has optimized solutions for the edge. So that's why we partner with them.

Moving on to edge orchestration and application software layer, we work with Intel OpenVINO mentioned already, great for customers to benefit from the underlying Intel platform as well as GSI AI. GSI AI is a leader on the software side in the edge AI space and have great skills in terms of solving problems, real world problems across various industries.

And then AWS, a great provider on the cloud side, we do have also some of the hardware certified for those systems and can also have the process in place whatever our customers need, we can provide those AWS certifications.

So now, let's look at an example of how we support our customers here when it comes to edge AI solutions. First of all, the fear is often, oh it's gonna be very complex. It's gonna be a time sink. It's gonna need data scientists. It's gonna be a huge endeavor. And the whole idea here is to make this much more straightforward, much simpler for the customer.

And together with Red Hat, Intel, GSI AI and OnLogic, we've been working on a solution to make this easier for our customers. And it's kind of, you can view it as an adaptable turnkey solution that customers can apply.

In this example here, you see two computers in the center. They are connected to this machine on the left which is a bottling machine. One of the computers is collecting data that relates to predictive maintenance. And the other one, the upper one, is actually collecting data including camera data. Can't see the camera here, but that also includes quality assessment of the bottles that are coming down the line.

And you have on these units, we'll run the inference and then are connected to the edge server that you see on the right hand side, that's also an OnLogic product. And that can then connect up to the cloud for obviously all the benefits that the cloud has to offer as well as to a dashboard.

And this dashboard can be either located on the line remotely wherever the customer would like to have it. So this is a very, very flexible solution that we've been developing and we're working with actually a number of customers currently on the implementation here of this solution.

And this is just an example, similar to the chart earlier was just a subset of the partners we work with. It's a logic we're committed to continuing on this path and we are deepening our partnerships and our solutions offerings.

And as part of that, uh uh uh Alex also mentioned the KBS demo at the Inter booth that features uh a K 800 system as well as there are other, a lot of other opportunities here at the event to see uh on Logic hardware. So please keep your eyes open for uh the orange computers uh from OnLogic. And uh with that, I would like to thank you for listening. And uh also I would like to uh do a big shout out to our partners and friends at Intel for giving us the opportunity uh to speak here. Thank you.

Brilliant. Thank you very much Michael. And you know, it really is true that there's not a single edge device which fits every need. You need to be able to adapt to form factor power requirements and of course align to those connectivity needs as well. This is even more extreme when you look at moving AI compute and inference to the far edge as well. So not just using centralized edge systems to be able to do that compute connecting the cameras, the sensors into a single platform, but truly running the inference and running that data set locally on the sense of a transmitter, the camera as well. And with this, you have additional challenges with remote manageability. If you've got a far edge device or a smart camera being set up high up in a city, it's very difficult to physically interact with that device as well. So making sure that you come across and achieve a single platform which enables you to have remote manageability local AI infer is truly important here as well.

And of course the scales of the the far edge and also there's opportunity to deploy the telco edge for RAN hubs as well. But to look a little bit more detail at actually, what are these devices which are running at the far edge? How can they be used to directly integrate into industrial machinery? It'd be my pleasure to welcome Alan from Arduino to spend a little bit more time talking to you at the far edge devices as well.

Welcome, Alan, thanks.

Hi, everyone. I'd like to introduce myself. My name is Alan Gagnon. I apologize for no inside voice there. I work as a solutions architect at Arduino. I'm thrilled to be here today to discuss Arduino's partnership with Intel and AWS. I want to personally thank Intel for allowing me to join this discussion with advanced edge hardware using integrated sensors, developer IDE and libraries, cloud computing built on AWS using Intel processors to generate ML models and the flexibility to develop a solution in any IIoT or R&D departments to develop and deploy edge AI applications to the factory floor.

The benefits of AI/ML tools on the edge with Arduino are but not limited to:

Build and train predictive models with just a few lines of code
Accelerate test model performance in the cloud before going to production
Securely scale developed models to any edge device
Leverage automatic labeling tools to simplify object detection and audio segmentation
Start using pre-made models out of the box, fully own your ML algorithms. All the tools are open source and royalty free
Expand existing Arduino projects with machine learning capabilities
Run machine learning tools on any OS or cloud platform
Compatible with multiple Nano, Nicla and Portenta Arduino products

The Nicla family of boards has three variants, all of them integrated sensors and capable of running ML models at the edge. Machine learning predictive maintenance is a common deployment we see used across this product line. Customers develop a model of normal operation and create cloud triggers to alert operations or the OEM when the anomaly is detected. The most common use is with the Nicla Sense ME collecting vibration data. Nicla Vision is used for solutions needing to detect anomalous motion and Nicla Voice is detecting anomalous sounds or classifying the type of sound.

When it is integrated with our Neural Tip by Sentient, bottling packaging and logistics facilities deploy Nicla Vision on the edge to process vast amounts of image data through the dual core STM32 processor. Image detection is able to detect and classify various scenarios. One example is if the bottle cap is properly secured, an ML model running on the edge can detect if the bottle cap is missing, set incorrectly or not tightened because it is sitting too high on the bottle. After detection, the errant bottle is autonomously removed from the conveyor at the quality inspection station integrated near the Nicla Vision sensor.

In packaging and logistics centers, it is crucial to monitor products as they move through from station to station. Arduino Portenta H7 is a powerful system on a module which is expandable through a set of high density connectors. In this use case, the Portenta H7 includes an Arduino Portenta Vision shield which features an integrated camera, stereo microphones and an Ethernet port. This solution leverages the OpenMV to program a custom barcode detection algorithm in Python that is integrated with the Arduino Cloud or directly to AWS infrastructure.

Nicla Voice is an edge device co-developed with Sentient and integrates their Neural Decision Processor which enables customers to quickly and easily deploy deep learning models on power constrained devices. Nicla Voice also includes BLE connectivity, battery management, a microphone and even an IMU. Leveraging all the components is a fault detection use case – the Syntiant chip can detect an emergency from a fall and further validate that anomaly by listening for the patient's cry for help. Connected to the phone app or an emergency call can be placed to 911 or their emergency contacts.

And with that, I would like to hand it off to Mohead from Intel. Thank you Alan.

So we saw all these compelling use cases from customers and partners. Now let's look at the actual technology from Intel that enables this AI transformation. So first let's look at the latest Intel 4th Gen Xeon processor. There are multiple AWS instances available as you can see here. These were launched earlier this year. And we work closely with Amazon EC2 team to make sure that these instances perform to the best of their ability. So that's and to the left you see multiple accelerators, this processor has acceleration for different areas of computing.

One of the things I want to call out here is AMX as you know AI has a lot of matrix multiplication and AMX is a processor custom built to perform matrix multiplication in a very optimized manner. So all the instances that we show here have AMX, the other accelerators, we have the DSA, the IAA and the QAT are available in the bare metal instance.

The other thing I wanna mention here is that with this 7 series EC2 instances of Intel, there's a option called Flex where customers can have a lower price, a 19% lower price than its predecessor. And the other thing we've seen is with AMX, customers can see up to 300% improvement in performance for their common AI workloads.

Now, let's go through a small demo showcasing what AMX is about...

[AMX demo transcript]

So now that we looked at the the hardware capabilities that we bring to the table. Let's look at the actual AI pipeline, you know, traditionally AI is associated with training, but you know AI is much more than training. So there is there are three distinct phases in AI that include data processing, which data processing and cleansing, where you're actually ingesting the data making it appropriate for use in the modeling that happens in the next phase. And in the modeling also called this training, we are actually taking the data and creating our model. And then last but not the least is the deployment where we actually taking the trained model and putting it into production. And that's also called, that's also called inference, right.

So Intel actually works with the AI ecosystem. So we actually work with open source partners. We show a small subset of them here, we actually work to optimize these frameworks, libraries and environments so that they, they operate optimally with the hardware features that we shared earlier.

So Modin for example is data processing commonly used platform. And then we, we, we, we work with Modin and AWS to make sure that Modin runs in an optimized manner on our instances. Similarly, you guys would have probably heard of TensorFlow and PyTorch these are the most commonly used frameworks and we actually work upstream with the open source provider to make sure that these work well with our hardware capabilities.

OpenVINO as you've heard a lot about in this in this session also is optimized and to run on Intel on AWS. So one important thing that we do is, you know, it's not just having hardware and software, it's also making software easy to consume. So we actually provide developers with very simple one line code changes that make a lot of these platforms perform ideally, right?

So we show you examples of Modin in here, which is actually accelerating Pandas. It can accelerate a Pandas up to 90x with just a one line code change. And we actually show you the code change. Similarly, PsychicLearn, which is a traditional machine learning platform. We actually accelerated by 38x with just another line of code change.

TensorFlow. As you guys are all heard of, we are since we work with the open source ecosystem, TensorFlow can be sped up by around 3x, just one line of code change.

So now how many of you have heard of generational AI? So that's probably something that we've heard everywhere in every session in this conference. So the one thing that we all are familiar with is that these are large language models and they, they solve general problems, right? So, but if you look at traditional customers, we've talked to our customers about it even though the general models answers a lot of their questions, a lot of the questions, the common questions, specific customer use cases require specific domain knowledge.

And so what happens is there, there's a need to have a smaller model that actually is dedicated to a particular customers use case. So what we see is even though the large language models are pretty popular for a customer to be able to leverage it in a very accurate and more efficient manner, they will have to take a specific instance and use it.

So there are a lot of common language models available in platforms such as Hugging Face and others that customers can use as a starting point with Intel Xeon. What you can do is take these models that are publicly available, bring your data in train or what we call this. Tune these models with Xeon, it works with Xeon, so you can tune these models with your specific data and make it actually work better with your application.

And also, as you know, in the edge, we are always resource constrained. There's memory limitations, the CPU limitations, you want to make sure that you have a small model that fits into the infrastructure that you have at the edge. So that's kind of what this tuning and customizations with embedded models does for you. And so that's kind of where Intel Xeon actually shines because you're able to take an take the customer data and tune it for their application.

So to summarize you, I recommend you guys use Intel solutions to run AI efficiently. Intel is ubiquitous as you know, through the edge to the, to the cloud. So it's, it's, it's, it's a good platform as you've seen from all the examples that we've, we've shown you to kind of run efficiently. And at scale, we also encourage you to use Intel's latest 4th gen processors as they have a lot of advanced capabilities for AI and make sure that your AI models run efficiently.

We also showed that, you know, we have a lot of software work that we do to optimize these AI capabilities. So by having a one line of change, you can see a lot of improvements in your performance.

And last but not the least, we, we have a couple of hours left in the, in the show show floor. Please visit our booth and, here are some of our partner booths as well. And, I just wanna kind of point you to the, some useful links and blogs that we've written on these topics and, I know it's Thursday, the final big day of, AWS and I'm grateful for you guys to attend. Thank you.

李白的朋友王维

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
AI acceleration at the edge

Alex: Right. Brilliant. Thank you very much everyone for coming along to our session today. We're gonna be spending the next hour talking to you all about deploying scaling creating solutions for the edge utilizing AI to accelerate them. I'm Alex White, sa
复制链接

扫一扫