Data drives transformation: Data foundations with AWS analytics

最新推荐文章于 2024-08-18 20:55:51 发布

李白的朋友高适

最新推荐文章于 2024-08-18 20:55:51 发布

阅读量516

点赞数 8

文章标签： aws 亚马逊云科技科技人工智能 re:Invent 2023 生成式AI 云服务

本文链接：https://blog.csdn.net/weixin_40272094/article/details/134737432

版权

Please welcome Vice President Analytics, AWS G2 Christian Mti.

Hello and welcome. Thank you for joining us today. I know it is a Thursday late uh late event. Hope you're having a fabulous re invent so far. And I know you are actually here because you see the opportunity in putting data to work, to delight your customers to go and optimize your business and to make everyone in your company do better in their jobs with data.

Now, if you are actually not stressed in or have unlimited budget, you may not actually get as much from the stock. But if you are among the most who are actually running hard just to stay in place and are stretched to take advantage of the opportunity to innovate like the generative a i, then this session is for you.

I am G2 Krishnamurti from AWS and others and I'll be joined by paul from adidas sunil from global foundries and tom from ucr win. And we have a great agenda for you today. We want to help you to build a robust and nimble data foundation for all your needs. We want to help you simplify your data landscape and we want to help you empower your users by putting the power of data in their hands. We will cover some exciting launches, show some cool demos, share some inspiring stories and we will have some talking about one of my favorite topics. Food, i think data and food have a lot in common.

Let me explain. In this digital age, data fuels innovations like food fuels us. The innovators being data driven is a lot like eating healthy, you know, fresh ingredients, balanced nutrition, no added sugar, everyone knows that they should, but it's hard for most. And we know the payoff is big eating right means happy and healthy lives for years. And studies show that businesses that are actually data driven, have happier customers and make a lot more money, maybe 20% more money, but only one out of four organizations feel they are actually turning data into a strategic assets. Why is that?

Maybe they are not working backwards at amazon? We are known for our working backwards approach. How would you actually apply that to your data strategy? Would you start with a beautiful solution architecture? No. Would you start by engaging some customers or consultants and spend a couple of years developing a comprehensive data strategy wrong. Working backwards means starting with a funded business initiative. Like think about the value you want to create for your customers or your business, like building a general a i chatter system that makes your customer support be better as you are building this project. Think about actually the capabilities that you need and expand your data foundation for it. This flywheel of building expertise in existing and emerging technologies and the culture and the processes to succeed with. It builds your data foundation incrementally for your needs of today. And in the future, data is your transformation differentiator. And if your data is not ready, even the most capable large language model will not deliver on the business value.

So what do you need in your data foundation? You must have access to the comprehensive set of tools for all your needs. From the tried and true data warehouses to the new and shiny large language models. It must be built together to work together to make your jobs easy. It must have governance built in end to end and it must tap into the power of machine learning and generative a i to make things easier, faster, cheaper.

So today, we will actually talk about four areas, how to build a robust and nimble data foundation, how to simplify your data landscape, how to empower your users to innovate by giving them easy access to data and intuitive experiences, to innovate and collaborate and how to enrich every application with analytics and machine learning.

Now, i'm gonna break this down with some food analogies. Show of hands, any food is here. Now as for me, i love desserts and my favorite dessert is mangoes, the fruit. It's really sweet. It is healthy because it's natural and it's something i grew up with. Think about the strategy to eat healthy. Not every everything is the same. You have, sit down dinners with your family. You have bento box lunches that you take to work and you have that on the go snack at your kids soccer practice, just like ordinary data users and use cases. Your strategy need to help you eat right in each case with fresh ingredients, balanced nutrition and it must fit with your lifestyle. Like for me, i'm a vegetarian. So getting good protein, take some work and you need good guard rails to keep that junk food and added sugar in check.

Similarly, for your data strategy, you need a comprehensive set of tools and built in governance and it must fit your unique needs on compliance, on scale on performance or cost. Our lives are crazy busy and we need things to be simple like local farm to table delivery service that makes getting fresh ingredients easy and sustainable. I'll talk later about how zero etl will help you break down your data silos and keep your data fresh without breaking a sweat.

And you need to unleash the culture of experimentation and creativity among your users like fusion cuisine. I'm amazed with what creative chefs are actually doing with jack fruit as a vegetable. You know, growing up, i saw it as a sticky smelly, but very tasty fruit. You want to empower your users to be as creative with data, to find tasty innovations for your business using the best ingredients in your data pantry.

Finally, you need to infuse great experiences powered by analytics and machine learning in every application. Just like the convenience of grab and go food. It's fast. It's right where and when you need it.

Now your business environment is complex and constantly changing. You have new use cases like generative a i coming online, your existing workloads are actually scaling. You may need to integrate an acquisition or comply with the new regulation. Our vision is working backwards from your needs and your success is our funded business initiative.

Let me summarize what i'm hearing from you as what is most important for your business first. You need friction, free access and near real time access to all your data no matter where it lives, databases, log streams, apps, marketplaces, warehouses, lakes in aws on your premises or in other clouds. And you need hassle free management of the data, get control of that copies of data and of etl you want to make all of this data available proactively to discover, to understand and to innovate by everyone in your business. But with trusted guardrails to curate and protect this data to meet your compliance obligations. And you need to have access to the broadest set of tools to innovate and act on this data. From sequel to spark to pythons to search to dashboards and generative a i. And you need simpler experiences and intuitive experiences to write queries, to code in your notebook, to build dashboards or applications so that you spend most of your time on your business initiative. And jam of a i is a game changer here.

Finally, you need to stretch your dollar and have the peace of mind of relying on secure and reliable operations. So let's see how we achieve this with the aws data analytics and machine learning services.

You know, at first glance, this looks like a jumble of icons hang on each one of these services were built working backwards from customers like you. And let me show you how they come together to realize the vision that we just talked about.

Aws offers the most comprehensive set of services to manage your data from databases to streams and logs to data warehouses or lakes or applications or data marketplaces. For an in depth discussion on databases, please look at like rahul and just innovation talk. They've been to a lot of details there and for analytics, we are breaking down silos between data warehouses and lakes and with the zero etl integrations, we are reducing the hassle of etl and the need to copy data around all this data is governed through a cohesive set of governance tools that enable you to understand, curate, protect and collaborate like aws identity center enables you to govern with your organizational identity like octa or intra glu data catalog and lake formation, manages your lakes and warehouses alike. Glu data quality makes your data trustworthy and clean rooms. Enables you to collaborate across organization and datazone. Brings it all together, unlocking enterprise wide discovery and govern sharing to put all your data to work.

And we offer the broadest set of tools to innovate and act on all your data, giving you easy access to innovations in open source and our own all with the best price performance by taking full advantage of amazon's deep infrastructure innovations like graviton, which is up to 30% more efficient for the analytical workloads. And we give you flexible deployment options to best suit your needs as managed services, manage clusters or server.

And we hear you loud and clear on the need for simplified experiences for builders and users. We are investing deeply in query editor in notebooks in quick site and for applications to accelerate time to value for everyone. And i'll share more about how we are actually putting general a i for work on this as we talk about your business environment is complex and constantly changing. And you need the comprehensive set of comprehensive suite of data analytics and ml services that are built to work together under unified governance and with intuitive experiences. And we will continue to simplify and invent working backwards from your needs.

So i would like to start with an inspiring grab and go business initiative from one of our customers, adidas. Please join me to welcome paul vasu, vice president of platform engineering at adidas for a healthy living. We need great foods like mangoes, but we also need to do some sports. And today i'll bring you a sports story leading platform engineering at adidas. I can speak at length about the many meaningful ways we use the aws cloud, whether it's for product creation and design for our future er p solution, the data and analytics stack or our massively scaling ecommerce platform today. However, is about our purpose. Our purpose at adidas is through sports. We have the power to change lives and the lives. I'm going to talk about are those of young enthusiastic soccer players or how we call them anywhere else in the world footballers? They too deserve affordable technology that will enable them to measure, to collect data, learn from it and improve their performance. And a team at adidas and innovation team has set off to serve just for that. And i wrote a video to explain it how adidas team fx has five metrics in focus, top ball speed, distance, traveled, top running speed, keep count and explosiveness.

Adidas redefines performance, tracking in a highly competitive sport. They can compare almost everything. How many passes i did, how many sprints i did and it's all about daycare. I need to improve. I need to have the data. I need to know my own data to compare to others. You got details from every single day, so you could do better the next day. Just to recap, the solution is made up of an ml powered sensor that goes into a soul that goes into the shoe. After its training. After each game, the players would get access to the data and the metrics and enable them to learn. And we heard fantastic feedback on the reliability and accuracy of the models and the solution as a whole. Yet having launched through covid, it didn't quite yield the expected results

"But we heard back from all over Europe, from a group of people that asked us the same thing and that was to evolve from a player centric fun experience to a professional team solution. And they were coaches, coaches of those teams, uh and they saw the potential and uh reached out, we listened to them and together with an AWS digital innovation team, we have set off to use the mechanism G two already mentioned working backwards.

We went to understand deeply, what do they need? What are the needs of the players and the coaches at the same time? And how do we get to build a solution that intertwines into their day to day lives and helps them improve players, coaches and teams alike.

Now having that in the hand, the PR faq, the storyboards next up was the design board. So good news there. The back end was already running on AWS, mostly powered by several technologies and they were scalable, cost efficient, but they couldn't quite cater to all the use cases that we had in front of us for the professional the solution.

Now, next to the open source service, for example, we needed to bring in something else, something new. And this is what the team has done together with a prototyping team of AWS in Europe to bring in two new fit for purpose data stores with Amazon Timestream for metrics, Amazon MemoryDB for Redis for fast access to leaderboard data very much needed in a team context.

Now, also on the data streaming site, we revamp the stack and enable real time data flows with Kinesis, data streams and Kinesis data streams for analytics. And the performance has been astonishing. We've run tests with over 1 million requests per second and it took under 70 milliseconds for data to propagate and and to respond in the app.

Now, the performance has also been fantastic into implementing this three months to be MVP ready. And at the end, we had a demo on our pitch in Germany. And after a training and a game, everyone has seen compelling results in a fantastic experience, the players, coaches and our teams as well and they were convinced such that this solution is now launching across Europe to teams and clubs and speaking of experience, let's have a look now in the app, the teams get a team space, the coaches can define events, look at the timeline, look at the player statistics and make sure that they set up the next trainings in the best way possible to improve everyone's performance and make sure that in the game day, they have the best team on the pitch.

There's much more to come and of course, uh things like social integrations and social features, game summaries will soon be available. What I'm most excited about and available today in fact, is intensity feedback because this proves to be a great unblocker for predicting and preventing injuries.

So, isn't it remarkable that data and analytics technology helps us build and improve our enterprises but also achieving our purposes in our case, changing people's lives through sports. So I'm very, very excited about all the rest of the updates and announcements that G two has for us. Thank you back to him.

Wasn't that an amazing, inspiring example of a data driven experience that enables everyday athletes to train with a world class coach and improve their personal record and avoid injuries? Now, let's take a quick poll. Where does your time go? Does it go in innovation or keeping your lights on? Let's take a minute to get your input. Please use the link and folks if you are actually watching live, you can actually vote too.

Now, while you're voting, I would like to share a quick story. I'm an amateur runner and one of my bucket list item is to run the OG marathon, you know, from Marathon to Athens Greece. Now in 2010, as one of the lucky 12,000 who signed up to run the 2005 100th anniversary of the marathon. Now, like many amateur runners while training, I got overzealous and overtrained and I hurt my knee four weeks before the race and had to drop out of the race.

Now, I was thinking, listening to Paul, if I had trained with an app, like uh the one that he talked about, but for runners, maybe I would have checked off, run the original marathon from my bucket list. Let's give a few more minutes for folks to actually uh put in their input to the poll.

Ok, let's check the result. You know, I see that like the keeping the lights on is actually just showing up a large portion. I'm excited to see that there is a lot of innovation that is happening too. But the good news here is that you can actually start wherever you are and then build out your data foundation one funded business initiative at a time incrementally.

All right. So we saw the grab and go snack from Adidas embedding analytics in the application to deliver engaging experiences. So let's talk about actually the data strategy and the data foundation that will enable this meal and all your other meals.

First, we need to talk about governance. Governance often is viewed as a constraint on innovation, like many fad diets, you know, no carbs eatery kale all about control and impossible to sustain. You want an approach that is flexible and in some balance with your lifestyle, like switching to salads for lunches or Greek yogurt for snacks and tracking the weight and what you're reading in an app and auditing with a coach who's going to hold you accountable, right?

Governance, freeze your data for innovation with the right guard rails. And that's why we built Amazon DataZone to make it easy for data owners to share their data proactively so that everyone can find the data they need easily and work with it from their favorite tools seamlessly. And we are adding a lot more governance features.

Often access temporary policies are tied to people's roles like prescription drugs are handled only by the trained pharmacist at your grocery store. See the data employee personal information is available only to HR or the current quarter. Sales is restricted to the finance team using such organizational information for access control. Today is tedious. You need to do complex mapping with tags and roles, not anymore trusted identity propagation with AWS IM Identity Center makes the end user identity and all its attributes like which department they belong to or like which security groups they are part of to AWS services to make access control decisions.

So you can create a policy in the information to make current quantas sales data restricted only to members of the finance group. And that new finance hire, he can use this data from dashboards and clicks site and query it from Redshift query editor from day one without having to file any tickets to the BI or data war hostings.

And there are other similar use cases like ensuring a seller in an ecommerce marketplace can only see the product funnel data for the products they have listed with multi dialect SQL views feature. You can build a secure SQL view that joins the product funnel data with the seller's product listing data to return only the right subset. And when the seller updates their product listing, then their product funnel axis automatically adapts this pattern of fine grain access control using SQL use is not new but for a single engine.

What is remarkable is that you can do so with Lake Formation. Now centrally across many engines, Redshift Reno Spark across AWS analytic services. Let's stop data quality. In many cases, static rules that check for valid values is simply not enough, take candy sales. It's highly seasonal and spices during Halloween or Valentine's day, your data quality rules should alert you when the candy sales.

This Halloween is not growing as expected compared to the last Halloween and it should not alert you when the sales return to normal as expected come November expecting users to understand the dynamic nature of the thousands of data sets they work with and have to manually tweak the threshold is simply not scalable.

That's why I'm really excited about the anomaly detection and dynamic rules in AWS Glu data quality. With this new feature, AWS Glu data quality will automatically gather statistics about the data and use machine learning to surface the seasonality. And you can create new data quality rules with dynamic thresholds to monitor and alert on your candy sales.

Now all with the service simplicity and cost efficiency of AWS Glue. Now with some use cases like the vaccine research during COVID-19, innovation needs data from partners across an industry for the right breakthrough. Think of it like a big potluck dinner that becomes richer because of the diversity of dishes.

Sometimes you only want to share the derived insight, not the underlying data because that data may be highly sensitive or easier or competitive differentiation. It would be like sharing the dish but not your grandmother's secret recipe.

AWS Clean Rooms make collaborating across organization on sensitive data, easy and flexible. With just a few clicks, you can create a clean room by customizing controls on both queries and outputs, add your collaborators from other companies and start innovating no copying of data required. And with the new features, adding support for differential privacy and training ML models on sensitive data, organizing a data potluck party with your partners in other organization has never been easier

Data and projects are actually growing much faster than most budgets. So our customers are looking for ways to do more with the same or less. That's why AWS is continuously focused on innovation that reduces cost but without sacrificing performance.

Take the example of the new R1 instance family for Amazon OpenSearch, R1 reimagines managed clusters in a cloud native architecture data is stored synchronously in S3 enabling unparalleled durability and zero data loss recovery from mode failures is now automatic and take just a few minutes.

Rr one is up to 80% better in indexing throughput and overall is 30% more efficient when compared to the current manage instances. Now OpenSearch customers can scale their deployments reliably and economically without compromising performance.

And we are making Redshift better too. Our customers love Redshift data sharing to share live data for read access often in the data mesh architecture. Now these data shares are writeable too. With this feature. Customers can separate their ETL jobs from their interactive reporting and get better performance for their dashboards at lower cost overall because the ETL can spin up and scale independently in its own compute only when needed.

In fact, we have improved the price performance of engines across the board with AI driven optimization that Peter talked about on Monday Redshift delivers up to six times better price performance compared to any other cloud data warehouse as measured by industry standard TPC-DS benchmark and it is up to seven times better in real world. Concurrent viago clocks, we made EMR Spark 70% more efficient over the past year and EMR Spark is now over five times faster than latest Apache Spark open source release in TPC-DS benchmark.

And you will see big savings by adopting managed services for Apache Flink and Apache Kafka compared to the respective open source releases. And we have made it really easy for you to understand your AWS cost and usage with your AWS billing data available right inside. QuickSight, the prebuilt cost and usage dashboard makes it easy for you to get an overview of your spend and understand trends and you have the full power of QuickSight to deep dive and drive action across your organization."

This is dedicated for my technical team. A well organized kitchen ensures that every ingredient is readily available when needed. Similarly, in a data lake, it has to ensure the relevant information is at fingertip ready to be analyzed and utilized to enhance that decision making process. This accessibility is the cornerstone of agility in responding to the market trend.

This realization helped us to architect our data leak for our future needs for enabling our 2030 vision. At this stage for data lake, we do push the data from multiple sources from our on prem and various systems that includes our manufacturing execution system tools tool, data sensor data, our engineering data which are the measurement of our product to ensure the right quality is met. And that's also augmented with our enterprise data, which is very crucial for supply chain plan.

All of this data we push into into aws through through various technology stacks. But in the a wo zone, we use native aws products as much as possible and it goes through row zone curated and aggregated in the aggregated zone, we use redshift as our backup, most of our bi tools and we can analytic solutions leverages the data from redshift. And there are data types that doesn't belong to an overlap environment like rift, but it is still needed for our use case. But we let we are able to let the data belong to where it belongs and seamlessly access it leveraging spectrum and athena framework.

This enables the users to seamlessly access the data and not worry about where my data sits. And in order to make all this happen, all the data need to be properly discovered and governed and we leverage we are in the process of extending that with glue and data zone.

Moreover, just as a gourmet dish that evolves in every iteration the semiconductor manufacturing process benefits from continuous improvement data lake empowers us to apply advanced analytics, machine learning and a i to uncover patterns, optimize process and predict future trends.

Like the previous slide, which was dedicated for my technical team. This is dedicated for my business users who always ask. So what you have a great architecture, you have built a great data like. So what what does it mean for a business? That's what this is all about.

This shows us earlier, we were not able to compare our tool data to identify which tool performs the best in class across all the fab units. And that was just not possible with our on prem infrastructure due to the sheer power of infrastructure that is needed. Now, with the data lake, we are able to compare the poor performance across all the firs compared to the best in class and identify what's causing the right space in the tools.

Semiconductor manufacturing process is a complex one. But for simplicity's sake, let's compare it to baking a cake. The difference here is it takes three months to bake this cake. So every day, every hour as it goes through multiple process, the recipe need to be followed. Exactly correct. And there are many factors that could affect the quality of those products. And earlier, we had to spend a lot of semi automated but a lot of manual effort to identify defect patterns and identify what's causing that defect and engineers were spending more than 12 hours classifying those images which are really hard to identify in this way first.

And now with the data leak powered with machine learning, we are able to do image classification. Now that same task we are able to accomplish in less than three minutes. Another use case is predictive maintenance in semiconductor. The heavy capex investment is involved in tools in buying tools. So anytime when a tool is taken down for maintenance, that's a significant drain on productivity. And now with the data lake and machine learning, we are able to go in the direction of predictive maintenance.

Earlier, it was mostly based off consumption based and time based whether you need maintenance or not in this time, you have to do the maintenance. Now we are able to leverage predictive maintenance and extend the pool life that means more products and planning better for the maintenance activity.

So what's next? We have build architecture, we are able to generate the business value. But what's next? The ms framework where we have built over time for more than a year. Now, we are able to extend that for our gen a use case, build the right guardrails and build the coi around enabling gen i use cases. We have the data, we have the machine learning structure in place. And now we are boldly getting into the gen a world and we are very sure we can make m cases more ubiquitous across all of the business, not just in the core manufacturing and we are in the process of building our data products, data and analytic products that not only makes the life of engineers easy in the company but also enables our external customers who can leverage this value.

Just as a chef's creativity knows no bound when armed with a well stocked kitchen, semiconducting industry can push the boundaries of innovation with the arsenal of insight unlocked by data lake powered by aws cloud. Thank you all over to g two.

Thank you sir for that amazing value that your team has created using the one g of data foundation built on aws. Now, unlike the fried kind, your chips are healthy and keeps our economy ticking. Now, let's switch from building your data strategy to simplifying your data landscape.

Come to table is all about removing that middle man and bringing things right to your fingertips. At the last re event, we shared our vision for a zero etl future to enable easy access to all your data from all your favorite tools without the burden of writing and managing etl like this little girl who has removed all the middle men to get to that farm fresh tomato. We are delivering integrations to make that possible for you.

Now, with a simple configuration, you can bring your data in aurora, mysql, aurora, post sql, rds, mysql or dynamo db right into redshift in near real time. For fast sql analytics and interactive dashboards. And with the high performance integration between spark and redshift, all this data is ready for big data processing in er and machine learning that sage maker and that's not all we are bringing zero etl access to your data in applications like salesforce, amazon connect or your logs and cloud trail.

But these integrations, your data in these applications will appear as tables in your aws g blue data catalog with just a few clicks, no copying of data needed and all your favorite tools redshift emr athena sagemaker. Just work.

Let's see this in action with a demo using sales force.

Ok. So meet bob kyle and maya. They work at the ecommerce company that is catering to outdoor enthusiasts and they are tasked to come up with a better marketing campaign for the holidays powered by data and ml in sales force data cloud and aws.

Ok. First bob is preparing data that kyle needs using sales force data clown with just a few clicks. He's bringing new data like product catalog, purchase history or web engagement from redshift into data cloud. Once he has this, he can enhance the data cloud data models using this additional data like the purchase history and share this graph of data models back to kyle.

Now, kyle spins up his stage maker. Data wrangler connects to this data cloud data and prepares the data as he normally would building a purchase propensity model and he says, bob, you are ready to go.

Bob can with just a few click, bring that model into data cloud and activate it. And you can use this model to extend the sales force data cloud models like adding a purchase propensity score in the data cloud. And with this new purchase propensity score, he can then create a marketing campaign using data cloud marketing segmentation to identify the high propensity customers.

And now he needs to get it to maya. So to do that, he creates a new sharon red shift adds all the necessary data models including case or customer profiles and then he'd save. Now, maya can find these tables right. in her query editor, she does the analysis and visualizes that in quicksight done.

As you can see with zero tl integration between aws analytics and machine learning services and sales force data. Cloud bob kyle and maya were able to seamlessly collaborate, able to seamlessly collaborate and launch this new campaign.

So i want to go back to an important data type, logs your application logs in cloud trail or in your s3 data lake. They are best analyzed using opensearch. You can simply configure your data lake tables. Now as a data source in opensearch and start querying no indexing needed to get started. Of course, you would enable indexing to make your dashboards fast and responsive with all these zero etl integrations. You have now easy access to all your data from all your favorite tools with unified governance.

Now, dental boxes are a marvel at managing your food. They are flexible, organized and easy to use. It would be fantastic to have a dental box for your data leak, transactional data lake formats like apache sber are really great for keeping data fresh at using near real time injection. We support transactional data lakes broadly including uh red shift that we just announced, but often this generates too many small files that makes your query slow and expensive.

Now you can turn on storage optimization on your apache iceberg tables in aws, dual data catalog with just one click. Let the system monitor changes to the table and automatically trigger compaction so that your queries are faster and you save money.

Anyone here use meal kit service. I think they are a great simer. You get to make and enjoy healthy and delicious home cooked meal. But without the hassle of shopping and prepping with servers, you get to do the same, you get to focus on your application or do your analysis and rely on us aws with all of the heavy lifting with the infrastructure stuff. That's why we offer service option for all our analytic services and the only one in the industry to do so.

And i'm really excited about the addition of vector engine for amazon opensearch vector engine efficiently stores billions of vector embeddings generated from large language models like the ones in amazon bedrock and will respond to your vector search queries in milliseconds. And you can combine it with powerful text search capability of opensearch and easily implement hybrid and rack patterns to bring engaging generative experience to your application without any of the hassle of managing infrastructure.

Now, i would like to welcome tom andreola, vice chancellor of uc irvine. When i first came to the university, i had spent a career beforehand in consulting as a chief information officer and as an enterprise software and data solutions provider. And i did that living and working on four different continents. And so when i came to university, i asked a question that i typically would ask when i dropped into a new environment, what's the core business that we're in? And i heard many different perspectives, but one stuck with me and that was one professor who said tom, we're in the business of creating the future.

You see the purpose of a research university is to push the boundaries of discovery and create new knowledge. And then through teaching, we give that knowledge to our students who run into the world. Work for established companies, create new companies, become policymakers and they shape tomorrow. So you see, we're in the business of creating a future.

This stuck with me as i tried to define what my role would be, but there was one missing piece and that was we're also an organization, a business with over 50,000 constituents in the form of students, patients, faculty members, doctors, staff who are making decisions every day. And in a data driven organization, we have to make better decisions today than we did yesterday. Consistently make them better, make them faster and increasingly at scale.

And so i believe every organization needs to re invent itself as a data company just as a university is now a data company. And so let me give you a very pragmatic example, especially if you're a parent, student success. See, we believe that we're creating a new definition of what we mean by student success to meet society, at least expectations of us. And this is the third generation, first generation was actually pretty straightforward. College was very, was very exclusive, very small number of people get in, you put the the bar very high and you could almost guarantee success.

But then as society realized that everyone should have access to universities, it became harder we let in an increasingly diverse set of individuals. At uci, 87% of our students identify as nonwhite, for example, they come from a variety of backgrounds and circumstances and to support them and generate a level of student success

"We built a lot of scaffolding around their university experience in the form of people. Now, we all know that that's a good model but not a very scalable model and some of the challenges we have in higher education. And I'm a parent of three and concerned about the cost is we need a different model.

And so Student Success 3.0 for us is really rethinking the model of how we drive student success in an increasingly digital world. Student Success 3.0 for us means it's going to be powered by digital more and more technology interactions that leave data signals aggregating that data and using analytics both for structured and unstructured data that allow us to understand our students thinking longitudinally, meaning from the first time a student expresses an interest to come to our university through the first time that they land their job after university and connecting with them while they're a lifelong learner.

What are all the digital touch points that we have? And how do we think about that? Not just tracking the data about what happens in their classes, but what student groups are they in? What internships and practicums have they gotten involved with as they developed their core competencies for career readiness? These are all holistic thinking ideas that we use from other industries but not yet come into higher education.

And then also how do we stop thinking about the class of 2024 and start thinking about each and every student as an individual on their own personal journey to get to the goals that they define and we're empowering them through data in the same way that the data that comes from this aura ring that I wear, empowers me to make better decisions about my health. This is what we believe. Students 3.0 is all about now.

I was taught that any technology presentation must have a box, must have a slide with arrows pointing the boxes. So the bingo card has been punched. And of course, this is our architecture, but I'm going to talk about three mindset shifts that have come with this architecture of us working with AWS mind shift.

Number one is you'll recognize this one. It's from data warehouse to data lake. But we're going beyond in talking about what I think of as the data resort because having the data lake doesn't make it an interesting place to go and stay, right? You have to have services, things that draw people there and keep people there that add value to their visit. G two is not going to come to my data resort unless I'm serving mango on the menu, right?

The second shift for us is one from data hoarding to data sharing to thinking data ecosystems. You see universities were designed to be open in collaborative environments and we are that way around our research. But I found our data assets, our data mentality was not that way and so much of what the services that we have available to us in the AWS platform and the ecosystem, whether it's Redshift, right, or Zero EL or end to end governance, all give us the opportunity to really think about how do we share data across our enterprise?

And when we talk about ecosystems, what we mean is, how do we connect with other organizations? Whether that's another organization like ours, that we want to aggregate our data to study complex problems or the continuation of education from the community college system to the four year institution. How do we put those ecosystems together?

And then the third shift is one from thinking of AWS as a vendor to a strategic thought partner. You see, we didn't know and we still don't completely know what Student 63 should look like. But working backwards was a great framework for us to redefine a future state and then work backwards to how we needed to enable it and what the competencies we needed in the organization and the data we needed to bring together.

And so I put the slide up here to show you this is where the world is going see. This is a picture from a class that is a collaboration between my office and the dean of the business school. The class is called Into the Metaverse where 30 students are learning about immersive experiences as a technology stack, as a business growth concept and as a place for them to learn and interact with each other, they'll be doing their final projects in an immersive experience with a headset.

Here's the thing when we talk about this from the standpoint of Student Success 3.0 and data lakes, this is going to kick off new data that we've never seen before. I doubt it's going to be structured data. We're going to get data on things like interactions, portal jumps, eye movements. We needed an architecture that could be futuristic in its mindset and agile to accept new types of data form. Understanding how do you structure and unstructured data and the tools that sit on top of it.

You see, I believe that the world we know today and the world that's coming at us in an increasingly digital world. The certainty I know from that is there will be more data. The uncertainty is where it will take us. But I believe that the collision of those two worlds are a world where mango is infinitely available and it will be our future. Thank you about, I think mango and data are gonna trend together now.

So I'm missing a couple of slides. All right, I think that Tom's example of reimagining student success powered by data really resonates with me. I just sent my daughter to college. Um and as a parent, I have a lot of apprehension and this is actually amazing and inspiring.

So we built a robust and nimble data strategy. We simplified the landscape all in support of a funded business initiative. Now it's time to get the users cooking. So what do they want? They want easy access to all the right data. They want intuitive experiences to work with it and they need to be able to collaborate.

Now, one of the hardest parts of actually working the data is understanding it and finding it. Imagine trying to eat healthy without all the nutritional facts or expiry dates or allergy information. But we do this with our data all the time. It would be wonderful if all the data in DataZone comes with great documentation about what's in it and how to use it. But documentation can be tedious. Maybe not with the great new feature of AI recommendation for descriptions in DataZone.

Data owners can simply tap into generative AI to describe the data in business terms, describe the appropriate usage and more. Just click the button, review the recommendations, refine and hit publish.

Now having Alexa in the kitchen helping you through the right recipes and the steps one by one is very helpful. Wouldn't it be wonderful if we had an assistant just like that for our data tasks? We heard you and I'm really excited that Amazon Q is coming to your favorite data tools.

Now you can specify your data integration task in natural language and let Amazon Q build you a ready to deploy job in code or as boxes and arrows, just test refine and deploy to production. It can help you with debugging your failures and how to fix them. It's like having an expert in Spark Python and AWS Glue working with you on all your tasks.

Amazon Cube can help you with the SQL task too. Just specify the query you want to see in natural language. And Cube would generate the right sequel matching the schema of your warehouse or LA and generative AI is coming to your open source, open source as well. You can dive into your logs, get automated summaries and more using the generative AI toolkit. You can take it for a spin at the playground today.

So we prepared the data with Glue, analyzed with Redshift, shared it with great documentation in DataZone. And we did it all much faster with the power of Amazon Q and generative AI.

Now we need to put it in the hands of every user to enable timely data driven decision. And the best way to do that is with QuickSight, you know, they say that we eat with our eyes first, a beautifully created dish or a data story makes it easy for you to communicate your insights with your team and engage them to take the right action. And Amazon Cube can help you with that too.

With Amazon Q and QuickSight. Users can translate their analysis into a compelling data story. Summarizing the analysis, sharing the insights and describing and tracking the actions that the organization need to take and Q will help you with other b tasks as well. An author can simply specify what they want to see and Q will generate the dashboard leaders can better understand using exact summaries and explore the data with natural language.

Let's see a demo and see how this works. So we're meeting Deepa Kevin and Tracy. They work at an auto insurance company and they are working together to reduce fraudulent claims.

First, we meet Deepa, she's a data engineer and she is using the generative AI powered description to make claims data, easily discoverable by her teammates. She reviews the generative documentation validates the suggested news and column descriptions and then hits published.

Now, Kevin can actually come in and find the data but using their own words, you know, use accident data instead of planes and still find the data with one click. They are ready to work with it. Let's continue the journey with Kevin.

Kevin can see the data in his query editor. Now this is a new table for Kevin but he now has Amazon Q. He can simply specify using natural language what he wants to like, you know what he wants to understand. And Amazon Q would generate the sequel for him and he's ready to update the claims analysis dashboard with this new data.

So first, Kevin wants to create a new calculated field to bucket eye the deductible values as low medium or high. He can use Q for that. And then he wants to add a new chart that correlates the size of the claims with the size of deductible. Thank you. Looks good. Let's add it to the dashboard and then he enables the generative bi features, summaries and stories for this dashboard.

Let's finish up with Tracy our business user. So Tracy starts the analysis by reviewing the exact summary. Now it shows that Saturdays and New Hampshire are outliers. They get curious and they start digging in. Now what should be going on in New Hampshire on Saturdays, you can explore the data using natural language and follow where the data takes you.

Once they are done, they can create a data story, summarizing the findings and recommended actions to their team, just add the key visuals saved along the way in the analysis and put Amazon Q to work.

Now, this story shines light on the rise and claims with without a police report. Perhaps it's there is a high likelihood of these being staged accidents. And Tracy recommends building a partnership between the insurance investigators and the local law enforcement officials and build it a right visual to track progress over time, easy as pie.

So let's recap. Data is your transformation differentiator. And today turn data into a strategic asset. You need a robust and nimble data foundation. You get there by building a modern data strategy, simplify your landscape, empower your users and embed data driven experiences in every application. And you can rely on us AWS as your trusted partner to build with to get your data fly wheel going, get started with building your data strategy using the data driven everything. program.

Talk about the AWS generative AI innovation center to prototype your big generative AI ideas and choose from over 100 and 50 training programs, training courses to build your team's expertise in data analytics and machine learning.

Thank you. Please take a few moments to give us feedback and have a wonderful rest of re invent."

李白的朋友高适

关注

8
点赞
踩
9

收藏

觉得还不错? 一键收藏
打赏
0
评论
Data drives transformation: Data foundations with AWS analytics

Please welcome Vice President Analytics, AWS G2 Christian Mti.Hello and welcome. Thank you for joining us today. I know it is a Thursday late uh late event. Hope you're having a fabulous re invent so far. And I know you are actually here because you see th
复制链接

扫一扫