Building interoperability and data collaboration workloads with AWS

最新推荐文章于 2024-07-25 13:56:03 发布

李白的朋友王维

最新推荐文章于 2024-07-25 13:56:03 发布

阅读量99

点赞数

文章标签： aws 亚马逊云科技科技人工智能 re:Invent 2023 生成式AI 云服务

本文链接：https://blog.csdn.net/just2gooo/article/details/134813441

版权

Hello, everyone and welcome to our breakout session on building interoperability and data collaboration workloads on AWS, focused on advertising, marketing and technology customers.

I'm Shila Matthias, Business Development for AWS Clean Rooms. I'm joined by Adam Solomon, Head of Business Development for Data Collaboration Apps at AWS, Dvor Golo, General Manager at AWS and Justin Dereon, Chief Product Officer at Action IQ.

Today, Adam and I will kick off the session on sharing how data interoperability and collaboration can drive business value for advertising, marketing and technology customers. We will share the use cases that the services AWS has developed supports as well as the solution design related to these strategic business areas.

Next, we'll invite Justin up on stage to share more about his company Action IQ and how they're uniquely positioned in the customer data platform space by using AWS services that reduce friction and increase privacy by enabling data interoperability and collaboration directly from the AWS cloud.

Last of all is gonna wrap up and share how you can learn more about the services we talk about today as you prepare and evaluate for testing within your own enterprise.

So let's begin by defining what it is I mean when I say data interoperability and collaboration. Interoperability is the ability of computer systems or software to exchange and make use of information. Data collaboration is the action of working with another entity to produce something new or better.

Companies across advertising and marketing, whether you are an enterprise brand, a measurement company, a customer data platform or a media owner, thrive and grow by knowing who their customers are, unifying the data they have on their customers and safely using that data with their partners for advertising and marketing to drive business outcomes.

We know from speaking to our customers that having duplicated consolidated data is a tremendous value. We heard from a retail customer of AWS that having this consolidated data allows them to personalize every experience they have with their customers, whether that's in store online or via mobile.

A CDP customer on AWS told us that their customers, like that retailer, are hampered by a limited understanding of who their customers are. And not only do they want to unify the data for every consumer, but they want to enrich it with additional attributes such as what those consumers do outside of the retailer's physical or virtual walls.

And finally, a measurement company on AWS told us that advertisers like that retailer want to use that duplicated consolidated data with their media partners so they can better understand the effectiveness of their advertising. For example, for a retailer, how many of their buyers signed up for a loyalty program as a result of an ad campaign.

But the reality is companies like that retailer are faced with challenges before they can even think about collaborating with partners. Data is incomplete and fragmented. Most companies like that retailer have multiple applications and systems that manage the consumer experience. A single customer of that retailer can have multiple records and IDs stored across those systems, applications and data stores. So connecting this data into a unified view is challenging and complex.

Companies also want to unify and connect their data with greater governance to protect consumer privacy. And more often than not, they want this to take place without needing to copy or maintain a copy of their data outside of their AWS cloud environment.

Having unified customer views provides the foundation for any company to understand who their customers are so you can personalize every interaction you have with them.

Companies like that retailer want to build on that unified view to develop that all-important customer 360 view. For example, allowing that retailer to know who Shiloh Mathias is demographically, what life stage I fit into, who's in my household, what my buying preferences are.

But many times the data that retailer has on me isn't enough to assemble that complete view. This is where the retailer needs a little bit of help from their partners, customer data platforms, data providers, media owners to get that 360 view of your customer in a privacy safe way, especially for advertising and marketing use cases.

With the changing digital landscape, experiencing signal loss and increased consumer privacy regulation, your partners come in different flavors depending upon the use cases you're looking to solve for:

Audience enrichment - allowing that retailer to know that Shila Mathias just added a pet cat to her household and I'm in market for a new house, which my partner would find very interesting because he doesn't think we need a new house.
Media planning - allowing that retailer to understand how many people like me, other in market home buyers, can they reach who are active on a specific media channel.
Audience activation - allowing that retailer to safely use the identity data they have on me to reach me and my other in market home buyer cohort on a specific media channel.
And finally, optimization and measurement - allowing that retailer to understand the effectiveness of their advertising to people like me so that they can better optimize their tactics moving forward and better personalize their messaging to their own customer.

So like we've discussed, customers like that retailer want to unify and manage their data, they want to collaborate with many partners for lots of different use cases, but not at the cost of consumer privacy or revealing their intellectual property to their partner. This is the problem statement that we've spent a significant amount of time trying to understand so that we can create AWS services that meet our customers' needs and solve these pain points.

I'm gonna turn it over to Adam to introduce you to the services we built for advertising and marketing customers that do just that.

Adam: Thank you so much, Sheila.

Earlier this year, AWS launched two new services that are specifically focused on the customer needs and use cases that Sheila just described:

AWS Resolution - Which helps customers match, link and enhance related records across applications, channels and data stores using flexible configurable workflows that can take only minutes to set up. This helps customers with that use case where they want to unify all that data and create a unified view of their customers.
AWS Clean Rooms - Which helps customers match, analyze and collaborate with any company on the AWS cloud to generate new insights without revealing your raw data or moving it from your AWS environment. This helps in the situation where I have data, you have data, we want to analyze our collective data sets, but I don't want to share my data with you, you don't want to share your data with me - how do we do that? AWS Clean Rooms.

Let's illustrate with an example from travel and hospitality. Let's say I'm an airline. Most likely I have a loyalty program. Now in an ideal world, one person has one account, but we don't live in an ideal world. What typically happens is that some consumers could have multiple loyalty accounts. Why do they do that? Maybe they forgot their login, maybe they were on the plane trying to access wifi, they got frustrated, they created a new account. So that airline has a need to duplicate and link those loyalty accounts together.

Or maybe based on the marketing needs of that company, they want to know when two or more people are in the same household so they could use AWS AnyResolution for that as well.

But let's say an individual buys a ticket, purchases a ticket for whatever reason does not use their loyalty account ID. Now you have a purchase in one database, you have another database for their loyalty accounts. Once again, in an ideal world, the airline would like to link those together into one unified view of that consumer.

Now, let's say for a marketing program, the airline has pulled together all this information, they see what tickets you purchased and they know that you have certain loyalty status, but that's all they know about you. How do they learn more? Well, they could collaborate with a hotel or a credit card company or a streaming TV provider. In order to perform that collaboration there may be hesitation by all involved. The airline may not want to share their data with the streaming TV provider and vice versa.

AWS Clean Rooms is a service that can be used between the airline and their partners to perform analysis across the data sets to learn more about these consumers without exchanging the granular content of the data.

AWS AnyResolution and AWS Clean Rooms - two distinct services, but as they say in the commercial, "Who doesn't like two great tastes that taste great together?" And it does hold true for these services.

Now, what kind of benefits can customers expect when using these two services?

First, composability. Not only can customers use AWS AnyResolution or AWS Clean Rooms, there are hundreds of AWS services that can be used in combination to enhance the solutions that customers build, whether it's databases, analytic services or ML services used in combination with AWS AnyResolution, AWS Clean Rooms, AWS knows composability.

Easy setup. We made it a priority to use services like AWS Glue Data Catalog so that customers once they index all their data and create a metadata catalog can use it across the services. Easy setup top priority.

One of our core tenants when developing these services is that we want to minimize the movement of data. With both of these services, if you have your data in an AWS data lake, the first step in using either of these services is not to move your data anywhere else is to leave it in place. Zero ETL. Do not move your data at AWS.

There are hundreds of thousands of marketers, agencies, system integrators, advertising technology companies, marketing technology companies, publishers and media platforms. If you want to collaborate for advertising and marketing use cases, all your friends are here at AWS.

And finally, like most AWS services, these services are pay as you go. No subscription, no upfront commitment. Only pay for what you use.

Now, let's see how it works:

AWS AnyResolution - You have your data in a data lake on AWS S3. We make it very easy and straightforward to map the schema of your data in up to 20 data lakes on AWS to a schema that AWS AnyResolution can recognize - name, address, phone number, email or even custom fields. You map that one time. Now the system is ready to go.

There are three different ways that you can match and unify your data with AWS AnyResolution:

You can perform deterministic rule based matching where you can have a prioritized waterfall of deterministic matching.
We have machine learning matching featuring a machine learning model developed here at AWS. You can use machine learning to unify those records together.
And we also have provider integrations with LiveRamp, TransUnion and Unified ID 2. Once again, this whole notion of composability - have it your way. You can mix and match style of service you want to use for your data matching.

When you perform these AnyResolution workflows, the output goes into an S3 bucket, which could be the starting point for your data collaboration with AWS Clean Rooms.

With AWS Clean Rooms, you can have up to five collaboration partners. Each partner once again keeps their data in their AWS data lake S3, no data movement.

Each collaboration partner sets their own analysis rules on the data. Everyone gets to lay down what kind of rules that they want enforced on their data for analysis purposes. So if you have a column, let's say it's a hashed email that you only want to use for matching and you never want it to be revealed in the clear, you can set those rules. You can also set things like minimum aggregation constraints where you can say unless there are 25 rows that are returned in a query, I don't want anything returned. Each participant in the collaboration can set those rules.

So and when you execute a collaboration, you can get an aggregate metric as the output or in the case of you want a list of ids or attributes as an output, you can have that output to S3 as well. So in this example, the airline where you want to unify your data and you want to perform analysis with a streaming TV provider or a hotel or whoever your partner is. This is how you put those services together.

Now, at AWS, we recognize that customers across many industries have advertising marketing needs. It doesn't matter if you travel and hospitality, automotive software, everyone has customers, everyone wants to get to know their customers better. Everyone wants to acquire their next great customer. But we're also cognizant of the fact that within those companies, there are different personas that we engage every day. We like to say that at AWS, everyone's a builder. But our reality is that when we're interacting with customers, some personas within the company are more technically proficient than others.

So we've made it a priority here at AWS to make sure that we have the right tools, solutions and capabilities for all manner of persona at our companies, at our customers companies. So whether you are a technical builder, you want to be hands on. Of course, we have a full array of services that you can use here at AWS including AWS, any resolution, AWS, clean rooms. We also provide guidance and turnkey solutions to get you going faster. But an area that's really important to us is to work hand in hand with our partners to make sure that customers who lean more towards a buyer persona or a composable hybrid persona that they have the tools, the solutions, the capabilities that they need that are brought to life through our partners.

And a great example of one of those partners is ActionIQ. Please welcome me in joining the Chief Product Officer for ActionIQ, Justin de Brabant.

Thank you. Uh so as as uh you said, my name is Justin de Brabant. I'm the Chief Product Officer at ActionIQ. Uh we're a customer data platform. Uh by way of introduction, um my background is in data data infrastructure. I've been in this space um really since the early days of big data. Um initially, I did a PhD in databases. I was at Brown and MIT right around the time when things like Hadoop were just coming out and really transforming how we invest in data and infrastructure. Um realized I didn't want to go into academia, spent a number of years uh in a consulting practice, consulting around data like deployments around big data 1.0 realized I wanted to build um and eventually, after a few misguided years at a hedge fund, uh and in finance uh joined ActionIQ uh a little over eight years ago and now I'm the Chief Product Officer.

So, um what are we gonna talk about today? Uh a lot of data, I actually can't get the fact uh in my head, uh we were at Thanksgiving and my daughter was uh giving an impression of what her dad does for work and she went and got my AirPods and walked around the house and just said, data, data, data a bunch of times. So that's kind of what's in the back of my head. As I'm going through this talk, I don't think I'm gonna disappoint her.

Um so uh quick agenda for today. So I'll tell you a little bit about ActionIQ um talk about the CDP space in general sense. I assume um most folks here are not super familiar with it. Talk about the trend in the CDP space around what we call composable interoperability. Um talk about the architecture that enables that compos um and then talk about how uh identity and data collaboration are two trends in that composable market and what that looks like in our collaboration with, with AWS.

Um and what that looks like today. So, first about ActionIQ. So we are a customer experience hub um built on a composable CDP. Um the company is a little over eight years old. Uh we are founded uh we're based in New York. Um and we uh have a little over 202 100 employees globally um back by Andreessen and Sequoia March Capital.

Um and uh as you can see here, um the focus of ActionIQ has always been enterprise customers. So most of our customers are Global 2000. The reason that's relevant is because what we do is a lot of scale and complexity of data and channels and I'll talk a little bit more about how that impacts some of the decisions we made from an architecture and partner standpoint.

So talking about scale before we get into what we do and how we do it. Let's just anchor on a couple numbers. We are talking about real big data here. I know people throw that term around quite a bit. We deal with a lot of data and importantly, a lot of complexity of data. Some of the examples Adam gave around, you know, the pity of stitching together different data sets. We do that at scale every day for our customers. We ingest a little over a petabyte of data every single day. Um combination of batch and real time data usually coming from multiple sources being reconciled into one unified customer 360 analytics.

Uh what are we doing with that data? So once we've adjusted it into the platform, um whether it's building audiences getting dashboards running models. Um broadly speaking, we categorize this as kind of queries that we're running on the data and we do a little bit over 1 million per week across all of our customers.

Um and then activations is what are we actually doing with that data? The goal for us as the name ActionIQ implies, the goal is to drive actions that, that drive personalized customer experiences. And we do a little over 2.3 trillion activations over this calendar year on behalf of our customers.

So it's not just that we're ingesting a lot of data, we're doing a lot of analytics and queries on top of that data. And then ultimately, we're activating out very large volumes across the whole ecosystem of customer experience or CX channels.

So big data. Um now what's our goal with that data? Uh our mission from day one has always been to transform customer experiences with the power of data. We believe strongly that brands um can improve how they interact with their customers, especially as they have more data about their customers.

Um there's a lot of potential in that data. Um customers give a lot of signals about what they're interested in or what they're not interested in. Um but at that scale, there's a lot of nuance and a lot of complexity of doing that analysis. But in doing that, right, you can ultimately build much better relationships with your customers, make their experiences with your brand much more personalized. And importantly, it's not just about the personalization as a one off, but making sure that's consistent across every single channel touch point that they have with you, whether it be a paid media, email, mobile, push it all needs to be very consistent. And historically, it has not been some of it's been batch and blast when there is personalization, it's, you know, a single campaign and a single channel. It's not connected with other experiences that you might have, you call customer support. They don't realize that you were part of a marketing campaign and received an offer. You try to get that offer. It's frustrating. Everything has been disconnected historically.

So we set out to change that and the broader CDP category has evolved around that promise. Um you know, when the category started a little over seven years ago. Um there were two kind of big promises uh that that were made, one was to break down traditional data silos um at the time especially um the data wasn't consolidated in a cloud data warehouse right there. There certainly was momentum in that direction. But when we went and talked with big enterprise brands, the challenge was hey, we have a data mart that marketing has, we have some Hadoop cluster that the CIO built, we have data that is actually still in the channels, maybe you know, email, open or activity data, um email, open and click.

Um there's digital behavioral data, click stream data that lives within Google or Adobe's environment and only there. And it's never been integrated into our own first party data infrastructure ecosystem. So if you wanna make use of all that data, the first step is to consolidate it, um break down those silos and provide this kind of unified customer view.

Um and then finally, again, the activation of that. So um being able to take that data, get insights, um decision and orchestrate across every outbound customer experience channel.

So what does that look like? Um if you're at all familiar with the CDP space, you are familiar with the obligatory kind of butterfly slide where you have data on the left destinations and channels on the right. Um pretty straightforward unify the data, provide insights, decision and orchestration um and then ultimately activate out across the downstream destination channels, wherever the customer is.

Um and again, it's not just marketing, it's all of customer experience. It could be um you know, CRM systems or call center systems or certainly outbound channels, website, etc.

Um now, importantly, CDPs were never really designed to do that last mile delivery, right. They're doing the decision, deciding who's part of the campaign would offer, maybe what content they should get. But still the ESP does the delivery, you know Google delivers the actual ads, etc.

So it's more about integrating with the full ecosystem as opposed to replacing it. So sounds great. Um, but a lot's changed. So as the CDP market was maturing over the last seven years, uh, the data infrastructure and data lake market was maturing as well.

And when I said that, you know, we set out initially and the first challenge we had to solve was breaking down those traditional data silos. What happened in parallel to us doing that was that brands and companies and CIOs were doing that themselves, right? They were investing in migrating to the cloud. Um, big data 2.0, right? Building out data lakes, whether it be, you know, S3 and EC2 and Athena or Redshift or Snowflake or Databricks, right? Everybody's investing in these big cloud data migrations to these kind of modern systems.

Um, and what's been interesting about that is if a brand has invested probably years and many, many millions of dollars to get the data in the cloud, um, get into one place, then that, you know, promise of breaking down these traditional silos for the CDP start to sound um less appealing, right? The data is already there. Why do we need a CDP to come in and manage their own copy of the data, right?

Um, so all this was kind of happening in parallel and what you're seeing now is a shift. Um, I certainly think for the better um because I don't think CDPs were ever intended to be data lakes or data marts um where CDPs are evolving to really embrace the infrastructure that the brands are building themselves.

Um, and that's where this trend around composability is coming from. Um, so increasingly what you're seeing is CDPs that are really coming in and figuring out how to integrate in novel ways with the client's cloud, data warehouses and ecosystems. And even on prem, we have a number of customers and regulated industries where we deploy in a composable way with on prem systems, right? This is not something that's strictly cloud related or certainly the cloud makes it easier.

Um, and you know, these systems work and can scale technically and also commercially, commercially is important because if you, if you have, you know terabytes of terabytes of data a day and you have it in a CDP and you have it in your cloud data like, well guess what? It's expensive, right? Storing and querying data, multiple sources is expensive.

Um, CDPs are increasingly looking at how to complement existing investments in infrastructure. Again, how you come into the data lake project and not force a kind of re-invent or reinvestment but come in and say, hey, we're going to complement that work. We're going to build on top of it, we're going to have quicker time to value, we're not going to force you to duplicate data infrastructure and technology across the stack just to make use of the CDP.

And then the last one is really kind of the the conclusion of this is that really this this deployment model now needs to be zero copy and warehouse first. So it's a move away from CDPs being bundled and having their own copy of the data. Um, and towards a model where the data warehouse, the cloud data lake, the lake house, whatever you want to call it, right is the primary compute and storage layer and the CDP comes in and integrates on top of that in a way that doesn't force copying of data outside of that ecosystem.

There are many advantages to this time to value governance security depending on who you ask in the organization, right, everybody wins in this model assuming the cloud data warehouse is there and mature and has the data necessary to get going. So what does this look like?

Um, as you can imagine, you know, because we've been at this for eight years, we initially invested heavily in this bundled model where ActionIQ was managing the data. And I showed those numbers around ingesting a petabyte of data on a daily basis. Um, that's for bundled deployments where we are managing the data directly.

Um, we built our entire infrastructure on spot instances and S3, right? So we run all those jobs on a daily basis um in, in our own infrastructure on spot instances and S3. But in the last couple of years, we've uh really invested in this more composable model, which is kind of the bottom uh uh layer here where the client is ingesting data into their own ecosystem storage and compute ecosystem Redshift, um S3 data lake, et cetera. And we're essentially pushing these queries down.

Um, and so in this model, we become essentially kind of a data virtualization layer where you see the data through our applications, but we're never actually managing a direct copy of that data. And we're essentially federating these queries down to the client's infrastructure, never needing to copy the data back. The only thing that comes back is really the results of these queries, which in the context of CX queries is usually just sets of user ids that have qualified for an offer or a campaign, right? So it's really just sets of user ids probably anonymized. We're talking about orders of magnitude reduction in data. And that's what we ultimately integrate with the downstream CX channels. Certainly we have customers that do that integration themselves. But also usually that last mile integration is a point of value for the CDP. And so we're managing that ourselves.

So when we think about this though, it's, it's not just about ok, well, the world was bundled and now the world is composable because anybody who's ever done one of these data lake projects knows that it's a process and it will probably never end. New data sources are coming on. New ways to use the data are always coming into play.

Um, and so uh the the the decision points are more of a spectrum. It's not all the data exists in the cloud data warehouse or all the data exists in the CDP. It's really about picking and choosing kind of where to fall on that spectrum at any given point in time, given the readiness of the cloud data warehouse.

Um, certainly, if all the data is there, it's modeled correctly, it's ready to go sit on top of it full composable. Great. We have a number of customers that deploy purely in that model, but maybe the data for all the use cases, isn't there? Maybe 80% of it is. What do you do in that case? Do you not do those use cases? Well, CMO doesn't love that answer.

Um, and so what we find is a lot of customers are, are actually starting to deploy in what we call a hybrid composable model where um it's composable first. And if the data is available in the warehouse, we integrate with it there and push all the queries down, we also enable them to manage some of the data directly within our infrastructure. And our CDP maybe real time data sets and use cases data that hasn't yet been integrated into the data warehouse. And we have native integrations with that application to bring it in could be a number of different reasons.

And in this hybrid composable deployment model, what we actually enable is customers to pick and choose on a table by table basis where the data lives where it's queried. And we provide an abstraction layer over that data in our applications that really abstract away a lot of those implementation details. So for the end user using our applications, building audiences, building journeys and ultimately um triggering the CX activations, they don't need to know or care where the data lives and where it's queried. It all looks like a simple drag and drop interface to them. They can build audiences, get insights, whether the query is running on Redshift or ActionIQ managed infrastructure is irrelevant to a marketing team.

Um, and that's our goal, right? When we talk about compos, you want to abstract away the complexity of the implementation details of the underlying infrastructure while increasingly supporting a complex ecosystem of infrastructure, potentially spanning multiple different sources but still provide a simple, easy to use interface for non-technical users. Because that is a really critical requirement when we talk about CX. It's not just that you can have data analysts and you know more technical resources that have access to the data. You really need to democratize that access throughout the organization and provide a simple interface for these non-technical users to do that.

And then importantly, you need to be able to evolve with the brands over time, right? Like if the goal is to be 100% composable, fantastic, you need to provide mechanisms to migrate data from bundled to composable in a way that doesn't disrupt campaign operations. Um, and the day to day business of of the marketing teams. Ok? So this is data, right? And this has been the trend around composable in the CDP space over the last couple of years.

Um, we're all in on this. It's super exciting. Um, and we've partnered with, with AWS for a number of these um innovations. But what about identity, right? How where does identity fit in in the grand scheme of what we're talking about here?

Well, while the data infrastructure landscape has been changing a lot in the last couple of years, uh the identity landscape has been changing a lot as well. Um, much has been written, um, you know, much hand wringing has gone on over the disappearance of third party cookies eventually at some point, right?

Um, you know, the reason why it's so important because that was the identity fabric of the digital ecosystem at least. Um, and it's going away. So there's a shift in that to different sources of identity and importantly, which aligns with the vision of the CDP a shift towards more first party data, right?

Advertisers, marketers realize that hey, this massive ecosystem of anonymous third party cookie based data is no longer going to be available. We need to shift to really owning our data strategy being first party data driven and then augmenting that with second party data collaboration or third party data partnerships, right?

So it's not that 2nd and 3rd party data are going away. It's just that the nature of them and how you integrate them needs to be first party driven and then you layer on different 2nd and 3rd party strategies to that.

Um, and you can see here, you know a lot of graphs up into the right spending, right, spending investments in identity and identity services um increasing as brands figure out how to um shift to this first party data strategy and and integrate new um and future proof identity solutions into that ecosystem.

Um, and then as we've been talking about, right, we we don't want to lose this ideal of moving towards a composable architecture where you don't have to copy the data outside of that ecosystem as we figure out how to work with identity and second party data collaborators.

Um, you want to do it in a warehouse centric way where you don't have to copy the data, it's still zero copy, just like the other architecture that we showed. So what does this mean?

Um, I think the logical conclusion is that uh identity really should belong where the data is. Uh there needs to be identity really stitched into the ecosystem of what you're doing in the cloud data warehouse.

Um, and um, likely coming from the major cloud providers or data infrastructure providers as kind of additional products and services that they're layering on to the core infrastructure that they're providing.

Um, it needs to be done in a privacy centric way. Um, and you need ways to collaborate with your partners in a privacy and secure and governed way.

Um, and you know, bundled solutions I think are, are increasingly going to be a thing of the past where more and more brands at least aspirationally are embracing this idea of composable where they control their data strategy and it lives in their ecosystem and they want applications whether it be CDPs or others to really plug in complement that and do that in a zero copy way that still delivers on the promise and value of what that application is trying to do.

And you're seeing that right? You're seeing these composable solutions emerging in the market.

Um, and this is where kind of over the last couple of years. Um, we've really been partnering with AWS, right?

Um, as we shifted our strategy from bundled to composable on the data and infrastructure side, we really looked for partners in the ecosystem around identity and data collaboration as well. Um, and AWS stood out uh in their investment in that. So what does that look like? Right.

Um, architecture isn't supposed to be perfect. But uh, you know what we have over here is the the client or customers cloud data environment on the left and a number of services that they're leveraging essentially storage, compute data modeling are all kind of within that same cloud environment.

And then what we're layering on here is the identity in the data collaboration all within that client's cloud environment, um completely owned and managed um by them.

And then over on the right, you have the CDP deployed in a composable fashion again, queries getting pushed down federated to that client's cloud data warehouse, building audiences, orchestrating these campaigns and ultimately triggering these experiences.

Um, and then the last mile delivery is, you know, the channels that are doing the inbound, outbound um email, mobile, push web, et cetera.

Um, so composability works and is increasingly increasingly important um with identity and data collaboration as well and you shouldn't break the model and really that's what we get with AWS CR, right? Um, profiles and identities can live within the client's ecosystem. Um, never have to move the data built on top of, you know, the same storage, the same catalogs that are used for the data lake that's really powerful.

Um, it's really easy to get going. And so what does that look like? Right. Um, so starting from kind of the data storage and collection, right? You have the ingest and normalization of data. One thing that we did find in our POC of this is that you do have to um, you know, do some upfront modeling and cleaning of the data, right? Doing things like address standardization, um, you know, email verification. Um, you know, even like physical address uh standardizing that before you go into the matching actually really improves that matching algorithm.

Um, so you ingest that data, you, you write that normalized data to S3 um register it with Glue um map it to ER. You take those resolve profiles and then those become the customer 360 foundation um for segmentation audience building and ultimately orchestration and that whole center box of of orange there right can exist either in ActionIQ managed infrastructure or client managed AWS infrastructure. That's the power of this is you can really drag and drop and say, ok, you know, if you have the ability to host this within your AWS environment, great, we'll just consume the output of that. If not, we can provide these services out of the box as well. So it's a really powerful story to again meet brands wherever they're at in this journey of investing in their infrastructure and migrating their applications to be more composable.

So what do we do to verify this? We did test with a large retailer. Um we had a couple of requirements going in but had to work to integrate within our infrastructure and ecosystem. This was already a client that was deployed on a previous solution. Um in order for the matching to be at the standards that we had set for that client. Um we had to support a combination of rules and ML based identity resolution. These are really important, they're very complementary to each other. Um I'm sure there's going to be much talk about AI at the conference today. Um that's great. Um certainly ML based matching can, can augment here, but a lot of things can be done with just simple rules as well, right? And so you need the ability to do both, um especially since rules based matching can be done much more incrementally and um much more cost effectively typically.

And then scale scale is really important. We have customers with hundreds and hundreds of millions of customers in their universe and you need to be able to do any, any resolution at that scale and on a very iterative basis.

So what do we find? Um one was that there was faster time to value versus the previous solution um as that I mentioned right, there was an emphasis with AWS ER around um building it in a way so that less technical users could easily configure it. We found that the, the amount of configuration was kind of the goldilocks zone. It was just right. Right. There was enough out of the box where we didn't have to rethink this from scratch every single time. Um but then when, you know, we were dealing with the nuances of that customer's data and their use case, we can configure and fine tune it using the uh the UI that was provided and that felt like we were able to deploy this within a couple of days and get results very, very quickly.

Um and then lower cost, this was actually a function of the fact that AWS ER from the ground up was built to be incremental. Um so there's really two approaches to identity resolution. One is kind of brute force batch where you take a look at the whole universe every day and you map across everything on a daily basis. Um you need some notion of persistent ID. So you need to kind of keep some state from runs, but you're doing this kind of brute force, taking into account all the data, all the matching logic logic every time this runs.

Um the much more nuanced approach is to actually only provide only do the matching on data that has changed from the last run. Um, and guess what data doesn't change all that much. A lot of times there's a lot of new data but it's not like old data that influenced the match previously is being updated. That can happen, but usually that stabilizes pretty quickly. So it's pretty inefficient to do the brute force approach. Um, and, you know, every day, rematch on all the data historically and with AWS, ER, we found out of the box that was really easy to get that more incremental matching logic, which improved performance and overall lowered cost since it is kind of a pay as you use model and um tying this into the clean rooms.

And I think clean rooms is a very clear extension of this, right? Because now this identity that you've created is the fabric in which you can collaborate across different ecosystems in a secure and governed way, right? Um and through the entity resolution and clean rooms integration, now, um you can just extend that same architecture and say, ok, the output of ActionIQ audiences and campaigns can then tie into AWS Clean Rooms and allow different brands and publishers to interact with each other's data, get insights and again in a secure and govern way without ever needing to move or copy the data.

So the fact that they have the same kind of core identity that integration works without ever having to move the data happens all within the client's ecosystem and AWS Clean Rooms provide the mechanisms for privacy, security and governance of that data collaboration is a win win.

Ok. So that's it for me. I think Devore is going to summarize and recap and talk about what's next? Thank you, Justin.

Um my name is Vor Gola, I'm the General Manager in AWS Applications uh for a couple of these uh uh customer data services that we've been talking about. Um in general, I would say like the mission that Justin just mentioned on transforming customer experiences with the power of data. I think everyone can relate to that. We're here. We're thinking about how these types of services can help our companies create better experiences for their customers is really at the heart of what we're trying to build here.

And uh as we talked about the need for beginning with a common identity, a common set of capabilities that help this entity resolution space. And we've heard from many companies, you know, like Action IQ and other companies across different industries as to how this initial layer of resolving that identity is that first step, you know, how do you bring together information and data that is often times unstructured. It's messy. It's in different data repositories, different applications. How do you bring that together, stitch it with a one common identifier uh and, and link and duplicate so that you can get a better view of those customers. So we see this as a first step of getting towards that mission.

Um second is this concept of interoperability um that layer of identity and and linking of records also helps with how you can interoperate that data set with other data sets. So when you think about the data collaboration space, it's not only within your own company that you're trying to link different uh data sets from, for example, uh customer support or marketing sales, but also with information that resides in partners uh data sets. Uh and this is where that concept, you know, pulling it back to the data clean rooms comes in by helping resolve that identity, having a common identifier. It makes that data collaboration problem much easier to solve.

So this combination of entity resolution uh and data collaboration is a common theme that we've heard from, from our customers. Um so within AWS, you know, we are aligning to that mission of how do we help transform these experiences? We're doing it our way we're building, we're hearing back from our customers to we need AWS to bring these composable services, these data services that can plug in either into an existing application or stack that they have or we can also help them ground up uh to solve these problems.

So what this means for our customers is that we have the ability to bring these assets for entity resolution clean rooms to ultimately land in a unified view of the customer. So many customers already operate on their data with in AWS. So one of the key benefits that we saw uh in building these services, as i i think Adam mentioned and Justin as well is how do we minimize that data movement?

So bringing it back to that benefit that AWS can bring to all of you is uh acknowledging that there's, you know, hundreds of thousands of data lakes in AWS already, you have your data there. How can we make sure that we can make it super easy for you to operate directly on that? So that, that has been, you know, part of our mission. We, I think on our part is we can help our customers transform those experiences if we make that uh data that they have all over AWS much more usable.

Um so that's one key component of that. Second one, we talked about the easy to set up. Uh but it's also how easy it is to configure uh companies like Action IQ who are CDP, they have different set of requirements versus, you know, a retailer or a uh a tribal hospitality company. Um and so with, with these services, what we try to do is make sure that we have it uh uh defined by rules and configurations that can uh be more specific to your use cases.

So, you know, I encourage you as you learn about our, our services, really test it out and, and, and see how it can adapt to your data, needs your sche uh your use cases as well and obviously come back to us and let us know you know how, how it's well adapted to it.

Um Adam talked about the extensive network uh and, and this is what we're trying to do like within AWS, there's just so many different companies that are tackling this space. So when you think about all of them trying to resolve these identities and enabling data collaboration, it becomes super powerful to have like an extensive network of, of uh data assets that are are out there that can enable new use cases in in this area.

Um and, and again, like the the concept of pay as you go, like we, we really want to make sure that we drive um a set of capabilities that enable our customers with the right price point as well. So because we, we believe that it's a faster time to value and lowering costs is one of the key things that, that we want to do.

So in general, I just want to say like the these are uh new services that we're looking to help uh uh companies across different industries uh tie in both like the identity space, the data collaboration space. Ultimately creating that unified view of customers that can help you point it back to the mission, help you transform those experiences uh for all the different uh um uh experiences that you enable across your channels.

So as we wrap up, I just want to mention like some key takeaways. Uh you, you have the QR codes here to take a look at more details for AWS Entity Resolution as well as uh AWS Clean Rooms. These two set of services that we just spoke extensively about today.

Uh also do not miss the uh keynote from Swami. That was gonna be a great one coming up um at 830 tomorrow. Uh and also later today, uh we have, um uh you know, we have the Advertising and Marketing booth, uh where you can come over and visit us and, and we can speak more.

Uh and lastly, uh we will be taking some questions over in that side of the hallway. So, uh feel free to stop by, we'll head over there. Uh and I just want to say thank you. This has been a great session thanks to our customers uh and have a great Re:Invent. Thank you all.

李白的朋友王维

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Building interoperability and data collaboration workloads with AWS

Right.
复制链接

扫一扫