How Choice Hotels is unifying guest profiles to drive personalization

Thank you, Claire. So before I begin, a little bit about who Choice Hotels is, at the expense of doing a little bit of slide reading, but making our PR team very happy because I will say exactly what they told me to say.

Choice Hotels is one of the largest lodging franchisors in the world with nearly 7,500 hotels in 43 countries and 22 different brands that includes brands you might have heard of such as Comfort, Quality, Cambria, WoodSpring, and in the Americas region, the Radisson brands.

Choice Hotels has a history of innovation and developed the first internet-based massively distributed property management system and more recently developed the first reservation system for a hotel company in the last 30 years which incidentally happens to run on AWS.

So the problem we're seeking to solve - the hotel space has many different ways customers can book stays. They can walk into a property directly, they can call up the front desk, they can call a global reservation center. They can also use the ChoiceHotels.com website or mobile apps. They could book through a travel agent. They could use an online travel agency such as Expedia or Booking.com.

Each of these channels generates customer data slightly differently. And while there's really good coverage for specifications for things like rates, inventory and availability, not so much when it comes to customer profiles. Uniquely identifying customers when they stay only a few times a year is also challenging because their customer data can change between our interactions with them.

But we see massive advantage in uniquely identifying our customers. We can use this for personalization such as doing sort order optimizations to make sure we're putting the most common hotels the customer would want to book at the top. Or we could change the images on the website so that we show them the type of trips they like to book, be it family vacations or couples getaways. Or we could even show them the most recent stays they've had to help them figure out and book easily when they're wanting to rebook.

But I'd like to give a more concrete example. Here we have my example customer Alexis Customer. And she walks into one of our Choice hotels properties and books just on the spot. And she hands her ID to the front desk agent and the front desk agent enters her information exactly as it appears on her ID. So the profile gets created with her email address and with her full name of Alexis.

Later, she has a great time at her stay and decides to come back and stay with us again. And this time she books it herself using the ChoiceHotels.com website. Well, this time she uses her nickname that she goes by in everyday life of Lexi, which doesn't exactly match Alexis, which was created the first time.

Well, you might say, hey, but she gave the same email. Why wouldn't you just match on that? Well, it turns out the email is not as unique as you would like it to be. There are plenty of cases where multiple people share an email. It could be husband and wife sharing an email or a work-based email that's based on the position that someone's in or even just a work-based email that multiple people use.

So because of that the fear of accidentally matching customers and giving one access to data they shouldn't have leads us to be very conservative in our matching. And once you have more than one result, the problem gets even worse because then the likelihood you're going to get a single result back from the future gets even lower.

The third example here shows that even if she provided the same exact name, Alexis or Lexi, in a future booking some online travel agencies or OTAs will end up providing us with a temporary email instead of the one she provides. So now we're back to just having first name and last name again, which simply isn't enough to make a match.

So enter AWS's new solution, the Unified Profiles for Travelers and Guests on AWS. It has documented specialized schemas that are specific to the travel space. You can search, retrieve, delete and merge customer profiles using the Amazon APIs. It's tightly integrated with other Amazon services as you would expect. And it provides rule-based matching for identity resolution - if you have exact matches as well as AI-based matching for the more fuzzy scenarios where it isn't an exact match.

It's still a very young AWS service and has the potential for many new features in the future. AWS has dedicated a service team to the travel segment that's working on things like privacy, ID resolution, and enrichment.

But I want to talk today about how we used it in our POC this past fall on our actual customer production data. We loaded over 100 million customer profiles from our data analytics platform which runs on Redshift, as well as booking data from our distribution engine, and clickstream data from our website and mobile apps.

For the clickstream data, we loaded both customer searches as well as profile views, as well as logins and bookings to help tie the other clickstream data back to actual results.

So on here, we did this first, we had to get the data out of Redshift. And Redshift's best way to get a large amount of data out is by an UNLOAD statement. The problem is that UNLOAD statements operate on one row per one record and customer data is not that flat. It's very hierarchical and nested.

And so we had to come up with a way to create this nested object from a flat record while still doing it at scale. We could have loaded these in, loaded these out separately and stitched them together later in code. But we decided to be a little bit scrappy and instead did a combination of string concatenations and list commands in order to create these interleave separators that then we could break apart later downstream.

It actually worked really well because the UPT format for profiles was clean enough that it was able to be done that way. But other types of messages, for instance, if we were looking at the reservation message, was so complicated there's no way that approach would have worked and we would have had to go through the stitching the multiple extracts and stitching the data together later.

Thankfully, the reservation data as well as the clickstream data were already all nicely nested in JSON format on Kafka for us. So we were good to go.

We then wrote a Lambda that read the data either from S3 or via a Kafka trigger for the data that was coming in across Kafka. And that Lambda was able to reconfigure from Parameter Store on the types of mappings. We use JSON path expressions in order to do the mapping for the simple scenarios. But then we kind of hybridized the approach and also had the ability to do more complicated Python code transformations.

An example of this would be that we had durations in our reservation messages that looked like 100 and 44 hours or 14 4h. And while Java doesn't care if you represent your durations that way, Python follows this ISO standard a little bit more strictly. And therefore it required us to reduce that down to something like 6d instead of the 144 hours.

So once that was done, we were able to then write the result onto Amazon Kinesis, which we use for our ingestions. The solution, the Amazon Connect, was able to load it at about 800 transactions per second. So we were able to solely use the Kinesis-based ingestion. It does support S3-based file ingestion, but we decided that it was easier just to have a single path to load the data in. And since it was our history load, we were able to wait a little bit longer for that to be done. We were able to provision throughput on the Kinesis streams in order to make sure that it got loaded in a timely fashion.

Okay, so that's what it took to get the data into the solution. Now, let's talk about the matching. So we ran with four matching rules:

  1. First name, last name, email
  2. Phone
  3. Address
  4. And then the fourth one is a little bit more interesting.

One thing we noticed in our data was that we had a number of records on multi-room reservations where the first room would have very rich customer data, but then the future rooms, the other rooms on that same reservation, might only have a first name and last name. And if those profiles made it into the solution, they're never gonna match with anyone because it's just their name.

So what we did is we had this rule with the last booking ID, which allowed us to match these customers that were coming in on a multi-room reservation where they had just a first name and last name.

Once the other thing we did with this solution is we enabled the automatic merging for rule-based matches. And this meant that as the data comes in, the solution would automatically merge them if they matched any rules and emit an event out on the EventBridge bus.

That EventBridge bus would then be able to have the Lambda executed when a rule fires for it. And we were able to load this data back into Redshift for future analysis with both Tableau or SageMaker or with our marketing partners.

To embrace, I will note however, that the events that get emitted on a merge are just the IDs of that profile and not the full rich profile. The reason for this is that you can actually have profiles that are just too rich to fit on a messaging system. Once you get a bunch of clicks data or even reservations, your profile would grow beyond the message limit sizes.

So those matches are written to S3. The merge when they're complete is available on S3 if you want it. But we just use the profiles as they already existed back in Redshift.

We also ran automatic the AI matching and the AI matching is configurable to run up to three times a week at a particular time of day. But we have let Amazon know that we wanna do this even more. Claire's laughing at me now.

The automatic merge, the AI-based matching, I'll talk about the results of that in a minute, but we were able to see that we didn't, we couldn't just rely on the AI-based matching. We wanted to additionally do manual review.

And so there is a manual review UI that you're able to use to look at these different matches and either accept or reject them. If you choose to accept one of those matches, then it's going to emit an event just like a rule-based match back on the EventBridge bus and you can handle it exactly the same way.

So that's enough about the matching. I will talk more about our specific results with the matching in a second. But I did wanna talk a little bit about the other use case for the solution, which is as a real-time data store for like our ChoiceHotels.com website.

So you can directly search the different profiles once they're in there by first name, last name, email, phone or most importantly, customer ID. We saw constant access times regardless of the number of queries that you're going to run. And we had access times of about 63 milliseconds on average and about 97 milliseconds as the p90.

So we'd love to see this get an even lower because the more time you have during rendering, the more you can do with the data. But the solution is able to handle real-time queries even at scale.

Alright, let's dive further into the results of our proof of concept. We loaded 108 million profiles into it. And out of that Amazon, once it split it into the profile objects, that resulted in 435 million addresses, phones, loyalty information, all of those objects. So we had 435 million objects, so about a 4 to 1 ratio with our number of profiles from there.

It ran the four matching rules that I described earlier which resulted in 18 million matches from our rule-based matching alone, which after merging meant we were left with about 90 million profiles.

We then ran the AI-based matching on that and it found 7 million additional matches which when we dug into those of those 7 million, 2 million of them had above a 90% confidence score.

Some interesting things we saw when we manually looked at these matches:

First of all, we have a number of customers that had first name, last name swaps where we didn't know which was actually the correct order. So one of the things that we took away from this was that we were gonna try to corroborate with some of our third party partners like NewStar to figure out what the correct order is.

We also saw the solution was very successful at catching things like nicknames like William to Bill or Alexis to Lexi. And sometimes it was even a little bit too effective.

We had cases where we saw relatively high confidence matches even when it was a husband and wife in the same household. I mean, when you think about it here, they have the same address, they have the same phone, they have the same last name, they have the same email address. And if they even had one letter, the same, you know, the first letter was the same. Sometimes the solution might say that, hey, yes. This could be a match.

That's one of the reasons that led us to say that we really believe that there should be manual review of these AI matches.

Another interesting area that we learned about was anonymous values. You wouldn't believe the number of different ways that anonymous values are represented for addresses. People came up with things like 123 Any Street or 123 Main Street or no address. If they're engineers, they put the word "no" in. I haven't seen an emoji yet, but I'm pretty sure that's probably coming at some point.

The deal here is the customers are trying to just randomly come up with a value because they don't want to give you the real data. And so what happens is randomly, people happen to pick the same values and when you add that, when two people with the same name happen to pick the same value, now you're making a match on that. And the way to avoid this is to actually exclude those addresses from the solution by knowing out the values. So this is something that we're going to look more into in the future.

We also saw cases though where OTAs were sending us the actual OTA, sorry, the travel agencies were sending us their own address as the customer's address, which effectively again makes this into an anonymous value.

It goes without saying, but data cleanliness is very, very, very important and the AI matching was really helpful in helping us understand our data and find where these matches might occur.

As I alluded to earlier, we still believe that manual review of the AI matching is absolutely necessary and we are very happy with the UI that Amazon is developing and look forward to it developing even further to help us with those manual reviews.

I know it's going to be a big cost up front to go through and manually look at 2 million matches. But once it's done, then the ongoing cost should be manageable.

Regarding cost - as you can imagine, travel and hospitality companies like Choice have hundreds of millions of profiles and developing a solution that can store, manage, merge and support real-time queries with that many profiles is a challenging and difficult proposition.

So we've been working closely with Amazon to make sure that this, that the solution has a cost structure that aligns with that - a cost structure that we can manage and control while still getting value from the solution.

One last nugget that we we saw as we went through is that because we loaded the clickstream data in, we have a bunch of profiles that are only clickstream data. There are people who are doing searches on the website that never converted and we're going to need to develop a solution to kind of manage that and delete those profiles sooner so that we don't end up paying for them long term. So that needs to be managed separately from our longer term profiles.

I want to let you guys know they do have a demo over in the Travel and Hospitality area over there in Area 7 back behind me. I encourage you to go check it out.

Overall, we're very excited about this solution. We learned a ton about our customer data and we think this solution has great value for others in the travel and hospitality industry. So with that, thank you for your time and have a wonderful evening.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值