HBO Max achieves scale and performance with Amazon CloudFront

Hello, everyone and thank you for joining us. This is session NET 312 - HBO Max Achieves Scale and Performance with Amazon CloudFront. My name is Tal Shalom. I'm a principal product manager with the Amazon CloudFront service team.

Today, I'm excited to be joined by two guests from the HBO Max team - Jay and VW. I'll let them introduce themselves.

Jay: Hi, I'm Jay Boisseranc. I'm the director of engineering for video backend systems at Warner Brothers Discovery and I'm really excited to be sharing our journey with CloudFront with you today.

VW: Hello everyone. My name is Vikrant Kelkar and I am staff software engineer at Warner Brothers Discovery. I am responsible for CDN strategy, architecture, implementation and operations and I'm very excited to be here today.

Tal: Thank you. We have a packed agenda for you in this session. In the first part, I will be focusing on how you can optimize your media delivery for improved quality of experience to your viewers. I will talk about some best practices for your media delivery in configuring CloudFront and overall delivery. I'll talk about how you can test those configurations before you apply them to production and what I will share with you some new options for support from CloudFront.

The second part Jay and Vikrant will take the stage and share with you HBO Max's journey with CloudFront. They will talk about some telemetry measurements they take to improve their quality of experience. And finally, they will share some of the features and capabilities they're exploring with CloudFront going into 2023.

To start, let's take a moment and look back at the 60 year evolution of how we consume our media from the real playback tape in the sixties to today's mobile devices and how the user experience and expectation evolved through time.

So in the late seventies and eighties, we had the VCR that brought movies of our choice to the living room. And then in the nineties, the DVD enhanced the experience with widescreen videos, movies and 8 channels of digital audio, camera angles, multilingual and more.

And then in the late nineties and 2000s, digital media became popular as downloadable content to our PCs and laptops. And today we can walk with our mobile phone and watch almost any favorite content of ours from anywhere, thanks to high speed networks and cloud enabled solutions that allow new experiences for media delivery.

Now to reach that or achieve that user experience, the quality of experience for our viewers, there are multiple considerations we need to take. There are a few fundamental considerations which are performance, scale and availability of the network that connects the viewers to the content.

AWS continues investing in expanding the Amazon CloudFront network. And today we have 450 points of presence globally with 13 regional edge locations. Those points of presence are strategically placed in locations closer to your viewers that helps reduce the latency for delivery and enhance the quality of experience.

Beyond the network that we build and operate, we also deployed embedded PoPs within ISP networks. In 2019, we deployed our first version of embedded PoPs in ISP networks that was a full rack that was deployed in different ISPs. And this year we launched our second generation version two of our embedded PoPs. These are one new servers that can be deployed deep into the ISP network, getting much closer to viewers and helping at peak sports events or game downloads and popular VOD releases.

So far this year, we deployed over 220 of those PoPs, the embedded PoPs. It helps us to manage traffic and also help the ISP offload some of the traffic and mitigate any issues at the last mile for the viewer.

So we talked about the network performance, scale, availability. And now let's talk about configuration - what you can do when you set up your workload, your workflow end to end in order to improve the quality of experience for your viewers.

Let's start on the left side with the origins. CloudFront supports origins whether they are on AWS, you can use AWS Elemental Media Services as your origins or if you decide to process your video on Amazon EC2, you can also set it up as your origin. If your assets, video assets are already ready for delivery and you store them in S3, you can also have S3 as your origin or if you're still running your video processing or you keep your assets on-prem or a third party location, you can also have that as origin.

A single origin might have issues, so to maintain high availability and not to impact quality of experience, you want to have another origin as a failover. CloudFront lets you set up an origin failover group with primary and secondary origins and you set criteria for the failover between those origins.

Another parameter is the option to set up CloudFront origin shield. CloudFront origin shield is another layer of caching from the PoPs to the mid-tier, the original edge caches. And then another caching layer that protects your origin from overwhelming request rates that are coming from all those PoPs around the world at once.

So when you have origin shield, you can connect third party CDNs in a multi-CDN architecture. You can connect it to CloudFront and have origin shield as your entry point to make the fetches from your origin, and by that reducing your request rate to your origin and all the requests from all the regions and the original edge caches goes through the origin shield.

Now if you have two origins, you can associate two separate origin shields with your origins, each one will be in a different region. And the reason is you want to maintain the shortest latency possible or the lowest latency possible between your origin within that region and your origin shield because all the requests will funnel through. So if there is a failover, you still maintain high, low latency from that region where your origin is set up.

Now the criteria that I mentioned before about how you switch or failover between origins has multiple configuration parameters. You have connection attempts - what happens if first connection doesn't respond well, you can do a secondary. You have connection timeouts - how long to wait for a request you send to the origin. And a live timeout - this is very important, especially when you decide what's your workload.

So if you are choosing VOD or live, the configuration might be completely different. And if you have adaptive bitrate, which means you have multiple renditions that you serve your content in different quality levels, that also can have impact because now you make multiple requests for different resolutions, different qualities.

And the one most important or most impactful parameter that will make a difference on those configurations is your video segment length - video segment length. The video segment is one part of the video delivery. We have manifest and we have segment - the manifest basically points to your segments and the segments are parts of the video that are packaged to be delivered.

Now the video segment length can determine how you want to set up your connection timeout or how you want to set up your response time in order to get it into context.

Let's take a use case - let's take a low latency live video with a 4 second segment, okay? Typically most players will keep at least two segments in their buffer. If you have control on your player, then you can get some telemetry, you know exactly what's going on. But you don't always have that control on your player. But the idea is that you're trying to maintain this buffer full at all times so you don't interrupt the experience.

Now what happens if you set up a response timeout, waiting for the response for 4 seconds? That means if there is any issue in the origin and you did not respond in 4 seconds, you lost a timeframe of a segment in that buffer of all your viewers and that can cause interruptions.

So you want to mitigate it fast and to mitigate it fast, you want to reduce the time for the response time that you wait to at least half or even less than your segment length. So let's say we set it at 2 seconds - if the origin does not respond in 2 seconds, I either want to retry or I want to failover to the other origin. That gives me enough time to make that failover request for the other segment, send it back and fill up that buffer without interrupting the viewer.

And that's also correct to the connection attempts - meaning if I select to do two attempts, it means the first time I tried, waited two seconds, then I do a second try. If all works well great, but if not, now it's again another two seconds which means four seconds, which means I'm again I lost a segment timeframe, right?

So this is a consideration you need to take when you create your workflow in order to maintain that quality of experience and less interruptions on the viewer side.

Another parameter that goes with all these settings is the manifest cache policy. In live video, the manifest gets updated with the new segments that are produced while the packager is generating those segments for you. And then the new manifest will have the new segments available there.

So you might think - okay if you know at 4 seconds if I'm generating those packages on the fly, then why would I need to cache the manifest? Four seconds is a lot of time. And if you have a massive amount of viewers, what now? And a lot of players making requests at the same time, this could overwhelm the origin if you are coming from different PoPs.

And in order to reduce that and serve faster, you want to cache that manifest. And that caching setting goes along through all CloudFront layers. So if you set it for example to again 4 seconds cache, then throughout that segment length, all the requests for that segment for that manifest update or refresh manifest will be within that timeframe.

But still you might be stretching that window too far to that 4 seconds, so it's recommended to reduce it to at least 2 seconds or even if you want you can do it to 1 second, it's still okay, because you're gonna observe most of the requests at the edge location when you make those requests, when the players are making those requests.

Now segment caching might be different, and it's recommended to split between your manifest caching expiry and your segment expire caching for the segment. Basically you can cache only to let's say the timeframe of the segment or twice the timeframe of the segment. But if your workload allows for the player to skim back and do playback of let's say the last half hour, you might want to add more time to cache it there, so players and viewers can go back and skim to. Again it's up to your workload how you decide to serve your content.

So split between the two different cache behaviors and allow for different cache expiry parameters.

Last point that I wanted to mention here is about negative caching - negative caching is basically caching of an error. In essence you would say why would I need to cache my errors? If there's an error I'm gonna request again and you know I want to respond.

But again when you're at scale and you have millions of viewers, those requests might be overwhelming at the origin level. And what you wanna do, you wanna maintain a low number or a low time of cache expiry caching for those negative caching for those errors in order to allow enough time for the origin to recover if it needs time to produce that segment or to update the refresh manifest.

So even at one second on an error, on a negative error caching will help a lot for players not to overwhelm the origin, not to create an overload on the network and still maintain that timeframe to allow you to make the request again and get the segment.

So again, especially when you go low latency, this is important. So we spoke about the configuration and another thing that you can consider in uh in low latency and improving your delivery is the use of HTTP three. And HTTP three is the third generation, the third major version of uh the HTTP standard. Um it's built on QUIC, which is built on UDP and it brings few performance benefits. I'm not going to dive too deep into it. We had a session with uh J Roski who is the uh known inventor of the QUIC protocol. Um and you can catch it up uh on a replay, I think yesterday he was his session. Uh but the main advantages that this uh protocol allows you is basically faster connection setup. So when you have a faster connection set up, it's a faster uh first byte to the viewer, which means lower latency in start up time, right?

And then uh it has improved mechanism to recover packet loss on the network wherever the the packet loss are. So that's also helping in reducing uh um re buffering errors on the player side. Additional enhancement is uh it has a mechanism to support uh roaming between different networks, especially for mobile or between mobile cellular network and the uh wi fi network. Um it's encrypted by default with TLS 1.3. So it's also secured and uh we actually is going to present some metrics that they uh explored and uh and found out with their delivery um with a comparison of with HTTP three QUIC. And without, if you want to set up HTTP three, it's very simple. Either through CloudFront console, you can just check the uh the box for enabling HTTP three or you can do it through CLI or uh API calls.

Another uh another functionality that I want to talk about and how to make use of is edge compute. If you're thinking about shifting functionality from your origin to the edge CloudFront, provide you two options to use edge compute. One is Lambda@Edge uh which is typically used for a more complex longer running functionality. Uh for example, you can uh fetch a manifest or you can take a manifest and do some manipulation. You can add uh if you have adds some, some type of an um advertisement that you want to uh basically embed into a manifest, you can do that and then serve it. Um or if you wanna do some personalization uh for a specific user based on the user ID, you can do it also. Um Lambda@Edge has the option to make network calls. So you can call um other services. For example, uh you can call AWS Elemental MediaConvert and do transcoding on the fly if you need, um you can also collect metrics and send them to a third uh to, to an analysis system, a different service that collects your analytics for the uh requests that are coming in. If you have custom headers or if you have uh custom query string, you can pick that and send that information to um another, another system that you're collecting your metrics for a more, let's say low uh low latency or latency sensitive uh functionality.

We also have uh CloudFront functions, CloudFront functions. It's a very fast execution function that runs at the edge. It's no more than one millisecond execution, that's the uh run time for it. And it's for really um lightweight functionality. For example, if you need to uh rewrite the URL, if you need to validate a token, uh you can also do that with CloudFront function, then make a decision if you want to pass it or you can send a redirect to uh uh a login system. Um you can do header manipulation uh or you can again, do redirect for um any other purpose right there from the origin. So you can shift that functionality from your origin um to the edge.

So we spoke about all those great features and how you can use them to improve quality of, of experience. But you have your traffic running uh in production and you wanna test changes before you're actually making uh making the change. And for that, um last week we launched our uh CloudFront continuous deployment. This is a new feature of CloudFront that allow you to create a staging environment with configuration, different configuration. And the nice part that you can select which traffic you want to send to that staging environment. So you can decide if you want to do it based on IP. And then you have kind of an A/B environment or you can decide to um send a certain percentage of traffic or with a hinted um uh header. And that allows you to test your new configuration check how it impacts the quality of experience on your viewers. And then if everything works as intended, you can apply that to your production environment. And the good thing is you don't need to change your DNS. You don't need to create a completely new environment. You can use the same DNS that you're using for your production uh for your production traffic.

So this is the latest release from CloudFront. And last, um I want to mention here uh the new support option for live events and uh premier releases. Uh we have uh in the AWS Elemental team. Uh we have a team that is media event management that supports event from A to Z um helping you set up the encoders setting the uh uh all the work um until delivery. And recently we added specialized team that supports CloudFront. So now you have a full support from the ingest of the video all the way to delivery. Uh the team will work with you prior to your event. Uh they will look at your configuration uh architecture uh recommend some best practices. And then during the event, they will be looking at dashboards. Uh if you have a war room, they can be on that war room and work with you. They have access to different internal configuration of Cloud, they have access immediate access to our engineering team. And uh after the event, um they will work with you on a retrospective reports. What happened if anything needs to be changed for the future or future event? Um so there is a, there is a link at the bottom if you uh want to uh connect with that team. Um please go to that link and uh reach out to us and now I'm gonna hand over the stage to Jay and Vikrant uh to tell you about HBO Max journey with CloudFront. Thank you Tall Vick and I are going to now walk you through the HBO Max journey with CloudFront, including how we use it as part of our multi CDN strategy and how we use it. To manage and prepare for large scale events like House of the Dragon.

So I want to start by introducing you to HBO Max. Um HBO Max builds on the legacy of HBO. HBO just celebrated their 50th anniversary. We're really proud of that. And during those 50 years, HBO has become synonymous with quality premium content. Uh shows like The Sopranos and Game of Thrones and the uh the focus of the premium content has been something that we wanted to bring the user experience to match that. So HBO Max, the purpose of HBO Max was to introduce a premium user experience to match that premium content. So if you don't remember anything else from this slide, remember, premium user experience is what we're all about. HBO stands for Home Box Office. It's about bringing the box office experience, box office experience into your home and everything we do. It's in our DNA and how we think about solving every problem. How do we make sure the users have a premium experience? Uh there's a few other numbers on here. So we have about 95 million subscribers. That's HBO Max and Discovery Plus combined. Uh we're now part of Warner Brothers Discovery and we report our numbers as a combined subscriber number. Uh we have both ads supported and ad free tiers. Uh we're in 61 countries across the world. We have both VOD and live streaming support and more than 13,000 hours of programming programming.

Now, I'd like to walk you through a few key milestones in the history of HBO Max. Uh we'll start with launch and you notice the headline of launch here is direct to consumer take center stage. Uh what that meant for us was historically, we had other streaming options for HBO. We had HBO Go HBO Now, but they were really just supplemental to other methods of streaming and consuming HBO content. When AT&T acquired Time Warner and HBO is part of that, they shifted our strategy so that we wanted to match that premium content, as I mentioned with a premium user experience. And you know, the the focus then was on how do we, how do we build this new premium user experience from the ground up to take advantage of not just the HBO content but also the deep content from Warner Brothers and from Turner catalogs.

Uh so as you see the launch, there was May 2020. Uh we had been working for months on getting ready for this launch. Uh had all our war rooms planned, meetings planned. Of course, you all know March of 2020 we all got sent home. Uh the date didn't change. We still were going to launch in the same date. So we had to quickly pivot all of our strategies from launching in war rooms to launching remotely. We're looking at each other in cameras, we're all used to that now. But at that time, it was completely new and a very different experience. And we managed to pull this off. The launch launched on the day we wanted it to launch. It was one of the smoothest launches that, that I've been part of so very proud of that.

The next time point I want to talk about is uh popcorn and popcorn was the supporting the day and date releases of all the Warner Brothers 2021 movie slate on HBO Max. So the same time it launched in theaters, it launched on HBO Max. And it actually kicked off with Wonder Woman 1984 on Christmas day of 2020. Uh in order to support this new, this new capability or this, this new um this new initiative, we introduced a whole bunch of new capabilities and immersive experiences in the product. Uh we introduced HDR Dolby Vision, Dolby Atmos. I'm feeding back a little bit

The uh and the, the whole idea of this was to make sure again, bringing that box office experience into your home. Uh it was a crazy time. That's a lot of, that's a lot of new technology, a lot of new work. It was on our road map, but not for this time period. And with the, with the announcement of this, we shifted our strategy focused on this um during the in the tech industry, most of the time, you know, during november and december. That's a time when you can combine holidays and vacation and kind of rewind or, and get a chance to unwind and start to get ready for the next year.

It was not the case for us at all. That year, we focused long hours, long days right through christmas day. And when many of you were opening presents with your kids on christmas morning, we were all huddled over our computers, watching dashboards to make sure the launch was as successful as it could be. And it was uh the next two, the next two bullets, uh next two time periods here, uh ads here and global. We're all about expanding the capabilities. Uh the, the, the subscriber base for hbo and hbo max.

Uh you'll notice it's not a typo that they both say june 2021. Originally, they were supposed to be spaced far apart, but that popcorn thing i talked about got in the way and shifted us off to the right. Unfortunately, the other side didn't shift. So we ended up with everything in june 2021. And the ad tier required us to build for the first time, a completely separate tier for its consuming and buying hbo max ads supported tier included ssai integration.

We launched that at the beginning of june. At the end of june, we then switched to a global strategy launched from, we went from a single country, the united states only to 30 countries overnight, we launched an all of latham overnight went from 1 to 30 and this required not just uh having new infrastructure, new capabilities in these new uh these new territories, but also handling things like making sure we did multi-language, right, uh handling live events uh and uh uh live sporting events in latin america. So a lot of things happened very short time period.

Uh and as uh as i mentioned, this rapid growth, this rapid uh timeline from may 2020 to june 2021. It's really only about a year. We had to have a very resilient, very robust cd n architecture to support that. And i'm going to turn it over to vicar to walk us through that.

Thank you, jay. I yet, the hbo max platform delivers millions of streams around the world every day in order to support that massive volume, a solid cdn infrastructure is required. So we can assure highest quality of user experience and have resiliency and redundancy when needed. Our platform is integrated with multiple cds and with multi multi syrian strategy, we have extended our global footprint to get closer to each user, improving latency and making our delivery cost model efficient as well.

The top criterion used to on board a cdn are first a cdn that's that's capable of handling our daily peaks as well as traffic burs second, it has global presence. Third is that we get access to their premium support and fourth is that they have right tools and feature set to accommodate our operational needs.

We are very happy to have cloudfront as one of our key syrian provider that delivers millions of streams around the world. Cloudfront has strategically spread out age network and uh to support necessary bandwidth. They also provide premium support and event support while we are using cloud fund for many years.

Hbo max infrastructure has been running on aws cloud as well. Our journey with cloud fund started with hbo max's predecessor platforms, hbo go and hbo. Now in order to support game of thrones traffic, uh extensive capacity commitments were required. And after working with aws account management team and amazon cloudfront team, uh necessary capacity was made available to us within very short notice.

During on boarding process, we found cloudfront to be very responsive as we had to complete our setup and configuration in a very short time frame. We have had successful game of thrones season and cloudfront continued to be our one of the trusted key syrian partner. Ever since we just reviewed hbo max's multi cd strategy on earlier slide.

Now we are going to look at uh uh high level architecture on how video delivery works uh with integrated cd ns. So in step 11 user selects an asset for a playback, a client platform initiates a request for an asset manifest to a syrian end point. Syrian point checks if that object it's in cash and if it is fresh, if it is, then in step two, it sends that response back to the user from its cash.

If that object does not exist in its cash, then it talks to its parent or regional ed cash to obtain copy of, of that object uh region cash and parente perform same logic on objects validity. And if that object needs to be fetched or refreshed, they in step three, they request that copy from, from the origin.

Well upon successful delivery for manifest object to the regional edge cache. And in turn back to the client platform, client platform continues to make requests for video segments using same steps and logic until the end of playstation.

If origin responds with http error in step four, then cdn will go to fellow origin and request copy of that object. And in step six, that object gets delivered from fellow origin provided that object was available on f or origin. If it does not, then it will, it will respond back with proper response.

We use amazon s3 as our video origin where we maintain large catalog of video assets. This catalog gets replicated in other aws regions around the globe cloudfront. And all our cds are configured to have secure handshake with the origin and they are also configured to have authentication with origin to ensure that only syrians are ingesting contents for security compliance.

Uh all our citizens are also configured to ingest contents from fellow origin in case there are connectivity issues in specific region. This is just the high level architecture on how video on demand delivery works using integrated c dns. However, there is a lot that goes on behind the scene and jay is going to walk us through on some of these complexities that works behind the scene.

Yeah, thank you very french. So there's a lot going on on this slide. So we're going to take some time to unpack it. But i really want you to to get two messages out of this. All these video services in the back here are really trying to accomplish two things. They're trying to get the right cdn and the right manifest and both of those have the same goal. And that's again to ensure the premium user experience.

So what does the right cdn means it means a cdn that's able to continuously provide a good experience to the user, provide the content that they need. No latency, no lag. And what does the right manifest mean? The right manifest means the best user experience that we can give that user for their specific device specific network conditions.

We would love everybody to watch everything and dolby utmost and dolby vision. It's a fantastic experience, but the reality is that not every device, not every network supports that. So now as we walk through this, i'll, i'll walk through it kind of one step at a time. Ok. I lied two came on at the same time there, but they happen at the same time. That's the, that's getting ready for the actual playback experience.

It's really just wanted to let you know how this content ingestion has two parts to it. We ingest the video itself and the manifests and those go to our s3 origin as, as vicker described. But at the same time, we ingest metadata about that, about that video and about that manifest, the codecs that are available, the languages, the different, the different the run links. There's a lot of information about that that gets ingested into our video url service and it stores that for later use.

And now we shift your focus back over to the right hand side here and you see a kind of a crude looking phone, but that's, that's our user and our users looking through trying to decide what they want to play. Um they find they find a video they, they wanna watch, they wanna watch uh sopranos uh season one, episode one, an excellent choice. It's a fantastic series, one of my favorite.

So they, they push the button that play button that tom mentioned earlier. And hopefully things play. The first thing that happens is a request goes back to our video url service and the client says, hey, video url service, this user wants to play soprano season one episode one. And here's information about their device and their network and all the things that might impact your decision on how you want to play that for them.

Uh the video url service says, ok. Um yeah, i know about soprano season one, episode one. Let me find the metadata for it. Um now i'm now i need to make sure i get the two things that i want to get the right cd n and the right manifest. So i'm gonna actually go back and ask my cd n service. Hey cd n service. I've got this user. He wants to play some video. Here's information about his device, his network, uh his location cd n service uh uses all of that looks at a lot of different parameters that we monitor, uh returns back a list and says, here's the cd n i should let you that i would like you to use.

So now the video uh video url service now has the right cd n. So it's got half of what it needs. It knows about a lot of, a lot of things about how a device and the capabilities of the device and the networks all interact with each other. So it builds up a parameter list and says, ok, here's the basic url that describes everything about that manifest. And here's a set of parameters that will tell our dynamic manifest service how to filter that. And that gets then oops there's my wrong button, push that then goes forward to the back to the client with all that information in it.

The client then requests the manifest. The manifest request goes toward our dynamic manifest service. Most of the time, it's, it's it's actually fronted by cloudfront. Most of the time that manifest is actually returned out of the cache. But in this case, nobody's watched sopranos season one episode in a while. And so that gets forwarded on to our dynamic manifest service which then requests that master manifest that has everything in it

every possibility that that video could have in it. it filters that manifest builds the right manifest returns the right manifest back via cloudfront to the device. and then the user is able to then continue just with a normal path, request the playback from the right cdn with the right videos cdn if it needs to pull us from the origin, ideally, it hasn't cached and the user is now able to enjoy james gandolfini and all his, all his glory.

so the next uh next steps are now really about how do we choose the right cd n? and i'm gonna hand it over to vikrant to uh talk about some of the metrics we use to pick the right cd n and monitor our c dns.

thank you. we all know how important it is to monitor metrics for today's businesses. it is extremely essential to monitor video, health and cd n health for the hbo max platform, hbo max platform uses its homegrown monitoring and alerting solution. and this solution is also hosted on aws and uses grafana for its dashboard metrics is collected from client platforms and these metrics are calculated in real time.

in current example, these data points are broken down by cdn. however, these data points can also be broken down by many different dimensions like users, geography, isp a n device, metadata, asset metadata, or even user metadata. we are going to look at few of these key metrics. those are extremely essential to ensure that we are providing highest quality of user experience to our users.

so the first metric we are going to look at is video start failure. i don't know why this clicker is not working. oh, here we go. um video start failure. this kp provides us insight into how many failures happen during start up. there are many components involved in video start up. there could be device related issues, there could be a network related issue, there could be application related issue kak issues, dr m issues or even entitlement issues. so uh error code needs to be generated in order to mark a session to be a failure.

uh similarly for video playback failure when playback was in post state and was reinitiated, we closely monitor failures around those resume playback. so similar to video start failure, error code needs to be generated by some end point to tag that session as a failure. for both of these metrics, uh lower the value increases. the success rate. uh so that's what we usually look for.

next one is exit before video start is uh user behavior. sometimes it takes longer than normal to start the video because there were delays by some end point and we closely monitor which end points are causing those delays and try to fix them or mitigate them based upon based upon where those delays are coming from.

uh re buffering ratio. there is always an initial buffering event to maintain players buffer to have smooth and consistent playback. however, re buffering ratio uh calculates if there were if there are any additional delays in downloading video segments. so for this metric lower the value better, it is better experience for our users

video start up time. uh this kp provides us data into how long did it take to play first frame of the video on the playback. so lower the value ensures greater cash hits for objects as well as low latency in network bandwidth. so again, lower the value ensures videos are starting very quickly for our customers.

for average bit rate in adaptive bit rate delivery, we would like our users to play video at highest bit rate possible. so highest which higher highest bit rate that translates into highest quality of video playback. so for this metric higher the value better it is

all these metrics are a balancing act. for example, if we want to lower our video start time or re buffering time, uh a bit bitrate stack array can be lower to lower bit rates. however, that will be a trade off for the quality and we make no compromises on the quality of video in hbo max world.

uh all these metrics are also part of our alerting system where if metric goes beyond certain thresholds, um alert gets triggered to our on-call engineer. and once alert is received by on-call engineer, he takes a deep dive into data using many different dimensions that i mentioned earlier. and once root cause is identified engineer applies proper mitigation based upon our own books.

so for these metris overall consistency is important and you can see cloud front which is in color orange, here is having consistent performance for all these key metrics

beyond this video and syrian metrics, we also monitor metrics on syrian side like cachet ratio or responses by ctp status quo, response times origin injection, bandwidth or peak throughput at the edge.

so quality of performance is a paramount for media businesses and to stay ahead of the curve. hbo max always evolves in improving our performance and our customers experience. this year we launch, we enable quick on select cs for our supported clients. we closely partnered with amazon cloudfront team and enabled quick on all our cloudfront distributions for both video and demand as well as live. the adoption rate by our clients was up to 10%. and you can see how quick made difference into customer experience in terms of reoffering ratio and video start time.

so each color on this graph represents a different cd n. and it's easy to spot that re buffering is higher on the cd n that currently does not support quick protocol. so you can see this is a cdn currently doesn't support quick and is having much higher rebuffing ratio.

we are on this slide, we are going to look at a couple interesting traffic patterns. so the graph on your left hand side is for our live stream sport event in latin america where six games were scheduled in a time window. uh, an event was kicked out by first two simultaneous games in the beginning. and when those games were about to finish additional four simultaneous games, uh they, uh, uh, they were kicked off and as you can see, it created an interesting tidal viewership pattern with the concurrent place.

can anybody guess where first two games ended? and next four games were about to start? any guesses? everybody is silent and the first half, the second half? yeah, let's see. so that's where first two games ended. and, uh, right around 240 you know, next four games kicked off and kept climbing. climbing viewership.

can you also guess where the half time happened? and, yep, yep. right at four o'clock, that's where half time again ended and actually game probably got more interesting as viewership just kept climbing

on the right hand side. uh that, that's the graph represents for hbo max's prime time hours versus hbo max's primetime releases. so the color light blue represents for hbo max's prime time hours where you see the linear traffic growth for our bau traffic as compared to color purple shows a thundering traffic burs for shows like game of thrones or house of the dragon or euphoria to name the few. and you can see at schedule time, viewership just kept climbing and created a huge spike and when it was at the peak, created a tent pole pattern to meet these massive spikes in demand.

uh hbo max team works through the logistics for months in advance leading up to the event based upon projections and historical data required capacity is calculated and team ensures that this capacity is preemptively scaled in many different layers of our platform. we also need to make sure that syrian capacity is available to support these peaks. we closely partnered with amazon cloudfront team and developed a semi auto uh prew warming process for our high profile events, aws infrastructure event management team along with aws account management team and solution architects work with our infrastructure team to make sure that we are prepared for these high profile events. aws media event management team provided us be access during house of the dragon for cloudfront delivery.

uh next, uh jay is going to talk about what we'll be looking forward to from cloudfront as well. as from the hbo max platform.

all right, normally uh might be a lot more stuff here on our road map. uh but we're at an interesting time in our, in our merger of the company. so we warner media discovery merged and became warner brothers discovery. as part of that, we have a new strategy. now that's really going to be focused on taking the catalog from hbo max, a catalog from discovery plus combining them into this new enhanced user experience that will be launching next year. so that's really where a lot of our focus is at. and that will be where we go from. from that point, we'll continue our launches into the rest of the world.

i did want to highlight +11 additional thing. one of the things to brought up was the the configuration management that the blue green deployment we're looking forward to that and we're glad that it launched. it's going to really help us simplify how we manage configuration changes and on cloudfront specifically so that we can test things in a staging environment or on a small percentage of traffic before we roll it out to the full production. so that's something i think is really key. but i, you know, i don't want to leave you just with this say, hey, there's not a whole lot on our road map. i can't tell you about a lot of these details right now.

so we do have a sizzle reel for you to just tell you a little bit about what's up on hbo. max.

what is not given? you have no idea what loss is. it's very intoxicated. bend the rules color outside the lines. would you be interested in having an affair? this is gonna be one of the best. what are we doing? changing the world? that's right. we're pushing forward. i'm always chasing the music. i have the drive. i have the passion.

thank you, jay and viron. this was so exciting and thank you for sharing all this journey and the metrics and everything that uh you have done so far to deliver this quality of experience to your users.

thank you, everyone. uh we can take uh questions now and if you want to learn more about cloudfront with media delivery, um this is the link to check out. um if there's any questions we, you're welcome to ask. uh we're also gonna stay here a few minutes after so you can meet us um off the stage and don't forget to check the survey in your mobile app. thank you again for.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值