

Instagram plays a critical part in forming meaningful communities where people can connect with each other and share what matters most to them. To help best facilitate these connections, we craft our app with high quality sharing experiences that we can take pride in. One way we work hard to improve the Instagram experience is by improving audio quality.

Instagram在形成有意义的社区中起着至关重要的作用,人们可以在此相互联系并分享对他们最重要的事情。 为了最大程度地促进这些连接,我们以引以为傲的高质量共享体验来制作我们的应用程序。我们努力改善Instagram体验的一种方法是改善音频质量。

Instagram’s Music Sticker song suggestions for the pop music genre

Instagram流行音乐类型的Music Sticker歌曲建议

什么是音质? (What is Audio Quality?)

Audio quality is a measure of how closely the audio we deliver to Instagram apps matches the original uncompressed audio file. Instagram delivers compressed audio to enable smooth video playback with fewer stalls caused by rebuffers. However, in exchange for smoother playback, this introduces the risk of compression artifacts. Some examples of compression artifacts are reduced clarity in high frequency sounds, weaker bass, and noise. These differences collectively lower the audio quality perceived by listeners.

音频质量是衡量我们提供给Instagram应用程序的音频与原始未压缩音频文件匹配的程度。 Instagram提供压缩音频 ,以减少因重新缓冲而造成的停顿,从而实现流畅的视频播放。 但是,为了换得更流畅的播放,这带来了压缩伪影的风险。 压缩伪像的一些示例是高频声音的清晰度降低,低音减弱和噪音降低。 这些差异共同降低了听众感知的音频质量。

改善音质 (Improving Audio Quality)

Instagram’s video system has access to multiple levers that affect audio quality. The audio codec selection, sample rate, and bitrate all contribute to the quality of the audio encoding. Different audio codecs have different levels of lossy compression, and they perform differently on different types of content. With the scale and range of Instagram’s content, it’s important to rigorously evaluate which codecs best fit the content and install metrics to track audio quality. Instead of potentially focusing plenty of engineering time to build an audio quality metric, we pursued the simple solution first and aimed to demonstrate that Instagram listeners cared about audio quality via existing engagement metrics. Changing the audio codec was not the simplest solution, so we decided to keep AAC as our audio codec selection for our audio quality improvement experiment. Sample rate affects the upper bound of frequencies that our audio encodings can represent correctly. The Nyquist-Shannon Sampling Theorem says that: “A band limited continuous-time signal can be sampled and perfectly reconstructed from its samples if the waveform is sampled over twice as fast as its highest frequency component.” Instagram uses an industry standard 44.1kHz sample rate, more than enough to convey the 20kHz max that most people can hear, so we ruled out sample rate as a variable worth changing. Bitrate, measured as kilobits per second (kbps), varies linearly with the number of bits in the audio file. In other words, a higher bitrate means more data and less compression in the audio encoding. This allows the compressed audio encoding to retain more features of the original audio file with fewer compression artifacts. When the bitrate is too low, the encoder removes audio details that it considers less important. Since we kept the audio codec and sample rate constant, and bitrate was simple to change, we chose to vary the bitrate in our audio quality improvement experiment.

Instagram视频系统可以使用多种影响音频质量的杠杆。 音频编解码器的选择,采样率和比特率均会提高音频编码的质量。 不同的音频编解码器具有不同级别的有损压缩 ,并且它们对不同类型的内容执行不同的操作。 借助Instagram内容的规模和范围,严格评估最适合内容的编解码器并安装指标以跟踪音频质量非常重要。 我们没有可能花费大量的工程时间来建立音频质量指标,而是首先采用了简单的解决方案,旨在证明Instagram听众通过现有的参与度指标关注音频质量。 更改音频编解码器并不是最简单的解决方案,因此我们决定保留AAC作为我们的音频质量改进实验的音频编解码器选择。 采样率会影响我们的音频编码可以正确表示的频率上限。 Nyquist-Shannon采样定理说:“如果以两倍于最高频率分量的速度采样波形,则可以对频带有限的连续时间信号进行采样并从其采样中完美重构。” Instagram使用行业标准的44.1kHz采样率,足以传达大多数人可以听到的最大20kHz,因此我们排除了采样率是值得改变的变量。 比特率 (以千比特/秒(kbps)衡量)随音频文件中的位数线性变化。 换句话说,更高的比特率意味着更多的数据和更少的音频编码压缩。 这允许压缩的音频编码以较少的压缩伪像来保留原始音频文件的更多功能。 当比特率太低时,编码器会删除它认为不太重要的音频细节。 由于我们将音频编解码器和采样率保持恒定,并且比特率易于更改,因此我们选择在音频质量改善实验中更改比特率。

比特率实验 (The Bitrate Experiment)

Prior to our audio quality improvement efforts, Instagram’s default bitrate for audio in videos was 64kbps. The microphone on a phone doesn’t produce a rich audio signal, so despite the low bitrate, Instagram’s audio compression performed well for most content. However, as Instagram creators started posting studio-produced audio content (e.g. music recordings), it became clear that 64kbps was not sufficient for delivering high quality audio. We received reports that Instagram’s audio sounded “blown out” or too low quality for artists to want to share certain songs on Instagram. When we tested the Instagram app, we observed common compression artifacts. For example, in Instagram’s Music Sticker Stories, we noticed that the compressed audio for snare drums, cymbals, voice, and reverb sounded drier and thinner than they did in the original recordings. We unfortunately can’t simply increase bitrate for all content. We need to split bandwidth between audio and video because of limited overall bandwidth, so this is a zero-sum game. High quality video has a bitrate so high that the difference between 64kbps and 128kbps audio has a negligible impact on playback rebuffers. However, in low bandwidth situations we serve video at much lower bitrates. In these situations, a difference of 64kbps can be substantial in the playback experience.

在我们提高音频质量之前,Instagram视频音频默认比特率是64kbps。 手机上的麦克风不会产生丰富的音频信号,因此尽管比特率很低,Instagram音频压缩对于大多数内容还是表现良好。 然而,随着Instagram创作者开始发布录音室制作的音频内容(例如音乐录音),很明显64kbps不足以提供高质量的音频。 我们收到的报告称,Instagram音频听起来“吹断”或质量太低,以至于艺术家无法在Instagram上共享某些歌曲。 当我们测试Instagram应用程序时,我们观察到了常见的压缩伪像。 例如,在Instagram“ 音乐贴纸故事”中 ,我们注意到军鼓,,声音和混响的压缩音频听起来比原始录音更干燥,更薄。 不幸的是,我们不能简单地提高所有内容的比特率。 由于整体带宽有限,我们需要在音频和视频之间分配带宽,因此这是一个零和游戏。 高质量视频的比特率很高,以至于64kbps和128kbps音频之间的差异对回放缓冲的影响可以忽略不计。 但是,在低带宽情况下,我们以低得多的比特率提供视频。 在这些情况下,播放体验可能会相差64kbps。

While we can increase the audio bitrate, we must weigh the tradeoffs between audio quality and video quality. Increasing this bitrate for all content is particularly risky, since we know that most content has simple audio and will not benefit from the audio side of the tradeoff. In our experiment, we aimed to make the right quality tradeoff for the right content.

虽然我们可以提高音频比特率,但必须权衡音频质量和视频质量之间的权衡。 为所有内容提高此比特率特别危险,因为我们知道大多数内容都具有简单的音频,因此无法从折衷的音频方面受益。 在我们的实验中,我们旨在针对适当的内容进行适当的质量折衷。

内容和社区特定质量首选项 (Content and Community Specific Quality Preferences)

To find the strongest signal on Instagram listeners’ preferences for audio quality, we considered ways to focus our audio quality improvements. From our previous experiments on visual quality, we knew that quality of experience is subjective and unique to content type and community type. Audio quality sensitivity depends on each listener’s attention to audio details and the quality of the playback speaker (e.g. the device’s default external speaker or headphones). We worried that some Instagram listeners with low-end mobile phone speakers may not focus on general audio quality. Musicians, on the other hand, know Instagram as a platform where they can create music communities, so we suspected that many Instagram listeners would be sensitive to music audio quality.

为了在Instagram听众对音频质量的偏爱中找到最强烈的信号,我们考虑了重点改善音频质量的方法。 从我们先前关于视觉质量的实验中,我们知道体验质量是主观的,并且对于内容类型和社区类型是唯一的。 音频质量的敏感性取决于每个听众对音频细节的关注以及播放扬声器的质量(例如,设备的默认外接扬声器或耳机)。 我们担心一些带有低端手机扬声器的Instagram听众可能不会关注一般的音频质量。 另一方面,音乐家们知道Instagram是创建音乐社区的平台,因此我们怀疑许多Instagram听众会对音乐的音频质量敏感。

We expected to see the strongest correlations between audio quality and engagement in Instagram’s music content where the audio frequency range is wide and full. To obtain this signal, we ran a targeted audio quality improvement test on the product where we expected audio quality to make the biggest impact: Music Sticker Stories.

我们希望看到音频质量与Instagram音乐内容的参与度之间最强的相关性,因为音频频率范围很广且很饱满。 为了获得此信号,我们在希望音频质量产生最大影响的产品上进行了有针对性的音频质量改善测试:“音乐贴纸故事”。

Image for post

a music sticker that plays a song by Relient K

播放 Relient K 的歌曲的音乐贴纸

音乐贴纸故事实验 (Music Sticker Stories Experiment)

To avoid diluted results from non-music content, we leveraged Instagram’s video and audio encoding tag system to zoom in on Stories audio encodings in the A/B test. All audio encodings in the control group used our default 64kbps bitrate. We ran two test groups: one group where the audio encodings used a 96kbps bitrate and another group where the audio encodings used a 128kbps bitrate. In the experiment results, we saw clear engagement wins from improved audio quality in Music Sticker Stories. The 128kbps test group delivered the best results. We measure video engagement by watch time (i.e., time spent watching videos) and view percent (i.e., the percentage of a video a viewer finishes watching). Both watch time and view percent improved despite regressions in visual quality and rebuffers. We expected the regressions in visual quality and rebuffers because we shifted our bandwidth usage from video to audio. However, the engagement metric wins exceeded our expectations. These metrics demonstrated that Instagram viewers are more willing to watch complete Music Sticker Stories videos even with playback performance regressions because the audio quality is better.

为了避免来自非音乐内容的稀释结果,我们利用Instagram视频和音频编码标签系统在A / B测试中放大了Stories音频编码。 对照组中的所有音频编码都使用我们默认的64kbps比特率。 我们进行了两个测试组:一组音频编码使用96kbps比特率,另一组音频编码使用128kbps比特率。 在实验结果中,我们发现“音乐贴纸故事”的音频质量得到了提高,从而赢得了明显的参与。 128kbps测试组提供了最佳结果。 我们通过观看时间 (即观看视频所花费的时间)和观看百分比 (即观看者完成观看的视频所占的百分比)来衡量视频的参与度。 尽管视觉质量和重新缓冲有所降低,但观看时间和观看次数百分比均得到改善。 我们希望将视觉质量和重新缓冲性能降低,因为我们将带宽使用从视频转移到了音频。 但是,参与度指标的获胜超出了我们的预期。 这些指标表明,Instagram观众更愿意观看完整的Music Sticker Stories视频,即使回放性能下降,因为音频质量更好。

未来的改进 (Future Improvements)

Increasing the audio bitrate for Music Sticker Stories is only the beginning of delivering a personalized video quality of experience to the Instagram community. To help us make the right tradeoffs between audio quality, visual quality, and smooth playback, we are considering future plans to build bandwidth aware audio ABR (i.e., adaptive bitrate) and content identification (i.e., identifying which video content has music). Many thanks to my great team members: Donald Chen, Haixia Shi, Chris Ellsworth, Bill Phillips, Mackenzie Pearson, who helped to make this happen.

提高Music Sticker Stories的音频比特率只是向Instagram社区提供个性化视频体验的开始。 为了帮助我们在音频质量,视觉质量和流畅播放之间做出正确的权衡,我们正在考虑未来的计划,以建立带宽感知的音频ABR (即自适应比特率)和内容标识(即,标识哪个视频内容有音乐)。 非常感谢我出色的团队成员:唐纳德·陈(Donald Chen),石海霞,克里斯·埃尔斯沃思(Chris Ellsworth),比尔·菲利普斯(Bill Phillips),麦肯齐·皮尔森(Mackenzie Pearson),他们帮助实现了这一目标。

Donald Chen (Android) and Chris Hsu (Server) are software engineers on Instagram Media Infrastructure team.

Donald Chen(Android)和Chris Hsu(Server)是Instagram Media Infrastructure团队的软件工程师。

If you want to learn more about this work or are interested joining one of our engineering teams, please visit our careers page, follow us on Facebook or on Twitter.




  • 0
  • 0
    觉得还不错? 一键收藏
  • 0




当前余额3.43前往充值 >
领取后你会自动成为博主和红包主的粉丝 规则
钱包余额 0


