Emoji in Qt 6.9
表情符号和QT 6.9
January 28, 2025 by Eskil Abrahamsen Blomfeldt | Comments
2025年1月28日 by Eskil Abrahamsen Blomfeldt | 评论
Emoji are quirky and fun, but it's also one of the world's most popular writing systems. In 2022 it was estimated that 92% of the world's online population used emoji for expressing themselves.
表情符号既古怪又有趣,但它也是世界上最流行的书写系统之一。据估计,2022年,全球92%的在线人口使用表情符号表达自己。
Supporting color fonts is a pre-requisite for supporting emoji, and Qt has had such support on macOS and iOS since Qt 5.2. For Windows and Linux/Android (Freetype), support came a bit later, in Qt 5.7. But as the domain has evolved, Qt has not quite kept up with everything. In Qt 6.9 we fill in the gaps and modernize our emoji/color font support across all platforms.
支持彩色字体是支持表情符号的先决条件,Qt自Qt 5.2以来就在macOS和iOS上提供了这种支持。对于Windows和Linux/Android(Freetype),支持时间稍晚,在Qt 5.7中。但随着领域的发展,Qt并没有完全跟上一切。在Qt 6.9中,我们填补了空白,并在所有平台上实现了表情符号/彩色字体支持的现代化。
The word emoji itself has a history of slightly different meanings, and the idea of using pictographic fonts or emoticons to express emotions certainly predates the current understanding of the word. In Qt, we use the Unicode definition: A colorful, cartoony pictograph inlined in text. In this interpretation, the symbols that we typically know as emoji can also be presented as "text" (basically meaning monochrome), but an "emoji presentation" requires the use of a color font.
表情符号这个词本身有着略微不同的含义,使用象形字体或表情符号来表达情感的想法肯定早于目前对这个词的理解。在Qt中,我们使用Unicode定义:一个嵌入文本的彩色卡通象形图。在这种解释中,我们通常称为表情符号的符号也可以表示为“文本”(基本上意味着单色),但“表情符号表示”需要使用彩色字体。
In this blog, I will first go through some improvements made to emoji handling itself in Qt, and then look at the history of color font formats, and how they will be supported in upcoming releases.
在这篇博客中,我将首先介绍在Qt中对表情符号处理本身所做的一些改进,然后查看彩色字体格式的历史,以及在即将发布的版本中如何支持它们。
Emoji segmenter
分段表情符号
Like for other writing systems, the Unicode standard has ranges of characters designated as "emoji" (i.e. defaulting to emoji presentation.) But the emoji concept is a bit broader than that: For instance, certain characters, that are text by default, can be turned into emoji by appending the control character variation selector 16 (U+FE0F).
与其他书写系统一样,Unicode标准的字符范围被指定为“表情符号”(即默认为表情符号表示。)但表情符号的概念比这更广泛:例如,某些默认为文本的字符可以通过附加控制字符变体选择器16(U+FE0F)而变成表情符号。
One example is the airplane (U+2708) character, which is a text dingbat, added to Unicode 1.1 back in 1993. It should be presented in monochrome by default: ✈. However, by appending the VS-16 to it, and creating the sequence U+2708, U+FE0F, we ask for the emoji presentation of the same character instead: ✈️.
一个例子是飞机(U+2708)字符,它是1993年添加到Unicode 1.1中的文本dingbat。默认情况下,它应该以单色显示:✈. 然而,通过将VS-16附加到它,并创建序列U+2708,U+FE0F,我们要求使用相同字符的表情符号表示:✈️.
In addition, certain text presentation characters take on a special meaning when combined with emoji using zero-width joiners (U+200D). For instance, joining the woman emoji (U+1F469 👩) with the staff of Aesculapius (U+2695 ⚕) in the sequence U+1F469, U+200D, U+2695 resolves to a woman health worker emoji: 👩⚕.
此外,当使用零宽度连接符(U+200D)与表情符号组合时,某些文本表示字符具有特殊含义。例如,加入女性表情符号(U+1F469👩)与埃斯库拉皮乌斯的工作人员(U+2695⚕)在序列U+1F469,U+200D,U+2695解析为女性卫生工作者表情符号:👩⚕.
Now, the font and shaper take care of selecting the correct glyph in these cases, but in order for that to happen, we need to apply the emoji font to the full sequence of characters, which means we need to know which characters belong to the sequence. This is where the problem arose: In Qt, text will initially be split into "script items" according to their writing system. Based on the writing system and font query, a main font is selected for each item, and a fall back mechanism ("font merging") queries other similar fonts whenever specific characters are not supported by the main one.
现在,在这些情况下,字体和整形器会负责选择正确的字形,但为了实现这一点,我们需要将表情符号字体应用于整个字符序列,这意味着我们需要知道哪些字符属于该序列。这就是问题所在:在Qt中,文本最初会根据其书写系统被拆分为“脚本项”。根据书写系统和字体查询,为每个项目选择一个主字体,当主字体不支持特定字符时,回退机制(“字体合并”)会查询其他类似字体。
This system was written before the introduction of emoji, and it did not account for the fact that specific sequences such as these - combining characters from different Unicode blocks - should default to a specific subset of fonts. A lot of the issues that we saw were addressed with ad hoc heuristics over the years, so for most use cases it would work as expected, but the core problem was always there. We ended up with complicated, ad hoc code in Qt and a set of broken corner cases - some depending on which fonts were available on the target system.
这个系统是在表情符号出现之前编写的,它没有考虑到这样一个事实,即特定的序列(组合来自不同Unicode块的字符)应该默认为特定的字体子集。多年来,我们看到的许多问题都是通过临时启发式方法解决的,因此对于大多数用例来说,它会按预期工作,但核心问题始终存在。我们最终得到了Qt中复杂的、临时的代码和一组损坏的角案例——有些取决于目标系统上可用的字体。
In Qt 6.9, Google's emoji-segmenter has been introduced to Qt. This is small parser for the emoji specification in Unicode, and it allows us to detect substrings that are intended to be displayed in color. We treat this as a writing system in its own right when resolving fonts for these substrings, prioritizing color fonts.
在Qt 6.9中,谷歌的表情符号已经被引入Qt。这是一个适用于Unicode中表情符号规范的小型解析器,它允许我们检测打算以颜色显示的子字符串。在解析这些子字符串的字体时,我们将其视为一个独立的书写系统,优先考虑彩色字体。
The segmenter is on by default, so no action is needed to reap the benefits of this. It can, however, be disabled by a flag in QTextOption, for a little bit less overhead when you know the text will not contain emoji. If needed, the whole feature can also be disabled either by setting the environment variable QT_DISABLE_EMOJI_SEGMENTER=1, or by passing -no-emojisegmenter when you configure Qt.
默认情况下,分段器处于打开状态,因此无需采取任何行动即可从中获益。然而,它可以被QTextOption中的标志禁用,当知道文本不会包含表情符号时,可以减少一点开销。如果需要,也可以通过设置环境变量QT_DISABLE_EMOJI_SEGMENTER=1或在配置QT时传递-no表情片段来禁用整个功能。
Color font formats
彩色字体格式
Detecting emoji correctly is an important piece of the puzzle, but a greater impact on users is ultimately the support for the specific font file they want to use to display those emoji.
正确检测表情符号是一个重要的难题,但对用户更大的影响最终是支持他们想要用来显示这些表情符号的特定字体文件。
When the idea of color fonts was first introduced, multiple different standards were created, each backed by different major players. The whole area has been quite fragmented ever since.
当彩色字体的想法首次被引入时,创建了多种不同的标准,每种标准都有不同的主要参与者支持。从那以后,整个地区就变得相当分散。
- Apple backed the SBIX format: Bitmap fonts containing embedded JPEG, TIFF or PNG images to represent the glyphs.
- 苹果公司支持SBIX格式:位图字体包含嵌入式JPEG、TIFF或PNG图像来表示字形。
- Google backed the CBLC/CBDT format: Bitmap fonts capable of containing PNGs as well as uncompressed image data.
- 谷歌支持CBLC/CBDT格式:位图字体能够包含PNG和未压缩的图像数据。
(In practice, fonts in both these formats will typically embed PNG data due to its efficient, lossless compression and wide support.)
(在实践中,由于高效、无损压缩和广泛支持,这两种格式的字体通常会嵌入PNG数据。)
- Adobe and Mozilla backed the OpenType-SVG format: Embedded SVG files representing the glyphs.
- Adobe和Mozilla支持OpenType SVG格式:表示字形的嵌入式SVG文件。
- And Microsoft backed the COLR/CPAL format: In the original version of this format (v0), glyph outlines are stored in the same vector format as in monochrome fonts, but a glyph can consist of multiple such outlines layered on top of each other, each with a different predefined fill color.
- 微软支持COLR/CPAL格式:在该格式的原始版本(v0)中,字形轮廓以与单色字体相同的矢量格式存储,但一个字形可以由多个这样的轮廓组成,每个轮廓都有不同的预定义填充颜色。
Since the former two of these are stored as bitmap images, they can contain any image the designer creates. However, the downside of this is that they will look blurry when scaled to sizes beyond the images stored in the font.
由于前两个是作为位图图像存储的,因此它们可以包含设计者创建的任何图像。然而,这样做的缺点是,当缩放到字体中存储的图像之外的大小时,它们会看起来模糊。
Google's "Noto Color Emoji" font was originally a CBDT font, containing emoji with nice gradients, but with obvious scaling artifacts at large sizes, as can be seen in this screenshot.
谷歌的“Noto Color表情符号”字体最初是CBDT字体,包含具有良好渐变的表情符号,但有明显的大尺寸缩放伪影,如图所示。
The latter two formats are vector formats and do not suffer from the same blurry look as the bitmap fonts. OpenType-SVG is the most expressive of all the formats, but due to the complexity of creating an SVG renderer, it never really saw very wide adoption. To my knowledge (and I might be very wrong about this), it is primarily supported by Adobe's tools and Firefox.
后两种格式是矢量格式,不会出现与位图字体相同的模糊外观。OpenType SVG是所有格式中最具表现力的,但由于创建SVG渲染器的复杂性,它从未真正得到广泛采用。据我所知(我可能对此非常错误),它主要由Adobe的工具和Firefox支持。
The original version (v0) of the COLR format is, on the contrary, a straight-forward format to support for existing rasterizers. But since it only supports single color fills, it limits itself to quite "flat"-looking emoji.
相反,COLR格式的原始版本(v0)是一种支持现有光栅化器的直接格式。但由于它只支持单色填充,所以它将自己限制为看起来很“扁平”的表情符号。
Here is a rendering of a glyph from the COLRv0 version of the Twemoji font. It consists of a set of monochrome shapes layered on top of each other.
这是Twemoji字体COLRv0版本的字形渲染图。它由一组相互叠加的单色形状组成。
So the initial state was that all the formats had their own pros and cons, and different backends prioritized support for different font formats.
因此,最初的状态是,所有格式都有自己的优缺点,不同的后端优先支持不同的字体格式。
But in 2019, Google proposed a new version of the COLR/CPAL format to compromise between the needs of scalability and renderer simplicity: COLRv1. This format introduced gradients and Porter-Duff compositing to the vector layers, and thus gives crisp illustrations at any size, with approximately the same feature set as SVG fonts, but on top of a very simple scenegraph architecture.
但在2019年,谷歌提出了一个新版本的COLR/CPAL格式,以在可扩展性和渲染器简单性之间达成妥协:COLRv1。这种格式将渐变和Porter-Duff合成引入到矢量层中,从而在任何大小上都能提供清晰的插图,其功能集与SVG字体大致相同,但基于非常简单的场景图架构。
Seen here is the same glyph as before, but from the newer, COLRv1 version of Noto Color Emoji. It has the nice gradients, but now scales to any size without any loss of precision.
这里看到的是和以前一样的字形,但来自新的COLRv1版本的Noto Color表情符号。它有很好的渐变,但现在可以缩放到任何大小,而不会损失任何精度。
Support in Qt
Qt支持
Support in older versions of Qt has been as fragmented as the area itself: On Windows, we supported COLRv0, which was the only natively supported format when the feature was added to Qt in 2016.
Qt旧版本的支持与该领域本身一样零散:在Windows上,我们支持COLRv0,这是2016年Qt添加该功能时唯一原生支持的格式。
On Freetype (default on Linux/Android, but usable on all platforms) we only supported the CBDT bitmap fonts, which, in turn, was the only supported format in Freetype in 2016.
在Freetype(Linux/Android上的默认设置,但可在所有平台上使用)上,我们只支持CBDT位图字体,这是2016年Freetype中唯一支持的格式。
The exception was on macOS/iOS, where system support for different formats has been keeping up with development, and without requiring the use of new APIs. So support was gradually extended in Qt for free.
例外的是在macOS/iOS上,系统对不同格式的支持一直在跟上发展,并且不需要使用新的API。因此,Qt的支持逐渐免费扩展。
This meant that you could get emoji presented in the target platform's default font, but including your own, custom emoji fonts in a cross-platform app would be a lot of work.
这意味着你可以在目标平台的默认字体中显示表情符号,但在跨平台应用程序中包含你自己的自定义表情符号字体将是一项艰巨的工作。
In Qt 6.9, our support on Windows and Freetype has also been extended, so that bitmap color fonts, COLRv0 and COLRv1 are now supported across all platforms. Note that OpenType-SVG fonts are still not supported by Qt and there is currently no plan to implement this, given the current cross-platform traction of the COLRv1 format.
在Qt 6.9中,我们对Windows和Freetype的支持也得到了扩展,现在所有平台都支持位图彩色字体COLRv0和COLRv1。请注意,Qt仍然不支持OpenType SVG字体,鉴于当前COLRv1格式的跨平台吸引力,目前没有实现这一点的计划。
In addition, new API has been introduced to enable overriding the system default emoji font, making it easy to bundle custom emoji fonts with your application.
此外,还引入了新的API来覆盖系统默认的表情符号字体,从而可以轻松地将自定义的表情符号字体与应用程序捆绑在一起。
Note: For Freetype in particular, COLR/CPAL support is also extended to the upcoming patch release of Qt 6.8.x, since the default emoji font on Android 15 depends on it. For apps that target Android 15 and wish to continue using older Qt versions, we recommend bundling the CBDT version of Noto Color Emoji with your app.
注意:特别是对于Freetype,COLR/CPAL支持也扩展到即将发布的Qt 6.8.x补丁,因为Android 15上的默认表情符号字体取决于此。对于以Android 15为目标并希望继续使用较旧的Qt版本的应用,我们建议将Noto Color表情符号的CBDT版本与应用捆绑在一起。
If you have questions or additional information, feel free to add a comment here - or send me a message on BlueSky. For bug reports, though, please use our bug reporting tool, since they may otherwise risk being overlooked.
如果有任何问题或其他信息,请随时在此处添加评论,或在蓝天上给我发消息。不过,对于错误报告,请使用我们的错误报告工具,否则可能会被忽视。