如何提高opengl的显示效率及Android系统的子模块Graphic的总体架构

22 篇文章 2 订阅

问题源于如何提高opengl的显示效率?

https://stackoverflow.com/questions/23131472/how-to-improve-opengl-es-display-performance-in-android

https://stackoverflow.com/questions/23261662/how-to-use-graphicbuffer-in-android-ndk

https://github.com/fuyufjh/GraphicBuffer

https://stackoverflow.com/questions/25564203/what-is-wrong-when-i-use-eglimage-replace-glreadpixels-in-ndk-program

本篇文章是基于谷歌有关Graphic的一篇概览文章的翻译:http://source.android.com/devices/graphics/architecture.html
大量文字以及术语的理解基于自身的理解,可能并不准确。文中有部分英文原文我也不能准确理解,对于这种语句,我在翻译的语句后加了(?)符号。

This document describes the essential elements of Android’s “system-level” graphics architecture, and how it is used by the application framework and multimedia system. The focus is on how buffers of graphical data move through the system. If you’ve ever wondered why SurfaceView and TextureView behave the way they do, or how Surface and EGLSurface interact, you’ve come to the right place.

这篇文档描述了android系统的子模块Graphic的总体架构,以及APP Framework层和多媒体系统如何使用Graphic模块的过程。这篇文章的重点在于讲述Graphic的buffer数据如何在系统内部传输的。如果你曾经对SurfaceView和TextureView工作方式表示好奇,如果你希望了解Surface和EGLSurface的交互方式,那么朋友,你来对地方了。

Some familiarity with Android devices and application development is assumed. You don’t need detailed knowledge of the app framework, and very few API calls will be mentioned, but the material herein doesn’t overlap much with other public documentation. The goal here is to provide a sense for the significant events involved in rendering a frame for output, so that you can make informed choices when designing an application. To achieve this, we work from the bottom up, describing how the UI classes work rather than how they can be used.

阅读这篇文章前,我们假设你已经对android设备和应用开发有了一定的了解。你不需要了解app framework层的大量知识,文中会涉及少量api,但是所涉及的材料并不会跟其它文档有很大重叠。这篇文章重点讲解在一帧的渲染过程中的重要步骤,目的在于使你在开发应用程序时做出更明智的选择。为实现这个目标,我们将自下向上的讲解相关的UI类是如何工作的,至于如何使用这些类,则不在我们的讲解范围内。

We start with an explanation of Android’s graphics buffers, describe the composition and display mechanism, and then proceed to the higher-level mechanisms that supply the compositor with data.

我们从解释android Graphic buffer开始讲起,描述了buffer合成和显示的原理,然后,我们将在更高的层面讲解这些数据合成的原理。

This document is chiefly concerned with the system as it exists in Android 4.4 (“KitKat”). Earlier versions of the system worked differently, and future versions will likely be different as well. Version-specific features are called out in a few places.

这篇文章主要是基于android KK的,早期系统和后面的系统在一些细节方面会有一些不同。

At various points I will refer to source code from the AOSP sources or from Grafika. Grafika is a Google open-source project for testing; it can be found at https://github.com/google/grafika. It’s more “quick hack” than solid example code, but it will suffice.

BufferQueue and gralloc


To understand how Android’s graphics system works, we have to start behind the scenes. At the heart of everything graphical in Android is a class called BufferQueue. Its role is simple enough: connect something that generates buffers of graphical data (the “producer”) to something that accepts the data for display or further processing (the “consumer”). The producer and consumer can live in different processes. Nearly everything that moves buffers of graphical data through the system relies on BufferQueue.

我们将从具体的场景来理解android Graphic 系统的运作。整个绘制系统的核心是一个叫做BufferQueue的类。它的作用其实很简单:将一些可以生产绘制数据(buffers of graphical data)的模块(producer)和一些将绘制数据显示出来或做进一步处理的模块(consumer)相连。生产者和消费者可以存在于不同的进程内。几乎所有和buffers of graphical data移动的过程都依赖与BufferQueue类。

The basic usage is straightforward. The producer requests a free buffer (dequeueBuffer()), specifying a set of characteristics including width, height, pixel format, and usage flags. The producer populates the buffer and returns it to the queue (queueBuffer()). Sometime later, the consumer acquires the buffer (acquireBuffer()) and makes use of the buffer contents. When the consumer is done, it returns the buffer to the queue (releaseBuffer()).

这个类基本的用法很简单。生产者申请一块空闲的buffer(dequeueBuffer()),在申请时指定宽度,高度,像素的格式,以及使用的用途的一系列参数。生产者填充缓冲区后,将它送还给队列(queueBuffer())。之后,消费者申请buffer (acquireBuffer()),然后使用对应的buffer数据。当消费者使用完成后,它将buffer返回给队列(releaseBuffer())。

Most recent Android devices support the “sync framework”. This allows the system to do some nifty thing when combined with hardware components that can manipulate graphics data asynchronously. For example, a producer can submit a series of OpenGL ES drawing commands and then enqueue the output buffer before rendering completes. The buffer is accompanied by a fence that signals when the contents are ready. A second fence accompanies the buffer when it is returned to the free list, so that the consumer can release the buffer while the contents are still in use. This approach improves latency and throughput as the buffers move through the system.

最新的android设备支持一种叫做sync framework的技术。这允许系统结合硬件来实现对Graphic数据的异步操作。比如说,一个生产者可以一次性提交一系列的OpenGL ES绘制命令,然后在渲染结束前将其入队到输出缓冲区中。Buffer会被一个fence保护,当内容准备好后,发出信号。当buffer回到free列表时(有第二个fence保护),所以消费者在内容仍在使用的时候释放缓冲区。这种方法提高了buffer在系统中移动的速度和吞吐量。

Some characteristics of the queue, such as the maximum number of buffers it can hold, are determined jointly by the producer and the consumer.

队列其他的一些特性,比如说它能拥有的最大缓冲区数量,则有生产者和消费者共同决定。

The BufferQueue is responsible for allocating buffers as it needs them. Buffers are retained unless the characteristics change; for example, if the producer starts requesting buffers with a different size, the old buffers will be freed and new buffers will be allocated on demand.

BufferQueue负责缓冲区的分配。除非buffer的一些属性发生变化,否则buffers将被保留。举例说,如果生成者申请一个大小不同的buffers,旧的buffers将被释放,而新的buffers将重新被申请。

The data structure is currently always created and “owned” by the consumer. In Android 4.3 only the producer side was “binderized”, i.e. the producer could be in a remote process but the consumer had to live in the process where the queue was created. This evolved a bit in 4.4, moving toward a more general implementation.

这个数据结构一直被消费者创建并“持有”。在4.3版本时,只有生成者一端是“binder化的”,也就是说生产者这端可以在远端进程(binder的另一侧进程)里,而消费者必须在队列被创建的进程里。这在KK版本里面有了一定程序的改进。

Buffer contents are never copied by BufferQueue. Moving that much data around would be very inefficient. Instead, buffers are always passed by handle.

BufferQueue 永远不会拷贝Buffer的数据,因为移动如此多的数据效率将十分低下,buffers只会以句柄的方式被传递。

Gralloc HAL

The actual buffer allocations are performed through a memory allocator called “gralloc”, which is implemented through a vendor-specific HAL interface (see hardware/libhardware/include/hardware/gralloc.h). The alloc()function takes the arguments you’d expect — width, height, pixel format — as well as a set of usage flags. Those flags merit closer attention.

事实上,缓冲区的分配是由一个叫做gralloc的内存分配模块控制的,这个模块是是由具体厂商来实现的一个HAL接口(参见 hardware/libhardware/include/hardware/gralloc.h)。使用alloc函数被传入你所需要的参数—宽度,高度,像素类型,以及用途的标志(usage flags)。

The gralloc allocator is not just another way to allocate memory on the native heap. In some situations, the allocated memory may not be cache-coherent, or could be totally inaccessible from user space. The nature of the allocation is determined by the usage flags, which include attributes like: • how often the memory will be accessed from software (CPU) • how often the memory will be accessed from hardware (GPU) • whether the memory will be used as an OpenGL ES (“GLES”) texture • whether the memory will be used by a video encoder

这个gralloc allocator并不是仅仅在native heap上分配内存。在一些场景中,分配的内存很可能并非缓存一致性(所谓缓存一致性,是指保留在高速缓存中的共享资源,保持数据一致性的机制)的,或者是从用户空间不可达的。分配到的内存具有哪些特性,取决于在创建时传入的usage flags:

• 从软件层次来访问这段内存的频率(CPU)
• 从硬件层次来访问这段内存的频率(GPU)
• 这段内存是否被用来做OpenGL ES的材质(GLES)
• 这段内存是否会被拿来做视频的编码

For example, if your format specifies RGBA 8888 pixels, and you indicate the buffer will be accessed from software — meaning your application will touch pixels directly — then the allocator needs to create a buffer with 4 bytes per pixel in R-G-B-A order. If instead you say the buffer will only be accessed from hardware and as a GLES texture, the allocator can do anything the GLES driver wants — BGRA ordering, non-linear “swizzled” layouts, alternative color formats, etc. Allowing the hardware to use its preferred format can improve performance. Some values cannot be combined on certain platforms. For example, the “video encoder” flag may require YUV pixels, so adding “software access” and specifying RGBA 8888 would fail. The handle returned by the gralloc allocator can be passed between processes through Binder.

比如说,如果你设置了RGBA 8888的像素格式,并且你设置了缓冲区被软件访问(这意味着你的程序可以直接修改像素的数据),那么allocator会创建一个四字节的缓冲区,其中顺序按照R-G-B-A的存储顺序来排列。。。。。设置成硬件推荐的数据格式可以提高性能。
一些数值的组合在某些特定的平台是不被允许的。比如说,video encoder对应的可能是YUV的数据格式,所以如果我们如果加入software access,并且指定数据格式为RGBA 8888就有可能失败。
gralloc allocator创建的缓冲区的句柄将通过binder在不同进程之间传输。

SurfaceFlinger and Hardware Composer


Having buffers of graphical data is wonderful, but life is even better when you get to see them on your device’s screen. That’s where SurfaceFlinger and the Hardware Composer HAL come in.

拥有Graphic数据的缓冲区很美妙,但是如果你能看到它们显示在屏幕上才更是让你觉得人生完美。是时候让SurfaceFlinger 和 Hardware Composer HAL登场了。

SurfaceFlinger’s role is to accept buffers of data from multiple sources, composite them, and send them to the display. Once upon a time this was done with software blitting to a hardware framebuffer (e.g./dev/graphics/fb0), but those days are long gone.

SurfaceFlinger的工作是接受来自不同来源的缓冲区数据,将这些数据混合,然后发送数据到显示设备上。曾几何时,这些功能是由软件直接复制数据到硬件的framebuffer上(e.g./dev/graphics/fb0),但这样的日子早已一去不复返。

*When an app comes to the foreground, the WindowManager service asks SurfaceFlinger for a drawing surface. SurfaceFlinger creates a “layer”

  • the primary component of which is a BufferQueue - for which SurfaceFlinger acts as the consumer. A Binder object for the producer side is passed through the WindowManager to the app, which can then start sending frames directly to SurfaceFlinger. (Note: The WindowManager uses the term “window” instead of “layer” for this and uses “layer” to mean something else. We’re going to use the SurfaceFlinger terminology. It can be argued that SurfaceFlinger should really be called LayerFlinger.)*

当一个app转到前端,WindowManager服务会要求SurfaceFlinger绘制一个surface。SurfaceFlinger将创建一个layer(它的主要组成部分是一个BufferQueue),而实际上SurfaceFlinger扮演了一个消费者的角色。一个生产者一侧的binder对象通过WindowManager传输给了app,所以它可以直接向SurfaceFlinger发送帧数据(注意:WindowManager实际上使用的是window这个术语,而不是layer这个术语。但是我们这里将使用SurfaceFlinger体系下的术语,可以说SurfaceFlinger更应该被称作是LayerFlinger)。

For most apps, there will be three layers on screen at any time: the “status bar” at the top of the screen, the “navigation bar” at the bottom or side, and the application’s UI. Some apps will have more or less, e.g. the default home app has a separate layer for the wallpaper, while a full-screen game might hide the status bar. Each layer can be updated independently. The status and navigation bars are rendered by a system process, while the app layers are rendered by the app, with no coordination between the two.

对于大多数app来说,屏幕上一般总是有三个layer:屏幕上方的status bar,屏幕下方的navigation bar(实际上很多品牌的手机并没有navigation bar,比如三星),以及应用本身的UI。一些应用的layer可能会有不同,比如home app的壁纸会有一个独立的layer,而一直全屏幕的游戏可能不会有status bar。每个layer都是独立的被更新。status和navigation bars是被系统进程渲染的,而app的layer则被app渲染,这二者之间并不会有什么协同作业。

Device displays refresh at a certain rate, typically 60 frames per second on phones and tablets. If the display contents are updated mid-refresh, “tearing” will be visible; so it’s important to update the contents only between cycles. The system receives a signal from the display when it’s safe to update the contents. For historical reasons we’ll call this the VSYNC signal.

设备的显示刷新频率是一个特定的值,一般是60帧每秒。如果显示的内容刷新不够迅速,就可能出现显示撕裂的情况。因为按照周期来更新显示的内容至关重要。当显示系统可以安全的更新内容时,它会发送一个信号给系统。基于某种历史上的原因,我们将这个信号称之为VSYNC信号。

The refresh rate may vary over time, e.g. some mobile devices will range from 58 to 62fps depending on current conditions. For an HDMI-attached television, this could theoretically dip to 24 or 48Hz to match a video. Because we can update the screen only once per refresh cycle, submitting buffers for display at 200fps would be a waste of effort as most of the frames would never be seen. Instead of taking action whenever an app submits a buffer, SurfaceFlinger wakes up when the display is ready for something new.

设备的刷新率可能随时间变化,基于不同的场景,一些型号的刷新率可能在58到62之间变化。对于一个连接了HDMI的电视,这个值理论上可以下降到24或者48。因为我们只能在每个刷新周期上更新屏幕内容,如果我们以200fps的频率来提交buffer的数据,那么由于大多数的数据并不会被显示,这无疑是一种浪费。因为我们不会在每次app提交buffer数据时就做相应操作,只会在显示系统可以接受数据时才唤醒SurfaceFlinger。

When the VSYNC signal arrives, SurfaceFlinger walks through its list of layers looking for new buffers. If it finds a new one, it acquires it; if not, it continues to use the previously-acquired buffer. SurfaceFlinger always wants to have something to display, so it will hang on to one buffer. If no buffers have ever been submitted on a layer, the layer is ignored.

当VSYNC信号到达时,SurfaceFlinger会遍历它的layer列表来查找新的buffer。如果查找到一个,SurfaceFlinger将请求它(acquires),否则的话,SurfaceFlinger将继续使用之前的数据。SurfaceFlinger总是需要一些数据来显示,因此它依赖于一个buffer(?)。如果一个layer没有buffer被提交,那么这个layer将被忽略。

Once SurfaceFlinger has collected all of the buffers for visible layers, it asks the Hardware Composer how composition should be performed.

一旦SurfaceFlinger已经收集到了所有可见layer的buffer,它将请求Hardware Composer来执行混合的操作。

Hardware Composer

The Hardware Composer HAL (“HWC”) was first introduced in Android 3.0 (“Honeycomb”) and has evolved steadily over the years. Its primary purpose is to determine the most efficient way to composite buffers with the available hardware. As a HAL, its implementation is device-specific and usually implemented by the display hardware OEM.

HWC是从android 3.0版本引入的,在过去的数年中逐渐变得稳定。它的作用是使用现有的硬件选择最有效的方式来合成缓冲区。做为一个HAL层接口,它的内容是由显示硬件设备厂商来具体实现的。

The value of this approach is easy to recognize when you consider “overlay planes.” The purpose of overlay planes is to composite multiple buffers together, but in the display hardware rather than the GPU. For example, suppose you have a typical Android phone in portrait orientation, with the status bar on top and navigation bar at the bottom, and app content everywhere else. The contents for each layer are in separate buffers. You could handle composition by rendering the app content into a scratch buffer, then rendering the status bar over it, then rendering the navigation bar on top of that, and finally passing the scratch buffer to the display hardware. Or, you could pass all three buffers to the display hardware, and tell it to read data from different buffers for different parts of the screen. The latter approach can be significantly more efficient.

如果设想一下”overlay planes.”的场景,那么这个方法的价值是显而易见的。”overlay planes”的作用是在display hardware而不是GPU中同时混合不同的buffer。打比方说,典型场景下,屏幕上方的status bar,屏幕下方的navigation bar,以及应用本身的UI。每个layer都有自己独立的buffer。你可以通过逐步绘制每个layer到缓冲区里的方式来合成,最后将缓冲区的数据传递给显示硬件设备;或者,你也可以将每个layer数据分别传给显示硬件设备,然后告知显示硬件设备从不同的缓冲区中读取数据。显然后一种方法更有效率。

*As you might expect, the capabilities of different display processors vary significantly. The number of overlays, whether layers can be rotated or blended, and restrictions on positioning and overlap can be difficult to express through an API. So, the HWC works like this:

  1. SurfaceFlinger provides the HWC with a full list of layers, and asks, “how do you want to handle this?”
  2. The HWC responds by marking each layer as “overlay” or “GLES composition.”
  3. SurfaceFlinger takes care of any GLES composition, passing the output buffer to HWC, and lets HWC handle the rest.*

如你所料,不同显示处理器之间的性能有巨大的差距。很多Overlay, layer被旋转或者混合,因此一个api很难准确表达在位置和遮盖上的限制。因此,HWC模块是这样运作的:
1.SurfaceFlinger给HWC提供一份完整的layer列表,然后问,“你打算如何处理?”
2.HWC将每个layer标记为overlay或者GLES composition然后回复给SurfaceFlinger
3.SurfaceFlinger来处理被标记为GLES composition的layer,将处理之后的数据传输给HWC,并且让HWC模块来处理剩下的工作。

Since the decision-making code can be custom tailored by the hardware vendor, it’s possible to get the best performance out of every device.

因为硬件厂商可以自己定制decision-making的代码,所以每台机器达到性能最优成为了可能。

Overlay planes may be less efficient than GL composition when nothing on the screen is changing. This is particularly true when the overlay contents have transparent pixels, and overlapping layers are being blended together. In such cases, the HWC can choose to request GLES composition for some or all layers and retain the composited buffer. If SurfaceFlinger comes back again asking to composite the same set of buffers, the HWC can just continue to show the previously-composited scratch buffer. This can improve the battery life of an idle device.

当屏幕上没有任何东西变化时,Overlay planes的效率并不如GL composition的效率高。当overlay的内容中有很多透明的像素,或者重叠的layer在一起被混合时,这种差距尤其明显。在这种情况下,HWC会请求让GLES composition来处理部分或者全部的layer,并且保留混合后的buffer。如果Surfaceflinger又来请求混合相同的buffer时,HWC会直接显示之前保存的混合好的buffer。这么做将可以提升设备待机时间。

Devices shipping with Android 4.4 (“KitKat”) typically support four overlay planes. Attempting to composite more layers than there are overlays will cause the system to use GLES composition for some of them; so the number of layers used by an application can have a measurable impact on power consumption and performance.

搭载了KK的android设备一般支持四条overlay planes。如果我们尝试混合更多的layer时,系统会使用GLES composition来处理其中的部分;所以一个应用使用了多少layer会影响到系统的功耗和性能。

You can see exactly what SurfaceFlinger is up to with the command adb shell dumpsys SurfaceFlinger. The output is verbose. The part most relevant to our current discussion is the HWC summary that appears near the bottom of the output:

你可以通过adb shell dumpsys SurfaceFlinger这个命令来查看Surfaceflinger具体使用了什么。这个命令的输出十分的长,其中和我们上面探讨的问题关连最深的是HWC的一段总结,这段一般在输出内容的底部:

    type    |          source crop              |           frame           name
------------+-----------------------------------+--------------------------------
        HWC | [    0.0,    0.0,  320.0,  240.0] | [   48,  411, 1032, 1149] SurfaceView
        HWC | [    0.0,   75.0, 1080.0, 1776.0] | [    0,   75, 1080, 1776] com.android.grafika/com.android.grafika.PlayMovieSurfaceActivity
        HWC | [    0.0,    0.0, 1080.0,   75.0] | [    0,    0, 1080,   75] StatusBar
        HWC | [    0.0,    0.0, 1080.0,  144.0] | [    0, 1776, 1080, 1920] NavigationBar
  FB TARGET | [    0.0,    0.0, 1080.0, 1920.0] | [    0,    0, 1080, 1920] HWC_FRAMEBUFFER_TARGET
 
 
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7

This tells you what layers are on screen, whether they’re being handled with overlays (“HWC”) or OpenGL ES composition (“GLES”), and gives you a bunch of other facts you probably won’t care about (“handle” and “hints” and “flags” and other stuff that we’ve trimmed out of the snippet above). The “source crop” and “frame” values will be examined more closely later on.

从这里我们可以看到那些显示在屏幕上的layer,是被overlays (“HWC”)处理,还是被OpenGL ES composition (“GLES”)处理,另外还有一些我们目前不太关注的别的属性(”handle” and “hints” and “flags”还有别的一些属性,我们没有粘帖在上面的输出中)。我们会在后面详细解释”source crop” and “frame”这两个值的含义。

The FB_TARGET layer is where GLES composition output goes. Since all layers shown above are using overlays, FB_TARGET isn’t being used for this frame. The layer’s name is indicative of its original role: On a device with/dev/graphics/fb0 and no overlays, all composition would be done with GLES, and the output would be written to the framebuffer. On recent devices there generally is no simple framebuffer, so the FB_TARGET layer is a scratch buffer. (Note: This is why screen grabbers written for old versions of Android no longer work: They’re trying to read from The Framebuffer, but there is no such thing.)

FB_TARGET这个layer就是由GLES composition输出组成。这个layer上面的其余layer都是由overlay渲染而成,所以在这一帧里面,FB_TARGET并没有被使用。这个layer的名字表明了初始的角色:一个在/dev/graphics/fb0 的设备,所有的合成工作由GLES来完成,然后输入将会被写入framebuffer中。在当前的设备上并没有一个单纯的framebuffer,所有这个FB_TARGET layer 实际上是一个scratch buffer(这就是为什么在android早期版本上写的一些屏幕截图工具现在不能正常工作的原因:程序试图从framebuffer中读取数据,但是现在已经没有了framebuffer).

The overlay planes have another important role: they’re the only way to display DRM content. DRM-protected buffers cannot be accessed by SurfaceFlinger or the GLES driver, which means that your video will disappear if HWC switches to GLES composition.

overlay planes有另外一个重要的作用:这是显示DRM内容的唯一方法。受保护的DRM视频的buffer是无法被Surfaceflinger或者GLES来读取的,这意味着如果你使用GLES而不是HWC的话,你的视频将无法播放。

The Need for Triple-Buffering

To avoid tearing on the display, the system needs to be double-buffered: the front buffer is displayed while the back buffer is being prepared. At VSYNC, if the back buffer is ready, you quickly switch them. This works reasonably well in a system where you’re drawing directly into the framebuffer, but there’s a hitch in the flow when a composition step is added. Because of the way SurfaceFlinger is triggered, our double-buffered pipeline will have a bubble.

为了避免出现画面撕裂,系统需要双重缓冲:前台缓冲在显示时,后台缓冲则在准备中。在接收到VSYNC信号后,如果后台缓冲已经准备好,你就可以迅速切换到上面。如果你总是直接向framebuffer绘入数据,那么这种工作方式就足够了。但是,当我们加入一个合成步骤后,这样就会有问题。Because of the way

SurfaceFlinger is triggered, our double-buffered pipeline will have a bubble. Suppose frame N is being displayed, and frame N+1 has been acquired by SurfaceFlinger for display on the next VSYNC. (Assume frame N is composited with an overlay, so we can’t alter the buffer contents until the display is done with it.) When VSYNC arrives, HWC flips the buffers. While the app is starting to render frame N+2 into the buffer that used to hold frame N, SurfaceFlinger is scanning the layer list, looking for updates. SurfaceFlinger won’t find any new buffers, so it prepares to show frame N+1 again after the next VSYNC. A little while later, the app finishes rendering frame N+2 and queues it for SurfaceFlinger, but it’s too late. This has effectively cut our maximum frame rate in half.

假设frame N正在被显示,而frame N+1已经被Surfaceflinger获取用于下一次VSYNC发生时的显示(假设frame N使用了overylay来做渲染,所以显示处理完成之前,我们没办法修改buffer的内容)。当VSYNC信号到来时,HWC投递了缓冲区。当app开始渲染frame N+2 到Frame N用过的缓冲区内时,Surfaceflinger开始检查layer列表,查看是否有更新。此时Surfaceflinger并不会发现任何新的buffer,所以它会准备在下一个VSYNC到来时继续显示N+1帧的内容。一段时间后,app结束了N+2帧的渲染,然后将数据传给Surfaceflinger,但是此时已经为时太晚。这将导致我们最大帧率缩减为一半。

We can fix this with triple-buffering. Just before VSYNC, frame N is being displayed, frame N+1 has been composited (or scheduled for an overlay) and is ready to be displayed, and frame N+2 is queued up and ready to be acquired by SurfaceFlinger. When the screen flips, the buffers rotate through the stages with no bubble. The app has just less than a full VSYNC period (16.7ms at 60fps) to do its rendering and queue the buffer. And SurfaceFlinger / HWC has a full VSYNC period to figure out the composition before the next flip. The downside is that it takes at least two VSYNC periods for anything that the app does to appear on the screen. As the latency increases, the device feels less responsive to touch input.

三重缓冲可以解决我们的这个问题。VSYNC信号之前,帧N已经被显示,帧N+1已经合成完毕(或者计划进行overlay),等待被显示,而帧N+2已经在排队等候被Surfaceflinger获取。When the screen flips, the buffers rotate through the stages with no bubble.App有略少于一个完整VSYNC周期的时间(当帧率为60时,这个时间为16.7毫秒)去做它的渲染工作并且将buffer入队。在下一个VSYNC到来之前,Surfaceflinger/HWC有一个完整的VSYNC周期去完成合成的工作。坏消息是,app将内容显示在屏幕上,将需要花费两个VSYNC的周期。因为延迟增加了,所以设备会显得会触摸事件的响应不够灵敏。

SurfaceFlinger with BufferQueue

此处输入图片的描述
Figure 1. SurfaceFlinger + BufferQueue

The diagram above depicts the flow of SurfaceFlinger and BufferQueue. During frame:

上面的图表描述了SurfaceFlinger and BufferQueue的处理流程,在每一帧中:

1.red buffer fills up, then slides into BufferQueue
2.after red buffer leaves app, blue buffer slides in, replacing it
3.green buffer and systemUI
 shadow-slide into HWC (showing that SurfaceFlinger still has the buffers, but now HWC has prepared them for display via overlay on the next VSYNC).
The blue buffer is referenced by both the display and the BufferQueue. The app is not allowed to render to it until the associated sync fence signals.*

1.红色的缓冲区填满后,进入BufferQueue中
2.当红色缓冲区离开app后,蓝色的缓冲区进入并代替了它
3.绿色缓冲区和SystemUI的数据进入HWC(这里显示Surfaceflinger依然持有这些缓冲区,但是现在HWC已经准备好在一个VSYNC到来时,将数据通过overlay显示在屏幕上了)
蓝色的缓冲区同时被显示和BufferQueue引用,因此在相关的同步信号到来前,app是不能在这块缓冲区上渲染的。

On VSYNC, all of these happen at once:
当VSYNC到来时,以下操作同时发生:

1.Red buffer leaps into SurfaceFlinger, replacing green buffer
2.Green buffer leaps into Display, replacing blue buffer, and a dotted-line green twin appears in the BufferQueue
3.The blue buffer’s fence is signaled, and the blue buffer in App empties
4.Display rect changes from to

1.红色的缓冲区进入Surfaceflinger,取代了绿色缓冲区
2.绿色缓冲区取代了蓝色缓冲区,开始显示,同时图中虚线连接的,绿色缓冲区的复制保存在了BufferQueue中
3.蓝色缓冲区的fence被解除,进入到了App empties**中
4.显示内容从蓝色缓冲区+SystemUI变成了绿色缓冲区+systemUI

The System UI process is providing the status and nav bars, which for our purposes here aren’t changing, so SurfaceFlinger keeps using the previously-acquired buffer. In practice there would be two separate buffers, one for the status bar at the top, one for the navigation bar at the bottom, and they would be sized to fit their contents. Each would arrive on its own BufferQueue.

SystemUI提供了状态栏和导航栏,我们这里认为它是不变的,因此Surfaceflinger使用了前面保存的buffer。而实际上,这里会有两个独立的buffer,一个属于上面的状态栏,一个属于下面的导航栏,并且他们的大小和内容是匹配的。每一个都会独立达到自己的BufferQueue中。

The buffer doesn’t actually “empty”; if you submit it without drawing on it you’ll get that same blue again. The emptying is the result of clearing the buffer contents, which the app should do before it starts drawing.

这里的buffer并非真的是空的,如果你不在上面绘制而是直接提交的话,你将会得到一个同样的蓝色缓冲区。App在绘制执行应该先执行清空缓冲区的命令,这将会buffer变空。

We can reduce the latency by noting layer composition should not require a full VSYNC period. If composition is performed by overlays, it takes essentially zero CPU and GPU time. But we can’t count on that, so we need to allow a little time. If the app starts rendering halfway between VSYNC signals, and SurfaceFlinger defers the HWC setup until a few milliseconds before the signal is due to arrive, we can cut the latency from 2 frames to perhaps 1.5. In theory you could render and composite in a single period, allowing a return to double-buffering; but getting it down that far is difficult on current devices. Minor fluctuations in rendering and composition time, and switching from overlays to GLES composition, can cause us to miss a swap deadline and repeat the previous frame.

通过让合成不占用一整个VSYNC时间的办法,我们可以降低延迟。如果合成是由overlay来实现的,那么它几乎不需要消耗CPU和GPU时间。但我们不能依赖于此,因此我们需要一点额外的时间。如果app在两个VSYNC信号中间开始渲染,而surfaceFlinger直到VSYNC到达前的几毫秒才进行了HWC的设置(译者注:setUpHWComposer调用,也就是把需要显示的layer数据准备好,报给HWC模块来决定使用谁来合成),那么我们可以将延迟从2帧降到1.5帧。理论上来说我们可以让渲染和合成在一个周期内,这样双重缓冲区足矣(译者注:的确,理论上来说如果这个过程不消耗时间的话,app在VSYNC之后dequeue到buffer,开始渲染,然后在这个VSYNC时间内完成渲染,要求合成,合成如果瞬间完成,的确不需要多一个VSYNC周期,两个周期足矣,但这要求太高了);但这对当前的设备来说要求太高了,渲染和合成时一点微小的耗时变化(使用GLES而不是HWC来合成),都会导致错过更新时间,导致重复显示上一帧。

SurfaceFlinger’s buffer handling demonstrates the fence-based buffer management mentioned earlier. If we’re animating at full speed, we need to have an acquired buffer for the display (“front”) and an acquired buffer for the next flip (“back”). If we’re showing the buffer on an overlay, the contents are being accessed directly by the display and must not be touched. But if you look at an active layer’s BufferQueue state in the dumpsys SurfaceFlinger output, you’ll see one acquired buffer, one queued buffer, and one free buffer. That’s because, when SurfaceFlinger acquires the new “back” buffer, it releases the current “front” buffer to the queue. The “front” buffer is still in use by the display, so anything that dequeues it must wait for the fence to signal before drawing on it. So long as everybody follows the fencing rules, all of the queue-management IPC requests can happen in parallel with the display.

Surfaceflinger buffer的处理过程展示了我们前面提过的fence-based buffer的管理过程。如果画面高速的变化,我们需要申请一个缓冲区用于显示(front),同时需要申请一个缓冲区用于下一帧(back)。如果显示的buffer是被overlay使用的,那么这里面的内容是直接被显示系统读取的,因此不能被修改。但是如果你通过dumpsys SurfaceFlinger命令来check一个活动的layer的BufferQueue状态时,你会看到一个acquired buffer, 一个queued buffer, 还有一个free buffer.这是因为,当Surfaceflinger申请一个新的back buffer时,它释放了front buffer给队列。但是这个front buffer依然被display使用,所以任何想要在绘制之前dequeue这段buffer的进程,都必须等待fence signal的通知。只要每个人都遵守这套规则,所有的同步队列管理IPC请求都可以在显示系统中被并行的处理。

Virtual Displays

SurfaceFlinger supports a “primary” display, i.e. what’s built into your phone or tablet, and an “external” display, such as a television connected through HDMI. It also supports a number of “virtual” displays, which make composited output available within the system. Virtual displays can be used to record the screen or send it over a network.

Surfaceflinger支持一个主显示,也支持一个额外的显示,比如一个通过HDMI线连接的电视机。同时它也支持一些虚拟的显示,虚拟显示可被用于录制屏幕或者通过网络发送。

Virtual displays may share the same set of layers as the main display (the “layer stack”) or have its own set. There is no VSYNC for a virtual display, so the VSYNC for the primary display is used to trigger composition for all displays.

虚拟电视可以有跟主显示相同的layer,也可以有它自己的layer stack。但是虚拟显示并没有VSYNC,所以主显示的VSYNC将用于触发所有显示的合成工作。

In the past, virtual displays were always composited with GLES. The Hardware Composer managed composition for only the primary display. In Android 4.4, the Hardware Composer gained the ability to participate in virtual display composition.

在过去,虚拟显示的合成一直是由GLES来完成的。HWC仅仅用于主显示,但是在KK,HWC也可以参与虚拟显示的合成工作了。

As you might expect, the frames generated for a virtual display are written to a BufferQueue.

正如你所料,虚拟显示的帧是被写入了一个BufferQueue的。

Surface and SurfaceHolder


The Surface class has been part of the public API since 1.0. Its description simply says, “Handle onto a raw buffer that is being managed by the screen compositor.” The statement was accurate when initially written but falls well short of the mark on a modern system.

Surface类从1.0开始就是公开api的一部分。它的描述是这样的:处理被屏幕合成器管理的raw buffer。这句话在当时被写下时(1.0时代)是准确的,但是在当代的操作系统的标准下这句话已经远远落后。

The Surface represents the producer side of a buffer queue that is often (but not always!) consumed by SurfaceFlinger. When you render onto a Surface, the result ends up in a buffer that gets shipped to the consumer. A Surface is not simply a raw chunk of memory you can scribble on.

Surface代表了一个buffer queue的生产者一侧,这个buffer queue一般被(但不是总是)被Surfaceflinger来消费。当你向一个Surface渲染时,结果最终在一个缓冲区内被运送到消费者那里。一个Surface并不是一个可以任意修改的简单raw内存数据块。

The BufferQueue for a display Surface is typically configured for triple-buffering; but buffers are allocated on demand. So if the producer generates buffers slowly enough — maybe it’s animating at 30fps on a 60fps display — there might only be two allocated buffers in the queue. This helps minimize memory consumption. You can see a summary of the buffers associated with every layer in the dumpsys SurfaceFlinger output.

一个显示surface的bufferQueue一般被配置为三重缓冲区,但是缓冲区是按需分配的。所以如果生产者生产缓冲区足够缓慢(比如在一个刷新率60的设备上只有30的刷新率),这种情况下可能队列中只有两个被分配的缓冲区,这样可以有效的降低内存使用。通过命令dumpsys SurfaceFlinger,你可以看到每个layer关连的buffer的汇总。

Canvas Rendering

Once upon a time, all rendering was done in software, and you can still do this today. The low-level implementation is provided by the Skia graphics library. If you want to draw a rectangle, you make a library call, and it sets bytes in a buffer appropriately. To ensure that a buffer isn’t updated by two clients at once, or written to while being displayed, you have to lock the buffer to access it. lockCanvas() locks the buffer and returns a Canvas to use for drawing, and unlockCanvasAndPost() unlocks the buffer and sends it to the compositor.

曾经,所有的渲染工作都可以由软件来完成,在今天你依然可以这么做。底层的实现是由Skia库来实现的。如果你想绘制一个矩形,你调用一个库函数,函数就会设置好缓冲区中的数据。为了确保buffer同时被两个客户端同时更新,或者在显示时被写入,你需要在使用它之前锁定这块buffer。函数lockCanvas会锁定一块缓冲区并且返回一个canvas用来绘制,函数unlockCanvasAndPost函数解锁缓冲区,并且把它发送给合成器。

As time went on, and devices with general-purpose 3D engines appeared, Android reoriented itself around OpenGL ES. However, it was important to keep the old API working, for apps as well as app framework code, so an effort was made to hardware-accelerate the Canvas API. As you can see from the charts on the Hardware Acceleration page, this was a bit of a bumpy ride. Note in particular that while the Canvas provided to a View’s onDraw() method may be hardware-accelerated, the Canvas obtained when an app locks a Surface directly with lockCanvas() never is.

随着时间的推移,带有通用3D加速引擎的设备出现了。Android围绕OpenGL ES做了调整。然而,保证旧的API可以运行同样重要。因此我们努力使得Canvas的API支持硬件加速。如你在Hardware Acceleration页面所能看到的图表一样,这是一段艰苦的旅程。特别要注意的是,当一个Canvas提供到一个View的onDraw方法时,它可能是硬件加速的;而你通过lockCanvas方法获取到的Canvas则绝不可能是硬件加速的。

When you lock a Surface for Canvas access, the “CPU renderer” connects to the producer side of the BufferQueue and does not disconnect until the Surface is destroyed. Most other producers (like GLES) can be disconnected and reconnected to a Surface, but the Canvas-based “CPU renderer” cannot. This means you can’t draw on a surface with GLES or send it frames from a video decoder if you’ve ever locked it for a Canvas.

当你为了使用Canvas而锁定一个Surface的时候,”CPU renderer”连接到了BufferQueue的生产者一端,直到Surface被销毁才会断开。大多数其他的生产者(比如GLES)可以断开连接,并且重新连接到一个Surface之上;但是基于CPU渲染的Canvas不行。这意味着,一旦你为了使用一个Canvas而lock了一个Surface,你就不能使用GLES绘制这个Surface,你也不能将视频解码器生成的帧发送给它。

The first time the producer requests a buffer from a BufferQueue, it is allocated and initialized to zeroes. Initialization is necessary to avoid inadvertently sharing data between processes. When you re-use a buffer, however, the previous contents will still be present. If you repeatedly call lockCanvas() and unlockCanvasAndPost() without drawing anything, you’ll cycle between previously-rendered frames.

第一次生产者从BufferQueue中请求一个buffer时,它被分配并且被初始化为空。为了避免出现进程间不经意的分享数据,初始化是必要的。因为当你重新使用一个buffer时,之前的内容可能还在那里。如果你重复调用 lockCanvas() 和unlockCanvasAndPost()函数而不绘制任何东西的话,你将会循环显示前面渲染过的帧。

The Surface lock/unlock code keeps a reference to the previously-rendered buffer. If you specify a dirty region when locking the Surface, it will copy the non-dirty pixels from the previous buffer. There’s a fair chance the buffer will be handled by SurfaceFlinger or HWC; but since we need to only read from it, there’s no need to wait for exclusive access.

Surface的lock/unlock代码保持了上次渲染过的buffer的引用。如果你在lock时指定了脏区域,那么它会将前一个缓冲区内非脏区域的像素拷贝过来。有相当大的可能这块buffer正在被Surfaceflinger或者HWC处理,但是因为我们只是要从中读取内容,因此我们没必要一直等待互斥锁。

The main non-Canvas way for an application to draw directly on a Surface is through OpenGL ES. That’s described in the EGLSurface and OpenGL ES section.

一个app不通过Canvas这个方法,而直接在Surface上绘制的办法是通过OpenGL ES。我们将在EGLSurface and OpenGL ES 这一节中讲到这个问题。

SurfaceHolder

Some things that work with Surfaces want a SurfaceHolder, notably SurfaceView. The original idea was that Surface represented the raw compositor-managed buffer, while SurfaceHolder was managed by the app and kept track of higher-level information like the dimensions and format. The Java-language definition mirrors the underlying native implementation. It’s arguably no longer useful to split it this way, but it has long been part of the public API.

一些在surface上工作的东西需要一个SurfaceHolder,尤其是SurfaceView。初始的想法是,Surface代表了raw格式的,被混合器管理的缓冲区,而SurfaceHolder被app管理。这样,app可以在比如大小以及格式等更高层面上来处理问题。Java层定义了一个底层实现的上层镜像。这种分层方法目前已经不再有意义,但是它已经长时间成为了公共API中的一部分。

Generally speaking, anything having to do with a View will involve a SurfaceHolder. Some other APIs, such as MediaCodec, will operate on the Surface itself. You can easily get the Surface from the SurfaceHolder, so hang on to the latter when you have it.

一般而言,对View所做的一切事情都需要通过一个SurfaceHolder。其他的一些api,比如MediaCodec,将直接在Surface上操作。你可以很容易的从一个SurfaceHolder中获取一个Surface。

APIs to get and set Surface parameters, such as the size and format, are implemented through SurfaceHolder.

获取和设置Surface参数的一些API,比如大小和格式,都是通过SurfaceHolder实现。

EGLSurface and OpenGL ES


OpenGL ES defines an API for rendering graphics. It does not define a windowing system. To allow GLES to work on a variety of platforms, it is designed to be combined with a library that knows how to create and access windows through the operating system. The library used for Android is called EGL. If you want to draw textured polygons, you use GLES calls; if you want to put your rendering on the screen, you use EGL calls.

OpenGL ES定义了一组Graphic的渲染API。它并没有定义一个窗口系统。为了让GLES可以工作在不同的平台之上,它设计了一个库,这个库知道如何在指定的操作系统上创建和使用窗口。Android上的这个库叫做EGL。如果你想绘制一个多边形,那么使用GLES的函数;如果你想要将它渲染到屏幕上,你需要使用EGL的调用。

Before you can do anything with GLES, you need to create a GL context. In EGL, this means creating an EGLContext and an EGLSurface. GLES operations apply to the current context, which is accessed through thread-local storage rather than passed around as an argument. This means you have to be careful about which thread your rendering code executes on, and which context is current on that thread.

在你使用GLES做事之前,你需要创建一个GL的上下文。具体针对EGL,这意味着创建一个EGLContext 和一个 EGLSurface。GLES的操作作用在当前的上下文之中,而上下文的访问更多的依赖本地线程的存储而不是参数的传递。这意味着你需要关注你的渲染代码执行在哪个线程之上,并且这个线程的当前上下文是什么。

The EGLSurface can be an off-screen buffer allocated by EGL (called a “pbuffer”) or a window allocated by the operating system. EGL window surfaces are created with the eglCreateWindowSurface() call. It takes a “window object” as an argument, which on Android can be a SurfaceView, a SurfaceTexture, a SurfaceHolder, or a Surface — all of which have a BufferQueue underneath. When you make this call, EGL creates a new EGLSurface object, and connects it to the producer interface of the window object’s BufferQueue. From that point onward, rendering to that EGLSurface results in a buffer being dequeued, rendered into, and queued for use by the consumer. (The term “window” is indicative of the expected use, but bear in mind the output might not be destined to appear on the display.)

EGLSurface可以是一个由EGL分配的离屏缓冲区(”pbuffer”)或者一个由操作系统分配的窗口缓冲区。EGL window Surface由eglCreateWindowSurface()函数创建。它持有一个窗口对象做为参数,在Android系统上,这个对象可能是一个SurfaceView,一个SurfaceTexture,一个SurfaceHolder,或者一个Surface—-所有的这些下面都有一个BuffferQueue。当你调用这个函数时,ELG创建了一个新的EGLSurface对象,并且将它连接到一个窗口对象的BufferQueue的生产者接口上。从这一刻开始,渲染到一个EGLSurface上将导致一个buffer经历出队,渲染,入队供消费者使用 这个过程。(这里属于window被使用,但这只是一个预期,实际上,输出可能并不显示在屏幕上)

EGL does not provide lock/unlock calls. Instead, you issue drawing commands and then call eglSwapBuffers()to submit the current frame. The method name comes from the traditional swap of front and back buffers, but the actual implementation may be very different.

EGL并没有提供lock/unlock的调用。你需要调用绘制命令,然后调用eglSwapBuffers()函数去提交当前的帧。这个方法名字的来源是传统的交换前后缓冲区,但是目前实际的实现可能会有很大的不同。

Only one EGLSurface can be associated with a Surface at a time — you can have only one producer connected to a BufferQueue — but if you destroy the EGLSurface it will disconnect from the BufferQueue and allow something else to connect.

一个EGLSurface一次只能关连一个Surface—-一次只能有一个生产者连接到一个BufferQueue上—但是你可以销毁这个EGLSurface,使得它和BufferQueue的连接断开,这样就可以用其他的东西连接这个BufferQueue了。

A given thread can switch between multiple EGLSurfaces by changing what’s “current.” An EGLSurface must be current on only one thread at a time.

一个给定的线程可以通过设置哪个是Current的方法来在不同的EGLSurfaces间切换,一个线程同时只能有一个EGLSurface作为current。

The most common mistake when thinking about EGLSurface is assuming that it is just another aspect of Surface (like SurfaceHolder). It’s a related but independent concept. You can draw on an EGLSurface that isn’t backed by a Surface, and you can use a Surface without EGL. EGLSurface just gives GLES a place to draw.

一个普通的误解是,很多人把EGLSurface当做是Surface的另一种表现(就像是SurfaceHolder)。他们二者之间有关系,但是这是两个独立的概念。你可以在一个EGLSurface上绘制而不需要一个Surface的支持,你也可以不通过EGL而使用一个Surface。EGLSurface只不过给GLES提供了一个绘制的地方而已。

ANativeWindow

The public Surface class is implemented in the Java programming language. The equivalent in C/C++ is the ANativeWindow class, semi-exposed by the Android NDK. You can get the ANativeWindow from a Surface with the ANativeWindow_fromSurface() call. Just like its Java-language cousin, you can lock it, render in software, and unlock-and-post.

公共的Surface类是由Java实现的。在C/C++层对应的是ANativeWindow类,半暴漏在Android NDK中。你可以通过使用 ANativeWindow_fromSurface()从Surface中获得一个ANativeWindow。就像他的java表亲一样,你可以lock,使用软件渲染,然后unlock-and-post.

To create an EGL window surface from native code, you pass an instance of EGLNativeWindowType to eglCreateWindowSurface(). EGLNativeWindowType is just a synonym for ANativeWindow, so you can freely cast one to the other.

为了从本地代码中创建一个EGL window surface,你需要给eglCreateWindowSurface()方法传递一个EGLNativeWindowType实例。EGLNativeWindowType等同于ANativeWindow,所以你可以在二者之间自由的转换。

The fact that the basic “native window” type just wraps the producer side of a BufferQueue should not come as a surprise.

事实上,native window的本质不过是BufferQueue在生产者一侧的包装罢了。

SurfaceView and GLSurfaceView


Now that we’ve explored the lower-level components, it’s time to see how they fit into the higher-level components that apps are built from.

现在我们已经研究了底层的一些组件,是时候来看下更高层次上组件是如何工作的了。

The Android app framework UI is based on a hierarchy of objects that start with View. Most of the details don’t matter for this discussion, but it’s helpful to understand that UI elements go through a complicated measurement and layout process that fits them into a rectangular area. All visible View objects are rendered to a SurfaceFlinger-created Surface that was set up by the WindowManager when the app was brought to the foreground. The layout and rendering is performed on the app’s UI thread.

一个Android的app Framework层的UI是从视图上而来的基于对象的层次结构。大多数的细节对我们的讨论来说无关紧要,但是理解一下过程依然是对我们有帮助的:即UI元素是如何通过负责的测量和布局过程来将他们部署在一个矩形区域里面的。所有可见的view对象都呈现给了Surfaceflinger—当app由后台转到前台后,通过WindowManager创建了Surface。Layout和渲染都是在app的UI线程里面执行的。

Regardless of how many Layouts and Views you have, everything gets rendered into a single buffer. This is true whether or not the Views are hardware-accelerated.

根据你有多少layout和view,每个对象都在一个独立的buffer中渲染。无论是否使用硬件加速,都是如此。

A SurfaceView takes the same sorts of parameters as other views, so you can give it a position and size, and fit other elements around it. When it comes time to render, however, the contents are completely transparent. The View part of a SurfaceView is just a see-through placeholder.

一个SurfaceView有跟其他view一样的一些参数,所以你可以设置它的位置和大小等等。当它被渲染时,我们可以认为他的所有内容都是透明的。SurfaceView的视图部分只不过是一个透明的占位区域。

When the SurfaceView’s View component is about to become visible, the framework asks the WindowManager to ask SurfaceFlinger to create a new Surface. (This doesn’t happen synchronously, which is why you should provide a callback that notifies you when the Surface creation finishes.) By default, the new Surface is placed behind the app UI Surface, but the default “Z-ordering” can be overridden to put the Surface on top.

当SurfaceView的组件即将变为可见时,Framework层要求WindowManager请求Surfaceflinger创建一个新的Surface(这个过程是异步发生的,这就是为什么你应该提供一个回调函数,这样当Surface创建完成时你才能得到通知)。缺省情况下,新创建的Surface在app UI Surface的下面,但是Z轴顺序可能将这个Surface放在上面。

Whatever you render onto this Surface will be composited by SurfaceFlinger, not by the app. This is the real power of SurfaceView: the Surface you get can be rendered by a separate thread or a separate process, isolated from any rendering performed by the app UI, and the buffers go directly to SurfaceFlinger. You can’t totally ignore the UI thread — you still have to coordinate with the Activity lifecycle, and you may need to adjust something if the size or position of the View changes — but you have a whole Surface all to yourself, and blending with the app UI and other layers is handled by the Hardware Composer.

渲染在这个Surface(SurfaceView的surface)上的内容将由Surfaceflinger来混合,而不是由app。这才是SurfaceView的真正作用:这个surface可以被一个独立的线程或者进程来渲染,和app UI上其他的渲染工作分开,这些缓冲区数据将直接传递给Surfaceflinger。当然你不能完全忽略UI线程—-你依然要和activity的生命周期保持一致,并且一旦view的大小或者位置发生了改变,你可能也需要做些调整—但是你拥有了一个完整的Surface,并且这个Surface和app UI以及其他layer的混合工作将由HWC来完成。

It’s worth taking a moment to note that this new Surface is the producer side of a BufferQueue whose consumer is a SurfaceFlinger layer. You can update the Surface with any mechanism that can feed a BufferQueue. You can: use the Surface-supplied Canvas functions, attach an EGLSurface and draw on it with GLES, and configure a MediaCodec video decoder to write to it.

值得一提的是,新创建的surface实际上是生产者端,而消费者端则是一个Surfaceflinger的layer。你可以通过任何可以填充BufferQueue的途径来更新这个surface。你可以:使用Surface提供的Canvas相关的函数,附加一个EGLSurface然后使用GLES在上面绘制,配置一个MediaCodec 视频解码器直接在上面写数据。

Composition and the Hardware Scaler Now that we have a bit more context, it’s useful to go back and look at a couple of fields from dumpsys SurfaceFlinger that we skipped over earlier on. Back in the Hardware Composer discussion, we looked at some output like this:

我们现在有了更多的上下文知识,所以让我们回去看看前面讲到dumpsys SurfaceFlinger时我们忽略的几个字段。我们来看看如下的几个输出:

    type    |          source crop              |           frame           name
------------+-----------------------------------+--------------------------------
        HWC | [    0.0,    0.0,  320.0,  240.0] | [   48,  411, 1032, 1149] SurfaceView
        HWC | [    0.0,   75.0, 1080.0, 1776.0] | [    0,   75, 1080, 1776] com.android.grafika/com.android.grafika.PlayMovieSurfaceActivity
        HWC | [    0.0,    0.0, 1080.0,   75.0] | [    0,    0, 1080,   75] StatusBar
        HWC | [    0.0,    0.0, 1080.0,  144.0] | [    0, 1776, 1080, 1920] NavigationBar
  FB TARGET | [    0.0,    0.0, 1080.0, 1920.0] | [    0,    0, 1080, 1920] HWC_FRAMEBUFFER_TARGET
 
 
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7

This was taken while playing a movie in Grafika’s “Play video (SurfaceView)” activity, on a Nexus 5 in portrait orientation. Note that the list is ordered from back to front: the SurfaceView’s Surface is in the back, the app UI layer sits on top of that, followed by the status and navigation bars that are above everything else. The video is QVGA (320x240).

这个抓取自Nexus 5竖屏模式,当在Grafika里面播放视频的时候。注意这列表是按照从后到前的顺序排列的:SurfaceView的Surface在最后面,app ui layer在上面,然后是状态栏和导航栏。视频是QVGA的。

The “source crop” indicates the portion of the Surface’s buffer that SurfaceFlinger is going to display. The app UI was given a Surface equal to the full size of the display (1080x1920), but there’s no point rendering and compositing pixels that will be obscured by the status and navigation bars, so the source is cropped to a rectangle that starts 75 pixels from the top, and ends 144 pixels from the bottom. The status and navigation bars have smaller Surfaces, and the source crop describes a rectangle that begins at the the top left (0,0) and spans their content.

“source crop”指示了Surface的buffer中要被SurfaceFlinger显示的部分。App UI的surface大小是整个显示的大小(1080*1920),但是由于需要显示状态栏和导航栏,因此从上面裁剪了75个像素,从下面裁剪了144个像素。

The “frame” is the rectangle where the pixels end up on the display. For the app UI layer, the frame matches the source crop, because we’re copying (or overlaying) a portion of a display-sized layer to the same location in another display-sized layer. For the status and navigation bars, the size of the frame rectangle is the same, but the position is adjusted so that the navigation bar appears at the bottom of the screen.

Frame一栏是指最终显示在屏幕上的位置。APP UI因为是完全相同位置的拷贝,因此这个值和前一列完全相同。而对于状态栏和导航栏,大小和前面一列是相似的,但是位置已经发生了改变。

Now consider the layer labeled “SurfaceView”, which holds our video content. The source crop matches the video size, which SurfaceFlinger knows because the MediaCodec decoder (the buffer producer) is dequeuing buffers that size. The frame rectangle has a completely different size — 984x738.

现在我们来看下SurfaceView这个layer,这个layer里面是视频的内容。source crop一列和视频的大小一致,SurfaceFlinger之所以知道这个信息是因为MediaCodec解码器申请的出队的buffer的大小就是这么大。而Frame则有一个完全不同的大小:984*738.

SurfaceFlinger handles size differences by scaling the buffer contents to fill the frame rectangle, upscaling or downscaling as needed. This particular size was chosen because it has the same aspect ratio as the video (4:3), and is as wide as possible given the constraints of the View layout (which includes some padding at the edges of the screen for aesthetic reasons).

SurfaceFlinger处理了这种大小的不同,通过缩放缓冲区数据的方式来填充到frame中,根据需要放大或者缩小。之所以选择这个特殊的大小,是因为这个大小和视频有相同的高宽比,并且在View Layout允许的宽度下尽可能的大(基于美观的考虑,在屏幕边缘也放置了一些填充物。)。

If you started playing a different video on the same Surface, the underlying BufferQueue would reallocate buffers to the new size automatically, and SurfaceFlinger would adjust the source crop. If the aspect ratio of the new video is different, the app would need to force a re-layout of the View to match it, which causes the WindowManager to tell SurfaceFlinger to update the frame rectangle.

如果我们在同一个surface上面播放了一个不同大小的视频,那么底层的BufferQueue会重新使用新的大小重新分配缓冲区,SurfaceFlinger也会调整它的source crop。如果新的视频的高宽比也发生了变化,app需要强制要求View来re-layout来匹配当前大小,这将导致WindowManager通知SurfaceFlinger来更新每个frame矩形的大小。

If you’re rendering on the Surface through some other means, perhaps GLES, you can set the Surface size using the SurfaceHolder#setFixedSize() call. You could, for example, configure a game to always render at 1280x720, which would significantly reduce the number of pixels that must be touched to fill the screen on a 2560x1440 tablet or 4K television. The display processor handles the scaling. If you don’t want to letter- or pillar-box your game, you could adjust the game’s aspect ratio by setting the size so that the narrow dimension is 720 pixels, but the long dimension is set to maintain the aspect ratio of the physical display (e.g. 1152x720 to match a 2560x1600 display). You can see an example of this approach in Grafika’s “Hardware scaler exerciser” activity.

如果你使用这个Surface来做其他用途,比如GLES,那么你可以通过调用SurfaceHolder#setFixedSize() 函数来设置Surface的大小。比如你可以设置一个游戏的大小为1280*720,这样当你去运行在一个大小为2K或者4K的屏幕上时,你可以显著的减少需要填充的像素的数目。显示处理器会来处理缩放。。。。

GLSurfaceView

The GLSurfaceView class provides some helper classes that help manage EGL contexts, inter-thread communication, and interaction with the Activity lifecycle. That’s it. You do not need to use a GLSurfaceView to use GLES.

GLSurfaceView类提供一个辅助类,这些类可以帮助我们管理EGL的上下文,线程通信以及与activity生命周期的交互。当然,你不需要GLSurfaceView就可以使用GLES。

For example, GLSurfaceView creates a thread for rendering and configures an EGL context there. The state is cleaned up automatically when the activity pauses. Most apps won’t need to know anything about EGL to use GLES with GLSurfaceView.

举例来说,GLSurfaceView创建了一个用来渲染和管理EGL上下文的线程。当activity pause时,它会自动清空所有的状态。大多数应用通过GLSurfaceView来使用GLES时,不需要了解任何和EGL有关的事情。

In most cases, GLSurfaceView is very helpful and can make working with GLES easier. In some situations, it can get in the way. Use it if it helps, don’t if it doesn’t.

在大多数情况下,GLSurfaceView对处理GLES来说很有帮助。但是在一些情况下,它可能是一种阻碍。仅仅在你需要的时候使用它。

SurfaceTexture


The SurfaceTexture class is a relative newcomer, added in Android 3.0 (“Honeycomb”). Just as SurfaceView is the combination of a Surface and a View, SurfaceTexture is the combination of a Surface and a GLES texture. Sort of.

SurfaceTexture类是从Android 3.0开始引入的。就像是SurfaceView是一个Surface和view的组合一样,SurfaceTexture某种程度上是一个Surface和GLES材质的组合。

When you create a SurfaceTexture, you are creating a BufferQueue for which your app is the consumer. When a new buffer is queued by the producer, your app is notified via callback (onFrameAvailable()). Your app calls updateTexImage(), which releases the previously-held buffer, acquires the new buffer from the queue, and makes some EGL calls to make the buffer available to GLES as an “external” texture.

当你创建了一个SurfaceTexture时,你创建了一个BufferQueue,而你的app则是消费者。当一个新的buffer被生产者入队后,你的app将会被回调函数通知(onFrameAvailable())。你的app会调用updateTexImage()函数,释放了先前持有的buffer,从队列中acquire新的buffer,做了一些EGL调用,使得一些buffer可以作为额外的材质被GLES获取。

External textures (GL_TEXTURE_EXTERNAL_OES) are not quite the same as textures created by GLES (GL_TEXTURE_2D). You have to configure your renderer a bit differently, and there are things you can’t do with them. But the key point is this: You can render textured polygons directly from the data received by your BufferQueue.

额外的材质(GL_TEXTURE_EXTERNAL_OES)跟由GLES自身创建的材质(GL_TEXTURE_2D)有一些不同。需要配置renderer略有不同,也有一些事情是不能做的。但是我们关注的重点是:我们可以使用从BufferQueue中收到的数据直接渲染多边形。

You may be wondering how we can guarantee the format of the data in the buffer is something GLES can recognize — gralloc supports a wide variety of formats. When SurfaceTexture created the BufferQueue, it set the consumer’s usage flags to GRALLOC_USAGE_HW_TEXTURE, ensuring that any buffer created by gralloc would be usable by GLES.

你可以好奇,我们如何保证buffer中的数据格式可以被GLES正确读取—-要知道gralloc可是支持各种各样的格式。当SurfaceTexture创建BufferQueue时,它设置消费者的usage flags是GRALLOC_USAGE_HW_TEXTURE,这保证了任何由gralloc创建的buffer都是可以被GLES使用的。

Because SurfaceTexture interacts with an EGL context, you have to be careful to call its methods from the correct thread. This is spelled out in the class documentation.

因为SurfaceTexture要和EGL上下文交互,因此务必保证调用的方法来自正确的线程,这个在类的说明文档中已经指出(SurfaceTexture objects may be created on any thread. updateTexImage() may only be called on the thread with the OpenGL ES context that contains the texture object. The frame-available callback is called on an arbitrary thread, so unless special care is taken updateTexImage() should not be called directly from the callback.)。

If you look deeper into the class documentation, you will see a couple of odd calls. One retrieves a timestamp, the other a transformation matrix, the value of each having been set by the previous call to updateTexImage(). It turns out that BufferQueue passes more than just a buffer handle to the consumer. Each buffer is accompanied by a timestamp and transformation parameters.

如果你对这个类的文档做更深的研究,你会发现一些古怪的函数。一个检索一个时间戳,而另外一个则变换一个矩阵,这两个值都是在前面那个updateTexImage()调用时被改变的(?)。这说明BufferQueue传给消费者的不仅仅是一个buffer的句柄。每一个buffer都伴随着一个时间戳参数和一个transformation参数。

The transformation is provided for efficiency. In some cases, the source data might be in the “wrong” orientation for the consumer; but instead of rotating the data before sending it, we can send the data in its current orientation with a transform that corrects it. The transformation matrix can be merged with other transformations at the point the data is used, minimizing overhead.

Transformation参数用于提高效率。在一些情况下,提供给消费者的源数据可能方向是错误的,但是相比发送之前旋转数据,我们可以发送一个当前方向的数据,同时一起发送一个transform参数用以变换。transformation matrix(变换矩阵)可以跟其他一些变换参数一起传递,用以减小系统开销。

The timestamp is useful for certain buffer sources. For example, suppose you connect the producer interface to the output of the camera (with setPreviewTexture()). If you want to create a video, you need to set the presentation time stamp for each frame; but you want to base that on the time when the frame was captured, not the time when the buffer was received by your app. The timestamp provided with the buffer is set by the camera code, resulting in a more consistent series of timestamps.

时间戳则是为了明确具体的buffer源数据。举例来说,假设你将camera的输入做为生产者(通过setPreviewTexture)。如果你先要创建一段视频,你需要给每一帧打上时间戳;你想要基于帧被录制的时间,而不是app收到buffer的时间。因此这个时间戳是被camera的代码提供的,这些时间戳将更加准确一致。

SurfaceTexture and Surface

If you look closely at the API you’ll see the only way for an application to create a plain Surface is through a constructor that takes a SurfaceTexture as the sole argument. (Prior to API 11, there was no public constructor for Surface at all.) This might seem a bit backward if you view SurfaceTexture as a combination of a Surface and a texture.

如果你仔细看过api说明的话,你会发现创建一个简易Surface的唯一办法是使用一个带有SurfaceTexture参数的构造函数(在API11之前的版本,Surface根本就没有一个公有的构造函数)。如果你认为SurfaceTexture是Surface和texture的组合的话,似乎就有点落后。

Under the hood, SurfaceTexture is called GLConsumer, which more accurately reflects its role as the owner and consumer of a BufferQueue. When you create a Surface from a SurfaceTexture, what you’re doing is creating an object that represents the producer side of the SurfaceTexture’s BufferQueue.

在底层,SurfaceTexture实际上是一个GLConsumer,这个名字更能反应出它作为一个BufferQueue的持有者和消费者的角色。当你使用一个SurfaceTexture来创建一个Surface时,你实际上做的,其实是创建了BufferQueue(SurfaceTexture持有的)的生产者一侧。
此处输入图片的描述
Figure 2.Grafika’s continuous capture activity

In the diagram above, the arrows show the propagation of the data from the camera. BufferQueues are in color (purple producer, cyan consumer). Note “Camera” actually lives in the mediaserver process.

Encoded H.264 video goes to a circular buffer in RAM in the app process, and is written to an MP4 file on disk using the MediaMuxer class when the “capture” button is hit.
如上图所示,箭头显示了从相机开始数据的传输。图中紫色的是生产者,而蓝色的则是消费者。注意Camera实际上是在mediaserver进程中的。编码过的H.264 video存入app进程的一个环形缓冲区内,并且通过MediaMuxer类被写入一个硬盘上的MP4文件中(?)。

All three of the BufferQueues are handled with a single EGL context in the app, and the GLES operations are performed on the UI thread. Doing the SurfaceView rendering on the UI thread is generally discouraged, but since we’re doing simple operations that are handled asynchronously by the GLES driver we should be fine. (If the video encoder locks up and we block trying to dequeue a buffer, the app will become unresponsive. But at that point, we’re probably failing anyway.) The handling of the encoded data — managing the circular buffer and writing it to disk — is performed on a separate thread.

(上图中)所有的三组BufferQueue都是由app中同一个EGL上下文来处理的,GLES的操作实在UI线程内执行的。在UI线程中来做SurfaceView的渲染工作一般是不推荐的。但是鉴于我们只需要做一些简单的操作,(其余的)由GLES来做异步的处理,所以应该问题不大(如果视频编码器死锁,那么我们试图获取Buffer的操作将被阻塞,app将会发生ANR。但是在这种情况下,无论如何错误都会发生)。编码好的数据的处理—-管理这个环形缓冲区和将它写到磁盘上——是在一个单独的线程上的。

The bulk of the configuration happens in the SurfaceView’s surfaceCreated() callback. The EGLContext is created, and EGLSurfaces are created for the display and for the video encoder. When a new frame arrives, we tell SurfaceTexture to acquire it and make it available as a GLES texture, then render it with GLES commands on each EGLSurface (forwarding the transform and timestamp from SurfaceTexture). The encoder thread pulls the encoded output from MediaCodec and stashes it in memory.

大量的配置工作发生在SurfaceView的 surfaceCreated()函数被调用时。EGLContext被创建,用于显示和视频编码器的EGLSurfaces也同样被创建。当新的一帧来临时,我们通知SurfaceTexture来获取并且使之变为GLES texture,然后使用GLES命令来在各个EGLSurface上渲染(从SurfaceTexture来发送transform 和 timestamp)。解码器线程从MediaCodec取出数据并且把它存放在内存里。

TextureView


The TextureView class was introduced in Android 4.0 (“Ice Cream Sandwich”). It’s the most complex of the View objects discussed here, combining a View with a SurfaceTexture.

TextureView类是由Android 4.0引入的。它是我们目前介绍过的最复杂的类对象,由一个View混合了一个SurfaceTexture而成。

Recall that the SurfaceTexture is a “GL consumer”, consuming buffers of graphics data and making them available as textures. TextureView wraps a SurfaceTexture, taking over the responsibility of responding to the callbacks and acquiring new buffers. The arrival of new buffers causes TextureView to issue a View invalidate request. When asked to draw, the TextureView uses the contents of the most recently received buffer as its data source, rendering wherever and however the View state indicates it should.

让我们回忆一下,SurfaceTexture实际上是一个‘GL consumer’,消费Graphic缓冲区的数据,并且使他们可以作为纹理。TextureView包装了一个SurfaceTexture,接管了它响应回调函数和申请新的缓冲区的工作。新的缓冲区的到来使得TextureView发出一个view重绘的请求。当被要求绘制时,TextureView使用了最新收到的缓冲区数据作为数据来源,渲染任何view状态指示它做的工作。

You can render on a TextureView with GLES just as you would SurfaceView. Just pass the SurfaceTexture to the EGL window creation call. However, doing so exposes a potential problem.

你可以在TextureView上使用GLES渲染,就像你在SurfaceView上做的一样。只需要在EGL window创建时,把SurfaceTexture传递过去。但是,这样做会暴露一个潜在的问题。

In most of what we’ve looked at, the BufferQueues have passed buffers between different processes. When rendering to a TextureView with GLES, both producer and consumer are in the same process, and they might even be handled on a single thread. Suppose we submit several buffers in quick succession from the UI thread. The EGL buffer swap call will need to dequeue a buffer from the BufferQueue, and it will stall until one is available. There won’t be any available until the consumer acquires one for rendering, but that also happens on the UI thread… so we’re stuck.

大多数情况下,BufferQueue在不同的进程间传递数据。当我们在TextureView上使用GLES渲染时,生产者和消费者在一个进程内,而且他们很可能被同一个线程来处理。假设我们连续的UI线程上提交数据,EGL需要从BufferQueue中dequeue一个Buffer,但是在知道一个buffer可用时,这个操作才会完成。但是除非消费者acquire了一个buffer用于渲染,否则不会有任何可用的buffer,但这个过程同样发生了UI线程中。现在我们悲剧了。

The solution is to have BufferQueue ensure there is always a buffer available to be dequeued, so the buffer swap never stalls. One way to guarantee this is to have BufferQueue discard the contents of the previously-queued buffer when a new buffer is queued, and to place restrictions on minimum buffer counts and maximum acquired buffer counts. (If your queue has three buffers, and all three buffers are acquired by the consumer, then there’s nothing to dequeue and the buffer swap call must hang or fail. So we need to prevent the consumer from acquiring more than two buffers at once.) Dropping buffers is usually undesirable, so it’s only enabled in specific situations, such as when the producer and consumer are in the same process.

解决这个问题的办法是:让BufferQueue保证始终都有一个buffer是可以被dequeued的,这样这个过程才不会阻塞。如何才能保证这一点?一个方法是当BufferQueue上有新的buffer被queued时,就丢弃掉之前queue的缓冲区,我们还要设置最小缓冲区数量的限制和最大acquired缓冲区数量的限制(如果你的队列只有三个缓冲区,但是三个缓冲区都被消费者acquired,这样我们就不能dequeue到任何buffer,阻塞就发生了。因此我们需要阻止消费者acquire两个以上的缓冲区)丢帧一般是不可接受的,因此这个方法只能用在一些特殊的场景里,比如当生产者和消费者在一个进程里面的时候。

  • 2
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值