GEM v. TTM

GEM v. TTM

By Jonathan Corbet
May 28, 2008
Getting high-performance, three-dimensional graphics working under Linux is quite a challenge even when the fundamental hardware programming information is available. One component of this problem is memory management: a graphics processor (GPU) is, essentially, a computer of its own with a distinct view of memory. Managing the GPU's memory - and its view of system RAM - must be done carefully if the resulting system is intended to work at all, much less with acceptable performance.

Not that long ago, it appeared that this problem had been solved with the translation table maps (TTM) subsystem. TTM remains outside of the mainline kernel, though, as do all drivers which use it. A recent query about what would be required to get TTM merged led to an interesting discussion where it turned out that, in fact, TTM may not be the future of graphics memory management after all.

 

A number of complaints about TTM have been raised. Its API is far larger than is needed for any free Linux driver; it has, in other words, a certain amount of code dedicated to the needs of binary-only drivers. The fencing mechanism (which manages concurrency between the host CPUs and the GPU) is seen as being complex, difficult to work with, and not always yielding the best performance. Heavy use of memory-mapped buffers can create performance problems of its own. The TTM API is an exercise in trying to provide for everything in all situations; as a result it is, according to some driver developers, hard to match to any specific hardware, hard to get started with, and still insufficiently flexible. And, importantly, there is a distinct shortage of working free drivers which use TTM. So Dave Airlie worries:

I was hoping that by now, one of the radeon or nouveau drivers would have adopted TTM, or at least demoed something working using it, this hasn't happened which worries me... The real question is whether TTM suits the driver writers for use in Linux desktop and embedded environments, and I think so far I'm not seeing enough positive feedback from the desktop side
 

All of these worries would seem to be moot, since TTM is available and there is nothing else out there. Except, as it turns out, there is something out there: it's called the Graphics Execution Manager, or GEM. The Intel-sponsored GEM project is all of one month old, as of this writing. The GEM developers had not really intended to announce their work quite yet, but the TTM discussion brought the issue to the fore.

 

Keith Packard's introduction to GEM includes a document describing the API as it exists so far. There are a number of significant differences in how GEM does things. To begin with, GEM allocates graphical buffer objects using normal, anonymous, user-space memory. That means that these buffers can be forced out to swap when memory gets tight. There are clear advantages to this approach, and not just in memory flexibility: it also makes the implementation of suspend and resume easier by automatically providing backing store for all buffer objects.

 

The GEM API tries to do away with the mapping of buffers into user space. That mapping is expensive to do and brings all sorts of interesting issues with cache coherency between the CPU and GPU. So, instead, buffer objects are accessed with simple read() and write() calls. Or, at least, that's the way it would be if the GEM developers could attach a file descriptor to each buffer object. The kernel, however, does not make the management of that many file descriptors easy (yet), so the real API uses separate handles for buffer objects and a series of ioctl() calls.

 

That said, it is possible to map a buffer object into user space. But then the user-space driver must take explicit responsibility for the management of cache coherency. To that end there is a set of ioctl() calls for managing the "domain" of a buffer; the domain, essentially, describes which component of the system owns the buffer and is entitled to operate on it. Changing the domains (there are two, one for read access and one for writes) of a buffer will perform the necessary cache flushes. In a sense, this mechanism resembles the streaming DMA API, where the ownership of DMA buffers can be switched between the CPU and the peripheral controller. That is not entirely surprising, as a very similar problem is being solved.

 

This API also does away with the need for explicit fence operations. Instead, a CPU operation which requires access to a buffer will simply wait, if necessary, for the GPU to finish any outstanding operations involving that buffer.

 

Finally, the GEM API does not try to solve the entire problem; a number of important operations (such as the execution of a set of GPU commands) are left for the hardware-specific driver to implement. GEM is, thus, quite specific to the needs of Intel's driver at this time; it does not try for the same sort of generality that was a goal of TTM. As described by Eric Anholt:

 

The problem with TTM is that it's designed to expose one general API for all hardware, when that's not what our drivers want... We're trying to come at it from the other direction: Implement one driver well. When someone else implements another driver and finds that there's code that should be common, make it into a support library and share it.
 

The advantage to this approach is that it makes it relatively easy to create something which works well with Intel drivers. And that may well be a good start; one working set of drivers is better than none. On the other hand, that means that a significant amount of work may be required to get GEM to the point where it can support drivers for other hardware. There seem to be two points of view on how that might be done: (1) add capabilities to GEM when needed by other drivers, or (2) have each driver use its own memory manager.

The first approach is, in many ways, more pleasing. But it implies that the GEM API could change significantly over time. And that, in turn, could delay the merging of the whole thing; the GEM API is exported to user space, and, as a result, must remain compatible as things change. So there may be resistance to a quick merge of an API which looks like it may yet have to evolve for some time.

The second approach, instead, is best described by Dave Airlie:

 

Well the thing is I can't believe we don't know enough to do this in some way generically, but maybe the TTM vs GEM thing proves its not possible. So we can then punt to having one memory manager per driver, but I suspect this will be a maintenance nightmare, so if people decide this is the way forward, I'm happy to see it happen. However the person submitting the memory manager n+1 must damn well be willing to stand behind the interface until time ends, and explain why they couldn't re-use 1..n memory managers.
 

One other remaining issue is performance. Keith Whitwell posted some benchmark results showing that the i915 driver performs significantly worse with either TTM or GEM than without. Keith Packard gets different results, though; his tests show that the GEM-based driver is significantly faster. Clearly there is a need for a set of consistent benchmarks; performance of graphics drivers is important, but performance cannot be optimized if it cannot be reliably measured.

 

The use of anonymous memory also raises some performance concerns: a first-person shooter game will not provide the same experience if its blood-and-gore textures must be continually paged in. Anonymous memory can also be high memory, and, thus, not necessarily accessible via a 32-bit pointer. Some GPU hardware cannot address high memory; that will likely force the use of bounce buffers within the kernel. In the end, GEM will have to prove that it can deliver good performance; GEM's developers are highly motivated to make their hardware look good, so there is a reasonable chance that things will work out on this front.

 

The conclusion to draw from all of this is that the GPU memory management problem cannot yet be considered solved. GEM might eventually become that solution, but it is a very new API which still needs a fair amount of work. There is likely to be a lot of work yet to be done in this area.

 

refer to http://lwn.net/Articles/283793/

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
提供的源码资源涵盖了安卓应用、小程序、Python应用和Java应用等多个领域,每个领域都包含了丰富的实例和项目。这些源码都是基于各自平台的最新技术和标准编写,确保了在对应环境下能够无缝运行。同时,源码中配备了详细的注释和文档,帮助用户快速理解代码结构和实现逻辑。 适用人群: 这些源码资源特别适合大学生群体。无论你是计算机相关专业的学生,还是对其他领域编程感兴趣的学生,这些资源都能为你提供宝贵的学习和实践机会。通过学习和运行这些源码,你可以掌握各平台开发的基础知识,提升编程能力和项目实战经验。 使用场景及目标: 在学习阶段,你可以利用这些源码资源进行课程实践、课外项目或毕业设计。通过分析和运行源码,你将深入了解各平台开发的技术细节和最佳实践,逐步培养起自己的项目开发和问题解决能力。此外,在求职或创业过程中,具备跨平台开发能力的大学生将更具竞争力。 其他说明: 为了确保源码资源的可运行性和易用性,特别注意了以下几点:首先,每份源码都提供了详细的运行环境和依赖说明,确保用户能够轻松搭建起开发环境;其次,源码中的注释和文档都非常完善,方便用户快速上手和理解代码;最后,我会定期更新这些源码资源,以适应各平台技术的最新发展和市场需求。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值