PCIe ATS介绍

ATS Architectural Overview

DMA access time can be significantly lengthened due to the time required to resolve the actual physical address.

主机侧使用IOMMU完成设备地址到物理memory地址的转换,以及设备访问权限的检查。但如果所有PCIe设备都在进行DMA操作,则TA(Translation Agent)和ATPT(Address Translation and Protection Table)则会成为瓶颈,从而影响系统的latency。

To mitigate these impacts, designs often include address translation caches in the entity that performs the address translation. In a CPU, the address translation cache is most commonly referred to as a translation look-aside buffer (TLB). For an I/O TA, the term address translation cache or ATC is used to differentiate it from the translation cache used by the CPU.

为了解决地址转换性能问题,设计者通常会实现一个地址转换cache,在CPU侧通常称为translation look-aside buffer (TLB);对于IO侧,则称为address translation cache or ATC。

While there are some similarities between TLB and ATC, there are important differences. A TLB serves the needs of a CPU that is nominally running one thread at a time. The ATC, however, is generally processing requests from multiple I/O Functions, each of which can be considered a separate thread

与TLB不同的是,ATC通常处理的是多线程的请求。

这一区别使得主机侧的ATC很难针对系统的IO Function数目来确定cache的大小。

. The benefits of having an ATC within a Device include: • Ability to alleviate TA resource pressure by distributing address translation caching responsibility (reduced probability of “thrashing” within the TA) • Enable ATC Devices to have less performance dependency on a system’s ATC size • Potential to ensure optimal access latency by sending pretranslated requests to central complex

设备侧ATC中存放有TA和ATPT的内容,从而降低设备性能对系统cache大小的依赖。

There are a number of considerations a Function or software can use in making such a determination; for example: • Memory address ranges that will be frequently accessed over an extended period of time or whose associated buffer content is subject to a significant update rate • Memory address ranges, such as work and completion queue structures, data buffers for low-latency communications, graphics frame buffers, host memory that is used to cache Function-specific content, and so forth

针对设备Funcion访问某一个需要被频繁刷新的地址,或者对latency特别敏感的访问,ATC能显著提高软件性能。

Address Translation Services (ATS) Overview

ATS uses a request-completion protocol between a Device and a Root Complex (RC) to provide translation services.

  1. ATS Request的路由规则和序的规则同Non-Posted Memory Read
  2. ATS Request可以在1个或多个TC上outstanding地发送

TA收到ATS Translation Request之后

  1. 检查该Function是否开启ATS能力
  2. 检查该Function是否有权限访问该段地址
  3. 检查是否可以给该Function提供页表
    1. 页表请求是否合规
      1. Page size必须为2的幂次方,并且地址需要对齐到page size
      2. 最小的page size为4096 bytes
    2. 为了提供系统资源的利用效率,该Function必须被告知minimum translation或invalidate size,并且该Fucntion需要支持该size。最小的minimum translation size必须为4096 bytes
  4. TA告诉RC请求的结果是成功还是失败,RC产生ATS Translation Completion通过RP返回给Device Function
    1. RC针对1条ATS Translation Request至少返回1条ATS Translation Completion
      1. A successful translation can result in one or two ATS Translation Completion TLPs per request.
      2. An RC may pipeline multiple ATS Translation Completions,并且多条ATS Translation Completion的序是任意的
      3. RC需要按照与ATS Translation Request相同的TC返回Completion
    2. 如果Request请求的地址不是有效的,RC需要返回Completion表明该地址不可访问

Device Function收到ATS Translation Completion之后

1. 根据Completion成功与否,决定是否转换后续业务请求的地址

  • 5
    点赞
  • 44
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值