intel c linux,Intel IOMMU在Linux上的实现架构

1.检测平台是否支持DMAR设备

./drivers/pci/dmar.c->int __init early_dmar_detect(void)

{

acpi_status status = AE_OK;

/* if we could find DMAR table, then there are DMAR devices */

status = acpi_get_table(ACPI_SIG_DMAR, 0,

(struct acpi_table_header **)&dmar_tbl);

if (ACPI_SUCCESS(status) && !dmar_tbl) {

printk (KERN_WARNING PREFIX "Unable to map DMAR/n");

status = AE_NOT_FOUND;

}

return (ACPI_SUCCESS(status) ? 1 : 0);

}

该函数在内存初始化的时候调用:

./arch/x86_64/mm/init.c:528:pci_iommu_alloc();

通过读取 DMA Remapping table,来判断判断是否支持DMAR设备。

./include/acpi/actbl1.h:64:#define ACPI_SIG_DMAR"DMAR"/* DMA Remapping table */

/*******************************************************************************

*

* FUNCTION:acpi_get_table

*

* PARAMETERS:table_type- one of the defined table types

*Instance- the non zero instance of the table, allows

*support for multiple tables of the same type

*see acpi_gbl_acpi_table_flag

*ret_buffer- pointer to a structure containing a buffer to

*receive the table

*

* RETURN:Status

*

* DESCRIPTION: This function is called to get an ACPI table.The caller

*supplies an out_buffer large enough to contain the entire ACPI

*table.The caller should call the acpi_get_table_header function

*first to determine the buffer size needed.Upon completion

*the out_buffer->Length field will indicate the number of bytes

*copied into the out_buffer->buf_ptr buffer. This table will be

*a complete table including the header.

*

********************************************************************************/

2.初始化Intel IOMMU设备

./drivers/pci/intel-iommu.c:

int __init intel_iommu_init(void)

{

int ret = 0;

if (no_iommu || swiotlb || dmar_disabled)

return -ENODEV;

if (dmar_table_init())

return-ENODEV;

iommu_init_mempool();

dmar_init_reserved_ranges();

init_no_remapping_devices();

ret = init_dmars();

if (ret) {

printk(KERN_ERR "IOMMU: dmar init failed/n");

put_iova_domain(&reserved_iova_list);

iommu_exit_mempool();

return ret;

}

printk(KERN_INFO

"PCI-DMA: Intel(R) Virtualization Technology for Directed I/O/n");

force_iommu = 1;

dma_ops = &intel_dma_ops;

return 0;

}

该函数在arch/x86_64/kernel/pci-dma.c的

static int __init pci_iommu_init(void)

{

#ifdef CONFIG_CALGARY_IOMMU

calgary_iommu_init();

#endif

intel_iommu_init();

#ifdef CONFIG_IOMMU

gart_iommu_init();

#endif

no_iommu_init();

return 0;

}

中被调用,同时在该文件中注册为初始化函数:

/* Must execute after PCI subsystem */

fs_initcall(pci_iommu_init);

2.1 dmar_table_init

解析DMAR table。逐一打印每个dmar项,

dmar_table_print_dmar_entry(entry_header);

类似如下的信息在dmesg中出现:

ACPI DMAR:Host address width 36

ACPI DMAR:DRHD (flags: 0x00000000)base: 0x00000000fed90000

ACPI DMAR:DRHD (flags: 0x00000000)base: 0x00000000fed91000

ACPI DMAR:DRHD (flags: 0x00000001)base: 0x00000000fed93000

ACPI DMAR:RMRR base: 0x00000000000ed000 end: 0x00000000000effff

ACPI DMAR:RMRR base: 0x000000007f600000 end: 0x000000007fffffff

switch (entry_header->type) {

case ACPI_DMAR_TYPE_HARDWARE_UNIT:

ret = dmar_parse_one_drhd(entry_header);

break;

case ACPI_DMAR_TYPE_RESERVED_MEMORY:

ret = dmar_parse_one_rmrr(entry_header);

break;

default:

printk(KERN_WARNING PREFIX

"Unknown DMAR structure type/n");

ret = 0; /* for forward compatibility */

break;

}

解析如下两个表项:

DRHD - DMA Engine Reporting Structure

RMRR - Reserved memory Region Reporting Structure

对于DRHD表项,通过register函数,将每个DMA的物理设备放到一个list中。对于每个RMRR,同样放到一个全局列表中。

2.2 iommu_init_mempool

创建几个常用结构的slab_cache:

struct iova

struct iommu_domain

struct device_domain_info

2.3 dmar_init_reserved_ranges

初始化保留的区域。下面两种range是需要保留的:

1.IOAPIC ranges shouldn't be accessed by DMA

2.Reserve all PCI MMIO to avoid peer-to-peer access

2.4 init_no_remapping_devices

Graphics driver workarounds to provide unity map

Digg This

Most GFX drivers don't call standard PCI DMA APIs to allocate DMA buffer,

Such drivers will be broken with IOMMU enabled. To workaround this issue,

we added two options.

Once graphics devices are converted over to use the DMA-API's this entire

patch can be removed...

a. intel_iommu=igfx_off. With this option, DMAR who has just gfx devices

under it will be ignored. This mostly affect intergated gfx devices.

If the DMAR is ignored, gfx device under it will get physical address

for DMA.

b. intel_iommu=gfx_workaround. With this option, we will setup 1:1 mapping

for whole memory for gfx devices, that is physical address equals to

virtual address.In this way, gfx will use physical address for DMA, this

is primarily for add-in card GFX device.

2.5 init_dmars

初始化dmar数据结构。

TBD:数据结构关系图

dma_ops = &intel_dma_ops;

static struct dma_mapping_ops intel_dma_ops = {

.alloc_coherent = intel_alloc_coherent,

.free_coherent = intel_free_coherent,

.map_single = intel_map_single,

.unmap_single = intel_unmap_single,

.map_sg = intel_map_sg,

.unmap_sg = intel_unmap_sg,

};

3. DMAR ACPI Table结构

The system BIOS is responsible

for detecting the remapping hardware functions in the platform and for

locating the memory-mapped remapping hardware registers in the host

system address space. The BIOS reports the remapping hardware units in a

platform to system software through the DMA Remapping Reporting (DMAR)

ACPI table described below.

3.1 DMA Remapping Reporting Structure

Field

Byte Length

Byte Offset

Description

Signature

4

0

“DMAR”. Signature for the DMA Remapping Description table.

Length

4

4

Length, in bytes, of the description table including the length of the associated DMAremapping structures.

Revision

1

8

1

Checksum

1

9

Entire table must sum to zero.

OEMID

6

10

OEM ID

OEM Table ID

8

16

For DMAR description table, the Table ID is the manufacturer model ID.

OEM Revision

4

24

OEM Revision of DMAR Table for OEM Table ID.

Creator ID

4

28

Vendor ID of utility that created the table.

Creator Revision

4

32

Revision of utility that created the table.

Host Address Width

1

36

This field indicates

the maximum DMA physical addressability supported by this platform.

The system address map reported by the BIOS indicates what portions of

this addresses are populated.

The Host Address

Width (HAW) of the platform is computed as (N+1), where N is the value

reported in this field. For example, for a platform supporting 40

bits of physical addressability, the value of 100111b is reported in

this field.

Flags

1

37

? Bit 0: INTR_REMAP -

If Clear, the platform does not support interrupt remapping. If Set,

the platform supports interrupt remapping.

? Bits 1-7: Reserved.

Reserved

10

38

Reserved (0).

Remapping Structures[]

-

48

A list of

structures. The list will contain one or more DMA Remapping Hardware

Unit Definition (DRHD) structures, and zero or more Reserved Memory

Region Reporting (RMRR) and Root Port ATS Capability Reporting (ATSR)

structures. These structures are described below.

3.2 Remapping Structure Types

每个Remapping Structure的开始部分包含type和length两个字段。其中,type表示DMA-remapping structure的类型,而length表示该structure的长度。下表定义了type的可能值:

Value

Description

0

DMA Remapping Hardware Unit Definition (DRHD) Structure

1

Reserved Memory Region Reporting (RMRR) Structure

2

RootPortATS Capability Reporting (ATSR) Structure

>2

Reserved for future

use. For forward compatibility, software skips structures it does not

comprehend by skipping the appropriate number of bytes indicated by

the Length field.

注:BIOS

implementations must report these remapping structure types in

numerical order. i.e., All remapping structures of type 0 (DRHD)

enumerated before remapping structures of type 1 (RMRR), and so forth.

3.3 DMA Remapping Hardware Unit Definition Structure

A DMA-remapping hardware unit

definition (DRHD) structure uniquely represents a remapping hardware

unit present in the platform. There must be at least one instance of

this structure for each PCI segment in the platform.

Field

Byte Length

Byte Offset

Description

Type

2

0

0 - DMA Remapping Hardware Unit Definition (DRHD) structure

Length

2

2

Varies (16 + size of Device Scope Structure)

Flags

1

4

Bit 0: INCLUDE_PCI_ALL

lIf

Set, this remapping hardware unit has under its scope all PCI

compatible devices in the specified Segment, except devices reported

under the scope of other remapping hardware units for the same

Segment. If a DRHD structure with INCLUDE_PCI_ALL flag Set is reported

for a Segment, it must be enumerated by BIOS after all other DRHD

structures for the same Segment. A DRHD structure with INCLUDE_PCI_ALL

flag Set may use the ‘Device Scope’ field to enumerate I/OxAPIC and

HPET devices under its scope.

lIf

Clear, this remapping hardware unit has under its scope only devices

in the specified Segment that are explicitly identified through the

‘Device Scope’ field.

Bits 1-7: Reserved.

Reserved

1

5

Reserved (0).

Segment Number

2

6

The PCI Segment associated with this unit.

Register Base Address

8

8

Base address of remapping hardware register-set for this unit.

Device Scope []

-

16

The Device Scope

structure contains one or more Device Scope Entries that identify

devices in the specified segment and under the scope of this remapping

hardware unit.

3.3.1 Device Scope Structure

The Device Scope Structure is

made up of one or more Device Scope Entries. Each Device Scope Entry may

be used to indicate a PCI endpoint device, a PCI sub-hierarchy, or

devices such as I/OxAPICs or HPET (High Precision Event Timer). In this

section, the generic term ‘PCI’ is used to describe conventional PCI,

PCI-X, and PCI-Express devices. Similarly, the term ‘PCI-PCI bridge’ is

used to refer to conventional PCI bridges, PCI-X bridges, PCI Express

root ports, or downstream ports of a PCI Express switch. A PCI

sub-hierarchy is defined as the collection of PCI controllers that are

downstream to a specific PCI-PCI bridge. To identify a PCI

sub-hierarchy, the Device Scope Entry needs to identify only the parent

PCI-PCI bridge of the sub-hierarchy.

Field

Byte Length

Byte Offset

Description

Type

1

0

The following values

are defined for this field. ? 0x01: PCI Endpoint Device - The device

identified by the ‘Path’ field is a PCI endpoint device. This type

must not be used in Device Scope of DRHD structures with

INCLUDE_PCI_ALL flag Set. ? 0x02: PCI Sub-hierarchy - The device

identified by the ‘Path’ field is a PCI-PCI bridge. In this case, the

specified bridge device and all its downstream devices are included in

the scope. This type must not be in Device Scope of DRHD structures

with INCLUDE_PCI_ALL flag Set. ? 0x03: IOAPIC - The device identified

by the ‘Path’ field is an I/O APIC (or I/O SAPIC) device, enumerated

through the ACPI MADT I/O APIC (or I/O SAPIC) structure. ? 0x04:

MSI_CAPABLE_HPET1 - The device identified by the ‘Path’ field is an

HPET device capable of generating MSI (Message Signaled interrupts).

HPET hardware is reported through ACPI HPET structure. Other values

for this field are reserved for future use.

Length

1

1

Length of this Entry in Bytes. (6 + X), where X is the size in bytes of the “Path” field.

Reserved

2

2

Reserved (0).

Enumeration ID

1

4

When the ‘Type’

field indicates ‘IOAPIC’, this field provides the I/O APICID as

provided in the I/O APIC (or I/O SAPIC) structure in the ACPI MADT

(Multiple APIC Descriptor Table). This field is treated reserved (0)

for all other ‘Type’fields.

Start Bus Number

1

5

This field describes the bus number (bus number of the first PCI Bus produced by the PCI Host Bridge) under which the device identified by this Device Scope resides.

Path

2 * N

6

Describes the hierarchical path from the Host Bridge to the device specified by the Device Scope Entry.

For example, a

device in a N-deep hierarchy is identified by N {PCI Device Number,

PCI Function Number} pairs, where N is a positive integer. Even

offsets contain the Device numbers, and odd offsets contain the

Function numbers.

The first {Device,

Function} pair resides on the bus identified by the ‘Start Bus Number’

field. Each subsequent pair resides on the bus directly behind the

bus of the device identified by the previous pair. The identity (Bus,

Device, Function) of the target device is obtained by recursively

walking down these N {Device, Function} pairs.

If the ‘Path’ field

length is 2 bytes (N=1), the Device Scope Entry identifies a

‘Root-Complex Integrated Device’. The requester-id of ‘Root-Complex

Integrated Devices’ are static and not impacted by system software bus

rebalancing actions.

If the ‘Path’ field

length is more than 2 bytes (N > 1), the Device Scope Entry

identifies a device behind one or more system software visible PCI-PCI

bridges. Bus rebalancing actions by system software modifying bus

assignments of the device’s parent bridge impacts the bus number

portion of device’s requester-id.

3.4 Reserved Memory Region Reporting Structure

BIOS may report each such

reserved memory region through the RMRR structures, along with the

devices that requires access to the specified reserved memory region.

Reserved memory ranges that are either not DMA targets, or memory ranges

that may be target of BIOS initiated DMA only during pre-boot phase

(such as from a boot disk drive) must not be included in the reserved

memory region reporting. The base address of each RMRR region must be

4KB aligned and the size must be an integer multiple of 4KB. BIOS must

report the RMRR reported memory addresses as reserved in the system

memory map returned through methods such as INT15, EFI GetMemoryMap etc.

The reserved memory region reporting structures are optional. If there

are no RMRR structures, the system software concludes that the platform

does not have any reserved memory ranges that are DMA targets.

The RMRR regions are expected to

be used only for USB and UMA Graphics legacy usages for reserved

memory. Platform designers must avoid or limit reserved memory regions

since these require system software to create holes in the DMA virtual

address range available to system software and its drivers.

Field

Byte Length

Byte Offset

Description

Type

2

0

1 - Reserved Memory Region Reporting Structure

Length

2

2

Varies (24 + size of Device Scope structure)

Reserved

2

4

Reserved.

Segment Number

2

6

PCI Segment Number associated with devices identified through the Device Scope field.

Reserved Memory Region Base Address

8

8

Base address of 4KB-aligned reserved memory region.

Reserved Memory Region Limit Address

8

16

Last address of the

reserved memory region. The reserved memory region size (Limit - Base +

1) must be an integer multiple of 4KB.

Device Scope[]

-

24

The Device Scope

structure contains one or more Device Scope entries that identify

devices requiring access to the specified reserved memory region. The

devices identified in this structure must be devices under the scope

of one of the remapping hardware units reported in DRHD.

3.5 Root Port ATS Capability Reporting Structure

This structure is applicable

only for platforms supporting Device-IOTLBs as reported through the

Extended Capability register. For each PCI Segment in the platform that

supports Device-IOTLBs, BIOS provides an ATSR structure. The ATSR

structures identifies PCI Express Root-Ports supporting Address

Translation Services (ATS) transactions. Software must enable ATS on

endpoint devices behind a Root Port only if the Root Port is reported as supporting ATS transactions.

Field

Byte Length

Byte Offset

Description

Type

2

0

2 - Root Port ATS Capability Reporting Structure

Length

2

2

Varies (8 + size of Device Scope Structure)

Flags

1

4

? Bit 0: ALL_PORTS:

If Set, indicates all PCI Express Root Ports in the specified PCI

Segment supports ATS transactions. If Clear, indicates ATS

transactions are supported only on Root Ports identified through the

Device Scope field.

? Bits 1-7: Reserved.

Reserved

1

5

Reserved (0).

Segment Number

2

6

The PCI Segment associated with this ATSR structure.

Device Scope []

-

8

If the ALL_PORTS

flag is Set, the Device Scope structure is omitted. If ALL_PORTS flag

is Clear, the Device Scope structure contains Device Scope Entries

that identifies Root Ports supporting ATS transactions. All Device

Scope Entries in this structure must have a Device Scope Entry Type of

02h.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值