linux中pci设备知识
Linux PCI设备驱动实际包括Linux PCI设备驱动和设备本身驱动两部分。PCI(Periheral Component Interconnect)有三种地址空间:PCI I/O空间、PCI内存地址空间和PCI配置空间。其中,PCI I/O空间和PCI内存地址空间由设备驱动程序使用,而PCI配置空间由Linux PCI初始化代码使用,用于配置PCI设备,比如中断号以及I/O或内存基地址。
1.1.1 内核工作
Linux内核主要就做了对PCI设备的枚举和配置;在Linux内核初始化时完成的。
对于PCI总线,有一个叫做PCI桥的设备用来将父总线与子总线连接。作为一种特殊的PCI设备,PCI桥主要包括以下三种:
1). Host/PCI桥: 用于连接CPU与PCI根总线,第1个根总线的编号为0。在PC中,内存控制器也通常被集成到Host/PCI桥设备芯片中,因此Host/PCI桥通常也被称为“北桥芯片组(North Bridge Chipset)”。
2). PCI/ISA桥: 用于连接旧的ISA总线。通常,PCI中类似i8359A中断控制器这样的设备也会被集成到PCI/ISA桥设备中。因此,PCI/ISA桥通常也被称为“南桥芯片组(South Bridge Chipset)”
3). PCI-to-PCI桥(以下称为PCI-PCI桥): 用于连接PCI主总线(Primary Bus)和次总线(Secondary Bus)。PCI-PCI桥所处的PCI总线称为主总线,即次总线的父总线;PCI-PCI桥所连接的PCI总线称为次总线,即主总线的子总线。
1.1.2 遍历
从Host/PCI桥开始进行探测和扫描,逐个“枚举”连接在第一条PCI总线上的所有设备并记录在案。如果其中的某个设备是PCI-PCI桥,则又进一步再探测和扫描连在这个桥上的次级PCI总线。就这样递归下去,直到穷尽系统中的所有PCI设备。其结果,是在内存中建立起一棵代表着这些PCI总线和设备的PCI树。
每个PCI设备(包括PCI桥设备)都由一个pci_dev结构体来表示,而每条PCI总线则由pci_bus结构来表示。
1.1.3 配置
PCI设备中一般都带有一些RAM和ROM 空间,通常的控制/状态寄存器和数据寄存器也往往以RAM区间的形式出现,而这些区间的地址在设备内部一般都是从0开始编址的,那么当总线上挂接了多个设备时,对这些空间的访问就会产生冲突。所以,这些地址都要先映射到系统总线上,再进一步映射到内核的虚拟地址空间。
配置就是通过对PCI配置空间的寄存器进行操作从而完成地址的映射。
1.1.4 数据结构
pci_driver数据结构定义在:include/linux/pci.h文件中。
struct pci_driver {
struct list_head node;
const char *name;
const struct pci_device_id *id_table; /* must be non-NULL for probe to be called */
int (*probe) (struct pci_dev *dev, const struct pci_device_id *id); /* New device inserted */
void (*remove) (struct pci_dev *dev); /* Device removed (NULL if not a hot-plug capable driver) */
int (*suspend) (struct pci_dev *dev, pm_message_t state); /* Device suspended */
int (*suspend_late) (struct pci_dev *dev, pm_message_t state);
int (*resume_early) (struct pci_dev *dev);
int (*resume) (struct pci_dev *dev); /* Device woken up */
void (*shutdown) (struct pci_dev *dev);
int (*sriov_configure) (struct pci_dev *dev, int num_vfs); /* PF pdev */
const struct pci_error_handlers *err_handler;
struct device_driver driver;
struct pci_dynids dynids;
};
pci_dev也定义在include/linux/pci.h文件中。
详细描述了一个PCI设备几乎所有的硬件信息,包括厂商ID、设备ID、各种资源等.
struct pci_dev {
struct list_head bus_list; /* node in per-bus list */
struct pci_bus *bus; /* bus this device is on */
struct pci_bus *subordinate; /* bus this device bridges to */
void *sysdata; /* hook for sys-specific extension */
struct proc_dir_entry *procent; /* device entry in /proc/bus/pci */
struct pci_slot *slot; /* Physical slot this device is in */
unsigned int devfn; /* encoded device & function index */
unsigned short vendor;
unsigned short device;
unsigned short subsystem_vendor;
unsigned short subsystem_device;
unsigned int class; /* 3 bytes: (base,sub,prog-if) */
u8 revision; /* PCI revision, low byte of class word */
u8 hdr_type; /* PCI header type (`multi' flag masked out) */
u8 pcie_cap; /* PCI-E capability offset */
u8 msi_cap; /* MSI capability offset */
u8 msix_cap; /* MSI-X capability offset */
u8 pcie_mpss:3; /* PCI-E Max Payload Size Supported */
u8 rom_base_reg; /* which config register controls the ROM */
u8 pin; /* which interrupt pin this device uses */
u16 pcie_flags_reg; /* cached PCI-E Capabilities Register */
struct pci_driver *driver; /* which driver has allocated this device */
u64 dma_mask; /* Mask of the bits of bus address this
device implements. Normally this is
0xffffffff. You only need to change
this if your device has broken DMA
or supports 64-bit transfers. */
struct device_dma_parameters dma_parms;
pci_power_t current_state; /* Current operating state. In ACPI-speak,
this is D0-D3, D0 being fully functional,
and D3 being off. */
u8 pm_cap; /* PM capability offset */
unsigned int pme_support:5; /* Bitmask of states from which PME#
can be generated */
unsigned int pme_interrupt:1;
unsigned int pme_poll:1; /* Poll device's PME status bit */
unsigned int d1_support:1; /* Low power state D1 is supported */
unsigned int d2_support:1; /* Low power state D2 is supported */
unsigned int no_d1d2:1; /* D1 and D2 are forbidden */
unsigned int no_d3cold:1; /* D3cold is forbidden */
unsigned int d3cold_allowed:1; /* D3cold is allowed by user */
unsigned int mmio_always_on:1; /* disallow turning off io/mem
decoding during bar sizing */
unsigned int wakeup_prepared:1;
unsigned int runtime_d3cold:1; /* whether go through runtime
D3cold, not set for devices
powered on/off by the
corresponding bridge */
unsigned int d3_delay; /* D3->D0 transition time in ms */
unsigned int d3cold_delay; /* D3cold->D0 transition time in ms */
#ifdef CONFIG_PCIEASPM
struct pcie_link_state *link_state; /* ASPM link state. */
#endif
pci_channel_state_t error_state; /* current connectivity state */
struct device dev; /* Generic device interface */
int cfg_size; /* Size of configuration space */
/*
* Instead of touching interrupt line and base address registers
* directly, use the values stored here. They might be different!
*/
unsigned int irq;
struct resource resource[DEVICE_COUNT_RESOURCE]; /* I/O and memory regions + expansion ROMs */
bool match_driver; /* Skip attaching driver */
/* These fields are used by common fixups */
unsigned int transparent:1; /* Transparent PCI bridge */
unsigned int multifunction:1;/* Part of multi-function device */
/* keep track of device state */
unsigned int is_added:1;
unsigned int is_busmaster:1; /* device is busmaster */
unsigned int no_msi:1; /* device may not use msi */
unsigned int block_cfg_access:1; /* config space access is blocked */
unsigned int broken_parity_status:1; /* Device generates false positive parity */
unsigned int irq_reroute_variant:2; /* device needs IRQ rerouting variant */
unsigned int msi_enabled:1;
unsigned int msix_enabled:1;
unsigned int ari_enabled:1; /* ARI forwarding */
unsigned int is_managed:1;
unsigned int is_pcie:1; /* Obsolete. Will be removed.
Use pci_is_pcie() instead */
unsigned int needs_freset:1; /* Dev requires fundamental reset */
unsigned int state_saved:1;
unsigned int is_physfn:1;
unsigned int is_virtfn:1;
unsigned int reset_fn:1;
unsigned int is_hotplug_bridge:1;
unsigned int __aer_firmware_first_valid:1;
unsigned int __aer_firmware_first:1;
unsigned int broken_intx_masking:1;
unsigned int io_window_1k:1; /* Intel P2P bridge 1K I/O windows */
pci_dev_flags_t dev_flags;
atomic_t enable_cnt; /* pci_enable_device has been called */
u32 saved_config_space[16]; /* config space saved at suspend time */
struct hlist_head saved_cap_space;
struct bin_attribute *rom_attr; /* attribute descriptor for sysfs ROM entry */
int rom_attr_enabled; /* has display of the rom attribute been enabled? */
struct bin_attribute *res_attr[DEVICE_COUNT_RESOURCE]; /* sysfs file for resources */
struct bin_attribute *res_attr_wc[DEVICE_COUNT_RESOURCE]; /* sysfs file for WC mapping of resources */
#ifdef CONFIG_PCI_MSI
struct list_head msi_list;
struct kset *msi_kset;
#endif
struct pci_vpd *vpd;
#ifdef CONFIG_PCI_ATS
union {
struct pci_sriov *sriov; /* SR-IOV capability related */
struct pci_dev *physfn; /* the PF this VF is associated with */
};
struct pci_ats *ats; /* Address Translation Service */
#endif
phys_addr_t rom; /* Physical address of ROM if it's not from the BAR */
size_t romlen; /* Length of ROM if it's not from the BAR */
};
static inline struct pci_dev *pci_physfn(struct pci_dev *dev)
{
#ifdef CONFIG_PCI_IOV
if (dev->is_virtfn)
dev = dev->physfn;
#endif
return dev;
}
1.1.5 pci初始化流程
pci这块代码在两个地方,一个是driver/pci另一个是arch/x86/pci中。
1.1.5.1 关于入口
pci系统中入口函数处理subsys_initcall之外,还有arch_initcall,postcore_initcall等等。
pci系统的初始化工作有内核来完成,在drivers/pci/probe.c文件中,调用postcore_initcall(pcibus_class_init);函数,在sys/class/下创建一个pci_bus目录
drivers/pci/pci-driver.c文件,postcore_initcall(pci_driver_init); 注册pci总线,并在/sys/bus/下创建了一个pci目录
drivers/pci/pci-acpi.c文件中,调用arch_initcall(acpi_pci_init);
arch/x86/pci/init.c文件中
arch_initcall(pci_arch_init);体系架构相关,对于64 bit x86来说使用CONFIG_PCI_DIRECT的方式进行访问PCI配置空间。在内核编译时候可以指定。
drivers/pci/slot.c文件中,subsys_initcall(pci_slot_init); 创建/sys/bus/slots文件。