问题描述
SDX65平台PCIe外接QPS615偶现死机问题,RAMDUMP解析是NoC错误
Problem Description:the NOC(Time-out error) error is coming while IPA is trying to access HSP doorbell registers
Problem Recovery:Disable L0s on SDX65 and QPS615 switch
Problem Occurrence:MST lab testing
Problem Scenario:Modem data transfer
log
TZ - - 0x00000006 (SOC_ERR_FATAL_NOC_ERROR ) on CPU 0
修改描述
PCI: Disable L0s support for SDX65 with QPS615 on CPE platform
NoC timeout issues are seen with HSP attach over QPS615 switch while
IPA is accessing HSP specific registers.
At the time of issue link state from PARF register dump showed that
link is in L0s.
So, disable the L0s state as a work-around
(vetted by hardware verification team) and this change should have
minimum power impact.
代码修改
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index bbe4e6d1ba4bd..eeb4e2af29ff8 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -2334,6 +2334,22 @@ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x10f1, quirk_disable_aspm_l0s);
DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x10f4, quirk_disable_aspm_l0s);
DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x1508, quirk_disable_aspm_l0s);
+/*
+ * QPS615 PCIe-PCI bridge devices cause AER timeout errors on the upstream
+ * PCIe root port when L0s is enabled in CPE platform with SDX65.
+ * Disable L0s for both QPS615 and SDX65 when QPS615 switch is
+ * present.
+ */
+static void quirk_disable_aspm_qps615_l0s(struct pci_dev *dev)
+{
+ struct pci_dev *p;
+
+ pci_disable_link_state(dev, PCIE_LINK_STATE_L0S);
+ p = pci_get_device(PCI_VENDOR_ID_QCOM, 0x0308, NULL);
+ pci_disable_link_state(p, PCIE_LINK_STATE_L0S);
+}
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_TOSHIBA, 0x0623, quirk_disable_aspm_qps615_l0s);
+
static void quirk_disable_aspm_l0s_l1(struct pci_dev *dev)
{
pci_info(dev, "Disabling ASPM L0s/L1\n");
修改验证
Change Verification:Verified by MST, no issue is seen for more than 10 days