【朝花夕拾】RT四位考虑DQS与QE RT-UFL烧录算法的修改
一, 文章简介
最近有客户RT1050项目中使用一款QSPI flash(Puya simi P25Q16H)作为XIP memory,但是总是遇到第一次下载不成功,重新上电之后再次下载能成功,app也能跑的现象,下载算法是使用超级下载算法RT-UFL。经过分析,这种情况通常和新的QSPI flash的QE没有使能有关。于是,笔者根据客户使用的QSPI flash的QE位置,在SDK flexspi_nor_polling_transfer代码中特地做了对应QE使能,让客户跑在RAM中运行,去查看使能QE之后,是否还有之前的新芯片烧录的问题。但是客户连flexspi_nor_polling_transfer都无法跑下去,根据客户之前的描述是硬件能够跑RAM代码,并且第一次下载不行,重新下载之后就能运行了,所以硬件APP实际上是能够运行。根据现象,初步推测新问题的出现可能和FlexSPI DQS被占用有关,通常情况下,推荐FlexSPI DQS浮空。因为给客户的flexSPI频率是120Mhz,如果DQS被用,FlexSPI read data的内部采样时钟源为:Read strobe provided by memory device and input from DQS pad,这种方式是会有问题的。所以再次让客户确认硬件,果然DQS在客户板子上被用作其他电路的控制引脚,通常这种情况有两点注意:
第一,FlexSPI时钟控制在60MHz之内。
第二,FlexSPI read data的内部采样时钟源配置为: Dummy read strobe generated by FlexSPI controller and looped back internally (FlexSPIn_MCR0[RXCLKSRC] = 0x0)
所以,本篇文章重在,如何根据客户使用的QSPI flash,去准备对应的QE位置的测试代码,考虑DQS被使能情况下的操作,以及配套超级下载算法的修改与测试。
二, 软硬件准备
为了复现客户的问题,需要准备对应的软硬件以及下载算法和单独测试QE的代码。
2.1 硬件准备
MIMXRT1050-EVKB, 修改板载电阻,从默认的hyperflash改成QSPI flash。修改点: USE QSPI FLASH(Mount R153~R158, DNP R356,R361~R366)。
拿掉板载的U33原来的ISSI QSPI flash,焊接客户使用的Puya simi P25Q16H。
因为客户使用的是JLINK,所以准备JLINK plus用于下载和仿真。
2.2 flexspi_nor_polling_transfer软件准备
SDK2.14.0代码:flexspi_nor_polling_transfer,用于单独测试QE的情况。
应用代码:比如led_blinky
RT-UFL烧写算法代码:https://github.com/JayHeng/RT-UFL
JLINK的驱动:本文使用JLINKV768B,也可以使用高版本。
2.2.1 P25Q16H QE位置
图 1
可以看到,还是通用的Status register bit 9。
对应的LUT读写命令如下:
图2
可以看到,对于写,是命令0X01,需要写2个连续字节。但是读的命令,两个状态寄存器字节的命令0X05,0X35是分开的。所以这点在单独操作QE位的时候需要注意。
2.2.2 flexspi_nor_polling_transfer代码准备
该代码主要用于测试flash的QE使能,禁止,外部flash的擦写读等功能。
代码修改点有:修改LUT命令符合P25Q16H;添加QE读写擦除功能;修改flexSPI的频率和DQS loopback internal的情况。
相关代码如下:
LUT 相关命令: flexspi_nor_polling_transfer.c
const uint32_t customLUT[CUSTOM_LUT_LENGTH] = {
/* Normal read mode -SDR */
[4 * NOR_CMD_LUT_SEQ_IDX_READ_NORMAL] =
FLEXSPI_LUT_SEQ(kFLEXSPI_Command_SDR, kFLEXSPI_1PAD, 0x03, kFLEXSPI_Command_RADDR_SDR, kFLEXSPI_1PAD, 0x18),
[4 * NOR_CMD_LUT_SEQ_IDX_READ_NORMAL + 1] =
FLEXSPI_LUT_SEQ(kFLEXSPI_Command_READ_SDR, kFLEXSPI_1PAD, 0x04, kFLEXSPI_Command_STOP, kFLEXSPI_1PAD, 0),
/* Fast read mode - SDR */
[4 * NOR_CMD_LUT_SEQ_IDX_READ_FAST] =
FLEXSPI_LUT_SEQ(kFLEXSPI_Command_SDR, kFLEXSPI_1PAD, 0x0B, kFLEXSPI_Command_RADDR_SDR, kFLEXSPI_1PAD, 0x18),
[4 * NOR_CMD_LUT_SEQ_IDX_READ_FAST + 1] = FLEXSPI_LUT_SEQ(
kFLEXSPI_Command_DUMMY_SDR, kFLEXSPI_1PAD, 0x08, kFLEXSPI_Command_READ_SDR, kFLEXSPI_1PAD, 0x04),
/* Fast read quad mode - SDR */
[4 * NOR_CMD_LUT_SEQ_IDX_READ_FAST_QUAD] =
FLEXSPI_LUT_SEQ(kFLEXSPI_Command_SDR, kFLEXSPI_1PAD, 0xEB, kFLEXSPI_Command_RADDR_SDR, kFLEXSPI_4PAD, 0x18),
[4 * NOR_CMD_LUT_SEQ_IDX_READ_FAST_QUAD + 1] = FLEXSPI_LUT_SEQ(
kFLEXSPI_Command_DUMMY_SDR, kFLEXSPI_4PAD, 0x06, kFLEXSPI_Command_READ_SDR, kFLEXSPI_4PAD, 0x04),
/* Read extend parameters */
[4 * NOR_CMD_LUT_SEQ_IDX_READSTATUS] =
FLEXSPI_LUT_SEQ(kFLEXSPI_Command_SDR, kFLEXSPI_1PAD, 0x81, kFLEXSPI_Command_READ_SDR, kFLEXSPI_1PAD, 0x04),
/* Write Enable */
[4 * NOR_CMD_LUT_SEQ_IDX_WRITEENABLE] =
FLEXSPI_LUT_SEQ(kFLEXSPI_Command_SDR, kFLEXSPI_1PAD, 0x06, kFLEXSPI_Command_STOP, kFLEXSPI_1PAD, 0),
/* Erase Sector */
[4 * NOR_CMD_LUT_SEQ_IDX_ERASESECTOR] =
FLEXSPI_LUT_SEQ(kFLEXSPI_Command_SDR, kFLEXSPI_1PAD, 0x20, kFLEXSPI_Command_RADDR_SDR, kFLEXSPI_1PAD, 0x18),//0xD7
/* Page Program - single mode */
[4 * NOR_CMD_LUT_SEQ_IDX_PAGEPROGRAM_SINGLE] =
FLEXSPI_LUT_SEQ(kFLEXSPI_Command_SDR, kFLEXSPI_1PAD, 0x02, kFLEXSPI_Command_RADDR_SDR, kFLEXSPI_1PAD, 0x18),
[4 * NOR_CMD_LUT_SEQ_IDX_PAGEPROGRAM_SINGLE + 1] =
FLEXSPI_LUT_SEQ(kFLEXSPI_Command_WRITE_SDR, kFLEXSPI_1PAD, 0x04, kFLEXSPI_Command_STOP, kFLEXSPI_1PAD, 0),
/* Page Program - quad mode */
[4 * NOR_CMD_LUT_SEQ_IDX_PAGEPROGRAM_QUAD] =
FLEXSPI_LUT_SEQ(kFLEXSPI_Command_SDR, kFLEXSPI_1PAD, 0x32, kFLEXSPI_Command_RADDR_SDR, kFLEXSPI_1PAD, 0x18),
[4 * NOR_CMD_LUT_SEQ_IDX_PAGEPROGRAM_QUAD + 1] =
FLEXSPI_LUT_SEQ(kFLEXSPI_Command_WRITE_SDR, kFLEXSPI_4PAD, 0x04, kFLEXSPI_Command_STOP, kFLEXSPI_1PAD, 0),
/* Read ID */
[4 * NOR_CMD_LUT_SEQ_IDX_READID] =
FLEXSPI_LUT_SEQ(kFLEXSPI_Command_SDR, kFLEXSPI_1PAD, 0x9F, kFLEXSPI_Command_READ_SDR, kFLEXSPI_1PAD, 0x04),
/* Enable Quad mode */
[4 * NOR_CMD_LUT_SEQ_IDX_WRITESTATUSREG] =
FLEXSPI_LUT_SEQ(kFLEXSPI_Command_SDR, kFLEXSPI_1PAD, 0x01, kFLEXSPI_Command_WRITE_SDR, kFLEXSPI_1PAD, 0x04),
/* Enter QPI mode */
[4 * NOR_CMD_LUT_SEQ_IDX_ENTERQPI] =
FLEXSPI_LUT_SEQ(kFLEXSPI_Command_SDR, kFLEXSPI_1PAD, 0x35, kFLEXSPI_Command_STOP, kFLEXSPI_1PAD, 0),
/* Exit QPI mode */
[4 * NOR_CMD_LUT_SEQ_IDX_EXITQPI] =
FLEXSPI_LUT_SEQ(kFLEXSPI_Command_SDR, kFLEXSPI_4PAD, 0xF5, kFLEXSPI_Command_STOP, kFLEXSPI_1PAD, 0),
/* Read status register */
[4 * NOR_CMD_LUT_SEQ_IDX_READSTATUSREG] =
FLEXSPI_LUT_SEQ(kFLEXSPI_Command_SDR, kFLEXSPI_1PAD, 0x05, kFLEXSPI_Command_READ_SDR, kFLEXSPI_1PAD, 0x04),
/* Read status register */
[4 * NOR_CMD_LUT_SEQ_IDX_READSTATUSREG1] =
FLEXSPI_LUT_SEQ(kFLEXSPI_Command_SDR, kFLEXSPI_1PAD, 0x35, kFLEXSPI_Command_READ_SDR, kFLEXSPI_1PAD, 0x04),
/* Erase whole chip */
[4 * NOR_CMD_LUT_SEQ_IDX_ERASECHIP] =
FLEXSPI_LUT_SEQ(kFLEXSPI_Command_SDR, kFLEXSPI_1PAD, 0xC7, kFLEXSPI_Command_STOP, kFLEXSPI_1PAD, 0),//0xC7
};
flexspi_nor_flash_ops.c: QE读写相关
status_t flexspi_nor_enable_quad_mode(FLEXSPI_Type *base)
{
flexspi_transfer_t flashXfer;
status_t status;
uint32_t writeValue = FLASH_QUAD_ENABLE;
#if defined(CACHE_MAINTAIN) && CACHE_MAINTAIN
flexspi_cache_status_t cacheStatus;
flexspi_nor_disable_cache(&cacheStatus);
#endif
/* Write enable */
status = flexspi_nor_write_enable(base, 0);
if (status != kStatus_Success)
{
return status;
}
/* Enable quad mode. */
flashXfer.deviceAddress = 0;
flashXfer.port = FLASH_PORT;
flashXfer.cmdType = kFLEXSPI_Write;
flashXfer.SeqNumber = 1;
flashXfer.seqIndex = NOR_CMD_LUT_SEQ_IDX_WRITESTATUSREG;
flashXfer.data = &writeValue;
flashXfer.dataSize = writeValue <= 0xFFU ? 1 : 2;
status = FLEXSPI_TransferBlocking(base, &flashXfer);
if (status != kStatus_Success)
{
return status;
}
status = flexspi_nor_wait_bus_busy(base);
/* Do software reset. */
FLEXSPI_SoftwareReset(base);
#if defined(CACHE_MAINTAIN) && CACHE_MAINTAIN
flexspi_nor_enable_cache(cacheStatus);
#endif
return status;
}
status_t flexspi_nor_disable_quad_mode(FLEXSPI_Type *base)
{
flexspi_transfer_t flashXfer;
status_t status;
uint32_t writeValue = 0x0;//FLASH_QUAD_ENABLE;
#if defined(CACHE_MAINTAIN) && CACHE_MAINTAIN
flexspi_cache_status_t cacheStatus;
flexspi_nor_disable_cache(&cacheStatus);
#endif
/* Write enable */
status = flexspi_nor_write_enable(base, 0);
if (status != kStatus_Success)
{
return status;
}
/* Enable quad mode. */
flashXfer.deviceAddress = 0;
flashXfer.port = FLASH_PORT;
flashXfer.cmdType = kFLEXSPI_Write;
flashXfer.SeqNumber = 1;
flashXfer.seqIndex = NOR_CMD_LUT_SEQ_IDX_WRITESTATUSREG;
flashXfer.data = &writeValue;
flashXfer.dataSize = 2;
status = FLEXSPI_TransferBlocking(base, &flashXfer);
if (status != kStatus_Success)
{
return status;
}
status = flexspi_nor_wait_bus_busy(base);
/* Do software reset. */
FLEXSPI_SoftwareReset(base);
#if defined(CACHE_MAINTAIN) && CACHE_MAINTAIN
flexspi_nor_enable_cache(cacheStatus);
#endif
return status;
}
status_t flexspi_nor_QE_register(FLEXSPI_Type *base, uint32_t *QEvalue)
{
/* Wait status ready. */
bool isBusy;
uint32_t readValue;
status_t status;
flexspi_transfer_t flashXfer;
flashXfer.deviceAddress = 0;
flashXfer.port = FLASH_PORT;
flashXfer.cmdType = kFLEXSPI_Read;
flashXfer.SeqNumber = 1;
flashXfer.seqIndex = NOR_CMD_LUT_SEQ_IDX_READSTATUSREG1;
flashXfer.data = &readValue;
flashXfer.dataSize = 1;
do
{
status = FLEXSPI_TransferBlocking(base, &flashXfer);
if (status != kStatus_Success)
{
return status;
}
if (FLASH_BUSY_STATUS_POL)
{
if (readValue & (1U << FLASH_BUSY_STATUS_OFFSET))
{
isBusy = true;
}
else
{
isBusy = false;
}
}
else
{
if (readValue & (1U << FLASH_BUSY_STATUS_OFFSET))
{
isBusy = false;
}
else
{
isBusy = true;
}
}
*QEvalue = readValue;
} while (isBusy);
return status;
}
QE位置:App.h
#define FLASH_QUAD_ENABLE 0X0200
QE操作代码:flexspi_nor_polling_transfer.c
PRINTF("Get the QE bit value before QE enable!\r\n");
uint32_t QEvalue=0;
status = flexspi_nor_QE_register(EXAMPLE_FLEXSPI, &QEvalue);
if (status != kStatus_Success)
{
return status;
}
PRINTF("QE=%X!\r\n",(uint8_t)QEvalue);
#if 1
status = flexspi_nor_disable_quad_mode(EXAMPLE_FLEXSPI);
if (status != kStatus_Success)
{
return status;
}
PRINTF("Get the QE bit value after QE disable!\r\n");
status = flexspi_nor_QE_register(EXAMPLE_FLEXSPI, &QEvalue);
if (status != kStatus_Success)
{
return status;
}
PRINTF("QE=%X!\r\n",(uint8_t)QEvalue);
#endif
PRINTF("Enable the QE bit value !\r\n");
/* Enter quad mode. */
status = flexspi_nor_enable_quad_mode(EXAMPLE_FLEXSPI);
if (status != kStatus_Success)
{
return status;
}
status = flexspi_nor_QE_register(EXAMPLE_FLEXSPI, &QEvalue);
if (status != kStatus_Success)
{
return status;
}
PRINTF("QE=%X!\r\n",(uint8_t)QEvalue);
FlexSPI 频率修改:flexspi_nor_polling_transfer.c,app.h
flexspi_device_config_t deviceconfig = {
.flexspiRootClk = 60000000,
.flashSize = FLASH_SIZE,
.CSIntervalUnit = kFLEXSPI_CsIntervalUnit1SckCycle,
.CSInterval = 2,
.CSHoldTime = 3,
.CSSetupTime = 3,
.dataValidTime = 0,
.columnspace = 0,
.enableWordAddress = 0,
.AWRSeqIndex = 0,
.AWRSeqNumber = 0,
.ARDSeqIndex = NOR_CMD_LUT_SEQ_IDX_READ_FAST_QUAD,
.ARDSeqNumber = 1,
.AHBWriteWaitUnit = kFLEXSPI_AhbWriteWaitUnit2AhbCycle,
.AHBWriteWaitInterval = 0,
};
static inline void flexspi_clock_init(void)
{
const clock_usb_pll_config_t g_ccmConfigUsbPll = {.loopDivider = 0U};
CLOCK_InitUsb1Pll(&g_ccmConfigUsbPll);
CLOCK_InitUsb1Pfd(kCLOCK_Pfd0, 24); /* Set PLL3 PFD0 clock 360MHZ. */
CLOCK_SetMux(kCLOCK_FlexspiMux, 0x3); /* Choose PLL3 PFD0 clock as flexspi source clock. */
CLOCK_SetDiv(kCLOCK_FlexspiDiv, 5); /* flexspi clock 60M. */
}
Loop back internally:app.h
#define EXAMPLE_FLEXSPI_RX_SAMPLE_CLOCK kFLEXSPI_ReadSampleClkLoopbackInternally
2.2.3 flexspi_nor_polling_transfer修改后测试
下载修改后的代码到RT1050 RAM中运行结果如下:
图 3
从图中可以看到,QE可以正常实现读写,擦除,读的功能。这里第一次读出来QE为2,说明QE使能了,这是因为本文的QSPI已经经过操作。如果新的芯片,默认会读出0,也就是QE没有使能的状态。而且可以看到,经过修改之后,能够准确的擦除,编程,读外部flash,说明目前的代码修改是成功的。LUT,QE位置,DQS的考虑(60Mhz+loopback internal)均已经工作。
2.3 APP准备
使用SDK里面的led_blinky代码,主要修改FCB的频率和readSampleClkSrc。
evkbimxrt1050_flexspi_nor_config.c修改如下:
const flexspi_nor_config_t qspiflash_config = {
.memConfig =
{
.tag = FLEXSPI_CFG_BLK_TAG,
.version = FLEXSPI_CFG_BLK_VERSION,
.readSampleClkSrc = kFlexSPIReadSampleClk_LoopbackInternally,
.csHoldTime = 3u,
.csSetupTime = 3u,
.controllerMiscOption = (1u << kFlexSpiMiscOffset_SafeConfigFreqEnable),
.deviceType = kFlexSpiDeviceType_SerialNOR,
.sflashPadType = kSerialFlash_4Pads,
.serialClkFreq = kFlexSpiSerialClk_60MHz,
.sflashA1Size = 8u * 1024u * 1024u,
…
}
该代码用于后续使用新的烧录算法debug测试,编译生成.srec,用于JFLASH的烧录测试。
三, RT-UFL JLINK烧录算法修改
下载了超级下载算法RT-UFL之后,需要根据上面说的两个因素修改超级下载算法:
第一, QE使能;第二,DQS被使用因素。
RT-UFL对于本文的方案,还是使用的option ROM的方式去初始化flexSPI。
根据option各位描述,选择:
OPTION 0: 0xc0000201
OPTION 1:0x0
也就是相当于这种情况:
图 4
3.1 RT-UFL代码修改
本文使用的是KEIL工程:
\RT-UFL-1.0\build\mdk
代码修改点如下:
Ufl_main.c: ufl_set_target_property
case kChipId_RT105x:
uflTargetDesc->flexspiInstance = MIMXRT105X_1st_FLEXSPI_INSTANCE;
uflTargetDesc->flexspiBaseAddr = MIMXRT105X_1st_FLEXSPI_BASE;
uflTargetDesc->flashBaseAddr = MIMXRT105X_1st_FLEXSPI_AMBA_BASE;
//p25q16h QESet bit 1 in Status Register 2 {.option0.U = 0xc0000201, .option1.U = 0x00000000},
uflTargetDesc->configOption.option0.U = 0xc0000201;
uflTargetDesc->configOption.option1.U = 0x0;
Ufl_romapi.c: readSampleClkSrc配置
status_t flexspi_nor_auto_config(uint32_t instance, flexspi_nor_config_t *config, serial_nor_config_option_t *option)
{
// Wait until the FLEXSPI is idle
register uint32_t delaycnt = 10000u;
while(delaycnt--)
{
}
status_t status = flexspi_nor_get_config(instance, config, option);
if (status != kStatus_Success)
{
return status;
}
config->memConfig.readSampleClkSrc = kFlexSPIReadSampleClk_LoopbackInternally; //For DQS is used by other circuit
return flexspi_nor_flash_init(instance, config);
}
FlashDev.c
struct FlashDevice const FlashDevice = {
FLASH_DRV_VERS, // Driver Version, do not modify!
"MIMXRT_FLEXSPI", // Device Name
EXTSPI, // Device Type
0x60000000, // Device Start Address
0x00800000, // Device Size in Bytes (8mB)
256, // Programming Page Size
0, // Reserved, must be 0
0xFF, // Initial Content of Erased Memory
100, // Program Page Timeout 100 mSec
5000, // Erase Sector Timeout 5000 mSec
// Specify Size and Address of Sectors
0x1000, 0x00000000, // Sector Size 4kB (256 Sectors)
SECTOR_END
};
FlashOS.h:这块主要保证生成的是UFL_L0类型,定义flash相关page,sector size情况
#define FLASH_DRV_SIZE_OPT (0)
#if (FLASH_DRV_SIZE_OPT == 0)
#define FLASH_DRV_PAGE_SIZE (0x100)
#define FLASH_DRV_SECTOR_SIZE (0x1000)
#elif (FLASH_DRV_SIZE_OPT == 1)
#define FLASH_DRV_PAGE_SIZE (0x200)
#define FLASH_DRV_SECTOR_SIZE (0x1000)
#elif (FLASH_DRV_SIZE_OPT == 2)
#define FLASH_DRV_PAGE_SIZE (0x200)
#define FLASH_DRV_SECTOR_SIZE (0x10000)
#endif
编译生成代码,获得:MIMXRT_FLEXSPI_UV5_UFL.FLM
改名为:MIMXRT_FLEXSPI_UV5_UFL_P25Q16H.FLM
3.2 JLINK驱动算法更新
安装了JLINK驱动之后,下面使用RT-UFL算法。
根据该文修改JLINK的驱动算法为RT-UFL算法:
https://www.cnblogs.com/henjay724/p/14942574.html
然后在此基础上,添加刚刚为P25Q16H修改的算法,修改点如下:
(1)拷贝附件中RT1050_P25Q16H_JLINK\program\ JLinkDevices.xml 到:
C:\Program Files\SEGGER\JLINKV768B
图5
目前相关修改情况如下:
图 6
注意,device name:MIMXRT1050_UFL_P25Q16H
(2)拷贝:RT1050_P25Q16H_JLINK\program\ IMXRT_FLEXSPI_UV5_UFL_P25Q16H.FLM
到:C:\Program Files\SEGGER\JLINKV768B\Devices\NXP\iMXRT_UFL
图 7
这里的MIMXRT_FLEXSPI_UV5_UFL_P25Q16H.FLM算法就是上面经过修改后的算法。
(3)运行C:\Program Files\SEGGER\JLINKV768B\JLinkDLLUpdater.exe,刷一下到IDE IAR
3.3 算法下载测试
对于MIMXRT1050-EVKB使用外部JLINK,需要断开EVKB板上J33,JTAG插入J21。
3.3.1 使用JFLASH下载测试
首先用之前修改的EVKB-IMXRT1050-flexspi_nor_polling_transfer,禁止QE位,模拟新的QSPI flash芯片,测试如下:
图 8
JFlash测试如下:
图9
可以看到,使用JFLASH能够一次性成功烧录。
3.3.2 led_blinky app debug测试
通过禁止QE位,模拟新的QSPI flash芯片,测试如图8。
APP demo使用IAR工程,工程选择JLINK:
图 10
图11
这里需要注意,选择device 为修改后的超级下载算法设备名,方法如下,IAR debug生成的settings->xxx.jlink,修改如下:
图 12
主要是override =1, 和device 为修改后算法的设备名。
Debug测试结果:
图13
可以看到,能够成功debug,算法也是运行了UFL修改后的算法。
全速运行,可以看到灯闪烁。
说明所有的算法,硬件,代码,已经支持新的P25Q16H QSPI flash了。
四, 总结
在使用一款新的QSPI flash的时候,首先需要注意QE的位置,DQS是否被用,然后来配套准备对应的RT-UFL烧写算法,UFL算法通常情况下默认可以支持绝大多少flash芯片。QE,还有DQS被用的情况下,只需要微调算法既可支持。所以本文经过算法的修改,成功的解决了客户项目烧录的问题。其他的QSPI flash,也可以使用本文的方法去对应修改烧录算法以保证满足自己的项目需求。
附件下载链接:
https://community.nxp.com/t5/i-MX-RT-Knowledge-Base/RT10XX-RT-UFL-modification-for-QSPI-QE-and-DQS-factor/ta-p/1740129