现象:
Linux环境下,如果文件系统类型为EXT3,由于不支持快速预分配,DB2表空间扩容会非常慢,在扩容器间观察到有 SQLO_LT_SQLB_POOL_CB__readLotch 和 SQLO_LT_SQLB_PTBL__pool_table_latch 等待现象,示例
Database Member 0 -- Active -- Up 4 days 01:34:48 -- Date 2020-06-08-08.40.39.493862
Latches:
Address Holder Waiter Filename LOC LatchType HoldCount
0x00000002018D0470 14 0 Unknown 1391 SQLO_LT_sqeWLDispatcher__m_tunerLatch 1
0x00007F57E872AE20 53 31213 Unknown 5598 SQLO_LT_SQLB_POOL_CB__readLotch 1
0x00007F57E872AE20 53 31219 Unknown 5598 SQLO_LT_SQLB_POOL_CB__readLotch 1
0x0000000203577C28 53 0 Unknown 720 SQLO_LT_preventSuspendIOLotch 1
0x00007F57E872AEA0 53 0 Unknown 5598 SQLO_LT_SQLB_POOL_CB__ptfLotch 1
0x0000000203577CE8 31213 0 Unknown 3780 SQLO_LT_sqeSuspendIOCB__m_suspendIOCBLatch 1
0x00007F57E6C536F0 31219 31214 Unknown 3095 SQLO_LT_SQLB_PTBL__pool_table_latch 1
0x0000000203576C08 31219 0 Unknown 2460 SQLO_LT_sqlmon_dbcb__inSnapshotLatch 1
并且以下语句会由于上面Latch等待HANG住:
同一个表空间下创建新表、db2 list tablespaces
查询mon_get_bufferpool()、MON_GET_database()、sysibmadm.tbsp_utilization
当然,还可能会引发其他问题,这里不作讨论。
其中 "alter tablespace extend"的stack如下(latch holder):
<StackTrace>
------FUNCTION + OFFSET------
write + 0x002d
sqlowrite + 0x02c3
sqloSetFileSize + 0x07a5
_Z22sqlbResizeDMSContainerP12SQLB_POOL_CBijjP12SQLB_GLOBALS + 0x0166
_Z36sqlbServiceResizeDMSContainerRequestP14SQLB_pfParIoCbP12SQLB_POOL_CBP16SQLB_POOLCONT_CBP12SQLB_GLOBALS + 0x010e
_Z33sqlbPerformDMSContainerOperationsP12SQLB_POOL_CBS0_mP16SQLB_POOLCONT_CBbP26SQLB_AS_CONT_AND_PATH_INFOP12SQLB_GLOBALS + 0x124d
_Z13sqlbAlterPooltP26SQLB_TABLESPACE_ATTRIBUTESbP9SQLP_LSN8P12SQLB_GLOBALS + 0x2299
_Z13sqldPoolAlterP8sqeAgenttP26SQLB_TABLESPACE_ATTRIBUTES + 0x043d
_Z17sqlrlAlterTbspaceP8sqlrr_cbPhsP19sqlrl_tbspace_parms + 0x060e
_Z27sqlnq_alter_tablespace_stmtPP8stknode_i10actiontypePhP3loc + 0x006d
_Z12sqlnp_smactnP8sqlnp_cbi + 0x0a8d
_Z12sqlnp_parserP8sqlnp_cb + 0x07df
_Z10sqlnp_mainP12sqlnq_stringbP3locPP9sqlnq_qur + 0x02a1
...
</StackTrace>
"create table t3(id int) in tbs4"的stack如下:
<StackTrace>
------FUNCTION + OFFSET------
_ZN17SQLO_SLATCH_CAS6418getConflictComplexEm + 0x0579
_ZN17SQLO_SLATCH_CAS6411getConflictEm + 0x0051
_ZN12sqlpValLotch12getLatchOnlyEmPKcm + 0x011c
_Z20sqlbCommonWriteSetupP12SQLB_GLOBALSP8SQLB_BPDP9SQLB_PAGEP11SQLO_IO_REQP15SQLB_RANGE_INFOP21SqlbLocalOpenFileInfobt + 0x1414
sqlbWritePage + 0x0704
_Z13sqlbUnfixPageP11SQLB_FIX_CB + 0x10d7
_Z10sqldAddRowP13SQLD_DFM_WORK + 0x02c2
_Z13sqldInsertRowP13SQLD_DFM_WORKi + 0x01a9
_Z8sqldoisrP8sqeAgenttPtP15SQLD_OBJTAB_REC + 0x04c3
_Z23sqldInternalCreateTableP8sqeAgentP19SQLD_TABLECREATE_CB + 0x0577
_Z15sqldTableCreateP8sqeAgentP19SQLD_TABLECREATE_CB + 0x00e8
_Z7sqlrlctP8sqlrr_cbP8sqlrg_pdPttS3_S3_tP5doid1S3_hhhP10sqlrl_rngsP10sqlrl_sstrS9_ + 0x04e7
_Z8sqlrlcrtP8sqlrr_cbPhsS1_sP8sqlrg_pdP9sqlrg_hpdP17sqlrl_table_parmsP11sqlrg_vdhdrihP18sqlrg_datapartinfo + 0x18ef
_Z22sqlnq_create_table_endP9sqlnq_qtbmiS0_RmP10sqlrl_fldsP25sqlnq_base_table_seq_info + 0x0d9e
_Z17sqlnq_table_defn1PP8stknode_i10actiontypePhP3loc + 0x0138
...
</StackTrace>
查询 mon_get_bufferpool(null,-2)的stack如下:
<StackTrace>
------FUNCTION + OFFSET------
_ZN17SQLO_SLATCH_CAS6418getConflictComplexEm + 0x0579
_ZN17SQLO_SLATCH_CAS6411getConflictEm + 0x0051
_Z21sqlbLatchPoolR_inlineP12SQLB_POOL_CBibiPKc + 0x08e9
_ZN31sqlrwFilteredTablespaceIterator7executeEv + 0x019d
_ZN29sqlrwBufferpoolMetricsBuilder7executeEP15SQLB_BufferPool + 0x01ac
_ZN31sqlrwFilteredBufferpoolIterator7executeEv + 0x0114
_Z25sqlrwGetBufferpoolMetricsP8sqlrr_cbP29sqlrwGetBufferpoolMetricsArgsPPvPl + 0x0135
_Z30sqlrwGetWLMTableFunctionResultP8sqlrr_cbP20sqlrw_rpc_tf_requestPPvPlb + 0x01a2
_Z36sqlrwGetWLMTableFunctionMergedResultjPPv + 0x01f6
_Z29sqlerTrustedRtnCallbackRouterjPPv + 0x00bc
monGetBufferpool + 0x18da
...
</StackTrace>
查询 MON_GET_database(-2)的stack如下:
<StackTrace>
------FUNCTION + OFFSET------
_ZN17SQLO_SLATCH_CAS6418getConflictComplexEm + 0x0579
_ZN17SQLO_SLATCH_CAS6411getConflictEm + 0x0051
_Z21sqlbLatchPoolR_inlineP12SQLB_POOL_CBibiPKc + 0x08e9
_ZN31sqlrwFilteredTablespaceIterator7executeEv + 0x019d
_Z23sqlrwGetDatabaseMetricsP8sqlrr_cbPPvPl + 0x0a52
_Z30sqlrwGetWLMTableFunctionResultP8sqlrr_cbP20sqlrw_rpc_tf_requestPPvPlb + 0x03ea
_Z36sqlrwGetWLMTableFunctionMergedResultjPPv + 0x01f6
_Z29sqlerTrustedRtnCallbackRouterjPPv + 0x00bc
monGetDatabase_v105fp4 + 0x02aa
...
</StackTrace>
查询 sysibmadm.tbsp_utilization 的stack如下:
db2fmp stack 如下
<StackTrace>
------FUNCTION + OFFSET------
_ZN17SQLO_SLATCH_CAS6418getConflictComplexEm + 0x0579
_ZN17SQLO_SLATCH_CAS6411getConflictEm + 0x0051
_ZN12sqlpValLotch12getLatchOnlyEmPKcm + 0x011c
_Z21sqlbLatchPoolR_inlineP12SQLB_POOL_CBibiPKc + 0x0e4e
_Z30sqlbSnapshotTablespaceEstimatejP16sqeLocalDatabaseb + 0x01ae
_Z12sqlmonszagntj13sqm_entity_idP6sqlmaiPjP5sqlca + 0x0534
_Z15sqlmonszbackendP12SQLE_DB2RA_T + 0x093f
_Z8sqlesrvrP14db2UCinterface + 0x1542
_Z19sqleMappingFnServerP5sqldaP5sqlca + 0x04de
...
</StackTrace>
list tablespaces的stack如下:
如果只有Extend tablespace和list tablespaces在执行,那么 list tablespaces 在等前者所持有的 SQLO_LT_SQLB_POOL_CB__readLotch。 如果还有别的应用在同时执行,比如 select * from sysibmadm.tbsp_utilization,那么list tablespaces可能在等sysibmadm.tbsp_utilization所持有的 SQLO_LT_SQLB_PTBL__pool_table_latch
<StackTrace>
------FUNCTION + OFFSET------
_ZN17SQLO_SLATCH_CAS6418getConflictComplexEm + 0x0579
_ZN17SQLO_SLATCH_CAS6411getConflictEm + 0x0051
_ZN12sqlpValLotch12getLatchOnlyEmPKcm + 0x011c
_Z12sqlbQryPoolsP12SQLB_GLOBALSP13SQLO_MEM_POOLPijPP17SQLB_TBSPQRY_DATA + 0x076b
_Z15sqlbOpenPoolQryP12SQLB_GLOBALSjPi + 0x0042
_Z17sqlbotsq_route_inP12SQLE_DB2RA_TP8sqeAgent + 0x0055
_Z8sqlesrvrP14db2UCinterface + 0x13b2
...
</StackTrace>