有些客户会碰到如何设置主机HBA队列深度的问题,其实这个队列深度是要根据不同情况来设置的,而并非是一个固定数值。可以看到下面的文章,有一个方法可以告诉我们如何去设置这个数值。
很显然,不同厂商的存储,FA口的缓存大小也是不同的,所以不可以用这个固定数值来计算。
一把在规划设计当中,只会做一个最佳配置,不会因为配置的变化而去动态修改这个参数。
参考如下:
Product:
CLARiiON CX4 Series, VNX Series
Description:
Performance analysis on the CLARiiON or VNX shows "Queue full" issues occurring on the FC ports and there is poor response time in VMware.
VMware event: VMK_SCSI_DEVICE_QUEUE_FULL (TASK SET FULL) = 0x28
vmkernel: ... Command 0x12 to device ... failed H:0x0 D:0x28 P:0x0 Possible sense data: 0x0 0x0 0x0.
The maximum queue depth on each Storage Processor Fibre Channel port is 1600. When a large number of initiators and LUNs are used then at times of peak I/O, the port queues can become full. At these times a "QFULL" status is returned to the HBA. When this is received the ESX host decreases its LUN queue depth to just 1 I/O. If no further QFULL status messages are received, ESX will slowly increase the queue depth, taking up to a minute to get back to normal.
Resolution:
Some OS and HBA drivers can limit their queue depths on a per path basis, which is normally referred to as the target queue depth. ESX limits the queue depth on a per LUN basis for each path.
To avoid overloading the storage processor (SP) front-end FC ports, the ideal maximum queue depth setting can be calculated using a combination of the number of initiators per SP Port and the number of LUNs in use by ESX. Other initiators are likely to be sharing the same SP ports, so these will also need to have their queue depths limited (see solution emc204523). Normally, there would only be a path from the ESX server to any given SP port (this is best practice), so the calculation of the maximum queue depth would be:
Q = 1600 / (I * L)
Q is the Queue Depth or Execution Throttle, which is the maximum number of simultaneous I/O for each LUN any particular path to the SP
I is the number of initiators per SP port, which be equivalent to the number of ESX hosts, plus all other hosts sharing the same SP ports.
L is the quantity of LUNs for ESX which are sharing the same paths, which is equivalent to the number LUNs in the ESX storage group
Two ESX parameters should be set to this Q value. These are the queue depth of the storage adapter and "Disk.SchedNumReqOutstanding." However, the "Disk.SchedNumReqOutstanding" is often set to a lower value than the HBA queue depth in order to prevent any particular virtual machine (VM) from completely filling up the HBA queue by itself and starving other VMs of I/O. If this is currently the case in the ESX environment, these settings should be decreased in proportion to each other. For example, if the HBA queue depth is 64 and "Disk.SchedNumReqOutstanding" is 32 (the default setting), then reducing to reduce the QFULL, the HBA queue depth could be set to 32 and 'Disk.SchedNumReqOutstanding' set to 16.
For example, a farm of ten ESX servers has four paths to the CLARiiON (via two HBA each) and these FC ports are dedicated for use by ESX (which makes keeping queue depths under control easier). There are multiple storage groups in this example to keep each ESX servers boot LUN private, but each storage group has sixteen LUNs. This leads to the following queue depth:
Q = 1600 / (10 * 16) = 10
In practice a certain amount of over-subscription would be fine because all LUNs on all servers are unlikely to be busy at the same time, especially if load balancing is used. So in the example above, a queue depth of sixteen should still not cause QFULL events under normal circumstances.
QFULL events are not logged in the SP logs, but are recorded in Unisphere (or Navisphere) Analyzer archive files (.NAR or .NAZ) in Release 26 or later. See solution emc204523 for further details.
For further information on changing these settings, see the following VMware knowledge base articles:
http://kb.vmware.com/kb/1267
http://kb.vmware.com/kb/1268
For more information on this, refer primus solution “emc274169”.