TOP发现系统负载整体很低,但CPU2的sys占用率很高在90%以上,查看当前正在运行的进程发现kipmi0进程占用率达到100%。

IPMI的初步认识:
IPMI是智能型平台管理接口(Intelligent Platform. Management Interface)的缩写,是管理基于Intel结构的企业系统中所使用的外围设备采用的一种工业标准,
该标准由英特尔/惠普/NEC/美国戴尔电脑和SuperMicro等公司制定。用户可以利用IPMI监视服务器的物理健康特征,如温度/电压/风扇工作状态、电源状态等。而且
更为重要的是IPMI是一个开放的免费标准,用户无需为使用该标准而支付额外的费用。
自1998 年,IPMI论坛创建了IPMI标准依赖,其已经得到了170 多家供应商的支持,使得其逐渐成为了一个完整地包括服务器和其他系统(如存储设备、网络和通信设备)的硬件管理规范,目前该标准最新版本为IPMI 2.0,该版本在原有基础上有了不少的改进,包括可以通过串口、Modem以及Lan等远程环境管理服务器系统(包括远程开关机),以及在安全、VLAN 和刀片支持等方面的提高。IPMI针对大量监 控、控制和自动回复服务器的作业,提供了智能型的管理方式。此标准适用于不同的服务器拓朴学,以及Windows、Linux、 Solaris、Mac或是混合型的操作系统。此外,由于IPMI可在不同的属性值下运作,即使服务器本身的运作不正常,或是由于任何原因而无法提供服 务,IPMI仍可正常运作。

查看IPMI相关的内核源码,发现09年提出的一个Patch(目前已合入主干,并做了再次修改):

  1. [PATCH] limit CPU time spent in kipmid (version 4)



  2. Signed-off-by: martin.wilck@xxxxxxxxxxxxxx

  3. --- linux-2.6.29.4/drivers/char/ipmi/ipmi_si_intf.c    2009-05-19 01:52:34.000000000 +0200

  4. +++ linux-2.6.29-rc8/drivers/char/ipmi/ipmi_si_intf.c    2009-06-04 15:30:34.855398091 +0200

  5. @@ -297,6 +297,9 @@

  6. static int force_kipmid[SI_MAX_PARMS];

  7. static int num_force_kipmid;


  8. +static unsigned int kipmid_max_busy_us[SI_MAX_PARMS];

  9. +static int num_max_busy_us;

  10. +

  11. static int unload_when_empty = 1;


  12. static int try_smi_init(struct smi_info *smi);

  13. @@ -927,23 +930,56 @@

  14. }

  15. }


  16. +#define ipmi_si_set_not_busy(timespec) \

  17. +    do { (timespec)->tv_nsec = -1; } while (0)

  18. +#define ipmi_si_is_busy(timespec) ((timespec)->tv_nsec != -1)

  19. +

  20. +static int ipmi_thread_busy_wait(enum si_sm_result smi_result,

  21. +    const struct smi_info *smi_info,

  22. +    struct timespec *busy_until)

  23. +{

  24. +    unsigned int max_busy_us = 0;

  25. +

  26. +    if (smi_info->intf_num < num_max_busy_us)

  27. +    max_busy_us = kipmid_max_busy_us[smi_info->intf_num];

  28. +    if (max_busy_us == 0 || smi_result != SI_SM_CALL_WITH_DELAY)

  29. +    ipmi_si_set_not_busy(busy_until);

  30. +    else if (!ipmi_si_is_busy(busy_until)) {

  31. +    getnstimeofday(busy_until);

  32. +    timespec_add_ns(busy_until, max_busy_us*NSEC_PER_USEC);

  33. +    } else {

  34. +    struct timespec now;

  35. +    getnstimeofday(&now);

  36. +    if (unlikely(timespec_compare(&now, busy_until) > 0)) {

  37. +    ipmi_si_set_not_busy(busy_until);

  38. +    return 0;

  39. +    }

  40. +    }

  41. +    return 1;

  42. +}

  43. +

  44. static int ipmi_thread(void *data)

  45. {

  46. struct smi_info *smi_info = data;

  47. unsigned long flags;

  48. enum si_sm_result smi_result;

  49. +    struct timespec busy_until;


  50. +    ipmi_si_set_not_busy(&busy_until);

  51. set_user_nice(current, 19);

  52. while (!kthread_should_stop()) {

  53. +    int busy_wait;

  54. spin_lock_irqsave(&(smi_info->si_lock), flags);

  55. smi_result = smi_event_handler(smi_info, 0);

  56. spin_unlock_irqrestore(&(smi_info->si_lock), flags);

  57. +    busy_wait = ipmi_thread_busy_wait(smi_result, smi_info,

  58. +    &busy_until);

  59. if (smi_result == SI_SM_CALL_WITHOUT_DELAY)

  60. ; /* do nothing */

  61. -    else if (smi_result == SI_SM_CALL_WITH_DELAY)

  62. +    else if (smi_result == SI_SM_CALL_WITH_DELAY && busy_wait)

  63. schedule();

  64. else

  65. -    schedule_timeout_interruptible(1);

  66. +    schedule_timeout_interruptible(0);

  67. }

  68. return 0;

  69. }

  70. @@ -1213,6 +1249,11 @@

  71. MODULE_PARM_DESC(unload_when_empty, "Unload the module if no interfaces are"

  72. " specified or found, default is 1. Setting to 0"

  73. " is useful for hot add of devices using hotmod.");

  74. +module_param_array(kipmid_max_busy_us, uint, &num_max_busy_us, 0644);

  75. +MODULE_PARM_DESC(kipmid_max_busy_us,

  76. +    "Max time (in microseconds) to busy-wait for IPMI data before"

  77. +    " sleeping. 0 (default) means to wait forever. Set to 100-500"

  78. +    " if kipmid is using up a lot of CPU time.");

通过Patch的说明以及Patch中最后的参数介绍:当kipmid占用较多CPU时,可以将kipmid_max_busy_us设置100-500。
邮件原文如下,是为了降低kipmid的开销

  1. Hi all,


  2. I am sorry for the long silence. I am sending here a new version of my patch which takes into account Bela's suggestions (well, most of them).

  3. I compiled and tested it with 2.6.29.4, the results are similar as before. By setting kipmid_max_busy_us to a value between 100 and 500, it is possible to bring down kipmid CPU load to practically 0 without loosing too much ipmi throughput performance.


  4. Please give me some feedback whether this patch will get merged, and if not, what improvement is needed.


  5. Regards

  6. Martin

Kipmid的开销与其实现有关,暂时不深入究,不过可以通过Patch看到通过设置kipmid_max_busy_us来影响kipmid的调度策略,进而降低CPU的占用率。
看一下网上对ipmi占用CPU问题的说明:

Fix:不需要修复
No fix required. You should ignore increased CPU utilization as it has no impact on actual system performance.
利用空余的CPU资源进行一些接口自动调节的任务。

临时降低(立即生效,cpu占用率降到10%以内):

echo 100 > /sys/module/ipmi_si/parameters/kipmid_max_busy_us

永久性降低(修改配置文件,模块/系统重启生效)

To make the changes persistent you can configure the options for the ipmi_si kernel module.
Create a file in /etc/modprobe.d/, i.e./etc/modprobe.d/ipmi.conf, and add the following content:
# Prevent kipmi0 from consuming 100% CPU

echo "options ipmi_si kipmid_max_busy_us=100">/etc/modprobe.d/ipmi.conf