Linux PM core - Device Power Management

  • 了解Device Power Management Basics

1.Device Power Management Basics

  Most of the code in Linux is device drivers, so most of the Linux power management (PM) code is also driver-specific. Most drivers will do very little; others, especially for platforms with small batteries (like cell phones), will do a lot.

2.Two Models for Device Power Management

  Drivers will use one or both of these models to put devices into low-power states.

2.1.System Sleep model

  Drivers can enter low-power states as part of entering system-wide low-power states like “suspend” (also known as “suspend-to-RAM”), or “hibernation” (also known as “suspend-to-disk”).

  This is something that device, bus, and class drivers collaborate on by implementing various role-specific suspend and resume methods to cleanly power down hardware and software subsystems, then reactivate them without loss of data.

  Some drivers can manage hardware wakeup events, which make the system leave the low-power state. This feature may be enabled or disabled using the relevant /sys/devices/…/power/wakeup file (for Ethernet drivers the ioctl interface used by ethtool may also be used for this purpose); enabling it may cost some power usage, but let the whole system enter low-power states more often.

2.2.Runtime Power Management model

  Devices may also be put into low-power states while the system is running, independently of other power management activity in principle. However, devices are not generally independent of each other (for example, a parent device cannot be suspended unless all of its child devices have been suspended).

  Moreover, depending on the bus type the device is on, it may be necessary to carry out some bus-specific operations on the device for this purpose. Devices put into low power states at run time may require special handling during system-wide power transitions (suspend or hibernation).

  For these reasons not only the device driver itself, but also the appropriate subsystem (bus type, device type or device class) driver and the PM core are involved in runtime power management. As in the system sleep power management case, they need to collaborate by implementing various role-specific suspend and resume methods, so that the hardware is cleanly powered down and reactivated without data or service loss.

  There’s not a lot to be said about those low-power states except that they are very system-specific, and often device-specific. Also, that if enough devices have been put into low-power states (at runtime), the effect may be very similar to entering some system-wide low-power state (system sleep) … and that synergies exist, so that several drivers using runtime PM might put the system into a state where even deeper power saving options are available.

  Most suspended devices will have quiesced all I/O: no more DMA or IRQs (except for wakeup events), no more data read or written, and requests from upstream drivers are no longer accepted. A given bus or platform may have different requirements though.

  Examples of hardware wakeup events include an alarm from a real time clock, network wake-on-LAN packets, keyboard or mouse activity, and media insertion or removal (for PCMCIA, MMC/SD, USB, and so on).

  Runtime PM层次结构:
在这里插入图片描述

3.Device Power Management Operations

  在一个系统中,数量最多的是设备,耗电最多的也是设备,因此设备的电源管理是Linux电源管理的核心内容。而设备电源管理最核心的操作就是:在合适的时机(如不再使用,如暂停使用),将设备置为合理的状态(如关闭,如睡眠)。这就是device PM callbacks的目的:定义一套统一的方式,让设备在特定的时机,步调一致的进入类似的状态。

  在旧版本的内核中,这些PM callbacks分布在设备模型的大型数据结构中,如struct bus_type中的suspend、suspend_late、resume、resume_late,如struct device_driver/struct class/struct device_type中的suspend、resume。很显然这样不具备良好的封装特性,因为随着设备复杂度的增加,简单的suspend、resume已经不能满足电源管理的需求,就需要扩充PM callbacks,就会不可避免的改动这些数据结构。

  于是新版本的内核,就将这些Callbacks统一封装为一个数据结构struct dev_pm_ops,上层的数据结构只需要包含这个结构即可。这样如果需要增加或者修改PM callbacks,就不用改动上层结构了。当然,内核为了兼容旧的设计,也保留了上述的suspend/resume类型的callbacks,只是已不建议使用。

   1: include/linux/pm.h
   2: struct dev_pm_ops {
              //system-wide power  transitions.
   3:         int (*prepare)(struct device *dev);
   4:         void (*complete)(struct device *dev);
   5:         int (*suspend)(struct device *dev);
   6:         int (*resume)(struct device *dev);
   7:         int (*freeze)(struct device *dev);
   8:         int (*thaw)(struct device *dev);
   9:         int (*poweroff)(struct device *dev);
  10:         int (*restore)(struct device *dev);
  11:         int (*suspend_late)(struct device *dev);
  12:         int (*resume_early)(struct device *dev);
  13:         int (*freeze_late)(struct device *dev);
  14:         int (*thaw_early)(struct device *dev);
  15:         int (*poweroff_late)(struct device *dev);
  16:         int (*restore_early)(struct device *dev);
  17:         int (*suspend_noirq)(struct device *dev);
  18:         int (*resume_noirq)(struct device *dev);
  19:         int (*freeze_noirq)(struct device *dev);
  20:         int (*thaw_noirq)(struct device *dev);
  21:         int (*poweroff_noirq)(struct device *dev);
  22:         int (*restore_noirq)(struct device *dev);
              //runtime power management 
  23:         int (*runtime_suspend)(struct device *dev);
  24:         int (*runtime_resume)(struct device *dev);
  25:         int (*runtime_idle)(struct device *dev);
  26: };

  这些callbacks有两部组成,一部分负责system-wide power transitions,另一部分负责runtime power management 。它们需要由具体的设备Driver实现,因此要求驱动工程师在设计每个Driver时,就要知道这些callbacks的使用场景、是否需要实现、怎么实现。

3.1.设备模型之dev_pm_ops

  Linux设备模型中的很多数据结构,都会包含struct dev_pm_ops变量,具体如下:

  1: struct bus_type {
   2:         ...
   3:         const struct dev_pm_ops *pm;
   4:         ...
   5: };
   6:  
   7: struct device_driver {
   8:         ...
   9:         const struct dev_pm_ops *pm;
  10:         ...
  11: };
  12:  
  13: struct class {
  14:         ...
  15:         const struct dev_pm_ops *pm;
  16:         ...
  17: };
  18:  
  19: struct device_type {
  20:         ...
  21:         const struct dev_pm_ops *pm;
  22: };
  23:  
  24: struct device {
  25:         ...
  26:         struct dev_pm_info      power;
  27:         struct dev_pm_domain    *pm_domain;
  28:         ...
  29: };

3.1.1.dev_pm_ops 之 runtime power management function

  The ->runtime_suspend(), ->runtime_resume() and ->runtime_idle() callbacks are executed by the PM core for the device’s subsystem that may be either of
the following:

  • PM domain of the device, if the device’s PM domain object, dev->pm_domain, is present.
  • Device type of the device, if both dev->type and dev->type->pm are present.
  • Device class of the device, if both dev->class and dev->class->pm are present.
  • Bus type of the device, if both dev->bus and dev->bus->pm are present.

  If the subsystem chosen by applying the above rules doesn’t provide the relevant callback, the PM core will invoke the corresponding driver callback stored in
dev->driver->pm directly (if present).

PM core 回调优先级:

  priority order of callbacks from high to low is: PM domain, device type, class and bus type. Moreover, the high-priority one will always take precedence over a low-priority one. The PM domain, bus type, device type and class callbacks
are referred to as subsystem-level callbacks in what follows.

3.1.2.device 之power/pm_domain

  • power
    power是一个struct dev_pm_info类型的变量,主要保存PM相关的状态,如当前的power_state、是否可以被唤醒、是否已经prepare完成、是否已经suspend完成等等。

  • pm_domain
    PM Domain(电源域),是针对device来说的。bus_type、device_driver、class、device_type等结构,本质上代表的是设备驱动,电源管理的操作,由设备驱动负责,是理所应当的。但在内核中,由于各种原因,是允许没有driver的device存在的,那么怎么处理这些设备的电源管理呢?就是通过设备的电源域实现的。

  • 在注册device的时候,调用device_pm_init 对 struct dev_pm_info进行初始化。

Note:
  The core methods to suspend and resume devices reside in |struct dev_pm_ops| pointed to by the :c:member:ops member of |struct dev_pm_domain|, or by the :c:member:pm member of |struct bus_type|, |struct device_type| and |struct class|. They are mostly of interest to the people writing infrastructure for platforms and buses, like PCI or USB, or device type and device class drivers. They also are relevant to the writers of device drivers whose subsystems (PM domains, device types, device classes and bus types) don’t provide all power management methods.

3.1.3. Platform bus 之dev_pm_ops

  drivers/base/platform.c:
  1190 struct bus_type platform_bus_type = {
  1191     .name       = "platform",
  1192     ...
  1196     .pm     = &platform_dev_pm_ops,
  1197 };
  
  1184 static const struct dev_pm_ops platform_dev_pm_ops = {
  1185     .runtime_suspend = pm_generic_runtime_suspend,
  1186     .runtime_resume = pm_generic_runtime_resume,
  1187     USE_PLATFORM_PM_SLEEP_OPS                                                                         
  1188 }; 
  
  365 #ifdef CONFIG_PM_SLEEP
  366 #define USE_PLATFORM_PM_SLEEP_OPS \                                                                    
  367     .suspend = platform_pm_suspend, \
  368     .resume = platform_pm_resume, \
  369     .freeze = platform_pm_freeze, \
  370     .thaw = platform_pm_thaw, \
  371     .poweroff = platform_pm_poweroff, \
  372     .restore = platform_pm_restore,
  373 #else
  374 #define USE_PLATFORM_PM_SLEEP_OPS
  375 #endif

如上所示,展开之后为:

static const struct dev_pm_ops platform_dev_pm_ops = {
    .runtime_suspend = pm_generic_runtime_suspend,
    .runtime_resume = pm_generic_runtime_resume,
    .suspend = platform_pm_suspend, 
    .resume = platform_pm_resume, 
    .freeze = platform_pm_freeze, 
    .thaw = platform_pm_thaw, 
    .poweroff = platform_pm_poweroff, 
    .restore = platform_pm_restore,                                                          
}; 

  以 pm_generic_runtime_suspend为例,最终调用platform driver驱动注册的runtime_suspend 函数,即imx i2c定义的i2c_imx_runtime_suspend函数。

   20 int pm_generic_runtime_suspend(struct device *dev)
   21 {
   22     const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
   23     int ret;                                                                                                                                        
   24 
   25     ret = pm && pm->runtime_suspend ? pm->runtime_suspend(dev) : 0;
   26 
   27     return ret;
   28 }

  1245 static const struct dev_pm_ops i2c_imx_pm_ops = {
  1246     SET_RUNTIME_PM_OPS(i2c_imx_runtime_suspend,
  1247                i2c_imx_runtime_resume, NULL)
  1248 };
  1249 #define I2C_IMX_PM_OPS (&i2c_imx_pm_ops)
  1254 static struct platform_driver i2c_imx_driver = {
  1255     .probe = i2c_imx_probe,
  1256     .remove = i2c_imx_remove,
  1257     .driver = {
  1258         .name = DRIVER_NAME,
  1259         .pm = I2C_IMX_PM_OPS,                                                                                                                      
  1260         .of_match_table = i2c_imx_dt_ids,
  1261     },
  1262     .id_table = imx_i2c_devtype,
  1263 };

3.1.4. Device Runtime PM 初始化

driver register:
在这里插入图片描述
同理device register:
在这里插入图片描述

  • 调用pm_runtime_init,完成pm_runtime 的初始化;
  • 设备模型在调用drv probe, remove, shutdown 等接口时,可以保证pm_runtime 处于disable 状态,或其他约定状态;
  • 调用dpm_sysfs_add,添加sysfs power接口;如注册mt6516_tpd paltform device的時候,在sysfs中出現如下目录和文件:
  /sys/devices/platform/mt6516-tpd
  #cd mt6516-tpd
  #ls -l
  -rw-r--r-- root     root         4096 2010-01-02 06:35 uevent
  -r--r--r-- root     root         4096 2010-01-02 06:39 modalias
  lrwxrwxrwx root     root              2010-01-02 06:39 subsystem -> ../../../bus/platform
  drwxr-xr-x root     root              2010-01-02 06:35 power
  lrwxrwxrwx root     root              2010-01-02 06:39 driver -> ../../../bus/platform/drivers/mt6516-tpd
  #cd power
  #ls -l
  -rw-r--r-- root     root         4096 2010-01-02 06:39 wakeup
  • 调用device_pm_add,将该设备插入到电源管理的核心链表dpm_list中统一管理。

Note: 关于dpm sysfs 创建,详见drivers/base/power/sysfs.c

  在device register的时候,进行dev_pm_info初始化。具体如下所示:

platform_device_register
  -> device_initialize
    -> device_pm_init
      ->device_pm_init_common
      ->device_pm_sleep_init
      -> pm_runtime_init
        -> INIT_WORK(&dev->power.work, pm_runtime_work);
          ->rpm_idle
          ->rpm_suspend
          ->rpm_resume

  • 初始化 struct dev_pm_info power;
  • 向pm_wq工作队列提交一个工作时,将会执行pm_runtime_work函数。

  分析pm_runtime_work之前,看一下device Runtime PM supported state:
在这里插入图片描述

相关结构体:

  496 enum rpm_status {
  497     RPM_ACTIVE = 0,  /* 表示runtime_resume()被成功执行 */
  498     RPM_RESUMING,    /* 表示runtime_resume()正在被执行 */
  499     RPM_SUSPENDED,   /* 表示runtime_suspend()被成功执行 */                                                                                   
  500     RPM_SUSPENDING,  /* 表示runtime_suspend()正在被执行 */
  501 };   

  518 enum rpm_request {
  519     RPM_REQ_NONE = 0,
  520     RPM_REQ_IDLE,
  521     RPM_REQ_SUSPEND,
  522     RPM_REQ_AUTOSUSPEND,                                                                               
  523     RPM_REQ_RESUME,
  524 };
  • 设备device 状态在PM runtime中的表示;
  • 用于电源请求的类型,当工作队列执行一个work时,会判断请求的类型,从而执行不同的函数(rpm_idle, rpm_suspend, rpm_resume);

3.1.5.PM core 初始化

  797 struct workqueue_struct *pm_wq;
  798 EXPORT_SYMBOL_GPL(pm_wq);
  799 
  800 static int __init pm_start_workqueue(void)                                                             
  801 {
  802     pm_wq = alloc_workqueue("pm", WQ_FREEZABLE, 0);
  803 
  804     return pm_wq ? 0 : -ENOMEM;
  805 }
  
  807 static int __init pm_init(void)
  808 {
  809     int error = pm_start_workqueue();                                                                             
  814     pm_states_init();
  815     power_kobj = kobject_create_and_add("power", NULL);
  818     error = sysfs_create_group(power_kobj, &attr_group);
  821     pm_print_times_init();
  822     return pm_autosleep_init();
  823 }
  • 创建工作队列pm_wq,负责具体的电源事务。
  • 创建/sys/power/xxx 属性节点。

3.2.Call Sequence Guarantees

  To ensure that bridges and similar links needing to talk to a device are
available when the device is suspended or resumed, the device hierarchy is walked in a bottom-up order to suspend devices. A top-down order is used to resume those devices.

  The ordering of the device hierarchy is defined by the order in which devices
get registered: a child can never be registered, probed or resumed before its parent; and can’t be removed or suspended after that parent.

3.3.System Power Management Phases

  Suspending or resuming the system is done in several phases. Different phases are used for suspend-to-idle, shallow (standby), and deep (“suspend-to-RAM”) sleep states and the hibernation state (“suspend-to-disk”). Each phase involves executing callbacks for every device before the next phase begins. Not all buses or classes support all these callbacks and not all drivers use all the
callbacks. The various phases always run after tasks have been frozen and before they are unfrozen. Furthermore, the *_noirq phases run at a time when IRQ handlers have been disabled (except for those marked with the IRQF_NO_SUSPEND flag).

  All phases use PM domain, bus, type, class or driver callbacks (that is, methods defined in dev->pm_domain->ops, dev->bus->pm, dev->type->pm, dev->class->pm or dev->driver->pm). These callbacks are regarded by the PM core as mutually exclusive. Moreover, PM domain callbacks always take precedence over all of the other callbacks and, for example, type callbacks take precedence over bus, class and driver callbacks. To be precise, the following rules are used to determine which callback to execute in the given phase:

1.  If ``dev->pm_domain`` is present, the PM core will choose the callback
provided by ``dev->pm_domain->ops`` for execution.

2.  Otherwise, if both ``dev->type`` and ``dev->type->pm`` are present, the
callback provided by ``dev->type->pm`` will be chosen for execution.

3.  Otherwise, if both ``dev->class`` and ``dev->class->pm`` are present,
the callback provided by ``dev->class->pm`` will be chosen for
execution.
                                                                                                                                                      
4.  Otherwise, if both ``dev->bus`` and ``dev->bus->pm`` are present, the
callback provided by ``dev->bus->pm`` will be chosen for execution.

  This allows PM domains and device types to override callbacks provided by bus types or device classes if necessary.

  The PM domain, type, class and bus callbacks may in turn invoke device- or driver-specific methods stored in dev->driver->pm, but they don’t have to do that.

  If the subsystem callback chosen for execution is not present, the PM core will execute the corresponding method from the dev->driver->pm set instead if there is one.

参考:Documentation/driver-api/pm/devices.rst

3.3.dev_pm_ops的callbacks function

  内核在定义dev_pm_ops callbacks数据结构的同时,定义了大量的操作API,这些API分为两类:

3.3.1.通用辅助APIs

  直接调用指定设备所绑定的driver的、pm指针的、相应的callback:

   include/linux/pm.h:
   1: extern int pm_generic_prepare(struct device *dev);
   2: extern int pm_generic_suspend_late(struct device *dev);
   3: extern int pm_generic_suspend_noirq(struct device *dev);
   4: extern int pm_generic_suspend(struct device *dev);
   5: extern int pm_generic_resume_early(struct device *dev);
   6: extern int pm_generic_resume_noirq(struct device *dev);
   7: extern int pm_generic_resume(struct device *dev); 
   8: extern int pm_generic_freeze_noirq(struct device *dev);
   9: extern int pm_generic_freeze_late(struct device *dev);
  10: extern int pm_generic_freeze(struct device *dev);
  11: extern int pm_generic_thaw_noirq(struct device *dev);
  12: extern int pm_generic_thaw_early(struct device *dev);
  13: extern int pm_generic_thaw(struct device *dev);
  14: extern int pm_generic_restore_noirq(struct device *dev);
  15: extern int pm_generic_restore_early(struct device *dev);
  16: extern int pm_generic_restore(struct device *dev);
  17: extern int pm_generic_poweroff_noirq(struct device *dev);
  18: extern int pm_generic_poweroff_late(struct device *dev);
  19: extern int pm_generic_poweroff(struct device *dev); 
  20: extern void pm_generic_complete(struct device *dev);
  
  include/linux/pm_runtime.h:
  33 extern int pm_generic_runtime_suspend(struct device *dev);
  34 extern int pm_generic_runtime_resume(struct device *dev);

  以pm_generic_prepare为例,就是查看dev->driver->pm->prepare接口是否存在,如果存在,直接调用并返回结果。

  drivers/base/power/generic_ops.c :
  58 int pm_generic_prepare(struct device *dev)                                                             
   59 {
   60     struct device_driver *drv = dev->driver;
   61     int ret = 0;
   62 
   63     if (drv && drv->pm && drv->pm->prepare)
   64         ret = drv->pm->prepare(dev);
   65 
   66     return ret;
   67 

3.3.2.和整体电源管理行为相关的APIs

  将各个独立的电源管理行为组合起来,组成一个较为简单的功能。

 1: #ifdef CONFIG_PM_SLEEP
   2: extern void device_pm_lock(void);
   3: extern void dpm_resume_start(pm_message_t state);
   4: extern void dpm_resume_end(pm_message_t state);
   5: extern void dpm_resume(pm_message_t state);
   6: extern void dpm_complete(pm_message_t state);
   7:  
   8: extern void device_pm_unlock(void);
   9: extern int dpm_suspend_end(pm_message_t state);
  10: extern int dpm_suspend_start(pm_message_t state);
  11: extern int dpm_suspend(pm_message_t state);
  12: extern int dpm_prepare(pm_message_t state);
  13:  
  14: extern void __suspend_report_result(const char *function, void *fn, int ret);
  15:  
  16: #define suspend_report_result(fn, ret)                                  \
  17:         do {                                                            \
  18:                 __suspend_report_result(__func__, fn, ret);             \
  19:         } while (0)
  20:  
  21: extern int device_pm_wait_for_dev(struct device *sub, struct device *dev);
  22: extern void dpm_for_each_dev(void *data, void (*fn)(struct device *, void *));

refer to

  • Documentation/power/runtime_pm.txt
  • Documentation/driver-api/pm/devices.rst
  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值