ACPI 高级电源管理接口Suspend与Resume

http://www.advogato.org/article/913.html

 

Back in the APM days, everything was easy. You called an ioctl on /dev/apm, and the kernel made a BIOS call. After that, it was all up to the hardware. Sure, it never really worked properly, and it was basically impossible to debug what the hardware actually did. And then ACPI came along, and nothing worked at all. Several years later, we're almost back to where we were with APM. But what's actually happening when you hit that sleep key?

 

Without the ability to suspend and resume, laptop users are doomed to spend several hours of their lives waiting for machines to boot and shutdown. This is, clearly, suboptimal. APM made it fairly easy to implement this, because almost everything was handled by the BIOS. And that, in a nutshell, is one of the primary reasons why ACPI ended up in charge.

 

The biggest problem with APM is that it left policy in hardware. Don't want to suspend on lid closure? The OS doesn't get any say in the matter, though if you're lucky there might be a BIOS option to control it. Would prefer it if the BIOS didn't scribble all over the contents of your video registers while it tries to reprogram them (probably back to the defaults of the Windows drivers...)? Sucks to be you. Want the sleep button to trigger suspend to disk, not suspend to RAM? A-ha ha ha.

 

ACPI deals with that problem, by moving almost all the useful functionality out of hardware. The downside of this is that the functionality needs to be reimplemented in the OS. Which, given that the ACPI spec is around 600 pages long, has taken a little time.

 

(Of course, it turns out that most of the ACPI spec is entirely uninteresting for suspend and resume purposes, but that's not really the point right now)

 

So, firstly, lets have some ACPI jargon. ACPI itself stands for "Advanced Configuration and Power Interface". It's not just a power management spec - it provides the OS with a description of all the built-in hardware in your system, along with a certain degree of abstraction. It gives you information about interrupt routing, tells you if someone's just removed a hot-pluggable DVD drive from a laptop and may even let you control which video output is being used.

 

This information is provided in a table called the DSDT (Discrete System Descriptor Table). The DSDT is in a bytecode called AML (ACPI Machine Language), compiled from a simple language called ASL (ACPI Source Language, shockingly enough). At boot time, the system reads the DSDT, parses it and executes various methods. These can do pretty much anything, but on the bright side they're being executed in kernel context and (in principle) you can filter out anything that you really don't want to do (such as scribbling all over CMOS or something).

 

The final relevant piece of ACPI information is something called the FADT, or Fixed Address Descriptor Table. This gives the OS information about various register addresses. It's a static structure, and doesn't contain any executable code.

 

So, how does all of this stuff actually work?

 

First of all, the user hits the sleep key. This triggers a hardware interrupt, which is caught by the embedded controller. That pokes a register in the southbridge, which flags that a general purpose event has just occured. The OS notices this, and checks the DSDT for what's supposed to happen next. Generally, this just calls a notification event. This is bounced back out to userspace via /proc/acpi/events (currently, though it's going to be moved to the input layer in future) and userspace gets to choose what happens next.

 

Let's concentrate on the common scenario, which is that someone hitting the sleep button wants to suspend to RAM. Via some abstraction (either acpid, gnome-power-manager or kpowersave or something), userspace makes that decision and initiates the suspend to RAM process by either calling a suspend script directly or bouncing via HAL.

 

Depending on distribution, this ends up running a shell script or binary which attempts to prepare the system for suspend. Right now, this tends to involve a bunch of bandaids around various broken drivers - unloading modules and reloading them is one of the easiest workarounds for breakage. Finally, the string "mem" is written to /sys/power/state.

 

This jumps back into the kernel. First, userspace is stopped. This stops it getting horribly confused when a load of hardware mysteriously stops working. Then the kernel goes through the device tree and calls suspend methods on each bound driver. Individual drivers have responsibility for storing enough state in order to be able to reprogram the device on resume - ACPI doesn't make guarantees about what the hardware state is going to be when we come back. Once the kernel-side suspend code has been run, we execute a couple of ACPI methods - PTS (Prepare To Sleep) and GTS (Going To Sleep). These tend to poke various things that the kernel knows nothing about, and so a certain amount of magic may be involved.

 

At this point, the system should be fairly quiescent. Only two things to do now. Firstly, the address of the kernel wakeup code is written to an address contained in the FADT. Secondly, two magic values from the DSDT are written to registers described in the FADT. This usually causes some sort of system management trap, which makes sure that the memory is put in self-refresh mode and actually sequences the machine into suspend. For the S3 power state, this basically involves shutting the machine (other than the RAM) down completely.

 

Time passes.

 

The user presses the power button. The system switches on, jumps to the BIOS start address, does a certain amount of setup (programming the memory controller and so on) and then looks at the ACPI status register. This tells it that the machine was previously suspended to RAM, so it then jumps to the wakeup address programmed earlier. This leads it to a bunch of real-mode x86 code provided by the kernel, which programs the CPU back into protected mode and restores register state. Suddenly we're running kernel code again.

 

 

From this point onwards, it's much the reverse of the suspend process. We call the ACPI WAK method, resume all the drivers and restart userspace. The shell script suddenly starts running again and cleans up after itself, reloading any drivers that were unloaded before suspend. As far as userspace is concerned, the only thing that's happened is that the clock has jumped forward.

 

So why is this difficult?

 

In a lot of cases, it's just down to bugs in the drivers. Restoring hardware state can be hard, especially if you don't actually have all the documentation for the hardware to start with - traditionally, many Linux drivers have ended up depending on the BIOS to have programmed the hardware into a semi-sane state, and there's no guarantee that that will happen with ACPI. Other cases can just be oversights - for instance, the bug in the APIC (not to be confused with ACPI) code that meant a single register wasn't restored, resulting in some machines resuming without any interrupts being delivered.

 

The single biggest problem is video hardware. The spec doesn't require the BIOS to reprogram the video hardware at all, and so often it'll come back in an entirely unprogrammed state. This is an issue, since we (in general) have absolutely no idea how to bring a video card up from scratch. One of the easiest workarounds is to execute code from the video BIOS in the same way that the system BIOS does on machine startup. vbetool lets you do this from userspace, and it works a surprisingly large amount of the time. However, there's no guarantee that it'll be successful. Vendors often unmap that section of BIOS after the system has been brought up, since they've got far more BIOS code than will fit in the BIOS region of the legacy address space. In the long run, the only solution is drivers that know how to program an entirely uninitialised chip. The new modesetting branch of the Intel driver aims to do this, as do the developers of noveau.

 

Despite all this misery, ACPI support is generally improving. Most machines can now suspend and resume once more. The next big challenge is improving run-time power management in order to get battery life to at least the level it is under Windows, and ideally beyond that.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值