Since I had more than one person ask if I could document this somewhere, here is my part of the presentation that Len and I did for LPC2011. (yes, as requested I will also be submitting this as a patch to Documentation at some point soon)
This past year I was tasked with enabling runtime power management for most of the drivers for the Moorestown architecture for Meego. I was also the Meego kernel maintainer for the Moorestown architecture for a few months, and got to review many drivers during that time. Below are the top 10 issues I found driver writers making (over and over and over again) while attempting to implement runtime power management.
- Not understanding the general concept of runtime power management. Runtime power management is initiated by the kernel and transparent to user space. YOU determine when you are not busy and sleep then, you don’t wait for someone to tell you to.
- Not understanding what your subsystem does for runtime PM. I saw many driver writers duplicating work that is done in their subsystem core.
- Make sure you understand all the possible entry points to your driver. Sysfs is an entry point.
- Make sure you understand when you are actually touching hardware.
- Make sure you understand the context in which you are being called during these entry points (i.e. are you called in interrupt context?).
- For drivers who may have entry points called in interrupt context, you must ensure that the hardware is available without sleeping, otherwise use a work queue.
- Ref counting during the irq handler does cost cycles. This may be ok, depending on your device. If you have to do ref counting during the irq handler, use a threaded irq handler if possible (see #6).
- Have the correct granularity for your ref counting. Depending on your hardware, there is likely some kind of performance penalty for entering a suspended state. You many even lose device state. Restoring can take time, so you want to make sure you are appropriately idle before allowing a suspend.
- Unbalanced ref counting occurs when you do not pay attention to error paths.
- Unbalanced ref counting can occur when you try to do something tricky with the ref counting to work around overly complex code paths (like setting your pm usage count directly to zero). This will cost you hours and hours of debug time. If it isn’t obvious where to increment/decrement your counter, you are doing something wrong.