diff --git a/Documentation/driver-api/pm/devices.rst b/Documentation/driver-api/pm/devices.rst index f66c7b9126ea..946ad0b94e31 100644 --- a/Documentation/driver-api/pm/devices.rst +++ b/Documentation/driver-api/pm/devices.rst @@ -349,7 +349,7 @@ the phases are: ``prepare``, ``suspend``, ``suspend_late``, ``suspend_noirq``. PM core will skip the ``suspend``, ``suspend_late`` and ``suspend_noirq`` phases as well as all of the corresponding phases of the subsequent device resume for all of these devices. In that case, - the ``->complete`` callback will be invoked directly after the + the ``->complete`` callback will be the next one invoked after the ``->prepare`` callback and is entirely responsible for putting the device into a consistent state as appropriate. @@ -361,9 +361,9 @@ the phases are: ``prepare``, ``suspend``, ``suspend_late``, ``suspend_noirq``. runtime PM disabled. This feature also can be controlled by device drivers by using the - ``DPM_FLAG_NEVER_SKIP`` and ``DPM_FLAG_SMART_PREPARE`` driver power - management flags. [Typically, they are set at the time the driver is - probed against the device in question by passing them to the + ``DPM_FLAG_NO_DIRECT_COMPLETE`` and ``DPM_FLAG_SMART_PREPARE`` driver + power management flags. [Typically, they are set at the time the driver + is probed against the device in question by passing them to the :c:func:`dev_pm_set_driver_flags` helper function.] If the first of these flags is set, the PM core will not apply the direct-complete procedure described above to the given device and, consequenty, to any @@ -383,11 +383,15 @@ the phases are: ``prepare``, ``suspend``, ``suspend_late``, ``suspend_noirq``. ``->suspend`` methods provided by subsystems (bus types and PM domains in particular) must follow an additional rule regarding what can be done to the devices before their drivers' ``->suspend`` methods are called. - Namely, they can only resume the devices from runtime suspend by - calling :c:func:`pm_runtime_resume` for them, if that is necessary, and + Namely, they may resume the devices from runtime suspend by + calling :c:func:`pm_runtime_resume` for them, if that is necessary, but they must not update the state of the devices in any other way at that time (in case the drivers need to resume the devices from runtime - suspend in their ``->suspend`` methods). + suspend in their ``->suspend`` methods). In fact, the PM core prevents + subsystems or drivers from putting devices into runtime suspend at + these times by calling :c:func:`pm_runtime_get_noresume` before issuing + the ``->prepare`` callback (and calling :c:func:`pm_runtime_put` after + issuing the ``->complete`` callback). 3. For a number of devices it is convenient to split suspend into the "quiesce device" and "save device state" phases, in which cases @@ -459,22 +463,22 @@ When resuming from freeze, standby or memory sleep, the phases are: Note, however, that new children may be registered below the device as soon as the ``->resume`` callbacks occur; it's not necessary to wait - until the ``complete`` phase with that. + until the ``complete`` phase runs. Moreover, if the preceding ``->prepare`` callback returned a positive number, the device may have been left in runtime suspend throughout the - whole system suspend and resume (the ``suspend``, ``suspend_late``, - ``suspend_noirq`` phases of system suspend and the ``resume_noirq``, - ``resume_early``, ``resume`` phases of system resume may have been - skipped for it). In that case, the ``->complete`` callback is entirely + whole system suspend and resume (its ``->suspend``, ``->suspend_late``, + ``->suspend_noirq``, ``->resume_noirq``, + ``->resume_early``, and ``->resume`` callbacks may have been + skipped). In that case, the ``->complete`` callback is entirely responsible for putting the device into a consistent state after system suspend if necessary. [For example, it may need to queue up a runtime resume request for the device for this purpose.] To check if that is the case, the ``->complete`` callback can consult the device's - ``power.direct_complete`` flag. Namely, if that flag is set when the - ``->complete`` callback is being run, it has been called directly after - the preceding ``->prepare`` and special actions may be required - to make the device work correctly afterward. + ``power.direct_complete`` flag. If that flag is set when the + ``->complete`` callback is being run then the direct-complete mechanism + was used, and special actions may be required to make the device work + correctly afterward. At the end of these phases, drivers should be as functional as they were before suspending: I/O can be performed using DMA and IRQs, and the relevant clocks are @@ -575,10 +579,12 @@ and the phases are similar. The ``->poweroff``, ``->poweroff_late`` and ``->poweroff_noirq`` callbacks should do essentially the same things as the ``->suspend``, ``->suspend_late`` -and ``->suspend_noirq`` callbacks, respectively. The only notable difference is +and ``->suspend_noirq`` callbacks, respectively. A notable difference is that they need not store the device register values, because the registers should already have been stored during the ``freeze``, ``freeze_late`` or -``freeze_noirq`` phases. +``freeze_noirq`` phases. Also, on many machines the firmware will power-down +the entire system, so it is not necessary for the callback to put the device in +a low-power state. Leaving Hibernation @@ -764,70 +770,119 @@ device driver in question. If it is necessary to resume a device from runtime suspend during a system-wide transition into a sleep state, that can be done by calling -:c:func:`pm_runtime_resume` for it from the ``->suspend`` callback (or its -couterpart for transitions related to hibernation) of either the device's driver -or a subsystem responsible for it (for example, a bus type or a PM domain). -That is guaranteed to work by the requirement that subsystems must not change -the state of devices (possibly except for resuming them from runtime suspend) +:c:func:`pm_runtime_resume` from the ``->suspend`` callback (or the ``->freeze`` +or ``->poweroff`` callback for transitions related to hibernation) of either the +device's driver or its subsystem (for example, a bus type or a PM domain). +However, subsystems must not otherwise change the runtime status of devices from their ``->prepare`` and ``->suspend`` callbacks (or equivalent) *before* invoking device drivers' ``->suspend`` callbacks (or equivalent). +.. _smart_suspend_flag: + +The ``DPM_FLAG_SMART_SUSPEND`` Driver Flag +------------------------------------------ + Some bus types and PM domains have a policy to resume all devices from runtime suspend upfront in their ``->suspend`` callbacks, but that may not be really -necessary if the driver of the device can cope with runtime-suspended devices. -The driver can indicate that by setting ``DPM_FLAG_SMART_SUSPEND`` in -:c:member:`power.driver_flags` at the probe time, by passing it to the -:c:func:`dev_pm_set_driver_flags` helper. That also may cause middle-layer code +necessary if the device's driver can cope with runtime-suspended devices. +The driver can indicate this by setting ``DPM_FLAG_SMART_SUSPEND`` in +:c:member:`power.driver_flags` at probe time, with the assistance of the +:c:func:`dev_pm_set_driver_flags` helper routine. + +Setting that flag causes the PM core and middle-layer code (bus types, PM domains etc.) to skip the ``->suspend_late`` and ``->suspend_noirq`` callbacks provided by the driver if the device remains in -runtime suspend at the beginning of the ``suspend_late`` phase of system-wide -suspend (or in the ``poweroff_late`` phase of hibernation), when runtime PM -has been disabled for it, under the assumption that its state should not change -after that point until the system-wide transition is over (the PM core itself -does that for devices whose "noirq", "late" and "early" system-wide PM callbacks -are executed directly by it). If that happens, the driver's system-wide resume -callbacks, if present, may still be invoked during the subsequent system-wide -resume transition and the device's runtime power management status may be set -to "active" before enabling runtime PM for it, so the driver must be prepared to -cope with the invocation of its system-wide resume callbacks back-to-back with -its ``->runtime_suspend`` one (without the intervening ``->runtime_resume`` and -so on) and the final state of the device must reflect the "active" runtime PM -status in that case. +runtime suspend throughout those phases of the system-wide suspend (and +similarly for the "freeze" and "poweroff" parts of system hibernation). +[Otherwise the same driver +callback might be executed twice in a row for the same device, which would not +be valid in general.] If the middle-layer system-wide PM callbacks are present +for the device then they are responsible for skipping these driver callbacks; +if not then the PM core skips them. The subsystem callback routines can +determine whether they need to skip the driver callbacks by testing the return +value from the :c:func:`dev_pm_skip_suspend` helper function. + +In addition, with ``DPM_FLAG_SMART_SUSPEND`` set, the driver's ``->thaw_noirq`` +and ``->thaw_early`` callbacks are skipped in hibernation if the device remained +in runtime suspend throughout the preceding "freeze" transition. Again, if the +middle-layer callbacks are present for the device, they are responsible for +doing this, otherwise the PM core takes care of it. + + +The ``DPM_FLAG_MAY_SKIP_RESUME`` Driver Flag +-------------------------------------------- During system-wide resume from a sleep state it's easiest to put devices into the full-power state, as explained in :file:`Documentation/power/runtime_pm.rst`. [Refer to that document for more information regarding this particular issue as well as for information on the device runtime power management framework in -general.] - -However, it often is desirable to leave devices in suspend after system -transitions to the working state, especially if those devices had been in +general.] However, it often is desirable to leave devices in suspend after +system transitions to the working state, especially if those devices had been in runtime suspend before the preceding system-wide suspend (or analogous) -transition. Device drivers can use the ``DPM_FLAG_LEAVE_SUSPENDED`` flag to -indicate to the PM core (and middle-layer code) that they prefer the specific -devices handled by them to be left suspended and they have no problems with -skipping their system-wide resume callbacks for this reason. Whether or not the -devices will actually be left in suspend may depend on their state before the -given system suspend-resume cycle and on the type of the system transition under -way. In particular, devices are not left suspended if that transition is a -restore from hibernation, as device states are not guaranteed to be reflected -by the information stored in the hibernation image in that case. +transition. -The middle-layer code involved in the handling of the device is expected to -indicate to the PM core if the device may be left in suspend by setting its -:c:member:`power.may_skip_resume` status bit which is checked by the PM core -during the "noirq" phase of the preceding system-wide suspend (or analogous) -transition. The middle layer is then responsible for handling the device as -appropriate in its "noirq" resume callback, which is executed regardless of -whether or not the device is left suspended, but the other resume callbacks -(except for ``->complete``) will be skipped automatically by the PM core if the -device really can be left in suspend. +To that end, device drivers can use the ``DPM_FLAG_MAY_SKIP_RESUME`` flag to +indicate to the PM core and middle-layer code that they allow their "noirq" and +"early" resume callbacks to be skipped if the device can be left in suspend +after system-wide PM transitions to the working state. Whether or not that is +the case generally depends on the state of the device before the given system +suspend-resume cycle and on the type of the system transition under way. +In particular, the "thaw" and "restore" transitions related to hibernation are +not affected by ``DPM_FLAG_MAY_SKIP_RESUME`` at all. [All callbacks are +issued during the "restore" transition regardless of the flag settings, +and whether or not any driver callbacks +are skipped during the "thaw" transition depends whether or not the +``DPM_FLAG_SMART_SUSPEND`` flag is set (see `above `_). +In addition, a device is not allowed to remain in runtime suspend if any of its +children will be returned to full power.] -For devices whose "noirq", "late" and "early" driver callbacks are invoked -directly by the PM core, all of the system-wide resume callbacks are skipped if -``DPM_FLAG_LEAVE_SUSPENDED`` is set and the device is in runtime suspend during -the ``suspend_noirq`` (or analogous) phase or the transition under way is a -proper system suspend (rather than anything related to hibernation) and the -device's wakeup settings are suitable for runtime PM (that is, it cannot -generate wakeup signals at all or it is allowed to wake up the system from -sleep). +The ``DPM_FLAG_MAY_SKIP_RESUME`` flag is taken into account in combination with +the :c:member:`power.may_skip_resume` status bit set by the PM core during the +"suspend" phase of suspend-type transitions. If the driver or the middle layer +has a reason to prevent the driver's "noirq" and "early" resume callbacks from +being skipped during the subsequent system resume transition, it should +clear :c:member:`power.may_skip_resume` in its ``->suspend``, ``->suspend_late`` +or ``->suspend_noirq`` callback. [Note that the drivers setting +``DPM_FLAG_SMART_SUSPEND`` need to clear :c:member:`power.may_skip_resume` in +their ``->suspend`` callback in case the other two are skipped.] + +Setting the :c:member:`power.may_skip_resume` status bit along with the +``DPM_FLAG_MAY_SKIP_RESUME`` flag is necessary, but generally not sufficient, +for the driver's "noirq" and "early" resume callbacks to be skipped. Whether or +not they should be skipped can be determined by evaluating the +:c:func:`dev_pm_skip_resume` helper function. + +If that function returns ``true``, the driver's "noirq" and "early" resume +callbacks should be skipped and the device's runtime PM status will be set to +"suspended" by the PM core. Otherwise, if the device was runtime-suspended +during the preceding system-wide suspend transition and its +``DPM_FLAG_SMART_SUSPEND`` is set, its runtime PM status will be set to +"active" by the PM core. [Hence, the drivers that do not set +``DPM_FLAG_SMART_SUSPEND`` should not expect the runtime PM status of their +devices to be changed from "suspended" to "active" by the PM core during +system-wide resume-type transitions.] + +If the ``DPM_FLAG_MAY_SKIP_RESUME`` flag is not set for a device, but +``DPM_FLAG_SMART_SUSPEND`` is set and the driver's "late" and "noirq" suspend +callbacks are skipped, its system-wide "noirq" and "early" resume callbacks, if +present, are invoked as usual and the device's runtime PM status is set to +"active" by the PM core before enabling runtime PM for it. In that case, the +driver must be prepared to cope with the invocation of its system-wide resume +callbacks back-to-back with its ``->runtime_suspend`` one (without the +intervening ``->runtime_resume`` and system-wide suspend callbacks) and the +final state of the device must reflect the "active" runtime PM status in that +case. [Note that this is not a problem at all if the driver's +``->suspend_late`` callback pointer points to the same function as its +``->runtime_suspend`` one and its ``->resume_early`` callback pointer points to +the same function as the ``->runtime_resume`` one, while none of the other +system-wide suspend-resume callbacks of the driver are present, for example.] + +Likewise, if ``DPM_FLAG_MAY_SKIP_RESUME`` is set for a device, its driver's +system-wide "noirq" and "early" resume callbacks may be skipped while its "late" +and "noirq" suspend callbacks may have been executed (in principle, regardless +of whether or not ``DPM_FLAG_SMART_SUSPEND`` is set). In that case, the driver +needs to be able to cope with the invocation of its ``->runtime_resume`` +callback back-to-back with its "late" and "noirq" suspend ones. [For instance, +that is not a concern if the driver sets both ``DPM_FLAG_SMART_SUSPEND`` and +``DPM_FLAG_MAY_SKIP_RESUME`` and uses the same pair of suspend/resume callback +functions for runtime PM and system-wide suspend/resume.] diff --git a/Documentation/power/pci.rst b/Documentation/power/pci.rst index 0924d29636ad..1831e431f725 100644 --- a/Documentation/power/pci.rst +++ b/Documentation/power/pci.rst @@ -1004,41 +1004,39 @@ including the PCI bus type. The flags should be set once at the driver probe time with the help of the dev_pm_set_driver_flags() function and they should not be updated directly afterwards. -The DPM_FLAG_NEVER_SKIP flag prevents the PM core from using the direct-complete -mechanism allowing device suspend/resume callbacks to be skipped if the device -is in runtime suspend when the system suspend starts. That also affects all of -the ancestors of the device, so this flag should only be used if absolutely -necessary. +The DPM_FLAG_NO_DIRECT_COMPLETE flag prevents the PM core from using the +direct-complete mechanism allowing device suspend/resume callbacks to be skipped +if the device is in runtime suspend when the system suspend starts. That also +affects all of the ancestors of the device, so this flag should only be used if +absolutely necessary. -The DPM_FLAG_SMART_PREPARE flag instructs the PCI bus type to only return a -positive value from pci_pm_prepare() if the ->prepare callback provided by the +The DPM_FLAG_SMART_PREPARE flag causes the PCI bus type to return a positive +value from pci_pm_prepare() only if the ->prepare callback provided by the driver of the device returns a positive value. That allows the driver to opt -out from using the direct-complete mechanism dynamically. +out from using the direct-complete mechanism dynamically (whereas setting +DPM_FLAG_NO_DIRECT_COMPLETE means permanent opt-out). The DPM_FLAG_SMART_SUSPEND flag tells the PCI bus type that from the driver's perspective the device can be safely left in runtime suspend during system suspend. That causes pci_pm_suspend(), pci_pm_freeze() and pci_pm_poweroff() -to skip resuming the device from runtime suspend unless there are PCI-specific -reasons for doing that. Also, it causes pci_pm_suspend_late/noirq(), -pci_pm_freeze_late/noirq() and pci_pm_poweroff_late/noirq() to return early -if the device remains in runtime suspend in the beginning of the "late" phase -of the system-wide transition under way. Moreover, if the device is in -runtime suspend in pci_pm_resume_noirq() or pci_pm_restore_noirq(), its runtime -power management status will be changed to "active" (as it is going to be put -into D0 going forward), but if it is in runtime suspend in pci_pm_thaw_noirq(), -the function will set the power.direct_complete flag for it (to make the PM core -skip the subsequent "thaw" callbacks for it) and return. +to avoid resuming the device from runtime suspend unless there are PCI-specific +reasons for doing that. Also, it causes pci_pm_suspend_late/noirq() and +pci_pm_poweroff_late/noirq() to return early if the device remains in runtime +suspend during the "late" phase of the system-wide transition under way. +Moreover, if the device is in runtime suspend in pci_pm_resume_noirq() or +pci_pm_restore_noirq(), its runtime PM status will be changed to "active" (as it +is going to be put into D0 going forward). -Setting the DPM_FLAG_LEAVE_SUSPENDED flag means that the driver prefers the -device to be left in suspend after system-wide transitions to the working state. -This flag is checked by the PM core, but the PCI bus type informs the PM core -which devices may be left in suspend from its perspective (that happens during -the "noirq" phase of system-wide suspend and analogous transitions) and next it -uses the dev_pm_may_skip_resume() helper to decide whether or not to return from -pci_pm_resume_noirq() early, as the PM core will skip the remaining resume -callbacks for the device during the transition under way and will set its -runtime PM status to "suspended" if dev_pm_may_skip_resume() returns "true" for -it. +Setting the DPM_FLAG_MAY_SKIP_RESUME flag means that the driver allows its +"noirq" and "early" resume callbacks to be skipped if the device can be left +in suspend after a system-wide transition into the working state. This flag is +taken into consideration by the PM core along with the power.may_skip_resume +status bit of the device which is set by pci_pm_suspend_noirq() in certain +situations. If the PM core determines that the driver's "noirq" and "early" +resume callbacks should be skipped, the dev_pm_skip_resume() helper function +will return "true" and that will cause pci_pm_resume_noirq() and +pci_pm_resume_early() to return upfront without touching the device and +executing the driver callbacks. 3.2. Device Runtime Power Management ------------------------------------ diff --git a/drivers/acpi/acpi_lpss.c b/drivers/acpi/acpi_lpss.c index dee999938213..5e2bfbcf526f 100644 --- a/drivers/acpi/acpi_lpss.c +++ b/drivers/acpi/acpi_lpss.c @@ -1041,7 +1041,7 @@ static int acpi_lpss_do_suspend_late(struct device *dev) { int ret; - if (dev_pm_smart_suspend_and_suspended(dev)) + if (dev_pm_skip_suspend(dev)) return 0; ret = pm_generic_suspend_late(dev); @@ -1093,6 +1093,9 @@ static int acpi_lpss_resume_early(struct device *dev) if (pdata->dev_desc->resume_from_noirq) return 0; + if (dev_pm_skip_resume(dev)) + return 0; + return acpi_lpss_do_resume_early(dev); } @@ -1102,12 +1105,9 @@ static int acpi_lpss_resume_noirq(struct device *dev) int ret; /* Follow acpi_subsys_resume_noirq(). */ - if (dev_pm_may_skip_resume(dev)) + if (dev_pm_skip_resume(dev)) return 0; - if (dev_pm_smart_suspend_and_suspended(dev)) - pm_runtime_set_active(dev); - ret = pm_generic_resume_noirq(dev); if (ret) return ret; @@ -1169,7 +1169,7 @@ static int acpi_lpss_poweroff_late(struct device *dev) { struct lpss_private_data *pdata = acpi_driver_data(ACPI_COMPANION(dev)); - if (dev_pm_smart_suspend_and_suspended(dev)) + if (dev_pm_skip_suspend(dev)) return 0; if (pdata->dev_desc->resume_from_noirq) @@ -1182,7 +1182,7 @@ static int acpi_lpss_poweroff_noirq(struct device *dev) { struct lpss_private_data *pdata = acpi_driver_data(ACPI_COMPANION(dev)); - if (dev_pm_smart_suspend_and_suspended(dev)) + if (dev_pm_skip_suspend(dev)) return 0; if (pdata->dev_desc->resume_from_noirq) { diff --git a/drivers/acpi/acpi_tad.c b/drivers/acpi/acpi_tad.c index 33a4bcdaa4d7..7d45cce0c3c1 100644 --- a/drivers/acpi/acpi_tad.c +++ b/drivers/acpi/acpi_tad.c @@ -624,7 +624,7 @@ static int acpi_tad_probe(struct platform_device *pdev) */ device_init_wakeup(dev, true); dev_pm_set_driver_flags(dev, DPM_FLAG_SMART_SUSPEND | - DPM_FLAG_LEAVE_SUSPENDED); + DPM_FLAG_MAY_SKIP_RESUME); /* * The platform bus type layer tells the ACPI PM domain powers up the * device, so set the runtime PM status of it to "active". diff --git a/drivers/acpi/device_pm.c b/drivers/acpi/device_pm.c index 5832bc10aca8..b44b12a931e7 100644 --- a/drivers/acpi/device_pm.c +++ b/drivers/acpi/device_pm.c @@ -1084,7 +1084,7 @@ int acpi_subsys_suspend_late(struct device *dev) { int ret; - if (dev_pm_smart_suspend_and_suspended(dev)) + if (dev_pm_skip_suspend(dev)) return 0; ret = pm_generic_suspend_late(dev); @@ -1100,10 +1100,8 @@ int acpi_subsys_suspend_noirq(struct device *dev) { int ret; - if (dev_pm_smart_suspend_and_suspended(dev)) { - dev->power.may_skip_resume = true; + if (dev_pm_skip_suspend(dev)) return 0; - } ret = pm_generic_suspend_noirq(dev); if (ret) @@ -1116,8 +1114,8 @@ int acpi_subsys_suspend_noirq(struct device *dev) * acpi_subsys_complete() to take care of fixing up the device's state * anyway, if need be. */ - dev->power.may_skip_resume = device_may_wakeup(dev) || - !device_can_wakeup(dev); + if (device_can_wakeup(dev) && !device_may_wakeup(dev)) + dev->power.may_skip_resume = false; return 0; } @@ -1129,17 +1127,9 @@ EXPORT_SYMBOL_GPL(acpi_subsys_suspend_noirq); */ static int acpi_subsys_resume_noirq(struct device *dev) { - if (dev_pm_may_skip_resume(dev)) + if (dev_pm_skip_resume(dev)) return 0; - /* - * Devices with DPM_FLAG_SMART_SUSPEND may be left in runtime suspend - * during system suspend, so update their runtime PM status to "active" - * as they will be put into D0 going forward. - */ - if (dev_pm_smart_suspend_and_suspended(dev)) - pm_runtime_set_active(dev); - return pm_generic_resume_noirq(dev); } @@ -1153,7 +1143,12 @@ static int acpi_subsys_resume_noirq(struct device *dev) */ static int acpi_subsys_resume_early(struct device *dev) { - int ret = acpi_dev_resume(dev); + int ret; + + if (dev_pm_skip_resume(dev)) + return 0; + + ret = acpi_dev_resume(dev); return ret ? ret : pm_generic_resume_early(dev); } @@ -1218,7 +1213,7 @@ static int acpi_subsys_poweroff_late(struct device *dev) { int ret; - if (dev_pm_smart_suspend_and_suspended(dev)) + if (dev_pm_skip_suspend(dev)) return 0; ret = pm_generic_poweroff_late(dev); @@ -1234,7 +1229,7 @@ static int acpi_subsys_poweroff_late(struct device *dev) */ static int acpi_subsys_poweroff_noirq(struct device *dev) { - if (dev_pm_smart_suspend_and_suspended(dev)) + if (dev_pm_skip_suspend(dev)) return 0; return pm_generic_poweroff_noirq(dev); diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c index 0e07e17c2def..bb98b813554f 100644 --- a/drivers/base/power/main.c +++ b/drivers/base/power/main.c @@ -562,72 +562,26 @@ static void dpm_watchdog_clear(struct dpm_watchdog *wd) /*------------------------- Resume routines -------------------------*/ /** - * suspend_event - Return a "suspend" message for given "resume" one. - * @resume_msg: PM message representing a system-wide resume transition. - */ -static pm_message_t suspend_event(pm_message_t resume_msg) -{ - switch (resume_msg.event) { - case PM_EVENT_RESUME: - return PMSG_SUSPEND; - case PM_EVENT_THAW: - case PM_EVENT_RESTORE: - return PMSG_FREEZE; - case PM_EVENT_RECOVER: - return PMSG_HIBERNATE; - } - return PMSG_ON; -} - -/** - * dev_pm_may_skip_resume - System-wide device resume optimization check. + * dev_pm_skip_resume - System-wide device resume optimization check. * @dev: Target device. * - * Checks whether or not the device may be left in suspend after a system-wide - * transition to the working state. + * Return: + * - %false if the transition under way is RESTORE. + * - Return value of dev_pm_skip_suspend() if the transition under way is THAW. + * - The logical negation of %power.must_resume otherwise (that is, when the + * transition under way is RESUME). */ -bool dev_pm_may_skip_resume(struct device *dev) +bool dev_pm_skip_resume(struct device *dev) { - return !dev->power.must_resume && pm_transition.event != PM_EVENT_RESTORE; + if (pm_transition.event == PM_EVENT_RESTORE) + return false; + + if (pm_transition.event == PM_EVENT_THAW) + return dev_pm_skip_suspend(dev); + + return !dev->power.must_resume; } -static pm_callback_t dpm_subsys_resume_noirq_cb(struct device *dev, - pm_message_t state, - const char **info_p) -{ - pm_callback_t callback; - const char *info; - - if (dev->pm_domain) { - info = "noirq power domain "; - callback = pm_noirq_op(&dev->pm_domain->ops, state); - } else if (dev->type && dev->type->pm) { - info = "noirq type "; - callback = pm_noirq_op(dev->type->pm, state); - } else if (dev->class && dev->class->pm) { - info = "noirq class "; - callback = pm_noirq_op(dev->class->pm, state); - } else if (dev->bus && dev->bus->pm) { - info = "noirq bus "; - callback = pm_noirq_op(dev->bus->pm, state); - } else { - return NULL; - } - - if (info_p) - *info_p = info; - - return callback; -} - -static pm_callback_t dpm_subsys_suspend_noirq_cb(struct device *dev, - pm_message_t state, - const char **info_p); - -static pm_callback_t dpm_subsys_suspend_late_cb(struct device *dev, - pm_message_t state, - const char **info_p); - /** * device_resume_noirq - Execute a "noirq resume" callback for given device. * @dev: Device to handle. @@ -639,8 +593,8 @@ static pm_callback_t dpm_subsys_suspend_late_cb(struct device *dev, */ static int device_resume_noirq(struct device *dev, pm_message_t state, bool async) { - pm_callback_t callback; - const char *info; + pm_callback_t callback = NULL; + const char *info = NULL; bool skip_resume; int error = 0; @@ -656,37 +610,41 @@ static int device_resume_noirq(struct device *dev, pm_message_t state, bool asyn if (!dpm_wait_for_superior(dev, async)) goto Out; - skip_resume = dev_pm_may_skip_resume(dev); + skip_resume = dev_pm_skip_resume(dev); + /* + * If the driver callback is skipped below or by the middle layer + * callback and device_resume_early() also skips the driver callback for + * this device later, it needs to appear as "suspended" to PM-runtime, + * so change its status accordingly. + * + * Otherwise, the device is going to be resumed, so set its PM-runtime + * status to "active", but do that only if DPM_FLAG_SMART_SUSPEND is set + * to avoid confusing drivers that don't use it. + */ + if (skip_resume) + pm_runtime_set_suspended(dev); + else if (dev_pm_skip_suspend(dev)) + pm_runtime_set_active(dev); - callback = dpm_subsys_resume_noirq_cb(dev, state, &info); + if (dev->pm_domain) { + info = "noirq power domain "; + callback = pm_noirq_op(&dev->pm_domain->ops, state); + } else if (dev->type && dev->type->pm) { + info = "noirq type "; + callback = pm_noirq_op(dev->type->pm, state); + } else if (dev->class && dev->class->pm) { + info = "noirq class "; + callback = pm_noirq_op(dev->class->pm, state); + } else if (dev->bus && dev->bus->pm) { + info = "noirq bus "; + callback = pm_noirq_op(dev->bus->pm, state); + } if (callback) goto Run; if (skip_resume) goto Skip; - if (dev_pm_smart_suspend_and_suspended(dev)) { - pm_message_t suspend_msg = suspend_event(state); - - /* - * If "freeze" callbacks have been skipped during a transition - * related to hibernation, the subsequent "thaw" callbacks must - * be skipped too or bad things may happen. Otherwise, resume - * callbacks are going to be run for the device, so its runtime - * PM status must be changed to reflect the new state after the - * transition under way. - */ - if (!dpm_subsys_suspend_late_cb(dev, suspend_msg, NULL) && - !dpm_subsys_suspend_noirq_cb(dev, suspend_msg, NULL)) { - if (state.event == PM_EVENT_THAW) { - skip_resume = true; - goto Skip; - } else { - pm_runtime_set_active(dev); - } - } - } - if (dev->driver && dev->driver->pm) { info = "noirq driver "; callback = pm_noirq_op(dev->driver->pm, state); @@ -698,20 +656,6 @@ Run: Skip: dev->power.is_noirq_suspended = false; - if (skip_resume) { - /* Make the next phases of resume skip the device. */ - dev->power.is_late_suspended = false; - dev->power.is_suspended = false; - /* - * The device is going to be left in suspend, but it might not - * have been in runtime suspend before the system suspended, so - * its runtime PM status needs to be updated to avoid confusing - * the runtime PM framework when runtime PM is enabled for the - * device again. - */ - pm_runtime_set_suspended(dev); - } - Out: complete_all(&dev->power.completion); TRACE_RESUME(error); @@ -810,35 +754,6 @@ void dpm_resume_noirq(pm_message_t state) cpuidle_resume(); } -static pm_callback_t dpm_subsys_resume_early_cb(struct device *dev, - pm_message_t state, - const char **info_p) -{ - pm_callback_t callback; - const char *info; - - if (dev->pm_domain) { - info = "early power domain "; - callback = pm_late_early_op(&dev->pm_domain->ops, state); - } else if (dev->type && dev->type->pm) { - info = "early type "; - callback = pm_late_early_op(dev->type->pm, state); - } else if (dev->class && dev->class->pm) { - info = "early class "; - callback = pm_late_early_op(dev->class->pm, state); - } else if (dev->bus && dev->bus->pm) { - info = "early bus "; - callback = pm_late_early_op(dev->bus->pm, state); - } else { - return NULL; - } - - if (info_p) - *info_p = info; - - return callback; -} - /** * device_resume_early - Execute an "early resume" callback for given device. * @dev: Device to handle. @@ -849,8 +764,8 @@ static pm_callback_t dpm_subsys_resume_early_cb(struct device *dev, */ static int device_resume_early(struct device *dev, pm_message_t state, bool async) { - pm_callback_t callback; - const char *info; + pm_callback_t callback = NULL; + const char *info = NULL; int error = 0; TRACE_DEVICE(dev); @@ -865,17 +780,37 @@ static int device_resume_early(struct device *dev, pm_message_t state, bool asyn if (!dpm_wait_for_superior(dev, async)) goto Out; - callback = dpm_subsys_resume_early_cb(dev, state, &info); + if (dev->pm_domain) { + info = "early power domain "; + callback = pm_late_early_op(&dev->pm_domain->ops, state); + } else if (dev->type && dev->type->pm) { + info = "early type "; + callback = pm_late_early_op(dev->type->pm, state); + } else if (dev->class && dev->class->pm) { + info = "early class "; + callback = pm_late_early_op(dev->class->pm, state); + } else if (dev->bus && dev->bus->pm) { + info = "early bus "; + callback = pm_late_early_op(dev->bus->pm, state); + } + if (callback) + goto Run; - if (!callback && dev->driver && dev->driver->pm) { + if (dev_pm_skip_resume(dev)) + goto Skip; + + if (dev->driver && dev->driver->pm) { info = "early driver "; callback = pm_late_early_op(dev->driver->pm, state); } +Run: error = dpm_run_callback(callback, dev, state, info); + +Skip: dev->power.is_late_suspended = false; - Out: +Out: TRACE_RESUME(error); pm_runtime_enable(dev); @@ -1245,61 +1180,6 @@ static void dpm_superior_set_must_resume(struct device *dev) device_links_read_unlock(idx); } -static pm_callback_t dpm_subsys_suspend_noirq_cb(struct device *dev, - pm_message_t state, - const char **info_p) -{ - pm_callback_t callback; - const char *info; - - if (dev->pm_domain) { - info = "noirq power domain "; - callback = pm_noirq_op(&dev->pm_domain->ops, state); - } else if (dev->type && dev->type->pm) { - info = "noirq type "; - callback = pm_noirq_op(dev->type->pm, state); - } else if (dev->class && dev->class->pm) { - info = "noirq class "; - callback = pm_noirq_op(dev->class->pm, state); - } else if (dev->bus && dev->bus->pm) { - info = "noirq bus "; - callback = pm_noirq_op(dev->bus->pm, state); - } else { - return NULL; - } - - if (info_p) - *info_p = info; - - return callback; -} - -static bool device_must_resume(struct device *dev, pm_message_t state, - bool no_subsys_suspend_noirq) -{ - pm_message_t resume_msg = resume_event(state); - - /* - * If all of the device driver's "noirq", "late" and "early" callbacks - * are invoked directly by the core, the decision to allow the device to - * stay in suspend can be based on its current runtime PM status and its - * wakeup settings. - */ - if (no_subsys_suspend_noirq && - !dpm_subsys_suspend_late_cb(dev, state, NULL) && - !dpm_subsys_resume_early_cb(dev, resume_msg, NULL) && - !dpm_subsys_resume_noirq_cb(dev, resume_msg, NULL)) - return !pm_runtime_status_suspended(dev) && - (resume_msg.event != PM_EVENT_RESUME || - (device_can_wakeup(dev) && !device_may_wakeup(dev))); - - /* - * The only safe strategy here is to require that if the device may not - * be left in suspend, resume callbacks must be invoked for it. - */ - return !dev->power.may_skip_resume; -} - /** * __device_suspend_noirq - Execute a "noirq suspend" callback for given device. * @dev: Device to handle. @@ -1311,9 +1191,8 @@ static bool device_must_resume(struct device *dev, pm_message_t state, */ static int __device_suspend_noirq(struct device *dev, pm_message_t state, bool async) { - pm_callback_t callback; - const char *info; - bool no_subsys_cb = false; + pm_callback_t callback = NULL; + const char *info = NULL; int error = 0; TRACE_DEVICE(dev); @@ -1327,13 +1206,23 @@ static int __device_suspend_noirq(struct device *dev, pm_message_t state, bool a if (dev->power.syscore || dev->power.direct_complete) goto Complete; - callback = dpm_subsys_suspend_noirq_cb(dev, state, &info); + if (dev->pm_domain) { + info = "noirq power domain "; + callback = pm_noirq_op(&dev->pm_domain->ops, state); + } else if (dev->type && dev->type->pm) { + info = "noirq type "; + callback = pm_noirq_op(dev->type->pm, state); + } else if (dev->class && dev->class->pm) { + info = "noirq class "; + callback = pm_noirq_op(dev->class->pm, state); + } else if (dev->bus && dev->bus->pm) { + info = "noirq bus "; + callback = pm_noirq_op(dev->bus->pm, state); + } if (callback) goto Run; - no_subsys_cb = !dpm_subsys_suspend_late_cb(dev, state, NULL); - - if (dev_pm_smart_suspend_and_suspended(dev) && no_subsys_cb) + if (dev_pm_skip_suspend(dev)) goto Skip; if (dev->driver && dev->driver->pm) { @@ -1351,13 +1240,16 @@ Run: Skip: dev->power.is_noirq_suspended = true; - if (dev_pm_test_driver_flags(dev, DPM_FLAG_LEAVE_SUSPENDED)) { - dev->power.must_resume = dev->power.must_resume || - atomic_read(&dev->power.usage_count) > 1 || - device_must_resume(dev, state, no_subsys_cb); - } else { + /* + * Skipping the resume of devices that were in use right before the + * system suspend (as indicated by their PM-runtime usage counters) + * would be suboptimal. Also resume them if doing that is not allowed + * to be skipped. + */ + if (atomic_read(&dev->power.usage_count) > 1 || + !(dev_pm_test_driver_flags(dev, DPM_FLAG_MAY_SKIP_RESUME) && + dev->power.may_skip_resume)) dev->power.must_resume = true; - } if (dev->power.must_resume) dpm_superior_set_must_resume(dev); @@ -1474,35 +1366,6 @@ static void dpm_propagate_wakeup_to_parent(struct device *dev) spin_unlock_irq(&parent->power.lock); } -static pm_callback_t dpm_subsys_suspend_late_cb(struct device *dev, - pm_message_t state, - const char **info_p) -{ - pm_callback_t callback; - const char *info; - - if (dev->pm_domain) { - info = "late power domain "; - callback = pm_late_early_op(&dev->pm_domain->ops, state); - } else if (dev->type && dev->type->pm) { - info = "late type "; - callback = pm_late_early_op(dev->type->pm, state); - } else if (dev->class && dev->class->pm) { - info = "late class "; - callback = pm_late_early_op(dev->class->pm, state); - } else if (dev->bus && dev->bus->pm) { - info = "late bus "; - callback = pm_late_early_op(dev->bus->pm, state); - } else { - return NULL; - } - - if (info_p) - *info_p = info; - - return callback; -} - /** * __device_suspend_late - Execute a "late suspend" callback for given device. * @dev: Device to handle. @@ -1513,8 +1376,8 @@ static pm_callback_t dpm_subsys_suspend_late_cb(struct device *dev, */ static int __device_suspend_late(struct device *dev, pm_message_t state, bool async) { - pm_callback_t callback; - const char *info; + pm_callback_t callback = NULL; + const char *info = NULL; int error = 0; TRACE_DEVICE(dev); @@ -1535,12 +1398,23 @@ static int __device_suspend_late(struct device *dev, pm_message_t state, bool as if (dev->power.syscore || dev->power.direct_complete) goto Complete; - callback = dpm_subsys_suspend_late_cb(dev, state, &info); + if (dev->pm_domain) { + info = "late power domain "; + callback = pm_late_early_op(&dev->pm_domain->ops, state); + } else if (dev->type && dev->type->pm) { + info = "late type "; + callback = pm_late_early_op(dev->type->pm, state); + } else if (dev->class && dev->class->pm) { + info = "late class "; + callback = pm_late_early_op(dev->class->pm, state); + } else if (dev->bus && dev->bus->pm) { + info = "late bus "; + callback = pm_late_early_op(dev->bus->pm, state); + } if (callback) goto Run; - if (dev_pm_smart_suspend_and_suspended(dev) && - !dpm_subsys_suspend_noirq_cb(dev, state, NULL)) + if (dev_pm_skip_suspend(dev)) goto Skip; if (dev->driver && dev->driver->pm) { @@ -1766,7 +1640,7 @@ static int __device_suspend(struct device *dev, pm_message_t state, bool async) dev->power.direct_complete = false; } - dev->power.may_skip_resume = false; + dev->power.may_skip_resume = true; dev->power.must_resume = false; dpm_watchdog_set(&wd, dev); @@ -1970,7 +1844,7 @@ unlock: spin_lock_irq(&dev->power.lock); dev->power.direct_complete = state.event == PM_EVENT_SUSPEND && (ret > 0 || dev->power.no_pm_callbacks) && - !dev_pm_test_driver_flags(dev, DPM_FLAG_NEVER_SKIP); + !dev_pm_test_driver_flags(dev, DPM_FLAG_NO_DIRECT_COMPLETE); spin_unlock_irq(&dev->power.lock); return 0; } @@ -2128,7 +2002,7 @@ void device_pm_check_callbacks(struct device *dev) spin_unlock_irq(&dev->power.lock); } -bool dev_pm_smart_suspend_and_suspended(struct device *dev) +bool dev_pm_skip_suspend(struct device *dev) { return dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_SUSPEND) && pm_runtime_status_suspended(dev); diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c index 99c7da112c95..9f62790f644c 100644 --- a/drivers/base/power/runtime.c +++ b/drivers/base/power/runtime.c @@ -523,13 +523,11 @@ static int rpm_suspend(struct device *dev, int rpmflags) repeat: retval = rpm_check_suspend_allowed(dev); - if (retval < 0) - ; /* Conditions are wrong. */ + goto out; /* Conditions are wrong. */ /* Synchronous suspends are not allowed in the RPM_RESUMING state. */ - else if (dev->power.runtime_status == RPM_RESUMING && - !(rpmflags & RPM_ASYNC)) + if (dev->power.runtime_status == RPM_RESUMING && !(rpmflags & RPM_ASYNC)) retval = -EAGAIN; if (retval) goto out; diff --git a/drivers/base/power/sysfs.c b/drivers/base/power/sysfs.c index 2b99fe1eb207..24d25cf8ab14 100644 --- a/drivers/base/power/sysfs.c +++ b/drivers/base/power/sysfs.c @@ -666,7 +666,7 @@ int dpm_sysfs_add(struct device *dev) if (rc) return rc; - if (pm_runtime_callbacks_present(dev)) { + if (!pm_runtime_has_no_callbacks(dev)) { rc = sysfs_merge_group(&dev->kobj, &pm_runtime_attr_group); if (rc) goto err_out; @@ -709,7 +709,7 @@ int dpm_sysfs_change_owner(struct device *dev, kuid_t kuid, kgid_t kgid) if (rc) return rc; - if (pm_runtime_callbacks_present(dev)) { + if (!pm_runtime_has_no_callbacks(dev)) { rc = sysfs_group_change_owner( &dev->kobj, &pm_runtime_attr_group, kuid, kgid); if (rc) diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c index 2dfb30b963c4..407f6919604c 100644 --- a/drivers/clk/clk.c +++ b/drivers/clk/clk.c @@ -114,7 +114,11 @@ static int clk_pm_runtime_get(struct clk_core *core) return 0; ret = pm_runtime_get_sync(core->dev); - return ret < 0 ? ret : 0; + if (ret < 0) { + pm_runtime_put_noidle(core->dev); + return ret; + } + return 0; } static void clk_pm_runtime_put(struct clk_core *core) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c index fd1dc3236eca..a9086ea1ab60 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c @@ -191,7 +191,7 @@ int amdgpu_driver_load_kms(struct drm_device *dev, unsigned long flags) } if (adev->runpm) { - dev_pm_set_driver_flags(dev->dev, DPM_FLAG_NEVER_SKIP); + dev_pm_set_driver_flags(dev->dev, DPM_FLAG_NO_DIRECT_COMPLETE); pm_runtime_use_autosuspend(dev->dev); pm_runtime_set_autosuspend_delay(dev->dev, 5000); pm_runtime_set_active(dev->dev); diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c b/drivers/gpu/drm/i915/intel_runtime_pm.c index ad719c9602af..9cb2d7548daa 100644 --- a/drivers/gpu/drm/i915/intel_runtime_pm.c +++ b/drivers/gpu/drm/i915/intel_runtime_pm.c @@ -549,7 +549,7 @@ void intel_runtime_pm_enable(struct intel_runtime_pm *rpm) * becaue the HDA driver may require us to enable the audio power * domain during system suspend. */ - dev_pm_set_driver_flags(kdev, DPM_FLAG_NEVER_SKIP); + dev_pm_set_driver_flags(kdev, DPM_FLAG_NO_DIRECT_COMPLETE); pm_runtime_set_autosuspend_delay(kdev, 10000); /* 10s */ pm_runtime_mark_last_busy(kdev); diff --git a/drivers/gpu/drm/radeon/radeon_kms.c b/drivers/gpu/drm/radeon/radeon_kms.c index 58176db85952..372962358a18 100644 --- a/drivers/gpu/drm/radeon/radeon_kms.c +++ b/drivers/gpu/drm/radeon/radeon_kms.c @@ -158,7 +158,7 @@ int radeon_driver_load_kms(struct drm_device *dev, unsigned long flags) } if (radeon_is_px(dev)) { - dev_pm_set_driver_flags(dev->dev, DPM_FLAG_NEVER_SKIP); + dev_pm_set_driver_flags(dev->dev, DPM_FLAG_NO_DIRECT_COMPLETE); pm_runtime_use_autosuspend(dev->dev); pm_runtime_set_autosuspend_delay(dev->dev, 5000); pm_runtime_set_active(dev->dev); diff --git a/drivers/i2c/busses/i2c-designware-platdrv.c b/drivers/i2c/busses/i2c-designware-platdrv.c index 5536673060cc..c429d664f655 100644 --- a/drivers/i2c/busses/i2c-designware-platdrv.c +++ b/drivers/i2c/busses/i2c-designware-platdrv.c @@ -357,12 +357,12 @@ static int dw_i2c_plat_probe(struct platform_device *pdev) if (dev->flags & ACCESS_NO_IRQ_SUSPEND) { dev_pm_set_driver_flags(&pdev->dev, DPM_FLAG_SMART_PREPARE | - DPM_FLAG_LEAVE_SUSPENDED); + DPM_FLAG_MAY_SKIP_RESUME); } else { dev_pm_set_driver_flags(&pdev->dev, DPM_FLAG_SMART_PREPARE | DPM_FLAG_SMART_SUSPEND | - DPM_FLAG_LEAVE_SUSPENDED); + DPM_FLAG_MAY_SKIP_RESUME); } /* The code below assumes runtime PM to be disabled. */ diff --git a/drivers/misc/mei/pci-me.c b/drivers/misc/mei/pci-me.c index a1ed375fed37..71f795b510ce 100644 --- a/drivers/misc/mei/pci-me.c +++ b/drivers/misc/mei/pci-me.c @@ -241,7 +241,7 @@ static int mei_me_probe(struct pci_dev *pdev, const struct pci_device_id *ent) * MEI requires to resume from runtime suspend mode * in order to perform link reset flow upon system suspend. */ - dev_pm_set_driver_flags(&pdev->dev, DPM_FLAG_NEVER_SKIP); + dev_pm_set_driver_flags(&pdev->dev, DPM_FLAG_NO_DIRECT_COMPLETE); /* * ME maps runtime suspend/resume to D0i states, diff --git a/drivers/misc/mei/pci-txe.c b/drivers/misc/mei/pci-txe.c index beacf2a2f2b5..4bf26ce61044 100644 --- a/drivers/misc/mei/pci-txe.c +++ b/drivers/misc/mei/pci-txe.c @@ -128,7 +128,7 @@ static int mei_txe_probe(struct pci_dev *pdev, const struct pci_device_id *ent) * MEI requires to resume from runtime suspend mode * in order to perform link reset flow upon system suspend. */ - dev_pm_set_driver_flags(&pdev->dev, DPM_FLAG_NEVER_SKIP); + dev_pm_set_driver_flags(&pdev->dev, DPM_FLAG_NO_DIRECT_COMPLETE); /* * TXE maps runtime suspend/resume to own power gating states, diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c index 177c6da80c57..2730b1c7dddb 100644 --- a/drivers/net/ethernet/intel/e1000e/netdev.c +++ b/drivers/net/ethernet/intel/e1000e/netdev.c @@ -7549,7 +7549,7 @@ static int e1000_probe(struct pci_dev *pdev, const struct pci_device_id *ent) e1000_print_device_info(adapter); - dev_pm_set_driver_flags(&pdev->dev, DPM_FLAG_NEVER_SKIP); + dev_pm_set_driver_flags(&pdev->dev, DPM_FLAG_NO_DIRECT_COMPLETE); if (pci_dev_run_wake(pdev) && hw->mac.type < e1000_pch_cnp) pm_runtime_put_noidle(&pdev->dev); diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c index b46bff8fe056..8bb3db2cbd41 100644 --- a/drivers/net/ethernet/intel/igb/igb_main.c +++ b/drivers/net/ethernet/intel/igb/igb_main.c @@ -3445,7 +3445,7 @@ static int igb_probe(struct pci_dev *pdev, const struct pci_device_id *ent) } } - dev_pm_set_driver_flags(&pdev->dev, DPM_FLAG_NEVER_SKIP); + dev_pm_set_driver_flags(&pdev->dev, DPM_FLAG_NO_DIRECT_COMPLETE); pm_runtime_put_noidle(&pdev->dev); return 0; diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c index 69fa1ce1f927..59fc0097438f 100644 --- a/drivers/net/ethernet/intel/igc/igc_main.c +++ b/drivers/net/ethernet/intel/igc/igc_main.c @@ -4825,7 +4825,7 @@ static int igc_probe(struct pci_dev *pdev, pcie_print_link_status(pdev); netdev_info(netdev, "MAC: %pM\n", netdev->dev_addr); - dev_pm_set_driver_flags(&pdev->dev, DPM_FLAG_NEVER_SKIP); + dev_pm_set_driver_flags(&pdev->dev, DPM_FLAG_NO_DIRECT_COMPLETE); pm_runtime_put_noidle(&pdev->dev); diff --git a/drivers/pci/hotplug/pciehp_core.c b/drivers/pci/hotplug/pciehp_core.c index 312cc45c44c7..bf779f291f15 100644 --- a/drivers/pci/hotplug/pciehp_core.c +++ b/drivers/pci/hotplug/pciehp_core.c @@ -275,7 +275,7 @@ static int pciehp_suspend(struct pcie_device *dev) * If the port is already runtime suspended we can keep it that * way. */ - if (dev_pm_smart_suspend_and_suspended(&dev->port->dev)) + if (dev_pm_skip_suspend(&dev->port->dev)) return 0; pciehp_disable_interrupt(dev); diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c index 0454ca0e4e3f..da6510af1221 100644 --- a/drivers/pci/pci-driver.c +++ b/drivers/pci/pci-driver.c @@ -776,7 +776,7 @@ static int pci_pm_suspend(struct device *dev) static int pci_pm_suspend_late(struct device *dev) { - if (dev_pm_smart_suspend_and_suspended(dev)) + if (dev_pm_skip_suspend(dev)) return 0; pci_fixup_device(pci_fixup_suspend, to_pci_dev(dev)); @@ -789,10 +789,8 @@ static int pci_pm_suspend_noirq(struct device *dev) struct pci_dev *pci_dev = to_pci_dev(dev); const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL; - if (dev_pm_smart_suspend_and_suspended(dev)) { - dev->power.may_skip_resume = true; + if (dev_pm_skip_suspend(dev)) return 0; - } if (pci_has_legacy_pm_support(pci_dev)) return pci_legacy_suspend_late(dev, PMSG_SUSPEND); @@ -880,8 +878,8 @@ Fixup: * pci_pm_complete() to take care of fixing up the device's state * anyway, if need be. */ - dev->power.may_skip_resume = device_may_wakeup(dev) || - !device_can_wakeup(dev); + if (device_can_wakeup(dev) && !device_may_wakeup(dev)) + dev->power.may_skip_resume = false; return 0; } @@ -893,17 +891,9 @@ static int pci_pm_resume_noirq(struct device *dev) pci_power_t prev_state = pci_dev->current_state; bool skip_bus_pm = pci_dev->skip_bus_pm; - if (dev_pm_may_skip_resume(dev)) + if (dev_pm_skip_resume(dev)) return 0; - /* - * Devices with DPM_FLAG_SMART_SUSPEND may be left in runtime suspend - * during system suspend, so update their runtime PM status to "active" - * as they are going to be put into D0 shortly. - */ - if (dev_pm_smart_suspend_and_suspended(dev)) - pm_runtime_set_active(dev); - /* * In the suspend-to-idle case, devices left in D0 during suspend will * stay in D0, so it is not necessary to restore or update their @@ -928,6 +918,14 @@ static int pci_pm_resume_noirq(struct device *dev) return 0; } +static int pci_pm_resume_early(struct device *dev) +{ + if (dev_pm_skip_resume(dev)) + return 0; + + return pm_generic_resume_early(dev); +} + static int pci_pm_resume(struct device *dev) { struct pci_dev *pci_dev = to_pci_dev(dev); @@ -961,6 +959,7 @@ static int pci_pm_resume(struct device *dev) #define pci_pm_suspend_late NULL #define pci_pm_suspend_noirq NULL #define pci_pm_resume NULL +#define pci_pm_resume_early NULL #define pci_pm_resume_noirq NULL #endif /* !CONFIG_SUSPEND */ @@ -1127,7 +1126,7 @@ static int pci_pm_poweroff(struct device *dev) static int pci_pm_poweroff_late(struct device *dev) { - if (dev_pm_smart_suspend_and_suspended(dev)) + if (dev_pm_skip_suspend(dev)) return 0; pci_fixup_device(pci_fixup_suspend, to_pci_dev(dev)); @@ -1140,7 +1139,7 @@ static int pci_pm_poweroff_noirq(struct device *dev) struct pci_dev *pci_dev = to_pci_dev(dev); const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL; - if (dev_pm_smart_suspend_and_suspended(dev)) + if (dev_pm_skip_suspend(dev)) return 0; if (pci_has_legacy_pm_support(pci_dev)) @@ -1358,6 +1357,7 @@ static const struct dev_pm_ops pci_dev_pm_ops = { .suspend = pci_pm_suspend, .suspend_late = pci_pm_suspend_late, .resume = pci_pm_resume, + .resume_early = pci_pm_resume_early, .freeze = pci_pm_freeze, .thaw = pci_pm_thaw, .poweroff = pci_pm_poweroff, diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c index 160d67c59310..3acf151ae015 100644 --- a/drivers/pci/pcie/portdrv_pci.c +++ b/drivers/pci/pcie/portdrv_pci.c @@ -115,7 +115,7 @@ static int pcie_portdrv_probe(struct pci_dev *dev, pci_save_state(dev); - dev_pm_set_driver_flags(&dev->dev, DPM_FLAG_NEVER_SKIP | + dev_pm_set_driver_flags(&dev->dev, DPM_FLAG_NO_DIRECT_COMPLETE | DPM_FLAG_SMART_SUSPEND); if (pci_bridge_d3_possible(dev)) { diff --git a/fs/block_dev.c b/fs/block_dev.c index 93672c3f1c78..608a7b9173f4 100644 --- a/fs/block_dev.c +++ b/fs/block_dev.c @@ -2023,8 +2023,7 @@ ssize_t blkdev_write_iter(struct kiocb *iocb, struct iov_iter *from) if (bdev_read_only(I_BDEV(bd_inode))) return -EPERM; - /* uswsusp needs write permission to the swap */ - if (IS_SWAPFILE(bd_inode) && !hibernation_available()) + if (IS_SWAPFILE(bd_inode) && !is_hibernate_resume_dev(bd_inode)) return -ETXTBSY; if (!iov_iter_count(from)) diff --git a/include/linux/pm.h b/include/linux/pm.h index e057d1fa2469..121c104a4090 100644 --- a/include/linux/pm.h +++ b/include/linux/pm.h @@ -544,31 +544,17 @@ struct pm_subsys_data { * These flags can be set by device drivers at the probe time. They need not be * cleared by the drivers as the driver core will take care of that. * - * NEVER_SKIP: Do not skip all system suspend/resume callbacks for the device. - * SMART_PREPARE: Check the return value of the driver's ->prepare callback. - * SMART_SUSPEND: No need to resume the device from runtime suspend. - * LEAVE_SUSPENDED: Avoid resuming the device during system resume if possible. + * NO_DIRECT_COMPLETE: Do not apply direct-complete optimization to the device. + * SMART_PREPARE: Take the driver ->prepare callback return value into account. + * SMART_SUSPEND: Avoid resuming the device from runtime suspend. + * MAY_SKIP_RESUME: Allow driver "noirq" and "early" callbacks to be skipped. * - * Setting SMART_PREPARE instructs bus types and PM domains which may want - * system suspend/resume callbacks to be skipped for the device to return 0 from - * their ->prepare callbacks if the driver's ->prepare callback returns 0 (in - * other words, the system suspend/resume callbacks can only be skipped for the - * device if its driver doesn't object against that). This flag has no effect - * if NEVER_SKIP is set. - * - * Setting SMART_SUSPEND instructs bus types and PM domains which may want to - * runtime resume the device upfront during system suspend that doing so is not - * necessary from the driver's perspective. It also may cause them to skip - * invocations of the ->suspend_late and ->suspend_noirq callbacks provided by - * the driver if they decide to leave the device in runtime suspend. - * - * Setting LEAVE_SUSPENDED informs the PM core and middle-layer code that the - * driver prefers the device to be left in suspend after system resume. + * See Documentation/driver-api/pm/devices.rst for details. */ -#define DPM_FLAG_NEVER_SKIP BIT(0) +#define DPM_FLAG_NO_DIRECT_COMPLETE BIT(0) #define DPM_FLAG_SMART_PREPARE BIT(1) #define DPM_FLAG_SMART_SUSPEND BIT(2) -#define DPM_FLAG_LEAVE_SUSPENDED BIT(3) +#define DPM_FLAG_MAY_SKIP_RESUME BIT(3) struct dev_pm_info { pm_message_t power_state; @@ -758,8 +744,8 @@ extern int pm_generic_poweroff_late(struct device *dev); extern int pm_generic_poweroff(struct device *dev); extern void pm_generic_complete(struct device *dev); -extern bool dev_pm_may_skip_resume(struct device *dev); -extern bool dev_pm_smart_suspend_and_suspended(struct device *dev); +extern bool dev_pm_skip_resume(struct device *dev); +extern bool dev_pm_skip_suspend(struct device *dev); #else /* !CONFIG_PM_SLEEP */ diff --git a/include/linux/pm_runtime.h b/include/linux/pm_runtime.h index 3bdcbce8141a..3dbc207bff53 100644 --- a/include/linux/pm_runtime.h +++ b/include/linux/pm_runtime.h @@ -102,9 +102,9 @@ static inline bool pm_runtime_enabled(struct device *dev) return !dev->power.disable_depth; } -static inline bool pm_runtime_callbacks_present(struct device *dev) +static inline bool pm_runtime_has_no_callbacks(struct device *dev) { - return !dev->power.no_callbacks; + return dev->power.no_callbacks; } static inline void pm_runtime_mark_last_busy(struct device *dev) diff --git a/include/linux/suspend.h b/include/linux/suspend.h index 4fcc6fd0cbd6..b960098acfb0 100644 --- a/include/linux/suspend.h +++ b/include/linux/suspend.h @@ -466,6 +466,12 @@ static inline bool system_entering_hibernation(void) { return false; } static inline bool hibernation_available(void) { return false; } #endif /* CONFIG_HIBERNATION */ +#ifdef CONFIG_HIBERNATION_SNAPSHOT_DEV +int is_hibernate_resume_dev(const struct inode *); +#else +static inline int is_hibernate_resume_dev(const struct inode *i) { return 0; } +#endif + /* Hibernation and suspend events */ #define PM_HIBERNATION_PREPARE 0x0001 /* Going to hibernate */ #define PM_POST_HIBERNATION 0x0002 /* Hibernation finished */ diff --git a/kernel/power/Kconfig b/kernel/power/Kconfig index c208566c844b..4d0e6e815a2b 100644 --- a/kernel/power/Kconfig +++ b/kernel/power/Kconfig @@ -80,6 +80,18 @@ config HIBERNATION For more information take a look at . +config HIBERNATION_SNAPSHOT_DEV + bool "Userspace snapshot device" + depends on HIBERNATION + default y + ---help--- + Device used by the uswsusp tools. + + Say N if no snapshotting from userspace is needed, this also + reduces the attack surface of the kernel. + + If in doubt, say Y. + config PM_STD_PARTITION string "Default resume partition" depends on HIBERNATION diff --git a/kernel/power/Makefile b/kernel/power/Makefile index e7e47d9be1e5..5899260a8bef 100644 --- a/kernel/power/Makefile +++ b/kernel/power/Makefile @@ -10,7 +10,8 @@ obj-$(CONFIG_VT_CONSOLE_SLEEP) += console.o obj-$(CONFIG_FREEZER) += process.o obj-$(CONFIG_SUSPEND) += suspend.o obj-$(CONFIG_PM_TEST_SUSPEND) += suspend_test.o -obj-$(CONFIG_HIBERNATION) += hibernate.o snapshot.o swap.o user.o +obj-$(CONFIG_HIBERNATION) += hibernate.o snapshot.o swap.o +obj-$(CONFIG_HIBERNATION_SNAPSHOT_DEV) += user.o obj-$(CONFIG_PM_AUTOSLEEP) += autosleep.o obj-$(CONFIG_PM_WAKELOCKS) += wakelock.o diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c index 30bd28d1d418..02ec716a4927 100644 --- a/kernel/power/hibernate.c +++ b/kernel/power/hibernate.c @@ -67,6 +67,18 @@ bool freezer_test_done; static const struct platform_hibernation_ops *hibernation_ops; +static atomic_t hibernate_atomic = ATOMIC_INIT(1); + +bool hibernate_acquire(void) +{ + return atomic_add_unless(&hibernate_atomic, -1, 0); +} + +void hibernate_release(void) +{ + atomic_inc(&hibernate_atomic); +} + bool hibernation_available(void) { return nohibernate == 0 && !security_locked_down(LOCKDOWN_HIBERNATION); @@ -704,7 +716,7 @@ int hibernate(void) lock_system_sleep(); /* The snapshot device should not be opened while we're running */ - if (!atomic_add_unless(&snapshot_device_available, -1, 0)) { + if (!hibernate_acquire()) { error = -EBUSY; goto Unlock; } @@ -775,7 +787,7 @@ int hibernate(void) Exit: __pm_notifier_call_chain(PM_POST_HIBERNATION, nr_calls, NULL); pm_restore_console(); - atomic_inc(&snapshot_device_available); + hibernate_release(); Unlock: unlock_system_sleep(); pr_info("hibernation exit\n"); @@ -880,7 +892,7 @@ static int software_resume(void) goto Unlock; /* The snapshot device should not be opened while we're running */ - if (!atomic_add_unless(&snapshot_device_available, -1, 0)) { + if (!hibernate_acquire()) { error = -EBUSY; swsusp_close(FMODE_READ); goto Unlock; @@ -911,7 +923,7 @@ static int software_resume(void) __pm_notifier_call_chain(PM_POST_RESTORE, nr_calls, NULL); pm_restore_console(); pr_info("resume failed (%d)\n", error); - atomic_inc(&snapshot_device_available); + hibernate_release(); /* For success case, the suspend path will release the lock */ Unlock: mutex_unlock(&system_transition_mutex); diff --git a/kernel/power/power.h b/kernel/power/power.h index 7cdc64dc2373..ba2094db6294 100644 --- a/kernel/power/power.h +++ b/kernel/power/power.h @@ -154,8 +154,8 @@ extern int snapshot_write_next(struct snapshot_handle *handle); extern void snapshot_write_finalize(struct snapshot_handle *handle); extern int snapshot_image_loaded(struct snapshot_handle *handle); -/* If unset, the snapshot device cannot be open. */ -extern atomic_t snapshot_device_available; +extern bool hibernate_acquire(void); +extern void hibernate_release(void); extern sector_t alloc_swapdev_block(int swap); extern void free_all_swap_pages(int swap); diff --git a/kernel/power/user.c b/kernel/power/user.c index 7959449765d9..d5eedc2baa2a 100644 --- a/kernel/power/user.c +++ b/kernel/power/user.c @@ -35,9 +35,13 @@ static struct snapshot_data { bool ready; bool platform_support; bool free_bitmaps; + struct inode *bd_inode; } snapshot_state; -atomic_t snapshot_device_available = ATOMIC_INIT(1); +int is_hibernate_resume_dev(const struct inode *bd_inode) +{ + return hibernation_available() && snapshot_state.bd_inode == bd_inode; +} static int snapshot_open(struct inode *inode, struct file *filp) { @@ -49,13 +53,13 @@ static int snapshot_open(struct inode *inode, struct file *filp) lock_system_sleep(); - if (!atomic_add_unless(&snapshot_device_available, -1, 0)) { + if (!hibernate_acquire()) { error = -EBUSY; goto Unlock; } if ((filp->f_flags & O_ACCMODE) == O_RDWR) { - atomic_inc(&snapshot_device_available); + hibernate_release(); error = -ENOSYS; goto Unlock; } @@ -92,11 +96,12 @@ static int snapshot_open(struct inode *inode, struct file *filp) __pm_notifier_call_chain(PM_POST_RESTORE, nr_calls, NULL); } if (error) - atomic_inc(&snapshot_device_available); + hibernate_release(); data->frozen = false; data->ready = false; data->platform_support = false; + data->bd_inode = NULL; Unlock: unlock_system_sleep(); @@ -112,6 +117,7 @@ static int snapshot_release(struct inode *inode, struct file *filp) swsusp_free(); data = filp->private_data; + data->bd_inode = NULL; free_all_swap_pages(data->swap); if (data->frozen) { pm_restore_gfp_mask(); @@ -122,7 +128,7 @@ static int snapshot_release(struct inode *inode, struct file *filp) } pm_notifier_call_chain(data->mode == O_RDONLY ? PM_POST_HIBERNATION : PM_POST_RESTORE); - atomic_inc(&snapshot_device_available); + hibernate_release(); unlock_system_sleep(); @@ -204,6 +210,7 @@ struct compat_resume_swap_area { static int snapshot_set_swap_area(struct snapshot_data *data, void __user *argp) { + struct block_device *bdev; sector_t offset; dev_t swdev; @@ -234,9 +241,12 @@ static int snapshot_set_swap_area(struct snapshot_data *data, data->swap = -1; return -EINVAL; } - data->swap = swap_type_of(swdev, offset, NULL); + data->swap = swap_type_of(swdev, offset, &bdev); if (data->swap < 0) return -ENODEV; + + data->bd_inode = bdev->bd_inode; + bdput(bdev); return 0; }