This wraps single ioctl() commands into block requests using
the custom block layer request types REQ_OP_DRV_IN and
REQ_OP_DRV_OUT.
By doing this we are loosening the grip on the big host lock,
since two calls to mmc_get_card()/mmc_put_card() are removed.
We are storing the ioctl() in/out argument as a pointer in
the per-request struct mmc_blk_request container. Since we
now let the block layer allocate this data, blk_get_request()
will allocate it for us and we can immediately dereference
it and use it to pass the argument into the block layer.
We refactor the if/else/if/else ladder in mmc_blk_issue_rq()
as part of the job, keeping some extra attention to the
case when a NULL req is passed into this function and
making that pipeline flush more explicit.
Tested on the ux500 with the userspace:
mmc extcsd read /dev/mmcblk3
resulting in a successful EXTCSD info dump back to the
console.
This commit fixes a starvation issue in the MMC/SD stack
that can be easily provoked in the following way by
issueing the following commands in sequence:
> dd if=/dev/mmcblk3 of=/dev/null bs=1M &
> mmc extcs read /dev/mmcblk3
Before this patch, the extcsd read command would hang
(starve) while waiting for the dd command to finish since
the block layer was holding the card/host lock.
After this patch, the extcsd ioctl() command is nicely
interpersed with the rest of the block commands and we
can issue a bunch of ioctl()s from userspace while there
is some busy block IO going on without any problems.
Conversely userspace ioctl()s can no longer starve
the block layer by holding the card/host lock.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Avri Altman <Avri.Altman@sandisk.com>
The variable is_rpmb is clearly a bool and even assigned true
and false, yet declared as an int.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
The mmc_queue_req is a per-request state container the MMC core uses
to carry bounce buffers, pointers to asynchronous requests and so on.
Currently allocated as a static array of objects, then as a request
comes in, a mmc_queue_req is assigned to it, and used during the
lifetime of the request.
This is backwards compared to how other block layer drivers work:
they usally let the block core provide a per-request struct that get
allocated right beind the struct request, and which can be obtained
using the blk_mq_rq_to_pdu() helper. (The _mq_ infix in this function
name is misleading: it is used by both the old and the MQ block
layer.)
The per-request struct gets allocated to the size stored in the queue
variable .cmd_size initialized using the .init_rq_fn() and
cleaned up using .exit_rq_fn().
The block layer code makes the MMC core rely on this mechanism to
allocate the per-request mmc_queue_req state container.
Doing this make a lot of complicated queue handling go away. We only
need to keep the .qnct that keeps count of how many request are
currently being processed by the MMC layer. The MQ block layer will
replace also this once we transition to it.
Doing this refactoring is necessary to move the ioctl() operations
into custom block layer requests tagged with REQ_OP_DRV_[IN|OUT]
instead of the custom code using the BigMMCHostLock that we have
today: those require that per-request data be obtainable easily from
a request after creating a custom request with e.g.:
struct request *rq = blk_get_request(q, REQ_OP_DRV_IN, __GFP_RECLAIM);
struct mmc_queue_req *mq_rq = req_to_mq_rq(rq);
And this is not possible with the current construction, as the request
is not immediately assigned the per-request state container, but
instead it gets assigned when the request finally enters the MMC
queue, which is way too late for custom requests.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
[Ulf: Folded in the fix to drop a call to blk_cleanup_queue()]
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Heiner Kallweit <hkallweit1@gmail.com>
This option is activated by all multiplatform configs and what
not so we almost always have it turned on, and the memory it
saves is negligible, even more so moving forward. The actual
bounce buffer only gets allocated only when used, the only
thing the ifdefs are saving is a little bit of code.
It is highly improper to have this as a Kconfig option that
get turned on by Kconfig, make this a pure runtime-thing and
let the host decide whether we use bounce buffers. We add a
new property "disable_bounce" to the host struct.
Notice that mmc_queue_calc_bouncesz() already disables the
bounce buffers if host->max_segs != 1, so any arch that has a
maximum number of segments higher than 1 will have bounce
buffers disabled.
The option CONFIG_MMC_BLOCK_BOUNCE is default y so the
majority of platforms in the kernel already have it on, and
it then gets turned off at runtime since most of these have
a host->max_segs > 1. The few exceptions that have
host->max_segs == 1 and still turn off the bounce buffering
are those that disable it in their defconfig.
Those are the following:
arch/arm/configs/colibri_pxa300_defconfig
arch/arm/configs/zeus_defconfig
- Uses MMC_PXA, drivers/mmc/host/pxamci.c
- Sets host->max_segs = NR_SG, which is 1
- This needs its bounce buffer deactivated so we set
host->disable_bounce to true in the host driver
arch/arm/configs/davinci_all_defconfig
- Uses MMC_DAVINCI, drivers/mmc/host/davinci_mmc.c
- This driver sets host->max_segs to MAX_NR_SG, which is 16
- That means this driver anyways disabled bounce buffers
- No special action needed for this platform
arch/arm/configs/lpc32xx_defconfig
arch/arm/configs/nhk8815_defconfig
arch/arm/configs/u300_defconfig
- Uses MMC_ARMMMCI, drivers/mmc/host/mmci.[c|h]
- This driver by default sets host->max_segs to NR_SG,
which is 128, unless a DMA engine is used, and in that case
the number of segments are also > 1
- That means this driver already disables bounce buffers
- No special action needed for these platforms
arch/arm/configs/sama5_defconfig
- Uses MMC_SDHCI, MMC_SDHCI_PLTFM, MMC_SDHCI_OF_AT91, MMC_ATMELMCI
- Uses drivers/mmc/host/sdhci.c
- Normally sets host->max_segs to SDHCI_MAX_SEGS which is 128 and
thus disables bounce buffers
- Sets host->max_segs to 1 if SDHCI_USE_SDMA is set
- SDHCI_USE_SDMA is only set by SDHCI on PCI adapers
- That means that for this platform bounce buffers are already
disabled at runtime
- No special action needed for this platform
arch/blackfin/configs/CM-BF533_defconfig
arch/blackfin/configs/CM-BF537E_defconfig
- Uses MMC_SPI (a simple MMC card connected on SPI pins)
- Uses drivers/mmc/host/mmc_spi.c
- Sets host->max_segs to MMC_SPI_BLOCKSATONCE which is 128
- That means this platform already disables bounce buffers at
runtime
- No special action needed for these platforms
arch/mips/configs/cavium_octeon_defconfig
- Uses MMC_CAVIUM_OCTEON, drivers/mmc/host/cavium.c
- Sets host->max_segs to 16 or 1
- Setting host->disable_bounce to be sure for the 1 case
arch/mips/configs/qi_lb60_defconfig
- Uses MMC_JZ4740, drivers/mmc/host/jz4740_mmc.c
- This sets host->max_segs to 128 so bounce buffers are
already runtime disabled
- No action needed for this platform
It would be interesting to come up with a list of the platforms
that actually end up using bounce buffers. I have not been
able to infer such a list, but it occurs when
host->max_segs == 1 and the bounce buffering is not explicitly
disabled.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Make renesas_sdhi_sys_dmac.c a top-level module file that makes use of
library code supplied by renesas_sdhi_core.c
This is in order to facilitate adding other variants of SDHI;
in particular SDHI using different DMA controllers.
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
[Arnd: Fixed module build error]
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Rename the source file SDHI. A follow-up patch will make it a library
file used by a different top-level module file.
The name "renesas" is chosen as the SDHI driver is applicable to a wider
range of SoCs than SH-Mobile it seems to be a more appropriate name.
However, the SDHI driver source itself, is left as sh_mobile_sdhi to
avoid unnecessary churn.
the name "core" was chosen to reflect the desired role of this file,
to provide core functionality to the sdhi driver. A follow-up patch will
move the file into that role.
Internal symbols have also been renamed to reflect the filename change.
The .name member of struct platform_driver and parameter to
MODULE_ALIAS() have not been changed in order to avoid the complication
of potentially breaking SH SoCs which still use platform drivers.
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Rename the source file for DMA for SDHI as a follow-up to attaching
DMA code to the SDHI driver rather than the tmio_core driver.
The name "renesas" is chosen as the SDHI driver is applicable to a wider
range of SoCs than SH-Mobile it seems to be a more appropriate name.
However, the SDHI driver source itself, is left as sh_mobile_sdhi to
avoid unnecessary churn.
The name sys_dmac was chosen to reflect the type of DMA used.
Internal symbols have also been renamed to reflect the filename change.
A follow-up patch will re-organise the SDHI driver removing
the need for renesas_sdhi_get_dma_ops().
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Rename tmio_mmc_pio.c to tmio_mmc_core.c to more accurately reflect its
function: to provide core code for the tmio-mmc and sh-mobole-sdhi drivers.
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Refactor DMA support to allow it to be provided by a set of call-backs
that are provided by a host driver. The motivation is to allow multiple
DMA implementations to be provided and instantiated at run-time.
Instantiate the existing DMA implementation from the sh_mobile_sdhi driver
which appears to match the current use-case. This has the side effect
of moving the DMA code from the tmio_core to the sh_mobile_sdhi driver.
A follow-up patch will change the source file for the SDHI DMA
implementation accordingly. Another follow-up patch will re-organise the
SDHI driver removing the need for tmio_mmc_get_dma_ops().
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Reshuffle the comment at the top of the source
dropping filenames and moving up human readable strings.
This seems to be somewhat more useful information to start the
source file with. It is also less fragile, f.e. to file renames.
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
This reverts commit a6db2c8603 ("mmc: dw_mmc: Don't allow Runtime PM for
SDIO cards")'
As dw_mmc now is capable of preventing runtime PM suspend while SDIO IRQs
are enabled, let's drop the less fine-grained method, which is preventing
runtime PM suspend for all SDIO cards - no matter of whether SDIO IRQs are
being enabled or not.
In this way we don't keep the host runtime PM resumed, unless it's really
needed, thus avoiding to waste power.
Especially when SDIO IRQs is supported via a separate out-of-band IRQ line,
which isn't defined by the SDIO standard, typically the SDIO func driver
doesn't enable SDIO IRQs via sdio_claim_irq(). So, for these cases we can
now allow the dwmmc device to be runtime PM suspended in-between requests.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
To be able to handle SDIO IRQs the dw_mmc device needs to be powered and
providing clock to the SDIO card. Therefore, we must not allow the device
to be runtime PM suspended while SDIO IRQs are enabled.
To fix this, let's increase the runtime PM usage count while the mmc core
enables SDIO IRQs. Later when the mmc core tells dw_mmc to disable SDIO
IRQs, we drop the usage count to again allow runtime PM suspend.
This now becomes the default behaviour for dw_mmc. In cases where SDIO IRQs
can be re-routed as GPIO wake-ups during runtime PM suspend, one could
potentially allow runtime PM suspend. However, that will have to be
addressed as a separate change on top of this one.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Convert to use the more lightweight method for processing SDIO IRQs, which
involves the following changes:
- Enable MMC_CAP2_SDIO_IRQ_NOTHREAD when SDIO IRQ is supported and use
sdio_signal_irq() instead of mmc_signal_sdio_irq().
- Mask the SDIO IRQ before signaling a new one to be processed.
- Implement the ->ack_sdio_irq() callback to unmask the SDIO IRQ.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
For hosts not supporting MMC_CAP2_SDIO_IRQ_NOTHREAD but MMC_CAP_SDIO_IRQ,
the SDIO IRQs are processed from a dedicated kernel thread. For these
cases, the host calls mmc_signal_sdio_irq() from its ISR to signal a new
SDIO IRQ.
Signaling an SDIO IRQ makes the host's ->enable_sdio_irq() callback to be
invoked to temporary disable the IRQs, before the kernel thread is woken up
to process it. When processing of the IRQs are completed, they are
re-enabled by the kernel thread, again via invoking the host's
->enable_sdio_irq().
The observation from this, is that the execution path is being unnecessary
complex, as the host driver already knows that it needs to temporary
disable the IRQs before signaling a new one. Moreover, replacing the kernel
thread with a work/workqueue would not only greatly simplify the code, but
also make it more robust.
To address the above problems, let's continue to build upon the support for
MMC_CAP2_SDIO_IRQ_NOTHREAD, as it already implements SDIO IRQs to be
processed without using the clumsy kernel thread and without the ping-pong
calls of the host's ->enable_sdio_irq() callback for each processed IRQ.
Therefore, let's add new API sdio_signal_irq(), which enables hosts to
signal/process SDIO IRQs by using a work/workqueue, rather than using the
kernel thread.
Add also a new host callback ->ack_sdio_irq(), which the work invokes when
the SDIO IRQs have been processed. This informs the host about when it
shall re-enable the SDIO IRQs. Potentially, we could re-use the existing
->enable_sdio_irq() callback instead of adding a new one, however it has
turned out that it's more convenient for hosts to get this information via
a separate callback.
Hosts that wants to use this new method to signal/process SDIO IRQs, must
enable MMC_CAP2_SDIO_IRQ_NOTHREAD and implement the ->ack_sdio_irq()
callback.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
In cases when MMC_CAP2_SDIO_IRQ_NOTHREAD is set, there is a minor window
for when the mmc host could call sdio_run_irqs(), while in fact an SDIO
func driver could have decided to released the SDIO IRQ via a call to
sdio_release_irq(). In this scenario, processing of the SDIO IRQs are done
even if there is none IRQ claimed, which is not what we want.
To prevent this from happen, close the window by validating that at least
one SDIO IRQs is claimed, before deciding to process them.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
In case if a pwrseq-emmc has been bound to the host, a call to
mmc_power_up() triggers an eMMC HW reset via the pwrseq_emmc's
->post_power_on() callback. This isn't really what we want, as
mmc_power_up() is called each time when resuming the card.
As a matter of fact, the current approach may also violate the eMMC spec,
as the involved delays managed in pwrseq_emmc assumes both VCC and VCCQ has
been turned on, which isn't the case for VCCQ, unless the regulator is
always on.
Fix this behaviour by aligning to the same procedure used when the mmc host
implements the ->hw_reset() callback and has the MMC_CAP_HW_RESET flag set.
In this way the eMMC HW reset is issued at card detection scan, to cope
with bogus bootloaders and in the error recovery path via the mmc specific
bus_ops->reset() callback.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
The ->reset() callback is needed to implement a better support for eMMC HW
reset. The following changes will take advantage of the new callback.
Suggested-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Add the missing endianness conversions when printing the USB
device-descriptor idVendor and idProduct fields during probe.
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
AVR32 is gone. Now it's time to clean up the driver by removing
leftovers that was used by AVR32 related code.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Ludovic Desroches <ludovic.desroches@microchip.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
At the end of either of the read or write loops len is always zero
and hence the non-zero check on len and return of -EIO is redundant
and can be removed.
Detected by CoverityScan, CID#114293 ("Logically dead code")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
ret is signed however is printed as unsigned fix the same.
If printed as a negative number the result is easier to read.
No functional change.
Signed-off-by: Shubhrajyoti Datta <shubhrajyoti.datta@xilinx.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Stack guard page is a useful feature to reduce a risk of stack smashing
into a different mapping. We have been using a single page gap which
is sufficient to prevent having stack adjacent to a different mapping.
But this seems to be insufficient in the light of the stack usage in
userspace. E.g. glibc uses as large as 64kB alloca() in many commonly
used functions. Others use constructs liks gid_t buffer[NGROUPS_MAX]
which is 256kB or stack strings with MAX_ARG_STRLEN.
This will become especially dangerous for suid binaries and the default
no limit for the stack size limit because those applications can be
tricked to consume a large portion of the stack and a single glibc call
could jump over the guard page. These attacks are not theoretical,
unfortunatelly.
Make those attacks less probable by increasing the stack guard gap
to 1MB (on systems with 4k pages; but make it depend on the page size
because systems with larger base pages might cap stack allocations in
the PAGE_SIZE units) which should cover larger alloca() and VLA stack
allocations. It is obviously not a full fix because the problem is
somehow inherent, but it should reduce attack space a lot.
One could argue that the gap size should be configurable from userspace,
but that can be done later when somebody finds that the new 1MB is wrong
for some special case applications. For now, add a kernel command line
option (stack_guard_gap) to specify the stack gap size (in page units).
Implementation wise, first delete all the old code for stack guard page:
because although we could get away with accounting one extra page in a
stack vma, accounting a larger gap can break userspace - case in point,
a program run with "ulimit -S -v 20000" failed when the 1MB gap was
counted for RLIMIT_AS; similar problems could come with RLIMIT_MLOCK
and strict non-overcommit mode.
Instead of keeping gap inside the stack vma, maintain the stack guard
gap as a gap between vmas: using vm_start_gap() in place of vm_start
(or vm_end_gap() in place of vm_end if VM_GROWSUP) in just those few
places which need to respect the gap - mainly arch_get_unmapped_area(),
and and the vma tree's subtree_gap support for that.
Original-patch-by: Oleg Nesterov <oleg@redhat.com>
Original-patch-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Tested-by: Helge Deller <deller@gmx.de> # parisc
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Stream of fixes has slowed down, only a few this week:
- Some DT fixes for Allwinner platforms, and addition of a clock to
the R_CCU clock controller that had been missed.
- A couple of small DT fixes for am335x-sl50.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJZR3FwAAoJEIwa5zzehBx3MmsQAJ+VV9hfqtXihWZFTqM+RqLO
0qn4BU+zpxS/4TK/cVXy69/PNXRJJqq0hPfmyfBPj3Mm5revejFaFg7w620mBvUy
01w1Wu2bLf6HG+9PBjmwBl9CIG4qjSHQXKvkT3A/ZVvV1zw+V/Yvs48Y7e7CDYMc
or+URw9JS5R8UZdJ03oklnkNdSRLfXCfjfwKz6Hn1WmZ30Gsg74DYBuGzvL2wFRx
qyBaNwTaItipiIIPSzrns4yexpujYwzMxypIF6q9cHXfmnA669NwHCUwhZawdvQi
ibEoGxTpisjus07/y+zcar73f+NFN3QVtKdTi+XxYTKBPH3OxU4d4DbbE4EBpazk
G/I8ZVZ87tpuskkLegTuXDjsgfsVJTdBt+Rck4+MGiP/4DccOXXauEsGhbryk5Jg
TB6r45tf9pDpoYiCF0JIkkl9TLEv4hUXgIYZBYtH1lFXbSVkGpk1y+ZM3SrgSoP1
U2wAY6vxAB6taGHI/99i3/8VI5Fd7Q06XpaGVyk9ET7pRc5Lvpbz9255jpLOasf/
8ZkaVk3yM9mzcSEezHohzQd2en1sIvA6gZbLFMBL9UoLBgrtbSJPQCIalnRelwJf
SZoO/mDmgYAr3Tq3NuYUI4dp1U49q5nGme6ujm98Hg5VdH/50ZDwidaFS/N+Ba71
gIc2TLD0OMC/zhuOOBaE
=pi+d
-----END PGP SIGNATURE-----
Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"Stream of fixes has slowed down, only a few this week:
- Some DT fixes for Allwinner platforms, and addition of a clock to
the R_CCU clock controller that had been missed.
- A couple of small DT fixes for am335x-sl50"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
arm64: allwinner: a64: Add PLL_PERIPH0 clock to the R_CCU
ARM: sunxi: h3-h5: Add PLL_PERIPH0 clock to the R_CCU
ARM: dts: am335x-sl50: Fix cannot claim requested pins for spi0
ARM: dts: am335x-sl50: Fix card detect pin for mmc1
arm64: allwinner: h5: Remove syslink to shared DTSI
ARM: sunxi: h3/h5: fix the compatible of R_CCU
A few fixes around the PRCM support that got in 4.12 with a wrong
compatible, and a missing clock in the binding.
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJZQZkZAAoJEBx+YmzsjxAg4nAQAJJqzy8//Ur0o6Ppc6eufIAY
gYGS80x5n/a6/X7PPMj/cMBVc1/HoOoF5YVKry2edRi4jKwpBjCE6THNJ/EdI0LM
6PrN4y9yJAzxbwWfD9rfVNLg54665TGW8etBp5C3Sqdi+qmU9BTL068UYGcA46I8
XGZ53wGLnCfRH5VGpVzxORbdQMStKsZ2D0PTmZ7aJU2nrPugbf4DiGg2Uhgdx+bI
Mz3Zl6cZQraWdl6gSVTjG1Z5LQOKo5tXGIaC4zxbXe/Ss2lspxM3WKtJDhdtoTH9
ZiLGDf6Q3XUeMN5WkQNT6ZnT8+/8NQujhcktEfxhfiA5pGeHLzuOCaOLgEWVy8sc
Z6jNLHUht3W/XGOGY0szKfqmsOnDdnsnv3YbCUoWJ/0ER8kwQdJ8k4iI1EVMi23Q
UcXDiZivCgjj7mYOzfhG2YYZ03rxhadPqsnoDro/a+mI6splPhQplJC/we8TUYHt
eJmXF3rvOgXGYJAdnF8FJiftzUKUd0h8S8qsxBB3knP4mPY73vtppsvJwPOqlil2
EPcqHmcd3EvHRZrmHNsP7qpQXMiaWqImf6Ioq7hz4mPPJ5uiIPrEIdRmACCAdn9H
eeOQyI0rdg3bTCKi/dztaX4zVCMjEy9HG0xZ/Y6dvPwKjuWTpJApsTC+xEaiql2L
gQ+OFMX4mvgr79OUNp7L
=WD1J
-----END PGP SIGNATURE-----
Merge tag 'sunxi-fixes-for-4.12' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux into fixes
Allwinner fixes for 4.12
A few fixes around the PRCM support that got in 4.12 with a wrong
compatible, and a missing clock in the binding.
* tag 'sunxi-fixes-for-4.12' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux:
arm64: allwinner: a64: Add PLL_PERIPH0 clock to the R_CCU
ARM: sunxi: h3-h5: Add PLL_PERIPH0 clock to the R_CCU
arm64: allwinner: h5: Remove syslink to shared DTSI
ARM: sunxi: h3/h5: fix the compatible of R_CCU
Signed-off-by: Olof Johansson <olof@lixom.net>
for claiming SPI pins, and to fix a SDIO card detect
pin for production version of the device.
-----BEGIN PGP SIGNATURE-----
iQJFBAABCAAvFiEEkgNvrZJU/QSQYIcQG9Q+yVyrpXMFAlk3q9cRHHRvbnlAYXRv
bWlkZS5jb20ACgkQG9Q+yVyrpXMFbRAAx9XBHtsHPI0AcWEagprXFoeKClEmlV33
Az06mEwgWKM0IMKELdstFLgpT80zp71v44k61vxWHudJi3a3rEbcDiujmttKg4II
fL3CfjpYS3son449VZWc/X0ZsvU/5vhusDkx8QwGWK1FgfOLVNvwNtgxWr3roUFC
qUQZImkTvgKYuhoaoV5Sx0VtUEJ9ukLNAqjy0HTfTU15jkX4/nqmlB+8ov2oamYv
r502+LEzNVoCmlYW1SBH+yn7zebIad2UtRZqz+TV8FnJl3p4yFI1Odvo8hbODTzT
vo+YsSmhd23eKnw+yXpqdPYng3yXH5Co3vLFumpBcyNkZlCtSokLrJ1/VOjtTNVn
oM+gsR77I6zGknFdiBzITLJRFABHlziSfeaDyyhfQX9CAfrCv9t5hiqn6KwRuiGj
H1hD2JkFWeJLLPGO4YauJk5PLcAdMILfjHRHW7lBeKaT76nH4Ha5PgpCj22hNSCo
m1EDdR30QzkfL28iYp8LM9yNhkm05Y8VG517AmIrzJUl6/RhTRH2f5d6+ChTQ7AA
cp89ITqEeccCso6xULCPjqaGIuYCEopDFjQN/WP8Pt5bXweiiQ6HpAgMDp2E5PfD
3yKwYvXDyl5HIa+ZbEDPLuna1OjDu7BrCe+nuDwsVfeM0fOHViABPl6mAvH7VXvb
nQhOVeQSP8I=
=e4F/
-----END PGP SIGNATURE-----
Merge tag 'omap-for-v4.12/fixes-sl50' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes
Two fixes for am335x-sl50 to fix a boot time error
for claiming SPI pins, and to fix a SDIO card detect
pin for production version of the device.
* tag 'omap-for-v4.12/fixes-sl50' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: dts: am335x-sl50: Fix cannot claim requested pins for spi0
ARM: dts: am335x-sl50: Fix card detect pin for mmc1
Signed-off-by: Olof Johansson <olof@lixom.net>
It turns out balloon does not handle IOMMUs correctly.
We should fix that at some point, for now let's just
disable this configuration.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJZRt8uAAoJECgfDbjSjVRph00H/2YkmqY3o5J769R1x/v4Q1Jl
7ZI++QjLnMSJYfw/oVSSQ7YvmG0MNnjGZlwOGblvmiRq1USD85H23gov5kuKzlgf
xKJCzlYG+TgaM0ZR43J4kk0E13QrRkYgPoC0rpm6X7mdYScLo5Hvcw8OfKR5akpA
xhxpkX0+42ftbvDVeG7oI6Yg+HZUTS6Vp5aqrW4bykaAst3dEXKHhBczyx05zqmZ
np6aymSz13Stl5IqIhoJaOGprN+iRhxm+iT+b+J2JH0W4sA/rVxkOZ2FXAtPP0JJ
tY69SYZCBf216dV8UKGSr8/1WiBVlxjmgbzXvrVKGv9BiPza/1jRGm44Em8y6Cc=
=BQi2
-----END PGP SIGNATURE-----
Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull virtio bugfix from Michael Tsirkin:
"It turns out balloon does not handle IOMMUs correctly. We should fix
that at some point, for now let's just disable this configuration"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
virtio_balloon: disable VIOMMU support
Pull i2c fixes from Wolfram Sang:
"Two driver bugfixes"
* 'i2c/for-current-fixed' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: ismt: fix wrong device address when unmap the data buffer
i2c: rcar: use correct length when unmapping DMA
Pull MIPS fixes from Ralf Baechle:
- Three highmem fixes:
+ Fixed mapping initialization
+ Adjust the pkmap location
+ Ensure we use at most one page for PTEs
- Fix makefile dependencies for .its targets to depend on vmlinux
- Fix reversed condition in BNEZC and JIALC software branch emulation
- Only flush initialized flush_insn_slot to avoid NULL pointer
dereference
- perf: Remove incorrect odd/even counter handling for I6400
- ftrace: Fix init functions tracing
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
MIPS: .its targets depend on vmlinux
MIPS: Fix bnezc/jialc return address calculation
MIPS: kprobes: flush_insn_slot should flush only if probe initialised
MIPS: ftrace: fix init functions tracing
MIPS: mm: adjust PKMAP location
MIPS: highmem: ensure that we don't use more than one page for PTEs
MIPS: mm: fixed mappings: correct initialisation
MIPS: perf: Remove incorrect odd/even counter handling for I6400
virtio balloon bypasses the DMA API entirely so does not support the
VIOMMU right now. It's not clear we need that support, for now let's
just make sure we don't pretend to support it.
Cc: stable@vger.kernel.org
Cc: Wei Wang <wei.w.wang@intel.com>
Fixes: 1a93769399 ("virtio: new feature to detect IOMMU device quirk")
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Pull x86 fixes from Thomas Gleixner:
"Two fixlets for x86:
- Handle WARN_ONs proper with the new UD based WARN implementation
- Disable 1G mappings when 2M mappings are disabled by kmemleak or
debug_pagealloc. Otherwise 1G mappings might still be used,
confusing the debug mechanisms"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm: Disable 1GB direct mappings when disabling 2MB mappings
x86/debug: Handle early WARN_ONs proper
Pull timer fixes from Thomas Gleixner:
"Three fixlets for timers:
- Two hot-fixes for the alarmtimer based posix timers, which prevent
a nasty DOS by self rescheduling timers. The proper cleanup of that
mess is queued for 4.13
- Make a function static"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
tick/broadcast: Make tick_broadcast_setup_oneshot() static
alarmtimer: Rate limit periodic intervals
alarmtimer: Prevent overflow of relative timers
Pull scheduler fixes from Thomas Gleixner:
"Two small fixes for the schedulre core:
- Use the proper switch_mm() variant in idle_task_exit() because that
code is not called with interrupts disabled.
- Fix a confusing typo in a printk"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/core: Idle_task_exit() shouldn't use switch_mm_irqs_off()
sched/fair: Fix typo in printk message
Pull perf fixes from Thomas Gleixner:
"Three fixes for the perf user space side:
- Fix the probing of precise_ip level, which got broken recently for
x86.
- Unbreak the ARCH=x86_64 build
- Report module before trying to unwind into the module code, which
avoids broken stack frames displayed"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf unwind: Report module before querying isactivation in dwfl unwind
perf tools: Fix build with ARCH=x86_64
perf evsel: Fix probing of precise_ip level for default cycles event
Pull irq fix from Thomas Gleixner:
"Add a missing resource release to an error path"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq: Release resources in __setup_irq() error path
Pull objtool fix from Thomas Gleixner:
"A single fix which adds fortify_panic to the list of no return
functions"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
objtool: Add fortify_panic as __noreturn function
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJZRQjNAAoJEL1qUBy3i3wmgt4P+wfkMgfBnpiCtp7aoRtvRo8o
Ar+SR0iwcG5T8jM/Q74H2K//lKdjXtsGVrtXHeUTmhzLluw+cfnycgPo0oK9iHtR
CowneQJhj3FvPos63wzPnea0+Sg8ZBbegR6Bp+Au+IrH+svX4BO48BuSdwXHBzsa
jFU6TgvR1YGltpsn4IKzTyY1uRlyMYV6CahadqqibpaWxw301Hm2RxzPPwV50h0T
2PXLYR28L/2G5f6bvI/qp+2m1haKE2c/4kSRXX0BetkE/zifAVCuOBxo505ztDZd
KZ/SZ0uZ5xR8O0/I28udoNv37wXePoXeOs+ycpHcgG2NUDLbVmWLVVUWQVYPY6Hu
71q0piW9LulSZxhDivEIiJk7lhQ/khq6M1WeABxYHI6X4BNFNnHykgnesEULL17I
B4+Hb4rALIFjh0KrJfCGL/Y4Pne0MGP/YFGPBFAYGBqC/yzZ32xs3lgcu9H0sSnV
mWedN+CbfX74AjAl16KGC/m1fxGRTHWPSN6QCrjzcPGXDN0Zx1NtkTk6HwhjIiRB
QeFxQB13ucQZaGiAG1N0pj2Qe1+Tjla03rJ4HL+IuNTxtaVX01/XaOahUWrZHWdg
H6FkrinOKuUJRQ/tsQRSPxpaOlGkdGEqGyTR3/S/8yoxU695lKV9HmVLBOVyqHwz
faKVeNd3MAA09o/RpN9N
=GOju
-----END PGP SIGNATURE-----
Merge tag 'led_fixes_for_4.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds
Pull LED fixes from Jacek Anaszewski:
"Two LED fixes:
- fix signal source assignment for leds-bcm6328
- revert patch that intended to fix LED behavior on suspend but it
had a side effect preventing suspend at all due to uevent being
sent on trigger removal"
* tag 'led_fixes_for_4.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds:
Revert "leds: handle suspend/resume in heartbeat trigger"
leds: bcm6328: fix signal source assignment for leds 4 to 7
Here are some small gadget and xhci USB fixes for 4.12-rc6.
Nothing major, but one of the gadget patches does fix a reported oops,
and the xhci ones resolve reported problems. All have been in
linux-next with no reported issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCWUUFyA8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ylQ4gCfYyKEsNf9NDKTEw9vNmCNRpsHMa4AoMmdVqmb
lyQAz7Uw2liD+XeBj/qJ
=eAqv
-----END PGP SIGNATURE-----
Merge tag 'usb-4.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are some small gadget and xhci USB fixes for 4.12-rc6.
Nothing major, but one of the gadget patches does fix a reported oops,
and the xhci ones resolve reported problems. All have been in
linux-next with no reported issues"
* tag 'usb-4.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
USB: gadgetfs, dummy-hcd, net2280: fix locking for callbacks
usb: xhci: ASMedia ASM1042A chipset need shorts TX quirk
usb: xhci: Fix USB 3.1 supported protocol parsing
USB: gadget: fix GPF in gadgetfs
usb: gadget: composite: make sure to reactivate function on unbind
Here are some small Staging and IIO driver fixes for 4.12-rc6.
Nothing huge, just a few small driver fixes for reported issues. All
have been in linux-next with no reported issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCWUUGfg8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+yk/DwCeJUyPGQ4tFdiUvxG08bJyRT87B/IAn1jZKka3
n2+JfDEiNBlq+w2eneXG
=wXYh
-----END PGP SIGNATURE-----
Merge tag 'staging-4.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging and IIO fixes from Greg KH:
"Here are some small staging and IIO driver fixes for 4.12-rc6.
Nothing huge, just a few small driver fixes for reported issues. All
have been in linux-next with no reported issues"
* tag 'staging-4.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
Staging: rtl8723bs: fix an error code in isFileReadable()
iio: buffer-dmaengine: Add missing header buffer_impl.h
iio: buffer-dma: Add missing header buffer_impl.h
iio: adc: meson-saradc: fix potential crash in meson_sar_adc_clear_fifo
iio: adc: mxs-lradc: Fix return value check in mxs_lradc_adc_probe()
iio: imu: inv_mpu6050: add accel lpf setting for chip >= MPU6500
staging: iio: ad7152: Fix deadlock in ad7152_write_raw_samp_freq()
fixups from Zheng, prompted by the ongoing y2038 work.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQEcBAABCAAGBQJZRTAEAAoJEEp/3jgCEfOLNwcH/jfzGqhOS262fU5FCCowfNVJ
9ANzXiRpWykaHR7iPXOTRmRUqCJOCzhogmwzjnl7bQUX7cJPsFN+R7l9KR+dCAUX
300dplDWZ5oQCX2c7A7vzRCIgv4wjQjtS0mo+dY/EBCNYcynoAUVmbr/87Ezrroi
qfmA6pnI6hI527RLBkwIObljoAiy11MjQ1xFj0zS2bckWxfCSauO1v1qSpMhawkn
v4fAWEKz3y8oUG3MtT7/Ukx4/GJAOcksxKZf93AW0sNwQozCxvB40D/Dda3NcT4s
xYVVymUTYGTg1I/CmZHZxSqJwtKUOZJLwMFTXEFyo6bQH0Vj2pw/HaRf8Q5ksOU=
=ClP/
-----END PGP SIGNATURE-----
Merge tag 'ceph-for-4.12-rc6' of git://github.com/ceph/ceph-client
Pull ceph fixes from Ilya Dryomov:
"A fix for an old ceph ->fh_to_* bug from Luis and two timestamp fixups
from Zheng, prompted by the ongoing y2038 work"
* tag 'ceph-for-4.12-rc6' of git://github.com/ceph/ceph-client:
ceph: unify inode i_ctime update
ceph: use current_kernel_time() to get request time stamp
ceph: check i_nlink while converting a file handle to dentry
- Fix some bogus ASSERT failures on CONFIG_SMP=n and CONFIG_XFS_DEBUG=y.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABCgAGBQJZQhB9AAoJEPh/dxk0SrTrMWsP/A1gSHmaS2UI3WvXdqwAxldy
GiDZ3oSAhn4SF7BY+AhIu+Ot9kwCyOVzd/RZ9zVYmRjTTFxYX7P099uua2e/mf70
Z1o9oSk9H76srqo7L76h7lI2ONsgd28aqh082hmDrRUhhwy7LZThVV15ZPZFfgU4
8Q17h0uRUVgSn8M8INsuiMpw1sCLJsXw/9Rb9iFMgi3tSJaGZ1Mm//nA1aeS8HFH
xCKHYw7YUtVKtIVyVV1NGdtXhXZbNznJXelkUZLMnMgOOmAqWiUa8FOGPNbQEezX
1VidjzGhxRF7uoUJNnnX5mM+26c8/Ip4cLtqvnFQo30bx+HXO3OR8jywQO+6DD9Q
NxFHY5D/Peud96xsnq5mRzNMN5fqumyVAyUCXSebT3Oa42HMdva+66s9JoFfDehi
Z7VmRdoryxexDRbhiQVev4xe20beLtOHAP2I5JrnJddrXb0aFWowLk776wa0uxD8
7cbKgovikqSHk4/mqpyZ5iQeaufg/kOi2cNaTcAFeCbvbXYieeRXVlisMBh1DKec
lSX5e4kNNS20VVbCYakAttK69lpZzuraYPDDsnb4HRlmt0VX12lYyqQwklY/YZ9S
jDagtKWKDm/L/jq2j5Nd3uSycM+lMaq97mIMjzPRrnPjOriME1ZGLwEQbKfBLXnW
3Flzt5C2Hk1Fb/VlNx4S
=U99d
-----END PGP SIGNATURE-----
Merge tag 'xfs-4.12-fixes-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull xfs fix from Darrick Wong:
"One more bugfix for you for 4.12-rc6 to fix something that came up in
an earlier rc:
- Fix some bogus ASSERT failures on CONFIG_SMP=n and CONFIG_XFS_DEBUG=y"
* tag 'xfs-4.12-fixes-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: fix spurious spin_is_locked() assert failures on non-smp kernels
Pull ufs fixes from Al Viro:
"Fix assorted ufs bugs: a couple of deadlocks, fs corruption in
truncate(), oopsen on tail unpacking and truncate when racing with
vmscan, mild fs corruption (free blocks stats summary buggered, *BSD
fsck would complain and fix), several instances of broken logics
around reserved blocks (starting with "check almost never triggers
when it should" and then there are issues with sufficiently large
UFS2)"
[ Note: ufs hasn't gotten any loving in a long time, because nobody
really seems to use it. These ufs fixes are triggered by people
actually caring now, not some sudden influx of new bugs. - Linus ]
* 'ufs-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
ufs_truncate_blocks(): fix the case when size is in the last direct block
ufs: more deadlock prevention on tail unpacking
ufs: avoid grabbing ->truncate_mutex if possible
ufs_get_locked_page(): make sure we have buffer_heads
ufs: fix s_size/s_dsize users
ufs: fix reserved blocks check
ufs: make ufs_freespace() return signed
ufs: fix logics in "ufs: make fsck -f happy"
Pull vfs fixes from Al Viro:
"A couple of fixes; a leak in mntns_install() caught by Andrei (this
cycle regression) + d_invalidate() softlockup fix - that had been
reported by a bunch of people lately, but the problem is pretty old"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fs: don't forget to put old mntns in mntns_install
Hang/soft lockup in d_invalidate with simultaneous calls
Merge misc fixes from Andrew Morton:
"5 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
mm: correct the comment when reclaimed pages exceed the scanned pages
userfaultfd: shmem: handle coredumping in handle_userfault()
mm: numa: avoid waiting on freed migrated pages
swap: cond_resched in swap_cgroup_prepare()
mm/memory-failure.c: use compound_head() flags for huge pages
Commit e1587a4945 ("mm: vmpressure: fix sending wrong events on
underflow") declared that reclaimed pages exceed the scanned pages due
to the thp reclaim.
That is incorrect because THP will be spilt to normal page and loop
again, which will result in the scanned pages increment.
[akpm@linux-foundation.org: tweak comment text]
Link: http://lkml.kernel.org/r/1496824266-25235-1-git-send-email-zhongjiang@huawei.com
Signed-off-by: zhongjiang <zhongjiang@huawei.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Anon and hugetlbfs handle FOLL_DUMP set by get_dump_page() internally to
__get_user_pages().
shmem as opposed has no special FOLL_DUMP handling there so
handle_mm_fault() is invoked without mmap_sem and ends up calling
handle_userfault() that isn't expecting to be invoked without mmap_sem
held.
This makes handle_userfault() fail immediately if invoked through
shmem_vm_ops->fault during coredumping and solves the problem.
The side effect is a BUG_ON with no lock held triggered by the
coredumping process which exits. Only 4.11 is affected, pre-4.11 anon
memory holes are skipped in __get_user_pages by checking FOLL_DUMP
explicitly against empty pagetables (mm/gup.c:no_page_table()).
It's zero cost as we already had a check for current->flags to prevent
futex to trigger userfaults during exit (PF_EXITING).
Link: http://lkml.kernel.org/r/20170615214838.27429-1-aarcange@redhat.com
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Reported-by: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: <stable@vger.kernel.org> [4.11+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In do_huge_pmd_numa_page(), we attempt to handle a migrating thp pmd by
waiting until the pmd is unlocked before we return and retry. However,
we can race with migrate_misplaced_transhuge_page():
// do_huge_pmd_numa_page // migrate_misplaced_transhuge_page()
// Holds 0 refs on page // Holds 2 refs on page
vmf->ptl = pmd_lock(vma->vm_mm, vmf->pmd);
/* ... */
if (pmd_trans_migrating(*vmf->pmd)) {
page = pmd_page(*vmf->pmd);
spin_unlock(vmf->ptl);
ptl = pmd_lock(mm, pmd);
if (page_count(page) != 2)) {
/* roll back */
}
/* ... */
mlock_migrate_page(new_page, page);
/* ... */
spin_unlock(ptl);
put_page(page);
put_page(page); // page freed here
wait_on_page_locked(page);
goto out;
}
This can result in the freed page having its waiters flag set
unexpectedly, which trips the PAGE_FLAGS_CHECK_AT_PREP checks in the
page alloc/free functions. This has been observed on arm64 KVM guests.
We can avoid this by having do_huge_pmd_numa_page() take a reference on
the page before dropping the pmd lock, mirroring what we do in
__migration_entry_wait().
When we hit the race, migrate_misplaced_transhuge_page() will see the
reference and abort the migration, as it may do today in other cases.
Fixes: b8916634b7 ("mm: Prevent parallel splits during THP migration")
Link: http://lkml.kernel.org/r/1497349722-6731-2-git-send-email-will.deacon@arm.com
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Acked-by: Steve Capper <steve.capper@arm.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@suse.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
I saw need_resched() warnings when swapping on large swapfile (TBs)
because continuously allocating many pages in swap_cgroup_prepare() took
too long.
We already cond_resched when freeing page in swap_cgroup_swapoff(). Do
the same for the page allocation.
Link: http://lkml.kernel.org/r/20170604200109.17606-1-yuzhao@google.com
Signed-off-by: Yu Zhao <yuzhao@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>