Commit Graph

336166 Commits

Author SHA1 Message Date
Al Viro
fae2ae2a90 sparc64: not any error from do_sigaltstack() should fail rt_sigreturn()
If a signal handler is executed on altstack and another signal comes,
we will end up with rt_sigreturn() on return from the second handler
getting -EPERM from do_sigaltstack().  It's perfectly OK, since we
are not asking to change the settings; in fact, they couldn't have been
changed during the second handler execution exactly because we'd been
on altstack all along.  64bit sigreturn on sparc treats any error from
do_sigaltstack() as "SIGSEGV now"; we need to switch to the same semantics
we are using on other architectures.

Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-18 22:27:03 -05:00
Francois Romieu
8495c0da20 sis900: fix sis900_set_mode call parameters.
Leftover of 57d6d456cf ("sis900: stop
using net_device.{base_addr, irq} and convert to __iomem.").

It is needed for suspend / resume to work.

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Tested-by: Jan Janssen <medhefgo@web.de>
Cc: Daniele Venzano <venza@brownhat.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-18 18:28:15 -05:00
Marcin Slusarz
3bb076af2a drm/nouveau/bios: fix DCB v1.5 parsing
memcmp->nv_strncmp conversion, in addition to name change, should have
inverted the return value.

But nv_strncmp does not act like strncmp - it does not check for string
terminator, returns true/false instead of -1/0/1 and has different
parameters order.

Let's rename it to nv_memcmp and let it act like memcmp.

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2012-11-19 08:54:20 +10:00
Maarten Lankhorst
d9c390561d drm/nouveau: add missing pll_calc calls
Fixes a null pointer dereference when reclocking on my fermi.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2012-11-19 08:52:30 +10:00
Marcin Slusarz
bf7e438bca drm/nouveau: fix crash with noaccel=1
Reported-by: Ortwin Glück <odi@odi.ch>
Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2012-11-19 08:52:20 +10:00
Marcin Slusarz
1f150b3e7a drm/nv40: allocate ctxprog with kmalloc
Some archs defconfigs have CONFIG_FRAME_WARN set to 1024, which lead to this
warning:
drivers/gpu/drm/nouveau/core/engine/graph/ctxnv40.c: warning: the frame size
of 1184 bytes is larger than 1024 bytes

Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2012-11-19 08:52:07 +10:00
Kelly Doran
4113014f2d drm/nvc0/disp: fix thinko in vblank regression fix..
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2012-11-19 08:52:03 +10:00
Al Viro
3587b1b097 fanotify: fix FAN_Q_OVERFLOW case of fanotify_read()
If the FAN_Q_OVERFLOW bit set in event->mask, the fanotify event
metadata will not contain a valid file descriptor, but
copy_event_to_user() didn't check for that, and unconditionally does a
fd_install() on the file descriptor.

Which in turn will cause a BUG_ON() in __fd_install().

Introduced by commit 352e3b2492 ("fanotify: sanitize failure exits in
copy_event_to_user()")

Mea culpa - missed that path ;-/

Reported-by: Alex Shi <lkml.alex@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-11-18 09:30:00 -10:00
Linus Torvalds
8d938105e4 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull misc VFS fixes from Al Viro:
 "Remove a bogus BUG_ON() that can trigger spuriously + alpha bits of
  do_mount() constification I'd missed during the merge window."

This pull request came in a week ago, I missed it for some reason.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  kill bogus BUG_ON() in do_close_on_exec()
  missing const in alpha callers of do_mount()
2012-11-18 09:13:48 -10:00
Linus Torvalds
aa7202c251 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k
Pull m68k fix from Geert Uytterhoeven:
 "This is a bug fix for asm constraints that affect sending RT signals,
  also destined for -stable."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
  m68k: fix sigset_t accessor functions
2012-11-18 08:36:24 -10:00
Linus Torvalds
5ad27d6ca5 Last minute GPIO fixes for the v3.7 series:
- Disable blinking on the Orion GPIO driver
 - Two Kconfig-style fixes to avoid broken builds
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABAgAGBQJQqAG0AAoJEEEQszewGV1zOioP/iDD6lmV9SLbRXuFtIwzuXiZ
 c+2OHvfrprHxEc4DiLajxA9vFhqyh3HjNj1f9U0c17P9G4c6TCkFRxWtUEdfI0Ls
 z/0oFWQBG3E0wK9xAKup9ochz4h8I99HDCbjw8rrgaVlYDsrnd5mERzlzgV9OzB8
 4GGw9JD8USwo2qZ0ThISxzsI9PDz+FFgKYumRBOhwAfSVbHkVIxLqkMYge8vI7kf
 ck/MVrpS6sBRPX9uyzbDaoPxodQb4cYVqmUY9v2GmmK+FY3OQUisK0d0k5h2pWQP
 j7TWRwMgaeYXgsv1wG53/Ay6B48k1Wv1N4vwwJ/4skBZEWi7TwcuGQydWeUo96hb
 nDQWc0AuEXZ1HIewRxqHeTaxJHi/OBB6W5hxzc+TuOTw3Xi491DWmiuf4h6WvpRS
 s0wwMJ8xN8sekRgmhE321FmyGoviSoqIaT2VAXXOGWf+NSaKuBcq22adydgLYoFM
 j9h66aEoPESvtNlOifHs6cfwsq9jBEaQu7fG1JQITU3MjpjWgCFQB2JvdVIwqDuR
 yJNprr6/RNRIEFXZEtZQgYXb1G515RUCzjjAcUQtfY7IRInCDK8BBvv9wcdZw6+G
 3oLjF7TQpCWK/Ed+2nkeN1A5lZqVASdNx9gjnMLeS3u9IpxfjgBDQwyPELSNst3I
 CyYQdyLwOpNd26Nk/HJq
 =UBlv
 -----END PGP SIGNATURE-----

Merge tag 'gpio-fixes-for-v3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio

Pull last minute GPIO fixes from Linus Walleij:

 - Disable blinking on the Orion GPIO driver

 - Two Kconfig-style fixes to avoid broken builds

* tag 'gpio-fixes-for-v3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
  gpio-mcp23s08: Build I2C support even when CONFIG_I2C=m
  gpio: adnp: Depend on OF_GPIO instead of OF
  mvebu-gpio: Disable blinking when enabling a GPIO for output
2012-11-18 08:32:59 -10:00
Linus Torvalds
d28d3730fd xfs: bugfixes for 3.7-rc7
- fix attr tree double split corruption
 - fix broken error handling in xfs_vm_writepage
 - drop buffer io reference when a bad bio is built
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.10 (GNU/Linux)
 
 iQIcBAABAgAGBQJQp7sfAAoJENaLyazVq6ZOWHwP/2WTlenvM74i8HDa/nYW8KTC
 EubCZ6X1C7LPTV9tm9YUpKZ1VtI1O+OmuGcSmWdBKSMMoBVNyKvWXvrJeVKBVtXV
 sQ/jh1zCiPYzt9DfxGuarkw8Uy5qKNOYrbEAK1WwPMeOsDODYncfmTm+A/VYMeTt
 bWOjaxFd5QQOMuf0x9NO/keZc84R5l51ezYxA7HyYa5XvV/MDmLLVL0IhuSTFKyw
 oOiQMp0hby4zsJg6nqu/eINmdlgBIw+32m8aMSB2jreUQm4yvt0CY7M3Zq6sPmsM
 2tC6cFonPw31FBBu9jvv9h5wNz7McyzxtZBS0+zDV+7K0UrIyxWm1BhzZIXoXzLz
 vHwc4gnZV8nOP/g34aftHLYYRD3ZJhG8mX5AdBRzlWWqDSFvYVEq+1evHrv8kk4l
 coTapzimNnR3aJ16qdP1M0gExKO9nrGVqrRi8ndLNbxLpxC9mFG7CfJBQPMumukX
 G8pTV1wQvqONHDNlN4mxqMBHN0d9dGp5xjYQ0Q92/siIA1C5szjCwTHekKNrP6Ol
 7xd+nO7Xcgj7Uwaakv31paqOSAGhla6H5jvxPF2A54hZWQqlp88QpChLt3LFPxwh
 tEYTEf1zRoaoCS4TD3zMYTLY+9cXvUybSIf3hbgns+JMYHJtuZdzbvcaXE6Wl4Jr
 6esA5fsBFP1J2/EzpLof
 =depY
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-v3.7-rc7' of git://oss.sgi.com/xfs/xfs

Pull xfs bugfixes from Ben Myers:

 - fix attr tree double split corruption

 - fix broken error handling in xfs_vm_writepage

 - drop buffer io reference when a bad bio is built

* tag 'for-linus-v3.7-rc7' of git://oss.sgi.com/xfs/xfs:
  xfs: drop buffer io reference when a bad bio is built
  xfs: fix broken error handling in xfs_vm_writepage
  xfs: fix attr tree double split corruption
2012-11-18 08:29:34 -10:00
Linus Torvalds
5e30c089e5 If you were going to shoot me for not sending these earlier, you would be
right.  -rc6 beat me by ~2 hours it seems, and they really should have
 gone out long before that.
 
 These have been in libata-dev.git for a day or so (unfortunately
 linux-next is on vacation).  The main one is #1, with the others being
 minor bits.  #1 has multiple tested-by, and can be considered a
 regression fix IMO.
 
 1) Fix ACPI oops, https://bugzilla.kernel.org/show_bug.cgi?id=48211
 
 2) Temporary WARN_ONCE() debugging patch for further ACPI debugging.
    The code already oopses here, and so this merely gives slightly
    better info.  Related to https://bugzilla.kernel.org/show_bug.cgi?id=49151
    which has been bisected down to a patch that _exposes_ a latest bug,
    but said bisection target does not actually appear to be the root cause
    itself.
 
 3) sata_svw: fix longstanding error recovery bug, which was
    preventing kdump, by adding missing DMA-start bit check.  Core
    code was already checking DMA-start, but ancillary, less-used
    routines were not.  Fixed.
 
 4) sata_highbank: fix minor __init/__devinit warning
 
 5) Fix minor warning, if CONFIG_PM is set, but CONFIG_PM_SLEEP is not set
 
 6) pata_arasan: proper functioning requires clock setting
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIVAwUAUKcUVSWzCDIBeCsvAQIbhQ/+OUl+AcgvrF4+90Gv1TfuSSWlzEHjOQ/N
 CQdx5da2OQhOE3rGEyaqBRt5L+bM20QLimH67QvQirhx4+cm2DtE2VcUS2ZpMh5P
 BHnk2nuZb77m0XKbRc2u1DxxLq4HIYljKmJXnaCYp6d+o8E2ETdGIt9Zj9yxVscC
 v63M28L0ab31gisAIXAbNXx+/iS3mtfIl5G8u0b0XNVixO0f/uRc/rl1feB3UvJF
 GGRPZGExrCwy9RXMvqi+nRVu5HAHQ4iLp6aoMknR/URGvymmGE0h0tpiy05UfP64
 7O2hb9LJMVwSKuSkFVlRcKHKPrDisQs+bFT7a208AqVyAI3kv1ZHKDdcSRlGXnVd
 uy6jHL/+LKONefC3xjYoyctxZTCLIJCXg6lydwLG3R7HlrId3HVfnrukAvNo+nCR
 D47gV8llj9LIrQg8nyZvOwQR8CVXD8oUV2mPU/P6Br8otZ5dlfrCb5jqhzYfel+Z
 XPbBT/OIfO1JPVRGuyKlR3SCujT1x9VqETD0yj/XkxkDiKqOWbfMzu2beRQtYYsT
 ZDKd+niVxOjoLbDEIOEgRkoAdVgQ8EkH0+c12gmTho18UIvXj2NeZMyNw9/UX5p2
 VqT25aCG5vcffI+sSz1LTjWc6OAHNTW7dEkFlsAxyTIHe347yQwrATZ/iqhBj5Ci
 rpv517rYfl0=
 =V0y8
 -----END PGP SIGNATURE-----

Merge tag 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev

Pull libata fixes from Jeff Garzik:
 "If you were going to shoot me for not sending these earlier, you would
  be right.  -rc6 beat me by ~2 hours it seems, and they really should
  have gone out long before that.

  These have been in libata-dev.git for a day or so (unfortunately
  linux-next is on vacation).  The main one is #1, with the others being
  minor bits.  #1 has multiple tested-by, and can be considered a
  regression fix IMO.

   1) Fix ACPI oops:

        https://bugzilla.kernel.org/show_bug.cgi?id=48211

   2) Temporary WARN_ONCE() debugging patch for further ACPI debugging.

      The code already oopses here, and so this merely gives slightly
      better info.  Related to

        https://bugzilla.kernel.org/show_bug.cgi?id=49151

      which has been bisected down to a patch that _exposes_ a latest
      bug, but said bisection target does not actually appear to be the
      root cause itself.

   3) sata_svw: fix longstanding error recovery bug, which was
      preventing kdump, by adding missing DMA-start bit check.  Core
      code was already checking DMA-start, but ancillary, less-used
      routines were not.  Fixed.

   4) sata_highbank: fix minor __init/__devinit warning

   5) Fix minor warning, if CONFIG_PM is set, but CONFIG_PM_SLEEP is not
      set

   6) pata_arasan: proper functioning requires clock setting"

* tag 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
  [libata] PM callbacks should be conditionally compiled on CONFIG_PM_SLEEP
  sata_svw: check DMA start bit before reset
  libata debugging: Warn when unable to find timing descriptor based on xfer_mode
  sata_highbank: mark ahci_highbank_probe as __devinit
  pata_arasan: Initialize cf clock to 166MHz
  libata-acpi: Fix NULL ptr derference in ata_acpi_dev_handle
2012-11-18 08:26:35 -10:00
Clemens Ladisch
e99ddfde6a ALSA: ua101, usx2y: fix broken MIDI output
Commit 88a8516a21 (ALSA: usbaudio: implement USB autosuspend) added
autosuspend code to all files making up the snd-usb-audio driver.
However, midi.c is part of snd-usb-lib and is also used by other
drivers, not all of which support autosuspend.  Thus, calls to
usb_autopm_get_interface() could fail, and this unexpected error would
result in the MIDI output being completely unusable.

Make it work by ignoring the error that is expected with drivers that do
not support autosuspend.

Reported-by: Colin Fletcher <colin.m.fletcher@googlemail.com>
Reported-by: Devin Venable <venable.devin@gmail.com>
Reported-by: Dr Nick Bailey <nicholas.bailey@glasgow.ac.uk>
Reported-by: Jannis Achstetter <jannis_achstetter@web.de>
Reported-by: Rui Nuno Capela <rncbc@rncbc.org>
Cc: Oliver Neukum <oliver@neukum.org>
Cc: 2.6.39+ <stable@vger.kernel.org>
Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
2012-11-18 17:15:24 +01:00
Emmanuel Grumbach
e1b69fdf33 iwlwifi: don't WARN when a non empty queue is disabled
This can happen when we shut down suddenly an interface.

Cc: stable@vger.kernel.org
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2012-11-18 11:51:09 +01:00
Andreas Schwab
34fa78b59c m68k: fix sigset_t accessor functions
The sigaddset/sigdelset/sigismember functions that are implemented with
bitfield insn cannot allow the sigset argument to be placed in a data
register since the sigset is wider than 32 bits.  Remove the "d"
constraint from the asm statements.

The effect of the bug is that sending RT signals does not work, the signal
number is truncated modulo 32.

Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: stable@vger.kernel.org
2012-11-18 10:32:16 +01:00
Daniel M. Weeks
cbf24fad8e gpio-mcp23s08: Build I2C support even when CONFIG_I2C=m
The driver has both SPI and I2C pieces. The appropriate pieces are built based
on whether SPI and/or I2C is/are enabled. However, it was only checking if I2C
was built-in, never if it was built as a module. This patch checks for either
since building both this driver and I2C as modules is possible.

Signed-off-by: Daniel M. Weeks <dan@danweeks.net>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2012-11-17 22:22:24 +01:00
Thierry Reding
cb144fe8e0 gpio: adnp: Depend on OF_GPIO instead of OF
The driver accesses the of_node field of struct gpio_chip, which is only
available if OF_GPIO is selected. This solves a build issue on SPARC
which conflicts with OF_GPIO and therefore does not provide this field.

Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2012-11-17 22:22:23 +01:00
Jamie Lentin
e91337609a mvebu-gpio: Disable blinking when enabling a GPIO for output
The plat-orion GPIO driver would disable any pin blinking whenever
using a pin for output. Do the same here, as a blinking LED will
continue to blink regardless of what the GPIO pin level is.

Signed-off-by: Jamie Lentin <jm@lentin.co.uk>
Acked-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2012-11-17 22:22:23 +01:00
Dave Chinner
d69043c42d xfs: drop buffer io reference when a bad bio is built
Error handling in xfs_buf_ioapply_map() does not handle IO reference
counts correctly. We increment the b_io_remaining count before
building the bio, but then fail to decrement it in the failure case.
This leads to the buffer never running IO completion and releasing
the reference that the IO holds, so at unmount we can leak the
buffer. This leak is captured by this assert failure during unmount:

XFS: Assertion failed: atomic_read(&pag->pag_ref) == 0, file: fs/xfs/xfs_mount.c, line: 273

This is not a new bug - the b_io_remaining accounting has had this
problem for a long, long time - it's just very hard to get a
zero length bio being built by this code...

Further, the buffer IO error can be overwritten on a multi-segment
buffer by subsequent bio completions for partial sections of the
buffer. Hence we should only set the buffer error status if the
buffer is not already carrying an error status. This ensures that a
partial IO error on a multi-segment buffer will not be lost. This
part of the problem is a regression, however.

cc: <stable@vger.kernel.org>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2012-11-17 09:36:57 -06:00
Dave Chinner
3daed8bc3e xfs: fix broken error handling in xfs_vm_writepage
When we shut down the filesystem, it might first be detected in
writeback when we are allocating a inode size transaction. This
happens after we have moved all the pages into the writeback state
and unlocked them. Unfortunately, if we fail to set up the
transaction we then abort writeback and try to invalidate the
current page. This then triggers are BUG() in block_invalidatepage()
because we are trying to invalidate an unlocked page.

Fixing this is a bit of a chicken and egg problem - we can't
allocate the transaction until we've clustered all the pages into
the IO and we know the size of it (i.e. whether the last block of
the IO is beyond the current EOF or not). However, we don't want to
hold pages locked for long periods of time, especially while we lock
other pages to cluster them into the write.

To fix this, we need to make a clear delineation in writeback where
errors can only be handled by IO completion processing. That is,
once we have marked a page for writeback and unlocked it, we have to
report errors via IO completion because we've already started the
IO. We may not have submitted any IO, but we've changed the page
state to indicate that it is under IO so we must now use the IO
completion path to report errors.

To do this, add an error field to xfs_submit_ioend() to pass it the
error that occurred during the building on the ioend chain. When
this is non-zero, mark each ioend with the error and call
xfs_finish_ioend() directly rather than building bios. This will
immediately push the ioends through completion processing with the
error that has occurred.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2012-11-17 09:35:42 -06:00
Dave Chinner
42e2976f13 xfs: fix attr tree double split corruption
In certain circumstances, a double split of an attribute tree is
needed to insert or replace an attribute. In rare situations, this
can go wrong, leaving the attribute tree corrupted. In this case,
the attr being replaced is the last attr in a leaf node, and the
replacement is larger so doesn't fit in the same leaf node.
When we have the initial condition of a node format attribute
btree with two leaves at index 1 and 2. Call them L1 and L2.  The
leaf L1 is completely full, there is not a single byte of free space
in it. L2 is mostly empty.  The attribute being replaced - call it X
- is the last attribute in L1.

The way an attribute replace is executed is that the replacement
attribute - call it Y - is first inserted into the tree, but has an
INCOMPLETE flag set on it so that list traversals ignore it. Once
this transaction is committed, a second transaction it run to
atomically mark Y as COMPLETE and X as INCOMPLETE, so that a
traversal will now find Y and skip X. Once that transaction is
committed, attribute X is then removed.

So, the initial condition is:

     +--------+     +--------+
     |   L1   |     |   L2   |
     | fwd: 2 |---->| fwd: 0 |
     | bwd: 0 |<----| bwd: 1 |
     | fsp: 0 |     | fsp: N |
     |--------|     |--------|
     | attr A |     | attr 1 |
     |--------|     |--------|
     | attr B |     | attr 2 |
     |--------|     |--------|
     ..........     ..........
     |--------|     |--------|
     | attr X |     | attr n |
     +--------+     +--------+

So now we go to replace X, and see that L1:fsp = 0 - it is full so
we can't insert Y in the same leaf. So we record the the location of
attribute X so we can track it for later use, then we split L1 into
L1 and L3 and reblance across the two leafs. We end with:

     +--------+     +--------+     +--------+
     |   L1   |     |   L3   |     |   L2   |
     | fwd: 3 |---->| fwd: 2 |---->| fwd: 0 |
     | bwd: 0 |<----| bwd: 1 |<----| bwd: 3 |
     | fsp: M |     | fsp: J |     | fsp: N |
     |--------|     |--------|     |--------|
     | attr A |     | attr X |     | attr 1 |
     |--------|     +--------+     |--------|
     | attr B |                    | attr 2 |
     |--------|                    |--------|
     ..........                    ..........
     |--------|                    |--------|
     | attr W |                    | attr n |
     +--------+                    +--------+

And we track that the original attribute is now at L3:0.

We then try to insert Y into L1 again, and find that there isn't
enough room because the new attribute is larger than the old one.
Hence we have to split again to make room for Y. We end up with
this:

     +--------+     +--------+     +--------+     +--------+
     |   L1   |     |   L4   |     |   L3   |     |   L2   |
     | fwd: 4 |---->| fwd: 3 |---->| fwd: 2 |---->| fwd: 0 |
     | bwd: 0 |<----| bwd: 1 |<----| bwd: 4 |<----| bwd: 3 |
     | fsp: M |     | fsp: J |     | fsp: J |     | fsp: N |
     |--------|     |--------|     |--------|     |--------|
     | attr A |     | attr Y |     | attr X |     | attr 1 |
     |--------|     + INCOMP +     +--------+     |--------|
     | attr B |     +--------+                    | attr 2 |
     |--------|                                   |--------|
     ..........                                   ..........
     |--------|                                   |--------|
     | attr W |                                   | attr n |
     +--------+                                   +--------+

And now we have the new (incomplete) attribute @ L4:0, and the
original attribute at L3:0. At this point, the first transaction is
committed, and we move to the flipping of the flags.

This is where we are supposed to end up with this:

     +--------+     +--------+     +--------+     +--------+
     |   L1   |     |   L4   |     |   L3   |     |   L2   |
     | fwd: 4 |---->| fwd: 3 |---->| fwd: 2 |---->| fwd: 0 |
     | bwd: 0 |<----| bwd: 1 |<----| bwd: 4 |<----| bwd: 3 |
     | fsp: M |     | fsp: J |     | fsp: J |     | fsp: N |
     |--------|     |--------|     |--------|     |--------|
     | attr A |     | attr Y |     | attr X |     | attr 1 |
     |--------|     +--------+     + INCOMP +     |--------|
     | attr B |                    +--------+     | attr 2 |
     |--------|                                   |--------|
     ..........                                   ..........
     |--------|                                   |--------|
     | attr W |                                   | attr n |
     +--------+                                   +--------+

But that doesn't happen properly - the attribute tracking indexes
are not pointing to the right locations. What we end up with is both
the old attribute to be removed pointing at L4:0 and the new
attribute at L4:1.  On a debug kernel, this assert fails like so:

XFS: Assertion failed: args->index2 < be16_to_cpu(leaf2->hdr.count), file: fs/xfs/xfs_attr_leaf.c, line: 2725

because the new attribute location does not exist. On a production
kernel, this goes unnoticed and the code proceeds ahead merrily and
removes L4 because it thinks that is the block that is no longer
needed. This leaves the hash index node pointing to entries
L1, L4 and L2, but only blocks L1, L3 and L2 to exist. Further, the
leaf level sibling list is L1 <-> L4 <-> L2, but L4 is now free
space, and so everything is busted. This corruption is caused by the
removal of the old attribute triggering a join - it joins everything
correctly but then frees the wrong block.

xfs_repair will report something like:

bad sibling back pointer for block 4 in attribute fork for inode 131
problem with attribute contents in inode 131
would clear attr fork
bad nblocks 8 for inode 131, would reset to 3
bad anextents 4 for inode 131, would reset to 0

The problem lies in the assignment of the old/new blocks for
tracking purposes when the double leaf split occurs. The first split
tries to place the new attribute inside the current leaf (i.e.
"inleaf == true") and moves the old attribute (X) to the new block.
This sets up the old block/index to L1:X, and newly allocated
block to L3:0. It then moves attr X to the new block and tries to
insert attr Y at the old index. That fails, so it splits again.

With the second split, the rebalance ends up placing the new attr in
the second new block - L4:0 - and this is where the code goes wrong.
What is does is it sets both the new and old block index to the
second new block. Hence it inserts attr Y at the right place (L4:0)
but overwrites the current location of the attr to replace that is
held in the new block index (currently L3:0). It over writes it with
L4:1 - the index we later assert fail on.

Hopefully this table will show this in a foramt that is a bit easier
to understand:

Split		old attr index		new attr index
		vanilla	patched		vanilla	patched
before 1st	L1:26	L1:26		N/A	N/A
after 1st	L3:0	L3:0		L1:26	L1:26
after 2nd	L4:0	L3:0		L4:1	L4:0
                ^^^^			^^^^
		wrong			wrong

The fix is surprisingly simple, for all this analysis - just stop
the rebalance on the out-of leaf case from overwriting the new attr
index - it's already correct for the double split case.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2012-11-17 09:34:13 -06:00
Alex Williamson
3da4af0aff intel-iommu: Fix lookup in add device
We can't assume this device exists, fall back to the bridge itself.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Tested-by: Matthew Thode <prometheanfire@gentoo.org>
Cc: stable@vger.kernel.org
Signed-off-by: Joerg Roedel <joro@8bytes.org>
2012-11-17 13:27:15 +01:00
Cyril Roelandt
b334b64862 iommu/tegra-smmu.c: fix dentry reference leak in smmu_debugfs_stats_show().
Call to d_find_alias() needs a corresponding dput().

Signed-off-by: Cyril Roelandt <tipecaml@gmail.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
2012-11-17 13:25:40 +01:00
Joerg Roedel
e41105687b iommu/amd: Update MAINTAINERS entry
I have no access to my AMD email address anymore. Update
entry in MAINTAINERS to the new address.

Signed-off-by: Joerg Roedel <joro@8bytes.org>
2012-11-17 12:56:45 +01:00
Al Viro
1e93f66831 Merge branch 'arch-unicore32' into no-rebases 2012-11-16 22:29:03 -05:00
Al Viro
2bf81c8af9 Merge branch 'arch-microblaze' into no-rebases 2012-11-16 22:28:43 -05:00
Al Viro
9526d9bc23 Merge branch 'arch-arm64' into no-rebases 2012-11-16 22:28:34 -05:00
Al Viro
d042f3135e Merge branch 'arch-sparc' into no-rebases 2012-11-16 22:28:30 -05:00
Al Viro
48944cc84d Merge branch 'arch-xtensa' into no-rebases 2012-11-16 22:28:06 -05:00
Al Viro
d05f06e60d Merge branch 'arch-frv' into no-rebases 2012-11-16 22:27:58 -05:00
Al Viro
0af1c5300d Merge branch 'arch-s390' into no-rebases 2012-11-16 22:27:45 -05:00
Al Viro
42472a742b Merge branch 'arch-hexagon' into no-rebases 2012-11-16 22:27:42 -05:00
Al Viro
94b237b6b9 Merge branch 'arch-tile' into no-rebases 2012-11-16 22:27:38 -05:00
Al Viro
2482f8449d Merge branch 'arch-parisc' into no-rebases 2012-11-16 22:27:35 -05:00
Al Viro
94b28de47f Merge branch 'arch-ia64' into no-rebases 2012-11-16 22:27:31 -05:00
Al Viro
c09c5ad459 Merge branch 'arch-openrisc' into no-rebases 2012-11-16 22:27:13 -05:00
Al Viro
6929039761 Merge commit '6ba1bc826d160fe4f32bcb188687dcca4bdfaf3d' into arch-arm64
Backmerge from mainline commit that introduced a trivial conflict in
arch/arm64/kernel/process.c - a bunch of functions removed next to the
place where kernel_thread() used to be.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-11-16 20:53:36 -05:00
Al Viro
85910c202b Merge commit '517ffce4e1a03aea979fe3a18a3dd1761a24fafb' into arch-sparc
Backmerge from the point in mainline where a trivial conflict had been
introduced (arch/sparc/kernel/sys_sparc_64.c had grown sys_kern_features()
right after where kernel_execve() used to be)

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-11-16 20:49:06 -05:00
Linus Torvalds
f4a75d2eb7 Linux 3.7-rc6 2012-11-16 17:42:40 -08:00
Linus Torvalds
51844b0f04 Merge git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fix from Marcelo Tosatti:
 "A correction for oops on module init with older Intel hosts."

* git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86: Fix invalid secondary exec controls in vmx_cpuid_update()
2012-11-16 16:49:10 -08:00
Linus Torvalds
0cad3ff404 Merge branch 'akpm' (Fixes from Andrew)
Merge misc fixes from Andrew Morton.

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (12 patches)
  revert "mm: fix-up zone present pages"
  tmpfs: change final i_blocks BUG to WARNING
  tmpfs: fix shmem_getpage_gfp() VM_BUG_ON
  mm: highmem: don't treat PKMAP_ADDR(LAST_PKMAP) as a highmem address
  mm: revert "mm: vmscan: scale number of pages reclaimed by reclaim/compaction based on failures"
  rapidio: fix kernel-doc warnings
  swapfile: fix name leak in swapoff
  memcg: fix hotplugged memory zone oops
  mips, arc: fix build failure
  memcg: oom: fix totalpages calculation for memory.swappiness==0
  mm: fix build warning for uninitialized value
  mm: add anon_vma_lock to validate_mm()
2012-11-16 15:26:38 -08:00
Andrew Morton
5576646f3c revert "mm: fix-up zone present pages"
Revert commit 7f1290f2f2 ("mm: fix-up zone present pages")

That patch tried to fix a issue when calculating zone->present_pages,
but it caused a regression on 32bit systems with HIGHMEM.  With that
change, reset_zone_present_pages() resets all zone->present_pages to
zero, and fixup_zone_present_pages() is called to recalculate
zone->present_pages when the boot allocator frees core memory pages into
buddy allocator.  Because highmem pages are not freed by bootmem
allocator, all highmem zones' present_pages becomes zero.

Various options for improving the situation are being discussed but for
now, let's return to the 3.6 code.

Cc: Jianguo Wu <wujianguo@huawei.com>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: Petr Tesarik <ptesarik@suse.cz>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: David Rientjes <rientjes@google.com>
Tested-by: Chris Clayton <chris2553@googlemail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-11-16 14:33:04 -08:00
Hugh Dickins
0f3c42f522 tmpfs: change final i_blocks BUG to WARNING
Under a particular load on one machine, I have hit shmem_evict_inode()'s
BUG_ON(inode->i_blocks), enough times to narrow it down to a particular
race between swapout and eviction.

It comes from the "if (freed > 0)" asymmetry in shmem_recalc_inode(),
and the lack of coherent locking between mapping's nrpages and shmem's
swapped count.  There's a window in shmem_writepage(), between lowering
nrpages in shmem_delete_from_page_cache() and then raising swapped
count, when the freed count appears to be +1 when it should be 0, and
then the asymmetry stops it from being corrected with -1 before hitting
the BUG.

One answer is coherent locking: using tree_lock throughout, without
info->lock; reasonable, but the raw_spin_lock in percpu_counter_add() on
used_blocks makes that messier than expected.  Another answer may be a
further effort to eliminate the weird shmem_recalc_inode() altogether,
but previous attempts at that failed.

So far undecided, but for now change the BUG_ON to WARN_ON: in usual
circumstances it remains a useful consistency check.

Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-11-16 14:33:04 -08:00
Hugh Dickins
215c02bc33 tmpfs: fix shmem_getpage_gfp() VM_BUG_ON
Fuzzing with trinity hit the "impossible" VM_BUG_ON(error) (which Fedora
has converted to WARNING) in shmem_getpage_gfp():

  WARNING: at mm/shmem.c:1151 shmem_getpage_gfp+0xa5c/0xa70()
  Pid: 29795, comm: trinity-child4 Not tainted 3.7.0-rc2+ #49
  Call Trace:
    warn_slowpath_common+0x7f/0xc0
    warn_slowpath_null+0x1a/0x20
    shmem_getpage_gfp+0xa5c/0xa70
    shmem_fault+0x4f/0xa0
    __do_fault+0x71/0x5c0
    handle_pte_fault+0x97/0xae0
    handle_mm_fault+0x289/0x350
    __do_page_fault+0x18e/0x530
    do_page_fault+0x2b/0x50
    page_fault+0x28/0x30
    tracesys+0xe1/0xe6

Thanks to Johannes for pointing to truncation: free_swap_and_cache()
only does a trylock on the page, so the page lock we've held since
before confirming swap is not enough to protect against truncation.

What cleanup is needed in this case? Just delete_from_swap_cache(),
which takes care of the memcg uncharge.

Signed-off-by: Hugh Dickins <hughd@google.com>
Reported-by: Dave Jones <davej@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-11-16 14:33:04 -08:00
Will Deacon
498c228021 mm: highmem: don't treat PKMAP_ADDR(LAST_PKMAP) as a highmem address
kmap_to_page returns the corresponding struct page for a virtual address
of an arbitrary mapping.  This works by checking whether the address
falls in the pkmap region and using the pkmap page tables instead of the
linear mapping if appropriate.

Unfortunately, the bounds checking means that PKMAP_ADDR(LAST_PKMAP) is
incorrectly treated as a highmem address and we can end up walking off
the end of pkmap_page_table and subsequently passing junk to pte_page.

This patch fixes the bound check to stay within the pkmap tables.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Cc: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-11-16 14:33:04 -08:00
Mel Gorman
96710098ee mm: revert "mm: vmscan: scale number of pages reclaimed by reclaim/compaction based on failures"
Jiri Slaby reported the following:

	(It's an effective revert of "mm: vmscan: scale number of pages
	reclaimed by reclaim/compaction based on failures".) Given kswapd
	had hours of runtime in ps/top output yesterday in the morning
	and after the revert it's now 2 minutes in sum for the last 24h,
	I would say, it's gone.

The intention of the patch in question was to compensate for the loss of
lumpy reclaim.  Part of the reason lumpy reclaim worked is because it
aggressively reclaimed pages and this patch was meant to be a sane
compromise.

When compaction fails, it gets deferred and both compaction and
reclaim/compaction is deferred avoid excessive reclaim.  However, since
commit c654345924 ("mm: remove __GFP_NO_KSWAPD"), kswapd is woken up
each time and continues reclaiming which was not taken into account when
the patch was developed.

Attempts to address the problem ended up just changing the shape of the
problem instead of fixing it.  The release window gets closer and while
a THP allocation failing is not a major problem, kswapd chewing up a lot
of CPU is.

This patch reverts commit 83fde0f228 ("mm: vmscan: scale number of
pages reclaimed by reclaim/compaction based on failures") and will be
revisited in the future.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Cc: Zdenek Kabelac <zkabelac@redhat.com>
Tested-by: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
Cc: Jiri Slaby <jirislaby@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Johannes Hirte <johannes.hirte@fem.tu-ilmenau.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-11-16 14:33:04 -08:00
Randy Dunlap
2ca3cb50ed rapidio: fix kernel-doc warnings
Fix rapidio kernel-doc warnings:

  Warning(drivers/rapidio/rio.c:415): No description found for parameter 'local'
  Warning(drivers/rapidio/rio.c:415): Excess function parameter 'lstart' description in 'rio_map_inb_region'
  Warning(include/linux/rio.h:290): No description found for parameter 'switches'
  Warning(include/linux/rio.h:290): No description found for parameter 'destid_table'

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Matt Porter <mporter@kernel.crashing.org>
Acked-by: Alexandre Bounine <alexandre.bounine@idt.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-11-16 14:33:04 -08:00
Xiaotian Feng
f58b59c1df swapfile: fix name leak in swapoff
There's a name leak introduced by commit 91a27b2a75 ("vfs: define
struct filename and have getname() return it").  Add the missing
putname.

[akpm@linux-foundation.org: cleanup]
Signed-off-by: Xiaotian Feng <dannyfeng@tencent.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-11-16 14:33:04 -08:00
Hugh Dickins
bea8c150a7 memcg: fix hotplugged memory zone oops
When MEMCG is configured on (even when it's disabled by boot option),
when adding or removing a page to/from its lru list, the zone pointer
used for stats updates is nowadays taken from the struct lruvec.  (On
many configurations, calculating zone from page is slower.)

But we have no code to update all the lruvecs (per zone, per memcg) when
a memory node is hotadded.  Here's an extract from the oops which
results when running numactl to bind a program to a newly onlined node:

  BUG: unable to handle kernel NULL pointer dereference at 0000000000000f60
  IP:  __mod_zone_page_state+0x9/0x60
  Pid: 1219, comm: numactl Not tainted 3.6.0-rc5+ #180 Bochs Bochs
  Process numactl (pid: 1219, threadinfo ffff880039abc000, task ffff8800383c4ce0)
  Call Trace:
    __pagevec_lru_add_fn+0xdf/0x140
    pagevec_lru_move_fn+0xb1/0x100
    __pagevec_lru_add+0x1c/0x30
    lru_add_drain_cpu+0xa3/0x130
    lru_add_drain+0x2f/0x40
   ...

The natural solution might be to use a memcg callback whenever memory is
hotadded; but that solution has not been scoped out, and it happens that
we do have an easy location at which to update lruvec->zone.  The lruvec
pointer is discovered either by mem_cgroup_zone_lruvec() or by
mem_cgroup_page_lruvec(), and both of those do know the right zone.

So check and set lruvec->zone in those; and remove the inadequate
attempt to set lruvec->zone from lruvec_init(), which is called before
NODE_DATA(node) has been allocated in such cases.

Ah, there was one exceptionr.  For no particularly good reason,
mem_cgroup_force_empty_list() has its own code for deciding lruvec.
Change it to use the standard mem_cgroup_zone_lruvec() and
mem_cgroup_get_lru_size() too.  In fact it was already safe against such
an oops (the lru lists in danger could only be empty), but we're better
proofed against future changes this way.

I've marked this for stable (3.6) since we introduced the problem in 3.5
(now closed to stable); but I have no idea if this is the only fix
needed to get memory hotadd working with memcg in 3.6, and received no
answer when I enquired twice before.

Reported-by: Tang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-11-16 14:33:04 -08:00