As the fs recovery is asynchronous, there is a small chance that another
node can mount (and thus recover) the slot before the recovery thread
gets to it.
If this happens, the recovery thread will block indefinitely on the
journal/slot lock as that lock will be held for the duration of the mount
(by design) by the node assigned to that slot.
The solution implemented is to keep track of the journal replays using
a recovery generation in the journal inode, which will be incremented by the
thread replaying that journal. The recovery thread, before attempting the
blocking lock on the journal/slot lock, will compare the generation on disk
with what it has cached and skip recovery if it does not match.
This bug appears to have been inadvertently introduced during the mount/umount
vote removal by mainline commit 34d024f843. In the
mount voting scheme, the messaging would indirectly indicate that the slot
was being recovered.
Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
This patch renames the ij_pad to ij_recovery_generation in struct ocfs2_dinode.
This will be used to keep count of journal replays after an unclean shutdown.
Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
Sysfs has the _ATTR() and _ATTR_RO() macros to make defining extended
form attributes easier. configfs should have something similiar.
- _CONFIGFS_ATTR() and _CONFIGFS_ATTR_RO() are the counterparts to the
sysfs macros.
- CONFIGFS_ATTR_STRUCT() creates the extended form attribute structure.
- CONFIGFS_ATTR_OPS() defines the show_attribute()/store_attribute()
operations that call the show()/store() operations of the extended
form configfs_attributes.
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
configfs_mkdir() creates a new item by calling its parent's
->make_item/group() functions. Once that object is created,
configfs_mkdir() calls try_module_get() on the new item's module. If it
succeeds, the module owning the new item cannot be unloaded, and
configfs is safe to reference the item.
If the item and the subsystem it belongs to are part of the same module,
the subsystem is also pinned. This is the common case.
However, if the subsystem is made up of multiple modules, this may not
pin the subsystem. Thus, it would be possible to unload the toplevel
subsystem module while there is still a child item. Thus, we now
try_module_get() the subsystem's module. This only really affects
children of the toplevel subsystem group. Deeper children already have
their parents pinned.
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
When checking for user-created elements under an item to be removed by rmdir(),
configfs_detach_prep() counts fake configfs_dirents created by dir_open() as
user-created and fails when finding one. It is however perfectly valid to remove
a directory that is open.
Simply make configfs_detach_prep() skip fake configfs_dirent, like it already
does for attributes, and like detach_groups() does.
Signed-off-by: Louis Rilling <louis.rilling@kerlabs.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
Once a new configfs directory is created by configfs_attach_item() or
configfs_attach_group(), a failure in the remaining initialization steps leads
to removing a directory which inode the VFS may have already accessed.
This commit adds the necessary inode locking to safely remove configfs
directories while cleaning up after a failure. As an advantage, the locking
rules of populate_groups() and detach_groups() become the same: the caller must
have the group's inode mutex locked.
Signed-off-by: Louis Rilling <louis.rilling@kerlabs.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
process 1: process 2:
configfs_mkdir("A")
attach_group("A")
attach_item("A")
d_instantiate("A")
populate_groups("A")
mutex_lock("A")
attach_group("A/B")
attach_item("A")
d_instantiate("A/B")
mkdir("A/B/C")
do_path_lookup("A/B/C", LOOKUP_PARENT)
ok
lookup_create("A/B/C")
mutex_lock("A/B")
ok
configfs_mkdir("A/B/C")
ok
attach_group("A/C")
attach_item("A/C")
d_instantiate("A/C")
populate_groups("A/C")
mutex_lock("A/C")
attach_group("A/C/D")
attach_item("A/C/D")
failure
mutex_unlock("A/C")
detach_groups("A/C")
nothing to do
mkdir("A/C/E")
do_path_lookup("A/C/E", LOOKUP_PARENT)
ok
lookup_create("A/C/E")
mutex_lock("A/C")
ok
configfs_mkdir("A/C/E")
ok
detach_item("A/C")
d_delete("A/C")
mutex_unlock("A")
detach_groups("A")
mutex_lock("A/B")
detach_group("A/B")
detach_groups("A/B")
nothing since no _default_ group
detach_item("A/B")
mutex_unlock("A/B")
d_delete("A/B")
detach_item("A")
d_delete("A")
Two bugs:
1/ "A/B/C" and "A/C/E" are created, but never removed while their parent are
removed in the end. The same could happen with symlink() instead of mkdir().
2/ "A" and "A/C" inodes are not locked while detach_item() is called on them,
which may probably confuse VFS.
This commit fixes 1/, tagging new directories with CONFIGFS_USET_CREATING before
building the inode and instantiating the dentry, and validating the whole
group+default groups hierarchy in a second pass by clearing
CONFIGFS_USET_CREATING.
mkdir(), symlink(), lookup(), and dir_open() simply return -ENOENT if
called in (or linking to) a directory tagged with CONFIGFS_USET_CREATING. This
does not prevent userspace from calling stat() successfuly on such directories,
but this prevents userspace from adding (children to | symlinking from/to |
read/write attributes of | listing the contents of) not validated items. In
other words, userspace will not interact with the subsystem on a new item until
the new item creation completes correctly.
It was first proposed to re-use CONFIGFS_USET_IN_MKDIR instead of a new
flag CONFIGFS_USET_CREATING, but this generated conflicts when checking the
target of a new symlink: a valid target directory in the middle of attaching
a new user-created child item could be wrongly detected as being attached.
2/ is fixed by next commit.
Signed-off-by: Louis Rilling <louis.rilling@kerlabs.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
On a similar pattern as mkdir() vs rmdir(), a failing symlink() may make rmdir()
fail for the symlink's parent and the symlink's target as well.
failing symlink() making target's rmdir() fail:
process 1: process 2:
symlink("A/S" -> "B")
allow_link()
create_link()
attach to "B" links list
rmdir("B")
detach_prep("B")
error because of new link
configfs_create_link("A", "S")
error (eg -ENOMEM)
failing symlink() making parent's rmdir() fail:
process 1: process 2:
symlink("A/D/S" -> "B")
allow_link()
create_link()
attach to "B" links list
configfs_create_link("A/D", "S")
make_dirent("A/D", "S")
rmdir("A")
detach_prep("A")
detach_prep("A/D")
error because of "S"
create("S")
error (eg -ENOMEM)
We cannot use the same solution as for mkdir() vs rmdir(), since rmdir() on the
target cannot wait on the i_mutex of the new symlink's parent without risking a
deadlock (with other symlink() or sys_rename()). Instead we define a global
mutex protecting all configfs symlinks attachment, so that rmdir() can avoid the
races above.
Signed-off-by: Louis Rilling <louis.rilling@kerlabs.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
The rule for configfs symlinks is that symlinks always point to valid
config_items, and prevent the target from being removed. However,
configfs_symlink() only checks that it can grab a reference on the target item,
without ensuring that it remains alive until the symlink is correctly attached.
This patch makes configfs_symlink() fail whenever the target is being removed,
using the CONFIGFS_USET_DROPPING flag set by configfs_detach_prep() and
protected by configfs_dirent_lock.
This patch introduces a similar (weird?) behavior as with mkdir failures making
rmdir fail: if symlink() races with rmdir() of the parent directory (or its
youngest user-created ancestor if parent is a default group) or rmdir() of the
target directory, and then fails in configfs_create(), this can make the racing
rmdir() fail despite the concerned directory having no user-created entry (resp.
no symlink pointing to it or one of its default groups) in the end.
This behavior is fixed in later patches.
Signed-off-by: Louis Rilling <louis.rilling@kerlabs.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
We now use PTR_ERR() in the ->make_item() and ->make_group() operations.
Folks including configfs.h need err.h.
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
It would have saved both a bug submitter and me a few hours if
scripts/ver_linux had picked the same gcc as the build.
Since I can't see any reason why it fiddles with PATH at all this patch
therefore removes the PATH setting.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Change the "If unsure" message to match the default value.
Signed-off-by: John Kacur <jkacur at gmail dot com>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
The extern flag currently is not included in type dump files
(genksyms --dump-types). Include that flag there for completeness.
Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
We are having two kinds of problems with genksyms today: fake checksum
changes without actual ABI changes, and changes which we would rather like
to ignore (such as an additional field at the end of a structure that
modules are not supposed to touch, for example).
I have thought about ways to improve genksyms and compute checksums
differently to avoid those problems, but in the end I don't see a
fundamentally better way. So here are some genksyms patches for at least
making the checksums more easily manageable, if we cannot fully fix them.
In addition to the bugfixes (the first two patches), this allows genksyms
to track checksum changes and report why a checksum changed (third patch),
and to selectively ignore changes (fourth patch).
This patch:
Gcc __attribute__ definitions may occur repeatedly, e.g.,
static int foo __attribute__((__used__))
__attribute__((aligned (16)));
The genksyms parser does not understand this, and generates a syntax error.
Fix this case.
Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
The z10 system supports large pages, kvm-s390 doesnt.
Make sure that we dont advertise large pages to avoid the guest crashing as
soon as the guest kernel activates DAT.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
Deleting a timer with del_timer doesn't guarantee, that the
timer function is not running at the moment of deletion. Thus
in the xt_hashlimit case we can get into a ticklish situation
when the htable_gc rearms the timer back and we'll actually
delete an entry with a pending timer.
Fix it with using del_timer_sync().
AFAIK del_timer_sync checks for the timer to be pending by
itself, so I remove the check.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The thing is that recent_mt_destroy first flushes the entries
from table with the recent_table_flush and only *after* this
removes the proc file, corresponding to that table.
Thus, if we manage to write to this file the '+XXX' command we
will leak some entries. If we manage to write there a 'clean'
command we'll race in two recent_table_flush flows, since the
recent_mt_destroy calls this outside the recent_lock.
The proper solution as I see it is to remove the proc file first
and then go on with flushing the table. This flushing becomes
safe w/o the lock, since the table is already inaccessible from
the outside.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
In order to time out dead connections quicker, keep track of outstanding data
and cap the timeout.
Suggested by Herbert Xu.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The device id for Am29DL800BB in jedec_probe.c is wrong.
Reference: http://www.spansion.com/datasheets/21519c4.pdf
I discovered this while working with u-boot.
The u-boot folks mentioned Linux as an upstream reference, thought I'd
post a heads-up here too.
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
- Add support for the RDC 1010 variant
- Rework the core library to have a read_id method. This allows the hacky
bits of it821x to go and prepares us for pata_hd
- Switch from WARN to BUG in ata_id_string as it will reboot if you get
it wrong so WARN won't be seen
- Allow the issue of command 0xFC on the 821x. This is needed to query
rebuild status.
- Tidy up printk formatting
- Do more ident rewriting on RAID volumes to handle firmware provided
ident data which is rather wonky
- Report the firmware revision and device layout in RAID mode
- Don't try and disable raid on the 8211 or RDC - they don't have the
relevant bits
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Subsys 106b:00a3 also is the weird apple ich8m which chokes when the
latter two ports are accessed, add it. Reported by Felipe Sere.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Felipe Sere <dodofxp@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Fix a potential memory leak when ata_init() encounters an error.
Signed-off-by: Elias Oltmanns <eo@nebensachen.de>
Cc: Tejun Heo <tj@kernel.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Global and per-LLD ATAPI disable checks were done in the command issue
path probably because it was left out during EH conversion. On
affected machines, this can cause lots of warning messages. Move them
to where they belong - the probing path.
Reported by Chunbo Luo.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Chunbo Luo <chunbo.luo@windriver.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Add flag VIA_SATA_PATA for vx800, VX800 uses the same
chipset(0x0581/0x5324) as CX700, which has 1 PATA channel(Master/Slave)
and 1 SATA channel(Master/Slave) Add function <via_ata_tf_load>. This is
to fix the internal bug of VIA chipsets, which will reset the device
register after changing the IEN bit in CTL register
Signed-off-by: Joseph Chan <josephchan@via.com.tw>
Cc: Tejun Heo <tj@kernel.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
The ali_init_one() function does a search for an isa_bridge,
but then fails to release it if the revision information was
not correctly found.
the problem comes from:
isa_bridge = pci_get_device(...);
if (isa_bridge && ...) {
pci_dev_put(isa_bridge);
}
where the pci_dev_put() is never called if isa_bridge
was valid but the extra checks on the chip-revision
fail to match.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
__FUNCTION__ is gcc-specific, use __func__
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is necessary to set the dongle type on the nsc driver in order to get
it to work correctly. Thinkpads all appear to use dongle type 9. This
patch defaults nsc devices with an IBM PnP descriptor to use type 9.
Signed-off-by: Matthew Garrett <mjg59@srcf.ucam.org>
Signed-off-by: Ben Collins <ben.collins@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Preface: The "Broadcom" device is on unreleased hardware, so I can't
disclose the actual model.
When the Dell 370 and 410 BT adapters are put into BT radio mode, they
need to be prepared like many other Broadcom adapters.
Also, add quirk Broadcom 2046 devices with HCI_RESET. Reference for this
bug: https://launchpad.net/bugs/249448
Signed-off-by: Michael Frey <michael.frey@canonical.com>
Signed-off-by: Mario Limonciello <Mario_Limonciello@Dell.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Ben Collins <ben.collins@canonical.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove the packed attribute from PofTimStamp_tag in the hysdn driver as the
thing being packed is just an array of chars and so is unpackable.
This deals with a compiler warning:
In file included from drivers/isdn/hysdn/hysdn_boot.c:19:
drivers/isdn/hysdn/hysdn_pof.h:63: warning: 'packed' attribute ignored for field of type 'unsigned char[40]'
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Acked-by: Karsten Keil <kkeil@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adapt the tg3 driver to use the reworked PCI PM and make it use the
exported PCI PM core functions instead of accessing the PCI PM registers
directly by itself.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix direct casts of pointers to u32 in the InterPhase ATM driver. These are
all arguments being passed to printk() calls. So drop the cast and change the
%x to a %p.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Chas Williams <chas@cmf.nrl.navy.mil>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix const assignment/discard warnings in the ATM networking driver.
The lane2_assoc_ind() function needed its arguments changing to match changes
in the lane2_ops struct (patch 61c33e0129
"atm: use const where reasonable").
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Chas Williams <chas@cmf.nrl.navy.mil>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The rationale is:
* use u32 consistently
* no need to do LCG on values from (better) get_random_bytes
* use more data from get_random_bytes for secondary seeding
* don't reduce state space on srandom32()
* enforce state variable initialization restrictions
Note: the second paper has a version of random32() with even longer period
and a version of random64() if needed.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
When bridging interfaces with different MTUs, the bridge correctly chooses
the minimum of the MTUs of the physical devices as the bridges MTU. But
when a frame is passed which fits through the incoming, but not through
the outgoing interface, a "Fragmentation Needed" packet is generated.
However, the propagated MTU is hardcoded to 1500, which is wrong in this
situation. The sender will repeat the packet again with the same frame
size, and the same problem will occur again.
Instead of sending 1500, the (correct) MTU value of the bridge is now sent
via PMTU. To achieve this, the corresponding rtable structure is stored
in its net_bridge structure.
Modified to get rid of fake_net_device as well.
Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The iov_iter_advance() function would look at the iov->iov_len entry
even though it might have iterated over the whole array, and iov was
pointing past the end. This would cause DEBUG_PAGEALLOC to trigger a
kernel page fault if the allocation was at the end of a page, and the
next page was unallocated.
The quick fix is to just change the order of the tests: check that there
is any iovec data left before we check the iov entry itself.
Thanks to Alexey Dobriyan for finding this case, and testing the fix.
Reported-and-tested-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: <stable@kernel.org> [2.6.25.x, 2.6.26.x]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We zero-fill them like we are supposed to, and that's all fine. It's
only an error if the 'romfs_copyfrom()' routine isn't able to fill the
data that is supposed to be there.
Most of the patch is really just re-organizing the code a bit, and using
separate variables for the error value and for how much of the page we
actually filled from the filesystem.
Reported-and-tested-by: Chris Fester <cfester@wms.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Matt Waddel <matt.waddel@freescale.com>
Cc: Greg Ungerer <gerg@snapgear.com>
Signed-of-by: Linus Torvalds <torvalds@linux-foundation.org>
The mutex is released on a successful return, so it would seem that it
should be released on an error return as well.
The semantic patch finds this problem is as follows:
(http://www.emn.fr/x-info/coccinelle/)
// <smpl>
@@
expression l;
@@
mutex_lock(l);
... when != mutex_unlock(l)
when any
when strict
(
if (...) { ... when != mutex_unlock(l)
+ mutex_unlock(l);
return ...;
}
|
mutex_unlock(l);
)
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
SH7763 has Ethernet core same as SH7710/SH7712.
Positions of some registry are different, but the basic part is the same.
I add support of ethernet of sh7763 to sh_eth.
Signed-off-by: Nobuhiro Iwamatsu <iwamatsu.nobuhiro@renesas.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Trying to build with CONFIG_NE2000=m fails with:
scripts/mod/modpost -o /tmp/tmp/linux-2.6.27-rc1/Module.symvers -S -s
ERROR: "NS8390_init" [drivers/net/ne.ko] undefined!
This is because the split of 8390 into pausing and non-pausing
versions was incompletely propagated to ne.c. This fixes it.
Signed-off-by: Mikael Pettersson <mikpe@it.uu.se>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
The new kgdb architecture specific handler registers and unregisters
dynamically for exceptions depending on when you configure a kgdb I/O
driver.
Aside from initializing the exceptions earlier in the boot process,
kgdb should have no impact on a device when it is compiled in so long
as an I/O module is not configured for use.
There have been quite a number of contributors during the existence of
this patch (see arch/mips/kernel/kgdb.c). Most recently Jason
re-wrote the mips kgdb logic to use the die notification handlers.
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
This patch explicitly removes the kgdb implementation, for mips which
is intended to be followed by a patch that adds a kgdb implementation
for MIPS that makes use of the kgdb core in the kernel.
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>