Fix a performance degradation introduced in 2.6.17. (30% degradation
running dbench with 16 threads)
Commit 21730eed11de42f22afcbd43f450a1872a0b5ea1, which claims to make
EXT2_DEBUG work again, moves the taking of the kernel lock out of
debug-only code in ext2_count_free_inodes and ext2_count_free_blocks and
into ext2_statfs.
The same problem was fixed in ext3 by removing the lock completely (commit
5b11687924e40790deb0d5f959247ade82196665)
Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Remove definitions of PAGE_* from the user view
Delete unnecessary comments referring to the size of pages
Only include <asm-generic> if we're in __KERNEL__
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
while porting the -rt tree to 2.6.18-rc7 i noticed the following
screaming-IRQ scenario on an SMP system:
2274 0Dn.:1 0.001ms: do_IRQ+0xc/0x103 <= (ret_from_intr+0x0/0xf)
2274 0Dn.:1 0.010ms: do_IRQ+0xc/0x103 <= (ret_from_intr+0x0/0xf)
2274 0Dn.:1 0.020ms: do_IRQ+0xc/0x103 <= (ret_from_intr+0x0/0xf)
2274 0Dn.:1 0.029ms: do_IRQ+0xc/0x103 <= (ret_from_intr+0x0/0xf)
2274 0Dn.:1 0.039ms: do_IRQ+0xc/0x103 <= (ret_from_intr+0x0/0xf)
2274 0Dn.:1 0.048ms: do_IRQ+0xc/0x103 <= (ret_from_intr+0x0/0xf)
2274 0Dn.:1 0.058ms: do_IRQ+0xc/0x103 <= (ret_from_intr+0x0/0xf)
2274 0Dn.:1 0.068ms: do_IRQ+0xc/0x103 <= (ret_from_intr+0x0/0xf)
2274 0Dn.:1 0.077ms: do_IRQ+0xc/0x103 <= (ret_from_intr+0x0/0xf)
2274 0Dn.:1 0.087ms: do_IRQ+0xc/0x103 <= (ret_from_intr+0x0/0xf)
2274 0Dn.:1 0.097ms: do_IRQ+0xc/0x103 <= (ret_from_intr+0x0/0xf)
as it turns out, the bug is caused by handle_level_irq(), which if it
races with another CPU already handling this IRQ, it _unmasks_ the IRQ
line on the way out. This is not how 2.6.17 works, and we introduced
this bug in one of the early genirq cleanups right before it went into
-mm. (the bug was not in the genirq patchset for a long time, and we
didnt notice the bug due to the lack of -rt rebase to the new genirq
code. -rt, and hardirq-preemption in particular opens up such races much
wider than anything else.)
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
(And reset it on new thread creation)
It turns out that eflags is important to save and restore not just
because of iopl, but due to the magic bits like the NT bit, which we
don't want leaking between different threads.
Tested-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
[ATM] CLIP: Do not refer freed skbuff in clip_mkip().
[NET]: Drop tx lock in dev_watchdog_up
[PACKET]: Don't truncate non-linear skbs with mmaped IO
[NET]: Mark frame diverter for future removal.
[NETFILTER]: Add secmark headers to header-y
[ATM]: linux-atm-general mailing list is subscribers only
[ATM]: [he] when transmit fails, unmap the dma regions
[TCP] tcp-lp: update information to MAINTAINERS
[TCP] tcp-lp: bug fix for oops in 2.6.18-rc6
[BRIDGE]: random extra bytes on STP TCN packet
[IPV6]: Accept -1 for IPV6_TCLASS
[IPV6]: Fix tclass setting for raw sockets.
[IPVS]: remove the debug option go ip_vs_ftp
[IPVS]: Make sure ip_vs_ftp ports are valid
[IPVS]: auto-help for ip_vs_ftp
[IPVS]: Document the ports option to ip_vs_ftp in kernel-parameters.txt
[TCP]: Turn ABC off.
[NEIGH]: neigh_table_clear() doesn't free stats
* master.kernel.org:/home/rmk/linux-2.6-arm:
[ARM] 3815/1: headers_install support for ARM
[ARM] 3794/1: S3C24XX: do not defined set_irq_wake when no CONFIG_PM
[ARM] 3793/1: S3C2412: fix wrong serial info struct
[ARM] 3780/1: Fix iop321 cpuid
[ARM] 3786/1: pnx4008: update defconfig
[ARM] 3785/1: S3C2412: Fix idle code as default uses wrong clocks
[ARM] 3784/1: S3C2413: fix config for MACH_S3C2413/MACH_SMDK2413
Move kernel-only #includes into #ifdef __KERNEL__, so that
headers_install target can be used on ARM.
Signed-off-by: Ralph Siemsen <ralphs@netwinder.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This patch corrects the buffer length checking in the
sys_getdomainname() implementation for sparc/sparc64.
Signed-off-by: Andy Walker <andy@puszczka.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In clip_mkip(), skb->dev is dereferenced after clip_push(),
which frees up skb.
Advisory: AD_LAB-06009 (<adlab@venustech.com.cn>).
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patch from Ben Dooks
Do not define set_irq_wake as a real function if
the CONFIG_PM option is not set.
Fixes bug reported by Thomas Gleixner.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Ben Dooks
The S3C2440 serial info struct is being passed
through the S3C2412 serial info struct probe
routine.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Thomas Glexiner <tglx@linutronix.de>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
If the user tries to traverse to the next node of the
last node, we get NULL in current_node and a zero phandle
returned. That's fine, but if the user tries to obtain
properties in that state, we try to dereference a NULL
pointer in the downcall to the of_*() routines.
So protect against that.
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix lockdep warning with GRE, iptables and Speedtouch ADSL, PPP over ATM.
On Sat, Sep 02, 2006 at 08:39:28PM +0000, Krzysztof Halasa wrote:
>
> =======================================================
> [ INFO: possible circular locking dependency detected ]
> -------------------------------------------------------
> swapper/0 is trying to acquire lock:
> (&dev->queue_lock){-+..}, at: [<c02c8c46>] dev_queue_xmit+0x56/0x290
>
> but task is already holding lock:
> (&dev->_xmit_lock){-+..}, at: [<c02c8e14>] dev_queue_xmit+0x224/0x290
>
> which lock already depends on the new lock.
This turns out to be a genuine bug. The queue lock and xmit lock are
intentionally taken out of order. Two things are supposed to prevent
dead-locks from occuring:
1) When we hold the queue_lock we're supposed to only do try_lock on the
tx_lock.
2) We always drop the queue_lock after taking the tx_lock and before doing
anything else.
>
> the existing dependency chain (in reverse order) is:
>
> -> #1 (&dev->_xmit_lock){-+..}:
> [<c012e7b6>] lock_acquire+0x76/0xa0
> [<c0336241>] _spin_lock_bh+0x31/0x40
> [<c02d25a9>] dev_activate+0x69/0x120
This path obviously breaks assumption 1) and therefore can lead to ABBA
dead-locks.
I've looked at the history and there seems to be no reason for the lock
to be held at all in dev_watchdog_up. The lock appeared in day one and
even there it was unnecessary. In fact, people added __dev_watchdog_up
precisely in order to get around the tx lock there.
The function dev_watchdog_up is already serialised by rtnl_lock since
its only caller dev_activate is always called under it.
So here is a simple patch to remove the tx lock from dev_watchdog_up.
In 2.6.19 we can eliminate the unnecessary __dev_watchdog_up and
replace it with dev_watchdog_up.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Non-linear skbs are truncated to their linear part with mmaped IO.
Fix by using skb_copy_bits instead of memcpy.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
The code for frame diverter is unmaintained and has bitrotted.
The number of users is very small and the code has lots of problems.
If anyone is using it, they maybe exposing themselves to bad packet attacks.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch includes xt_SECMARK.h and xt_CONNSECMARK.h to the kernel
headers which are exported via 'make headers_install'. This is needed to
allow userland code to be built correctly with these features.
Please apply, and consider for inclusion with 2.6.18 as a bugfix.
Signed-off-by: James Morris <jmorris@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As the automated reply I got to my last ATM patch shows, the
linux-atm-general mailing list is subscribers-only.
Signed-off-by: Roland Dreier <roland@digitalvampire.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sorry that the patch submited yesterday still contain a small bug.
This version have already been test for hours with BT connections. The
oops is now difficult to reproduce.
Signed-off-by: Wong Hoi Sing Edison <hswong3i@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We seem to send 3 extra bytes in a TCN, which will be whatever happens
to be on the stack. Thanks to Aji_Srinivas@emc.com for seeing.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch should add support for -1 as "default" IPv6 traffic class,
as specified in IETF RFC3542 §6.5. Within the kernel, it seems tclass
< 0 is already handled, but setsockopt, getsockopt and recvmsg calls
won't accept it from userland.
Signed-off-by: Remi Denis-Courmont <rdenis@simphalempin.com>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
np->cork.tclass is used only in cork'ed context.
Otherwise, np->tclass should be used.
Bug#7096 reported by Remi Denis-Courmont <rdenis@simphalempin.com>.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch makes the debuging behaviour of this code more consistent
with the rest of IPVS.
Signed-Off-By: Simon Horman <horms@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
I'm not entirely sure what happens in the case of a valid port,
at best it'll be silently ignored. This patch ignores them a little
more verbosely.
Signed-Off-By: Simon Horman <horms@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fill in a help message for the ports option to ip_vs_ftp
Signed-Off-By: Simon Horman <horms@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
I'm not sure if documenting this here is appropriate, but
if it is, here is some text to put there.
Signed-Off-By: Simon Horman <horms@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Turn Appropriate Byte Count off by default because it unfairly
penalizes applications that do small writes. Add better documentation
to describe what it is so users will understand why they might want to
turn it on.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
neigh_table_clear() doesn't free tbl->stats.
Found by Alexey Kuznetsov. Though Alexey considers this
leak minor for mainstream, I still believe that cleanup
code should not forget to free some of the resources :)
At least, this is critical for OpenVZ with virtualized
neighbour tables.
Signed-Off-By: Kirill Korotaev <dev@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
[PATCH 9/9] s390: qeth driver fixes [6/6]
From: Frank Pavlic <fpavlic@de.ibm.com>
- Hipersockets has no IPV6 support, thus prevent issueing
SETRTG_IPV6 control commands on Hipersockets devices.
- fixed error handling in qeth_sysfs_(un)register
Signed-off-by: Frank Pavlic <fpavlic@de.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
[PATCH 8/9] s390: qeth driver fixes [5/6]
From: Frank Pavlic <fpavlic@de.ibm.com>
fix kernel panic in qdio queue handling.
qeth_qdio_clear_card() could be invoked by 2 CPUs
simultaneously (for example reboot event and recovery).
Signed-off-by: Frank Pavlic <fpavlic@de.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
[PATCH 7/9] s390: qeth driver fixes [4/6]
From: Frank Pavlic <fpavlic@de.ibm.com>
- fix kernel crash due to race,
set card->state to SOFTSETUP after
card and card->dev are initialized properly.
- remove CONFIG_QETH_PERF_STATS, use sysfs attribute instead,
as we want to have the ability to turn on/off the
statistics at runtime.
Signed-off-by: Frank Pavlic <fpavlic@de.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
[PATCH 6/9] s390: qeth driver fixes [3/6]
From: Frank Pavlic <fpavlic@de.ibm.com>
fixed kernel panic caused by qeth driver:
Using a bonding device qeth driver will realloc
headroom for every skb coming from the bond device.
Once this happens qeth frees the original skb and
set the skb pointer to the new realloced skb.
Under heavy transmit workload (e.g.UDP streams) through bond
network device the qdio output queue might get full.
In this case we return with EBUSY from qeth_send_packet.
Returning to qeth_hard_start_xmit routine
the skb address on the stack still points to the old address,
which has been freed before.
Returning from qeth_hard_start_xmit with EBUSY results in
requeuing the skb. In this case it corrupts the qdisc queue
and results in kernel panic.
Signed-off-by: Frank Pavlic <fpavlic@de.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
[PATCH 5/9] s390: qeth driver fixes [2/6]
From: Frank Pavlic <fpavlic@de.ibm.com>
- fixed error handling in create_device_attributes
- fixed some minor bugs in IPv4
and IPv6 address checking
Signed-off-by: Frank Pavlic <fpavlic@de.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
[PATCH 4/9] s390: qeth driver fixes [1/6]
From: Frank Pavlic <fpavlic@de.ibm.com>
- Drop incoming packets with vlan_tag set
if card->vlangrp is not set.
- use always vlan_hwaccel_rx to pass
vlan frames to the stack.
- fix recovery problem. Device was recovered
properly but still not working.
netif_carrier_on call right before
recovery start fixes it.
Signed-off-by: Frank Pavlic <fpavlic@de.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
[PATCH 3/9] s390: Makefile cleanup
From: Frank Pavlic <fpavlic@de.ibm.com>
remove CONFIG_MPC from Makefile which was
introduced accidently in the past.
Signed-off-by: Frank Pavlic <fpavlic@de.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
[PATCH 2/9] s390: netiucv driver fixes
From: Frank Pavlic <fpavlic@de.ibm.com>
- missing lock initialization added
- avoid duplicate iucv-interfaces to the same peer
- rw-lock added for manipulating the list of
defined iucv connections
Signed-off-by: Frank Pavlic <fpavlic@de.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Hi Jeff,
this is a RESEND of the nine s390 network driver patches.
I finally found that my kmail corrupted almost every patch
I sent the last time. Please apply these 9 patches and forget
about my first attempt! Sorry for the delay, I had some fights
with sendmail, IMAP and mutt configuration.
Frank
[RESEND PATCH 1/9] s390: minor s390 network driver fixes
From: Frank Pavlic <fpavlic@de.ibm.com>
- iucv driver:
use do { } while (0) constructs
instead of empty defines to avoid compile bugs.
- ctc driver:
missing lock initialization added
- lcs driver:
BUG_ON usage was removed accidently
with the last lcs patch.
Put them back in place.
Signed-off-by: Frank Pavlic <fpavlic@de.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
* 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband:
RDMA/cma: Increase the IB CM retry count in CMA
IPoIB: Retry failed send-only multicast group joins
IB/srp: Don't schedule reconnect from srp
In some special case (padding because of sync or umount) it can be possible
that summary information is not fit to the end of the erase block. In
these cases the collecting of summary is disabled for this erase block.
The problem was that this was not respected by jffs2_sum_add_kvec(). This
patch fix this bug.
Signed-off-by: Ferenc Havasi <havasi@inf.u-szeged.hu>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
In the case of data-pad-ecc-pad-data... layout the oob start position has
to be sizeof(data) in nand_write_oob_syndrom().
In nand_fill_oob() we need to copy to buf + buffer offset instead of buf +
write offset.
Signed-off-by: Vitaly Wool <vwool@ru.mvista.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
ext3-get-blocks support caused ~20% degrade in Sequential read
performance (tiobench). Problem is with marking the buffer boundary
so IO can be submitted right away. Here is the patch to fix it.
2.6.18-rc6:
-----------
# ./iotest
1048576+0 records in
1048576+0 records out
4294967296 bytes (4.3 GB) copied, 75.2726 seconds, 57.1 MB/s
real 1m15.285s
user 0m0.276s
sys 0m3.884s
2.6.18-rc6 + fix:
-----------------
[root@elm3a241 ~]# ./iotest
1048576+0 records in
1048576+0 records out
4294967296 bytes (4.3 GB) copied, 62.9356 seconds, 68.2 MB/s
The boundary block check in ext3_get_blocks_handle needs to be adjusted
against the count of blocks mapped in this call, now that it can map
more than one block.
Signed-off-by: Suparna Bhattacharya <suparna@in.ibm.com>
Tested-by: Badari Pulavarty <pbadari@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
I think there is a bug in kmod.c: In __call_usermodehelper(), when
kernel_thread(wait_for_helper, ...) return success, since wait_for_helper()
might call complete() at any time, the sub_info should not be used any
more.
Normally wait_for_helper() take a long time to finish, you may not get
problem for most of the case. But if you remove /sbin/modprobe, it may
become easier for you to get a oop in khelper.
Cc: Matt Helsley <matthltc@us.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>