Convert s_blocks_count to s_blocks_count_lo
This helps in finding BUGs due to direct partial access of
these split 64 bit values
Also fix direct partial access in ext4 code
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Convert bg_inode_bitmap and bg_inode_table to bg_inode_bitmap_lo
and bg_inode_table_lo. This helps in finding BUGs due to
direct partial access of these split 64 bit values
Also fix one direct partial access
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Convert bg_block_bitmap to bg_block_bitmap_lo
This helps in catching some BUGS due to direct
partial access of these split fields.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
This feature relaxes check restrictions on where each block groups meta
data is located within the storage media. This allows for the allocation
of bitmaps or inode tables outside the block group boundaries in cases
where bad blocks forces us to look for new blocks which the owning block
group can not satisfy. This will also allow for new meta-data allocation
schemes to improve performance and scalability.
Signed-off-by: Jose R. Santos <jrs@us.ibm.com>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
In pass1 of e2fsck, every inode table in the fileystem is scanned and checked,
regardless of whether it is in use. This is this the most time consuming part
of the filesystem check. The unintialized block group feature can greatly
reduce e2fsck time by eliminating checking of uninitialized inodes.
With this feature, there is a a high water mark of used inodes for each block
group. Block and inode bitmaps can be uninitialized on disk via a flag in the
group descriptor to avoid reading or scanning them at e2fsck time. A checksum
of each group descriptor is used to ensure that corruption in the group
descriptor's bit flags does not cause incorrect operation.
The feature is enabled through a mkfs option
mke2fs /dev/ -O uninit_groups
A patch adding support for uninitialized block groups to e2fsprogs tools has
been posted to the linux-ext4 mailing list.
The patches have been stress tested with fsstress and fsx. In performance
tests testing e2fsck time, we have seen that e2fsck time on ext3 grows
linearly with the total number of inodes in the filesytem. In ext4 with the
uninitialized block groups feature, the e2fsck time is constant, based
solely on the number of used inodes rather than the total inode count.
Since typical ext4 filesystems only use 1-10% of their inodes, this feature can
greatly reduce e2fsck time for users. With performance improvement of 2-20
times, depending on how full the filesystem is.
The attached graph shows the major improvements in e2fsck times in filesystems
with a large total inode count, but few inodes in use.
In each group descriptor if we have
EXT4_BG_INODE_UNINIT set in bg_flags:
Inode table is not initialized/used in this group. So we can skip
the consistency check during fsck.
EXT4_BG_BLOCK_UNINIT set in bg_flags:
No block in the group is used. So we can skip the block bitmap
verification for this group.
We also add two new fields to group descriptor as a part of
uninitialized group patch.
__le16 bg_itable_unused; /* Unused inodes count */
__le16 bg_checksum; /* crc16(sb_uuid+group+desc) */
bg_itable_unused:
If we have EXT4_BG_INODE_UNINIT not set in bg_flags
then bg_itable_unused will give the offset within
the inode table till the inodes are used. This can be
used by fsck to skip list of inodes that are marked unused.
bg_checksum:
Now that we depend on bg_flags and bg_itable_unused to determine
the block and inode usage, we need to make sure group descriptor
is not corrupt. We add checksum to group descriptor to
detect corruption. If the descriptor is found to be corrupt, we
mark all the blocks and inodes in the group used.
Signed-off-by: Avantika Mathur <mathur@us.ibm.com>
Signed-off-by: Andreas Dilger <adilger@clusterfs.com>
Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
CONFIG_EXT4_INDEX is not an exposed config option in the kernel, and it is
unconditionally defined in ext4_fs.h. tune2fs is already able to turn off
dir indexing, so at this point it's just cluttering up the code. Remove
it.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Fragment support in ext2/3/4 was never implemented, and it probably will
never be implemented. So remove it from ext4.
Signed-off-by: Coly Li <coyli@suse.de>
Acked-by: Andreas Dilger <adilger@clusterfs.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
JBD2: Replace slab allocations with page allocations
JBD2 allocate memory for committed_data and frozen_data from slab. However
JBD2 should not pass slab pages down to the block layer. Use page allocator
pages instead. This will also prepare JBD for the large blocksize patchset.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Mingming Cao <cmm@us.ibm.com>
JBD: Replace slab allocations with page allocations
JBD allocate memory for committed_data and frozen_data from slab. However
JBD should not pass slab pages down to the block layer. Use page allocator pages instead. This will also prepare JBD for the large blocksize patchset.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Mingming Cao <cmm@us.ibm.com>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc:
net: libertas sdio driver
mmc: at91_mci: cleanup: use MCI_ERRORS
mmc: possible leak in mmc_read_ext_csd
A sysctl method was added to enable and disable debugging levels. After
further review, it was decided that there are better approaches to doing this
and the sysctl methodology isn't really desirable. This patch removes the
sysctl code from 9p.
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
This patch moves transport dynamic registration and matching to the net
module to prevent a bad Kconfig dependency between the net and fs 9p modules.
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
The 9P2000 protocol requires the authentication and permission checks to be
done in the file server. For that reason every user that accesses the file
server tree has to authenticate and attach to the server separately.
Multiple users can share the same connection to the server.
Currently v9fs does a single attach and executes all I/O operations as a
single user. This makes using v9fs in multiuser environment unsafe as it
depends on the client doing the permission checking.
This patch improves the 9P2000 support by allowing every user to attach
separately. The patch defines three modes of access (new mount option
'access'):
- attach-per-user (access=user) (default mode for 9P2000.u)
If a user tries to access a file served by v9fs for the first time, v9fs
sends an attach command to the server (Tattach) specifying the user. If
the attach succeeds, the user can access the v9fs tree.
As there is no uname->uid (string->integer) mapping yet, this mode works
only with the 9P2000.u dialect.
- allow only one user to access the tree (access=<uid>)
Only the user with uid can access the v9fs tree. Other users that attempt
to access it will get EPERM error.
- do all operations as a single user (access=any) (default for 9P2000)
V9fs does a single attach and all operations are done as a single user.
If this mode is selected, the v9fs behavior is identical with the current
one.
Signed-off-by: Latchesar Ionkov <lucho@ionkov.net>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
This patch abstracts out the interfaces to underlying transports so that
new transports can be added as modules. This should also allow kernel
configuration of transports without ifdef-hell.
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Almost identical except for the extra DR_LEN_8 and the different
DR_CONTROL_RESERVED defines.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Conflicts:
include/asm-x86/Kbuild
Same file, except for whitespace, comment formatting and the
.long/.quad delta which can be solved by a define.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Same file, except for whitespace, comment formatting and:
32-bit: if((unsigned int) addr >= (unsigned int) high_memory)
64-bit: if((unsigned long) addr >= (unsigned long) high_memory)
where the latter can be used safely for both.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Conflicts:
include/asm-x86/floppy_32.h
include/asm-x86/floppy_64.h
commit 554d284ba9 added _GPF_NORETRY
to floppy_64.h to prevent OOM killer on floppy DMA allocations.
Apply the same to the 32 bit variant.
Found during the attempt to unify the _32/_64 variants. Seperate commit
to document the resulting code change.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Same file, except for whitespace, comment formatting and the two variants
of fb_is_primary_device()
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Same file, except for whitespace, comment formatting and:
32-bit: unsigned long *virt_addr = va;
64-bit: unsigned int *virt_addr = va;
Both can be safely replaced by:
u32 i, *virt_addr = va;
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Same file, except for whitespace, comment formatting and the extra
function prototype usc_tsc_delay() in _32.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Same file, except for whitespace, comment formatting and the extra
defines in _64, which are conditional on VSMP anyway.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Same file, except for whitespace, comment formatting and the extra
DEBUG_PAGE_ALLOC function in _32.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Same file, except for whitespace, comment formatting and the
usage of wbinvd() instead of asm volatile("wbinvd":::"memory"), which is
the same.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Merge errno.h, resource.h, rtc.h, sections.h, serial.h and sockios.h,
where i386 and x86_64 have no or only trivial comment/include guard
differences.
Build tested on both 32-bit and 64-bit, and booted on 64-bit.
[tglx: fixup Kbuild as well]
Signed-off-by: Roland Dreier <roland@digitalvampire.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Merge 32/64-bit headers that simply redirect to asm-generic
[tglx: fixup Kbuild as well]
Signed-off-by: Brian Gerst <bgerst@didntduck.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
convert mm_context_t semaphore to a mutex.
Signed-off-by: Luiz Fernando N. Capitulino <lcapitulino@mandriva.com.br>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Add support for and use the multi-byte NOPs recently documented to be
available on all PentiumPro and later processors.
This patch only applies cleanly on top of the "x86: misc.
constifications" patch sent earlier.
[ tglx: arch/x86 adaptation ]
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
arch/x86/kernel/alternative.c | 23 ++++++++++++++++++++++-
include/asm-x86/processor_32.h | 22 ++++++++++++++++++++++
include/asm-x86/processor_64.h | 22 ++++++++++++++++++++++
3 files changed, 66 insertions(+), 1 deletion(-)
Add missing IRQs and IRQ descriptions to /proc/interrupts.
/proc/interrupts is most useful when it displays every IRQ vector in use by
the system, not just those somebody thought would be interesting.
This patch inserts the following vector displays to the i386 and x86_64
platforms, as appropriate:
rescheduling interrupts
TLB flush interrupts
function call interrupts
thermal event interrupts
threshold interrupts
spurious interrupts
A threshold interrupt occurs when ECC memory correction is occuring at too
high a frequency. Thresholds are used by the ECC hardware as occasional
ECC failures are part of normal operation, but long sequences of ECC
failures usually indicate a memory chip that is about to fail.
Thermal event interrupts occur when a temperature threshold has been
exceeded for some CPU chip. IIRC, a thermal interrupt is also generated
when the temperature drops back to a normal level.
A spurious interrupt is an interrupt that was raised then lowered by the
device before it could be fully processed by the APIC. Hence the apic sees
the interrupt but does not know what device it came from. For this case
the APIC hardware will assume a vector of 0xff.
Rescheduling, call, and TLB flush interrupts are sent from one CPU to
another per the needs of the OS. Typically, their statistics would be used
to discover if an interrupt flood of the given type has been occuring.
AK: merged v2 and v4 which had some more tweaks
AK: replace Local interrupts with Local timer interrupts
AK: Fixed description of interrupt types.
[ tglx: arch/x86 adaptation ]
[ mingo: small cleanup ]
Signed-off-by: Joe Korty <joe.korty@ccur.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Tim Hockin <thockin@hockin.org>
Cc: Andi Kleen <ak@suse.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The additional struct member of user_desc can be made conditional for
64 bit compiles.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Aside of the register defines the content can be shared.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>