ext4_iget() returns -ESTALE if invoked on a deleted inode, in order to
report errors to NFS properly. However, in ext4_lookup(), this
-ESTALE can be propagated to userspace if the filesystem is corrupted
such that a directory entry references a deleted inode. This leads to
a misleading error message - "Stale NFS file handle" - and confusion
on the part of the admin.
The bug can be easily reproduced by creating a new filesystem, making
a link to an unused inode using debugfs, then mounting and attempting
to ls -l said link.
This patch thus changes ext4_lookup to return -EIO if it receives
-ESTALE from ext4_iget(), as ext4 does for other filesystem metadata
corruption; and also invokes the appropriate ext*_error functions when
this case is detected.
Signed-off-by: Bryan Donlan <bdonlan@gmail.com>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The find_group_flex() inode allocator is now only used if the
filesystem is mounted using the "oldalloc" mount option. It is
replaced with the original Orlov allocator that has been updated for
flex_bg filesystems (it should behave the same way if flex_bg is
disabled). The inode allocator now functions by taking into account
each flex_bg group, instead of each block group, when deciding whether
or not it's time to allocate a new directory into a fresh flex_bg.
The block allocator has also been changed so that the first block
group in each flex_bg is preferred for use for storing directory
blocks. This keeps directory blocks close together, which is good for
speeding up e2fsck since large directories are more likely to look
like this:
debugfs: stat /home/tytso/Maildir/cur
Inode: 1844562 Type: directory Mode: 0700 Flags: 0x81000
Generation: 1132745781 Version: 0x00000000:0000ad71
User: 15806 Group: 15806 Size: 1060864
File ACL: 0 Directory ACL: 0
Links: 2 Blockcount: 2072
Fragment: Address: 0 Number: 0 Size: 0
ctime: 0x499c0ff4:164961f4 -- Wed Feb 18 08:41:08 2009
atime: 0x499c0ff4:00000000 -- Wed Feb 18 08:41:08 2009
mtime: 0x49957f51:00000000 -- Fri Feb 13 09:10:25 2009
crtime: 0x499c0f57:00d51440 -- Wed Feb 18 08:38:31 2009
Size of extra inode fields: 28
BLOCKS:
(0):7348651, (1-258):7348654-7348911
TOTAL: 259
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
At the moment there are few restrictions on which flags may be set on
which inodes. Specifically DIRSYNC may only be set on directories and
IMMUTABLE and APPEND may not be set on links. Tighten that to disallow
TOPDIR being set on non-directories and only NODUMP and NOATIME to be set
on non-regular file, non-directories.
Introduces a flags masking function which masks flags based on mode and
use it during inode creation and when flags are set via the ioctl to
facilitate future consistency.
Signed-off-by: Duane Griffin <duaneg@dghda.com>
Acked-by: Andreas Dilger <adilger@sun.com>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
At present INDEX and EXTENTS are the only flags that new ext4 inodes do
NOT inherit from their parent. In addition prevent the flags DIRTY,
ECOMPR, IMAGIC, TOPDIR, HUGE_FILE and EXT_MIGRATE from being inherited.
List inheritable flags explicitly to prevent future flags from
accidentally being inherited.
This fixes the TOPDIR flag inheritance bug reported at
http://bugzilla.kernel.org/show_bug.cgi?id=9866.
Signed-off-by: Duane Griffin <duaneg@dghda.com>
Acked-by: Andreas Dilger <adilger@sun.com>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
As spotted by kmemtrace, struct ext4_sb_info is 17664 bytes on 64-bit
which makes it a very bad fit for SLAB allocators. The culprit of the
wasted memory is ->s_blockgroup_lock which can be as big as 16 KB when
NR_CPUS >= 32.
To fix that, allocate ->s_blockgroup_lock, which fits nicely in a order 2
page in the worst case, separately. This shinks down struct ext4_sb_info
enough to fit a 2 KB slab cache so now we allocate 16 KB + 2 KB instead of
32 KB saving 14 KB of memory.
Acked-by: Andreas Dilger <adilger@sun.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The rec_len field in the directory entry is 16 bits, so to encode
blocksizes larger than 64k becomes problematic. This patch allows us
to supprot block sizes up to 256k, by using the low 2 bits to extend
the range of rec_len to 2**18-1 (since valid rec_len sizes must be a
multiple of 4). We use the convention that a rec_len of 0 or 65535
means the filesystem block size, for compatibility with older kernels.
It's unlikely we'll see VM pages of up to 256k, but at some point we
might find that the Linux VM has been enhanced to support filesystem
block sizes > than the VM page size, at which point it might be useful
for some applications to allow very large filesystem block sizes.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The static function ext4_group_used_meta_blocks() only has one caller,
who already has access to the block group's group descriptor. So it's
better to have ext4_init_block_bitmap() pass the group descriptor to
ext4_group_used_meta_blocks(), so it doesn't need to call
ext4_group_desc(). Previously this function did not check if
ext4_group_desc() returned NULL due to an error, potentially causing a
kernel OOPS report. This avoids the issue entirely.
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@holoscopio.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Remove some leftovers from when the old block allocator was removed
(c2ea3fde). ext4_sb_info is now a bit lighter. Also remove a dangling
read_block_bitmap() prototype.
Signed-off-by: Mike Snitzer <snitzer@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* 'bzip2-lzma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-tip:
bzip2/lzma: don't ask for compression mode for the default initramfs
bzip2/lzma: consistently capitalize LZMA in Kconfig
bzip2/lzma: clarify the meaning of the CONFIG_RD_ options
bzip2/lzma: move CONFIG_RD_* options under CONFIG_EMBEDDED
<linux/irq.h> relies on <linux/gfp.h> and <linux/topology.h> having been
included previous. If not, the errors like below will result.
CC arch/mips/mti-malta/malta-int.o
In file included from arch/mips/mti-malta/malta-int.c:25:
include/linux/irq.h: In function ‘init_alloc_desc_masks’:
include/linux/irq.h:444: error: implicit declaration of function ‘cpu_to_node’
include/linux/irq.h:446: error: ‘GFP_ATOMIC’ undeclared (first use in this function)
include/linux/irq.h:446: error: (Each undeclared identifier is reported only once
include/linux/irq.h:446: error: for each function it appears in.)
make[3]: *** [arch/mips/mti-malta/malta-int.o] Error 1
make[2]: *** [arch/mips/mti-malta] Error 2
make[1]: *** [sub-make] Error 2
Fixed by including the two missing headers.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix kernel-doc errors in sched.c: the structs don't have
kernel-doc notation and the short function description needs to
be one line only.
Error(kernel/sched.c:3197): cannot understand prototype: 'struct sd_lb_stats '
Error(kernel/sched.c:3228): cannot understand prototype: 'struct sg_lb_stats '
Error(kernel/sched.c:3375): duplicate section name 'Description'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'futexes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
futex: remove the pointer math from double_unlock_hb, fix
futex: remove the pointer math from double_unlock_hb
futex: clean up fault logic
futex: unlock before returning -EFAULT
futex: use current->time_slack_ns for rt tasks too
futex: add double_unlock_hb()
futex: additional (get|put)_futex_key() fixes
futex: update futex commentary
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6:
smack: Add a new '-CIPSO' option to the network address label configuration
netlabel: Cleanup the Smack/NetLabel code to fix incoming TCP connections
lsm: Remove the socket_post_accept() hook
selinux: Remove the "compat_net" compatibility code
netlabel: Label incoming TCP connections correctly in SELinux
lsm: Relocate the IPv4 security_inet_conn_request() hooks
TOMOYO: Fix a typo.
smack: convert smack to standard linux lists
Annotate struct fs_struct's usage count to indicate the restrictions upon it.
It may not be incremented, except by clone(CLONE_FS), as this affects the
check in check_unsafe_exec() in fs/exec.c.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
check_unsafe_exec() also notes whether the fs_struct is being
shared by more threads than will get killed by the exec, and if so
sets LSM_UNSAFE_SHARE to make bprm_set_creds() careful about euid.
But /proc/<pid>/cwd and /proc/<pid>/root lookups make transient
use of get_fs_struct(), which also raises that sharing count.
This might occasionally cause a setuid program not to change euid,
in the same way as happened with files->count (check_unsafe_exec
also looks at sighand->count, but /proc doesn't raise that one).
We'd prefer exec not to unshare fs_struct: so fix this in procfs,
replacing get_fs_struct() by get_fs_path(), which does path_get
while still holding task_lock, instead of raising fs->count.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Cc: stable@kernel.org
___
fs/proc/base.c | 50 +++++++++++++++--------------------------------
1 file changed, 16 insertions(+), 34 deletions(-)
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Malicki reports that setuid sometimes doesn't: very rarely,
a setuid root program does not get root euid; and, by the way,
they have a health check running lsof every few minutes.
Right, check_unsafe_exec() notes whether the files_struct is being
shared by more threads than will get killed by the exec, and if so
sets LSM_UNSAFE_SHARE to make bprm_set_creds() careful about euid.
But /proc/<pid>/fd and /proc/<pid>/fdinfo lookups make transient
use of get_files_struct(), which also raises that sharing count.
There's a rather simple fix for this: exec's check on files->count
has been redundant ever since 2.6.1 made it unshare_files() (except
while compat_do_execve() omitted to do so) - just remove that check.
[Note to -stable: this patch will not apply before 2.6.29: earlier
releases should just remove the files->count line from unsafe_exec().]
Reported-by: Joe Malicki <jmalicki@metacarta.com>
Narrowed-down-by: Michael Itz <mitz@metacarta.com>
Tested-by: Joe Malicki <jmalicki@metacarta.com>
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2.6.26's commit fd8328be87
"sanitize handling of shared descriptor tables in failing execve()"
moved the unshare_files() from flush_old_exec() and several binfmts
to the head of do_execve(); but forgot to make the same change to
compat_do_execve(), leaving a CLONE_FILES files_struct shared across
exec from a 32-bit process on a 64-bit kernel.
It's arguable whether the files_struct really ought to be unshared
across exec; but 2.6.1 made that so to stop the loading binary's fd
leaking into other threads, and a 32-bit process on a 64-bit kernel
ought to behave in the same way as 32 on 32 and 64 on 64.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Impact: Kconfig noise reduction, documentation
The default initramfs is so small that it makes no sense to worry
about the additional memory taken by not double-compressing it.
Therefore, don't bug the user with it.
Also, improve the description of the option, which was downright
incorrect.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Impact: message formatting
Consistently spell LZMA in all capitals, since it (unlike gzip or
bzip2) is an acronym.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Impact: Kconfig clarification
Make it clear that the CONFIG_RD_* options are about what formats are
supported, not about what formats are actually being used.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Impact: reduce Kconfig noise
Move the options that control possible initramfs/initrd compressions
underneath CONFIG_EMBEDDED. The only impact of leaving these options
set to y is additional code in the init section of the kernel; there
is no reason to burden non-embedded users with these options.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
* 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6:
i2c-core: Some style cleanups
i2c-piix4: Add support for the Broadcom HT1100 chipset
i2c-piix4: Add support to SB800 SMBus changes
i2c-pca-platform: Use defaults if no platform_data given
i2c-algo-pca: Use timeout for checking the state machine
i2c-algo-pca: Rework waiting for a free bus
i2c-algo-pca: Add PCA9665 support
i2c: Adapt debug macros for KERN_* constants
i2c-davinci: Fix timeout handling
i2c: Adapter timeout is in jiffies
i2c: Set a default timeout value for all adapters
i2c: Add missing KERN_* constants to printks
i2c-algo-pcf: Handle timeout correctly
i2c-algo-pcf: Style cleanups
eeprom/at24: Remove EXPERIMENTAL
i2c-nforce2: Add support for MCP67, MCP73, MCP78S and MCP79
i2c: Clarify which clients are auto-removed
i2c: Let checkpatch shout on users of the legacy model
i2c: Document the different ways to instantiate i2c devices
* 'devel' of master.kernel.org:/home/rmk/linux-2.6-arm: (422 commits)
[ARM] 5435/1: fix compile warning in sanity_check_meminfo()
[ARM] 5434/1: ARM: OMAP: Fix mailbox compile for 24xx
[ARM] pxa: fix the bad assumption that PCMCIA sockets always start with 0
[ARM] pxa: fix Colibri PXA300 and PXA320 LCD backlight pins
imxfb: Fix TFT mode
i.MX21/27: remove ifdef CONFIG_FB_IMX
imxfb: add clock support
mxc: add arch_reset() function
clkdev: add possibility to get a clock based on the device name
i.MX1: remove fb support from mach-imx
[ARM] pxa: build arch/arm/plat-pxa/mfp.c only when PXA3xx or ARCH_MMP defined
Gemini: Add support for Teltonika RUT100
Gemini: gpiolib based GPIO support v2
MAINTAINERS: add myself as Gemini architecture maintainer
ARM: Add Gemini architecture v3
[ARM] OMAP: Fix compile for omap2_init_common_hw()
MAINTAINERS: Add myself as Faraday ARM core variant maintainer
ARM: Add support for FA526 v2
[ARM] acorn,ebsa110,footbridge,integrator,sa1100: Convert asm/io.h to linux/io.h
[ARM] collie: fix two minor formatting nits
...
* git://git.kernel.org/pub/scm/linux/kernel/git/arjan/linux-2.6-async-for-30:
fastboot: remove duplicate unpack_to_rootfs()
ide/net: flip the order of SATA and network init
async: remove the temporary (2.6.29) "async is off by default" code
Fix up conflicts in init/initramfs.c manually
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/blackfin-2.6:
Blackfin arch: be less noisy when gets a gpio conflict after kernel has booted
Blackfin arch: add RSI's definitions to bf514 and bf516
Blackfin arch: add link-time asserts to make sure on-chip regions dont overflow
Blackfin arch: sport spi needs 6 gpio pins
Blackfin arch: add sport-spi related resource stuff to board file
Blackfin arch: Blacklist Hibernate (PM_SUSPEND_MEM) on BF561 as well
Blackfin arch: Privide BF537-STAMP platform data of ADP5520 Multifunction driver
Blackfin arch: enable the platfrom PATA driver with CF Cards
Blackfin arch: clean up sports header file
Blackfin arch: convert BF5{18,27,48}_FAMILY to CONFIG_BF{51,52,54}x
Blackfin arch: bf51x processors also have 8 timers
Blackfin arch: add a check to make sure only Blackfin GPIOs may generate IRQs
Blackfin arch: update default kernel configuration
Blackfin arch: include linux headers that this one uses definitions from fro sport drivers
* 'percpu-cpumask-x86-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (682 commits)
percpu: fix spurious alignment WARN in legacy SMP percpu allocator
percpu: generalize embedding first chunk setup helper
percpu: more flexibility for @dyn_size of pcpu_setup_first_chunk()
percpu: make x86 addr <-> pcpu ptr conversion macros generic
linker script: define __per_cpu_load on all SMP capable archs
x86: UV: remove uv_flush_tlb_others() WARN_ON
percpu: finer grained locking to break deadlock and allow atomic free
percpu: move fully free chunk reclamation into a work
percpu: move chunk area map extension out of area allocation
percpu: replace pcpu_realloc() with pcpu_mem_alloc() and pcpu_mem_free()
x86, percpu: setup reserved percpu area for x86_64
percpu, module: implement reserved allocation and use it for module percpu variables
percpu: add an indirection ptr for chunk page map access
x86: make embedding percpu allocator return excessive free space
percpu: use negative for auto for pcpu_setup_first_chunk() arguments
percpu: improve first chunk initial area map handling
percpu: cosmetic renames in pcpu_setup_first_chunk()
percpu: clean up percpu constants
x86: un-__init fill_pud/pmd/pte
x86: remove vestigial fix_ioremap prototypes
...
Manually merge conflicts in arch/ia64/kernel/irq_ia64.c
Some lines over 80.
The printk(KERN_ERR ... ) should be dev_err.
And some blankspace should be deleted.
Signed-off-by: Zhenwen Xu <helight.xu@gmail.com>
Signed-off-by: Jean Delvare <khlai@linux-fr.org>
Add support for the Broadcom HT1100 LD chipset (SMBus function.)
Signed-off-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Add support for the AMD SB800 Family series of products.
Major changes include the changes to addressing the SMBus registers at different
location from the locations in the previous compatible parts from AMD such as
SB400/SB600/SB700. For SB800, the main features and register definitions of
SMBus and other interfaces are still compatible with the previous products with
the only change being in how to access the internal registers for these blocks.
Signed-off-by: Shane Huang <shane.huang@amd.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
We now timeout also if the state machine does not change within the
given time. For that, the driver-specific completion-functions are
extended to return true or false depending on the timeout. This then
gets checked in the algorithm.
Signed-off-by: Wolfram Sang <w.sang@pengutronix.de>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Waiting for a free bus now accepts the timeout value in jiffies and does
proper checking using time_before.
Signed-off-by: Wolfram Sang <w.sang@pengutronix.de>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
According to kerneljanitors todo list all printk calls (beginning
a new line) should have an according KERN_* constant.
Those are the changes to the debug macros in the i2c subsystem
to meet this requirement. Also changing no-debug statements
to raw printks again.
Signed-off-by: Frank Seidel <frank@f-seidel.de>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Tested-by: Wolfram Sang <w.sang@pengutronix.de>
Properly set the adapter timeout value in jiffies, and then use that
value in the driver, rather than a hard-coded constant.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Tested-by: Troy Kisky <troy.kisky@boundarydevices.com>
Cc: Kevin Hilman <khilman@mvista.com>
i2c_adapter.timeout is in jiffies. Fix all drivers which thought
otherwise. It didn't really matter as long as the value was only used
inside the driver, but soon i2c-core will use it too so it must have
the proper unit.
Note: for the i2c-mpc driver, this fixes a bug in polling mode.
Timeout would trigger after 1 jiffy, which is most probably not what
the author wanted.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Clifford Wolf <clifford@clifford.at>
Acked-by: Sean MacLennan <smaclennan@pikatech.com>
Cc: Stefan Roese <sr@denx.de>
Acked-by: Lennert Buytenhek <kernel@wantstofly.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Mark A. Greer <mgreer@mvista.com>
Setting a default timeout value on a per-algo basis doesn't make any
sense. Move the default value setting to i2c-core. Individual adapter
drivers can specify a different (non-zero) value if they wish.
Also express the timeout value in a way which results in the same
duration regarless of the value of HZ.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Wolfram Sang <w.sang@pengutronix.de>
According to kerneljanitors todo list all printk calls (beginning
a new line) should have an according KERN_* constant.
Those are the missing pieces here for the i2c subsystem.
Signed-off-by: Frank Seidel <frank@f-seidel.de>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
With a postfix decrement these timeouts reach -1 rather than 0, but after the
loop it is tested whether they have become 0.
As pointed out by Jean Delvare, the msg_num should be tested before the timeout.
With the current order, you could exit with a timeout error while all the
messages were successfully transferred.
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Eric Brower <ebrower@gmail.com>
This driver has been widely used since inclusion and no problems have
been reported.
Signed-off-by: Wolfram Sang <w.sang@pengutronix.de>
Cc: David Brownell <david-b@pacbell.net>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
The MCP78S and MCP79 appear to be compatible with the previous nForce
chips as far as the SMBus controller is concerned. The MCP67 and MCP73
were not tested yet but I'd be very surprised if they weren't
compatible too.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Oleg Ryjkov <olegr@olegr.ca>
Cc: Malcolm Lalkaka <mlalkaka@gmail.com>
Cc: Zbigniew Luszpinski <zbiggy@o2.pl>
The automatic removal of i2c clients only affects the clients which
were created automatically in the first place. Add a comment saying
that to avoid any confusion.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
On popular demand, here comes some documentation about how to
instantiate i2c devices in the new (standard) i2c device driver
binding model.
I have also clarified how the class bitfield lets driver authors
control which buses are probed in the auto-detect case, and warned
more loudly against the abuse of this method.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Michael Lawnick <nospam_lawnick@gmx.de>
Acked-by: Hans Verkuil <hverkuil@xs4all.nl>