linux

mirror of https://github.com/FEX-Emu/linux.git synced 2025-01-25 20:15:08 +00:00

Author	SHA1	Message	Date
Jan Kara	5052b069ac	jbd2: fix dbench4 performance regression for 'nobarrier' mounts Commit b685d3d65ac7 "block: treat REQ_FUA and REQ_PREFLUSH as synchronous" removed REQ_SYNC flag from WRITE_FUA implementation. Since JBD2 strips REQ_FUA and REQ_FLUSH flags from submitted IO when the filesystem is mounted with nobarrier mount option, journal superblock writes ended up being async writes after this patch and that caused heavy performance regression for dbench4 benchmark with high number of processes. In my test setup with HP RAID array with non-volatile write cache and 32 GB ram, dbench4 runs with 8 processes regressed by ~25%. Fix the problem by making sure journal superblock writes are always treated as synchronous since they generally block progress of the journalling machinery and thus the whole filesystem. Fixes: b685d3d65ac791406e0dfd8779cc9b3707fea5a3 CC: stable@vger.kernel.org Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2017-04-29 21:07:30 -04:00
Jan Kara	c52c47e4b4	jbd2: Fix lockdep splat with generic/270 test I've hit a lockdep splat with generic/270 test complaining that: 3216.fsstress.b/3533 is trying to acquire lock: (jbd2_handle){++++..}, at: [<ffffffff813152e0>] jbd2_log_wait_commit+0x0/0x150 but task is already holding lock: (jbd2_handle){++++..}, at: [<ffffffff8130bd3b>] start_this_handle+0x35b/0x850 The underlying problem is that jbd2_journal_force_commit_nested() (called from ext4_should_retry_alloc()) may get called while a transaction handle is started. In such case it takes care to not wait for commit of the running transaction (which would deadlock) but only for a commit of a transaction that is already committing (which is safe as that doesn't wait for any filesystem locks). In fact there are also other callers of jbd2_log_wait_commit() that take care to pass tid of a transaction that is already committing and for those cases, the lockdep instrumentation is too restrictive and leading to false positive reports. Fix the problem by calling jbd2_might_wait_for_commit() from jbd2_log_wait_commit() only if the transaction isn't already committing. Fixes: 1eaa566d368b214d99cbb973647c1b0b8102a9ae Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2017-04-29 20:12:16 -04:00
Mark Charlebois	9280cdd6fe	fs: compat: Remove warning from COMPATIBLE_IOCTL cmd in COMPATIBLE_IOCTL is always a u32, so cast it so there isn't a warning about an overflow in XFORM. From: Mark Charlebois <charlebm@gmail.com> Signed-off-by: Mark Charlebois <charlebm@gmail.com> Signed-off-by: Behan Webster <behanw@converseincode.com> Signed-off-by: Matthias Kaehlcke <mka@chromium.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2017-04-29 17:47:19 -04:00
Al Viro	0b33540f9d	remove pointless extern of atime_need_update_rcu() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2017-04-29 17:42:25 -04:00
Trond Myklebust	1f18b82c34	pNFS: Ensure we commit the layout if it has been invalidated If the layout is being invalidated on the server, then we must invoke nfs_commit_inode() to ensure any commits to the DS get cleared out. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2017-04-29 11:29:30 -04:00
Trond Myklebust	722f0b8911	pNFS: Don't send COMMITs to the DSes if the server invalidated our layout If the layout was invalidated, then assume we should requeue all the pending writes for the DS in question. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2017-04-29 11:29:24 -04:00
Trond Myklebust	37f8aa16da	pNFS/flexfiles: Fix up the ff_layout_write_pagelist failure path If the attempt to write through pNFS fails, we need to use the same failure semantics as for the read path: If the FF_FLAGS_NO_IO_THRU_MDS flag is set or we have sufficient valid DSes, then we must retry through pNFS Fixes: d67ae825a59d ("pnfs/flexfiles: Add the FlexFile Layout Driver") Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2017-04-29 00:02:37 -04:00
Takashi Iwai	d66bb1607e	proc: Fix unbalanced hard link numbers proc_create_mount_point() forgot to increase the parent's nlink, and it resulted in unbalanced hard link numbers, e.g. /proc/fs shows one less than expected. Fixes: eb6d38d5427b ("proc: Allow creating permanently empty directories...") Cc: stable@vger.kernel.org Reported-by: Tristan Ye <tristan.ye@suse.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2017-04-28 21:05:26 -05:00
Linus Torvalds	28b2013587	Merge branch 'for-linus-4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs fix from Chris Mason: "We have one more fix for btrfs. This gets rid of a new WARN_ON from rc1 that ended up making more noise than we really want. The larger fix for the underflow got delayed a bit and it's better for now to put it under CONFIG_BTRFS_DEBUG" * 'for-linus-4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: btrfs: qgroup: move noisy underflow warning to debugging build	2017-04-28 10:13:17 -07:00
Trond Myklebust	bdebfccd0e	pNFS: Ensure we check layout validity before marking it for return pnfs_error_mark_layout_for_return needs to check that the layout is valid before calling pnfs_set_plh_return_info(). Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2017-04-28 13:07:01 -04:00
Olga Kornievskaia	88bd4f8629	NFS4.1 handle interrupted slot reuse from ERR_DELAY If the RPC slot was interrupted and server replied to the next operation on the "reused" slot with ERR_DELAY, don't clear out the "interrupted" flag until we properly recover. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2017-04-28 13:07:00 -04:00
Pan Bian	4edabfd7d0	NFSv4: check return value of xdr_inline_decode Function xdr_inline_decode() will return a NULL pointer if the input buffer does not have long enough buffer to decode nbytes of data. However, in function decode_op_map(), the return value of xdr_inline_decode() is not validated before it is used. This patch adds a check to the return value of xdr_inline_decode(). Signed-off-by: Pan Bian <bianpan2016@163.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2017-04-28 13:06:59 -04:00
Artem Savkov	209aa23083	nfs/filelayout: fix NULL pointer dereference in fl_pnfs_update_layout() Calling pnfs_put_lset on an IS_ERR pointer results in a NULL pointer dereference like the one below. At the same time the check of retvalue of filelayout_check_deviceid() sets lseg to error, but does not free it before that. [ 3000.636161] BUG: unable to handle kernel NULL pointer dereference at 000000000000003c [ 3000.636970] IP: pnfs_put_lseg+0x29/0x100 [nfsv4] [ 3000.637420] PGD 4f23b067 [ 3000.637421] PUD 4a0f4067 [ 3000.637679] PMD 0 [ 3000.637937] [ 3000.638287] Oops: 0000 [#1] SMP [ 3000.638591] Modules linked in: nfs_layout_nfsv41_files nfsv3 nfnetlink_queue nfnetlink_log nfnetlink bluetooth rfkill rpcsec_gss_krb5 nfsv4 nfs fscache binfmt_misc arc4 md4 nls_utf8 cifs ccm dns_resolver rpcrdma ib_isert iscsi_target_mod ib_iser rdma_cm iw_cm libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib ib_ucm ib_uverbs ib_umad ib_cm ib_core nls_koi8_u nls_cp932 ts_kmp nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcspkr virtio_balloon ppdev virtio_rng parport_pc i2c_piix4 parport acpi_cpufreq nfsd auth_rpcgss nfs_acl lockd grace sunrpc xfs libcrc32c ata_generic pata_acpi virtio_blk virtio_net cirrus drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops crc32c_intel ata_piix ttm libata drm serio_raw [ 3000.645245] i2c_core virtio_pci virtio_ring virtio floppy dm_mirror dm_region_hash dm_log dm_mod [last unloaded: xt_u32] [ 3000.646360] CPU: 1 PID: 26402 Comm: date Not tainted 4.11.0-rc7.1.el7.test.x86_64 #1 [ 3000.647092] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [ 3000.647638] task: ffff8800415ada00 task.stack: ffffc90000ff0000 [ 3000.648207] RIP: 0010:pnfs_put_lseg+0x29/0x100 [nfsv4] [ 3000.648696] RSP: 0018:ffffc90000ff39b8 EFLAGS: 00010246 [ 3000.649193] RAX: 0000000000000000 RBX: fffffffffffffff4 RCX: 00000000000d43be [ 3000.649859] RDX: 00000000000d43bd RSI: 0000000000000000 RDI: fffffffffffffff4 [ 3000.650530] RBP: ffffc90000ff39d8 R08: 000000000001e320 R09: ffffffffa05c35ce [ 3000.651203] R10: ffff88007fd1e320 R11: ffffea0001283d80 R12: 0000000001400040 [ 3000.651875] R13: ffff88004f77d9f0 R14: ffffc90000ff3cd8 R15: ffff8800417ade00 [ 3000.652546] FS: 00007fac4d5cd740(0000) GS:ffff88007fd00000(0000) knlGS:0000000000000000 [ 3000.653304] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 3000.653849] CR2: 000000000000003c CR3: 000000004f080000 CR4: 00000000000406e0 [ 3000.654527] Call Trace: [ 3000.654771] fl_pnfs_update_layout.constprop.20+0x10c/0x150 [nfs_layout_nfsv41_files] [ 3000.655505] filelayout_pg_init_write+0x21d/0x270 [nfs_layout_nfsv41_files] [ 3000.656195] __nfs_pageio_add_request+0x11c/0x490 [nfs] [ 3000.656698] nfs_pageio_add_request+0xac/0x260 [nfs] [ 3000.657180] nfs_do_writepage+0x109/0x2e0 [nfs] [ 3000.657616] nfs_writepages_callback+0x16/0x30 [nfs] [ 3000.658096] write_cache_pages+0x26f/0x510 [ 3000.658495] ? nfs_do_writepage+0x2e0/0x2e0 [nfs] [ 3000.658946] ? _raw_spin_unlock_bh+0x1e/0x20 [ 3000.659357] ? wb_wakeup_delayed+0x5f/0x70 [ 3000.659748] ? __mark_inode_dirty+0x2eb/0x360 [ 3000.660170] nfs_writepages+0x84/0xd0 [nfs] [ 3000.660575] ? nfs_updatepage+0x571/0xb70 [nfs] [ 3000.661012] do_writepages+0x1e/0x30 [ 3000.661358] __filemap_fdatawrite_range+0xc6/0x100 [ 3000.661819] filemap_write_and_wait_range+0x41/0x90 [ 3000.662292] nfs_file_fsync+0x34/0x1f0 [nfs] [ 3000.662704] vfs_fsync_range+0x3d/0xb0 [ 3000.663065] vfs_fsync+0x1c/0x20 [ 3000.663385] nfs4_file_flush+0x57/0x80 [nfsv4] [ 3000.663813] filp_close+0x2f/0x70 [ 3000.664132] __close_fd+0x9a/0xc0 [ 3000.664453] SyS_close+0x23/0x50 [ 3000.664785] do_syscall_64+0x67/0x180 [ 3000.665162] entry_SYSCALL64_slow_path+0x25/0x25 [ 3000.665600] RIP: 0033:0x7fac4d0e1e90 [ 3000.665946] RSP: 002b:00007ffd54e90c88 EFLAGS: 00000246 ORIG_RAX: 0000000000000003 [ 3000.666679] RAX: ffffffffffffffda RBX: 00007fac4d3b5400 RCX: 00007fac4d0e1e90 [ 3000.667349] RDX: 0000000000000000 RSI: 00007fac4d5d9000 RDI: 0000000000000001 [ 3000.668031] RBP: 0000000000000000 R08: 00007fac4d3b6a00 R09: 00007fac4d5cd740 [ 3000.668709] R10: 00007ffd54e909e0 R11: 0000000000000246 R12: 0000000000000000 [ 3000.669385] R13: 00007fac4d3b5e80 R14: 0000000000000000 R15: 0000000000000000 [ 3000.670061] Code: 00 00 66 66 66 66 90 55 48 85 ff 48 89 e5 41 56 41 55 41 54 53 48 89 fb 0f 84 97 00 00 00 f6 05 16 8f bc ff 10 0f 85 a6 00 00 00 <4c> 8b 63 48 48 8d 7b 38 49 8b 84 24 90 00 00 00 4c 8d a8 88 00 [ 3000.671831] RIP: pnfs_put_lseg+0x29/0x100 [nfsv4] RSP: ffffc90000ff39b8 [ 3000.672462] CR2: 000000000000003c Signed-off-by: Artem Savkov <asavkov@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2017-04-28 13:06:59 -04:00
Brian Foster	e20c8a517f	xfs: wait on new inodes during quotaoff dquot release The quotaoff operation has a race with inode allocation that results in a livelock. An inode allocation that occurs before the quota status flags are updated acquires the appropriate dquots for the inode via xfs_qm_vop_dqalloc(). It then inserts the XFS_INEW inode into the perag radix tree, sometime later attaches the dquots to the inode and finally clears the XFS_INEW flag. Quotaoff expects to release the dquots from all inodes in the filesystem via xfs_qm_dqrele_all_inodes(). This invokes the AG inode iterator, which skips inodes in the XFS_INEW state because they are not fully constructed. If the scan occurs after dquots have been attached to an inode, but before XFS_INEW is cleared, the newly allocated inode will continue to hold a reference to the applicable dquots. When quotaoff invokes xfs_qm_dqpurge_all(), the reference count of those dquot(s) remain elevated and the dqpurge scan spins indefinitely. To address this problem, update the xfs_qm_dqrele_all_inodes() scan to wait on inodes marked on the XFS_INEW state. We wait on the inodes explicitly rather than skip and retry to avoid continuous retry loops due to a parallel inode allocation workload. Since quotaoff updates the quota state flags and uses a synchronous transaction before the dqrele scan, and dquots are attached to inodes after radix tree insertion iff quota is enabled, one INEW waiting pass through the AG guarantees that the scan has processed all inodes that could possibly hold dquot references. Reported-by: Eryu Guan <eguan@redhat.com> Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2017-04-28 08:11:08 -07:00
Brian Foster	ae2c4ac2dd	xfs: update ag iterator to support wait on new inodes The AG inode iterator currently skips new inodes as such inodes are inserted into the inode radix tree before they are fully constructed. Certain contexts require the ability to wait on the construction of new inodes, however. The fs-wide dquot release from the quotaoff sequence is an example of this. Update the AG inode iterator to support the ability to wait on inodes flagged with XFS_INEW upon request. Create a new xfs_inode_ag_iterator_flags() interface and support a set of iteration flags to modify the iteration behavior. When the XFS_AGITER_INEW_WAIT flag is set, include XFS_INEW flags in the radix tree inode lookup and wait on them before the callback is executed. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2017-04-28 08:11:08 -07:00
Brian Foster	756baca27f	xfs: support ability to wait on new inodes Inodes that are inserted into the perag tree but still under construction are flagged with the XFS_INEW bit. Most contexts either skip such inodes when they are encountered or have the ability to handle them. The runtime quotaoff sequence introduces a context that must wait for construction of such inodes to correctly ensure that all dquots in the fs are released. In anticipation of this, support the ability to wait on new inodes. Wake the appropriate bit when XFS_INEW is cleared. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2017-04-28 08:11:08 -07:00
Amir Goldstein	8f720d9f89	xfs: publish UUID in struct super_block Copy the uuid of the filesystem to struct super_block s_uuid field, as several other filesystems already do. Copy regardless of the nouuid mount option, because other filesystems also do not guaranty uniqueness of the s_uuid field in super_block struct. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2017-04-28 08:10:53 -07:00
NeilBrown	a6f74e80f2	cifs: don't check for failure from mempool_alloc() mempool_alloc() cannot fail if the gfp flags allow it to sleep, and both GFP_FS allows for sleeping. So these tests of the return value from mempool_alloc() cannot be needed. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Steve French <smfrench@gmail.com>	2017-04-28 07:56:33 -05:00
Sachin Prabhu	7d0c234fd2	Do not return number of bytes written for ioctl CIFS_IOC_COPYCHUNK_FILE commit 620d8745b35d ("Introduce cifs_copy_file_range()") changes the behaviour of the cifs ioctl call CIFS_IOC_COPYCHUNK_FILE. In case of successful writes, it now returns the number of bytes written. This return value is treated as an error by the xfstest cifs/001. Depending on the errno set at that time, this may or may not result in the test failing. The patch fixes this by setting the return value to 0 in case of successful writes. Fixes: commit 620d8745b35d ("Introduce cifs_copy_file_range()") Reported-by: Eryu Guan <eguan@redhat.com> Signed-off-by: Sachin Prabhu <sprabhu@redhat.com> Acked-by: Pavel Shilovsky <pshilov@microsoft.com> Cc: stable@vger.kernel.org Signed-off-by: Steve French <smfrench@gmail.com>	2017-04-28 07:56:33 -05:00
Sachin Prabhu	cd8c42968e	Fix match_prepath() Incorrect return value for shares not using the prefix path means that we will never match superblocks for these shares. Fixes: commit c1d8b24d1819 ("Compare prepaths when comparing superblocks") Cc: stable@vger.kernel.org Signed-off-by: Sachin Prabhu <sprabhu@redhat.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <smfrench@gmail.com>	2017-04-28 07:54:54 -05:00
Kees Cook	3a7d2fd16c	pstore: Solve lockdep warning by moving inode locks Lockdep complains about a possible deadlock between mount and unlink (which is technically impossible), but fixing this improves possible future multiple-backend support, and keeps locking in the right order. The lockdep warning could be triggered by unlinking a file in the pstore filesystem: -> #1 (&sb->s_type->i_mutex_key#14){++++++}: lock_acquire+0xc9/0x220 down_write+0x3f/0x70 pstore_mkfile+0x1f4/0x460 pstore_get_records+0x17a/0x320 pstore_fill_super+0xa4/0xc0 mount_single+0x89/0xb0 pstore_mount+0x13/0x20 mount_fs+0xf/0x90 vfs_kern_mount+0x66/0x170 do_mount+0x190/0xd50 SyS_mount+0x90/0xd0 entry_SYSCALL_64_fastpath+0x1c/0xb1 -> #0 (&psinfo->read_mutex){+.+.+.}: __lock_acquire+0x1ac0/0x1bb0 lock_acquire+0xc9/0x220 __mutex_lock+0x6e/0x990 mutex_lock_nested+0x16/0x20 pstore_unlink+0x3f/0xa0 vfs_unlink+0xb5/0x190 do_unlinkat+0x24c/0x2a0 SyS_unlinkat+0x16/0x30 entry_SYSCALL_64_fastpath+0x1c/0xb1 Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&sb->s_type->i_mutex_key#14); lock(&psinfo->read_mutex); lock(&sb->s_type->i_mutex_key#14); lock(&psinfo->read_mutex); Reported-by: Marta Lofstedt <marta.lofstedt@intel.com> Reported-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Namhyung Kim <namhyung@kernel.org>	2017-04-27 20:35:34 -07:00
Trond Myklebust	ed6473ddc7	NFSv4: Fix callback server shutdown We want to use kthread_stop() in order to ensure the threads are shut down before we tear down the nfs_callback_info in nfs_callback_down. Tested-and-reviewed-by: Kinglong Mee <kinglongmee@gmail.com> Reported-by: Kinglong Mee <kinglongmee@gmail.com> Fixes: bb6aeba736ba9 ("NFSv4.x: Switch to using svc_set_num_threads()...") Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2017-04-27 18:00:16 -04:00
Kinglong Mee	df807fffaa	NFSv4.x/callback: Create the callback service through svc_create_pooled As the comments for svc_set_num_threads() said, " Destroying threads relies on the service threads filling in rqstp->rq_task, which only the nfs ones do. Assumes the serv has been created using svc_create_pooled()." If creating service through svc_create(), the svc_pool_map_put() will be called in svc_destroy(), but the pool map isn't used. So that, the reference of pool map will be drop, the next using of pool map will get a zero npools. [ 137.992130] divide error: 0000 [#1] SMP [ 137.992148] Modules linked in: nfsd(E) nfsv4 nfs fscache fuse tun bridge stp llc ip_set nfnetlink vmw_vsock_vmci_transport vsock snd_seq_midi snd_seq_midi_event vmw_balloon coretemp crct10dif_pclmul crc32_pclmul ppdev ghash_clmulni_intel intel_rapl_perf joydev snd_ens1371 gameport snd_ac97_codec ac97_bus snd_seq snd_pcm snd_rawmidi snd_timer snd_seq_device snd soundcore parport_pc parport nfit acpi_cpufreq tpm_tis tpm_tis_core tpm vmw_vmci i2c_piix4 shpchp auth_rpcgss nfs_acl lockd(E) grace sunrpc(E) xfs libcrc32c vmwgfx drm_kms_helper ttm crc32c_intel drm e1000 mptspi scsi_transport_spi serio_raw mptscsih mptbase ata_generic pata_acpi [last unloaded: nfsd] [ 137.992336] CPU: 0 PID: 4514 Comm: rpc.nfsd Tainted: G E 4.11.0-rc8+ #536 [ 137.992777] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015 [ 137.993757] task: ffff955984101d00 task.stack: ffff9873c2604000 [ 137.994231] RIP: 0010:svc_pool_for_cpu+0x2b/0x80 [sunrpc] [ 137.994768] RSP: 0018:ffff9873c2607c18 EFLAGS: 00010246 [ 137.995227] RAX: 0000000000000000 RBX: ffff95598376f000 RCX: 0000000000000002 [ 137.995673] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9559944aec00 [ 137.996156] RBP: ffff9873c2607c18 R08: ffff9559944aec28 R09: 0000000000000000 [ 137.996609] R10: 0000000001080002 R11: 0000000000000000 R12: ffff95598376f010 [ 137.997063] R13: ffff95598376f018 R14: ffff9559944aec28 R15: ffff9559944aec00 [ 137.997584] FS: 00007f755529eb40(0000) GS:ffff9559bb600000(0000) knlGS:0000000000000000 [ 137.998048] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 137.998548] CR2: 000055f3aecd9660 CR3: 0000000084290000 CR4: 00000000001406f0 [ 137.999052] Call Trace: [ 137.999517] svc_xprt_do_enqueue+0xef/0x260 [sunrpc] [ 138.000028] svc_xprt_received+0x47/0x90 [sunrpc] [ 138.000487] svc_add_new_perm_xprt+0x76/0x90 [sunrpc] [ 138.000981] svc_addsock+0x14b/0x200 [sunrpc] [ 138.001424] ? recalc_sigpending+0x1b/0x50 [ 138.001860] ? __getnstimeofday64+0x41/0xd0 [ 138.002346] ? do_gettimeofday+0x29/0x90 [ 138.002779] write_ports+0x255/0x2c0 [nfsd] [ 138.003202] ? _copy_from_user+0x4e/0x80 [ 138.003676] ? write_recoverydir+0x100/0x100 [nfsd] [ 138.004098] nfsctl_transaction_write+0x48/0x80 [nfsd] [ 138.004544] __vfs_write+0x37/0x160 [ 138.004982] ? selinux_file_permission+0xd7/0x110 [ 138.005401] ? security_file_permission+0x3b/0xc0 [ 138.005865] vfs_write+0xb5/0x1a0 [ 138.006267] SyS_write+0x55/0xc0 [ 138.006654] entry_SYSCALL_64_fastpath+0x1a/0xa9 [ 138.007071] RIP: 0033:0x7f7554b9dc30 [ 138.007437] RSP: 002b:00007ffc9f92c788 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [ 138.007807] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f7554b9dc30 [ 138.008168] RDX: 0000000000000002 RSI: 00005640cd536640 RDI: 0000000000000003 [ 138.008573] RBP: 00007ffc9f92c780 R08: 0000000000000001 R09: 0000000000000002 [ 138.008918] R10: 0000000000000064 R11: 0000000000000246 R12: 0000000000000004 [ 138.009254] R13: 00005640cdbf77a0 R14: 00005640cdbf7720 R15: 00007ffc9f92c238 [ 138.009610] Code: 0f 1f 44 00 00 48 8b 87 98 00 00 00 55 48 89 e5 48 83 78 08 00 74 10 8b 05 07 42 02 00 83 f8 01 74 40 83 f8 02 74 19 31 c0 31 d2 <f7> b7 88 00 00 00 5d 89 d0 48 c1 e0 07 48 03 87 90 00 00 00 c3 [ 138.010664] RIP: svc_pool_for_cpu+0x2b/0x80 [sunrpc] RSP: ffff9873c2607c18 [ 138.011061] ---[ end trace b3468224cafa7d11 ]--- Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2017-04-27 17:59:00 -04:00
Geliang Tang	3509d048c8	pstore: Remove unused vmalloc.h in pmsg Since the vmalloc code has been removed from write_pmsg() in the commit "5bf6d1b pstore/pmsg: drop bounce buffer", remove the unused header vmalloc.h. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org>	2017-04-27 14:48:59 -07:00
Chris Mason	bce19f9d23	Merge branch 'for-chris-4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/fdmanana/linux into for-linus-4.12	2017-04-27 14:13:09 -07:00
Linus Torvalds	8b5d11e4b0	Thanks to Ari Kauppi and Tuomas Haanpää at Synopsis for spotting bugs in our NFSv2/v3 xdr code that could crash the server or leak memory. -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJZAlLrAAoJECebzXlCjuG+lb8P/idTu9rLGaU1VYPInrdoXru0 iPY+p5inmGSYW2MfCGlS7disaCACgzPBKVqKjeNB1hfHn2JZCrfeBd0XvBYlc7TH JYmlHKSBjN/ZfnYMl0WrlVUCZstt4JVmxWjO3sZTCL3nImEbGA7d13yBDagxISWh gM9wOOiJwR5lzT/W3MezNCYj4n27/vVhODMP+Qhy0rpK08dBORIH0Bi7hwtVgdD9 cMVIXMbujmJrCz0Uhbo/DhoItcePRBrCLTXdY6WEeMmHxnXlyaA2XuA0EjjZHuMs +BsbAOsNy1BxY1c8Z2ignYgRymUvUBHeiJeIGZLbyKKM2OJMtE4BoqlSWjx08P3Y hTwTuaw8u7uxfAsvxepukZoonWj/uVY5tP2Hyq+K6CIKKJpB7vj+c3QGYD3dNnUu zDl/LG3ayZgpPXqfjKRnSzZE+St1/IwDnvaM2WN2B1mkuerVr8qDGq6xd4kB0QKX BcEKiwcb2ewfPaLlVnSXz6Wbuh2pB42BObJjC3qbOgMvQ7SBUM0UcZmFpJJ20uCR BX20aFzB/GHcd6fTRHpDrAxB4XGdVG/8Da5Ki2WhRnmmaeSXushhiKWImujY9B6i s4mZHu4gGGJdLzOT5u2HZl93STevriL70SA9nZPhPLQycnJMUAO0buU9UcNI7wXR GVu2F3IHd+anxDtDwOv4 =Glhl -----END PGP SIGNATURE----- Merge tag 'nfsd-4.11-3' of git://linux-nfs.org/~bfields/linux Pull nfsd fixes from Bruce Fields: "Thanks to Ari Kauppi and Tuomas Haanpää at Synopsis for spotting bugs in our NFSv2/v3 xdr code that could crash the server or leak memory" * tag 'nfsd-4.11-3' of git://linux-nfs.org/~bfields/linux: nfsd: stricter decoding of write-like NFSv2/v3 ops nfsd4: minor NFSv2/v3 write decoding cleanup nfsd: check for oversized NFSv2/v3 arguments	2017-04-27 13:39:19 -07:00
Linus Torvalds	19ac447420	A fix for a kernel stack overflow bug in ceph setattr code, marked for stable. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQEcBAABCAAGBQJZAhJLAAoJEEp/3jgCEfOLw4gH/ia+bMzmsnkYtjMfxQfCh0ia MHi7JS/YcAej/o71c/tvWlTU7mRbmvUCVSAcishRNytEBNGL8YzkP12vMOp/5Vdx kKk6yDWn9z0mR5/YdBKaE8ziM5Umdy+zLqeL4yuxyhtbxKFGUPG4txJKS5WD80yU Ld/toF2fL3y/JEs+s1pd5G+DPhEhEm2hFf56/VI6N7y08CHJgTqHB3GJ3ZnuUbnU UhSvNR9skdVirObI8jt3oWIix8uAGq5+6MjVeTqXo75Qng5sdBGZ8S2agxXbM3j7 Hu8h/1bhKyPCUzAXnOyGcZeR+5DQolKmlKLhogbT4I9X4YC2ie4Djg0bmFHscWI= =8aUa -----END PGP SIGNATURE----- Merge tag 'ceph-for-4.11-rc9' of git://github.com/ceph/ceph-client Pull ceph fix from Ilya Dryomov: "A fix for a kernel stack overflow bug in ceph setattr code, marked for stable" * tag 'ceph-for-4.11-rc9' of git://github.com/ceph/ceph-client: ceph: fix recursion between ceph_set_acl() and __ceph_setattr()	2017-04-27 11:38:05 -07:00
Linus Torvalds	f56fc7bdaa	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs fixes from Al Viro: - fix orangefs handling of faults on write() - I'd missed that one back when orangefs was going through review. - readdir counterpart of "9p: cope with bogus responses from server in p9_client_{read,write}" - server might be lying or broken, and we'd better not overrun the kmalloc'ed buffer we are copying the results into. - NFS O_DIRECT read/write can leave iov_iter advanced by too much; that's what had been causing iov_iter_pipe() warnings davej had been seeing. - statx_timestamp.tv_nsec type fix (s32 -> u32). That one really should go in before 4.11. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: uapi: change the type of struct statx_timestamp.tv_nsec to unsigned fix nfs O_DIRECT advancing iov_iter too much p9_client_readdir() fix orangefs_bufmap_copy_from_iovec(): fix EFAULT handling	2017-04-27 11:09:37 -07:00
Lukas Czerner	3c3781951c	xfs: Allow user to kill fstrim process fstrim can take really long time on big, slow device or on file system with a lots of allocation groups. Currently there is no way for the user to cancell the operation. This patch makes it possible for the user to kill fstrim pocess by adding the check for fatal_signal_pending() in xfs_trim_extents(). Signed-off-by: Lukas Czerner <lczerner@redhat.com> Reported-by: Zdenek Kabelac <zkabelac@redhat.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2017-04-27 10:45:34 -07:00
Michael Kerrisk (man-pages)	59372bbf3a	statx: correct error handling of NULL pathname The change in commit 1e2f82d1e9d1 ("statx: Kill fd-with-NULL-path support in favour of AT_EMPTY_PATH") to error on a NULL pathname to statx() is inconsistent. It results in the error EINVAL for a NULL pathname. Other system calls with similar APIs (fchownat(), fstatat(), linkat()), return EFAULT. The solution is simply to remove the EINVAL check. As I already pointed out in [1], user_path_at*() and filename_lookup() will handle the NULL pathname as per the other APIs, to correctly produce the error EFAULT. [1] https://lkml.org/lkml/2017/4/26/561 Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-04-27 10:45:09 -07:00
Christoph Hellwig	629e014bb8	fs: completely ignore unknown open flags Currently we just stash anything we got into file->f_flags, and the report it in fcntl(F_GETFD). This patch just clears out all unknown flags so that we don't pass them to the fs or report them. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2017-04-27 05:13:04 -04:00
Christoph Hellwig	80f18379a7	fs: add a VALID_OPEN_FLAGS Add a central define for all valid open flags, and use it in the uniqueness check. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2017-04-27 05:13:04 -04:00
Eric Biggers	020c2833db	fs: remove _submit_bh() _submit_bh() allowed submitting a buffer_head for I/O using custom bio_flags. It used to be used by jbd to set BIO_SNAP_STABLE, introduced by commit 713685111774 ("mm: make snapshotting pages for stable writes a per-bio operation"). However, the code and flag has since been removed and no _submit_bh() users remain. These days, bio_flags are mostly used internally by the block layer to track the state of bio's. As such, it doesn't really make sense for filesystems to use them instead of op_flags when wanting special behavior for block requests. Therefore, remove _submit_bh() and trim the bio_flags argument from submit_bh_wbc(). Cc: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2017-04-26 23:54:06 -04:00
Eric Biggers	cda37124f4	fs: constify tree_descr arrays passed to simple_fill_super() simple_fill_super() is passed an array of tree_descr structures which describe the files to create in the filesystem's root directory. Since these arrays are never modified intentionally, they should be 'const' so that they are placed in .rodata and benefit from memory protection. This patch updates the function signature and all users, and also constifies tree_descr.name. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2017-04-26 23:54:06 -04:00
Fabian Frederick	a80f2d2224	fs/affs: bugfix: Write files greater than page size on OFS Previous AFFS patch fixed OFS write operations but unveiled another bug: files greater than 4KB are being created with a wrong size resulting in errors like the following: dd if=/dev/zero of=file bs=4097 count=1 cp file /mnt/affs/ cp: error writing '/mnt/affs/file': Bad address Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2017-04-26 23:54:06 -04:00
Fabian Frederick	077e073e8f	fs/affs: bugfix: enable writes on OFS disks We called unconditionally affs_bread_ino() with create 0 resulting in "error (device ...): get_block(): strange block request 0" when trying to write on AFFS OFS format. This patch adds create parameter to that function. 0 for affs_readpage_ofs() 1 for affs_write_begin_ofs() Bug was found here: https://bugzilla.kernel.org/show_bug.cgi?id=114961 Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2017-04-26 23:54:06 -04:00
Fabian Frederick	a9d6cfb70f	fs/affs: remove node generation check node generation has to be stored on disk. AFAICS we won't be able to manage it on AFFS. This patch removes relevant check in affs_nfs_get_inode() Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2017-04-26 23:54:05 -04:00
Fabian Frederick	d2d58e0e0d	fs/affs: import amigaffs.h Have that file in global include/linux is not needed. Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2017-04-26 23:54:05 -04:00
Fabian Frederick	f1bf90724d	fs/affs: bugfix: make symbolic links work again AFFS symbolic links were broken since kernel 2.6.29 Problem was bisected to the following commit ebd09abbd969 ("vfs: ensure page symlinks are NUL-terminated") commit 035146851cfa ("vfs: introduce helper function to safely NUL-terminate symlinks") AFFS wasn't setting inode size when reading symbolic link from disk or creating a new one. Result was zero allocation in pagecache. ln -s file symlink ls -lrt file symlink -> This patch adds inode isize information on inode get and symbolic link addition. Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2017-04-26 23:54:05 -04:00
David S. Miller	b1513c3531	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-26 22:39:08 -04:00
David Howells	1e2f82d1e9	statx: Kill fd-with-NULL-path support in favour of AT_EMPTY_PATH With the new statx() syscall, the following both allow the attributes of the file attached to a file descriptor to be retrieved: statx(dfd, NULL, 0, ...); and: statx(dfd, "", AT_EMPTY_PATH, ...); Change the code to reject the first option, though this means copying the path and engaging pathwalk for the fstat() equivalent. dfd can be a non-directory provided path is "". [ The timing of this isn't wonderful, but applying this now before we have statx() in any released kernel, before anybody starts using the NULL special case. - Linus ] Fixes: a528d35e8bfc ("statx: Add a system call to make enhanced file info available") Reported-by: Michael Kerrisk <mtk.manpages@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com> cc: Eric Sandeen <sandeen@sandeen.net> cc: fstests@vger.kernel.org cc: linux-api@vger.kernel.org cc: linux-man@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-04-26 15:05:47 -07:00
Dan Carpenter	907bfcd8d8	orangefs: handle zero size write in debugfs If we write zero bytes to this debugfs file, then it will cause an underflow when we do copy_from_user(buf, ubuf, count - 1). Debugfs can normally only be written to by root so the impact of this is low. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com>	2017-04-26 14:33:01 -04:00
Martin Brandenburg	b5a9d61eeb	orangefs: do not wait for timeout if umounting When the computer is turned off, all the processes are killed and then all the filesystems are umounted. OrangeFS should not wait for the userspace daemon to come back in that case. This only works for plain umount(2). To actually take advantage of this interactively, `umount -f' is needed; otherwise umount will issue a statfs first, which will wait for the userspace daemon to come back. Signed-off-by: Martin Brandenburg <martin@omnibond.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com>	2017-04-26 14:33:01 -04:00
Martin Brandenburg	b7a57ccab8	orangefs: return from orangefs_devreq_read quickly if possible It is not necessary to take the lock and search through the request list if the list is empty. Signed-off-by: Martin Brandenburg <martin@omnibond.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com>	2017-04-26 14:33:00 -04:00
Martin Brandenburg	9d286b0d82	orangefs: ensure the userspace component is unmounted if mount fails If the mount is aborted after userspace has been asked to mount, userspace must be told to unmount. Ordinarily orangefs_kill_sb does the unmount. However it cannot be called if the superblock has not been set up. This is a very narrow window. The NULL fs_id is not unmounted. Signed-off-by: Martin Brandenburg <martin@omnibond.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com>	2017-04-26 14:33:00 -04:00
Martin Brandenburg	53950ef541	orangefs: do not check possibly stale size on truncate Let the server figure this out because our size might be out of date or not present. The bug was that xfs_io -f -t -c "pread -v 0 100" /mnt/foo echo "Test" > /mnt/foo xfs_io -f -t -c "pread -v 0 100" /mnt/foo fails because the second truncate did not happen if nothing had requested the size after the write in echo. Thus i_size was zero (not present) and the orangefs_setattr though i_size was zero and there was nothing to do. Signed-off-by: Martin Brandenburg <martin@omnibond.com> Cc: stable@vger.kernel.org Signed-off-by: Mike Marshall <hubcap@omnibond.com>	2017-04-26 14:33:00 -04:00
Martin Brandenburg	68a24a6cc4	orangefs: implement statx Fortunately OrangeFS has had a getattr request mask for a long time. The server basically has two difficulty levels for attributes. Fetching any attribute except size requires communicating with the metadata server for that handle. Since all the attributes are right there, it makes sense to return them all. Fetching the size requires communicating with every I/O server (that the file is distributed across). Therefore if asked for anything except size, get everything except size, and if asked for size, get everything. Signed-off-by: Martin Brandenburg <martin@omnibond.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com>	2017-04-26 14:33:00 -04:00
Martin Brandenburg	7b796ae370	orangefs: remove ORANGEFS_READDIR macros They are clones of the ORANGEFS_ITERATE macros in use elsewhere. Delete ORANGEFS_ITERATE_NEXT which is a hack previously used by readdir. Signed-off-by: Martin Brandenburg <martin@omnibond.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com>	2017-04-26 14:33:00 -04:00
Martin Brandenburg	480e3e532e	orangefs: support very large directories This works by maintaining a linked list of pages which the directory has been read into rather than one giant fixed-size buffer. This replaces code which limits the total directory size to the total amount that could be returned in one server request. Since filenames are usually considerably shorter than the maximum, the old code could usually handle several server requests before running out of space. Signed-off-by: Martin Brandenburg <martin@omnibond.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com>	2017-04-26 14:33:00 -04:00
Martin Brandenburg	72f66b8329	orangefs: support llseek on directories This and the previous commit fix xfstests generic/257. Signed-off-by: Martin Brandenburg <martin@omnibond.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com>	2017-04-26 14:33:00 -04:00

... 3 4 5 6 7 ...

49154 Commits