Here is the "big" set of driver core patches for 5.2-rc1
There are a number of ACPI patches in here as well, as Rafael said they
should go through this tree due to the driver core changes they
required. They have all been acked by the ACPI developers.
There are also a number of small subsystem-specific changes in here, due
to some changes to the kobject core code. Those too have all been acked
by the various subsystem maintainers.
As for content, it's pretty boring outside of the ACPI changes:
- spdx cleanups
- kobject documentation updates
- default attribute groups for kobjects
- other minor kobject/driver core fixes
All have been in linux-next for a while with no reported issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXNHDbw8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ynDAgCfbb4LBR6I50wFXb8JM/R6cAS7qrsAn1unshKV
8XCYcif2RxjtdJWXbjdm
=/rLh
-----END PGP SIGNATURE-----
Merge tag 'driver-core-5.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core/kobject updates from Greg KH:
"Here is the "big" set of driver core patches for 5.2-rc1
There are a number of ACPI patches in here as well, as Rafael said
they should go through this tree due to the driver core changes they
required. They have all been acked by the ACPI developers.
There are also a number of small subsystem-specific changes in here,
due to some changes to the kobject core code. Those too have all been
acked by the various subsystem maintainers.
As for content, it's pretty boring outside of the ACPI changes:
- spdx cleanups
- kobject documentation updates
- default attribute groups for kobjects
- other minor kobject/driver core fixes
All have been in linux-next for a while with no reported issues"
* tag 'driver-core-5.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (47 commits)
kobject: clean up the kobject add documentation a bit more
kobject: Fix kernel-doc comment first line
kobject: Remove docstring reference to kset
firmware_loader: Fix a typo ("syfs" -> "sysfs")
kobject: fix dereference before null check on kobj
Revert "driver core: platform: Fix the usage of platform device name(pdev->name)"
init/config: Do not select BUILD_BIN2C for IKCONFIG
Provide in-kernel headers to make extending kernel easier
kobject: Improve doc clarity kobject_init_and_add()
kobject: Improve docs for kobject_add/del
driver core: platform: Fix the usage of platform device name(pdev->name)
livepatch: Replace klp_ktype_patch's default_attrs with groups
cpufreq: schedutil: Replace default_attrs field with groups
padata: Replace padata_attr_type default_attrs field with groups
irqdesc: Replace irq_kobj_type's default_attrs field with groups
net-sysfs: Replace ktype default_attrs field with groups
block: Replace all ktype default_attrs with groups
samples/kobject: Replace foo_ktype's default_attrs field with groups
kobject: Add support for default attribute groups to kobj_type
driver core: Postpone DMA tear-down until after devres release for probe failure
...
in dbg_walk_index ubifs_load_znode is used to load the znode behind
a zbranch. ubifs_load_znode links the new child znode to the zbranch,
so doing it again is unnecessary.
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Richard Weinberger <richard@nod.at>
ifdefs reduce readability and compile coverage. This removes the ifdefs
around CONFIG_UBIFS_ATIME_SUPPORT by replacing them with IS_ENABLED()
where applicable. The fs layer would fall back to generic_update_time()
when .update_time doesn't exist. We do this fallback explicitly now.
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Richard Weinberger <richard@nod.at>
ifdefs reduce readablity and compile coverage. This removes the ifdefs
around CONFIG_FS_ENCRYPTION by using IS_ENABLED and relying on static
inline wrappers. A new static inline wrapper for setting sb->s_cop is
introduced to allow filesystems to unconditionally compile in their
s_cop operations.
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Richard Weinberger <richard@nod.at>
Since we have to write one deletion inode per xattr
into the journal, limit the max number of xattrs.
In theory UBIFS supported up to 65535 xattrs per inode.
But this never worked correctly, expect no powercuts happened.
Now we support only as many xattrs as we can store in 50% of a
LEB.
Even for tiny flashes this allows dozens of xattrs per inode,
which is for an embedded filesystem still fine.
In case someone has existing inodes with much more xattrs, it is
still possible to delete them.
UBIFS will fall back to an non-atomic deletion mode.
Reported-by: Stefan Agner <stefan@agner.ch>
Fixes: 1e51764a3c2ac ("UBIFS: add new flash file system")
Signed-off-by: Richard Weinberger <richard@nod.at>
Like for the journal case, make sure that we track all xattr
inodes.
Otherwise UBIFS might not be able to locate stale xattr inodes
upon recovery.
Reported-by: Stefan Agner <stefan@agner.ch>
Fixes: 1e51764a3c2ac ("UBIFS: add new flash file system")
Signed-off-by: Richard Weinberger <richard@nod.at>
If an inode hosts xattrs, create deletion entries for each
inode. That way we can make sure that upon journal replay UBIFS
can find find all xattr inodes.
Otherwise it can happen that GC consumed already a LEB which contained
parts of the TNC that pointed to the xattrs and we no longer
find all xattr inodes, which will confuse the LPT and cause
space allocation issues.
Reported-by: Stefan Agner <stefan@agner.ch>
Fixes: 1e51764a3c2ac ("UBIFS: add new flash file system")
Signed-off-by: Richard Weinberger <richard@nod.at>
Replace swap_dirty_idx function with built-in one,
because swap_dirty_idx does only a simple byte to byte swap.
Since Spectre mitigations have made indirect function calls more
expensive, and the default simple byte copies swap is implemented
without them, an "optimized" custom swap function is now
a waste of time as well as code.
Signed-off-by: Andrey Abramov <st5pub@yandex.ru>
Reviewed by: George Spelvin <lkml@sdf.org>
Signed-off-by: Richard Weinberger <richard@nod.at>
UBIFS bails out early from try_read_node() when it doesn't have to check
the CRC. Still the node hash has to be checked, otherwise wrong data
could be sneaked into the FS. Fix this by not bailing out early and
always checking the node hash.
Fixes: 16a26b20d2af ("ubifs: authentication: Add hashes to index nodes")
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Richard Weinberger <richard@nod.at>
Hi Linus,
This is my very first pull-request. I've been working full-time as
a kernel developer for more than two years now. During this time I've
been fixing bugs reported by Coverity all over the tree and, as part
of my work, I'm also contributing to the KSPP. My work in the kernel
community has been supervised by Greg KH and Kees Cook.
OK. So, after the quick introduction above, please, pull the following
patches that mark switch cases where we are expecting to fall through.
These patches are part of the ongoing efforts to enable -Wimplicit-fallthrough.
They have been ignored for a long time (most of them more than 3 months,
even after pinging multiple times), which is the reason why I've created
this tree. Most of them have been baking in linux-next for a whole development
cycle. And with Stephen Rothwell's help, we've had linux-next nag-emails
going out for newly introduced code that triggers -Wimplicit-fallthrough
to avoid gaining more of these cases while we work to remove the ones
that are already present.
I'm happy to let you know that we are getting close to completing this
work. Currently, there are only 32 of 2311 of these cases left to be
addressed in linux-next. I'm auditing every case; I take a look into
the code and analyze it in order to determine if I'm dealing with an
actual bug or a false positive, as explained here:
https://lore.kernel.org/lkml/c2fad584-1705-a5f2-d63c-824e9b96cf50@embeddedor.com/
While working on this, I've found and fixed the following missing
break/return bugs, some of them introduced more than 5 years ago:
84242b82d81c54e009a2aaa74d3d9eff70babf56
7850b51b6c21812be96d0200b74cff1f40587d98
5e420fe635813e5746b296cfc8fff4853ae205a2
09186e503486da4a17f16f2f7c679e6e3e2a32f4
b5be853181a8d4a6e20f2073ccd273d6280cad88
7264235ee74f51d26fbdf97bf98c6102a460484f
cc5034a5d293dd620484d1d836aa16c6764a1c8c
479826cc86118e0d87e5cefb3df5b748e0480924
5340f23df8fe27a270af3fa1a93cd07293d23dd9
df997abeebadaa4824271009e2d2b526a70a11cb
2f10d823739680d2477ce34437e8a08a53117f40
307b00c5e695857ca92fc6a4b8ab6c48f988a1b1
5d25ff7a544889bc4b749fda31778d6a18dddbcb
a7ed5b3e7dca197de4da6273940a7ca6d1d756a1
c24bfa8f21b59283580043dada19a6e943b6e426
ad0eaee6195db1db1749dd46b9e6f4466793d178
9ba8376ce1e2cbf4ce44f7e4bee1d0648e10d594
dc586a60a11d0260308db1bebe788ad8973e2729
a8e9b186f153a44690ad0363a56716e7077ad28c
4e57562b4846e42cd1c2e556f0ece18c1154e116
60747828eac28836b49bed214399b0c972f19df3
c5b974bee9d2ceae4c441ae5a01e498c2674e100
cc44ba91166beb78f9cb29d5e3d41c0a2d0a7329
2c930e3d0aed1505e86e0928d323df5027817740
Once this work is finish, we'll be able to universally enable
"-Wimplicit-fallthrough" to avoid any of these kinds of bugs from
entering the kernel again.
Thanks
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEkmRahXBSurMIg1YvRwW0y0cG2zEFAlzQR2IACgkQRwW0y0cG
2zEJbQ//X930OcBtT/9DRW4XL1Jeq0Mjssz/GLX2Vpup5CwwcTROG65no80Zezf/
yQRWnUjGX0OBv/fmUK32/nTxI/7k7NkmIXJHe0HiEF069GEENB7FT6tfDzIPjU8M
qQkB8NsSUWJs3IH6BVynb/9MGE1VpGBDbYk7CBZRtRJT1RMM+3kQPucgiZMgUBPo
Yd9zKwn4i/8tcOCli++EUdQ29ukMoY2R3qpK4LftdX9sXLKZBWNwQbiCwSkjnvJK
I6FDiA7RaWH2wWGlL7BpN5RrvAXp3z8QN/JZnivIGt4ijtAyxFUL/9KOEgQpBQN2
6TBRhfTQFM73NCyzLgGLNzvd8awem1rKGSBBUvevaPbgesgM+Of65wmmTQRhFNCt
A7+e286X1GiK3aNcjUKrByKWm7x590EWmDzmpmICxNPdt5DHQ6EkmvBdNjnxCMrO
aGA24l78tBN09qN45LR7wtHYuuyR0Jt9bCmeQZmz7+x3ICDHi/+Gw7XPN/eM9+T6
lZbbINiYUyZVxOqwzkYDCsdv9+kUvu3e4rPs20NERWRpV8FEvBIyMjXAg6NAMTue
K+ikkyMBxCvyw+NMimHJwtD7ho4FkLPcoeXb2ZGJTRHixiZAEtF1RaQ7dA05Q/SL
gbSc0DgLZeHlLBT+BSWC2Z8SDnoIhQFXW49OmuACwCUC68NHKps=
=k30z
-----END PGP SIGNATURE-----
Merge tag 'Wimplicit-fallthrough-5.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux
Pull Wimplicit-fallthrough updates from Gustavo A. R. Silva:
"Mark switch cases where we are expecting to fall through.
This is part of the ongoing efforts to enable -Wimplicit-fallthrough.
Most of them have been baking in linux-next for a whole development
cycle. And with Stephen Rothwell's help, we've had linux-next
nag-emails going out for newly introduced code that triggers
-Wimplicit-fallthrough to avoid gaining more of these cases while we
work to remove the ones that are already present.
We are getting close to completing this work. Currently, there are
only 32 of 2311 of these cases left to be addressed in linux-next. I'm
auditing every case; I take a look into the code and analyze it in
order to determine if I'm dealing with an actual bug or a false
positive, as explained here:
https://lore.kernel.org/lkml/c2fad584-1705-a5f2-d63c-824e9b96cf50@embeddedor.com/
While working on this, I've found and fixed the several missing
break/return bugs, some of them introduced more than 5 years ago.
Once this work is finished, we'll be able to universally enable
"-Wimplicit-fallthrough" to avoid any of these kinds of bugs from
entering the kernel again"
* tag 'Wimplicit-fallthrough-5.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux: (27 commits)
memstick: mark expected switch fall-throughs
drm/nouveau/nvkm: mark expected switch fall-throughs
NFC: st21nfca: Fix fall-through warnings
NFC: pn533: mark expected switch fall-throughs
block: Mark expected switch fall-throughs
ASN.1: mark expected switch fall-through
lib/cmdline.c: mark expected switch fall-throughs
lib: zstd: Mark expected switch fall-throughs
scsi: sym53c8xx_2: sym_nvram: Mark expected switch fall-through
scsi: sym53c8xx_2: sym_hipd: mark expected switch fall-throughs
scsi: ppa: mark expected switch fall-through
scsi: osst: mark expected switch fall-throughs
scsi: lpfc: lpfc_scsi: Mark expected switch fall-throughs
scsi: lpfc: lpfc_nvme: Mark expected switch fall-through
scsi: lpfc: lpfc_nportdisc: Mark expected switch fall-through
scsi: lpfc: lpfc_hbadisc: Mark expected switch fall-throughs
scsi: lpfc: lpfc_els: Mark expected switch fall-throughs
scsi: lpfc: lpfc_ct: Mark expected switch fall-throughs
scsi: imm: mark expected switch fall-throughs
scsi: csiostor: csio_wr: mark expected switch fall-through
...
Building this file with clang can result in large stack usage as seen from
this warning:
fs/ubifs/auth.c:78:5: error: stack frame size of 1152 bytes in function 'ubifs_prepare_auth_node'
The problem is that inlining ubifs_hash_calc_hmac() leads to
two SHASH_DESC_ON_STACK() blocks in the same function, and clang
for some reason does not reuse the stack space as it should.
Putting the first declaration into a separate basic block avoids
this problem and reduces the stack allocation to 640 bytes.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
There is no callers in tree, and can be removed.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Mukesh Ojha <mojha@codeaurora.org>
Signed-off-by: Richard Weinberger <richard@nod.at>
When !CONFIG_FS_ENCRYPTION, fscrypt_ioctl_get_policy() is already
stubbed out to return -EOPNOTSUPP, so the extra #ifdef is not needed.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
In ubifs_unlink() and ubifs_rmdir(), remove the call to
fscrypt_get_encryption_info() that precedes fscrypt_setup_filename().
This call was unnecessary, because fscrypt_setup_filename() already
tries to set up the directory's encryption key.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE7btrcuORLb1XUhEwjrBW1T7ssS0FAlzReuoACgkQjrBW1T7s
sS1uvBAA16pgnhRNxNTrp3LYft6lUWmF4n0baOTVtQNLhPjpwaOxHIrCBugkQCJB
QcQ9IQSOvIkaEW0XAQoPBaeLviiKhHOFw1Fv89OtW6xUidSfSV15lcI9f1F2pCm2
4yCL/8XvL6M0NhxiwftJAkWOXeDNLfjFnLwyLxBfgg3EeyqMgUB8raeosEID0ORR
gm2/g8DYS2r+KNqM/F4xvMSgabfi2bGk+8BtAaVnftJfstpRNrqKwWnSK3Wspj1l
5gkb8gSsiY6ns3V6RgNHrFlhevFg8V+VjcJt7FR+aUEjOkcoiXas/PhvamMzdsn/
FM1F/A0pM8FSybIUClhnnnxNPc+p8ZN/71YQAPs+Mnh3xvbtKea2lkhC+Xv4OpK3
edutSZWFaiIery82Rk00H3vqiSF1+kRIXSpZSS4mElk4FsVljkyH+nSP7rbmE2MR
EQe+kKnZl8QzWrVbnODC+EVvvVpA2bXDvENJmvKqus+t2G0OdV7Iku3F5E3KjF8k
S5RRV1zuBF3ugqnjmYrVmJtpEA8mxClmqvg6okru+qW6ngO5oOgVpPLjWn1CXcdj
wcuQ6Pe1QwAHS54e9WSWgCHVssLvm9nCdCqypdNaoyGWmbTWntwlrY7Y0JUQnAbB
6/G/DQQiCWY9y8bMZlTEydhIpgcsdROuPYv+oHF5+eQQthsWwHc=
=LH11
-----END PGP SIGNATURE-----
Merge tag 'pidfd-v5.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux
Pull pidfd updates from Christian Brauner:
"This patchset makes it possible to retrieve pidfds at process creation
time by introducing the new flag CLONE_PIDFD to the clone() system
call. Linus originally suggested to implement this as a new flag to
clone() instead of making it a separate system call.
After a thorough review from Oleg CLONE_PIDFD returns pidfds in the
parent_tidptr argument. This means we can give back the associated pid
and the pidfd at the same time. Access to process metadata information
thus becomes rather trivial.
As has been agreed, CLONE_PIDFD creates file descriptors based on
anonymous inodes similar to the new mount api. They are made
unconditional by this patchset as they are now needed by core kernel
code (vfs, pidfd) even more than they already were before (timerfd,
signalfd, io_uring, epoll etc.). The core patchset is rather small.
The bulky looking changelist is caused by David's very simple changes
to Kconfig to make anon inodes unconditional.
A pidfd comes with additional information in fdinfo if the kernel
supports procfs. The fdinfo file contains the pid of the process in
the callers pid namespace in the same format as the procfs status
file, i.e. "Pid:\t%d".
To remove worries about missing metadata access this patchset comes
with a sample/test program that illustrates how a combination of
CLONE_PIDFD and pidfd_send_signal() can be used to gain race-free
access to process metadata through /proc/<pid>.
Further work based on this patchset has been done by Joel. His work
makes pidfds pollable. It finished too late for this merge window. I
would prefer to have it sitting in linux-next for a while and send it
for inclusion during the 5.3 merge window"
* tag 'pidfd-v5.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
samples: show race-free pidfd metadata access
signal: support CLONE_PIDFD with pidfd_send_signal
clone: add CLONE_PIDFD
Make anon_inodes unconditional
https://lore.kernel.org/linux-fsdevel/CAHk-=wg1tFzcaX2v9Z91vPJiBR486ddW5MtgDL02-fOen2F0Aw@mail.gmail.com/T/#m5b2d9ad3aeacea4bd6aa1964468ac074bf3aa5bf
-----BEGIN PGP SIGNATURE-----
iQJEBAABCgAuFiEECVWwJCUO7/z+QjZbZsp4hBP2dUkFAlzR1UgQHGtpcnJAbmV4
ZWRpLmNvbQAKCRBmyniEE/Z1SZBiEACGw1LzUmjV9eBYFjqaUkgX/Zfcu42D4Ek2
8MuWnNdRabtpGQq0LccYlfoL3yH5xECp14IkCgJvkjqoZ3CcqWcv6uDxf0WtnUqZ
wPx1RYZykb4RZj2A6/ndhInReP4AlXICyTVulKb+BquVkemMvmXX8k+bkr/msKfT
9jdKWFIn+ANNABt3y2D7ywZvs9mkxIx+Fti+tVV4BFBeGfUuj4ArZBOHnngRnIk/
XYlQ7FVzENSPSB+3GvL34jTGEzo8suPHKhHQlIhtcd5hwzVRZKE2sdVXsCc6/WbY
YnT32gmT1/+cUuDl1mZSiQY5R4Xkb07k6/jNrdmjQpwmWbZu90cuRhb+JBXwnmjZ
2Wgy3sfwYISDxtePukg1iYePlHlVlGTYqMo3AQrTBs/gEwCKWrsKQb98mRxlf1YK
e2mdtmq6upYoorLFQesfRgrCg4GTBiPkrR3amXsFgJ2O5fhV6R98ZdGSv4kip19f
ZNoc/t1EtKGwyAJwjINduv36E3RSHODWwSPtSnmSS1ieCGToY1SI3bVUkFM4C0tO
5GMdSugHgXRGGVbTd/VftndJm6Wtj8b1j8c/1Vh04Q8qbKKJDRTDzAbK1v8oLaDh
UXAKMIc8uY4caZy3/bTAB2Ou9dibrSi8Oc+LwZqJlwIcbkwn/IGNvmwtWv4ehorE
N7EhCFZsFQ==
=Mavg
-----END PGP SIGNATURE-----
Merge tag 'stream_open-5.2' of https://lab.nexedi.com/kirr/linux
Pull stream_open conversion from Kirill Smelkov:
- remove unnecessary double nonseekable_open from drivers/char/dtlk.c
as noticed by Pavel Machek while reviewing nonseekable_open ->
stream_open mass conversion.
- the mass conversion patch promised in commit 10dce8af3422 ("fs:
stream_open - opener for stream-like files so that read and write can
run simultaneously without deadlock") and is automatically generated
by running
$ make coccicheck MODE=patch COCCI=scripts/coccinelle/api/stream_open.cocci
I've verified each generated change manually - that it is correct to
convert - and each other nonseekable_open instance left - that it is
either not correct to convert there, or that it is not converted due
to current stream_open.cocci limitations. More details on this in the
patch.
- finally, change VFS to pass ppos=NULL into .read/.write for files
that declare themselves streams. It was suggested by Rasmus Villemoes
and makes sure that if ppos starts to be erroneously used in a stream
file, such bug won't go unnoticed and will produce an oops instead of
creating illusion of position change being taken into account.
Note: this patch does not conflict with "fuse: Add FOPEN_STREAM to
use stream_open()" that will be hopefully coming via FUSE tree,
because fs/fuse/ uses new-style .read_iter/.write_iter, and for these
accessors position is still passed as non-pointer kiocb.ki_pos .
* tag 'stream_open-5.2' of https://lab.nexedi.com/kirr/linux:
vfs: pass ppos=NULL to .read()/.write() of FMODE_STREAM files
*: convert stream-like files from nonseekable_open -> stream_open
dtlk: remove double call to nonseekable_open
- Fix some more buffer deadlocks when performing an unmount after a hard
shutdown.
- Fix some minor space accounting issues.
- Fix some use after free problems.
- Make the (undocumented) FITRIM behavior consistent with other filesystems.
- Embiggen the xfs geometry ioctl's data structure.
- Introduce a new AG geometry ioctl.
- Introduce a new online health reporting infrastructure and ioctl for
userspace to query a filesystem's health status.
- Enhance online scrub and repair to update the health reports.
- Reduce thundering herd problems when writeback io completes.
- Fix some transaction reservation type errors.
- Fix integer overflow problems with delayed alloc reservation counters.
- Fix some problems where we would exit to userspace without unlocking.
- Fix inconsistent behavior when finishing deferred ops fails.
- Strengthen scrub to check incore data against ondisk metadata.
- Remove long-broken mntpt mount option.
- Add an online scrub function for the filesystem summary counters,
which should make online metadata scrub more or less feature complete
for now.
- Various cleanups.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEUzaAxoMeQq6m2jMV+H93GTRKtOsFAlzMUEwACgkQ+H93GTRK
tOvvHw//bou5YL/gMrsxCbg2b7rpsNG3TIOz5Kq52V3JMtyqFWzArpCBEskVZXD0
J4ZXZMv/VSvI2DgV22w8/vlsjPJiODPu5mIqmcyQmZK8eDg4sL7EKVa601F57kyj
QPrdT1AnxUl+n1gM4XrV57xmsNwLYMVHKQC9e5MS6LSu7+1sarw7HFxSY1AMG9Ys
skEvwe762LCbMsnBBLCs/ZeqHWlqDok9HiKZNCj35aRrLV9dA97mjznlBFUJbhbw
kAG2jeAaG0LQnlnCzPRd3HJqQXlGL4044gx73RRY3+/POVYupKiC9KSlImq/cbLd
n7UWHVDieWoOuLKverUICw4UuqtkAXurUCW7w91ipEmZUlYKNrMNNXiEm7pfgJU3
A2KK1R14UYKJ3zX6xPz4mdlYhh0KB/xlN01Rdzhrhk9XKfL92/YyjpyjcTIeUZm8
RNKLAoWRpJZPou3RPpfZLFTSmtYIcTB92kYV6XpQ3DRrJLjlaHbu9VJaipbZGhxY
rdF+Rtk8EjKMFP0bixDHePWCu7317vMy1lbpO5UipxyC9eTwry54EaCxP03CI7YO
OAsqCdf8HYlGqEjWKprkCczMYkDRDT0p4bS27Rdzc1D5lUj6/g5hF7RD0MYc1/eA
ZDQUqVgBTmAQp+tKPTHhuSTWyZ8IIt0kdg5Z8IRVWxd+SmwwGoo=
=d2sO
-----END PGP SIGNATURE-----
Merge tag 'xfs-5.2-merge-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull xfs updates from Darrick Wong:
"Here's a big pile of new stuff for XFS for 5.2. XFS has grown the
ability to report metadata health status to userspace after online
fsck checks the filesystem. The online metadata checking code is (I
really hope) feature complete with the addition of checks for the
global fs counters, though it'll remain EXPERIMENTAL for now.
There are also fixes for thundering herds of writeback completions and
some other deadlocks, fixes for theoretical integer overflow attacks
on space accounting, and removal of the long-defunct 'mntpt' option
which was deprecated in the mid-2000s and (it turns out) totally
broken since 2011 (and nobody complained...).
Summary:
- Fix some more buffer deadlocks when performing an unmount after a
hard shutdown.
- Fix some minor space accounting issues.
- Fix some use after free problems.
- Make the (undocumented) FITRIM behavior consistent with other
filesystems.
- Embiggen the xfs geometry ioctl's data structure.
- Introduce a new AG geometry ioctl.
- Introduce a new online health reporting infrastructure and ioctl
for userspace to query a filesystem's health status.
- Enhance online scrub and repair to update the health reports.
- Reduce thundering herd problems when writeback io completes.
- Fix some transaction reservation type errors.
- Fix integer overflow problems with delayed alloc reservation
counters.
- Fix some problems where we would exit to userspace without
unlocking.
- Fix inconsistent behavior when finishing deferred ops fails.
- Strengthen scrub to check incore data against ondisk metadata.
- Remove long-broken mntpt mount option.
- Add an online scrub function for the filesystem summary counters,
which should make online metadata scrub more or less feature
complete for now.
- Various cleanups"
* tag 'xfs-5.2-merge-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (38 commits)
xfs: change some error-less functions to void types
xfs: add online scrub for superblock counters
xfs: don't parse the mtpt mount option
xfs: always rejoin held resources during defer roll
xfs: add missing error check in xfs_prepare_shift()
xfs: scrub should check incore counters against ondisk headers
xfs: allow scrubbers to pause background reclaim
xfs: rename the speculative block allocation reclaim toggle functions
xfs: track delayed allocation reservations across the filesystem
xfs: fix broken bhold behavior in xrep_roll_ag_trans
xfs: unlock inode when xfs_ioctl_setattr_get_trans can't get transaction
xfs: kill the xfs_dqtrx_t typedef
xfs: widen inode delalloc block counter to 64-bits
xfs: widen quota block counters to 64-bit integers
xfs: abort unaligned nowait directio early
xfs: assert that we don't enter agfl freeing with a non-permanent transaction
xfs: make tr_growdata a permanent transaction
xfs: merge adjacent io completions of the same type
xfs: remove unused m_data_workqueue
xfs: implement per-inode writeback completion queues
...
- Add some extra hooks to the iomap buffered write path to enable gfs2
journalled writes.
- SPDX conversion
- Various refactoring.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEUzaAxoMeQq6m2jMV+H93GTRKtOsFAlzMUJsACgkQ+H93GTRK
tOsyHxAAjnAAO2ABOt2x9fdsZbuc/3Ox1C0388J21uUOm6lgtKCFm/snVmvC7BMa
t9bFOS8Y7RLgHclCkEHy0irsHVVuQl+6XyYrjFaPzkoRnVgViZM5aZGSNkBRiBEM
xVAog5IFLTx59NT41B4pn9y361BFwfHiFRsDgtSVNlv8UsbKdpAMBMX9ezjNLgWI
H5qJZXfzk5LyNG/jsOe+srwVXsboILvPAiDNP95g2KzrXZMvnf8MsMvAe9cSO9SD
ERHn9nX5b4hiwiL12lCl10QOsROmElzP82GHJBctFDzdfOfSRuRZw69lFSzf/2CT
xVypJBm7xVBJ7K50x8KlF1aSLqnxHi/wszS6BaowoMtkPbJRx+FC7M8FCnNr5WtF
DxJduFBUchbNKt1o2x98Evoqjx6eVp92XsCdjsJ05LQo7cxlwECfYhjruwyg/h16
qdE+6KmUwOOiqMQ6Z8kvrejpuIq2rcHlJydDojN+lIbbzmtge8ob/q/A1J8FT6k9
pzVW3y1h7yvgi0ClaQu2DCfx2is2Bd4y0w1b/Y/0jkV9aVbtPqt0akqcaLtAUgc9
25CkJL0sc7QB88Kd3sP9k4aGlQ2TAx52+3TWDo+CBbfPHBTMDWnGB1nE742WLO4v
neH+wSzLP/6U9JkxyRpiYHD+6zLAzq2xTZeiRSXYuzrRqVxaurY=
=Qumg
-----END PGP SIGNATURE-----
Merge tag 'iomap-5.2-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull iomap updates from Darrick Wong:
"Nothing particularly exciting here, just adding some callouts for gfs2
and cleaning a few things.
Summary:
- Add some extra hooks to the iomap buffered write path to enable
gfs2 journalled writes
- SPDX conversion
- Various refactoring"
* tag 'iomap-5.2-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
iomap: move iomap_read_inline_data around
iomap: Add a page_prepare callback
iomap: Fix use-after-free error in page_done callback
fs: Turn __generic_write_end into a void function
iomap: Clean up __generic_write_end calling
iomap: convert to SPDX identifier
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEIodevzQLVs53l6BhNqiEXrVAjGQFAlzQUCsACgkQNqiEXrVA
jGTXtA/9GJHwg22NvyVxGWF4GLLamFPqgiyaj/XhNw2+2BkK4I60uIPI9QucKG8C
RWMhMI3ZZH6dxiGitPQ4hLncEVcTTcHNEwhGcgT85M4tOs+g7mVf03+X0xrveXVB
GMFdf2ETWE80KkIUHaITAHBm/WU7FZG81RQg8IYr8aIqxg3Ey5sowU0vVg54sn61
jNV2h/UFa4VnPX4o+5GbKZ8gZylBoYDLV9WPlD38BRa10eZDjVcDeASBefvTqQO0
n/jBzqkWGMPqj2juKXL1MX2Zr+LnUL9An63Ak7EX95slVjiMmncffVJfyY/Sewoa
hTGIck19NABJsyO83BJqrF0C/c6QbgeRkZS8yZjIVZYF7RbyjQD18TqCZFJMoVpV
xyZ2dUg5DE2nFSLzIXE1JprPuOgNkY87vwO/PAO1jka7LPk2qQZUKGL3+Xw4At6v
XNohTbQKhfqYgeMjr2H9aAw53SnAXiHwCaje+mK26WZsT3KqgN4BbtWKEBcYrSYz
J1ivB7mpNhYwbizUWFZT5znA/ItqwN8YbCazGlxYFOhXgFWUbUXFds250w8F6Iy7
jp3RkEB8HXuFoFtyVjLTYnoM87l37fTQSTrkl5k5CsQ3kges7v6nxGSaZUHE1Iho
MQZr151zKQrcmKwjKGxVFMKFTU94x8TYALqYO6iDbmu8rra7nWw=
=M5Gm
-----END PGP SIGNATURE-----
Merge tag 'jfs-5.2' of git://github.com/kleikamp/linux-shaggy
Pull jfs updates from Dave Kleikamp:
"Several minor jfs fixes"
* tag 'jfs-5.2' of git://github.com/kleikamp/linux-shaggy:
jfs: fix bogus variable self-initialization
fs/jfs: Switch to use new generic UUID API
jfs: compare old and new mode before setting update_mode flag
jfs: remove incorrect comment in jfs_superblock
jfs: fix spelling mistake, EACCESS -> EACCES
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAlzQM7MACgkQxWXV+ddt
WDvrVw/+K0AElSuEfDFWd9HBqRAPlGaEP71xCGGle1tkzuY0DJVIBRZ72q8UR0YP
7yke7DU0oqXekGype83eTJUjDSLoOXrlVoQ+VqBdFteDk0W4BCG6Nw+N+wYBF7An
gXRXlGFaYzb2CqqjG92FbtkfxBzISR0XBCQBUN9CBqHNDu1EUQSbnTBkmTMN8MYh
PCoo37S6e5fR36uB/rOKbGNBJjsZEEg/2G6DprP52+eiQWV2h0avEUJrvv6xC4so
97QNgUNuuiUmyurqcYHdlaflZwIhuf5nQeNeu/UvMZmmRnBHPhSP7YPM7f7FftwA
y0d0p+AiEAO0he8nGFb5C6Avs4vuv1u65o1NbF5fqnmAyt+KXWem3LeG6etsXgU8
+eITgprJD3sNBMDLbLoA+wlhTps+w9tukVF5Zp2a8KgQLMMEyAYqUDWmSHvnO2Me
RCNPZLzeGXETgKun0WuMtl/CX2iBDnc0Kq5O6ks2ORl2TH6bg5lgEIwr6HP/Ewoy
w8twsmCOltrxiIptqyQHYD+kvNwqMVV9LSOQ8+EjbYd6BHsfjHjKObOBkhmJ7iqz
4MAIcZU++F9DLRv92H1kUYVNhAMCdXkEIWyxhZPwN1lUi5k9AhknY3FbheNc7ldl
LNPIgRxamWCq9oBmzfOcJ3eFOBtNN02fgA1GTXGd1/AgAilEep8=
=fEkD
-----END PGP SIGNATURE-----
Merge tag 'for-5.2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs updates from David Sterba:
"This time the majority of changes are cleanups, though there's still a
number of changes of user interest.
User visible changes:
- better read time and write checks to catch errors early and before
writing data to disk (to catch potential memory corruption on data
that get checksummed)
- qgroups + metadata relocation: last speed up patch int the series
to address the slowness, there should be no overhead comparing
balance with and without qgroups
- FIEMAP ioctl does not start a transaction unnecessarily, this can
result in a speed up and less blocking due to IO
- LOGICAL_INO (v1, v2) does not start transaction unnecessarily, this
can speed up the mentioned ioctl and scrub as well
- fsync on files with many (but not too many) hardlinks is faster,
finer decision if the links should be fsynced individually or
completely
- send tries harder to find ranges to clone
- trim/discard will skip unallocated chunks that haven't been touched
since the last mount
Fixes:
- send flushes delayed allocation before start, otherwise it could
miss some changes in case of a very recent rw->ro switch of a
subvolume
- fix fallocate with qgroups that could lead to space accounting
underflow, reported as a warning
- trim/discard ioctl honours the requested range
- starting send and dedupe on a subvolume at the same time will let
only one of them succeed, this is to prevent changes that send
could miss due to dedupe; both operations are restartable
Core changes:
- more tree-checker validations, errors reported by fuzzing tools:
- device item
- inode item
- block group profiles
- tracepoints for extent buffer locking
- async cow preallocates memory to avoid errors happening too deep in
the call chain
- metadata reservations for delalloc reworked to better adapt in
many-writers/low-space scenarios
- improved space flushing logic for intense DIO vs buffered workloads
- lots of cleanups
- removed unused struct members
- redundant argument removal
- properties and xattrs
- extent buffer locking
- selftests
- use common file type conversions
- many-argument functions reduction"
* tag 'for-5.2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (227 commits)
btrfs: Use kvmalloc for allocating compressed path context
btrfs: Factor out common extent locking code in submit_compressed_extents
btrfs: Set io_tree only once in submit_compressed_extents
btrfs: Replace clear_extent_bit with unlock_extent
btrfs: Make compress_file_range take only struct async_chunk
btrfs: Remove fs_info from struct async_chunk
btrfs: Rename async_cow to async_chunk
btrfs: Preallocate chunks in cow_file_range_async
btrfs: reserve delalloc metadata differently
btrfs: track DIO bytes in flight
btrfs: merge calls of btrfs_setxattr and btrfs_setxattr_trans in btrfs_set_prop
btrfs: delete unused function btrfs_set_prop_trans
btrfs: start transaction in xattr_handler_set_prop
btrfs: drop local copy of inode i_mode
btrfs: drop old_fsflags in btrfs_ioctl_setflags
btrfs: modify local copy of btrfs_inode flags
btrfs: drop useless inode i_flags copy and restore
btrfs: start transaction in btrfs_ioctl_setflags()
btrfs: export btrfs_set_prop
btrfs: refactor btrfs_set_props to validate externally
...
Pull vfs stable fodder fixes from Al Viro:
- acct_on() fix for deadlock caught by overlayfs folks
- autofs RCU use-after-free SNAFU (->d_manage() can be called
locklessly, so we need to RCU-delay freeing the objects it looks at)
- (hopefully) the end of "do we need freeing this dentry RCU-delayed"
whack-a-mole.
* 'stable-fodder' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
autofs: fix use-after-free in lockless ->d_manage()
dcache: sort the freeing-without-RCU-delay mess for good.
acct_on(): don't mess with freeze protection
Pull vfs inode freeing updates from Al Viro:
"Introduction of separate method for RCU-delayed part of
->destroy_inode() (if any).
Pretty much as posted, except that destroy_inode() stashes
->free_inode into the victim (anon-unioned with ->i_fops) before
scheduling i_callback() and the last two patches (sockfs conversion
and folding struct socket_wq into struct socket) are excluded - that
pair should go through netdev once davem reopens his tree"
* 'work.icache' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (58 commits)
orangefs: make use of ->free_inode()
shmem: make use of ->free_inode()
hugetlb: make use of ->free_inode()
overlayfs: make use of ->free_inode()
jfs: switch to ->free_inode()
fuse: switch to ->free_inode()
ext4: make use of ->free_inode()
ecryptfs: make use of ->free_inode()
ceph: use ->free_inode()
btrfs: use ->free_inode()
afs: switch to use of ->free_inode()
dax: make use of ->free_inode()
ntfs: switch to ->free_inode()
securityfs: switch to ->free_inode()
apparmor: switch to ->free_inode()
rpcpipe: switch to ->free_inode()
bpf: switch to ->free_inode()
mqueue: switch to ->free_inode()
ufs: switch to ->free_inode()
coda: switch to ->free_inode()
...
xfstest generic/452 was triggering a "Busy inodes after umount" warning.
ceph was allowing the mount to go read-only without first flushing out
dirty inodes in the cache. Ensure we sync out the filesystem before
allowing a remount to proceed.
Cc: stable@vger.kernel.org
Link: http://tracker.ceph.com/issues/39571
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
GCC9 is throwing a lot of warnings about unaligned accesses by
callers of ceph_pr_addr. All of the current callers are passing a
pointer to the sockaddr inside struct ceph_entity_addr.
Fix it to take a pointer to a struct ceph_entity_addr instead,
and then have the function make a copy of the sockaddr before
printing it.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
To make it easier to correlate with MDS logs.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
I originally thought there was a potential race here, but the fact
that this is called with the mdsc->mutex held, ensures that the
last reference to the session can't be put here.
Still, it's clearer to just return the value from get_session here,
and may prevent a bug later if we ever rework this code to be less
reliant on mutexes.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
The return of this function is rather complex. It can return 0 or 1,
and in the case of a 1 return, the "err" pointer will be filled out.
This necessitates a lot of copying of values.
We can achieve the same effect by just returning 0, 1 or a negative
error code, and drop the "err" argument from this function.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
It's not clear what AUTH_RDCACHE means in this context, and we're
clearly just dropping LINK caps here.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Nothing calls ceph_mdsc_submit_request today, but in later patches we'll
need to be able to call this separately.
Have the helper return an int so we can check the r_err under the mutex,
and have the caller just check the error code from the submit. Also move
the acquisition of CEPH_CAP_PIN references into the same function.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
No MDS requests use r_callback today, but that will change in the
future. The OSD client always does r_callback and then completes
r_completion. Let's have the MDS client do the same.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
We make copies of the dentry name in set_request_path_attr, but then
create_request_message re-fetches the lengths out of the dentry. While
we don't currently set the *_drop fields unless the parents are locked,
it's still better not to rely on that sort of implicit assumption.
Use the pathlen values that set_request_path_attr returned instead, as
they will always be correct for the returned paths themselves.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Al suggested we get rid of the kmalloc here and just use __getname
and __putname to get a full PATH_MAX pathname buffer.
Since we build the path in reverse, we continue to return a pointer
to the beginning of the string and the length, and add a new helper
to free the thing at the end.
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
While it may be slightly more efficient, it's probably not worthwhile to
optimize for the case that clone_dentry_name handles. We can get the
same result by just calling ceph_mdsc_build_path when the parent isn't
locked, with less code duplication.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
temp is not defined outside of the RCU critical section here. Ensure
we grab that value before we drop the rcu_read_lock.
Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
We have a "caps" file already that gives statistics on the caps
cache as a whole. Add another section to that output and dump a
line for each individual cap record.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
cephfs can benefit from statx. We can have the client just request caps
sufficient for the needed attributes and leave off the rest.
Also, recognize when AT_STATX_DONT_SYNC is set, and just scrape the
inode without doing any call in that case. Force a call to the MDS in
the event that AT_STATX_FORCE_SYNC is set.
Link: http://tracker.ceph.com/issues/39258
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Reviewed-by: David Howells <dhowells@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Originally, filemap_write_and_wait took the i_mutex internally, but
commit 02c24a82187d pushed the mutex acquisition into the individual
fsync routines, leaving it up to the subsystem maintainers to remove
it if it wasn't needed.
For ceph, I see no reason to take the inode_lock here. All of the
operations inside that lock are protected by their own locking.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
To support snapshot nfs re-export, we need a way to lookup snapped
inode by file handle. For directory inode, snapped metadata are always
stored together with head inode. Client just need to pass vinodeno_t
to MDS. For non-directory inode, there can be multiple version of
snapped inodes and they can be stored in different dirfrags. Besides
vinodeno_t, client also need to tell mds from which dirfrag it got the
snapped inode.
Another problem of supporting snapshot nfs re-export is that there
can be multiple paths to access a snapped inode. For example:
mkdir -p d1/d2/d3
mkdir d1/.snap/s1
Paths 'd1/.snap/s1/d2/d3', 'd1/d2/.snap/_s1_<inode number of d1>/d3'
and 'd1/d2/d3/.snap/_s1_<inode number of d1>' are all reference to the
same snapped inode. For a given snapped inode, There is no convenient
way to get the first form and the second form paths. For simplicity,
ceph_get_parent() return snapdir for snapped directory inode.
Furthermore, client may access snapshot of deleted directory. For
example:
mkdir -p d1/d2
mkdir d1/.snap/s1
open d1/.snap/s1/d2
rm -rf d1/d2
<nfs server restart>
The path constucted by ceph_get_parent() and ceph_get_name() is
'<inode of d2>/.snap/_s1_<inode number of d1>'. Futher lookup parent
of <inode of d2> will fail. To workaround this case, this patch uses
d_obtain_root() to get dentry for snapdir of deleted directory.
snapdir dentry has no DCACHE_DISCONNECTED flag set, reconnect_path()
stops when it reaches snapdir dentry.
Link: http://tracker.ceph.com/issues/22105
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
The CephFS kernel client does not enforce quotas set in a directory that
isn't visible from the mount point. For example, given the path
'/dir1/dir2', if quotas are set in 'dir1' and the filesystem is mounted with
mount -t ceph <server>:<port>:/dir1/ /mnt
then the client won't be able to access 'dir1' inode, even if 'dir2' belongs
to a quota realm that points to it.
This patch fixes this issue by simply doing an MDS LOOKUPINO operation for
unknown inodes. Any inode reference obtained this way will be added to a
list in ceph_mds_client, and will only be released when the filesystem is
umounted.
Link: https://tracker.ceph.com/issues/38482
Reported-by: Hendrik Peyerl <hpeyerl@plusline.net>
Signed-off-by: Luis Henriques <lhenriques@suse.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
This function will be used by __fh_to_dentry and by the quotas code, to
find quota realm inodes that are not visible in the mountpoint.
Signed-off-by: Luis Henriques <lhenriques@suse.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Inode i_filelock_ref is increased in ceph_lock or ceph_flock, but it is
increased again in ceph_lock_message. This results in this ref won't
become zero. If CEPH_I_ERROR_FILELOCK flag is set in
remove_session_caps once, this flag can't be cleared even if client is
back to normal. So further file lock will return EIO.
Signed-off-by: Zhi Zhang <zhang.david2011@gmail.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEESH4wyp42V4tXvYsjUqAMR0iAlPIFAlzP8nQACgkQUqAMR0iA
lPK79A/+NkRouqA9ihAZhUbgW0DHzOAFvUJSBgX11HQAZbGjngakuoyYFvwUx0T0
m80SUTCysxQrWl+xLdccPZ9ZrhP2KFQrEBEdeYHZ6ymcYcl83+3bOIBS7VwdZAbO
EzB8u/58uU/sI6ABL4lF7ZF/+R+U4CXveEUoVUF04bxdPOxZkRX4PT8u3DzCc+RK
r4yhwQUXGcKrHa2GrRL3GXKsDxcnRdFef/nzq4RFSZsi0bpskzEj34WrvctV6j+k
FH/R3kEcZrtKIMPOCoDMMWq07yNqK/QKj0MJlGoAlwfK4INgcrSXLOx+pAmr6BNq
uMKpkxCFhnkZVKgA/GbKEGzFf+ZGz9+2trSFka9LD2Ig6DIstwXqpAgiUK8JFQYj
lq1mTaJZD3DfF2vnGHGeAfBFG3XETv+mIT/ow6BcZi3NyNSVIaqa5GAR+lMc6xkR
waNkcMDkzLFuP1r0p7ZizXOksk9dFkMP3M6KqJomRtApwbSNmtt+O2jvyLPvB3+w
wRyN9WT7IJZYo4v0rrD5Bl6BjV15ZeCPRSFZRYofX+vhcqJQsFX1M9DeoNqokh55
Cri8f6MxGzBVjE1G70y2/cAFFvKEKJud0NUIMEuIbcy+xNrEAWPF8JhiwpKKnU10
c0u674iqHJ2HeVsYWZF0zqzqQ6E1Idhg/PrXfuVuhAaL5jIOnYY=
=WZfC
-----END PGP SIGNATURE-----
Merge tag 'printk-for-5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk
Pull printk updates from Petr Mladek:
- Allow state reset of printk_once() calls.
- Prevent crashes when dereferencing invalid pointers in vsprintf().
Only the first byte is checked for simplicity.
- Make vsprintf warnings consistent and inlined.
- Treewide conversion of obsolete %pf, %pF to %ps, %pF printf
modifiers.
- Some clean up of vsprintf and test_printf code.
* tag 'printk-for-5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk:
lib/vsprintf: Make function pointer_string static
vsprintf: Limit the length of inlined error messages
vsprintf: Avoid confusion between invalid address and value
vsprintf: Prevent crash when dereferencing invalid pointers
vsprintf: Consolidate handling of unknown pointer specifiers
vsprintf: Factor out %pO handler as kobject_string()
vsprintf: Factor out %pV handler as va_format()
vsprintf: Factor out %p[iI] handler as ip_addr_string()
vsprintf: Do not check address of well-known strings
vsprintf: Consistent %pK handling for kptr_restrict == 0
vsprintf: Shuffle restricted_pointer()
printk: Tie printk_once / printk_deferred_once into .data.once for reset
treewide: Switch printk users from %pf and %pF to %ps and %pS, respectively
lib/test_printf: Switch to bitmap_zalloc()
Implement the setting of YFS ACLs in AFS through the interface of setting
the afs.yfs.acl extended attribute on the file.
Signed-off-by: David Howells <dhowells@redhat.com>
The YFS/AuriStor variant of AFS provides more capable ACLs and provides
per-volume ACLs and per-file ACLs as well as per-directory ACLs. It also
provides some extra information that can be retrieved through four ACLs:
(1) afs.yfs.acl
The YFS file ACL (not the same format as afs.acl).
(2) afs.yfs.vol_acl
The YFS volume ACL.
(3) afs.yfs.acl_inherited
"1" if a file's ACL is inherited from its parent directory, "0"
otherwise.
(4) afs.yfs.acl_num_cleaned
The number of of ACEs removed from the ACL by the server because the
PT entries were removed from the PTS database (ie. the subject is no
longer known).
Signed-off-by: David Howells <dhowells@redhat.com>
Implements the setting of ACLs in AFS by means of setting the
afs.acl extended attribute on the file.
Signed-off-by: Joe Gorse <jhgorse@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Implement an xattr on AFS files called "afs.acl" that retrieves a file's
ACL. It returns the raw AFS3 ACL from the result of calling FS.FetchACL,
leaving any interpretation to userspace.
Note that whilst YFS servers will respond to FS.FetchACL, this will render
a more-advanced YFS ACL down. Use "afs.yfs.acl" instead for that.
Signed-off-by: David Howells <dhowells@redhat.com>