We support coordinated discarding of RAM using the RamDiscardManager for
the VFIO_TYPE1 iommus. Let's unlock support for coordinated discards,
keeping uncoordinated discards (e.g., via virtio-balloon) disabled if
possible.
This unlocks virtio-mem + vfio on x86-64. Note that vfio used via "nvme://"
by the block layer has to be implemented/unlocked separately. For now,
virtio-mem only supports x86-64; we don't restrict RamDiscardManager to
x86-64, though: arm64 and s390x are supposed to work as well, and we'll
test once unlocking virtio-mem support. The spapr IOMMUs will need special
care, to be tackled later, e.g.., once supporting virtio-mem.
Note: The block size of a virtio-mem device has to be set to sane sizes,
depending on the maximum hotplug size - to not run out of vfio mappings.
The default virtio-mem block size is usually in the range of a couple of
MBs. The maximum number of mapping is 64k, shared with other users.
Assume you want to hotplug 256GB using virtio-mem - the block size would
have to be set to at least 8 MiB (resulting in 32768 separate mappings).
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Auger Eric <eric.auger@redhat.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: teawater <teawaterz@linux.alibaba.com>
Cc: Marek Kedzierski <mkedzier@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210413095531.25603-14-david@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
We implement the RamDiscardManager interface and only require coordinated
discarding of RAM to work.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta@cloud.ionos.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Auger Eric <eric.auger@redhat.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: teawater <teawaterz@linux.alibaba.com>
Cc: Marek Kedzierski <mkedzier@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210413095531.25603-13-david@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
vIOMMU support works already with RamDiscardManager as long as guests only
map populated memory. Both, populated and discarded memory is mapped
into &address_space_memory, where vfio_get_xlat_addr() will find that
memory, to create the vfio mapping.
Sane guests will never map discarded memory (e.g., unplugged memory
blocks in virtio-mem) into an IOMMU - or keep it mapped into an IOMMU while
memory is getting discarded. However, there are two cases where a malicious
guests could trigger pinning of more memory than intended.
One case is easy to handle: the guest trying to map discarded memory
into an IOMMU.
The other case is harder to handle: the guest keeping memory mapped in
the IOMMU while it is getting discarded. We would have to walk over all
mappings when discarding memory and identify if any mapping would be a
violation. Let's keep it simple for now and print a warning, indicating
that setting RLIMIT_MEMLOCK can mitigate such attacks.
We have to take care of incoming migration: at the point the
IOMMUs get restored and start creating mappings in vfio, RamDiscardManager
implementations might not be back up and running yet: let's add runstate
priorities to enforce the order when restoring.
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Auger Eric <eric.auger@redhat.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: teawater <teawaterz@linux.alibaba.com>
Cc: Marek Kedzierski <mkedzier@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210413095531.25603-10-david@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Although RamDiscardManager can handle running into the maximum number of
DMA mappings by propagating errors when creating a DMA mapping, we want
to sanity check and warn the user early that there is a theoretical setup
issue and that virtio-mem might not be able to provide as much memory
towards a VM as desired.
As suggested by Alex, let's use the number of KVM memory slots to guess
how many other mappings we might see over time.
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Auger Eric <eric.auger@redhat.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: teawater <teawaterz@linux.alibaba.com>
Cc: Marek Kedzierski <mkedzier@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210413095531.25603-9-david@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Let's query the maximum number of possible DMA mappings by querying the
available mappings when creating the container (before any mappings are
created). We'll use this informaton soon to perform some sanity checks
and warn the user.
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Auger Eric <eric.auger@redhat.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: teawater <teawaterz@linux.alibaba.com>
Cc: Marek Kedzierski <mkedzier@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210413095531.25603-8-david@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Implement support for RamDiscardManager, to prepare for virtio-mem
support. Instead of mapping the whole memory section, we only map
"populated" parts and update the mapping when notified about
discarding/population of memory via the RamDiscardListener. Similarly, when
syncing the dirty bitmaps, sync only the actually mapped (populated) parts
by replaying via the notifier.
Using virtio-mem with vfio is still blocked via
ram_block_discard_disable()/ram_block_discard_require() after this patch.
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Auger Eric <eric.auger@redhat.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: teawater <teawaterz@linux.alibaba.com>
Cc: Marek Kedzierski <mkedzier@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210413095531.25603-7-david@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Let's properly notify when (un)plugging blocks, after discarding memory
and before allowing the guest to consume memory. Handle errors from
notifiers gracefully (e.g., no remaining VFIO mappings) when plugging,
rolling back the change and telling the guest that the VM is busy.
One special case to take care of is replaying all notifications after
restoring the vmstate. The device starts out with all memory discarded,
so after loading the vmstate, we have to notify about all plugged
blocks.
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Auger Eric <eric.auger@redhat.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: teawater <teawaterz@linux.alibaba.com>
Cc: Marek Kedzierski <mkedzier@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210413095531.25603-6-david@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Any errors are unexpected and ram_block_discard_range() already properly
prints errors. Let's stop manually reporting errors.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Auger Eric <eric.auger@redhat.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: teawater <teawaterz@linux.alibaba.com>
Cc: Marek Kedzierski <mkedzier@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210413095531.25603-5-david@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Let's factor out the core logic, no need to replicate.
Reviewed-by: Pankaj Gupta <pankaj.gupta@cloud.ionos.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Auger Eric <eric.auger@redhat.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: teawater <teawaterz@linux.alibaba.com>
Cc: Marek Kedzierski <mkedzier@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210413095531.25603-4-david@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Having properties registered conditionally makes QOM type
introspection difficult. Instead of skipping registration of the
"instanceid" property, always register the property but validate
its value against the instance id required by the class.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Acked-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Message-Id: <20201009200701.1830060-1-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
backend_defaults property allow users to control if default block
properties should be decided with backend information.
If it is off, any backend information will be discarded, which is
suitable if you plan to perform live migration to a different disk backend.
If it is on, a block device may utilize backend information more
aggressively.
By default, it is auto, which uses backend information for block
sizes and ignores the others, which is consistent with the older
versions.
Signed-off-by: Akihiko Odaki <akihiko.odaki@gmail.com>
Message-id: 20210705130458.97642-2-akihiko.odaki@gmail.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* Generalize XSAVE area offset so that it matches AMD processors on KVM
* Improvements for -display and deprecation of -no-quit
* Enable SMP configuration as a compound machine property ("-M smp.cpus=...")
* Haiku compilation fix
* Add icon on Darwin
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmDkB7sUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroOISgf+Nn5BiXQRY52DK/2PoG330F6UeOcp
kWFAE4k4qEktDiCcd5xKekiUd7h+TiRS8bLeycmRtiSXvbzXioE2eCelui0SZDQl
zpIb8wV2WaxrD/zUYPV7r5n+VFAaTCm9lUEzzqnwaThBG/Oat45gnossZEIWv85g
KtQMsSh3pc+KpTjWbIA8V01ohzwFE2q7cA9CB/pDgR3h8M5p4K0ZdaPoAO2auhvu
2sbu9oBl1JwqpIhPme9JR6Je5fMCILBRlXTvPgJ/0iaGdxcNmZxoflO/TZVFB1pl
tUiCu0GB0yEasMO1E6+cP7ezhm15Lz3vKqjr/boV5Y9osfU36k9xkLTvAg==
=itIm
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/bonzini-gitlab/tags/for-upstream' into staging
* More Meson test conversions and configure cleanups
* Generalize XSAVE area offset so that it matches AMD processors on KVM
* Improvements for -display and deprecation of -no-quit
* Enable SMP configuration as a compound machine property ("-M smp.cpus=...")
* Haiku compilation fix
* Add icon on Darwin
# gpg: Signature made Tue 06 Jul 2021 08:35:23 BST
# gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* remotes/bonzini-gitlab/tags/for-upstream: (40 commits)
config-host.mak: remove unused compiler-related lines
Set icon for QEMU binary on Mac OS
qemu-option: remove now-dead code
machine: add smp compound property
vl: switch -M parsing to keyval
keyval: introduce keyval_parse_into
keyval: introduce keyval_merge
qom: export more functions for use with non-UserCreatable objects
configure: convert compiler tests to meson, part 6
configure: convert compiler tests to meson, part 5
configure: convert compiler tests to meson, part 4
configure: convert compiler tests to meson, part 3
configure: convert compiler tests to meson, part 2
configure: convert compiler tests to meson, part 1
configure: convert HAVE_BROKEN_SIZE_MAX to meson
configure, meson: move CONFIG_IVSHMEM to meson
meson: store dependency('threads') in a variable
meson: sort existing compiler tests
configure, meson: convert libxml2 detection to meson
configure, meson: convert liburing detection to meson
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Make -smp syntactic sugar for a compound property "-machine
smp.{cores,threads,cpu,...}". machine_smp_parse is replaced by the
setter for the property.
numa-test will now cover the new syntax, while other tests
still use -smp.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
As with previous performance optimization on Treaddir handling;
reduce the overall latency, i.e. overall time spent on processing
a Twalk request by reducing the amount of thread hops between the
9p server's main thread and fs worker thread(s).
In fact this patch even reduces the thread hops for Twalk handling
to its theoritical minimum of exactly 2 thread hops:
main thread -> fs worker thread -> main thread
This is achieved by doing all the required fs driver tasks altogether
in a single v9fs_co_run_in_worker({ ... }); code block.
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <1a6701674afc4f08d40396e3aa2631e18a4dbb33.1622821729.git.qemu_oss@crudebyte.com>
There is no longer a user of root_qid, so drop it.
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <6896dd161d3257db6b0513842a14f87ca191fdf6.1622821729.git.qemu_oss@crudebyte.com>
As we are actually only comparing the filesystem ID (i.e. device number
and inode number pair) let's use the POSIX stat buffer instead of QIDs,
because resolving QIDs requires to be done on 9p server's main thread
only as it might mutate the server state if inode remapping is enabled.
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <26aa465ff9cc9c07e053331554a02fdae3994417.1622821729.git.qemu_oss@crudebyte.com>
There is only one user of fid_to_qid() which is v9fs_walk(). Let's
open-code fid_to_qid() directly within v9fs_walk(), because
fid_to_qid() hides the POSIX stat buffer which we are going to need
in the subsequent patch.
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <e9a4c9c7a0792ed4db6578d105a0823ea05bc324.1622821729.git.qemu_oss@crudebyte.com>
We already capture the QID of the exported 9p root path, i.e. to
prevent client access outside the defined, exported filesystem's tree.
This is currently checked by comparing the root QID with another FID's
QID.
The problem with the latter is that resolving a QID of any given 9p path
can only be done on 9p server's main thread, that's because it might
mutate the server's state if inode remapping is enabled.
For that reason also capture the POSIX stat info of the root path for
being able to identify on any (e.g. worker) thread whether an
arbitrary given path is identical to the export root.
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <eb07d6c2e9925788454cfe33d3802e4ffb23ea9a.1622821729.git.qemu_oss@crudebyte.com>
There is only one user of not_same_qid() which is v9fs_walk() and the
latter is using it for comparing a client supplied path with the 9p
export root path, for the sole purpose to prevent a Twalk request
from escaping from the exported 9p tree via "..".
However for that specific purpose the implementation of not_same_qid()
is wrong; if mtime of the 9p export root path changed between Tattach
and Twalk then not_same_qid() returns true when actually comparing
against the export root path.
To fix for the actual semantic being used, only compare QID path
members, but do not compare version or type members.
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <ca0abae4a899d81c6e87f683732d6c1f56915232.1622821729.git.qemu_oss@crudebyte.com>
There is only one comparison between nwnames and P9_MAXWELEM required.
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <E1liKiz-0006BC-Ja@lizzy.crudebyte.com>
To lower the entry level for new developers, add a link to the 9p
developer docs (i.e. qemu wiki) to MAINTAINERS and to the beginning of
9p source files, that is to: https://wiki.qemu.org/Documentation/9p
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Acked-by: Greg Kurz <groug@kaod.org>
Message-Id: <E1leeDf-0008GZ-9q@lizzy.crudebyte.com>
Several CVE fixes for the PVRDMA device.
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJg4hJVAAoJEDbUwPDPL+RtGEkIAJs7yas9LdaP80scaEQ3UiOJ
Mr9M2+RwoUm19YhrM2MqjIByluB5Jk2bO1ZmpuJmeg03V15I8AV8MozKGAzZaThx
8LVwduqlmdnrprBYKVRRn/SLSqbMkRSXGli9BjX2PoF97VeqvhNOxH1hL6z8Ytn6
zzpsOSq7NmC87hLdvL2UIkpYy1Z4bUjAWr1wn7K6jIhM163pek61rmO9eWwc7W3y
M1lXtGtrLgE0AuLm0oEJwz0xqlOuN8ld5kwIEu33RnL8phLwT0hDjHoLPqIX79bS
UdIL1zNjwXKI0DkR5EoVLRMdF87b3LMfVEvij25PNLPf6gyyh5FHmdqQi0r3sZU=
=MrNO
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/marcel/tags/pvrdma-04-07-2021-v2' into staging
PVRDMA queue
Several CVE fixes for the PVRDMA device.
# gpg: Signature made Sun 04 Jul 2021 20:56:05 BST
# gpg: using RSA key 36D4C0F0CF2FE46D
# gpg: Good signature from "Marcel Apfelbaum <marcel.apfelbaum@zoho.com>" [marginal]
# gpg: aka "Marcel Apfelbaum <marcel@redhat.com>" [marginal]
# gpg: aka "Marcel Apfelbaum <marcel.apfelbaum@gmail.com>" [marginal]
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: B1C6 3A57 F92E 08F2 640F 31F5 36D4 C0F0 CF2F E46D
* remotes/marcel/tags/pvrdma-04-07-2021-v2:
pvrdma: Fix the ring init error flow (CVE-2021-3608)
pvrdma: Ensure correct input on ring init (CVE-2021-3607)
hw/rdma: Fix possible mremap overflow in the pvrdma device (CVE-2021-3582)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
- Extract nanoMIPS, microMIPS, Code Compaction from translate.c
- Allow PCI config accesses smaller than 32-bit on Bonito64 device
- Fix migration of g364fb device on Jazz Magnum
- Fix dp8393x PROM checksum on Jazz Magnum and Quadra 800
- Map the UART devices unconditionally on Jazz Magnum
- Add functional test booting Linux on the Fuloong 2E
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmDfMnMACgkQ4+MsLN6t
wN71nhAArpyoJ5mTkt54wAxZwxvqjWAyesvogV4pLIvOyNJmQcExY/Ly8Qb5dbDg
2PEhCpDU7GlT7oCfgh7O5KrEjnqVfmHQzzbvQ0Ygq9kL5hdjaSSlHB/yeirU7CR1
cMQXfj9kvRVa5Oayt3L+kiKgTA0f1vbGmnveiFxJKJupyVDtursstD3nSZCThSVL
FWdFJbqLzvTY3cQqLBVq7jEnN7LzSYeYnq8Tvri0nuwoBwLJY382IljqsQZrGHzr
Bbya3KFInMrQK5VAM0pOkfvPYXZmtJ8i8VuR6S+IdICiZ+61sknKRUq5z09/4NXA
HaxarWO/fv07qd7q6Z2+i5Q6fiDrV4p2qfHeddM4Xwqvu8O98EPhaBE3veLOiNgr
VxgkJFslI1gstje31qqvNjFxB+cOIBYjWTlIVu1xOuOKGWMjMT+9XcVLseweA2rT
H/nTKnWTAiJ/mxT4KIv59SS0ZQa4QJ3CjYr26AcQ9YrJ9vCMay8rLPEn0iVlhB2H
ZyW4Be84ynEvuAvvuWt1AXvFjT41Zqj4Px1M6Pa15e9eV6guiW1KV13Thb45gJqt
LTQtoME83r3gUhQOcoZaowYh6ffCbjtW3eAcTZP3Zu7fOeBjSye+CvS9NJMtK3Vj
0/YjtqGPUvjNy4sR9VqkRRPMZpHftMTRILxaFm8Y5tlThkAydw0=
=rR+B
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/philmd/tags/mips-20210702' into staging
MIPS patches queue
- Extract nanoMIPS, microMIPS, Code Compaction from translate.c
- Allow PCI config accesses smaller than 32-bit on Bonito64 device
- Fix migration of g364fb device on Jazz Magnum
- Fix dp8393x PROM checksum on Jazz Magnum and Quadra 800
- Map the UART devices unconditionally on Jazz Magnum
- Add functional test booting Linux on the Fuloong 2E
# gpg: Signature made Fri 02 Jul 2021 16:36:19 BST
# gpg: using RSA key FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE
# gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <f4bug@amsat.org>" [full]
# Primary key fingerprint: FAAB E75E 1291 7221 DCFD 6BB2 E3E3 2C2C DEAD C0DE
* remotes/philmd/tags/mips-20210702:
hw/mips/jazz: Map the UART devices unconditionally
hw/mips/jazz: specify correct endian for dp8393x device
hw/m68k/q800: fix PROM checksum and MAC address storage
qemu/bitops.h: add bitrev8 implementation
dp8393x: remove onboard PROM containing MAC address and checksum
hw/m68k/q800: move PROM and checksum calculation from dp8393x device to board
hw/mips/jazz: move PROM and checksum calculation from dp8393x device to board
dp8393x: convert to trace-events
dp8393x: checkpatch fixes
g364fb: add VMStateDescription for G364SysBusState
g364fb: use RAM memory region for framebuffer
tests/acceptance: Test Linux on the Fuloong 2E machine
hw/pci-host/bonito: Allow PCI config accesses smaller than 32-bit
hw/pci-host/bonito: Trace PCI config accesses smaller than 32-bit
target/mips: Extract nanoMIPS ISA translation routines
target/mips: Extract the microMIPS ISA translation routines
target/mips: Extract Code Compaction ASE translation routines
target/mips: Add declarations for generic TCG helpers
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reset requests should use SHUTDOWN_CAUSE_GUEST_RESET not
SHUTDOWN_CAUSE_GUEST_SHUTDOWN.
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-Id: <20210624110057.2398779-1-kraxel@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Commit [1] moved _SUN variable from only hot-pluggable to
all devices. This made linux kernel enumerate extra slots
that weren't present before. If extra slot happens to be
be enumerated first and there is a device in th same slot
but on other bridge, linux kernel will add -N suffix to
slot name of the later, thus changing NIC name compared to
QEMU 5.2. This in some case confuses systemd, if it is
using SLOT NIC naming scheme and interface name becomes
not the same as it was under QEMU-5.2.
Reproducer QEMU CLI:
-M pc-i440fx-5.2 -nodefaults \
-device pci-bridge,chassis_nr=1,id=pci.1,bus=pci.0,addr=0x3 \
-device virtio-net-pci,id=nic1,bus=pci.1,addr=0x1 \
-device virtio-net-pci,id=nic2,bus=pci.1,addr=0x2 \
-device virtio-net-pci,id=nic3,bus=pci.1,addr=0x3
with RHEL8 guest produces following results:
v5.2:
kernel: virtio_net virtio0 ens1: renamed from eth0
kernel: virtio_net virtio2 ens3: renamed from eth2
kernel: virtio_net virtio1 enp1s2: renamed from eth1
(slot 2 is assigned to empty bus 0 slot and virtio1
is assigned to 2-2 slot, and renaming falls back,
for some reason, to path based naming scheme)
v6.0:
kernel: virtio_net virtio0 ens1: renamed from eth0
kernel: virtio_net virtio2 ens3: renamed from eth2
systemd-udevd[299]: Error changing net interface name 'eth1' to 'ens3': File exists
systemd-udevd[299]: could not rename interface '3' from 'eth1' to 'ens3': File exists
(with commit [1] kernel assigns virtio2 to 3-2 slot
since bridge advertises _SUN=0x3 and kernel assigns
slot 3 to bridge. Still it manages to rename virtio2
correctly to ens3, however systemd gets confused with virtio1
where slot allocation exactly the same (2-2) as in 5.2 case
and tries to rename it to ens3 which is rightfully taken by
virtio2)
I'm not sure what breaks in systemd interface renaming (it probably
should be investigated), but on QEMU side we can safely revert
_SUN to 5.2 behavior (i.e. avoid cold-plugged bridges and non
hot-pluggable device classes), without breaking acpi-index, which uses
slot numbers but it doesn't have to use _SUN, it could use an arbitrary
variable name that has the same slot value).
It will help existing VMs to keep networking with non trivial
configs in working order since systemd will do its interface
renaming magic as it used to do.
1)
Fixes: b7f23f62e40 (pci: acpi: add _DSM method to PCI devices)
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20210624204229.998824-3-imammedo@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: John Sucaet <john.sucaet@ekinops.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
When the card is plugged back, reset the partially_hotplugged flag to false
Bug: https://bugzilla.redhat.com/show_bug.cgi?id=1787194
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20210629152937.619193-1-lvivier@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
At some point, after unplugging virtio-pci the virtio device may be unrealised,
but the memory regions may be present in flatview. So, it's a possible situation
when memory region's callbacks are called for "unplugged" device.
Previous two patches made sure this case does not cause QEMU to crash.
This patch adds check for "notify" memory region. Now reads will return "-1" if a virtio
device is not present on a virtio bus.
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1938042
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1743098
Signed-off-by: Andrew Melnychenko <andrew@daynix.com>
Message-Id: <20210609095843.141378-4-andrew@daynix.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Now, if virtio device is not present on virtio-bus - pci config callbacks
will not lead to possible crush. The read will return "-1" which should be
interpreted by a driver that pci device may be unplugged.
Signed-off-by: Andrew Melnychenko <andrew@daynix.com>
Message-Id: <20210609095843.141378-3-andrew@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
During unplug the virtio device is unplugged from virtio-bus on pci. In some cases,
requests to virtio-pci mm may acquire during/after unplug. Added check that virtio
device is on the bus, for "common" memory region.
Signed-off-by: Andrew Melnychenko <andrew@daynix.com>
Message-Id: <20210609095843.141378-2-andrew@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
libFuzzer triggered the following assertion:
cat << EOF | qemu-system-i386 -M pc-q35-5.0 \
-nographic -monitor none -serial none \
-qtest stdio -d guest_errors -trace pci\*
outl 0xcf8 0xf2000060
outl 0xcfc 0x8400056e
EOF
pci_cfg_write mch 00:0 @0x60 <- 0x8400056e
Aborted (core dumped)
This is because guest wrote MCH_HOST_BRIDGE_PCIEXBAR_LENGTH_RVD
(reserved value) to the PCIE XBAR register.
There is no indication on the datasheet about what occurs when
this value is written. Simply ignore it on QEMU (and report an
guest error):
pci_cfg_write mch 00:0 @0x60 <- 0x8400056e
Q35: Reserved PCIEXBAR LENGTH
pci_cfg_read mch 00:0 @0x0 -> 0x8086
pci_cfg_read mch 00:0 @0x0 -> 0x29c08086
...
Cc: qemu-stable@nongnu.org
Reported-by: Alexander Bulekov <alxndr@bu.edu>
BugLink: https://bugs.launchpad.net/qemu/+bug/1878641
Fixes: df2d8b3ed4 ("q35: Introduce q35 pc based chipset emulator")
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210526142438.281477-1-f4bug@amsat.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
When using the Magnum ARC firmware we can see accesses to the
UART1 being rejected, because the device is not mapped:
$ qemu-system-mips64el -M magnum -d guest_errors,unimp -bios NTPROM.RAW
Invalid access at addr 0x80007004, size 1, region '(null)', reason: rejected
Invalid access at addr 0x80007001, size 1, region '(null)', reason: rejected
Invalid access at addr 0x80007002, size 1, region '(null)', reason: rejected
Invalid access at addr 0x80007003, size 1, region '(null)', reason: rejected
Invalid access at addr 0x80007004, size 1, region '(null)', reason: rejected
Since both UARTs are present (soldered on the board) regardless
of whether there are character devices connected, map them
unconditionally.
(This code pre-dated commit 12051d82f004 which made it safe to pass
NULL in as a chardev to serial devices.)
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20210629053704.2584504-1-f4bug@amsat.org>
The MIPS magnum machines are available in both big endian (mips64) and little
endian (mips64el) configurations. Ensure that the dp893x big_endian property
is set accordingly using logic similar to that used for the MIPS malta
machines.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Finn Thain <fthain@linux-m68k.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210625065401.30170-11-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
The checksum used by MacOS to validate the PROM content is an exclusive-OR
rather than a sum over the corresponding bytes. In addition the MAC address
must be stored in bit-reversed format as indicated in comments in Linux's
macsonic.c.
With the PROM contents fixed MacOS starts to probe the device registers
when AppleTalk is enabled in the Control Panel.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Finn Thain <fthain@linux-m68k.org>
Message-Id: <20210625065401.30170-8-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
According to the datasheet the dp8393x chipset does not contain any NVRAM capable
of storing a MAC address or checksum. Now that both the MIPS jazz and m68k q800
boards generate the PROM region and checksum themselves, remove the generated
PROM from the dp8393x device itself.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Finn Thain <fthain@linux-m68k.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210625065401.30170-6-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
This is in preparation for each board to have its own separate bit storage
format and checksum for storing the MAC address.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Finn Thain <fthain@linux-m68k.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210625065401.30170-5-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
This is in preparation for each board to have its own separate bit storage
format and checksum for storing the MAC address.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Finn Thain <fthain@linux-m68k.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210625065401.30170-4-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Finn Thain <fthain@linux-m68k.org>
Message-Id: <20210625065401.30170-3-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Also fix a simple comment typo of "constrainst" to "constraints".
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Finn Thain <fthain@linux-m68k.org>
Message-Id: <20210625065401.30170-2-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Currently when QEMU attempts to migrate the MIPS magnum machine it crashes due
to a mistake in the g364fb VMStateDescription configuration which expects a
G364SysBusState and not a G364State.
Resolve the issue by adding a new VMStateDescription for G364SysBusState and
embedding the existing vmstate_g364fb VMStateDescription inside it using
VMSTATE_STRUCT.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Fixes: 97a3f6ffbba ("g364fb: convert to qdev")
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210625163554.14879-3-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Since the migration stream is already broken, we can use this opportunity to
change the framebuffer so that it is migrated as a RAM memory region rather
than as an array of bytes.
In particular this helps the output of the analyze-migration.py tool which
no longer contains a huge array representing the framebuffer contents.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210625163554.14879-2-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
When running the official PMON firmware for the Fuloong 2E, we see
8-bit and 16-bit accesses to PCI config space:
$ qemu-system-mips64el -M fuloong2e -bios pmon_2e.bin \
-trace -trace bonito\* -trace pci_cfg\*
pci_cfg_write vt82c686b-pm 05:4 @0x90 <- 0xeee1
bonito_spciconf_small_access PCI config address is smaller then 32-bit, addr: 0x4d2, size: 2
pci_cfg_write vt82c686b-pm 05:4 @0xd2 <- 0x1
pci_cfg_write vt82c686b-pm 05:4 @0x4 <- 0x1
pci_cfg_write vt82c686b-isa 05:0 @0x4 <- 0x7
bonito_spciconf_small_access PCI config address is smaller then 32-bit, addr: 0x81, size: 1
pci_cfg_read vt82c686b-isa 05:0 @0x81 -> 0x0
bonito_spciconf_small_access PCI config address is smaller then 32-bit, addr: 0x81, size: 1
pci_cfg_write vt82c686b-isa 05:0 @0x81 <- 0x80
bonito_spciconf_small_access PCI config address is smaller then 32-bit, addr: 0x83, size: 1
pci_cfg_write vt82c686b-isa 05:0 @0x83 <- 0x89
bonito_spciconf_small_access PCI config address is smaller then 32-bit, addr: 0x85, size: 1
pci_cfg_write vt82c686b-isa 05:0 @0x85 <- 0x3
bonito_spciconf_small_access PCI config address is smaller then 32-bit, addr: 0x5a, size: 1
pci_cfg_write vt82c686b-isa 05:0 @0x5a <- 0x7
bonito_spciconf_small_access PCI config address is smaller then 32-bit, addr: 0x85, size: 1
pci_cfg_write vt82c686b-isa 05:0 @0x85 <- 0x1
Also this is what the Linux kernel does since it supports the Bonito
north bridge:
https://elixir.bootlin.com/linux/v2.6.15/source/arch/mips/pci/ops-bonito64.c#L85
So it seems safe to assume the datasheet is incomplete or outdated
regarding the address constraints.
This problem was exposed by commit 911629e6d3773a8adeab48b
("vt82c686: Fix SMBus IO base and configuration registers").
Reported-by: BALATON Zoltan <balaton@eik.bme.hu>
Suggested-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210624202747.1433023-4-f4bug@amsat.org>
Tested-by: BALATON Zoltan <balaton@eik.bme.hu>