* qemu-kvm/uq/master:
kvm: Activate in-kernel irqchip support
kvm: x86: Add user space part for in-kernel IOAPIC
kvm: x86: Add user space part for in-kernel i8259
kvm: x86: Add user space part for in-kernel APIC
kvm: x86: Establish IRQ0 override control
kvm: Introduce core services for in-kernel irqchip support
memory: Introduce memory_region_init_reservation
ioapic: Factor out base class for KVM reuse
ioapic: Drop post-load irr initialization
i8259: Factor out base class for KVM reuse
i8259: Completely privatize PicState
apic: Open-code timer save/restore
apic: Factor out base class for KVM reuse
apic: Introduce apic_report_irq_delivered
apic: Inject external NMI events via LINT1
apic: Stop timer on reset
kvm: Move kvmclock into hw/kvm folder
msi: Generalize msix_supported to msi_supported
hyper-v: initialize Hyper-V CPUID leaves.
hyper-v: introduce Hyper-V support infrastructure.
Conflicts:
Makefile.target
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Improve VGA selection logic, push check for device availabilty to vl.c.
Create the devices at board level unconditionally.
Remove now unused pci_try_create*() functions.
Make PCI VGA devices optional.
Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Rename SysBus device from 'grackle' to 'grackle-pcihost' to resolve a
name conflict.
Also mark both devices as no_user.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Cc: Alexander Graf <agraf@suse.de>
Cc: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
We call pci_host_config_{read,write}_common() which perform PCI config
accesses. However they don't do all limit checking the way we expect
it to.
So let's introduce a small wrapper around them, making them behave the
way we would without touching generic code.
This patch is based on a patch by David Gibson which put this logic into
the generic code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
Currently on the pseries machine the SLOF firmware is used normally,
but we bypass it when -kernel is specified. Having these two
different boot paths can cause some confusion.
In particular at present we need to "probe" the (emulated) PCI bus and
produce device tree nodes for the PCI devices in qemu, for the -kernel
case. In the SLOF case, it takes the device tree from qemu adds some
stuff to it then passes it on to the kernel.
It's been decided that a better approach is to always boot through
SLOF, even when using -kernel. WIth this approach we can leave PCI
probing and device node creation to SLOF in all cases which removes a
bunch of code in qemu, and avoids iterating the PCI devices from the
machine specific init code which we're not supposed to do.
This patch changes qemu to always boot through SLOF, and not to create
PCI nodes. Simultaneously it updates the included version of SLOF
(submodule and binary image) to one which supports (and requires) the
new approach.
The new SLOF version also includes a number of unrelated enhancements:
support for booting from virtio-pci devices and e1000, greatly
improved FCode support and many bugfixes. It also makes SLOF ready to
be used even when specifying a kernel on the qemu command line.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
The pseries machine expects a para-virtualized guest and so supplies RTAS
functions (via a hypercall) for performing PCI config space access.
Currently the implementation of these calls into
pci_default_{read,write}_config(). However this would be incorrect for
any PCI device which overrides the default config read/write functions.
AFAICT there's only one such device today, but we should still get it
right. In addition the pci_host_config_{read,write}_common() functions
which do correctly do this dispatch, perform bounds checking on the config
space address, lack of which currently leads to an exploitable bug.
This patch corrects the problem.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
On the pseries machine (which expexts a paravirtualized guest), guest
access to PCI config space is via host-provided RTAS functions. This
patch extends these RTAS functions to permit access to PCI extended
config space, as specified in PAPR.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
Back when I made patches introducing dma_addr_t and various PCI DMA
wrapper functions, I made a mistake. The bmdma_addr_{read,write} functions
need to take target_phys_addr_t not dma_addr_t, since they are assigned
to MemoryRegionOps callbacks.
This patch corrects my error.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
load_image_targphys() gets passed a max size for the file, but doesn't
enforce it at all. Add a check and return -1 (error) if the file is
too big, without loading it. Fix the bracing style in the function
while we're at it.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
When accessing the device specific virtio config space, we memcpy
the data into a variable in QEMU. At that point we're basically
pulling host endianness into the game which is a really bad idea.
So instead, let's use the target specific load/store helpers for
memory pointers which fetch things in target endianness. The whole
array is already populated in target endianness anyways
(see virtio-blk).
Signed-off-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
The virtio config area in PIO space is a bit special. The initial
header is little endian but the rest (device specific) is guest
native endian.
The PIO accessors for PCI on machines that don't have native IO ports
assume that all PIO is little endian, which works fine for everything
except the above.
A complicated way to fix it would be to split the BAR into two memory
regions with different endianess settings, but this isn't practical
to do, besides, the PIO code doesn't honor region endianness anyway
(I have a patch for that too but it isn't necessary at this stage).
So I decided to go for the quick fix instead which consists of
reverting the swap in virtio-pci in selected places, hoping that when
we eventually do a "v2" of the virtio protocols, we sort that out once
and for all using a fixed endian setting for everything.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
[agraf: keep virtio in libhw and determine endianness through a
helper function in exec.c]
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Now that we have the SoC init function in the same file, let's integrate
it with the board initialization.
While at it, also make use of the newly qdev'ified PCI host controller.
Signed-off-by: Alexander Graf <agraf@suse.de>
The separation of ppc440 and ppc440_bamboo makes some sense, since ppc440
is the SoC while ppc440_bamboo is the actual board. But the separation
makes things harder for us for no good reason, so let's just fold them
in together with each other.
Signed-off-by: Alexander Graf <agraf@suse.de>
Due to popular demand, this qdevifies the PCI host controller of 4xx SoCs
the same way as e500.
We have to introduce a small stub function for pci init that will be
removed in a later patch, once we qdev'ified the board, to keep the build
working.
Signed-off-by: Alexander Graf <agraf@suse.de>
Today we're exposing a Virtex 440 CPU to the guest despite the fact
that we're telling the guest that we're running on a 440EP one in the
device tree.
So let's better default to a real 440EP to make things synced again.
Signed-off-by: Alexander Graf <agraf@suse.de>
When running a 440 target, we currently get invalid irq_num values (-1)
which completely confuse the IRQ setting code.
This is most likely due to the missing qdev conversion.
While this shouldn't happen in the first place and should really rather
be fixed by converting the target, I dislike segfaults. So for now, let's
just print a warning and ignore invalid irq_num values.
Signed-off-by: Alexander Graf <agraf@suse.de>
Back in the day when the bamboo target got introduced, the initial TLB was
dictated by KVM. TCG has been missing initial TLB values ever since, rendering
the target unusable for TCG usage.
This patch adds linear TLB maps the way Linux expects them, making the target
work.
Signed-off-by: Alexander Graf <agraf@suse.de>
To be able to support CPU reset, we need to put all register initialization
and initial state into a CPU reset hook instead of a function that is only
called once on bootup.
This is a preparation step for the initial TLB setting code and brings bamboo
more in line with what e500 and virtex already do.
Signed-off-by: Alexander Graf <agraf@suse.de>
When using TCG with a BookE PowerPC core, we need to explicitly initialize
the BookE timers with the correct frequencies.
This was missing for 440EP, since that code came from KVM and was never used
with TCG.
Signed-off-by: Alexander Graf <agraf@suse.de>
Speaker I/O, ISA bus, i8259 PIC, RTC and DMA are no longer set up
individually by the machine. Effectively, no-op speaker I/O is replaced
by pcspk; PIT and i82374 DMA are introduced.
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Remove related dead, alternative code.
Wire up PCI host bridge IRQs via GPIO-in IRQs of PCI->ISA bridge.
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
Cc: Alexander Graf <agraf@suse.de>
Cc: Jan Kiszka <jan.kiszka@siemens.com>
Prepare Intel 82378 emulation for use by PReP platforms.
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Create ISA bus in this device (suggested by Markus).
Rebase onto Memory API, mark memory ops as Little Endian.
Add VMState. Provide access to i8259 IRQs via qdev GPIOs.
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Alexander Graf <agraf@suse.de>
Cc: Jan Kiszka <jan.kiszka@siemens.com>
Prepare Intel 82374 emulation for use by Intel 82378 PCI->ISA bridge.
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Confine to CONFIG_I82374. Add VMState.
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
Reviewed-by: Alexander Graf <agraf@suse.de>
Drop pci_prep_init() in favor of extended device state. Inspired by
patches from Hervé and Alex.
Assign the 4 IRQs from the board after device instantiation. This moves
the knowledge out of prep_pci and allows for future machines with
different IRQ wiring (IBM 40P). Suggested by Alex.
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
Reviewed-by: Alexander Graf <agraf@suse.de>
Cc: Hervé Poussineau <hpoussin@reactos.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Anthony Liguori <aliguori@us.ibm.com>
Convert to new-style read/write callbacks.
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
Cc: Alexander Graf <agraf@suse.de>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Benoît Canet <benoit.canet@gmail.com>
The prep PowerPC CPU is Big Endian. An explicit byte swap therefore
effectively becomes Little Endian.
Remove explicit byte swaps and mark as Little Endian.
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
Reviewed-by: Alexander Graf <agraf@suse.de>
Cc: Michael S. Tsirkin <mst@redhat.com>
Move initialization of vendor ID, etc. to PCIDeviceInfo.
Introduce VMState.
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
Reviewed-by: Alexander Graf <agraf@suse.de>
Cc: Hervé Poussineau <hpoussin@reactos.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Anthony Liguori <aliguori@us.ibm.com>
This simplifies the code later when the i8259 moves to the i82378
PCI->ISA bridge and happens to fix a SysBus m48t59 io_base issue
introduced by commit 0fb56ffc5e (m48t59:
drop obsolete address base arithmetic). Suggested by Hervé and Jan.
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
Cc: Hervé Poussineau <hpoussin@reactos.org>
Cc: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Blue Swirl <blauwirbel@gmail.com>
Since 0c90c52fab (ppc_prep: convert to memory
API) OHW was "Trying to execute code outside RAM or ROM at 0xfff00700".
The BIOS MemoryRegion is created with a fixed size of 1 MiB.
Ensure that the full size can be accessed since the exception
vectors are located at 0xfff00000 and the BIOS may want to use them.
It thereby no longer depends on the actual BIOS binary size.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Cc: Avi Kivity <avi@redhat.com>
Cc: Alexander Graf <agraf@suse.de>
* stefanha/trivial-patches:
Makefile: Remove generated headers on clean
Makefile: Exclude tests/Makefile in unconfigured tree
lm32: Fix mixup of uint32 and uint32_t
tests: Silence gtester in Makefile
qemu-tool: Fix mixup of int64 and int64_t
* pmaydell/arm-devs.for-upstream:
arm: make the number of GIC interrupts configurable
hw/lan9118: Add save/load support
arm: Remove incorrect comment in arm_timer
vexpress, realview: Add (dummy) L2 cache controller
* kraxel/usb.37:
usb-redir: Improve some debugging messages
usb-redir: Try to keep our buffer size near the target size
usb-redir: Pre-fill our isoc input buffer before sending pkts to the host
usb-redir: Dynamically adjust iso buffering size based on ep interval
usb-redir: Clear iso / irq error when stopping the stream
usb: link packets to endpoints not devices
usb: add max_packet_size to USBEndpoint
usb/debug: add usb_ep_dump
usb-desc: USBEndpoint support
usb: add ifnum to USBEndpoint
usb: add USBEndpoint
xhci: Initial xHCI implementation
usb: add audio device model
usb-desc: audio endpoint support
usb: track altsetting in USBDevice
usb: track configuration and interface count in USBDevice.
usb-host: rip out legacy procfs support
This introduces the KVM-accelerated IOAPIC model 'kvm-ioapic' and
extends the IRQ routing setup by the 0->2 redirection when needed.
The kvm-ioapic model has a property that allows to define its GSI base
for injecting interrupts into the kernel model. This will allow to
disentangle PIC and IOAPIC pins for chipsets that support more
sophisticated IRQ routes than the PIIX3. So far the base is kept at 0,
i.e. PIC and IOAPIC share pins 0..15.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Introduce the alternative 'kvm-i8259' device model that exploits KVM
in-kernel acceleration.
The PIIX3 initialization code is furthermore extended by KVM specific
IRQ route setup. GSI injection differs in KVM mode from the user space
model. As we can dispatch ISA-range IRQs to both IOAPIC and PIC inside
the kernel, we do not need to inject them separately. This is reflected
by a KVM-specific GSI handler.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
This introduces the alternative APIC device which makes use of KVM's
in-kernel device model. External NMI injection via LINT1 is emulated by
checking the current state of the in-kernel APIC, only injecting a NMI
into the VCPU if LINT1 is unmasked and configured to DM_NMI.
MSI is not yet supported, so we disable this when the in-kernel model is
in use.
CC: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
KVM is forced to disable the IRQ0 override when we run with in-kernel
irqchip but without IRQ routing support of the kernel. Set the fwcfg
value correspondingly. This aligns us with qemu-kvm.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Split up the IOAPIC analogously to APIC and i8259. KVM will share the
IOAPICCommonState, the vmstate, reset logic and certain init parts with
the user space model.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
As all devices undergo a reset prior to vmloa, and the reset value of
irr is 0, we do not need to do this clearing for older vmstates
explicitly. Dropping this redundant code will also make KVM integration
a bit simpler.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Analogously to the APIC, we will reuse some parts of the user space
i8259 model for KVM. The base class provides a common device state, the
vmstate, the property list, a reset core and some shared init bits.
This also introduces a common helper to instantiate a single i8259 chip
from the cascade-creating i8259_init function.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Use DeviceState instead of PicState in the public i8259 API. This is
cleaner and allows to reorganize the PIC data structures for KVM reuse.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
To enable migration between accelerated and non-accelerated APIC models,
we will need to handle the timer saving and restoring specially and can
no longer rely on the automatics of VMSTATE_TIMER. Specifically,
accelerated model will not start any QEMUTimer.
This patch therefore factors out the generic bits into apic_next_timer
and use a post-load callback to implemented model-specific logic.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
The KVM in-kernel APIC model will reuse parts of the user space model
while providing the same frontend view to guest and most management
interfaces.
Factor out an APIC base class to encapsulate those parts that will be
shared by user space and KVM model. This class offers callback hooks for
init, base/tpr setting, and the external NMI delivery that will be
set via APICCommonInfo structure and implemented specifically in the
subclasses.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
On real hardware, NMI button events are injected via the LINT1 line of
the APICs. E.g. kdump expect this wiring and gets upset if the per-APIC
LINT1 mask is not respected, i.e. if NMIs are injected to VCPUs that
should not receive them. Change the APIC emulation code to reflect this.
Based on qemu-kvm patch by Lai Jiangshan.
CC: Lai Jiangshan <laijs@cn.fujitsu.com>
Reported-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>