Our HYP init code suffers from two major design issues:
- it cannot support CPU hotplug, as we tear down the idmap very early
- it cannot perform a TLB invalidation when switching from init to
runtime mappings, as pages are manipulated from PL1 exclusively
The hotplug problem mandates that we keep two sets of page tables
(boot and runtime). The TLB problem mandates that we're able to
transition from one PGD to another while in HYP, invalidating the TLBs
in the process.
To be able to do this, we need to share a page between the two page
tables. A page that will have the same VA in both configurations. All we
need is a VA that has the following properties:
- This VA can't be used to represent a kernel mapping.
- This VA will not conflict with the physical address of the kernel text
The vectors page seems to satisfy this requirement:
- The kernel never maps anything else there
- The kernel text being copied at the beginning of the physical memory,
it is unlikely to use the last 64kB (I doubt we'll ever support KVM
on a system with something like 4MB of RAM, but patches are very
welcome).
Let's call this VA the trampoline VA.
Now, we map our init page at 3 locations:
- idmap in the boot pgd
- trampoline VA in the boot pgd
- trampoline VA in the runtime pgd
The init scenario is now the following:
- We jump in HYP with four parameters: boot HYP pgd, runtime HYP pgd,
runtime stack, runtime vectors
- Enable the MMU with the boot pgd
- Jump to a target into the trampoline page (remember, this is the same
physical page!)
- Now switch to the runtime pgd (same VA, and still the same physical
page!)
- Invalidate TLBs
- Set stack and vectors
- Profit! (or eret, if you only care about the code).
Note that we keep the boot mapping permanently (it is not strictly an
idmap anymore) to allow for CPU hotplug in later patches.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
There is no point in freeing HYP page tables differently from Stage-2.
They now have the same requirements, and should be dealt with the same way.
Promote unmap_stage2_range to be The One True Way, and get rid of a number
of nasty bugs in the process (good thing we never actually called free_hyp_pmds
before...).
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
After the HYP page table rework, it is pretty easy to let the KVM
code provide its own idmap, rather than expecting the kernel to
provide it. It takes actually less code to do so.
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
The current code for creating HYP mapping doesn't like to wrap
around zero, which prevents from mapping anything into the last
page of the virtual address space.
It doesn't take much effort to remove this limitation, making
the code more consistent with the rest of the kernel in the process.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
The way we populate HYP mappings is a bit convoluted, to say the least.
Passing a pointer around to keep track of the current PFN is quite
odd, and we end-up having two different PTE accessors for no good
reason.
Simplify the whole thing by unifying the two PTE accessors, passing
a pgprot_t around, and moving the various validity checks to the
upper layers.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
In clocksource/arm_arch_timer.h we define useful symbolic constants.
Let's use them to make the KVM arch_timer code clearer.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Christoffer Dall <cdall@cs.columbia.edu>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
In order to be able to correctly profile what is happening on the
host, we need to be able to identify when we're running on the guest,
and log these events differently.
Perf offers a simple way to register callbacks into KVM. Mimic what
x86 does and enjoy being able to profile your KVM host.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
Fix typo in printk and comments within various drivers.
Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
In the very unlikely event where a guest would be foolish enough to
*read* from a write-only cache maintainance register, we end up
with preemption disabled, due to a misplaced get_cpu().
Just move the "is_write" test outside of the critical section.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 3401d54696f9 (KVM: ARM: Introduce KVM_ARM_SET_DEVICE_ADDR
ioctl) added support for the KVM_CAP_ARM_SET_DEVICE_ADDR capability,
but failed to add a break in the relevant case statement, returning
the number of CPUs instead.
Luckilly enough, the CONFIG_NR_CPUS=0 patch hasn't been merged yet
(https://lkml.org/lkml/diff/2012/3/31/131/1), so the bug wasn't
noticed.
Just give it a break!
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
Commit aa2fbe6d broke the ARM KVM target by introducing a new parameter
to irq handling functions.
Fix the function prototype to get things compiling again and ignore the
parameter just like we did before
Signed-off-by: Alexander Graf <agraf@suse.de>
Acked-by: Christoffer Dall <cdall@cs.columbia.edu>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Instead of hardcoding the maximum MMIO access to be 4 bytes,
compare it to sizeof(unsigned long), which will do the
right thing on both 32 and 64bit systems.
Same thing for sign extention.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
Instead of trying to free everything from PAGE_OFFSET to the
top of memory, use the virt_addr_valid macro to check the
upper limit.
Also do the same for the vmalloc region where the IO mappings
are allocated.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
v8 is capable of invalidating Stage-2 by IPA, but v7 is not.
Change kvm_tlb_flush_vmid() to take an IPA parameter, which is
then ignored by the invalidation code (and nuke the whole TLB
as it always did).
This allows v8 to implement a more optimized strategy.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
The virtual GIC is supposed to be 4kB aligned. On a 64kB page
system, comparing the alignment to PAGE_SIZE is wrong.
Use SZ_4K instead.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
The ARM ARM says that HPFAR reports bits [39:12] of the faulting
IPA, and we need to complement it with the bottom 12 bits of the
faulting VA.
This is always 12 bits, irrespective of the page size. Makes it
clearer in the code.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
guest.c already contains some target-specific checks. Let's move
kvm_target_cpu() over there so arm.c is mostly target agnostic.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
__create_hyp_mappings() performs some kind of address validation before
creating the mapping, by verifying that the start address is above
PAGE_OFFSET.
This check is not completely correct for kernel memory (the upper
boundary has to be checked as well so we do not end up with highmem
pages), and wrong for IO mappings (the mapping must exist in the vmalloc
region).
Fix this by using the proper predicates (virt_addr_valid and
is_vmalloc_addr), which also work correctly on ARM64 (where the vmalloc
region is below PAGE_OFFSET).
Also change the BUG_ON() into a less agressive error return.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
arm64 cannot represent the kernel VAs in HYP mode, because of the lack
of TTBR1 at EL2. A way to cope with this situation is to have HYP VAs
to be an offset from the kernel VAs.
Introduce macros to convert a kernel VA to a HYP VA, make the HYP
mapping functions use these conversion macros. Also change the
documentation to reflect the existence of the offset.
On ARM, where we can have an identity mapping between kernel and HYP,
the macros are without any effect.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
In order to keep the VFP allocation code common, use an abstract type
for the VFP containers. Maps onto struct vfp_hard_struct on ARM.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Make the split of the pgd_ptr an implementation specific thing
by moving the init call to an inline function.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Move low level MMU-related operations to kvm_mmu.h. This makes
the MMU code reusable by the arm64 port.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
This one got lost in the move to handle_exit, so let's reintroduce it
using an accessor to the immediate value field like the other ones.
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
The exit handler selection code cannot be shared with arm64
(two different modes, more exception classes...).
Move it to a separate file (handle_exit.c).
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
Bit 8 is cache maintenance, bit 9 is external abort.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
Instead of directly accessing the fault registers, use proper accessors
so the core code can be shared.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
On 32bit ARM, unsigned long is guaranteed to be a 32bit quantity.
On 64bit ARM, it is a 64bit quantity.
In order to be able to share code between the two architectures,
convert the registers to be unsigned long, so the core code can
be oblivious of the change.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
hyp_hvc vector offset is 0x14 and hyp_svc vector offset is 0x8.
Signed-off-by: Jonghwan Choi <jhbird.choi@samsung.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
This was replaced with prepare/commit long before:
commit f7784b8ec9b6a041fa828cfbe9012fe51933f5ac
KVM: split kvm_arch_set_memory_region into prepare and commit
Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
This patch makes the parameter old a const pointer to the old memory
slot and adds a new parameter named change to know the change being
requested: the former is for removing extra copying and the latter is
for cleaning up the code.
Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
This patch drops the parameter old, a copy of the old memory slot, and
adds a new parameter named change to know the change being requested.
This not only cleans up the code but also removes extra copying of the
memory slot structure.
Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
X86 does not use this any more. The remaining user, s390's !user_alloc
check, can be simply removed since KVM_SET_MEMORY_REGION ioctl is no
longer supported.
Note: fixed powerpc's indentations with spaces to suppress checkpatch
errors.
Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Commit 7a905b1 (KVM: Remove user_alloc from struct kvm_memory_slot)
broke KVM/ARM by removing the user_alloc field from a public structure.
As we only used this field to alert the user that we didn't support
this operation mode, there is no harm in discarding this bit of code
without any remorse.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Commit f82a8cfe9 (KVM: struct kvm_memory_slot.user_alloc -> bool)
broke the ARM KVM port by changing the prototype of two global
functions.
Apply the same change to fix the compilation breakage.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Now that the maintenance interrupt handling is actually out of the
handler itself, the code becomes quite racy as we can get preempted
while we process the state.
Wrapping this code around the distributor lock ensures that we're not
preempted and relatively race-free.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>