Commit Graph

674 Commits

Author SHA1 Message Date
Juan Quintela
220c3ebddb memory: split cpu_physical_memory_* functions to its own include
All the functions that use ram_addr_t should be here.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
2014-01-13 14:04:54 +01:00
Juan Quintela
981fdf2353 memory: cpu_physical_memory_set_dirty_tracking() should return void
Result was always 0, and not used anywhere.  Once there, use bool type
for the parameter.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
2014-01-13 14:04:54 +01:00
Juan Quintela
a2f4d5bef2 memory: make cpu_physical_memory_reset_dirty() take a length parameter
We have an end parameter in all the callers, and this make it coherent
with the rest of cpu_physical_memory_* functions, that also take a
length parameter.

Once here, move the start/end calculation to
tlb_reset_dirty_range_all() as we don't need it here anymore.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
2014-01-13 14:04:54 +01:00
Juan Quintela
a2cd8c852d memory: s/dirty/clean/ in cpu_physical_memory_is_dirty()
All uses except one really want the other meaning.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
2014-01-13 14:04:54 +01:00
Juan Quintela
ace694cccc memory: s/mask/clear/ cpu_physical_memory_mask_dirty_range
Now all functions use the same wording that bitops/bitmap operations

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
2014-01-13 14:04:54 +01:00
Juan Quintela
1ab4c8ceaa memory: split dirty bitmap into three
After all the previous patches, spliting the bitmap gets direct.

Note: For some reason, I have to move DIRTY_MEMORY_* definitions to
the beginning of memory.h to make compilation work.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
2014-01-13 14:04:54 +01:00
Juan Quintela
2152f5ca78 memory: only resize dirty bitmap when memory size increases
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
2014-01-13 14:04:54 +01:00
Juan Quintela
5215919291 memory: cpu_physical_memory_mask_dirty_range() always clears a single flag
Document it

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
2014-01-13 14:04:54 +01:00
Juan Quintela
75218e7f2b memory: cpu_physical_memory_set_dirty_range() always dirty all flags
So remove the flag argument and do it directly.  After this change,
there is nothing else using cpu_physical_memory_set_dirty_flags() so
remove it.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
2014-01-13 14:04:53 +01:00
Juan Quintela
63995cebfa memory: set single dirty flags when possible
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
2014-01-13 14:04:53 +01:00
Juan Quintela
4f08cabe9e memory: make cpu_physical_memory_is_dirty return bool
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
2014-01-13 14:04:53 +01:00
Juan Quintela
7e5609a85e exec: create function to get a single dirty bit
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
2014-01-13 14:04:53 +01:00
Juan Quintela
06567942e5 exec: use accessor function to know if memory is dirty
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2014-01-13 14:04:53 +01:00
Alexander Graf
582b55a96a roms: Flush icache when writing roms to guest memory
We use the rom infrastructure to write firmware and/or initial kernel
blobs into guest address space. So we're basically emulating the cache
off phase on very early system bootup.

That phase is usually responsible for clearing the instruction cache for
anything it writes into cachable memory, to ensure that after reboot we
don't happen to execute stale bits from the instruction cache.

So we need to invalidate the icache every time we write a rom into guest
address space. We do not need to do this for every DMA since the guest
expects it has to flush the icache manually in that case.

This fixes random reboot issues on e5500 (booke ppc) for me.

Signed-off-by: Alexander Graf <agraf@suse.de>
2013-12-20 01:58:03 +01:00
Marcel Apfelbaum
53cb28cbfe exec: separate sections and nodes per address space
Every address space has its own nodes and sections, but
it uses the same global arrays of nodes/section.

This limits the number of devices that can be attached
to the guest to 20-30 devices. It happens because:
 - The sections array is limited to 2^12 entries.
 - The main memory has at least 100 sections.
 - Each device address space is actually an alias to
   main memory, multiplying its number of nodes/sections.

Remove the limitation by using separate arrays of
nodes and sections for each address space.

Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-12-11 20:11:09 +02:00
Michael S. Tsirkin
026736cebf exec: reduce L2_PAGE_SIZE
With the single exception of ppc with 16M pages,
we get the same number of levels
with L2_PAGE_SIZE = 10 as with L2_PAGE_SIZE = 9.

by doing this we reduce memory footprint of a single level
in the node memory map by 2x without runtime overhead.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-12-10 12:29:56 +02:00
Paolo Bonzini
57271d63c4 exec: make address spaces 64-bit wide
As an alternative to commit 818f86b (exec: limit system memory
size, 2013-11-04) let's just make all address spaces 64-bit wide.
This eliminates problems with phys_page_find ignoring bits above
TARGET_PHYS_ADDR_SPACE_BITS and address_space_translate_internal
consequently messing up the computations.

In Luiz's reported crash, at startup gdb attempts to read from address
0xffffffffffffffe6 to 0xffffffffffffffff inclusive.  The region it gets
is the newly introduced master abort region, which is as big as the PCI
address space (see pci_bus_init).  Due to a typo that's only 2^63-1,
not 2^64.  But we get it anyway because phys_page_find ignores the upper
bits of the physical address.  In address_space_translate_internal then

    diff = int128_sub(section->mr->size, int128_make64(addr));
    *plen = int128_get64(int128_min(diff, int128_make64(*plen)));

diff becomes negative, and int128_get64 booms.

The size of the PCI address space region should be fixed anyway.

Reported-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-12-10 12:29:56 +02:00
Michael S. Tsirkin
b35ba30f8f exec: memory radix tree page level compression
At the moment, memory radix tree is already variable width, but it can
only skip the low bits of address.

This is efficient if we have huge memory regions but inefficient if we
are only using a tiny portion of the address space.

After we have built up the map, detect
configurations where a single L2 entry is valid.

We then speed up the lookup by skipping one or more levels.
In case any levels were skipped, we might end up in a valid section
instead of erroring out. We handle this by checking that
the address is in range of the resulting section.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-12-10 12:29:56 +02:00
Michael S. Tsirkin
97115a8d45 exec: pass hw address to phys_page_find
callers always shift by target page bits so let's just do this
internally.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-12-10 12:29:56 +02:00
Michael S. Tsirkin
8b795765db exec: extend skip field to 6 bit, page entry to 32 bit
Extend skip to 6 bit. As page entry doesn't fit in 16 bit
any longer anyway, extend it to 32 bit.
This doubles node map memory requirements, but follow-up
patches will save this memory.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-12-10 12:29:56 +02:00
Michael S. Tsirkin
9736e55b78 exec: replace leaf with skip
In preparation for dynamic radix tree depth support, rename is_leaf
field to skip, telling us how many bits to skip to next level.
Set to 0 for leaf.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-12-10 12:29:56 +02:00
Paolo Bonzini
03f4995781 split definitions for exec.c and translate-all.c radix trees
The exec.c and translate-all.c radix trees are quite different, and
the exec.c one in particular is not limited to the CPU---it can be
used also by devices that do DMA, and in that case the address space
is not limited to TARGET_PHYS_ADDR_SPACE_BITS bits.

We want to make exec.c's radix trees 64-bit wide.  As a first step,
stop sharing the constants between exec.c and translate-all.c.
exec.c gets P_L2_* constants, translate-all.c gets V_L2_*, for
consistency with the existing V_L1_* symbols.  Though actually
in the softmmu case translate-all.c is also indexed by physical
addresses...

This patch has no semantic change.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-12-10 12:29:56 +02:00
Marcelo Tosatti
ef36fa1492 qemu: mempath: prefault pages manually (v4)
v4: s/fail/failed/  (Peter Maydell)

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2013-11-25 11:28:56 +01:00
Anthony Liguori
29c5b77d3d pci, pc, virtio bug fixes
This reverts PCI master abort support - we'll want it
 eventually but it exposes too many core bugs to be safe for 1.7.
 This also reverts a recent exec.c change that was an
 attempt to work-around some of these core bugs.
 
 Also included are small fixes in pc and virtio,
 and a core loader fix for PPC bamboo.
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.15 (GNU/Linux)
 
 iQEcBAABAgAGBQJSf4ZyAAoJECgfDbjSjVRp9DIIAK7yEMa9ie5n3sInKH+xHT3R
 Sf4uErqx55WfT/54dnLJPrs7DTfXblW+Qjnq/7RuaoJ32Dfshgxz64mPF+Lm2s3+
 ghjdQrKo2YkdSbbxy+AnBNO4eHMSeUs/rM2yIfi7FZU0nwC7wNe1QpAN3UjM4yAF
 5vE18xZE0Rxz/prXgofLtPHa1czvGPFk1qbS7Vag6HCSkfEI4N1Jxf9otDRV6KZP
 9hX0kTvZyOKdbhccN05G4VCWwx5YUrpBsNSoph4Jx1aokEBoucr4sgE1FPDp0H9H
 bJqDaAM2G5HNrDtIiDov5WOzRNT/ly011Q4mcaQh3va0pqUXttKCHgE1KRgn76I=
 =iMNW
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'mst/tags/for_anthony' into staging

pci, pc, virtio bug fixes

This reverts PCI master abort support - we'll want it
eventually but it exposes too many core bugs to be safe for 1.7.
This also reverts a recent exec.c change that was an
attempt to work-around some of these core bugs.

Also included are small fixes in pc and virtio,
and a core loader fix for PPC bamboo.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Sun 10 Nov 2013 05:13:22 AM PST using RSA key ID D28D5469
# gpg: Can't check signature: public key not found

# By Michael S. Tsirkin (3) and others
# Via Michael S. Tsirkin
* mst/tags/for_anthony:
  Revert "exec: limit system memory size"
  Revert "hw/pci: partially handle pci master abort"
  loader: drop return value for rom_add_blob_fixed
  acpi-build: disable with -no-acpi
  virtio-net: only delete bh that existed
  Fix pc migration from qemu <= 1.5

Message-id: 1384159176-31662-1-git-send-email-mst@redhat.com
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2013-11-13 11:48:35 -08:00
Michael S. Tsirkin
ef9e455d64 Revert "exec: limit system memory size"
This reverts commit 818f86b883.

This was a work-around for bugs elsewhere in the system,
exposed by commit a53ae8e934:
    "hw/pci: partially handle pci master abort"
since that's reverted now, the work-around is not required for 1.7
anymore.
The proper fix is supporting full 64 bit addresses in the radix tree.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Marcel Apfelbaum <marcel.a@redhat.com>
2013-11-10 15:11:01 +02:00
Max Filippov
e8262a1b5b exec: fix breakpoint_invalidate when pc may not be translated
This fixes qemu abort with the following message:

    include/qemu/int128.h:22: int128_get64: Assertion `!a.hi' failed.

which happens due to attempt to invalidate breakpoint by virtual address
for which get_phys_page_debug couldn't find mapping.

For more details see
http://lists.nongnu.org/archive/html/qemu-devel/2013-09/msg04582.html

Cc: qemu-stable@nongnu.org
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2013-11-08 09:25:22 +04:00
Michael S. Tsirkin
818f86b883 exec: limit system memory size
The page table logic in exec.c assumes
that memory addresses are at most TARGET_PHYS_ADDR_SPACE_BITS.

But pci addresses are full 64 bit so if we try to render them ignoring
the extra bits, we get strange effects with sections overlapping each
other.

To fix, simply limit the system memory size to
 1 << TARGET_PHYS_ADDR_SPACE_BITS,
pci addresses will be rendered within that.

Cc: qemu-stable@nongnu.org
Reported-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-11-04 15:38:49 +02:00
Kevin Wolf
e85d9db5f6 exec: Fix bounce buffer allocation in address_space_map()
This fixes a regression introduced by commit e3127ae0c, which kept the
allocation size of the bounce buffer limited to one page in order to
avoid unbounded allocations (as explained in the commit message of
6d16c2f88), but broke the reporting of the shortened bounce buffer to
the caller. The caller therefore assumes that the full requested size
was provided and causes memory corruption when writing beyond the end of
the actually allocated buffer.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-28 17:34:42 +01:00
Paolo Bonzini
041603fe5d exec: remove qemu_safe_ram_ptr
This is not needed since the RAM list is not modified anymore by
qemu_get_ram_ptr.  Replace it with qemu_get_ram_block.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-10-17 17:31:00 +02:00
Stefan Weil
575ddeb459 exec: Fix prototype of phys_mem_set_alloc and related functions
phys_mem_alloc and its assigned values qemu_anon_ram_alloc and
legacy_s390_alloc must have identical argument lists.

legacy_s390_alloc uses the size parameter to call mmap, so size_t is
good enough for all of them.

This patch fixes compiler errors on i686 Linux hosts:

  CC    alpha-softmmu/exec.o
exec.c:752:51: error:
 initialization from incompatible pointer type [-Werror]
exec.c: In function 'qemu_ram_alloc_from_ptr':
exec.c:1139:32: error:
 comparison of distinct pointer types lacks a cast [-Werror]
exec.c: In function 'qemu_ram_remap':
exec.c:1283:21: error:
 comparison of distinct pointer types lacks a cast [-Werror]

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1380481005-32399-1-git-send-email-sw@weilnetz.de
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2013-10-14 08:50:34 -07:00
Anthony Liguori
39c153b80f QOM CPUState refactorings / X86CPU
* Fix for X86CPU model field of qemu32/qemu64 CPU models
 * Bug fix for longjmp on FreeBSD
 * Removal of unused function
 * Confinement of clone syscall infrastructure to linux-user
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iQIcBAABAgAGBQJSVTKzAAoJEPou0S0+fgE/7tYP/i5dgm6q7jSnhJcwzgHlCHDE
 c0BTwnvFjdBdkuAARYb/soo0m9QWfsW/dgC4bG3rO5j3o84PLstMjiZSQch0pqM1
 YhA0hYSiFjHrMcRk9FOwIECPIe+QcHZ79iNML+9G4K13D7qg36aJWISbVOWy24Dp
 kj5D0wBBDNw032Oh/3z3EAK4U+vLc/+i4s8XjfwtbuBCCn7GMCE3mRnEqnf8ZX3o
 H3Il3h/o+I3XQSzIJKXXyJZ5ZVXTtlj0z/0ShQXe8o8u1hINXE2Nf9lB6WG/6sh0
 Y43d0uU/e9fWDer25j9yis9KfDNErgYyxlBMUA2X1+Rny5P0twjnnBr5GTAeKgSq
 Kcux8Ov7W8cbVoM/px03rnynF9rbFbgmGlx82L+QsNMKWhjnEsfs6unpccpGhHR5
 UuZX3ZPrmeHfjv0AZD/U2ya3jfrp0v+9gsTqy3QV1rCPbqPDcJ6jg8jzbPZYjEfa
 /Zy0e/0O3sytSyiaAfBg3MzVPBxdzPcn0JjExJQV9BHsUlkZIVCZVMfePw1oIaf+
 coyV4cT3hCe8LrSCzPZlRYP+1hIg41W4NicLbDxtS8lqgfRbcglvqw6NFdAM+NcB
 z3heQ7IFstQ+pEINXQNy6bS8orv8F1VVvCtZaV+2pzB4TZzjPYuGsrqygre4QkLU
 mtpN9BTfmSIjzyo6iYBv
 =hQfy
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'afaerber/tags/qom-cpu-for-anthony' into staging

QOM CPUState refactorings / X86CPU

* Fix for X86CPU model field of qemu32/qemu64 CPU models
* Bug fix for longjmp on FreeBSD
* Removal of unused function
* Confinement of clone syscall infrastructure to linux-user

# gpg: Signature made Wed 09 Oct 2013 03:40:51 AM PDT using RSA key ID 3E7E013F
# gpg: Can't check signature: public key not found

# By Andreas Färber (2) and others
# Via Andreas Färber
* afaerber/tags/qom-cpu-for-anthony:
  cpu: Drop cpu_model_str from CPU_COMMON
  cpu: Move cpu_copy() into linux-user
  cputlb: Remove dead function tlb_update_dirty()
  cpu-exec: Also reload CPUClass *cc after longjmp return in cpu_exec()
  target-i386: Set model=6 on qemu64 & qemu32 CPU models
2013-10-10 13:16:25 -07:00
Andreas Färber
30ba0ee52d cpu: Move cpu_copy() into linux-user
It is only used there and is deemed very fragile if not incorrect in its
current memcpy() form. Moving it into linux-user will allow to move
parts into target_cpu.h headers and only copy what the ABI mandates.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-10-07 11:48:39 +02:00
Amos Kong
016e9d62fe exec: cleanup DEBUG_SUBPAGE
Touched some error after enabling DEBUG_SUBPAGE.

Signed-off-by: Amos Kong <akong@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-10-02 22:55:28 +04:00
Anthony Liguori
2e6ae666c8 Merge remote-tracking branch 'mjt/trivial-patches' into staging
# By Stefan Weil (8) and others
# Via Michael Tokarev
* mjt/trivial-patches:
  tests/.gitignore: ignore test-throttle
  exec: Fix broken build for MinGW (regression)
  kvm: Fix compiler warning (clang)
  tcg-sparc: Fix parenthesis warning
  Makefile: Remove some more files when cleaning
  target-i386: Fix segment cache dump
  iov: avoid "orig_len may be used unitialized" warning
  vscclient: remove unnecessary use of uninitialized variable
  trace-events: Clean up with scripts/cleanup-trace-events.pl again
  tci: Fix qemu-alpha on 32 bit hosts (wrong assertions)
  *-user: Improve documentation for lock_user function
  MAINTAINERS: Add missing entry to filelist for TCI target
  translate-all: Fix formatting of dump output
  *-user: Fix typo in comment (ulocking -> unlocking)
  docs: Fix IO port number for CPU present bitmap.
  q35: Fix typo in constant DEFUALT -> DEFAULT.
  configure: Undefine _FORTIFY_SOURCE prior using it

Message-id: 1379696296-32105-1-git-send-email-mjt@msgid.tls.msk.ru
2013-09-23 11:52:55 -05:00
Anthony Liguori
3e4be9c297 Merge remote-tracking branch 'qemu-kvm/uq/master' into staging
# By Alexey Kardashevskiy (3) and others
# Via Paolo Bonzini
* qemu-kvm/uq/master:
  target-i386: add feature kvm_pv_unhalt
  linux-headers: update to 3.12-rc1
  target-i386: forward CPUID cache leaves when -cpu host is used
  linux-headers: update to 3.11
  kvm: fix traces to use %x instead of %d
  kvmvapic: Clear also physical ROM address when entering INACTIVE state
  kvmvapic: Enter inactive state on hardware reset
  kvmvapic: Catch invalid ROM size
  kvm irqfd: support direct msimessage to irq translation
  fix steal time MSR vmsd callback to proper opaque type
  kvm: warn if num cpus is greater than num recommended
  cpu: Move cpu state syncs up into cpu_dump_state()
  exec: always use MADV_DONTFORK

Message-id: 1379694292-1601-1-git-send-email-pbonzini@redhat.com
2013-09-23 11:52:49 -05:00
Stefan Weil
089f3f761e exec: Fix broken build for MinGW (regression)
Commit 3435f39513 reduced the ifdeffery with
this result for MinGW:

exec.c: In function ‘qemu_ram_free’:
exec.c:1239:17: warning:
 implicit declaration of function ‘munmap’ [-Wimplicit-function-declaration]
exec.c:1239:17: warning:
 nested extern declaration of ‘munmap’ [-Wnested-externs]
exec.c:1239: undefined reference to `munmap'

Add some ifdeffery again to fix this.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-09-20 20:13:09 +04:00
Andrea Arcangeli
3e469dbfe4 exec: always use MADV_DONTFORK
MADV_DONTFORK prevents fork to fail with -ENOMEM if the default
overcommit heuristics decides there's too much anonymous virtual
memory allocated. If the KVM secondary MMU is synchronized with MMU
notifiers or not, doesn't make a difference in that regard.

Secondly it's always more efficient to avoid copying the guest
physical address space in the fork child (so we avoid to mark all the
guest memory readonly in the parent and so we skip the establishment
and teardown of lots of pagetables in the child).

In the common case we can ignore the error if MADV_DONTFORK is not
available. Leave a second invocation that errors out in the KVM path
if MMU notifiers are missing and KVM is enabled, to abort in such
case.

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Tested-By: Benoit Canet <benoit@irqsave.net>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-09-20 12:37:52 +02:00
Markus Armbruster
39228250ce exec: Don't abort when we can't allocate guest memory
We abort() on memory allocation failure.  abort() is appropriate for
programming errors.  Maybe most memory allocation failures are
programming errors, maybe not.  But guest memory allocation failure
isn't, and aborting when the user asks for more memory than we can
provide is not nice.  exit(1) instead, and do it in just one place, so
the error message is consistent.

Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-id: 1375276272-15988-8-git-send-email-armbru@redhat.com
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-09-12 11:45:32 -05:00
Markus Armbruster
e1e84ba050 exec: Clean up unnecessary S390 ifdeffery
Another issue missed in commit fdec991 is -mem-path: it needs to be
rejected only for old S390 KVM, not for any S390.  Not that I
personally care, but the ifdeffery in qemu_ram_alloc_from_ptr() annoys
me.

Note that this doesn't actually make -mem-path work, as the kernel
doesn't (yet?)  support large pages in the host for KVM guests.  Clean
it up anyway.

Thanks to Christian Borntraeger for pointing out the S390 kernel
limitations.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-id: 1375276272-15988-7-git-send-email-armbru@redhat.com
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-09-12 11:45:32 -05:00
Markus Armbruster
2eb9fbaab5 exec: Drop incorrect & dead S390 code in qemu_ram_remap()
Old S390 KVM wants guest RAM mapped in a peculiar way.  Commit 6b02494
implemented that.

When qemu_ram_remap() got added in commit cd19cfa, its code carefully
mimicked the allocation code: peculiar way if defined(TARGET_S390X) &&
defined(CONFIG_KVM), else normal way.

For new S390 KVM, we actually want the normal way.  Commit fdec991
changed qemu_ram_alloc_from_ptr() accordingly, but forgot to update
qemu_ram_remap().  If qemu_ram_alloc_from_ptr() maps RAM the normal
way, but qemu_ram_remap() remaps it the peculiar way, remapping
changes protection and flags, which it shouldn't.

Fortunately, this can't happen, as we never remap on S390.

Replace the incorrect code with an assertion.

Thanks to Christian Borntraeger for help with assessing the bug's
(non-)impact.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Message-id: 1375276272-15988-6-git-send-email-armbru@redhat.com
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-09-12 11:45:31 -05:00
Markus Armbruster
91138037cb exec: Simplify the guest physical memory allocation hook
Make it a generic hook rather than a KVM hook.  Less code and
ifdeffery.

Since the only user of the hook is old S390 KVM, there's hope we can
get rid of it some day.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Message-id: 1375276272-15988-5-git-send-email-armbru@redhat.com
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-09-12 11:45:31 -05:00
Markus Armbruster
3435f39513 exec: Reduce ifdeffery around -mem-path
Instead of spreading its ifdeffery everywhere, confine it to
qemu_ram_alloc_from_ptr().  Everywhere else, simply test block->fd,
which is non-negative exactly when block uses -mem-path.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Message-id: 1375276272-15988-4-git-send-email-armbru@redhat.com
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-09-12 11:45:31 -05:00
Markus Armbruster
0628c18267 exec: Clean up fall back when -mem-path allocation fails
With -mem-path, qemu_ram_alloc_from_ptr() first tries to allocate
accordingly, but when it fails, it falls back to normal allocation.

The fall back allocation code used to be effectively identical to the
"-mem-path not given" code, until it started to diverge in commit
432d268.  I believe the code still works, but clean it up anyway: drop
the special fall back allocation code, and fall back to the ordinary
"-mem-path not given" code instead.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Message-id: 1375276272-15988-3-git-send-email-armbru@redhat.com
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-09-12 11:45:31 -05:00
Markus Armbruster
dfeaf2abc7 exec: Fix Xen RAM allocation with unusual options
Issues:

* We try to obey -mem-path even though it can't work with Xen.

* To implement -machine mem-merge, we call
  memory_try_enable_merging(new_block->host, size).  But with Xen,
  new_block->host remains null.  Oops.

Fix by separating Xen allocation from normal allocation.

Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Message-id: 1375276272-15988-2-git-send-email-armbru@redhat.com
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-09-12 11:45:31 -05:00
liguang
2641689a37 exec: do tcg_commit only when tcg_enabled
Signed-off-by: liguang <lig.fnst@cn.fujitsu.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-05 18:11:52 +02:00
Jan Kiszka
3bb28b7208 memory: Provide separate handling of unassigned io ports accesses
Accesses to unassigned io ports shall return -1 on read and be ignored
on write. Ensure these properties via dedicated ops, decoupling us from
the memory core's handling of unassigned accesses.

Cc: qemu-stable@nongnu.org
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-05 18:11:43 +02:00
Hu Tao
8826624970 exec: check offset_within_address_space for register subpage
If offset_within_address_space falls in a page, then we register a
subpage. So check offset_within_address_space rather than
offset_within_region.

Cc: qemu-stable@nongnu.org
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: "Andreas Färber" <afaerber@suse.de>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-05 18:11:37 +02:00
Paolo Bonzini
098178f274 exec: fix writing to MMIO area with non-power-of-two length
The problem is introduced by commit 2332616 (exec: Support 64-bit
operations in address_space_rw, 2013-07-08).  Before that commit,
memory_access_size would only return 1/2/4.

Since alignment is already handled above, reduce l to the largest
power of two that is smaller than l.

Cc: qemu-stable@nongnu.org
Reported-by: Oleksii Shevchuk <alxchk@gmail.com>
Tested-by: Oleksii Shevchuk <alxchk@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-05 18:11:28 +02:00
Andreas Färber
38fcbd3f08 cpu: Replace qemu_for_each_cpu()
It was introduced to loop over CPUs from target-independent code, but
since commit 182735efaf target-independent
CPUState is used.

A loop can be considered more efficient than function calls in a loop,
and CPU_FOREACH() hides implementation details just as well, so use that
instead.

Suggested-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-09-03 12:25:55 +02:00
Andreas Färber
bdc44640cb cpu: Use QTAILQ for CPU list
Introduce CPU_FOREACH(), CPU_FOREACH_SAFE() and CPU_NEXT() shorthand
macros.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-09-03 12:25:55 +02:00