Instead of checking the limit before calling numa_add(), check the limit
only when we already know we're going to add a new node.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Abort in case an invalid -numa option is provided, instead of silently
ignoring it.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
The numa_add() code was unconditionally adding 1 to the get_opt_name()
return value, making it point after the end of the string if no ','
separator is present.
Example of weird behavior caused by the bug:
$ qemu-img create -f qcow2 this-file-image-has,cpus=5,mem=1000,in-its-name.qcow2 5G
Formatting 'this-file-image-has,cpus=5,mem=1000,in-its-name.qcow2', fmt=qcow2 size=5368709120 encryption=off cluster_size=65536
$ ./x86_64-softmmu/qemu-system-x86_64 -S -monitor stdio -numa node 'this-file-image-has,cpus=5,mem=1000,in-its-name.qcow2'
QEMU 1.3.50 monitor - type 'help' for more information
(qemu) info numa
1 nodes
node 0 cpus: 0
node 0 size: 1000 MB
(qemu)
This changes the code to nove the pointer only if ',' is found.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
There are lots of duplicate parsing code using strto*() in QEMU, and
most of that code is broken in one way or another. Even the visitors
code have duplicate integer parsing code[1]. This introduces functions
to help parsing unsigned int values: parse_uint() and parse_uint_full().
Parsing functions for signed ints and floats will be submitted later.
parse_uint_full() has all the checks made by opts_type_uint64() at
opts-visitor.c:
- Check for NULL (returns -EINVAL)
- Check for negative numbers (returns -EINVAL)
- Check for empty string (returns -EINVAL)
- Check for overflow or other errno values set by strtoll() (returns
-errno)
- Check for end of string (reject invalid characters after number)
(returns -EINVAL)
parse_uint() does everything above except checking for the end of the
string, so callers can continue parsing the remainder of string after
the number.
Unit tests included.
[1] string-input-visitor.c:parse_int() could use the same parsing code
used by opts-visitor.c:opts_type_int(), instead of duplicating that
logic.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Around r3361 (81fdc5f8d2) env->debug1 used
to contain the address of an MMU fault. This is now written into
env->pregs[PR_EDA] instead.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
We had two copies of a ffs function for longs with subtly different
semantics and, for the one in bitops.h, a confusing name: the result
was off-by-one compared to the library function ffsl.
Unify the functions into one, and solve the name problem by calling
the 0-based functions "bitops_ctzl" and "bitops_ctol" respectively.
This also fixes the build on platforms with ffsl, including Mac OS X
and Windows.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Tested-by: Andreas Färber <afaerber@suse.de>
Tested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
MinGW has no strtok_r, so we need a declaration in sysemu/os-win32.h.
We must also fix the include statements in util/envlist.c to include
that file.
We currently don't need an implementation of strtok_r because the
code is compiled but not linked for MinGW.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
The multiqueue patch series broke -netdev tap,fd=X which manifests
as libvirt not being able to start a guest. This was because it
passed NULL for the netdev name which results in an anonymous netdev
device regardless of what the user specified.
Cc: Jason Wang <jasowang@redhat.com>
Cc: Bruce Rogers <brogers@suse.com>
Reported-by: Bruce Rogers <brogers@suse.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This is now unused. Document the initial reference count of an object
and when it will be freed/finalized.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
CPUs are never added to the composition tree, so delete is achieved
simply by removing the last references to them.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
qdev_free and qbus_free have to do unparent+unref, because nobody else
drops the initial reference (the one included by object_initialize)
before them.
For device_init_func and do_device_add, this is trivially correct,
since the DeviceState goes out of scope.
For qdev_create, qdev_try_create and qbus_init, it is a bit more tricky.
What we are doing here is just assuming that the caller knows what it's
doing, and won't call qdev_free/qbus_free while the device is still there.
This is a pretty reasonable assumption and (behind the scenes) is also
what GObject/GTK does. GTK actually has a "floating reference" that
goes away as soon as the caller does gtk_container_add or something
like that, but in the end qbus_init and qdev_try_create are already
adding the new object to its qdev parent! So in the end the two solutions
are the same.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
We want object_delete to disappear, and we will do this one class at a
time. Inline it for the qdev case, which we will tackle first.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Now that the unparent callbacks are complete, we can correctly account
more missing references.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Similarly, a bus holds a reference back to the device, and this will
prevent the device from going away as soon as this reference is counted
properly. To avoid this, move the unrealization of devices to the
unparent callback. This includes recursively unparenting all the buses
and (after the previous patch) the devices on those buses, which ensures
that the web of references completely disappears for all devices that
reside (in the qdev tree) below the one being unplugged.
After this patch, the qdev tree and the bus<->child relationship is
defined as "A is above B, iff unplugging A will automatically unplug B".
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
A device will never be finalized as long as it has a reference from
other devices that sit on its buses. To ensure that the references
go away, deassociate a bus from its children in the unparent callback
for the bus.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Each device has a reference through the BusChild. This reference
was not accounted for, add it now.
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Avoid that the object disappears after it's deleted from the QOM
composition tree, in case that was the only reference to it.
Acked-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Remove knowledge of QOM innards. The common part of pci_bus_new and
pci_bus_new_inplace is moved to a new function pci_bus_init.
Acked-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Make it clear that no BUS() macro is needed in the callers (in fact it
wouldn't work because the object has not been initialized yet with the
right class).
Suggested-by: Andreas Faerber <afaerber@suse.de>
Acked-by: Andreas F=E4rber <afaerber@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Move the common part to qbus_realize.
Acked-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
command:
qemu-system-x86_64 -hda disk.img -smp 32 --enable-kvm
error:
Number of SMP cpus requested (32) exceeds max cpus supported by KVM (16)
failed to initialize KVM: Invalid argument
No accelerator found!
well, it did find kvm, but failed to init,
so message "No accelerator found!" is confusing,
this commit remove the confusing error message.
Signed-off-by: liguang <lig.fnst@cn.fujitsu.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
We've seen this repeatedly in buildbot but I can now reliably
reproduce it myself too. With a few hundred runs of 'make check',
qemu-system-sparc will hang consuming 100% CPU. I've attached GDB
to the hung process and unfortunately, I can't get anything useful
out of GDB (RIP is not a valid simple and there is nothing else on
the stack).
At any rate, since this only manifests in qemu-system-sparc and it
doesn't appear to be a qtest specific problem, I think we should
disable it until the problem is resolved.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
# By Kevin Wolf (7) and others
# Via Stefan Hajnoczi
* stefanha/block:
block/raw-posix: Build fix for O_ASYNC
vmdk: Allow space in file name
parallels: Fix bdrv_open() error handling
dmg: Use g_free instead of free
dmg: Fix bdrv_open() error handling
vpc: Fix bdrv_open() error handling
cloop: Fix bdrv_open() error handling
bochs: Fix bdrv_open() error handling
sheepdog: pass vdi_id to sheep daemon for sd_close()
vmdk: Allow selecting SCSI adapter in image creation
block: Adds mirroring tests for resized images
block: Fix is_allocated_above with resized files
qemu-iotests: Add regression test for b7ab0fea
This patch add migration support for multiqueue virtio-net. Instead of bumping
the version, we conditionally send the info of multiqueue only when the device
support more than one queue to maintain the backward compatibility.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This patch implements both userspace and vhost support for multiple queue
virtio-net (VIRTIO_NET_F_MQ). This is done by introducing an array of
VirtIONetQueue to VirtIONet.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
To support multiqueue virtio-net, the first step is to separate the virtqueue
related fields from VirtIONet to a new structure VirtIONetQueue. The following
patches will add an array of VirtIONetQueue to VirtIONet based on this patch.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Add a queue_index to VirtQueue and a helper to fetch it, this could be used by
multiqueue supported device.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Some device (such as virtio-net) needs the ability to destroy or re-order the
virtqueues, this patch adds a helper to do this.
Signed-off-by: Jason Wang <jasowang>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This patch lets vhost support multiqueue. The idea is simple, just launching
multiple threads of vhost and let each of vhost thread processing a subset of
the virtqueues of the device. After this change each emulated device can have
multiple vhost threads as its backend.
To do this, a virtqueue index were introduced to record to first virtqueue that
will be handled by this vhost_net device. Based on this and nvqs, vhost could
calculate its relative index to setup vhost_net device.
Since we may have many vhost/net devices for a virtio-net device. The setting of
guest notifiers were moved out of the starting/stopping of a specific vhost
thread. The vhost_net_{start|stop}() were renamed to
vhost_net_{start|stop}_one(), and a new vhost_net_{start|stop}() were introduced
to configure the guest notifiers and start/stop all vhost/vhost_net devices.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Recently, linux support multiqueue tap which could let userspace call TUNSETIFF
for a signle device many times to create multiple file descriptors as
independent queues. User could also enable/disabe a specific queue through
TUNSETQUEUE.
The patch adds the generic infrastructure to create multiqueue taps. To achieve
this a new parameter "queues" were introduced to specify how many queues were
expected to be created for tap by qemu itself. Alternatively, management could
also pass multiple pre-created tap file descriptors separated with ':' through a
new parameter fds like -netdev tap,id=hn0,fds="X:Y:..:Z". Multiple vhost file
descriptors could also be passed in this way.
Each TAPState were still associated to a tap fd, which mean multiple TAPStates
were created when user needs multiqueue taps. Since each TAPState contains one
NetClientState, with the multiqueue nic support, an N peers of NetClientState
were built up.
A new parameter, mq_required were introduce in tap_open() to create multiqueue
tap fds.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This patch introduces a helper tap_get_ifname() to get the device name of tap
device. This is needed when ifname is unspecified in the command line and qemu
were asked to create tap device by itself. In this situation, the name were
allocated by kernel, so if multiqueue is asked, we need to fetch its name after
creating the first queue.
Only linux has this support since it's the only platform that supports
multiqueue tap.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This patch introduce a new bit - enabled in TAPState which tracks whether a
specific queue/fd is enabled. The tap/fd is enabled during initialization and
could be enabled/disabled by tap_enalbe() and tap_disable() which calls platform
specific helpers to do the real work. Polling of a tap fd can only done when
the tap was enabled.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This patch add basic multiqueue support for Linux. When multiqueue is needed, we
will first check whether kernel support multiqueue tap before creating more
queues. Two new functions tap_fd_enable() and tap_fd_disable() were introduced
to enable and disable a specific queue. Since the multiqueue is only supported
in Linux, return error on other platforms.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This patch factors out the common initialization of tap into a new helper
net_init_tap_one(). This will be used by multiqueue tap patches.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Import multiqueue constants from if_tun.h from 3.8-rc3. A new ifr flag
IFF_MULTI_QUEUE were introduced to create a multiqueue backend by calling
TUNSETIFF with the this flag and with the same interface name many times.
A new ioctl TUNSETQUEUE were introduced. When doing this ioctl with
IFF_DETACH_QUEUE, the queue were disabled in the linux kernel. When doing this
ioctl with IFF_ATTACH_QUEUE, the queue were enabled in the linux kernel.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This patch adds basic multiqueue support for qemu. The idea is simple, an array
of NetClientStates were introduced in NICState, parse_netdev() were extended to
find and match all NetClientStates belongs to the backend and place their
pointers in NICConf. Then qemu_new_nic can setup a N:N mapping between NICStates
that belongs to a nic and NICStates belongs to the netdev. And a queue_index
were introduced in NetClientState to track its index. After this, each peers of
a NICState were abstracted as a queue.
After this change, all NetClientState that belongs to the same backend/nic has
the same id. When use want to change the link status, all NetClientStates that
belongs to the same backend/nic will be also changed. When user want to delete
a device or netdev, all NetClientStates that belongs to the same backend/nic
will be deleted also. Changing or deleting an specific queue is not allowed.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
To allow allocating an array of NetClientState and free it once, this patch
introduces destructor of NetClientState. Which could do type specific free,
which could be used by multiqueue to free the array once.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This patch separates the setup of NetClientState from its allocation, this will
allow allocating an arrays of NetClientState and does the initialization one by
one which is what multiqueue needs.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
In multiqueue, all NetClientState that belongs to the same netdev or nic has the
same id. So this patches introduces an helper qemu_find_net_clients_except()
which finds all NetClientState with the same id. This will be used by multiqueue
networking.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
To support multiqueue nic, this patch separate the nic destructor from
qemu_del_net_client() to a new helper qemu_del_nic() since the mapping bettween
NiCState and NetClientState were not 1:1 in multiqueue. The following patches
would refactor this function to support multiqueue nic.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
To support multiqueue, this patch introduces a helper qemu_get_nic() to get
NICState from a NetClientState. The following patches would refactor this helper
to support multiqueue.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
To support multiqueue, the patch introduce a helper qemu_get_queue()
which is used to get the NetClientState of a device. The following patches would
refactor this helper to support multiqueue.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>