Unless a virtual server address was explicitly defined (which is
impossible with the legacy -net channel format), guestfwd did not
properly forwarded host->guest packets. This patch fixes it.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This allows a program to initialize a host networking device using a
file descriptor passed over a unix monitor socket.
The program must first pass the file descriptor using SCM_RIGHTS
ancillary data with the getfd monitor command. It then may do
"host_net_add tap fd=name" to use the named file descriptor.
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This got broken between a13a4126c8 and c92ef6a22d: old slirp code used
255.255.255.0.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
If no tap,sndbuf= arg is supplied, we use a default value. If
TUNSETSNDBUF fails in this case, we should not abort.
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
On reflection, perhaps it does make sense to set a default value for
the sndbuf= tap parameter.
For best effect, sndbuf= should be set to just below the capacity of
the physical NIC.
Setting it higher will cause packets to be dropped before the limit
is hit. Setting it much lower will not cause any problems unless
you set it low enough such that the guest cannot queue up new packets
before the NIC has emptied its queue.
In Linux, txqueuelen=1000 by default for ethernet NICs. Given a 1500
byte MTU, 1Mb is a good choice for sndbuf.
If it turns out that txqueuelen is actually much lower than this, then
sndbuf is essentially disabled. In the event that txqueuelen is much
higher, it's unlikely that the NIC will be able to empty a 1Mb queue.
Thanks to Herbert Xu for this logic.
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Cc: Herbert Xu <herbert.xu@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Extend the syntax of hostfwd_add/remove to optionally take a tuple of
VLAN ID and slirp stack name. If those are omitted, the commands will
continue to work on the first registered slirp stack.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Introduce qemu_find_vlan_client_by_name for VLANClientState lookup based
on VLAN ID and client name. This is useful for monitor commands.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Push the smb state, smb_dir, into SlirpState and construct it in a way
that allows multiple smb instances (one per slirp stack). Remove the smb
directory on slirp cleanup instead of qemu termination. As VLAN clients
are also cleaned up on process termination, no feature is lost.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Make sure for invocations from the monitor that slirp_smb properly
reports errors and doesn't terminate qemu.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Instead of open-coding this, we can use the power of the shell to remove
the smb_dir on exit.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Once again this was a long journey to reach the destination: Allow to
instantiate slirp multiple times. But as in the past, the journey was
worthwhile, cleaning up, fixing and enhancing various parts of the user
space network stack along the way.
What is this particular change good for? Multiple slirps instances
allow separated user space networks for guests with multiple NICs. This
is already possible, but without any slirp support for the second
network, ie. without a chance to talk to that network from the host via
IP. We have a legacy guest system here that benefits from this slirp
enhancement, allowing us to run both of its NICs purely over
unprivileged user space IP stacks.
Another benefit of this patch is that it simply removes an artificial
restriction of the configuration space qemu is providing, avoiding
another source of surprises that users may face when playing with
possible setups.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Allocate the internal slirp state dynamically and provide and call
slirp_cleanup to properly release it after use. This patch finally
unbreaks slirp release and re-instantiation via host_net_* monitor
commands.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This now also exports the internal state to the slirp users in qemu,
returning it from slirp_init and expecting it along with service
invocations. Additionally provide an opaque value interface for the
callbacks from slirp into the qemu core.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Slirp doesn't invoke slirp[_can]_output before it is initialized. The
motivation for these checks (3b7f5d479c) no longer applies. So drop
them.
Note: slirp_vc will become invalid if the slirp stack is removed during
runtime. But this is no new bug and will be fixed later.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Avoid the need for slirp_is_inited by refactoring the protected
slirp_select_* functions. This also avoids the clearing of all fd sets
on select errors.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
As agreed on the mailing list, there is no interest in keeping the
usually disabled slirp statistics in the tree. So this patch removes
them.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Break out sockstats from the slirp statistics and present them under the
new info category "usernet". This patch also improves the current output
/wrt proper reporting connection source and destination.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Prevent that the users accidentally shoots down dynamic sockets. This
allows to remove looping for removals as there can now only be one
match.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Extend the hostfwd rule format so that the user can specify on which
host interface qemu should listen for incoming connections. If omitted,
binding will takes place against all interfaces.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Improve the monitor interface for adding and removing host forwarding
rules by splitting it up in two commands and rename them to hostfwd_add
and hostfwd_remove. Also split up the paths taken for legacy -redir
support and the monitor add command as the latter will be extended later
on.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
With the internal IP configuration made more flexible, we can now
enhance the user interface. This patch adds a number of new options to
"-net user": net (address and mask), host, dhcpstart, dns and smbserver.
It also renames "redir" to "hostfwd" and "channel" to "guestfwd" in
order to (hopefully) clarify their meanings. The format of guestfwd is
extended so that the user can define not only the port but also the
virtual server's IP address the forwarding starts from.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
So far a couple of slirp-related parameters were expressed via
stand-alone command line options. This it inconsistent and unintuitive.
Moreover, it prevents both dynamically reconfigured (host_net_add/
delete) and multi-instance slirp.
This patch refactors the configuration by turning -smb, -redir, -tftp
and -bootp as well as -net channel into options of "-net user". The old
stand-alone command line options are still processed, but no longer
advertised. This allows smooth migration of management applications to
to the new syntax and also the extension of that syntax later in this
series.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This reverts commit 1c6ed9f337.
It's redundant to slirp statistics, which are going to be split up /
reworked later on.
Conflicts:
monitor.c
net.c
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Add an option to specify the number of MSI-X vectors for PCI NIC cards. This
can also be used to disable MSI-X, for compatibility with old qemu. This
option currently only affects virtio cards.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
virtio-net needs this - for the same purpose that it currently uses the
return value from qemu_sendv_packet().
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2.6.30 adds a new TUNSETSNDBUF ioctl() which allows a send buffer limit
for the tap device to be specified. When this limit is reached, a tap
write() will return EAGAIN and poll() will indicate the fd isn't
writable.
This allows people to tune their setups so as to avoid e.g. UDP packet
loss when the sending application in the guest out-runs the NIC in the
host.
There is no obviously sensible default setting - a suitable value
depends mostly on the capabilities of the physical NIC through which the
packets are being sent.
Also, note that when using a bridge with netfilter enabled, we currently
never get EAGAIN because netfilter causes the packet to be immediately
orphaned. Set /proc/sys/net/bridge/bridge nf-call-iptables to zero to
disable this behaviour.
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
net_tap_fd_init() already returns TAPState, so this is a sensible
cleanup in its own right.
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
If a write() on tapfd returns EAGAIN, return zero so that the packet
gets queued (in the case of async send) and enable polling tapfd for
writing.
When tapfd becomes writable, disable write polling and flush any queued
packets.
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Add a helper to enable/disable the read polling on tapfd.
We need this, because we want to start write polling on the tapfd too
and enable/disable both types of polling independently.
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
If tap has any packets queued at host_net_remove time, it needs to purge
them in order to prevent a sent callback being invoked for it.
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
If net client sends packets asynchronously, it needs to purge its queued
packets in cleanup() so as to prevent sent callbacks being invoked with
a freed client.
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Make net_client_init() accept addr=, put the value into struct
NICinfo. Use it in pci_nic_init(), and remove arguments bus and
devfn.
Don't support addr= in third argument of monitor command pci_add,
because that clashes with its first argument. Admittedly unelegant.
Machines "malta" and "r2d" have a default NIC with a well-known PCI
address. Deal with that the same way as the NIC model: make
pci_nic_init() take an optional default to be used when the user
doesn't specify one.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Fix the pcap dumps on Win32 and other systems where O_BINARY is required.
Signed-off-by: Filip Navara <filip.navara@gmail.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
The code how it is today, is totally painful to read and keep.
To begin with, the code is duplicated with the option rom loading
code that linux_boot and vga are already using.
This patch introduces a "bootable" state in NICInfo structure,
that we can use to keep track of whether or not a given nic should
be bootable, avoiding the introduction of yet another global state.
With that in hands, we move the code in vl.c to hw/pc.c, and use
the already existing infra structure to load those option roms.
Error checking code suggested by Mark McLoughlin
Signed-off-by: Glauber Costa <glommer@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Type casts removing the const attribute are bad because
they hide the fact that the argument remains const.
They also result in a compiler warning (at least with MS-C).
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Work around buffer and ioctlsocket argument type signedness problems
Suppress a prototype which is unused on mingw32
Expand a macro to avoid warnings from some GCC versions
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
If a packet is queued by qemu_send_packet(), remove I/O
handler for the tap fd until we get notification that the
packet has been sent.
A not insignificant side effect of this is we can now
drain the tap send queue in one go without fear of packets
being dropped.
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Add a qemu_send_packet() variant which will queue up the packet
if it cannot be sent when all client queues are full. It later
invokes the supplied callback when the packet has been sent.
If qemu_send_packet_async() returns zero, the caller is expected
to not send any more packets until the queued packet has been
sent.
Packets are queued iff a receive() handler returns zero (indicating
queue full) and the caller has provided a sent notification callback
(indicating it will stop and start its own queue).
We need the packet sending API to support queueing because:
- a sending client should process all available packets in one go
(e.g. virtio-net emptying its tx ring)
- a receiving client may not be able to handle the packet
(e.g. -EAGAIN from write() to tapfd)
- the sending client could detect this condition in advance
(e.g. by select() for writable on tapfd)
- that's too much overhead (e.g. a select() call per packet)
- therefore the sending client must handle the condition by
dropping the packet or queueing it
- dropping packets is poor form; we should queue.
However, we don't want queueing to be completely transparent. We
want the sending client to stop sending packets as soon as a
packet is queued. This allows the sending client to be throttled
by the receiver.
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
VLANClientState's fd_read() handler doesn't read from file
descriptors, it adds a buffer to the client's receive queue.
Re-name the handlers to make things a little less confusing.
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
This, apparently, is the style we prefer - all VLANClientState
should be an argument to qemu_new_vlan_client().
Signed-off-by: Mark McLoughlin <markmc@redhat.com>