Original Xbox Emulator for Windows, macOS, and Linux (Active Development)
Go to file
Vladimir Sementsov-Ogievskiy 4ddb5d2fde block/nbd: drop connection_co
OK, that's a big rewrite of the logic.

Pre-patch we have an always running coroutine - connection_co. It does
reply receiving and reconnecting. And it leads to a lot of difficult
and unobvious code around drained sections and context switch. We also
abuse bs->in_flight counter which is increased for connection_co and
temporary decreased in points where we want to allow drained section to
begin. One of these place is in another file: in nbd_read_eof() in
nbd/client.c.

We also cancel reconnect and requests waiting for reconnect on drained
begin which is not correct. And this patch fixes that.

Let's finally drop this always running coroutine and go another way:
do both reconnect and receiving in request coroutines.

The detailed list of changes below (in the sequence of diff hunks).

1. receiving coroutines are woken directly from nbd_channel_error, when
   we change s->state

2. nbd_co_establish_connection_cancel(): we don't have drain_begin now,
   and in nbd_teardown_connection() all requests should already be
   finished (and reconnect is done from request). So
   nbd_co_establish_connection_cancel() is called from
   nbd_cancel_in_flight() (to cancel the request that is doing
   nbd_co_establish_connection()) and from reconnect_delay_timer_cb()
   (previously we didn't need it, as reconnect delay only should cancel
   active requests not the reconnection itself). But now reconnection
   itself is done in the separate thread (we now call
   nbd_client_connection_enable_retry() in nbd_open()), and we need to
   cancel the requests that wait in nbd_co_establish_connection()
   now).

2A. We do receive headers in request coroutine. But we also should
   dispatch replies for other pending requests. So,
   nbd_connection_entry() is turned into nbd_receive_replies(), which
   does reply dispatching while it receives other request headers, and
   returns when it receives the requested header.

3. All old staff around drained sections and context switch is dropped.
   In details:
   - we don't need to move connection_co to new aio context, as we
     don't have connection_co anymore
   - we don't have a fake "request" of connection_co (extra increasing
     in_flight), so don't care with it in drain_begin/end
   - we don't stop reconnection during drained section anymore. This
     means that drain_begin may wait for a long time (up to
     reconnect_delay). But that's an improvement and more correct
     behavior see below[*]

4. In nbd_teardown_connection() we don't have to wait for
   connection_co, as it is dropped. And cleanup for s->ioc and nbd_yank
   is moved here from removed connection_co.

5. In nbd_co_do_establish_connection() we now should handle
   NBD_CLIENT_CONNECTING_NOWAIT: if new request comes when we are in
   NBD_CLIENT_CONNECTING_NOWAIT, it still should call
   nbd_co_establish_connection() (who knows, maybe the connection was
   already established by another thread in the background). But we
   shouldn't wait: if nbd_co_establish_connection() can't return new
   channel immediately the request should fail (we are in
   NBD_CLIENT_CONNECTING_NOWAIT state).

6. nbd_reconnect_attempt() is simplified: it's now easier to wait for
   other requests in the caller, so here we just assert that fact.
   Also delay time is now initialized here: we can easily detect first
   attempt and start a timer.

7. nbd_co_reconnect_loop() is dropped, we don't need it. Reconnect
   retries are fully handle by thread (nbd/client-connection.c), delay
   timer we initialize in nbd_reconnect_attempt(), we don't have to
   bother with s->drained and friends. nbd_reconnect_attempt() now
   called from nbd_co_send_request().

8. nbd_connection_entry is dropped: reconnect is now handled by
   nbd_co_send_request(), receiving reply is now handled by
   nbd_receive_replies(): all handled from request coroutines.

9. So, welcome new nbd_receive_replies() called from request coroutine,
   that receives reply header instead of nbd_connection_entry().
   Like with sending requests, only one coroutine may receive in a
   moment. So we introduce receive_mutex, which is locked around
   nbd_receive_reply(). It also protects some related fields. Still,
   full audit of thread-safety in nbd driver is a separate task.
   New function waits for a reply with specified handle being received
   and works rather simple:

   Under mutex:
     - if current handle is 0, do receive by hand. If another handle
       received - switch to other request coroutine, release mutex and
       yield. Otherwise return success
     - if current handle == requested handle, we are done
     - otherwise, release mutex and yield

10: in nbd_co_send_request() we now do nbd_reconnect_attempt() if
    needed. Also waiting in free_sema queue we now wait for one of two
    conditions:
    - connectED, in_flight < MAX_NBD_REQUESTS (so we can start new one)
    - connectING, in_flight == 0, so we can call
      nbd_reconnect_attempt()
    And this logic is protected by s->send_mutex

    Also, on failure we don't have to care of removed s->connection_co

11. nbd_co_do_receive_one_chunk(): now instead of yield() and wait for
    s->connection_co we just call new nbd_receive_replies().

12. nbd_co_receive_one_chunk(): place where s->reply.handle becomes 0,
    which means that handling of the whole reply is finished. Here we
    need to wake one of coroutines sleeping in nbd_receive_replies().
    If none are sleeping - do nothing. That's another behavior change: we
    don't have endless recv() in the idle time. It may be considered as
    a drawback. If so, it may be fixed later.

13. nbd_reply_chunk_iter_receive(): don't care about removed
    connection_co, just ping in_flight waiters.

14. Don't create connection_co, enable retry in the connection thread
    (we don't have own reconnect loop anymore)

15. We now need to add a nbd_co_establish_connection_cancel() call in
    nbd_cancel_in_flight(), to cancel the request that is doing a
    connection attempt.

[*], ok, now we don't cancel reconnect on drain begin. That's correct:
    reconnect feature leads to possibility of long-running requests (up
    to reconnect delay). Still, drain begin is not a reason to kill
    long requests. We should wait for them.

    This also means, that we can again reproduce a dead-lock, described
    in 8c517de24a.
    Why we are OK with it:
    1. Now this is not absolutely-dead dead-lock: the vm is unfrozen
       after reconnect delay. Actually 8c517de24a fixed a bug in
       NBD logic, that was not described in 8c517de24a and led to
       forever dead-lock. The problem was that nobody woke the free_sema
       queue, but drain_begin can't finish until there is a request in
       free_sema queue. Now we have a reconnect delay timer that works
       well.
    2. It's not a problem of the NBD driver, but of the ide code,
       because it does drain_begin under the global mutex; the problem
       doesn't reproduce when using scsi instead of ide.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210902103805.25686-5-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: grammar and comment tweaks]
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-09-29 13:46:33 -05:00
.github Update documentation to refer to new location for issues 2021-06-21 05:43:11 +02:00
.gitlab/issue_templates GitLab: Add "Feature Request" issue template. 2021-06-25 10:08:37 +01:00
.gitlab-ci.d gitlab-ci: Mark manual-only jobs as allow_failure 2021-09-15 16:43:16 +01:00
accel accel/tcg: Restrict cpu_handle_halt() to sysemu 2021-09-21 19:36:44 -07:00
audio audio: Never send migration section 2021-08-10 10:55:57 +02:00
authz configure, meson: convert pam detection to meson 2021-06-25 10:54:10 +02:00
backends qapi: Convert simple union TpmTypeOptions to flat one 2021-09-27 08:22:25 +02:00
block block/nbd: drop connection_co 2021-09-29 13:46:33 -05:00
bsd-user user: Remove cpu_get_pic_interrupt() stubs 2021-09-14 12:00:21 -07:00
capstone@f8b1b83301 capstone: Update to upstream "next" branch 2020-10-03 04:23:14 -05:00
chardev qapi: Convert simple union SocketAddressLegacy to flat one 2021-09-27 08:23:25 +02:00
configs hw/acpi: refactor acpi hp modules so that targets can just use what they need 2021-09-04 09:07:46 -04:00
contrib elf2dmp: Fail cleanly if PDB file specifies zero block_size 2021-09-20 09:54:32 +01:00
crypto crypto: add gnutls pbkdf provider 2021-07-14 14:15:52 +01:00
disas Hexagon (disas/hexagon.c) fix memory leak for early exit cases 2021-08-12 09:06:05 -05:00
docs qemu-nbd: Change default cache mode to writeback 2021-09-29 13:46:31 -05:00
dtc@85e5d83984 Makefile: dtc: update, build the libfdt target 2020-06-16 14:49:05 +01:00
dump Do not include cpu.h if it's not really necessary 2021-05-02 17:24:51 +02:00
ebpf ebpf: only include in system emulators 2021-09-17 16:07:52 +08:00
fpu softfloat: Remove assertion preventing silencing of NaN in default-NaN mode 2021-09-01 11:08:17 +01:00
fsdev meson: Declare have_virtfs_proxy_helper in main meson.build 2021-01-23 15:55:04 -05:00
gdb-xml target/riscv: Remove built-in GDB XML files for CSRs 2021-01-16 10:57:21 -08:00
hw hw/loader: Restrict PC_ROM_* definitions to hw/i386/pc 2021-09-27 10:57:21 +02:00
include block: use int64_t instead of int in driver discard handlers 2021-09-29 13:46:32 -05:00
io io: use GDateTime for formatting timestamp for websock headers 2021-07-14 14:15:52 +01:00
libdecnumber qemu/: fix some comment spelling errors 2020-09-17 20:35:43 +02:00
linux-headers linux-headers: Update 2021-07-09 11:01:06 +10:00
linux-user linux-user/aarch64: Use force_sig_fault() 2021-09-23 14:43:58 +02:00
meson@776acd2a80 submodules: bump meson to 0.55.3 2020-10-17 10:45:42 -04:00
migration migration: Handle migration_incoming_setup() errors consistently 2021-08-26 17:15:28 +02:00
monitor QAPI patches patches for 2021-09-25 2021-09-27 15:03:42 +01:00
nbd block/nbd: drop connection_co 2021-09-29 13:46:33 -05:00
net vhost-vdpa: remove the unncessary queue_index assignment 2021-09-04 17:34:05 -04:00
pc-bios meson: look up cp and dtrace with find_program() 2021-09-13 13:56:26 +02:00
plugins plugins/api: added a boolean parsing plugin api 2021-09-02 11:29:34 +01:00
po configure: move gettext detection to meson.build 2021-01-02 21:03:09 +01:00
python python/aqmp-tui: Add syntax highlighting 2021-09-27 12:10:29 -04:00
qapi qapi: Convert simple union TransactionAction to flat one 2021-09-27 08:23:25 +02:00
qga Remove superfluous ERRP_GUARD() 2021-08-26 17:15:28 +02:00
qobject qobject: braces {} are necessary for all arms of this statement 2021-02-04 13:20:29 +01:00
qom qom: use correct field name when getting/setting alias properties 2021-07-23 18:17:17 +02:00
replay replay: notify CPU on event 2021-04-01 10:37:20 +02:00
roms Update OpenBIOS images to d657b653 built from submodule. 2021-09-08 10:30:10 +01:00
scripts qapi: Drop simple unions 2021-09-27 08:23:25 +02:00
scsi error: Use error_fatal to simplify obvious fatal errors (again) 2021-08-26 17:15:28 +02:00
semihosting linux-user: Don't include gdbstub.h in qemu.h 2021-09-13 20:35:45 +02:00
slirp@a88d9ace23 Update libslirp to v4.6.1 2021-08-03 16:07:22 +04:00
softmmu qdev: Support marking individual buses as 'full' 2021-09-13 21:01:08 +01:00
storage-daemon storage-daemon: Add missing build dependency to the vhost-user-blk-test 2021-08-11 13:39:50 +02:00
stubs hw/pci: remove all references to find_i440fx function 2021-09-04 17:34:05 -04:00
subprojects/libvhost-user libvhost-user: fix -Werror=format= warnings with __u64 fields 2021-07-29 10:15:52 +02:00
target hw/core: Make do_unaligned_access noreturn 2021-09-21 19:36:44 -07:00
tcg tcg/riscv: Remove add with zero on user-only memory access 2021-09-21 19:36:44 -07:00
tests block: use int64_t instead of int in driver discard handlers 2021-09-29 13:46:32 -05:00
tools virtiofsd pull 2021-08-16 2021-09-19 18:53:29 +01:00
trace meson: look up cp and dtrace with find_program() 2021-09-13 13:56:26 +02:00
ui ui/gtk-egl: Wait for the draw signal for dmabuf blobs 2021-09-15 08:41:59 +02:00
util qapi: Convert simple union SocketAddressLegacy to flat one 2021-09-27 08:23:25 +02:00
.cirrus.yml cirrus: delete FreeBSD and macOS jobs 2021-07-14 14:33:53 +01:00
.dir-locals.el Add .dir-locals.el file to configure emacs coding style 2015-10-08 19:46:01 +03:00
.editorconfig .editorconfig: update the automatic mode setting for Emacs 2021-03-10 15:34:11 +00:00
.exrc qemu: add .exrc 2012-09-07 09:02:44 +03:00
.gdbinit .gdbinit: load QEMU sub-commands when gdb starts 2017-06-07 14:38:45 +01:00
.gitattributes maint: Tell git that *.py files should use python diff hunks 2021-02-15 22:13:34 -05:00
.gitignore gitignore: Update with some filetypes 2021-07-23 17:22:15 +01:00
.gitlab-ci.yml docs: Document GitLab custom CI/CD variables 2021-07-29 07:56:01 +02:00
.gitmodules gitmodules: use GitLab repos instead of qemu.org 2021-02-09 20:53:56 +00:00
.gitpublish Add a git-publish configuration file 2018-03-05 09:03:17 +00:00
.mailmap MAINTAINERS: Name and email address change 2021-08-10 16:42:16 +01:00
.patchew.yml scripts/checkpatch: roll diff tweaking into checkpatch itself 2021-06-25 10:08:33 +01:00
.readthedocs.yml readthedocs: build with Python 3.6 2020-10-05 16:30:45 +01:00
.travis.yml hw/usb/ccid: remove references to NSS 2021-07-14 14:33:53 +01:00
block.c block: bdrv_inactivate_recurse(): check for permissions and fix crash 2021-09-15 15:54:07 +02:00
blockdev-nbd.c block/nbd: Use qcrypto_tls_creds_check_endpoint() 2021-06-29 18:29:47 +01:00
blockdev.c arch_init.h: Don't include arch_init.h unnecessarily 2021-08-26 17:02:00 +01:00
blockjob.c progressmeter: protect with a mutex 2021-06-25 14:24:24 +03:00
configure configure: add missing pc-bios/qemu_vga.ndrv symlink in build tree 2021-09-15 15:54:02 +02:00
COPYING COPYING: update from FSF 2008-10-12 17:54:42 +00:00
COPYING.LIB COPYING.LIB: Synchronize the LGPL 2.1 with the version from gnu.org 2019-01-30 11:01:22 +01:00
cpu.c accel/tcg: Record singlestep_enabled in tb->cflags 2021-07-21 07:47:05 -10:00
cpus-common.c overall/alpha tcg cpus|hppa: Fix Lesser GPL version number 2020-11-15 16:43:54 +01:00
disas.c Do not include cpu.h if it's not really necessary 2021-05-02 17:24:51 +02:00
gdbstub.c linux-user: Don't include gdbstub.h in qemu.h 2021-09-13 20:35:45 +02:00
gitdm.config contrib/gitdm: add a new interns group-map for GSoC/Outreachy work 2021-07-23 17:22:16 +01:00
hmp-commands-info.hx monitor/tcg: move tcg hmp commands to accel/tcg, register them dynamically 2021-07-09 18:21:33 +02:00
hmp-commands.hx hmp: Drop a bogus sentence from set_password's documentation 2021-09-27 10:57:21 +02:00
iothread.c iothread: add aio-max-batch parameter 2021-07-21 13:47:50 +01:00
job-qmp.c progressmeter: protect with a mutex 2021-06-25 14:24:24 +03:00
job.c progressmeter: protect with a mutex 2021-06-25 14:24:24 +03:00
Kconfig meson: Introduce target-specific Kconfig 2021-07-09 18:21:34 +02:00
Kconfig.host multi-process: Add config option for multi-process QEMU 2021-02-09 20:53:56 +00:00
LICENSE tcg/LICENSE: Remove out of date claim about TCG subdirectory licensing 2019-11-11 15:11:21 +01:00
MAINTAINERS qemu: Split machine_ppc.py acceptance tests 2021-09-27 19:06:47 +02:00
Makefile Makefile: ignore long options 2021-07-29 10:15:51 +02:00
memory_ldst.c.inc exec/memory_ldst: Use correct type sizes 2021-05-26 08:35:51 -07:00
meson_options.txt configure, meson: convert libxml2 detection to meson 2021-07-06 08:33:51 +02:00
meson.build arm: Add Hypervisor.framework build target 2021-09-21 16:28:26 +01:00
module-common.c all: Clean up includes 2016-02-04 17:41:30 +00:00
os-posix.c remove qemu-options* from root directory 2021-05-26 14:49:46 +02:00
os-win32.c remove qemu-options* from root directory 2021-05-26 14:49:46 +02:00
page-vary-common.c exec: Build page-vary-common.c with -fno-lto 2021-03-23 19:36:47 -06:00
page-vary.c exec: Build page-vary-common.c with -fno-lto 2021-03-23 19:36:47 -06:00
qemu-bridge-helper.c qemu-bridge-helper: relocate path to default ACL 2020-09-30 19:11:36 +02:00
qemu-edid.c qemu-edid: use qemu_edid_size() 2021-05-10 11:41:02 +02:00
qemu-img-cmds.hx qemu-img: Add -F shorthand to convert 2021-09-15 18:42:38 +02:00
qemu-img.c qemu-img: Add -F shorthand to convert 2021-09-15 18:42:38 +02:00
qemu-io-cmds.c block: Acquire AioContexts during bdrv_reopen_multiple() 2021-07-09 13:19:11 +02:00
qemu-io.c error: Use error_fatal to simplify obvious fatal errors (again) 2021-08-26 17:15:28 +02:00
qemu-keymap.c Include qemu-common.h exactly where needed 2019-06-12 13:20:20 +02:00
qemu-nbd.c qemu-nbd: Change default cache mode to writeback 2021-09-29 13:46:31 -05:00
qemu-options.hx softmmu/vl: Deprecate the old grab options 2021-09-06 10:00:14 +02:00
qemu.nsi nsis: adjust for new MinGW paths 2021-01-23 15:55:05 -05:00
qemu.sasl sasl: remove comment about obsolete kerberos versions 2021-06-14 13:28:50 +01:00
README.rst Update documentation to refer to new location for issues 2021-06-21 05:43:11 +02:00
replication.c replication: move include out of root directory 2021-05-26 14:49:46 +02:00
thunk.c linux-user: Drop unneeded includes from qemu.h 2021-09-13 20:35:45 +02:00
trace-events cpu: Add breakpoint tracepoints 2021-07-09 21:31:11 -07:00
VERSION Open 6.2 development tree 2021-08-25 10:25:12 +01:00
version.rc configure: remove CONFIG_FILEVERSION and CONFIG_PRODUCTVERSION 2021-01-02 21:03:37 +01:00

===========
QEMU README
===========

QEMU is a generic and open source machine & userspace emulator and
virtualizer.

QEMU is capable of emulating a complete machine in software without any
need for hardware virtualization support. By using dynamic translation,
it achieves very good performance. QEMU can also integrate with the Xen
and KVM hypervisors to provide emulated hardware while allowing the
hypervisor to manage the CPU. With hypervisor support, QEMU can achieve
near native performance for CPUs. When QEMU emulates CPUs directly it is
capable of running operating systems made for one machine (e.g. an ARMv7
board) on a different machine (e.g. an x86_64 PC board).

QEMU is also capable of providing userspace API virtualization for Linux
and BSD kernel interfaces. This allows binaries compiled against one
architecture ABI (e.g. the Linux PPC64 ABI) to be run on a host using a
different architecture ABI (e.g. the Linux x86_64 ABI). This does not
involve any hardware emulation, simply CPU and syscall emulation.

QEMU aims to fit into a variety of use cases. It can be invoked directly
by users wishing to have full control over its behaviour and settings.
It also aims to facilitate integration into higher level management
layers, by providing a stable command line interface and monitor API.
It is commonly invoked indirectly via the libvirt library when using
open source applications such as oVirt, OpenStack and virt-manager.

QEMU as a whole is released under the GNU General Public License,
version 2. For full licensing details, consult the LICENSE file.


Documentation
=============

Documentation can be found hosted online at
`<https://www.qemu.org/documentation/>`_. The documentation for the
current development version that is available at
`<https://www.qemu.org/docs/master/>`_ is generated from the ``docs/``
folder in the source tree, and is built by `Sphinx
<https://www.sphinx-doc.org/en/master/>_`.


Building
========

QEMU is multi-platform software intended to be buildable on all modern
Linux platforms, OS-X, Win32 (via the Mingw64 toolchain) and a variety
of other UNIX targets. The simple steps to build QEMU are:


.. code-block:: shell

  mkdir build
  cd build
  ../configure
  make

Additional information can also be found online via the QEMU website:

* `<https://qemu.org/Hosts/Linux>`_
* `<https://qemu.org/Hosts/Mac>`_
* `<https://qemu.org/Hosts/W32>`_


Submitting patches
==================

The QEMU source code is maintained under the GIT version control system.

.. code-block:: shell

   git clone https://gitlab.com/qemu-project/qemu.git

When submitting patches, one common approach is to use 'git
format-patch' and/or 'git send-email' to format & send the mail to the
qemu-devel@nongnu.org mailing list. All patches submitted must contain
a 'Signed-off-by' line from the author. Patches should follow the
guidelines set out in the `style section
<https://www.qemu.org/docs/master/devel/style.html>` of
the Developers Guide.

Additional information on submitting patches can be found online via
the QEMU website

* `<https://qemu.org/Contribute/SubmitAPatch>`_
* `<https://qemu.org/Contribute/TrivialPatches>`_

The QEMU website is also maintained under source control.

.. code-block:: shell

  git clone https://gitlab.com/qemu-project/qemu-web.git

* `<https://www.qemu.org/2017/02/04/the-new-qemu-website-is-up/>`_

A 'git-publish' utility was created to make above process less
cumbersome, and is highly recommended for making regular contributions,
or even just for sending consecutive patch series revisions. It also
requires a working 'git send-email' setup, and by default doesn't
automate everything, so you may want to go through the above steps
manually for once.

For installation instructions, please go to

*  `<https://github.com/stefanha/git-publish>`_

The workflow with 'git-publish' is:

.. code-block:: shell

  $ git checkout master -b my-feature
  $ # work on new commits, add your 'Signed-off-by' lines to each
  $ git publish

Your patch series will be sent and tagged as my-feature-v1 if you need to refer
back to it in the future.

Sending v2:

.. code-block:: shell

  $ git checkout my-feature # same topic branch
  $ # making changes to the commits (using 'git rebase', for example)
  $ git publish

Your patch series will be sent with 'v2' tag in the subject and the git tip
will be tagged as my-feature-v2.

Bug reporting
=============

The QEMU project uses GitLab issues to track bugs. Bugs
found when running code built from QEMU git or upstream released sources
should be reported via:

* `<https://gitlab.com/qemu-project/qemu/-/issues>`_

If using QEMU via an operating system vendor pre-built binary package, it
is preferable to report bugs to the vendor's own bug tracker first. If
the bug is also known to affect latest upstream code, it can also be
reported via GitLab.

For additional information on bug reporting consult:

* `<https://qemu.org/Contribute/ReportABug>`_


ChangeLog
=========

For version history and release notes, please visit
`<https://wiki.qemu.org/ChangeLog/>`_ or look at the git history for
more detailed information.


Contact
=======

The QEMU community can be contacted in a number of ways, with the two
main methods being email and IRC

* `<mailto:qemu-devel@nongnu.org>`_
* `<https://lists.nongnu.org/mailman/listinfo/qemu-devel>`_
* #qemu on irc.oftc.net

Information on additional methods of contacting the community can be
found online via the QEMU website:

* `<https://qemu.org/Contribute/StartHere>`_