xemu-project/xemu - xemu - Gitea: Git with a cup of tea

mirror of https://github.com/xemu-project/xemu.git synced 2024-11-24 12:09:58 +00:00

Author	SHA1	Message	Date
Li Feng	2916405a11	block/nvme: optimize the performance of nvme driver based on vfio-pci When the IO size is larger than 2 pages, we move the the pointer one by one in the pagelist, this is inefficient. This is a simple benchmark result: Before: $ qemu-io -c 'write 0 1G' nvme://0000:00:04.0/1 wrote 1073741824/1073741824 bytes at offset 0 1 GiB, 1 ops; 0:00:02.41 (424.504 MiB/sec and 0.4146 ops/sec) $ qemu-io -c 'read 0 1G' nvme://0000:00:04.0/1 read 1073741824/1073741824 bytes at offset 0 1 GiB, 1 ops; 0:00:02.03 (503.055 MiB/sec and 0.4913 ops/sec) After: $ qemu-io -c 'write 0 1G' nvme://0000:00:04.0/1 wrote 1073741824/1073741824 bytes at offset 0 1 GiB, 1 ops; 0:00:02.17 (471.517 MiB/sec and 0.4605 ops/sec) $ qemu-io -c 'read 0 1G' nvme://0000:00:04.0/1 read 1073741824/1073741824 bytes at offset 0 1 GiB, 1 ops; 0:00:01.94 (526.770 MiB/sec and 0.5144 ops/sec) Signed-off-by: Li Feng <lifeng1519@gmail.com> Message-Id: <20181101103807.25862-1-lifeng1519@gmail.com> Signed-off-by: Fam Zheng <famz@redhat.com>	2019-01-09 09:38:34 +08:00
Peter Maydell	a29644590f	nbd patches for 2019-01-05 Error and trace improvements in NBD code, such as less noise for common disconnect scenarios. - Vladimir Sementsov-Ogievskiy: 0/3 nbd-client: drop extra error noise - Eric Blake: portions of 0/22 nbd: add qemu-nbd --list -----BEGIN PGP SIGNATURE----- iQEcBAABCAAGBQJcMLgeAAoJEKeha0olJ0NqzXgH/RVRrCtmz0dOrrAkoJUfFX3t AcnOSb8nTWXTLGvGpU6x5l9yGL77VaN83j3z/4tXAraGA3sZsFx4cNNepT7P3zBB GlyT/b/Aie4EQAJPjuZdIZBe5pR6seSLoMFoGiga2oS2ik/wx4Wxxn9Gf9ZHUwVO muCtsJQypNrfbv7GY2Qf5NKyB2IQxt4ljSCqOmi0/T2//mQrdmRhcHbgTTDUA5HS 3Rpr9gPV0u6YzA9foCu+UqUpyhF3RBQY9sSMLS+Y9L6W72WgynrG7y0A637o4Q61 W+2UDDjuvyyCDR8DjD+LnWyGn4QuxIglfta2kFRcFGoPr1GqcmP4gDLeqqdJKFs= =R1PV -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/ericb/tags/pull-nbd-2019-01-05' into staging nbd patches for 2019-01-05 Error and trace improvements in NBD code, such as less noise for common disconnect scenarios. - Vladimir Sementsov-Ogievskiy: 0/3 nbd-client: drop extra error noise - Eric Blake: portions of 0/22 nbd: add qemu-nbd --list # gpg: Signature made Sat 05 Jan 2019 13:58:54 GMT # gpg: using RSA key A7A16B4A2527436A # gpg: Good signature from "Eric Blake <eblake@redhat.com>" # gpg: aka "Eric Blake (Free Software Programmer) <ebb9@byu.net>" # gpg: aka "[jpeg image of size 6874]" # Primary key fingerprint: 71C2 CC22 B1C4 6029 27D2 F3AA A7A1 6B4A 2527 436A * remotes/ericb/tags/pull-nbd-2019-01-05: nbd/client: Drop pointless buf variable qemu-nbd: Fail earlier for -c/-d on non-linux nbd/client: More consistent error messages nbd: Document timeline of various features qemu-nbd: Use program name in error messages block/nbd-client: use traces instead of noisy error_report_err nbd/client: Trace all server option error messages nbd: publish _lookup functions Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2019-01-07 11:55:52 +00:00
Vladimir Sementsov-Ogievskiy	d8b4bad846	block/nbd-client: use traces instead of noisy error_report_err Reduce extra noise of nbd-client, change 083 correspondingly. In various commits (`be41c100` in 2.10, `f140e300` in 2.11, `78a33ab` in 2.12), we added spots where qemu as an NBD client would report problems communicating with the server to stderr, because there was no where else to send the error to. However, this is racy, particularly since the most common source of these errors is when either the client or the server abruptly hangs up, leaving one coroutine to report the error only if it wins (or loses) the race in attempting the read from the server before another thread completes its cleanup of a protocol error that caused the disconnect in the first place. The race is also apparent in the fact that differences in the flush behavior of the server can alter the frequency of encountering the race in the client (see commit `6d39db96`). Rather than polluting stderr, it's better to just trace these situations, for use by developers debugging a flaky connection, particularly since the real error that either triggers the abrupt disconnection in the first place, or that results from the EIO when a request can't receive a reply, DOES make it back to the user in the normal Error propagation channels. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20181102151152.288399-4-vsementsov@virtuozzo.com> [eblake: drop depedence on error hint, enhance commit message] Signed-off-by: Eric Blake <eblake@redhat.com>	2019-01-04 17:34:58 -06:00
yuchenlin	39a0408e76	dmg: don't skip zero chunk The dmg file has many tables which describe: "start from sector XXX to sector XXX, the compression method is XXX and where the compressed data resides on". Each sector in the expanded file should be covered by a table. The table will describe the offset of compressed data (or raw depends on the type) in the dmg. For example: [-----------The expanded file------------] [---bzip table ---]/* zeros /[---zlib---] ^ \| if we want to read this sector. we will find bzip table which contains this sector, and get the compressed data offset, read it from dmg, uncompress it, finally write to expanded file. If we skip zero chunk (table), some sector cannot find the table which will cause search_chunk() return s->n_chunks, dmg_read_chunk() return -1 and finally causing dmg_co_preadv() return EIO. See: [-----------The expanded file------------] [---bzip table ---]/ zeros */[---zlib---] ^ \| if we want to read this sector. Oops, we cannot find the table contains it... In the original implementation, we don't have zero table. When we try to read sector inside the zero chunk. We will get EIO, and skip reading. After this patch, we treat zero chunk the same as ignore chunk, it will directly write zero and avoid some sector may not find the table. After this patch: [-----------The expanded file------------] [---bzip table ---][--zeros--][---zlib---] Signed-off-by: yuchenlin <npes87184@gmail.com> Reviewed-by: Julio Faracco <jcfaracco@gmail.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20190103114700.9686-4-npes87184@gmail.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2019-01-04 11:15:09 +00:00
yuchenlin	e80fcf590e	dmg: use enumeration type instead of hard coding number Signed-off-by: yuchenlin <npes87184@gmail.com> Reviewed-by: Julio Faracco <jcfaracco@gmail.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20190103114700.9686-3-npes87184@gmail.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2019-01-04 11:15:09 +00:00
yuchenlin	5ef40828ba	dmg: fix binary search There is a possible hang in original binary search implementation. That is if chunk1 = 4, chunk2 = 5, chunk3 = 4, and we go else case. The chunk1 will be still 4, and so on. Signed-off-by: yuchenlin <npes87184@gmail.com> Message-id: 20190103114700.9686-2-npes87184@gmail.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2019-01-04 11:15:09 +00:00
Julio Faracco	b47c7d5325	dmg: Fixing wrong dmg block type value for block terminator. This is a trivial patch to fix a wrong value for block terminator. The old value was 0x7fffffff which is wrong. It was not affecting the code because QEMU dmg block is not handling block terminator right now. Neverthless, it should be fixed. Signed-off-by: Julio Faracco <jcfaracco@gmail.com> Reviewed-by: yuchenlin <yuchenlin@synology.com> Message-id: 20181228145055.18039-1-jcfaracco@gmail.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2019-01-04 11:06:02 +00:00
Stefan Hajnoczi	537c3d4f64	block/mirror: add missing coroutine_fn annotations Marking a function coroutine_fn currently has no effect on the compiler, but it documents that this function must be called from coroutine context and it may yield. This is important information for the programmer. Also, if we ever transition to a stackless coroutine implementation, then it's likely that the annotation will become mandatory so the compiler can use the correct calling convention for coroutine functions. Cc: Max Reitz <mreitz@redhat.com> Cc: John Snow <jsnow@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-12-14 11:55:02 +01:00
Alberto Garcia	2e891722c5	block: Remove flags parameter from bdrv_reopen_queue() Now that all callers are passing all flag changes as QDict options, the flags parameter is no longer necessary, so we can get rid of it. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-12-14 11:55:02 +01:00
Alberto Garcia	3c4e964762	block: Clean up reopen_backing_file() in block/replication.c This function is used to put the hidden and secondary disks in read-write mode before launching the backup job, and back in read-only mode afterwards. This patch does the following changes: - Use an options QDict with the "read-only" option instead of passing the changes as flags only. - Simplify the code (it was unnecessarily complicated and verbose). - Fix a bug due to which the secondary disk was not being put back in read-only mode when writable=false (because in this case orig_secondary_flags always had the BDRV_O_RDWR flag set). - Stop clearing the BDRV_O_INACTIVE flag. The flags parameter to bdrv_reopen_queue() becomes redundant and we'll be able to get rid of it in a subsequent patch. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-12-14 11:55:02 +01:00
Alberto Garcia	1ba7938895	block: Use bdrv_reopen_set_read_only() in the mirror driver The 'block-commit' QMP command is implemented internally using two different drivers. If the source image is the active layer then the mirror driver is used (commit_active_start()), otherwise the commit driver is used (commit_start()). In both cases the destination image must be put temporarily in read-write mode. This is done correctly in the latter case, but what commit_active_start() does is copy all flags instead. This patch replaces the bdrv_reopen() calls in that function with bdrv_reopen_set_read_only() so that only the read-only status is changed. A similar change is made in mirror_exit(), which is also used by the 'drive-mirror' and 'blockdev-mirror' commands. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-12-14 11:55:02 +01:00
Alberto Garcia	e7d22f8bc6	block: Use bdrv_reopen_set_read_only() in stream_start/complete() This patch replaces the bdrv_reopen() calls that set and remove the BDRV_O_RDWR flag with the new bdrv_reopen_set_read_only() function. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-12-14 11:55:02 +01:00
Alberto Garcia	c742a3643f	block: Use bdrv_reopen_set_read_only() in bdrv_commit() This patch replaces the bdrv_reopen() calls that set and remove the BDRV_O_RDWR flag with the new bdrv_reopen_set_read_only() function. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-12-14 11:55:01 +01:00
Alberto Garcia	e70cdc57da	block: Use bdrv_reopen_set_read_only() in commit_start/complete() This patch replaces the bdrv_reopen() calls that set and remove the BDRV_O_RDWR flag with the new bdrv_reopen_set_read_only() function. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-12-14 11:55:01 +01:00
Kevin Wolf	0342567115	file-posix: Avoid aio_worker() for QEMU_AIO_IOCTL aio_worker() doesn't add anything interesting, it's only a useless indirection. Call the handler function directly instead. As we know that this handler function is only called from coroutine context and the coroutine stays around until the worker thread finishes, we can keep RawPosixAIOData on the stack. This was the last user of aio_worker(), so the function goes away now. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-12-14 11:54:49 +01:00
Kevin Wolf	2f3a7ab39b	file-posix: Switch to .bdrv_co_ioctl No real reason to keep using the callback based mechanism here when the rest of the file-posix driver is coroutine based. Changing it brings ioctls more in line with how other request types work. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-12-14 11:52:41 +01:00
Kevin Wolf	c9db2b6489	file-posix: Remove paio_submit_co() The function is not used any more, remove it. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-12-14 11:52:41 +01:00
Kevin Wolf	999e6b69ce	file-posix: Avoid aio_worker() for QEMU_AIO_READ/WRITE aio_worker() doesn't add anything interesting, it's only a useless indirection. Call the handler function directly instead. As we know that this handler function is only called from coroutine context and the coroutine stays around until the worker thread finishes, we can keep RawPosixAIOData on the stack. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-12-14 11:52:41 +01:00
Kevin Wolf	54c7ca1b81	file-posix: Move read/write operation logic out of aio_worker() aio_worker() for reads and writes isn't boring enough yet. It still does some postprocessing for handling short reads and turning the result into the right return value. However, there is no reason why handle_aiocb_rw() couldn't do the same, and even without duplicating code between the read and write path. So move the code there. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-12-14 11:52:41 +01:00
Kevin Wolf	06dc9bd571	file-posix: Avoid aio_worker() for QEMU_AIO_FLUSH aio_worker() doesn't add anything interesting, it's only a useless indirection. Call the handler function directly instead. As we know that this handler function is only called from coroutine context and the coroutine stays around until the worker thread finishes, we can keep RawPosixAIOData on the stack. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-12-14 11:52:41 +01:00
Kevin Wolf	46ee0f462b	file-posix: Avoid aio_worker() for QEMU_AIO_DISCARD aio_worker() doesn't add anything interesting, it's only a useless indirection. Call the handler function directly instead. As we know that this handler function is only called from coroutine context and the coroutine stays around until the worker thread finishes, we can keep RawPosixAIOData on the stack. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-12-14 11:52:41 +01:00
Kevin Wolf	7154d8ae66	file-posix: Avoid aio_worker() for QEMU_AIO_WRITE_ZEROES aio_worker() doesn't add anything interesting, it's only a useless indirection. Call the handler function directly instead. As we know that this handler function is only called from coroutine context and the coroutine stays around until the worker thread finishes, we can keep RawPosixAIOData on the stack. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-12-14 11:52:41 +01:00
Kevin Wolf	58a209c437	file-posix: Avoid aio_worker() for QEMU_AIO_COPY_RANGE aio_worker() doesn't add anything interesting, it's only a useless indirection. Call the handler function directly instead. As we know that this handler function is only called from coroutine context and the coroutine stays around until the worker thread finishes, we can keep RawPosixAIOData on the stack. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-12-14 11:52:41 +01:00
Kevin Wolf	29cb4c01e7	file-posix: Avoid aio_worker() for QEMU_AIO_TRUNCATE aio_worker() doesn't add anything interesting, it's only a useless indirection. Call the handler function directly instead. As we know that this handler function is only called from coroutine context and the coroutine stays around until the worker thread finishes, we can keep RawPosixAIOData on the stack. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-12-14 11:52:41 +01:00
Kevin Wolf	5d5de25005	file-posix: Factor out raw_thread_pool_submit() Getting the thread pool of the AioContext of a block node and scheduling some work in it is an operation that is already done twice, and we'll get more instances. Factor it out into a separate function. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-12-14 11:52:41 +01:00
Kevin Wolf	d57c44d00f	file-posix: Reorganise RawPosixAIOData RawPosixAIOData contains a lot of fields for several separate operations that are to be processed in a worker thread and that need different parameters. The struct is currently rather unorganised, with unions that cover some, but not all operations, and even one #define for field names instead of a union. Clean this up to have some common fields and a single union. As a side effect, on x86_64 the struct shrinks from 72 to 48 bytes. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-12-14 11:52:41 +01:00
Vladimir Sementsov-Ogievskiy	e23c9d7a1c	qcow2: do decompression in threads Do decompression in threads, like it is already done for compression. This improves asynchronous compressed reads performance. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-12-14 11:52:41 +01:00
Vladimir Sementsov-Ogievskiy	c3c10f7295	qcow2: aio support for compressed cluster read Allocate buffers locally and release qcow2 lock. Than, reads inside qcow2_co_preadv_compressed may be done in parallel, however all decompression is still done synchronously. Let's improve it in the following commit. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-12-14 11:52:41 +01:00
Vladimir Sementsov-Ogievskiy	c068a1cd52	qcow2: use byte-based read in qcow2_decompress_cluster We are gradually moving away from sector-based interfaces, towards byte-based. Get rid of it here too. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-12-14 11:52:40 +01:00
Vladimir Sementsov-Ogievskiy	341926ab83	qcow2: refactor decompress_buffer - make it look more like a pair of qcow2_compress - rename the function and its parameters - drop extra out_len variable, check filling of output buffer by strm structure itself - fix code style - add some documentation Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-12-14 11:52:40 +01:00
Vladimir Sementsov-Ogievskiy	f4b3e2a960	qcow2: move decompression from qcow2-cluster.c to qcow2.c Compression is done in threads in qcow2.c. We want to do decompression in the same way, so, firstly, move it to the same file. The only change is braces around if-body in decompress_buffer, to satisfy checkpatch. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-12-14 11:52:40 +01:00
Vladimir Sementsov-Ogievskiy	6994fd78b9	qcow2: make more generic interface for qcow2_compress Give explicit size both for source and destination buffers, to make it similar with decompression path and than cleanly reuse parameter structure for decompression threads. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-12-14 11:52:40 +01:00
Vladimir Sementsov-Ogievskiy	19a4448853	qcow2: use Z_OK instead of 0 for deflateInit2 return code check Use appropriate macro, corresponding to deflateInit2 spec. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-12-14 11:52:40 +01:00
Vladimir Sementsov-Ogievskiy	3a75187fd8	block/backup: drop unused synchronization interface Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-12-14 11:52:40 +01:00
Vladimir Sementsov-Ogievskiy	e4f9752c4a	block/replication: drop extra synchronization After commit `f8d59dfb40` "block/backup: fix fleecing scheme: use serialized writes" fleecing (specifically reading from backup target, when backup source is in backing chain of backup target) is safe, because all backup-job writes to target are serialized. Therefore we don't need additional synchronization for these reads. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-12-14 11:52:40 +01:00
Julio Faracco	95a156f689	dmg: exchanging hardcoded dmg UDIF block types to enum. This change is better to understand what kind of block type is being handled by the code. Using a syntax similar to the DMG documentation is easier than tracking all hex values assigned to a block type. Signed-off-by: Julio Faracco <jcfaracco@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-12-14 11:52:40 +01:00
Julio Faracco	7a40b418ec	dmg: including dmg-lzfse module inside dmg block driver. This commit includes the support to new module dmg-lzfse into dmg block driver. It includes the support for block type ULFO (0x80000007). Signed-off-by: Julio Faracco <jcfaracco@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-12-14 11:52:40 +01:00
Julio Faracco	83bc1f9768	configure: adding support to lzfse library. This commit includes the support to lzfse opensource library. With this library dmg block driver can decompress images with this type of compression inside. Signed-off-by: Julio Faracco <jcfaracco@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-12-14 11:52:40 +01:00
Julio Faracco	c13e80d792	block: adding lzfse decompressing support as a module. QEMU dmg support includes zlib and bzip2, but it does not contains lzfse support. This commit adds the source file to extend compression support for new DMGs. Signed-off-by: Julio Faracco <jcfaracco@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-12-14 11:52:40 +01:00
Vladimir Sementsov-Ogievskiy	c972fa123c	crypto: support multiple threads accessing one QCryptoBlock The two thing that should be handled are cipher and ivgen. For ivgen the solution is just mutex, as iv calculations should not be long in comparison with encryption/decryption. And for cipher let's just keep per-thread ciphers. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2018-12-12 11:16:49 +00:00
Peter Maydell	9225cd127d	nbd patches for 2018-12-03 Improve x-dirty-bitmap handling for experimenting with pull mode incremental backups. - Eric Blake: 0/3 NBD dirty bitmap cleanups -----BEGIN PGP SIGNATURE----- Comment: Public key at http://people.redhat.com/eblake/eblake.gpg iQEcBAABCAAGBQJcBVInAAoJEKeha0olJ0Nq+68H/RoRtXAycWm1bNidVNC9J8pO hkXsZwe39V/6Hl5eXch5J6K9V+443A4+WJSpvfISnB7jKEIHQ2GVMiuXPjmeyap+ BIbAlHt+frQYz+x1E6fKrLE67mxQViRCQrfx8ijDRUJTUlDxq+yPDjwmGk25DRDt zlFqpkCr1NhEREm6JDNmM/WJqBaueSAXY5Fi7MqeHTyRKlZ5RclZaZ5bw4YpB+ip kwsWPHsz9uoe9caMxcdU0Y8PqB7KwvwJ+4RjtGpFgp0Auqe2FvaRyiqYE4n5EU2S 5kvbvKgGK1XUe4RyBLVg1cqcB/LoHLbNohJUJ03pdU1anxcjKIDEv/Oh801OPmE= =w1HF -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/ericb/tags/pull-nbd-2018-12-03' into staging nbd patches for 2018-12-03 Improve x-dirty-bitmap handling for experimenting with pull mode incremental backups. - Eric Blake: 0/3 NBD dirty bitmap cleanups # gpg: Signature made Mon 03 Dec 2018 15:56:23 GMT # gpg: using RSA key A7A16B4A2527436A # gpg: Good signature from "Eric Blake <eblake@redhat.com>" # gpg: aka "Eric Blake (Free Software Programmer) <ebb9@byu.net>" # gpg: aka "[jpeg image of size 6874]" # Primary key fingerprint: 71C2 CC22 B1C4 6029 27D2 F3AA A7A1 6B4A 2527 436A * remotes/ericb/tags/pull-nbd-2018-12-03: nbd/client: Send NBD_CMD_DISC if open fails after connect nbd/client: Make x-dirty-bitmap more reliable nbd/server: Advertise all contexts in response to bare LIST Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2018-12-03 17:43:20 +00:00
Vladimir Sementsov-Ogievskiy	d12ade5732	mirror: fix dead-lock Let start from the beginning: Commit `b9e413dd37` (in 2.9) "block: explicitly acquire aiocontext in aio callbacks that need it" added pairs of aio_context_acquire/release to mirror_write_complete and mirror_read_complete, when they were aio callbacks for blk_aio_* calls. Then, commit `2e1990b26e` (in 3.0) "block/mirror: Convert to coroutines" dropped these blk_aio_* calls, than mirror_write_complete and mirror_read_complete are not callbacks more, and don't need additional aiocontext acquiring. Furthermore, mirror_read_complete calls blk_co_pwritev inside these pair of aio_context_acquire/release, which leads to the following dead-lock with mirror: (gdb) info thr Id Target Id Frame 3 Thread (LWP 145412) "qemu-system-x86" syscall () 2 Thread (LWP 145416) "qemu-system-x86" __lll_lock_wait () * 1 Thread (LWP 145411) "qemu-system-x86" __lll_lock_wait () (gdb) bt #0 __lll_lock_wait () #1 _L_lock_812 () #2 __GI___pthread_mutex_lock #3 qemu_mutex_lock_impl (mutex=0x561032dce420 <qemu_global_mutex>, file=0x5610327d8654 "util/main-loop.c", line=236) at util/qemu-thread-posix.c:66 #4 qemu_mutex_lock_iothread_impl #5 os_host_main_loop_wait (timeout=480116000) at util/main-loop.c:236 #6 main_loop_wait (nonblocking=0) at util/main-loop.c:497 #7 main_loop () at vl.c:1892 #8 main Printing contents of qemu_global_mutex, I see that "__owner = 145416", so, thr1 is main loop, and now it wants BQL, which is owned by thr2. (gdb) thr 2 (gdb) bt #0 __lll_lock_wait () #1 _L_lock_870 () #2 __GI___pthread_mutex_lock #3 qemu_mutex_lock_impl (mutex=0x561034d25dc0, ... #4 aio_context_acquire (ctx=0x561034d25d60) #5 dma_blk_cb #6 dma_blk_io #7 dma_blk_read #8 ide_dma_cb #9 bmdma_cmd_writeb #10 bmdma_write #11 memory_region_write_accessor #12 access_with_adjusted_size #15 flatview_write #16 address_space_write #17 address_space_rw #18 kvm_handle_io #19 kvm_cpu_exec #20 qemu_kvm_cpu_thread_fn #21 qemu_thread_start #22 start_thread #23 clone () Printing mutex in fr 2, I see "__owner = 145411", so thr2 wants aio context mutex, which is owned by thr1. Classic dead-lock. Then, let's check that aio context is hold by mirror coroutine: just print coroutine stack of first tracked request in mirror job target: (gdb) [...] (gdb) qemu coroutine 0x561035dd0860 #0 qemu_coroutine_switch #1 qemu_coroutine_yield #2 qemu_co_mutex_lock_slowpath #3 qemu_co_mutex_lock #4 qcow2_co_pwritev #5 bdrv_driver_pwritev #6 bdrv_aligned_pwritev #7 bdrv_co_pwritev #8 blk_co_pwritev #9 mirror_read_complete () at block/mirror.c:232 #10 mirror_co_read () at block/mirror.c:370 #11 coroutine_trampoline #12 __start_context Yes it is mirror_read_complete calling blk_co_pwritev after acquiring aio context. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-12-03 16:50:58 +01:00
Eric Blake	c688e6ca7b	nbd/client: Send NBD_CMD_DISC if open fails after connect If nbd_client_init() fails after we are already connected, then the server will spam logs with: Disconnect client, due to: Unexpected end-of-file before all bytes were read unless we gracefully disconnect before closing the connection. Ways to trigger this: $ opts=driver=nbd,export=foo,server.type=inet,server.host=localhost,server.port=10809 $ qemu-img map --output=json --image-opts $opts,read-only=off $ qemu-img map --output=json --image-opts $opts,x-dirty-bitmap=nosuch: Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20181130023232.3079982-4-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>	2018-11-30 14:13:37 -06:00
Eric Blake	47829c4079	nbd/client: Make x-dirty-bitmap more reliable The implementation of x-dirty-bitmap in qemu 3.0 (commit `216ee365`) silently falls back to treating the server as not supporting NBD_CMD_BLOCK_STATUS if a requested meta_context name was not negotiated, which in turn means treating the _entire_ image as data. Since our hack relied on using 'qemu-img map' to view which portions of the image were dirty by seeing what the redirected bdrv_block_status() treats as holes, this means that our fallback treats the entire image as clean. Better would have been to treat the entire image as dirty, or to fail to connect because the user's request for a specific context could not be honored. This patch goes with the latter. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20181130023232.3079982-3-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>	2018-11-30 14:11:20 -06:00
Max Reitz	577a133988	file-posix: Fix shared locks on reopen commit s->locked_shared_perm is the set of bits locked in the file, which is the inverse of the permissions actually shared. So we need to pass them as they are to raw_apply_lock_bytes() instead of inverting them again. Reported-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-11-19 14:32:01 +01:00
Eric Blake	77d6a21558	qcow2: Don't allow overflow during cluster allocation Our code was already checking that we did not attempt to allocate more clusters than what would fit in an INT64 (the physical maximimum if we can access a full off_t's worth of data). But this does not catch smaller limits enforced by various spots in the qcow2 image description: L1 and normal clusters of L2 are documented as having bits 63-56 reserved for other purposes, capping our maximum offset at 64PB (bit 55 is the maximum bit set). And for compressed images with 2M clusters, the cap drops the maximum offset to bit 48, or a maximum offset of 512TB. If we overflow that offset, we would write compressed data into one place, but try to decompress from another, which won't work. It's actually possible to prove that overflow can cause image corruption without this patch; I'll add the iotests separately in the next commit. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2018-11-19 12:51:40 +01:00
Kevin Wolf	443ba6befa	vvfat: Fix memory leak Don't leak 'cluster' in the mapping == NULL case. Found by Coverity (CID 1055918). Fixes: `8d9401c279` Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>	2018-11-19 12:51:40 +01:00
Liam Merwick	7cb6d3c9be	qcow2: Read outside array bounds in qcow2_pre_write_overlap_check() The commit for `0e4e4318ea` increments QCOW2_OL_MAX_BITNR but does not add an array entry for QCOW2_OL_BITMAP_DIRECTORY_BITNR to metadata_ol_names[]. As a result, an array dereference of metadata_ol_names[8] in qcow2_pre_write_overlap_check() could result in a read outside of the array bounds. Fixes: `0e4e4318ea` ('qcow2: add overlap check for bitmap directory') Cc: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Liam Merwick <Liam.Merwick@oracle.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 1541453919-25973-6-git-send-email-Liam.Merwick@oracle.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2018-11-12 17:49:21 +01:00
Liam Merwick	8d9401c279	block: Fix potential Null pointer dereferences in vvfat.c The calls to find_mapping_for_cluster() may return NULL but it isn't always checked for before dereferencing the value returned. Additionally, add some asserts to cover cases where NULL can't be returned but which might not be obvious at first glance. Signed-off-by: Liam Merwick <Liam.Merwick@oracle.com> Message-id: 1541453919-25973-5-git-send-email-Liam.Merwick@oracle.com [mreitz: Dropped superfluous check of "mapping" following an assertion that it is not NULL, and fixed some indentation] Signed-off-by: Max Reitz <mreitz@redhat.com>	2018-11-12 17:49:21 +01:00
Liam Merwick	602414d123	block: Null pointer dereference in blk_root_get_parent_desc() The dev_id returned by the call to blk_get_attached_dev_id() in blk_root_get_parent_desc() can be NULL (an internal call to object_get_canonical_path may have returned NULL). Instead of just checking this case before before dereferencing, adjust blk_get_attached_dev_id() to return the empty string if no object path can be found (similar to the case when blk->dev is NULL and an empty string is returned). Signed-off-by: Liam Merwick <Liam.Merwick@oracle.com> Message-id: 1541453919-25973-3-git-send-email-Liam.Merwick@oracle.com Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2018-11-12 17:49:21 +01:00

1 2 3 4 5 ...

4080 Commits