There are some block drivers which are essential to QEMU and may not be
removed: These are raw, file and qcow2 (as the default non-raw format).
Make their BlockDriver objects public so they can be directly referenced
throughout the block layer without needing to call bdrv_find_format()
and having to deal with an error at runtime, while the real problem
occurred during linking (where raw, file or qcow2 were not linked into
qemu).
Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
If the user neglects to specify the image format, QEMU probes the
image to guess it automatically, for convenience.
Relying on format probing is insecure for raw images (CVE-2008-2004).
If the guest writes a suitable header to the device, the next probe
will recognize a format chosen by the guest. A malicious guest can
abuse this to gain access to host files, e.g. by crafting a QCOW2
header with backing file /etc/shadow.
Commit 1e72d3b (April 2008) provided -drive parameter format to let
users disable probing. Commit f965509 (March 2009) extended QCOW2 to
optionally store the backing file format, to let users disable backing
file probing. QED has had a flag to suppress probing since the
beginning (2010), set whenever a raw backing file is assigned.
All of these additions that allow to avoid format probing have to be
specified explicitly. The default still allows the attack.
In order to fix this, commit 79368c8 (July 2010) put probed raw images
in a restricted mode, in which they wouldn't be able to overwrite the
first few bytes of the image so that they would identify as a different
image. If a write to the first sector would write one of the signatures
of another driver, qemu would instead zero out the first four bytes.
This patch was later reverted in commit 8b33d9e (September 2010) because
it didn't get the handling of unaligned qiov members right.
Today's block layer that is based on coroutines and has qiov utility
functions makes it much easier to get this functionality right, so this
patch implements it.
The other differences of this patch to the old one are that it doesn't
silently write something different than the guest requested by zeroing
out some bytes (it fails the request instead) and that it doesn't
maintain a list of signatures in the raw driver (it calls the usual
probe function instead).
Note that this change doesn't introduce new breakage for false positive
cases where the guest legitimately writes data into the first sector
that matches the signatures of an image format (e.g. for nested virt):
These cases were broken before, only the failure mode changes from
corruption after the next restart (when the wrong format is probed) to
failing the problematic write request.
Also note that like in the original patch, the restrictions only apply
if the image format has been guessed by probing. Explicitly specifying a
format allows guests to write anything they like.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1416497234-29880-8-git-send-email-kwolf@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
I'll use it with block backends shortly, and the name is going to fit
badly there. It's a block layer thing anyway, not just a block driver
thing.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
I'll use BlockDriverAIOCB with block backends shortly, and the name is
going to fit badly there. It's a block layer thing anyway, not just a
block driver thing.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Now that all backend drivers are using QemuOpts, remove all
QEMUOptionParameter related codes.
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Change block layer to support both QemuOpts and QEMUOptionParameter.
After this patch, it will change backend drivers one by one. At the end,
QEMUOptionParameter will be removed and only QemuOpts is kept.
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
error_is_set(&var) is the same as var != NULL, but it takes
whole-program analysis to figure that out. Unnecessarily hard for
optimizers, static checkers, and human readers. Dumb it down to
obvious.
Gets rid of several dozen Coverity false positives.
Note that the obvious form is already used in many places.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
raw copies over the BlockLimits of bs->file during bdrv_open().
However, since commit d34682cd it is immediately overwritten during
bdrv_refresh_limits(). This caused all fields except for
opt_transfer_length and opt_mem_alignment (which happen to be correctly
inherited in generic code) to be zeroed.
Move the BlockLimit assignment to a .bdrv_refresh_limits() callback to
make it work again for all fields.
Reported-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
The block layer generally keeps the size of an image cached in
bs->total_sectors so that it doesn't have to perform expensive
operations to get the size whenever it needs it.
This doesn't work however when using a backend that can change its size
without qemu being aware of it, i.e. passthrough of removable media like
CD-ROMs or floppy disks. For this reason, the caching is disabled when a
removable device is used.
It is obvious that checking whether the _guest_ device has removable
media isn't the right thing to do when we want to know whether the size
of the host backend can change. To make things worse, non-top-level
BlockDriverStates never have any device attached, which makes qemu
assume they are removable, so drv->bdrv_getlength() is always called on
the protocol layer. In the case of raw-posix, this causes unnecessary
lseek() system calls, which turned out to be rather expensive.
This patch completely changes the logic and disables bs->total_sectors
caching only for certain block driver types, for which a size change is
expected: host_cdrom and host_floppy on POSIX, host_device on win32; also
the raw format in case it sits on top of one of these protocols, but in
the common case the nested bdrv_getlength() call on the protocol driver
will use the cache again and avoid an expensive drv->bdrv_getlength()
call.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Propagate errors in raw_create rather than directly reporting and
afterwards discarding them.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
if a raw device like an iscsi target or host device is used
the current implementation makes a second call out to get
the block status of bs->file.
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Add an Error ** parameter to bdrv_create and its associated functions to
allow more specific error messages.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Add an Error ** parameter to BlockDriver.bdrv_open and
BlockDriver.bdrv_file_open to allow more specific error messages.
Signed-off-by: Max Reitz <mreitz@redhat.com>
For now, bdrv_get_block_status is just another name for bdrv_is_allocated.
The next patches will add more flags.
This also touches all block drivers with a mostly mechanical rename. The
sole exception is cow; because it calls cow_co_is_allocated from the read
code, we keep that function and make cow_co_get_block_status a wrapper.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
bdrv_is_allocated can detect coroutine context and go through a fast
path, similar to other block layer functions.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
"Incoming" function prototypes and "outgoing" function calls must match
reality. Implemented using the "struct BlockDriver" definition in
"include/block/block_int.h", and gcc errors & warnings.
v1->v2:
On 08/20/13 09:51, Kevin Wolf wrote:
> Am 18.08.2013 um 16:29 hat Paolo Bonzini geschrieben:
>> Il 16/08/2013 16:15, Laszlo Ersek ha scritto:
>>> +static int raw_reopen_prepare(BDRVReopenState *reopen_state,
>>> + BlockReopenQueue *queue, Error **errp)
>>> {
>>> - return bdrv_reopen_prepare(bs->file);
>>> + BDRVReopenState tmp = *reopen_state;
>>> +
>>> + tmp.bs = tmp.bs->file;
>>> + return bdrv_reopen_prepare(&tmp, queue, errp);
>>> }
>>
>> This should just return zero, my fault.
>
> Which is because bdrv_reopen_queue() already queues bs->file for reopen.
> The simple return 0; implementation is shared by all other format drivers
> that support reopening images.
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
On 08/05/13 15:03, Paolo Bonzini wrote:
>
> [...]
>
> 5) Formats are registered with bdrv_register (takes a BlockDriver*). You
> also need to pass the caller of bdrv_register to block_init.
Fill in the BlockDriver structure with the raw_*() functions that have
been added to "block/raw_bsd.c", in the order the fields are defined in
"include/block/block_int.h".
I needed more explanation / naming examples for registering the driver
than what Paolo gave me, so I copied / adapted from "block/qcow2.c". The
parts I took as basis for modification are blamed on
commit 5efa9d5a8b
Author: Anthony Liguori <aliguori@us.ibm.com>
Date: Sat May 9 17:03:42 2009 -0500
Convert block infrastructure to use new module init functionality
commit 20d97356c9
Author: Blue Swirl <blauwirbel@gmail.com>
Date: Fri Apr 23 20:19:47 2010 +0000
Fix OpenBSD build
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
On 08/05/13 15:03, Paolo Bonzini wrote:
>
> [...]
>
> 4) There is another member, .create_options, which is an array of
> QEMUOptionParameter structs, terminated by an all-zero item. The only
> option you need is for the virtual disk size. You will find something
> to copy from in other block drivers, for example block/qcow2.c.
Code taken and adapted from "block/qcow2.c", as suggested. The code being
copied/modified is blamed on
commit 20d97356c9
Author: Blue Swirl <blauwirbel@gmail.com>
Date: Fri Apr 23 20:19:47 2010 +0000
Fix OpenBSD build
and
commit 7c80ab3f21
Author: Jes Sorensen <Jes.Sorensen@redhat.com>
Date: Fri Dec 17 16:02:39 2010 +0100
block/qcow2.c: rename qcow_ functions to qcow2_
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
On 08/05/13 15:03, Paolo Bonzini wrote:
>
> [...]
>
> 3) These members are special
>
> .format_name is the string "raw"
> .bdrv_open raw_open should set bs->sg to bs->file->sg and return 0
> .bdrv_close raw_close should do nothing
> .bdrv_probe raw_probe should just return 1.
v1->v2:
On 08/20/13 10:11, Kevin Wolf wrote:
> Am 16.08.2013 um 16:15 hat Laszlo Ersek geschrieben:
>> +static int raw_probe(void)
>> +{
>> + return 1;
>> +}
>
> Maybe add a comment here like "smallest possible positive score so that
> raw is used if and only if no other block driver works".
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
On 08/05/13 15:03, Paolo Bonzini wrote:
>
> [...]
>
> 2) This is also a simple forwarder function:
>
> .bdrv_create
>
> but there is no BlockDriverState argument so the forwarded-to function
> does not have a bs->file argument either. The forwarded-to function is
> bdrv_create_file.
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
On 08/05/13 15:03, Paolo Bonzini wrote:
>
> [...]
>
> 1) BlockDriver is a struct in which these function members are
> interesting:
>
> .bdrv_reopen_prepare
> .bdrv_co_readv
> .bdrv_co_writev
> .bdrv_co_is_allocated
> .bdrv_co_write_zeroes
> .bdrv_co_discard
> .bdrv_getlength
> .bdrv_get_info
> .bdrv_truncate
> .bdrv_is_inserted
> .bdrv_media_changed
> .bdrv_eject
> .bdrv_lock_medium
> .bdrv_ioctl
> .bdrv_aio_ioctl
> .bdrv_has_zero_init
>
> They should be implemented as simple forwarders (see above). There are
> 16 functions listed here, you can easily see how this already accounts
> for 100+ SLOC roughly...
>
> The implementations of bdrv_co_readv and bdrv_co_writev should also call
> BLKDBG_EVENT on bs->file too, before forwarding to bs->file. The events
> to be generated are BLKDBG_READ_AIO and BLKDBG_WRITE_AIO.
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
On 08/05/13 15:03, Paolo Bonzini wrote:
>
>
> ----- Original Message -----
>> From: "Laszlo Ersek" <lersek@redhat.com>
>> To: "Paolo Bonzini" <pbonzini@redhat.com>
>> Sent: Monday, August 5, 2013 2:43:46 PM
>> Subject: Re: [PATCH 1/2] raw: add license header
>>
>> On 08/02/13 00:27, Paolo Bonzini wrote:
>>> On 08/01/2013 10:13 AM, Christoph Hellwig wrote:
>>>> On Wed, Jul 31, 2013 at 08:19:51AM +0200, Paolo Bonzini wrote:
>>>>> Most of the block layer is under the BSD license, thus it is
>>>>> reasonable to license block/raw.c the same way. CCed people should
>>>>> ACK by replying with a Signed-off-by line.
>>>>
>>>> The coded was intended to be GPLv2.
>>>
>>> Laszlo, would you be willing to do clean-room reverse engineering?
>>>
>>> (No rants, please. :))
>>
>> What's the scope exactly?
>
> It's quite small, it's a file full of forwarders like
>
> static void raw_foo(BlockDriverState *bs)
> {
> return bdrv_foo(bs->file);
> }
>
> It's 170 lines of code, all as boring as this. I only picked you
> because I'm quite certain you have never seen the file (and the answer
> confirmed it).
>
> Basically:
>
> 1) BlockDriver is a struct in which these function members are
> interesting:
>
> .bdrv_reopen_prepare
> .bdrv_co_readv
> .bdrv_co_writev
> .bdrv_co_is_allocated
> .bdrv_co_write_zeroes
> .bdrv_co_discard
> .bdrv_getlength
> .bdrv_get_info
> .bdrv_truncate
> .bdrv_is_inserted
> .bdrv_media_changed
> .bdrv_eject
> .bdrv_lock_medium
> .bdrv_ioctl
> .bdrv_aio_ioctl
> .bdrv_has_zero_init
>
> They should be implemented as simple forwarders (see above).
> There are 16 functions listed here, you can easily see how this
> already accounts for 100+ SLOC roughly...
>
> The implementations of bdrv_co_readv and bdrv_co_writev should also
> call BLKDBG_EVENT on bs->file too, before forwarding to bs->file. The
> events to be generated are BLKDBG_READ_AIO and BLKDBG_WRITE_AIO.
>
> 2) This is also a simple forwarder function:
>
> .bdrv_create
>
> but there is no BlockDriverState argument so the forwarded-to function
> does not have a bs->file argument either. The forwarded-to function
> is bdrv_create_file.
>
> 3) These members are special
>
> .format_name is the string "raw"
> .bdrv_open raw_open should set bs->sg to bs->file->sg and return 0
> .bdrv_close raw_close should do nothing
> .bdrv_probe raw_probe should just return 1.
>
> 4) There is another member, .create_options, which is an array of
> QEMUOptionParameter structs, terminated by an all-zero item. The only
> option you need is for the virtual disk size. You will find something
> to copy from in other block drivers, for example block/qcow2.c.
>
> 5) Formats are registered with bdrv_register (takes a BlockDriver*).
> You also need to pass the caller of bdrv_register to block_init.
>
> 6) I'm not sure how to organize the patch series, so I'll leave this to
> your creativity. I guess in this case move/copy detection of git should
> be disabled. I would definitely include this spec in the commit
> message as a proof of clean-room reverse engineering.
>
> 7) Remember a BSD header like the one in block.c.
>
> Paolo
This patch implements the email up to the paragraph ending with "100+ SLOC
roughly". The skeleton is generated from the list there, with a simple
shell loop using "sed" and the raw_foo() template.
The BSD license block is copied (and reflowed) from
"util/qemu-progress.c".
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>