Without TC, we'd get draw->count==0.. which is obviously not correct.
With TC we get random garbage for draw->count. Which turns into
exciting things like trying to allocate multi-gigabyte buffers for
tess param/factor buffers.
But we can just tell the CP to split up large tess draws, and put an
upper bound on the tess param/factor buffer sizes.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9864>
when building with Meson. It requires libexpat that is not available
on Android and we want to avoid it.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9848>
nir_to_tgsi lowers nir_op_isub to UADD and the negate source mod
on the second operator. Since r600 doesn't support source mods
for integers but support SUB_INT we switch the opcode and clear the
negate flag.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9579>
Since the last round of enhanced layout packing, it seems like this code
is no longer needed. This allows us to use the HANDLE_EMIT_BUILTIN()
macro, which simplifies things further.
It would have been incorrect anyway, because it's the only code that
uses the location directly as an index. All other locations multiplies
it by four first.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9860>
Fixes the following building error:
In file included from external/mesa/src/amd/addrlib/src/gfx9/gfx9addrlib.cpp:36:
external/mesa/src/amd/addrlib/src/chip/gfx9/gfx9_gb_reg.h:43:2: error: "BIGENDIAN_CPU or LITTLEENDIAN_CPU must be defined"
^
NOTE: Android ABI specifies Little Endian for arm and x86 targets
Fixes: 3616e02ef3 ("amd/addrlib: define endianess differently")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9869>
If we've added the array, then we should update the info. This is the
value that gallium drivers setting !PIPE_CAP_CLIP_PLANES have to use in
place of rasterizer->clip_planes_enabled.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9815>
Since image stores can now compress and we can't track image stores
this also stops using predication for DCC decompression.
In GFX10 this was benchmarked to be faster. For GFX10.3 the microbenchmarks
are not as possible though I haven't tested any games, so this is not enabled
there yet.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6796>
For 16x16 we get 4 16x4 waves, which is bad for DCC image stores.
The workgroup size doesn't really matter for speed, the important
part is the number of waves, which should stay constant here.
(Though some optimization would be nice, but out of scope for this
patch)
The compute DCC compress shader still uses 16x16 due to functional
requirements (and we're sure it won't write with DCC compression ...)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6796>
We are providing a BO with the default attribute values for the
GL_SHADER_STATE_RECORD, that contains 16 vec4. Such default value for
each vec4 is (0, 0, 0, 1). As the attribute format could be int or
float, the "1" value needs to take into account the attribute format.
But in the practice, the most common case is all floats. So we create
one default attribute values BO assuming that all attributes will be
floats, and we store it at v3dv_device and only create a new one if a
int format type is defined. That allows to reduce the amount of BOs
needed.
Note that we could still try to reduce the amount of BOs used by the
pipelines if we create a bigger BO, and we just play with the
offsets. But as mentioned, that's not the usual, and would add an
extra complexity,so it is not a priority right now.
This makes the following test passing when disabling the pipeline
cache support:
dEQP-VK.api.object_management.max_concurrent.graphics_pipeline
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9845>
The upper 32 bits are truncated before converting, which still produces
correct results since they never meaningfully contribute to the result.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9597>
The previous code produced incorrect results for inputs outside the
range [INT16_MIN, INT16_MAX].
A problematic case is e.g. i2f16 32768, which previously would be
converted to -32768.0 instead of returning the exactly representable
floating point result.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9597>
This function has evolved to be a generic helper function used throughout
the file, so having those assumptions written down explicitly and document
unsupported edge cases should help prevent incorrect use.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9597>
This doesn't change semantics but allows us to reject this potentially
ambiguous configuration in convert_int in a later change.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9597>
Previously, inputs such as 0x100000000 would have their upper 32-bits
ignored despite being representable by 32-bit floats.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9597>
We might only need to emit a read or write cap for a given image. This
could provide the Vulkan driver with the chance to optimize things
slightly.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9855>
I don't think the hw culls these primitives and NGG culling isn't
yet a thing. This also matches PAL.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9691>
This reverts commit 5743a36b2b.
Now that _eglAddDevice is always called with the correct software
hint, no need to bail out if the device doesn't have a render node.
On split render/display SoCs, the DRM device won't have a render
node, yet rendering is hardware-accelerated (via kmsro).
Signed-off-by: Simon Ser <contact@emersion.fr>
Fixes: 5743a36b2b ("egl: Don't add hardware device if there is no render node v2.")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4178
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9697>
We don't want to expose an EGL device for a display-only DRM devices
(like VKMS). For these DRM devices we have a separate software-rendering
device (the first in the list, always present).
There is a similar check in _eglAddDRMDevice, however it will be
removed in a future commit to allow split render/display devices
to be properly added. We can't figure out whether we're on a split
render/display system before loading the driver.
Signed-off-by: Simon Ser <contact@emersion.fr>
Fixes: 5743a36b2b ("egl: Don't add hardware device if there is no render node v2.")
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9697>
On the EGL DRM platform, call _eglAddDevice with the software flag
set if GBM has loaded a software driver. This allows _eglAddDevice
to make the difference between llvmpipe and kmsro.
This is important on split render/display SoCs: we don't want to
advertise EGL_MESA_device_software on these systems.
Completely drop disp->Options.ForceSoftware, because GBM is
responsible for choosing software rendering and doesn't take this
hint into account.
Signed-off-by: Simon Ser <contact@emersion.fr>
Fixes: 5743a36b2b ("egl: Don't add hardware device if there is no render node v2.")
References: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4178
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9697>
"plane[0].i32" is the plane being lowered, it's not the sampler we're looking
for.
It worked when there's a single sampler because, eg for NV12, plane[0].i32 for
the UV plane would be 1 and the added ":uv" sampler would also land at binding
point 1.
Fixes: 079e5f73d7 ("mesa/st: rewrite src var when lowering tex_src_plane")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9812>
Now that we have surface descriptors defined in midgard.xml we can use
pan_pack() to emit them.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9827>
Right now the code manipulates mali_ptr, but having surface descriptors
properly defined will allow us to use the descriptors allocator when
allocating a midgard texture.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9827>
Got the alignment wrong a few times, so I figured it would be good to
have helpers that take care of the alignment requirements based on
the midgard.xml definitions.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9827>
Once we have that in place, we can provide macros to simplify descriptor
allocation.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9827>