tcs_out_lds_offsets is not sure to be 16 byte aligned, it's
calculated like this:
num_patches * patch_vertices * lshs_vertex_stride
num_patches and patch_vertices are not sure to be any value aligned,
lshs_vertex_stride is added one extra dword, so it's only 4 byte
aligned.
This may cause problem even before we switch to nir tess output
lower when write tess factor before read tail of input. But it's
more likely to cause problem after we switch to nir tess output
lower because the main body won't eliminate the low 4bit offset
but epilog will, so they use different offset to read/write tess
factor.
Fixes: 7598bfd768 ("radeonsi: replace llvm tcs output with nir lower pass")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7083
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18174>
(cherry picked from commit ff7c59672fd59c94792b26f7131eb86e57d4b8f4)
For AMD we kinda have some modifiers with a max size ... (Which is
really a compositor/kms issue, but getting them to try kinda falls
into the unsolved "how to allocate/what pitch to use" bucket, so
we solve it on the allocating side)
Cc: mesa-stable
Tested-by: Michel Dänzer <mdaenzer@redhat.com>
Reviewed-by: Joshua Ashton <joshua@froggi.es>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18139>
(cherry picked from commit bb2a44432400f6c7613905eceb14c6544687ed1f)
In a99e85db9e, we added a dup into iris_screen_create(). Apparently
some android code paths must be hitting iris_screen_create() without
calling iris_drm_screen_create(). After a99e85db9e, the code paths
that do hit iris_drm_screen_create() will now dup the fd twice, but
iris_screen_destroy() will only close 1 of these fds.
Fixes: a99e85db9e ("iris:Duplicate DRM fd internally instead of reuse.")
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18020>
(cherry picked from commit e9f40e42de6b47e036d603296ecb5facb384eb0c)
We did not handle the operation with data size < 4. It works fine on
all other messages (global/shared). The initial commit was just too
restrictive.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 1e242785c3 ("intel/fs: Implement load/store_scratch on XeHP")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16964>
(cherry picked from commit 3c78e94ff345fda6314e7644873d960c0ee97dc5)
The selection of the internal opcode to deal with load_scratch is
incorrect.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: c643979228 ("intel/fs: Choose memory message type based on bit size")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16964>
(cherry picked from commit 46a13404c07acdb0412121a6ff55fdbcd5bfea5c)
Even with dynamic rendering, you have to bind both aspects of the image
if the image contains both depth and stencil. One day, we may see this
restriction lifted but that will require deeper driver surgery into the
way we handle depth/stencil layouts.
Fixes: 42db590006 ("radv: convert the meta blit 2d path to dynamic rendering")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18084>
(cherry picked from commit 76b8b854a514bc515ddba47f4fbbf6ea80bcf0f2)
When setting -Dshader-cache=disabled the build fails due
no member named 'disk_cache' in 'struct anv_physical_device'
Signed-off-by: sjfricke <spencerfricke@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Fixes: 7f1e8230 ("anv: Switch to the new common pipeline cache")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18181>
(cherry picked from commit c49e328e4f41d4a2503045a0bb8bfaf65d4cde61)
We use/delete this fence unconditionally, but it was initialized only
when screen->precompile is set. Move the util_queue_fence_init call
to the iris_create_uncompiled_shader to initialize it always.
Fixes: 42c34e1a ("iris: Enable threaded shader compilation")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7074
Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Tested-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18182>
(cherry picked from commit 0a0aa24b33c4a3d18426f09674ff67c23e71cfd2)
First we should only support the extension if we can support reporting
on all the heaps.
Second we should not run any query code if the extension is not
supported.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: fae88d8791 ("anv: make use of the new smallbar uAPI")
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18153>
(cherry picked from commit b8c472c111ee1e5a7feb45a9da81e5f73145b6fb)
With VK_FORMAT_B10G11R11_UFLOAT_PACK32 in particular, we're seeing
applications create image views with swizzle = R,G,B,0
But since the format has no alpha channel, the swizzle value for it
does not matter for the equivalence we're trying to verify.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: a9edc268b9 ("anv: validate image view lowered storage formats for storage")
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18081>
(cherry picked from commit 4ab38112f36ec25afe98a1636909bb8bde4ad7a1)
this doesn't need fixing
Fixes: 3a47576687 ("zink: add a compiler pass to match up tex op dest types")
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18196>
(cherry picked from commit 585fa6bf406e064877baf4b2b106e116bee21c6b)
If this happens then "after" is NULL, so we can't use it to get the
block, and the instruction is never moved at the end so we have to
create the split instructions before creating the collect to make sure
they are in the right order.
This happens when reloading a complex vector value that has been
coalesced at the end of a basic block, which apparently hasn't happened
until a gfxbench5 shader on zink hit this case. This fixes it.
Closes: #7054
Fixes: 613eaac7b5 ("ir3: Initial support for spilling non-shared registers")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18082>
(cherry picked from commit dcfbb60392fce468eb2bc2513c3b1ba5dfc1e62e)
Fixes: bbdf7e45b1 ("wsi/x11: Hook up KHR_incremental_present")
Signed-off-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18177>
(cherry picked from commit 20fba14f2cb34060ecae42634bfe8c266f56952f)
When unpacking the texture sample result ensure it is moved to the
proper expected dest register.
This fixes incorrect texturing in Chromium using PixiJS framework.
CC: mesa-stable
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18122>
(cherry picked from commit 4ba21c3e8cef567f0c39460dd3f1d20f754c3c6f)
Clause scheduler edition of db2bdc1dc3 ("pan/bi: Require ATEST coverage mask
input in R60"). ATEST wants to read r60, which can't work if its input isn't
even in a register.
When per-sample shading isn't in use, prevents regressions in:
KHR-GLES31.core.sample_variables.mask.*
These tests previously passed because per-sample shading was forced. It's
not clear whether the bug addressed in this patch is possible to hit "in the
wild", i.e. without the optimizations in this series that allow us to use
per-pixel shading in more cases.
No shader-db changes.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17841>
(cherry picked from commit 394e1f5862a5cd537f60c01ed75dc698e112da58)
Fixes flaking in
dEQP-GLES31.functional.image_load_store.cube.qualifiers.volatile_r32i due to
image reads being moved past a BARRIER.
To make this more robust/optimal, we probably need scheduling information
(coherent/volatile/etc) added to instructions like ACO does. That's left for a
future extension, for now I just want the test to stop flaking.
Fixes: 569e5dc745 ("pan/bi: Schedule for pressure pre-RA")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17841>
(cherry picked from commit e12a9ce8d691a311cd37eecbdeadb30400adeb95)
without glsl array lowering, this intrinsic can creep in for tg4 ops,
which complicates everything. instead, rewrite these ops as residency+iand,
and then rewrite the existing residency ops to match
v2 (idr): Add missing size parameter to nir_is_sparse_texels_resident
calls.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16547>
I think I got all the drivers that need updating. This is only
necessary in drivers that support GLSL 4.00 / GL_ARB_gpu_shader5 and
have PIPE_CAP_TEXTURE_GATHER_OFFSETS = 0.
v2: Don't (accidentally) condition tg4 offsets lowering on tex rect
lowering. Noticed by Qiang.
v3: Add missing bool() cast.
v4: don't use designated initializers
Fixes: 640f909862 ("glsl: add _texture related sparse texture builtin functions")
Closes: #6365
Tested-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16547>
This intrinsic returns a Boolean. Both 1-bit and 32-bit versions must
be allowed. Otherwise, size mismatches will occur after lowering
1-bit Booleans to 32-bit.
Fixes: 4cbdf9ec4d ("nir,spirv: implement SpvOpImageSparseTexelsResident")
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16547>
If we don't recognize the model, dev->model will be NULL. In that case, we can't
dereference dev->model to get the tilebuffer size. If we do, we'll segfault,
instead of gracefully refusing to probe and loading the swrast instead.
Fixes: 96d65b47c7 ("panfrost: Use implementation-specific tile size")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18115>
(cherry picked from commit 93f69e0452e0cdb2cf1faecb613f54220b798f79)
Additionally add support for PIPE_CAP_CLEAR_SCISSORED since we are already
touching the scissor state.
Fixes various gnome-shell rendering artifacts.
v2: Remove NEW_SCISSOR as clear now updates scissor registers explicitly
v3: Reset scissor_off state since its not off now after clear
Cc: mesa-stable
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18137>
(cherry picked from commit 7908cb895e2dea8cff91cecb53457a099dc96b07)
Alpha-to-coverage acts like discard and happens after FS ends,
so like with discard LRZ write should be disabled.
With discard we don't know at the moment of binning whether
fragment would be not discarded, so we cannot write its depth to LRZ.
Cc: mesa-stable
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18102>
(cherry picked from commit 09676b5817339cb7653fb23c9c5341b7f95392bb)
SPIR-V shift ops unlike NIR have undefined behavior if shift count
larger than or equalt to bitwidth.
This means that true translation of NIR ishl/ishr/ushr to SPIR-V requires
masking like that done in gallivm.
This was seen in the case of soft fp64 in cts case
KHR-GL46.gpu_shader_fp64.builtin.ceil_double.
Cc: mesa-stable
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18074>
(cherry picked from commit b386df918fd3274b3814c639ad7297f5b07cc48a)
Loading indirectly from a register that was just written to
doesn't work on R600 class hardware, so add a NOP group with
the address register load being emitted in the t-slot. to make
sure that the register write was finished.
Fixes: 33765aa92a
r600/sfn: Enable NIR for pre RG hardware
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18130>
(cherry picked from commit 404d95ca49af48e06d786e765091dc8d3a531b3c)
On R600 and R700 class hardware the input declaration order maps
directly to the register the hardware writes the inputs to, so
make all interpolated inputs come first, and only then emit the
system values like POS or FACE.
Related: #7035
Fixes: 33765aa92a
r600/sfn: Enable NIR for pre RG hardware
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18130>
(cherry picked from commit f6582027dc839f9f886854b3a6e581d4c41ac5a7)
The old code does the same for R600.
Fixes: 33765aa92a
r600/sfn: Enable NIR for pre RG hardware
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18130>
(cherry picked from commit 34b9e3e44c1939e9773bb26a2cac2871b2579f03)
This reverts commit 23e5b910c5.
Reason for revert:
It's causing the regression for H264 transcoding. We will Enable EFC
once we verify all corner cases and as of now disabling
Signed-off-by: Ikshwaku Chauhan <ikshwaku.chauhan@amd.com>
Reviewed-by: Thong Thai <thong.thai@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17869>
(cherry picked from commit ddc8ab9e4322bed0ce7e47c986d54c852fb56852)
textureGatherOffsets always takes a highp array of constants. As
per the discussion in [1] trying to lower the precision results in segfault
later on in the compiler as textureGatherOffsets will end up being passed
a temp when its expecting a constant as required by the spec.
[1] https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16547#note_1393704
Fixes: b83f4b9fa2 ("glsl: Add an IR lowering pass to convert mediump operations to 16-bit")
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18101>
(cherry picked from commit 87940c31939f45f94c7af02c6c280773f917ff25)
Android CDD has additional requirements that must be met in order to
enable 1.1+:
- samplerYcbcrConversion
- VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_SYNC_FD_BIT
- VK_EXTERNAL_FENCE_HANDLE_TYPE_SYNC_FD_BIT
- VK_ANDROID_external_memory_android_hardware_buffer >= v2
Requirements are checked by:
android.graphics.cts.VulkanFeaturesTest#testVulkan1_1Requirements CTS
Fixes: 2686c5419d ("v3dv: add Android support")
Signed-off-by: Roman Stratiienko <r.stratiienko@gmail.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18047>
(cherry picked from commit b17ea48f91d8fa09668ac61a034013f4a79dc6be)
vk_common_AcquireImageANDROID and vk_common_QueueSignalReleaseImageANDROID
expect sync_fd import/export to be enabled, otherwise they crashes
while trying to ImportSemaphoreFdKHR() / GetSemaphoreFdKHR().
Features was disabled on Linux to skip sync_fd CTS tests, which is using
late vkEvent signalling which causes deadlock / dEQP timeout on v3dv.
One of the options was implementing blocking v3dv-specific
AcquireImageANDROID / QueueSignalReleaseImageANDROID to avoid importing
/ exporting sync_fd, but since these features are also required by CDD
for Vulkan 1.1 and above, it was decided to enable the extensions for
Android in exchange of a few failed dEQP tests (which should not cause
any issues in non-dEQP scenarious).
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6977
Fixes: 316728a55b ("v3dv: Switch to the common submit framework")
Signed-off-by: Roman Stratiienko <r.stratiienko@gmail.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18047>
(cherry picked from commit 5e32e8c962da237772c78df7654897acfd4787f7)
Right now even the simplest mesh test (func.mesh.basic.mesh from crucible) fails like this:
ASSERT: Scalar MESH validation failed!
load_payload(16) vgrf11+0.0:F, vgrf8:D
../../src/intel/compiler/brw_fs_validate.cpp:61: inst->dst.offset / REG_SIZE + regs_written(inst) <= alloc.sizes[inst->dst.nr]
Because we try to load 8 regs with LOAD_PAYLOAD in SIMD16 mode.
Fixes: 349a040f68 ("intel/fs: Make logical URB write instructions more like other logical instructions")
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18075>
(cherry picked from commit 446eeccb1c95f65511aa5ffdfd7f6ce23b1ea83f)
If we have VRAM we're not exactly a unified memory architecture, are we?
Cc: mesa-stable
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18123>
(cherry picked from commit 1ef43ea3c4ec8b29e45ec6e052e37ce3991b77b6)
get_ssa_temp's and NIR's bit size can differ for scalar sources.
This causes broken packing of the MIMG operands with A16/G16.
Fixes: f5f73db846 ("aco: Support 16bit sources for texture ops.")
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18008>
(cherry picked from commit 2dd641119fa545528eefdcd50e0d690c7f592bcb)
The lp_rasterizer is shared across contexts, and lp_rast_fence called
without holding rast_mutex could race with rast_mutex being replaced and
unref'd on a different thread.
Fixes: a680fd078c ("llvmpipe: make last_fence a screen/rast object not a context one.")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18116>
(cherry picked from commit f228c26520dcfdc2d600ba78c51796ef2a4a31fd)
Since PIPE_CAP_TGSI_TEXCOORD is now enabled, texcoord is now declared
as TGSI_SEMANTIC_TEXCOORD instead of TGSI_SEMANTIC_GENERIC.
Fixes assert running REDTurbineDEMO with MTL Renderer when the guest needs to
fallback to swtnl for line stipple.
Fixes: e73443b7a5 ("svga: enable PIPE_CAP_TGSI_TEXCOORD for vgpu10 and up")
Reviewed-by: Martin Krastev <krastevm@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18113>
(cherry picked from commit 854e8797ac3b010c980810fa5d0638a872df3aa0)
Alpha-to-coverage acts like discard and happens after FS ends,
so like with discard LRZ write should be disabled.
With discard we don't know at the moment of binning whether
fragment would be not discarded, so we cannot write its depth to LRZ.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6876
Cc: mesa-stable
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18083>
(cherry picked from commit c45fded26b8c858d1974a8b4017200b1cbe92d31)