When more than 1 idr frames have been added to dpb list, there might be 2
frames with same poc in the dpb list. In this case, driver needs to search
the entire dpb list in order to get newly added reference frame with given poc
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9881>
When we encounter a bindless image here, lower_deref returns a
NULL-pointer, and calling record_images_used will try to dereference
that NULL-pointer.
So let's dig out the var from the source instruction instead of the
result of the lowering.
Fixes: 5910c938a2 ("nir/glsl: gather bitmask of images used by program")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9895>
Do not assign to a variable that won't be used.
Fixes CID#1468098 "Unused value (UNUSED_VALUE)".
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9910>
Do not assign to a variable that won't be used.
Fixes CID#1451708 and CID#1451710 "Unused value (UNUSED_VALUE)".
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9910>
With compressed DCC writes supported, the image should still be
compressed after resolving using the compute path.
Fixes various dEQP-VK.api.copy_and_blit.core.resolve_image.*
failures with RADV_DEBUG=forcecompress on GFX10.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9908>
There is margin in the time budget to run the full GLES3 and GLES31 CTS
instead of only 50%.
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9899>
Enabling this fast path, while making CPU side simpler
(and thus reducing driver's CPU overhead), forces us to
generate multiple VERTEX_BUFFER_STATE packets pointing to
the same buffer, but with slightly shifted BufferStartingAddress.
This is fine on GFX version >= 11 and < 8, but on version
8 and 9, thanks to internal details of the VF cache, vertex
data from each VERTEX_BUFFER_STATE is cached separately
and this results in a lot of cache misses.
Disabling this fast path restores previous performance levels.
On my SKL GT2 machine this improves performance in:
- GLB27 Egypt offscreen by 37%
- GLB27 TRex offscreen by 22%
- gfxbench5 Manhattan offscreen by 1.75%
- gfxbench5 Manhattan31 offscreen by 1.9%
- gfxbench5 Aztec Ruins high by 2.3%
In Intel performance CI on GFX version 9 it improves performance in:
- gfxbench5 Manhattan offscreen by 1.65%
- gfxbench5 Aztec Ruins normal offscreen by 1.33%
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3277
Fixes: 42842306d3 ("mesa,st/mesa: add a fast path for non-static VAOs")
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9857>
Fixes: 42842306d3 ("mesa,st/mesa: add a fast path for non-static VAOs")
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9857>
Avoids some copypaste and makes it easier to see how the different types
relate.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8847>
according to spec, unused clearvalues aren't accessed
> Only elements corresponding to cleared attachments are used. Other elements of pClearValues are ignored.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9900>
MAX2(count * struct size, 1) results in 1 for count=0, not the size of a struct.
Since this MAX only seems to exist so we can keep using NULL for error reporting,
just refactor to return a VkResult.
Fixes: ad241b15a9 ("vk: consolidate dynamic descriptor binding sorting")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4522
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9880>
Include drm_helper.h to define the driver descriptor again, but with a
new define GALLIUM_KMSRO_ONLY to disable defining descriptors for the
drivers that kmsro uses.
Fixes clinfo on Panfrost.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4002
Fixes: 9ec28b8d22 ("gallium/drm: Deduplicate screen creation for the dynamic (clover) pipe loader.")
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9380>
In some cases we will change the type of the destination register of
an instruction. This is the type we should use to verify that we're
allow to do the replacement.
Otherwise we can hit restrictions on CHV and upcoming Xe-Hp for
instance where the copy propagation transforms this :
send(16) (mlen: 2) vgrf10:UD, 0u, 0u, vgrf35:D, null:UD
mov(16) vgrf11:UW, vgrf10<2>:UW
mov(16) vgrf12:UW, vgrf10+0.2<2>:UW
mov(16) vgrf15:HF, |vgrf11|:HF
mov(16) vgrf16:HF, |vgrf12|:HF
mov(8) vgrf41<2>:UW, vgrf15+0.0:UW group0
mov(8) vgrf42<2>:UW, vgrf15+0.16:UW group8
mov(8) vgrf45<2>:UW, vgrf16+0.0:UW group0
mov(8) vgrf46<2>:UW, vgrf16+0.16:UW group8
into this :
send(16) (mlen: 2) vgrf10:UD, 0u, 0u, vgrf35:D, null:UD
mov(8) vgrf41<2>:HF, |vgrf10+0.0|<2>:HF group0
mov(8) vgrf42<2>:HF, |vgrf10+1.0|<2>:HF group8
mov(8) vgrf45<2>:HF, |vgrf10+0.2|<2>:HF group0
mov(8) vgrf46<2>:HF, |vgrf10+1.2|<2>:HF group8
Because of the floating point use, stride and offets should be the
same.
v2: Fix final destination type selection (Curro)
v3: constify (Curro)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9832>
Otherwise pass->tile_align_w will be 0, leading to a divide by zero and
undefined behavior. In practice, I saw this lead to an infinite loop in
tests like
dEQP-VK.draw.instanced.draw_indexed_indirect_vk_primitive_topology_line_list_attrib_divisor_0_multiview
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9606>
A little low-stakes RE effort as I unwind from fighting CI all day. Comes
from diffing dEQP-VK.clipping.user_defined.clip_distance.vert.* on the
blob and comparing to a6xx behavior. (My blob doesn't do tess, so if
there are equivalent tess fields for some of these, I didn't find them)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9870>