Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103513
Fixes: de88979413 ("radv: Implement VK_AMD_shader_info")
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Otherwise we will leak them, load duplicates from disk rather
than memory and never write items loaded from disk to the apps
pipeline cache.
Fixes: fd24be134f 'radv: make use of on-disk cache'
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Patch uses mem_ctx for allocation to ensure param array gets freed
later.
==6164== 48 bytes in 1 blocks are definitely lost in loss record 61 of 193
==6164== at 0x4C2EB6B: malloc (vg_replace_malloc.c:299)
==6164== by 0x12E31C6C: ralloc_size (ralloc.c:121)
==6164== by 0x130189F1: fs_visitor::assign_constant_locations() (brw_fs.cpp:2095)
==6164== by 0x13022D32: fs_visitor::optimize() (brw_fs.cpp:5715)
==6164== by 0x13024D5A: fs_visitor::run_fs(bool, bool) (brw_fs.cpp:6229)
==6164== by 0x1302549A: brw_compile_fs (brw_fs.cpp:6570)
==6164== by 0x130C4B07: blorp_compile_fs (blorp.c:194)
==6164== by 0x130D384B: blorp_params_get_clear_kernel (blorp_clear.c:79)
==6164== by 0x130D3C56: blorp_fast_clear (blorp_clear.c:332)
==6164== by 0x12EFA439: do_single_blorp_clear (brw_blorp.c:1261)
==6164== by 0x12EFC4AF: brw_blorp_clear_color (brw_blorp.c:1326)
==6164== by 0x12EFF72B: brw_clear (brw_clear.c:297)
Fixes: 8d90e28839 ("intel/compiler: Allocate pull_param in assign_constant_locations")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable@lists.freedesktop.org
We were dividing by 4 twice. This also papered over a bug where we
were neglecting to clamp the sampler count to the [0, 16] range.
This should have no functional impact, it only affects prefetching.
v2 [Kenneth Graunke]:
- Clamp sampler_count to [0, 16] to avoid overflowing the valid values
for this field. Write a commit message.
Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This avoids recompiles for shaders that don't use explicit derivatives
when ctx->Hint.FragmentShaderDerivative == GL_NICEST.
For example, GFXBench 5 Aztec Ruins sets the GL_NICEST hint before
compiling any shaders, but none of them use dFdx() or dFdy() - only
implicit derivatives. This doesn't eliminate any recompiles, but
does eliminate one of the reasons for doing so.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
i965 turns fddx/fddy into their coarse/fine variants based on the
ctx->Hint.FragmentShaderDerivative setting. It needs to know whether
this can impact a shader in order to better guess NOS settings.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Also, reorder them to match the structure's field order, to make it
easier to check that they're all present.
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
This allows an app to query shader statistics and get a disassembly of
a shader. RenderDoc git has support for it, so this allows you to view
shader disassembly from a capture.
When this extension is enabled on a device (or when tracing), we now
disable pipeline caching, since we don't get the shader debug info when
we retrieve cached shaders.
v2: Improvements to resource usage reporting
v3: Disassembly string must be null terminated (string_buffer's length
does not include the terminator)
v4: Fixed LDS reporting. (Bas)
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Following piglits are passing:
- glean@texture_srgb
- spec@ext_texture_srgb@fbo-srgb
- spec@ext_texture_srgb@tex-srgb
- spec@ext_texture_srgb@texwrap formats
- spec@ext_texture_srgb@texwrap formats-s3tc
Btw. this enables GL 2.1 :-)
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
Fixes intermittent GPU hangs on Broxton with an Intel internal
test case.
There are plenty of similar fragment shaders in piglit that do
not use any varyings and any uniforms. According to the
documentation special timing is needed between pipeline stages.
Apparently we just don't hit that with piglit. Even with the
failing test case one doesn't always get the hang.
Moreover, according to the error states the hang happens
significantly later than the execution of the problematic shader.
There are multiple render cycles (primitive submissions) in between.
I've also seen error states where the ACTHD points outside the
batch. Almost as if the hardware writes somewhere that gets used
later on. That would also explain why piglit doesn't suffer from
this - most tests kick off one render cycle and any corruption
is left unseen.
v2 (Ken): Instead of enabling push constants, enable one of the
inputs (PSIZ).
v3 (Ken, Jason): Use LAYER instead making vulkan emit_3dstate_sbe()
happy.
Cc: "17.3 17.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Culling tris with zero area seems like a great idea, but apparently with
fill mode line (and point) we're supposed to draw them, at least some tests
for some other state tracker complained otherwise.
Such tris also always seem to be back facing (not sure if this can be
inferred from anything, since in a mathematical sense it cannot really be
determined), so make sure to account for this when filling in the face
information.
(For solid tris, this is of course unnecessary, drivers will throw the tris
away later in any case.)
Reviewed-by: Brian Paul <brianp@vmware.com>
This has been tested with the osdemo from mesa-demos
v2: - Add SELinux dependency
- fix typo GALLIUM_LLVM -> GALLIUM_LLVMPIPE
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
This builds the classic (non-gallium) osmesa with meson. This has been
tested with the osdemo application from mesa-demos.
v2: - Remove unrelated change
- Add SELinux dependency to osmesa
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
This has been tested wtih make dist-check and with meson.
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
This makes things much easier to ensure correctness with meson. Tested
with make dist-check and with meson.
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
These are used by non-gallium osmesa, so they need to be defined outside
of the gallium subdirectory.
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
There was a typo that causes the generated file to be called gl_procs.h
instead.
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
According to the ARB_ES3_1_compatibility specification,
glGetFramebufferAttachmentParameteriv is supposed to accept BACK,
and it behaves exactly like BACK_LEFT.
Fixes a GL error in GFXBench 5 Aztec Ruins.
Cc: "17.3 17.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
From the spec:
"IMAGE_FORMAT_COMPATIBILITY_TYPE: The matching criteria use for the
resource when used as an image textures is returned in
<params>. This is equivalent to calling GetTexParameter"
So we would need to return None for any target not supported by
GetTexParameter. By mistake, we were using the target check for
GetTexLevelParameter.
v2: fix typo (GetTextParameter vs GetTexParemeter) on comment (Illia Mirkin)
Reviewed-by: Antia Puentes <apuentes@igalia.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Meson's vcs_tag() uses the output of `git describe`, eg.
17.3-branchpoint-5-gfbf29c3cd15ae831e249+
Whereas the other build systems used a script that outputs only the sha1
of the HEAD commit, eg.
fbf29c3cd1
Given that this information is used by printing it next to the version
number, there's some redundancy here, and inconsistency between build
systems.
Bring Meson in line by making it use the same script, with the added
advantage of now supporting the MESA_GIT_SHA1_OVERRIDE env var.
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Serious Sam Fusion 2017 uses a huge number of occlusion queries,
and the allocated query pool buffer is greater than 4096 bytes.
This slightly improves performance (tested in Ultra) from
117.2 FPS to 119.7 FPS (~+2%) on my RX480.
This also improves Talos, from 69 FPS to 72/73 FPS (~+5%).
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Only needed when the CS path is used.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Fixes: 80bfff5c4f "wayland-egl: adds CFLAGS for wayland.egl.h include"
Suggested-by: Daniel Stone <daniel@fooishbar.org>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Mesa's DEBUG and assert's NDEBUG are not tied to each other, so we need
to explicitly compile this code out.
Fixes: 3df7892878 "vc4: Drop reloc_count tracking for debug
asserts on non-debug builds."
Cc: Eric Anholt <eric@anholt.net>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Valgrind shows that leak is caused by gen6_upload_push_constant, add
unref push_const_bo per stage to destructor to fix this (like done for
scratch_bo).
==10952== 144 bytes in 1 blocks are definitely lost in loss record 44 of 66
==10952== at 0x4C30A1E: calloc (vg_replace_malloc.c:711)
==10952== by 0x8C02847: bo_alloc_internal.constprop.10 (brw_bufmgr.c:344)
==10952== by 0x8C425C4: intel_upload_space (intel_upload.c:101)
==10952== by 0x8C22ED0: gen6_upload_push_constants (gen6_constant_state.c:154)
v2: remove if conditions, brw_bo_unreference handles NULL (Ken, Emil)
Fixes: 24891d7c05 ("i965: Store per-stage push constant BO pointers.")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
Asserting slot >= 2 made sense when the URB read offset was always 1
(pair of slots). Commit 566a0c43f0 made
it possible to read from the VUE header in slot 0, by adjusting the
offset to be 0. So, this assert is now bogus. Use the one from GL.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Commit 566a0c43f0 started setting the
3DSTATE_SBE bit to override these values with the one calculated there.
So, they're dead. Stop setting them.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
It appears that flushing the DB metadata is actually not sufficient
since the driver uses the new VS blit shaders. This looks quite
strange though, but it seems like we need to flush DB for fixing
the corruption.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102955
Fixes: 69ccb9dae7 (radeonsi: use new VS blit shaders (VS inputs in SGPRs)
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
This uses the new kernel interfaces for reduced cs overhead,
We only set the local flag for memory allocations that don't have
a dedicated allocation and ones that aren't imports.
v2: add to all the internal buffer creation paths.
v3: missed some command submission paths, handle 0/empty bo lists.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Not all rendering matches the miptree format. We allow rendering to
texture views so there are cases where it may not match. In those
cases, our current scheme of just passing the value of ctx->sRGBEnabled
isn't viable. Instead, just do what we do for texturing and pass the
view format in directly.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: mesa-stable@lists.freedesktop.org
It's rather surprising that we've never actually hit this before.
Aparently, Ian's SPIR-V generator currently claims the Simple when you
don't do anything complex. We really shouldn't assert-fail on it.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: mesa-stable@lists.freedesktop.org
system/window.h is no longer available by default and is part of
libnativewindow, so add it to the shared libraries. It has to be conditional
because the library is only present in O and later.
Really, we should only be depending on vndk/window.h now, but that's only
in O and changing would be pretty invasive.
Signed-off-by: Rob Herring <robh@kernel.org>