for drivers where separate cull distance variables are required, this
lets them avoid having to write yet another pass to undo gallium's mangling
of shader info
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14878>
Softpipe was the only driver still using this feature. I had enabled it
in ba22f014f9 ("softpipe: Enable PIPE_CAP_TGSI_ANY_REG_AS_ADDRESS;") for
an instr count win, but it's really not important to that driver and it's
not worth keeping the knob around just for that.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14360>
There exists hardware intel gen4 specifically that has only 6 clip planes
but supports GLSL 1.30. This enhances the CAP so that the current values
of 0,1 remain the same, but giving it a larger number will override the
max.
This allows the gen4 intel to set this to 6 and leave everyone else alone.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14344>
This was useful for emulating GL 3.2 in virgl on a GLES3 host renderer,
before GL_EXT_depth_clamp introduced the ability for hardware drivers to
expose the feature on GLES. Now that we have that, the desktop-GL-capable
HW that virgl cares about can expose desktop GL even on its GLES renderer
on the host without this emulation. I don't think anyone particularly
cares about hitting higher GL versions on actually-core-GLES hosts with
virgl.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13729>
This capability is enabled for drivers supporting formatless image
writing in shader.
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13409>
this reworks PIPE_CAP_PREFER_BLIT_BASED_TEXTURE_TRANSFER into an
enum as PIPE_CAP_TEXTURE_TRANSFER_MODES, enabling drivers to choose
a (sometimes) faster, compute-based download mechanism based on a new
pipe_screen hook
compute pbo download is implemented using shaders with a prolog to convert
the input format to generic rgb float values, then an epilog to convert
to the output value. the prolog and epilog are determined based on a vec4
of packed ubo data which is dynamically updated based on the API usage
currently, the only known limitations are:
* GL_ARB_texture_cube_map_array is broken somehow (and disabled)
* AMD hardware somehow can't do depth readback?
otherwise it should work for every possible case
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11984>
Driver should enable this cap if it prefers varyings to be aligned
to power of two in a slot, i.e. vec4 in .xyzw, vec3 in .xyz, vec2 in .xy
or .zw
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13151>
vulkan requires that vertex attribute access be aligned to the size of
a component for the attribute, but GL has no such requirements
the existing alignment caps are unnecessarily restrictive for applying
this limitation, so this cap now pre-calculates the masks for elements
and vertex buffers in vbuf to enable rewriting misaligned buffers
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13556>
The main motivation is to improve the score of viewperf13/snx.
This new interface is designed to be optimal for display lists as implemented
by the vbo module. It has much lower CPU overhead in the frontend, threaded
context, and the driver.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13050>
This is required for d3d10+, which has depth_clamp always enabled
regardless of depth_clip (in contrast to OpenGL, where enabling
depth_clamp disables depth_clip). There doesn't seem to be a GL
extension for it, but it will be used for lavapipe to implement
VK_EXT_depth_clip_enable.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12260>
We would like draw-only display lists to have immutable draw info and
this is the only GL non-draw state in pipe_draw_info (not counting
view_mask).
It also allows removing some code from draw_vbo for tessellation.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12351>
this is another prim type bitmask which will trigger automatic draw rewriting
to a direct draw any time a prim-restart draw occurs with a prim type that is
not supported by the driver for prim restart, even if that prim type is supported
for normal drawing
the default is set to all prim types to preserve existing functionality, and PrimitiveRestartForPatches
is now explicitly set to false because no driver supports it
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10973>
drivers can now export a bitmask of the primitive types they support,
and all others will be automatically be rewritten
the default value is set to all primitive types supported to preserve
existing behavior
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10973>
for drivers that set it, this now automatically handles restart index rewriting
by running draws through primconvert when necessary
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10973>
generally speaking, if I'm tracing an app, I want to see what's happening to
my driver, not what's happening to tc, as tc does rewriting of command streams
which can affect the operation of the driver
use GALLIUM_TRACE_TC for previous behavior
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10362>
The prog_to_nir->NIR-to-TGSI change ended up causing regressions on r300,
and svga against r300-class hardware, because nir_lower_uniforms_to_ubo()
introduced shifts that nir_lower_ubo_vec4() tried to reverse, but that NIR
couldn't prove are no-ops (since shifting up and back down may drop bits),
and the hardware can't do the integer ops.
Instead, make it so that nir_lower_uniforms_to_ubo can generate
nir_intrinsic_load_ubo_vec4 directly for !INTEGER hardware.
Fixes: cf3fc79cd0 ("st/mesa: Replace mesa_to_tgsi() with prog_to_nir() and nir_to_tgsi().")
Closes: #4602
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10194>
this enables detection for the EXT vs the ARB extension, which have
different specifications regarding which formats must be supported
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10030>
similar to amd/radv driver debug modes for sqtt, this specifies a filename
which is checked on every flush(PIPE_FLUSH_END_OF_FRAME); when it exists,
the next frame (and only that frame) is captured into the trace
to use, specify a file with the env var, run your app, and 'touch /path/to/file'
when you want to capture a trace
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10058>
We need a single empty line between the code-block state and the text
in the block, otherwise the rST is invalid and the entire block will be
dropped, as is currently the case on the website.
While we're at it, remove some needless colons from these code-blocks as
well. They're not needed, and we usually don't have these in the docs.
Fixes: a2a8c6a36c ("docs: Add some documentation of game GL buffer object mapping behavior.")
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9710>
This is to allow for
VK_EXT_sampler_filter_minmax
GL_EXT_texture_filter_minmax
support
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9487>
There are a variety of paths that apps take (this is by no means a
complete enumeration, I tried to keep going until I saw repeats but
eventually ran out of steam), and it should be useful to driver developers
writing their pipe_transfer_map() and invalidate_resource() calls to see a
bunch of the patterns without having to do performance debug on each app.
Acked-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9231>
when this is not set, this triggers shader and sampler state updates any time a sampler
starts or stops using GL_CLAMP, applying bitmasks needed to run nir_lower_tex
and setting CLAMP_TO_BORDER/CLAMP_TO_EDGE as necessary to mimic the behavior
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8756>
We should be exposing it in every driver, since it's required eventually
to reduce jank. Make drivers have to explicitly opt out instead of opt
in.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9088>
We often do this:
pipe->set_constant_buffer(pipe, shader, slot, &cb);
pipe_resource_reference(&cb->buffer, NULL);
That results in atomic increment in set_constant_buffer followed by
atomic decrement after set_constant_buffer. This new interface
eliminates those atomics.
For the case above, this should be used instead:
pipe->set_constant_buffer(pipe, shader, slot, true, &cb);
cb->buffer = NULL; // if cb is not a local variable, else do nothing
AMD Zen benefits from this. The perf improvement is ~3% for Viewperf13/Catia.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8298>
The vulkan cond rendering hook is quite different than the
traditional gallium one so add a new interface for it.
This just conditionalises rendering on the memory location.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8182>
This indicates whether a driver wants samplers for buffer textures as
well as normal textures.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8300>
Drivers aren't allowed to ignore start with user index buffers anymore.
This is required by the new fast path where mesa/main is using pipe_draw_info.
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7679>
To remove PIPE_CAP checking in the common code.
It's better if drivers lower multi draws even if the hardware doesn't
support it beause the multi draw loop can be moved deeper into the driver
to remove more overhead.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7679>
This is needed to implement the vulkan transform feedback pause
resume functionality
Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7981>
Rather than hard-code a list of all the format
modifiers supported by any gallium driver and the
number of aux planes they require in the dri state
tracker, add a screen proc that queries the number
of planes required for a given modifier+format
pair.
Since the only format modifiers that require
auxiliary planes currently are the iris driver's
I915_FORMAT_MOD_Y_TILED_CCS,
I915_FORMAT_MOD_Y_TILED_GEN12_RC_CCS, and
I915_FORMAT_MOD_Y_TILED_GEN12_MC_CCS, the absence
of the screen proc implies zero aux planes for all
of the screen's supported modifiers. Hence, when
a driver does not expose the proc, derive the
number of planes directly from the format.
Signed-off-by: James Jones <jajones@nvidia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3723>