Commit Graph

137114 Commits

Author SHA1 Message Date
Tony Wasserka
1e03796fa4 aco/isel: Don't request sign extension when truncating signed integers
This doesn't change semantics but allows us to reject this potentially
ambiguous configuration in convert_int in a later change.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9597>
2021-03-26 14:39:23 +00:00
Tony Wasserka
3a2b055726 aco/isel: Fix i64/u64->float32 conversion for large inputs
Previously, inputs such as 0x100000000 would have their upper 32-bits
ignored despite being representable by 32-bit floats.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9597>
2021-03-26 14:39:23 +00:00
Tony Wasserka
436922c84a aco/isel: Don't emit unsupported i16<->f16 conversion opcodes on GFX6/7
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: b86305bb57 ("nir/algebraic: collapse conversion opcodes (many patterns)")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4357
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9597>
2021-03-26 14:39:23 +00:00
Erik Faye-Lund
3463b8bf41 zink: tighten emitted image spir-v caps
We might only need to emit a read or write cap for a given image. This
could provide the Vulkan driver with the chance to optimize things
slightly.

Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9855>
2021-03-26 14:30:46 +00:00
Samuel Pitoiset
a3ee818b79 radv: report that degenerated triangles are not culled
I don't think the hw culls these primitives and NGG culling isn't
yet a thing. This also matches PAL.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9691>
2021-03-26 12:52:40 +01:00
Simon Ser
1d349a6484 Revert "egl: Don't add hardware device if there is no render node v2."
This reverts commit 5743a36b2b.

Now that _eglAddDevice is always called with the correct software
hint, no need to bail out if the device doesn't have a render node.
On split render/display SoCs, the DRM device won't have a render
node, yet rendering is hardware-accelerated (via kmsro).

Signed-off-by: Simon Ser <contact@emersion.fr>
Fixes: 5743a36b2b ("egl: Don't add hardware device if there is no render node v2.")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4178
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9697>
2021-03-26 10:32:31 +00:00
Simon Ser
e39d72aec2 egl: only take render nodes into account when listing DRM devices
We don't want to expose an EGL device for a display-only DRM devices
(like VKMS). For these DRM devices we have a separate software-rendering
device (the first in the list, always present).

There is a similar check in _eglAddDRMDevice, however it will be
removed in a future commit to allow split render/display devices
to be properly added. We can't figure out whether we're on a split
render/display system before loading the driver.

Signed-off-by: Simon Ser <contact@emersion.fr>
Fixes: 5743a36b2b ("egl: Don't add hardware device if there is no render node v2.")
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9697>
2021-03-26 10:32:31 +00:00
Simon Ser
08a51770bd egl: fix software flag in _eglAddDevice call on DRM
On the EGL DRM platform, call _eglAddDevice with the software flag
set if GBM has loaded a software driver. This allows _eglAddDevice
to make the difference between llvmpipe and kmsro.

This is important on split render/display SoCs: we don't want to
advertise EGL_MESA_device_software on these systems.

Completely drop disp->Options.ForceSoftware, because GBM is
responsible for choosing software rendering and doesn't take this
hint into account.

Signed-off-by: Simon Ser <contact@emersion.fr>
Fixes: 5743a36b2b ("egl: Don't add hardware device if there is no render node v2.")
References: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4178
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9697>
2021-03-26 10:32:31 +00:00
Pierre-Eric Pelloux-Prayer
6298347ec7 mesa/st: fix lower_tex_src_plane in multiple samplers scenario
"plane[0].i32" is the plane being lowered, it's not the sampler we're looking
for.

It worked when there's a single sampler because, eg for NV12, plane[0].i32 for
the UV plane would be 1 and the added ":uv" sampler would also land at binding
point 1.

Fixes: 079e5f73d7 ("mesa/st: rewrite src var when lowering tex_src_plane")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9812>
2021-03-26 10:21:35 +01:00
Boris Brezillon
3fe3aebd63 panfrost: Get rid of panfrost_pool_alloc()
This one is no longer used.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9827>
2021-03-26 08:46:15 +01:00
Boris Brezillon
f23a023a7e panfrost: Use the descriptor allocators where appropriate
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9827>
2021-03-26 08:46:10 +01:00
Boris Brezillon
58747cd2f5 panfrost: Emit surface descriptors with pan_pack()
Now that we have surface descriptors defined in midgard.xml we can use
pan_pack() to emit them.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9827>
2021-03-26 08:46:06 +01:00
Boris Brezillon
7c08bb5ad0 panfrost: Define the Surface and Surface-with-stride descriptors
Right now the code manipulates mali_ptr, but having surface descriptors
properly defined will allow us to use the descriptors allocator when
allocating a midgard texture.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9827>
2021-03-26 08:46:01 +01:00
Boris Brezillon
a1c0cc3fdd panfrost: Provide various helpers to simplify descriptor allocation
Got the alignment wrong a few times, so I figured it would be good to
have helpers that take care of the alignment requirements based on
the midgard.xml definitions.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9827>
2021-03-26 08:45:57 +01:00
Boris Brezillon
724045aff2 panfrost: Specify descriptor alignment requirements
Once we have that in place, we can provide macros to simplify descriptor
allocation.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9827>
2021-03-26 08:45:51 +01:00
Boris Brezillon
65931ffa27 pan/gen_pack: Parse alignment requirements
This will help us simplify descriptor allocation.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9827>
2021-03-26 08:45:32 +01:00
Iago Toral Quiroga
e790c20403 broadcom/compiler: try to fill up delay slots after a thrsw
The way we handle thrsw instructions is that we try to merge them
back into previously scheduled instructions to fill up its delay
slots. This is generally safe, because the thrsw won't happen until
after the delay slots, so we are not really changing the execution
order of the instructions and we just need to make sure we don't
violate a few specific restrictions.

If we have not managed to fill up all delay slots after doing this,
then we emit as many NOPs as needed to fill them. This is to ensure
that we don't schedule an instruction that needs to execute after the
thread switch before the thread switch happens. However, doing this
can lead to inefficient code, since some times the instructions we
schedule after a thrsw are indepdent of the thrsw and could be safely
executed in its delay slots.

This change removes the fixed NOP emission after a thrsw to fill
delay slots and instead adds code to ensure that our instruction
scheduling is aware of when it is scheduling instructions in the
delay slots of a previous thrsw to avoid selecting conflicting
instructions.

The only case were we still emit fixed NOPs is for the thread end that
we emit to terminate the program after scheduling all instructions
because we can't end the instruction stream before the thread end
is properly executed.

total instructions in shared programs: 13691004 -> 13648140 (-0.31%)
instructions in affected programs: 4345951 -> 4303087 (-0.99%)
helped: 19645
HURT: 652
Instructions are helped.

total max-temps in shared programs: 2319317 -> 2318687 (-0.03%)
max-temps in affected programs: 10510 -> 9880 (-5.99%)
helped: 532
HURT: 9
Max-temps are helped.

total sfu-stalls in shared programs: 31752 -> 32354 (1.90%)
sfu-stalls in affected programs: 840 -> 1442 (71.67%)
helped: 7
HURT: 467
Sfu-stalls are HURT.

total inst-and-stalls in shared programs: 13722756 -> 13680494 (-0.31%)
inst-and-stalls in affected programs: 4335590 -> 4293328 (-0.97%)
helped: 19453
HURT: 758
Inst-and-stalls are helped.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9825>
2021-03-26 07:13:07 +00:00
Iago Toral Quiroga
f68f209e39 broadcom/compiler: add a v3d_qpu_writes_accum helper
We have helpers to check if an instruction writes to specific
accumulators. This one will check if it writes any of the general
purpose accumulators, which will come in handy in a follow-up
patch.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9825>
2021-03-26 07:13:07 +00:00
Kenneth Graunke
640e8cb3dc Half-revert "gallium/dri2: Pass the resource that corresponds to the plane"
This reverts the resource_get_param changes from Tomeu's commit
ab7744b280.  It breaks every single
program run on iris (such as wflinfo -a gl -p glx).

iris's I915_FORMAT_MOD_Y_TILED_CCS modifier reports two planes - one for
the main surface, and a second one for the CCS surface.  However, there
is only one underlying pipe_resource.  We only use pipe_resource::next
for modifier information when importing buffers from elsewhere; there is
no res->next for internally allocated resources created with modifiers.

resource_get_param() is not supposed to chase res->next pointers.  The
hook is intended to take the base image, and the plane, as parameters.
The driver can then walk its own data structures however it sees fit
in order to find the appropriate plane, rather than enforcing a linked
list of resources.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9847>
2021-03-25 18:24:53 -07:00
Marek Olšák
f903a4be9f amd: update addrlib
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9668>
2021-03-25 19:23:40 -04:00
Marek Olšák
3616e02ef3 amd/addrlib: define endianess differently
This removes a Mesa-specific change.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9668>
2021-03-25 19:23:38 -04:00
Marek Olšák
1d69c0419b amd/addrlib: prevent defining regparm differently
Define it in meson, so addrlib won't define it.
This is adding back the addrlib original code.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9668>
2021-03-25 19:23:36 -04:00
Marek Olšák
59912cd4cf amd/addrlib: add back the incorrect original DCC checking
This reduces Mesa-specific changes.

is_dcc_supported_by_CB() should protect against getting there.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9668>
2021-03-25 19:23:34 -04:00
Iván Briano
042d24971e anv: use helper function to get the buffer size
This ensures we get a properly aligned size for the buffer so we don't
trip over HW limits for push constants.

Closes #3703
Fixes dEQP-VK.robustness.image_robustness.push.* on HSW

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9699>
2021-03-25 15:22:20 -07:00
Iván Briano
54b848b245 anv: move buffer size alignment into helper function
And use ANV_UBO_ALIGNMENT for it instead of a magic number.
This increases the alignment to 64B, but that ought to be good for
everyone.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9699>
2021-03-25 15:20:53 -07:00
Vasily Khoruzhick
bff7fa3fe3 lima: compute nir_sha1 for shader key even if disk cache is disabled
We're using it for in-memory cache as well, so it needs to be computed
unconditionally.

Fixes: bf09ba5385 ("lima: implement shader disk cache")
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9838>
2021-03-25 21:58:57 +00:00
Lionel Landwerlin
8b586d9ed6 intel: Add null hw layer
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2832>
2021-03-25 19:10:03 +00:00
Lionel Landwerlin
54fe5b0482 meson: switch vulkan layer to list of choices
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2832>
2021-03-25 19:10:03 +00:00
Jason Ekstrand
82b25a1d75 anv: Align inline uniform data to ANV_UBO_ALIGNMENT
If we're going to have a #define for UBO alignments, it's probably a
good idea to make sure everything is aligned to that.  This increases
the alignment from 32B to 64B but that shouldn't hurt anyone.

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9837>
2021-03-25 19:01:24 +00:00
Kenneth Graunke
5ae276f7e0 intel: Fix release build breakage
We missed changing one instance of debug_flag to debug_enabled in a
release-only ifdef branch.

Fixes: 758eb18c6f ("intel/compiler: Make vec4 generator take debug_enabled as a parameter")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9839>
2021-03-25 11:36:58 -07:00
Danylo Piliaiev
243724031b turnip: clamp to zero negative upper left corner of viewport
We cannot send negative viewport coordinates to the hardware,
so clamp them since negative min.x/y is valid per spec.
The negative origin still counts in calculations of guardband.

Fixes crash in 3DMark's "Sling Shot Extreme" test.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9629>
2021-03-25 17:57:59 +00:00
Matt Turner
efd1d5cc59 ci: Use CI_PROJECT_ROOT_NAMESPACE
Currently, the template in e.g. .gitlab-ci/deqp-runner.sh containing

    See https://$CI_PROJECT_NAMESPACE.pages.freedesktop.org/-/...

will print an invalid URL. Copied from .gitlab-ci/piglit/run.sh which
already uses CI_PROJECT_ROOT_NAMESPACE in its template.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9821>
2021-03-25 17:31:25 +00:00
Erik Faye-Lund
422ad76cca docs: Add 21.0.0 hashes
This was missed from the normal 21.0.0 release MR, so here's the hash
added retroactively.

Acked-by: Dylan Baker <dylan.c.baker@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9826>
2021-03-25 17:26:54 +00:00
Eric Anholt
b1e64a7771 ci/freedreno: Mark a630 as flaky on arb_draw_indirect-transform-feedback
This looks like a new one.  Unclear cause -- appeared in a large branch
with potential impact, but I had also been doing TF work around that time.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9819>
2021-03-25 16:25:24 +00:00
Daniel Schürmann
8e43abcd2c aco/ra: remove exec handling for phis
These are not temporaries anymore.

Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9763>
2021-03-25 15:12:19 +00:00
Daniel Schürmann
3284f755a3 aco/ra: allow m0 in get_reg_specified()
Totals from 1 (0.00% of 136546) affected shaders (Navi10):
CodeSize: 12788 -> 12776 (-0.09%)
Instrs: 2441 -> 2438 (-0.12%)
Latency: 29713 -> 29731 (+0.06%)
InvThroughput: 14857 -> 14866 (+0.06%)
Copies: 354 -> 353 (-0.28%)
Branches: 66 -> 65 (-1.52%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9763>
2021-03-25 15:12:19 +00:00
Daniel Schürmann
4bfbd4de84 aco/ra: iterate backwards when coalescing phis
Aligning the phi definition with the operand from
the else- block can reduce the number of branches
if the else- block is otherwise empty.

Totals from 16 (0.01% of 136546) affected shaders (Navi10:
CodeSize: 707848 -> 707312 (-0.08%); split: -0.09%, +0.01%
Instrs: 126534 -> 126400 (-0.11%); split: -0.13%, +0.02%
Latency: 6399306 -> 6395082 (-0.07%)
InvThroughput: 6134374 -> 6132119 (-0.04%); split: -0.04%, +0.00%
SClause: 1879 -> 1871 (-0.43%)
Copies: 36316 -> 36219 (-0.27%); split: -0.37%, +0.10%
Branches: 4154 -> 4127 (-0.65%); split: -0.67%, +0.02%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9763>
2021-03-25 15:12:19 +00:00
Daniel Schürmann
7c64623e94 aco/ra: refactor SSA repairing during register allocation
The previous approach attempted to construct phi nodes
on-demand and on-the-fly. Due to several bugs, it became
necessary to always create incomplete phis for all live-in
variables on loop headers, which is highly inefficient.

The new approach assumes that live-in variables on loop-
headers don't get renamed, and afterwards does one renaming
pass per loop nest. This greatly simplifies the code and
reduces the memory footprint.

Totals from 37 (0.03% of 136546) affected shaders (Navi10):
CodeSize: 588148 -> 588020 (-0.02%); split: -0.03%, +0.01%
Instrs: 111793 -> 111761 (-0.03%); split: -0.04%, +0.01%
Latency: 4546013 -> 4545611 (-0.01%); split: -0.02%, +0.01%
InvThroughput: 2806217 -> 2805730 (-0.02%); split: -0.03%, +0.01%
VClause: 2044 -> 2046 (+0.10%)
SClause: 3889 -> 3884 (-0.13%)
Copies: 17730 -> 17700 (-0.17%); split: -0.23%, +0.06%
Branches: 3282 -> 3280 (-0.06%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9763>
2021-03-25 15:12:19 +00:00
Daniel Schürmann
3ea2c05b32 aco/ra: split register_file initialization into separate function
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9763>
2021-03-25 15:12:19 +00:00
Daniel Schürmann
e4902d4574 aco/ra: split affinity creation into separate function
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9763>
2021-03-25 15:12:19 +00:00
Erik Faye-Lund
cf4ed05417 zink: follow spir-v 1.0 spec
OpEntryPoint is only supposed to include the Input and Output storage
classes prior to SPIR-V 1.4.

Since we're always emitting SPIR-V 1.0, we should simply omit these from
OpEntryPoint for now. If we start emitting SPIR-V 1.4 or later, we
should add these back in.

Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9828>
2021-03-25 14:29:48 +00:00
Erico Nunes
ab71c0ba44 lima/ppir: rework liveness data structures to bitset
The liveness code in ppir can be a bottleneck in complicated shaders
with multiple spills, and the use of the mesa set data structure is one
of the main reasons it is expensive to run.
With some changes, it can be adapted to using bitsets which makes it run
substantially faster.
ppir liveness can't run with a regular bitset for registers since we
need to track inviditual component masks for non-ssa registers, but we
can switch to using a separate packed bit array just for the masks,
rather than a full blown hash set. This also makes operations such as
liveness propagation much more straightforward.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9745>
2021-03-25 14:17:26 +00:00
Erico Nunes
2b83d99538 lima/ppir: remove use of live_out
The live_out set was serving mostly as a temporary storage of data
propagated from the next instructions.
If we do an early copy to preserve the instruction live_in set and
propagate the liveness information directly to the live_in set, we can
skip having a live_out set entirely.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9745>
2021-03-25 14:17:26 +00:00
Erico Nunes
55b1dbb1c1 lima/ppir: remove liveness info from blocks
liveness information was stored in blocks so that the algorithm doesn't
have to find a valid instruction to inherit liveness information from.
It turns out this part of the code can be a performance bottleneck in
applications loading with complicated shaders, so skip the liveness
information in blocks to reduce the number of times we have to copy
liveness information.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9745>
2021-03-25 14:17:26 +00:00
Danylo Piliaiev
56909868cd turnip: implement VK_KHR_pipeline_executable_properties
Loosely based on ANV implementation.

For executable's internal representation we output:
- Initial NIR after spirv_to_nir
- Final optimized NIR
- IR3 disassembly

Note, that vkGetPipelineExecutablePropertiesKHR is required to
return executable properties even if pipeline was not created with
CAPTURE_STATISTICS or CAPTURE_INTERNAL_REPRESENTATIONS bits set.
So the executables array is unconditionally populated, however
NIR and IR3 disassemlies are filled only when
CAPTURE_INTERNAL_REPRESENTATIONS is set.

Passes dEQP-VK.pipeline.executable_properties.*
Works with RenderDoc.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8877>
2021-03-25 13:53:33 +00:00
Danylo Piliaiev
2bff8fd53b nir: add nir_shader_as_str function
It would be later used by Turnip in implementation of
VK_KHR_pipeline_executable_properties.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8877>
2021-03-25 13:53:33 +00:00
Icecream95
616394cf31 pan/mdg: Use appropriate sizes for global loads/stores
When always using 128-bit operations, an access at the end of an SSBO
could spill over to the next page and cause a GPU fault. To avoid
this, pick the smallest instruction that fits.

The if ladder might need extending for sizes of less than 32 bits, but
those currently fail in other parts of the compiler so are untested.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9747>
2021-03-25 13:23:47 +00:00
Icecream95
2f78ed8f0a pan/mdg: Rename load/store operations
Use the actual bit sizes the instructions deal with.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9747>
2021-03-25 13:23:47 +00:00
Rhys Perry
b7fb5450ef ac: invalidate metadata after hs_emit_write_tess_factors()
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9804>
2021-03-25 10:46:21 +00:00
Iago Toral Quiroga
22a979be65 broadcom/compiler: convert add to mul when possible to allow merge
Integer add/sub can be implemented as either an add or a mul instruction
but we always emit them as add instructions at VIR level. We can use this
flexibility to improve our QPU scheduling so we can be more effective
at instruction merging by converting these to mul instructions when we
are attempting to merge them with another add instruction.

total instructions in shared programs: 13721549 -> 13691004 (-0.22%)
instructions in affected programs: 3340493 -> 3309948 (-0.91%)
helped: 12805
HURT: 1656
Instructions are helped.

total max-temps in shared programs: 2319528 -> 2319317 (<.01%)
max-temps in affected programs: 5285 -> 5074 (-3.99%)
helped: 195
HURT: 3
Max-temps are helped.

total sfu-stalls in shared programs: 31616 -> 31752 (0.43%)
sfu-stalls in affected programs: 469 -> 605 (29.00%)
helped: 52
HURT: 161
Sfu-stalls are HURT.

total inst-and-stalls in shared programs: 13753165 -> 13722756 (-0.22%)
inst-and-stalls in affected programs: 3340383 -> 3309974 (-0.91%)
helped: 12782
HURT: 1666
Inst-and-stalls are helped.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9769>
2021-03-25 09:51:42 +00:00