Commit Graph

109435 Commits

Author SHA1 Message Date
Benjamin Gordon
b30aad552c configure.ac/meson.build: Add options for library suffixes
When building the Chrome OS Android container, we need to build copies
of mesa that don't conflict with the Android system-supplied libraries.
This adds options to create suffixed versions of EGL and GLES libraries:

libEGL.so -> libEGL${egl-lib-suffix}.so
libGLESv1_CM.so -> libGLESv1_CM${gles-lib-suffix}.so
libGLESv2.so -> libGLES${gles-lib-suffix}.so

This is similar to what happens when --enable-libglvnd is specified, but
without the side effects of linking against libglvnd.  To avoid
unexpected clashes with the suffixed appended by libglvnd, make it an
error to specify both --enable-libglvnd and --with-egl-lib-suffix.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-03-21 10:18:31 -07:00
Kenneth Graunke
e426c3a6cb nir: Record non-vector/scalar varyings as unmovable when compacting
In some cases, we can end up with varying structs that aren't split to
their member variables.  nir_compact_varyings attempted to record these
as unmovable, so it would leave them be.  Unfortunately, it didn't do
it right for non-vector/scalar types.  It set the mask to:

   ((1 << (elements * dmul)) - 1) << var->data.location_frac

where elements is the number of vector elements.  For structures and
other non-vector/scalars, elements is 0...so the whole mask became 0.

This caused nir_compact_varyings to assign other varyings on top of
the structure varying's location (as it appeared to take up no space).

To combat this, we just set elements to 4 for non-vector/scalar types,
so that the entire slot gets marked as unmovable.

Fixes KHR-GL45.tessellation_shader.tessellation_control_to_tessellation_evaluation.gl_in on iris.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-03-21 16:03:58 +00:00
Rob Clark
6e781a01b9 freedreno/ir3: dynamic UBO indexing vs 64b pointers
Fixes dEQP-GLES31.functional.shaders.opaque_type_indexing.ubo.uniform_fragment
and similar things with multiple UBOs

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-03-21 09:13:05 -04:00
Rob Clark
2e01c534f4 freedreno/ir3: fix bit_count
Seems like it can only work 16b at a time.  Fixes
dEQP-GLES31.functional.shaders.builtin_functions.integer.bitcount.*

TODO need to check if this limitation applies to a3xx as well.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-03-21 09:13:05 -04:00
Rob Clark
3d8349048b freedreno/ir3: additional lowering
For some things that show up when we expose higher glsl

TODO check blob traces to see if we have instructions for some of this?
I guess we don't but worth a check..

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-03-21 09:13:05 -04:00
Rob Clark
bcd81d2387 freedreno/ir3: optimize sam.s2en to sam
Detect when sampler/texture idx are immediate and switch to non s2en
encoding.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-03-21 09:13:05 -04:00
Rob Clark
1443694ee5 freedreno/ir3: enable indirect tex/samp (sam.s2en)
For now it uses indirect for everything.  The next step is for the
ir3_cp pass to detect the case that tex and samp idx are immediate
and convert the sam instruction back to the non .s2en variant.  But
doing that in a following patch so we can shake out the bugs with
.s2en more easily.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-03-21 09:13:05 -04:00
Rob Clark
1088b788d8 freedreno/ir3: find # of samplers from uniform vars
When we have indirect samplers, we cannot tell the max sampler
referenced.  Instead just refer to the number of sampler uniforms.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-03-21 09:13:05 -04:00
Rob Clark
d4cbc94685 nir: move gls_type_get_{sampler,image}_count()
I need at least the sampler variant in ir3..

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-03-21 09:13:05 -04:00
Rob Clark
8eb16ae8bf freedreno/ir3: fix regmask for merged regs
On a6xx+ with half-regs conflicting with full-regs, the legalize pass
needs to set appropriate sync bits, such as (sy), on writes to full regs
that conflict with half regs, and visa-versa.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-03-21 09:13:05 -04:00
Rob Clark
1dffb089f9 freedreno/ir3: fix sam.s2en encoding
Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-03-21 09:13:05 -04:00
Rob Clark
45b7a581b4 freedreno/ir3: fix sam.s2en decoding
Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-03-21 09:13:05 -04:00
Rob Clark
2d31cf9d3b freedreno/ir3/ra: fix half-class conflicts
On a6xx, half-regs conflict with full-regs.  But we were only setting up
conflicts for the first class (ie. scalar, but not hvec2/hvec3/hvec4),
resulting in higher half-reg classes getting assigned to regs that
overwrite full-regs.

Noticed while trying to enable indirect-sampler (sam.s2en) which uses an
hvec2 argument to pass the sampler/tex index.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-03-21 09:13:05 -04:00
Rob Clark
cc5ca9391c freedreno/ir3 better cat6 encoding detection
These two bits seem to be a better way to detect which encoding we are
looking at.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-03-21 09:13:05 -04:00
Samuel Pitoiset
00327f827f ac: fix incorrect argument type for tbuffer.{load,store} with LLVM 7
GLC/SLC are boolean.

This fixes the following LLVM error when checkir is set:
Intrinsic has incorrect argument type!
void (i32, <4 x i32>, i32, i32, i32, i32, i32, i32, i32, i32)* @llvm.amdgcn.tbuffer.store.i32

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl
2019-03-21 14:02:00 +01:00
Samuel Pitoiset
20cac1f498 ac: fix 16-bit shifts
This fixes the following LLVM error when ckeckir is set:
Type too small for ZExt

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl
2019-03-21 14:01:58 +01:00
Samuel Pitoiset
2ac5c5c1b5 ac: add 16-bit support to fract
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-21 12:13:09 +01:00
Samuel Pitoiset
0eb1478ac2 ac: add 16-bit support fo fsign
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-21 12:13:07 +01:00
Samuel Pitoiset
ff11c9dcc7 ac: add f16_0 and f16_1 constants
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-21 12:13:05 +01:00
Timothy Arceri
427a6fee43 nir: only override previous alu during loop analysis if supported
Users of this function expect alu to be a supported comparision
if the induction variable is not NULL. Since we attempt to
override the return values if the first limit is not a const, we
must make sure we are dealing with a valid comparision before
overriding the alu instruction.

Fixes an unreachable in inverse_comparison() with the game
Assasins Creed Odyssey.

Fixes: 3235a942c1 ("nir: find induction/limit vars in iand instructions")

Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110216
2019-03-21 21:51:21 +11:00
Michel Dänzer
6d0a7f798c gitlab-ci: Use 8 CPU cores in autotools job
This cuts down the job runtime from ~9.5 to ~7 minutes with my personal
runner on an 8-core Ryzen 7 1700.

While this might result in slightly higher load on shared runners, it
should be OK, since libtool doesn't use the CPU cores as effectively as
e.g. ninja does; a significant part of the CPU load tends to be in bash
processes at any time, which should be relatively light on memory.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-03-21 09:58:31 +01:00
Michel Dänzer
a2cce701e6 gitlab-ci: List some longer-running jobs before others of the same stage
This increases the chance of them running earlier, which can have an
impact on the total duration of the pipeline.

v2:
* Minor style fix-up to moved comment (Eric Anholt)

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Eric Anholt <eric@anholt.net>
2019-03-21 09:55:08 +01:00
Samuel Pitoiset
db07f0554a radv: add missing initializations since VK_EXT_pipeline_creation_feedback
This fixes the world.

Fixes: 5f5ac19f13 ("radv: Implement VK_EXT_pipeline_creation_feedback.")"
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-21 09:42:31 +01:00
Rhys Perry
037f11d42e radv: enable VK_KHR_8bit_storage
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-21 09:02:27 +01:00
Rhys Perry
3cc72a88d8 ac/nir: implement 8-bit conversions
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-21 09:02:25 +01:00
Rhys Perry
c73f8b6576 ac/nir: add 8-bit types to glsl_base_to_llvm_type
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-21 09:02:22 +01:00
Rhys Perry
9c5067acf1 ac/nir: implement 8-bit ssbo stores
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-21 09:02:20 +01:00
Samuel Pitoiset
b235d77e18 ac: add ac_build_tbuffer_store_byte() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-21 09:02:18 +01:00
Rhys Perry
b12e074b89 ac/nir: implement 8-bit push constant, ssbo and ubo loads
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-21 09:02:16 +01:00
Samuel Pitoiset
104dbc64a5 ac: add ac_build_tbuffer_load_byte() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-21 09:02:14 +01:00
Samuel Pitoiset
6e632eb24b ac: add various int8 definitions
Original patch by Rhys Perry.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-21 09:02:10 +01:00
Tapani Pälli
4e1bbb000c anv/radv: release memory allocated by glsl types during spirv_to_nir
Fixes leaks for each glsl_type generated:

   ==32470== 384 bytes in 3 blocks are possibly lost in loss record 18 of 18
   ==32470==    at 0x483880B: malloc (vg_replace_malloc.c:309)
   ==32470==    by 0x4C43F4A: ralloc_size (ralloc.c:119)
   ==32470==    by 0x4C44014: rzalloc_size (ralloc.c:151)
   ==32470==    by 0x4C44258: rzalloc_array_size (ralloc.c:215)
   ==32470==    by 0x4D38957: glsl_type::glsl_type(glsl_struct_field const*, unsigned int, char const*) (glsl_types.cpp:114)
   ==32470==    by 0x4D3BEED: glsl_type::get_struct_instance(glsl_struct_field const*, unsigned int, char const*) (glsl_types.cpp:1146)
   ==32470==    by 0x4D42ECC: glsl_struct_type (nir_types.cpp:501)
   ==32470==    by 0x4CDB5A1: vtn_handle_type (spirv_to_nir.c:1269)
   ==32470==    by 0x4CE53DD: vtn_handle_variable_or_type_instruction (spirv_to_nir.c:4018)
   ==32470==    by 0x4CD8CFF: vtn_foreach_instruction (spirv_to_nir.c:365)
   ==32470==    by 0x4CE5E6B: spirv_to_nir (spirv_to_nir.c:4490)
   ==32470==    by 0x497AF10: anv_shader_compile_to_nir (anv_pipeline.c:173)

v2: move release call to vkDestroyInstance
v3: apply fix also to radv driver

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-03-21 08:30:22 +02:00
Jason Ekstrand
6e19348ad1 spirv: Drop inline tg4 lowering
Reviewed-by: Karol Herbst <kherbst@redhat.com>
2019-03-21 02:58:41 +00:00
Jason Ekstrand
08f804ec0c anv,radv,turnip: Lower TG4 offsets with nir_lower_tex
v2: turn on for turnip as well (Karol Herbst)

Reviewed-by: Karol Herbst <kherbst@redhat.com>
2019-03-21 02:58:41 +00:00
Karol Herbst
d8a0658d8b nir/lower_tex: Add support for tg4 offsets lowering
Signed-off-by: Karol Herbst <kherbst@redhat.com>
2019-03-21 02:58:41 +00:00
Karol Herbst
99f202432b nv50/ir/nir: support gather offsets
v2: only emit offsets if those are !0

Signed-off-by: Karol Herbst <kherbst@redhat.com>
2019-03-21 02:58:41 +00:00
Karol Herbst
71c66c254b nir: add support for gather offsets
Values inside the offsets parameter of textureGatherOffsets are required to be
constants in the range of [GL_MIN_PROGRAM_TEXTURE_GATHER_OFFSET,
GL_MAX_PROGRAM_TEXTURE_GATHER_OFFSET].

As this range is never outside [-32, 31] for all existing drivers inside mesa,
we can simply store the offsets as a int8_t[4][2] array inside nir_tex_instr.

Right now only Nvidia hardware supports this in hardware, so we can turn this
on inside Nouveau for the NIR path as it is already enabled with the TGSI one.

v2: use memcpy instead of for loops
    add missing bits to nir_instr_set
    don't show offsets if they are all 0
v3: default offsets aren't all 0
v4: rename offsets -> tg4_offsets
    rename nir_tex_instr_has_explicit_offsets -> nir_tex_instr_has_explicit_tg4_offsets

Signed-off-by: Karol Herbst <kherbst@redhat.com>
2019-03-21 02:58:41 +00:00
Dave Airlie
b95b33a5c7 nir/deref: remove casts of casts which are likely redundant (v3)
Not sure how ptr_stride should be taken into account if at all here

v2: reorder check to avoid src walking (Jason)
v3: remove is_cast_cast checks, keep going afterwards (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-03-21 10:58:06 +10:00
Dave Airlie
3b3653c4cf nir/spirv: don't use bare types, remove assert in split vars for testing
For OpenCL we never want to strip the info from the types, and it makes
type comparisons easier in later stages. We might later need a nir pass to
strip this for GLSL, but so far the only regression is the assert and Jason
said removing that is fine.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2019-03-21 10:25:40 +10:00
Rafael Antognolli
e7c8402163 iris: Let blorp update the clear color for us.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-20 16:46:26 -07:00
Rafael Antognolli
93123417dd iris: Track fast clear color.
v2: Update tracked clear color when we update the surface state.
v3: Update all aux surface states when updating the clear color.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-20 16:46:26 -07:00
Rafael Antognolli
5658c661de iris: Stall on the CPU and resolve predication during fast clears.
Only if the clear color/depth is changing. In those cases, it's hard to
keep track of the current clear color, and aux state of some layers,
when predication is enabled. So simplify everything by stalling on the
few cases where we would have a fast clear color change with
predication.

v2:
 - fix comment (Ken)
 - explicitly check for predicate state after resolving it (Ken)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-20 16:46:26 -07:00
Rafael Antognolli
ce830a364e iris: Add iris_resolve_conditional_render().
This function can be used to stall on the CPU and resolve the predicate
for the conditional render. It will convert ice->state.predicate from
IRIS_PREDICATE_STATE_USE_BIT to either IRIS_PREDICATE_STATE_RENDER or
IRIS_PREDICATE_STATE_DONT_RENDER, depending on the result of the query.

v2:
 - return void (Ken)
 - update the stored condition (Ken)
 - simplify the code leading to resolve the predicate (Ken)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-20 16:46:25 -07:00
Rafael Antognolli
131b42f0aa iris: Implement fast clear color.
If all the restrictions are satisfied, do a fast clear instead of
regular clear.

v2:
 - add perf_debug() when we can't fast clear (Ken)
 - improve comment: s/miptree/resource/ (Ken)
 - use swizzle_color_value from blorp (Ken)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-20 16:46:25 -07:00
Rafael Antognolli
bd6f51ec21 intel/blorp: Make swizzle_color_value public.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-20 16:46:25 -07:00
Rafael Antognolli
d97eddff25 intel/isl: Add isl_format_has_color_component() function.
v2: Get luminance bits from luminance component (Ken).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-20 16:46:25 -07:00
Rafael Antognolli
7f6344a726 iris: Bring back check for srgb and fast clear color.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-20 16:46:25 -07:00
Rafael Antognolli
a8b5ea8ef0 iris: Add function to update clear color in surface state.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-20 16:46:25 -07:00
Rafael Antognolli
32c8fa6411 iris: Add helper to convert fast clear color.
It needs to be converted to a value that can be used by ISL (and our
hardware SURFACE_STATE structure).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-20 16:46:25 -07:00
Rafael Antognolli
51638cf18a iris: Fast clear depth buffers.
Check and do a fast clear instead of a regular clear on depth buffers.

v3:
 - remove swith with some cases that we shouldn't wory about (Ken)
 - more parens into the has_hiz check (Ken)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-20 16:46:25 -07:00