third_party_mesa3d

mirror of https://gitee.com/openharmony/third_party_mesa3d synced 2025-02-16 16:10:58 +00:00

Author	SHA1	Message	Date
Brian Paul	55417140cd	svga: initialize a variable to silence a gcc warning Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-08-17 12:53:20 -06:00
Ian Romanick	607ab6d3bf	glsl: Pull enum ir_expression_operation out to its own file No change except to the copyright symbol. The next patch will generate this file with Python, and Unicode + Python = pure rage. v2: Massive rebase... I guess a lot can change in a year. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-17 13:48:25 +01:00
Ian Romanick	de71bc9eb6	glsl: Make the generated sources build rules more like NIR Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-17 13:48:25 +01:00
Francesco Ansanelli	120c9c6380	mesa/st: use llabs instead of abs for long args (v2) v2: long has 32bit on Windows (Marek) Signed-off-by: Francesco Ansanelli <francians@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-08-17 14:16:29 +02:00
Marek Olšák	57a8991020	radeonsi: fix up buffer descriptor upper-bound checking st/mesa does this too, so we're safe. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-17 14:15:33 +02:00
Marek Olšák	325379096f	gallium: change pipe_image_view::first_element/last_element -> offset/size This is required by OpenGL. Our hardware supports this. Example: Bind RGBA32F with offset = 4 bytes. Acked-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-17 14:15:33 +02:00
Marek Olšák	7cd256ce7e	gallium: change pipe_sampler_view::first_element/last_element -> offset/size This is required by OpenGL. Our hardware supports this. Example: Bind RGBA32F with offset = 4 bytes. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97305 Acked-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-17 14:15:33 +02:00
Marek Olšák	1ac23a9359	gallium/radeon: assign the highest priority to scratch; make rings second just FYI, the kernel receives priority/4 Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-17 14:15:29 +02:00
Marek Olšák	9009516501	gallium/winsys: re-number winsys priority flags free 60..63, move CP_DMA up Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-17 12:24:35 +02:00
Marek Olšák	95020c6dfd	gallium/radeon: mark shader rings as highest-priority buffers and rename the enum Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-17 12:24:35 +02:00
Marek Olšák	e2bb24f213	gallium/radeon: set SHADER_RW_BUFFER priority for streamout buffers Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-17 12:24:35 +02:00
Marek Olšák	a6b5845a0d	radeonsi: use current context for DCC feedback-loop decompress, fixes Elemental This is just a workaround. The problem is described in the code. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96541 v2: say that it's only between the current context and aux_context Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)	2016-08-17 12:24:35 +02:00
Marek Olšák	9812a50ae6	radeonsi: simplify CB_TARGET_MASK logic we can now rely on CB_COLORn_INFO to disable empty slots. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-17 12:24:35 +02:00
Marek Olšák	2d2b384066	radeonsi: don't set CB_COLOR1_INFO for dual src blending Vulkan doesn't do this. The reason may be that CB_COLOR1_INFO.SOURCE_FORMAT from NI was moved to SPI_SHADER_COL_FORMAT for SI. I asked CB guys about this 2 days ago and they still haven't replied. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-17 12:24:35 +02:00
Marek Olšák	e722b90bc9	radeonsi: eliminate PS OUT[1] if dual src blending is off and CB1 is not bound All VP DX9 ports benefit from this. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-17 12:24:35 +02:00
Marek Olšák	3de8ffe836	gallium/radeon: use unflushed fences for PIPE_QUERY_GPU_FINISHED Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-17 12:24:35 +02:00
Nicolai Hähnle	c5798d6314	gallium/radeon: use lp_build_alloca_undef Avoid building all those store 0 / store undef instruction pairs that end up getting removed anyway. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-17 12:11:25 +02:00
Nicolai Hähnle	41001ca4bd	gallivm: add lp_build_alloca_undef Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-17 12:11:24 +02:00
Nicolai Hähnle	17e88e276c	gallivm: add create_builder_at_entry helper function Reduces code duplication. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-17 12:11:24 +02:00
Nicolai Hähnle	f4204ba53d	gallium/radeon: protect against out of bounds temporary array accesses They can lead to VM faults and worse, which goes against the GL robustness promises. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-17 12:11:24 +02:00
Nicolai Hähnle	ea283779be	gallium/radeon: add radeon_llvm_bound_index for bounds checking Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-17 12:11:24 +02:00
Nicolai Hähnle	8916d1e2fa	gallium/radeon: reduce alloca of temporaries based on usagemask v2: take actual writemasks into account Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-17 12:11:24 +02:00
Nicolai Hähnle	6bba956073	gallium/radeon: use tgsi_scan_arrays for temp arrays Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-17 12:11:23 +02:00
Nicolai Hähnle	7c2295d7ef	gallium/radeon: allocate temps array info in radeon_llvm_context_init Also, prepare for using tgsi_array_info. This also opens the door for properly handling allocation failures, but I'm leaving that for a separate change. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-17 12:11:23 +02:00
Nicolai Hähnle	850c8dcc9c	gallium/radeon: always do the full store in store_value_to_array Doing the write-back of the temporary vector in radeon_llvm_emit_store makes no sense. This also allows us to get rid of get_alloca_for_array. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-17 12:11:23 +02:00
Nicolai Hähnle	4b150931c9	gallium/radeon: extract common getelementptr logic into get_pointer_into_array Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-17 12:11:23 +02:00
Nicolai Hähnle	dfbb8ea284	gallium/radeon: pass indirect register info into get_alloca_for_array To have the same signature as get_array_range. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-17 12:11:23 +02:00
Nicolai Hähnle	b76aabffa2	gallium/radeon: extract common lookup code into get_temp_array function Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-17 12:11:23 +02:00
Nicolai Hähnle	fa84296a5a	gallium/radeon: clarify the comment on the array alloca heuristic Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-17 12:11:22 +02:00
Nicolai Hähnle	92b66b38c9	gallium/radeon: more descriptive names for LLVM temporaries in debug builds Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-17 12:11:22 +02:00
Nicolai Hähnle	eacfc86d83	gallium/radeon: simplify radeon_llvm_emit_store for direct array addressing We can use the pointer stored in the temps array directly. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-17 12:11:22 +02:00
Nicolai Hähnle	87fa7cea23	gallium/radeon: simplify radeon_llvm_emit_fetch for direct array addressing We can use the pointer stored in the temps array directly. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-17 12:11:22 +02:00
Nicolai Hähnle	eb50cbf3bd	gallium/radeon: clean up emit_declaration for temporaries In the alloca'd array case, no longer create redundant and unused allocas for the individual elements; create getelementptrs instead. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-17 12:11:22 +02:00
Nicolai Hähnle	cb9ed66cc5	st_glsl_to_tgsi: use calloc the way it's meant to be used Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-17 12:11:22 +02:00
Nicolai Hähnle	67c0f077a2	tgsi/scan: add tgsi_scan_arrays Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-17 12:11:21 +02:00
Ian Romanick	2ec3a3e151	glsl: Add missing ir_quadop_vector constant evaluation for Boolean types Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-17 10:52:39 +01:00
Ian Romanick	cf58e3f522	glsl: Fix typo in ir_unop_f2u implementation This won't affect the output, but it was, technically, wrong. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-17 10:52:39 +01:00
Ian Romanick	8b123b08cb	glsl: Fix typo in ir_unop_b2i implementation This won't affect the output, but it was, technically, wrong. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-17 10:52:39 +01:00
Ian Romanick	cd8764737e	glsl: Don't support integer types for operations that can't handle them ir_unop_fract already forbade integer types in ir_validate. ir_unop_rcp, ir_unop_rsq, and ir_unop_sqrt should also forbid them in ir_validate. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-17 10:52:39 +01:00
Ian Romanick	437e612bd7	glsl: Don't support ir_unop_abs or ir_unop_sign for unsigned integers Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-08-17 10:52:39 +01:00
Ian Romanick	cceb50e14e	nir/algebraic: Optimize common array indexing sequence Some shaders include code that looks like: uniform int i; uniform vec4 bones[...]; foo(bones[i * 3], bones[i * 3 + 1], bones[i * 3 + 2]); CSE would do some work on this: x = i * 3 foo(bones[x], bones[x + 1], bones[x + 2]); The compiler may then add '<< 4 + base' to the index calculations. This results in expressions like x = i * 3 foo(bones[x << 4], bones[(x + 1) << 4], bones[(x + 2) << 4]); Just rearranging the math to produce (i * 48) + 16 saves an instruction, and it allows CSE to do more work. x = i * 48; foo(bones[x], bones[x + 16], bones[x + 32]); So, ~6 instructions becomes ~3. Some individual shader-db results look pretty bad. However, I have a really, really hard time believing the change in estimated cycles in, for example, 3dmmes-taiji/51.shader_test after looking that change in the generated code. G45 total instructions in shared programs: 4020840 -> 4010070 (-0.27%) instructions in affected programs: 177460 -> 166690 (-6.07%) helped: 894 HURT: 0 total cycles in shared programs: 98829000 -> 98784990 (-0.04%) cycles in affected programs: 3936648 -> 3892638 (-1.12%) helped: 894 HURT: 0 Ironlake total instructions in shared programs: 6418887 -> 6408117 (-0.17%) instructions in affected programs: 177460 -> 166690 (-6.07%) helped: 894 HURT: 0 total cycles in shared programs: 143504542 -> 143460532 (-0.03%) cycles in affected programs: 3936648 -> 3892638 (-1.12%) helped: 894 HURT: 0 Sandy Bridge total instructions in shared programs: 8357887 -> 8339251 (-0.22%) instructions in affected programs: 432715 -> 414079 (-4.31%) helped: 2795 HURT: 0 total cycles in shared programs: 118284184 -> 118207412 (-0.06%) cycles in affected programs: 6114626 -> 6037854 (-1.26%) helped: 2478 HURT: 317 Ivy Bridge total instructions in shared programs: 7669390 -> 7653822 (-0.20%) instructions in affected programs: 388234 -> 372666 (-4.01%) helped: 2795 HURT: 0 total cycles in shared programs: 68381982 -> 68263684 (-0.17%) cycles in affected programs: 1972658 -> 1854360 (-6.00%) helped: 2458 HURT: 307 Haswell total instructions in shared programs: 7082636 -> 7067068 (-0.22%) instructions in affected programs: 388234 -> 372666 (-4.01%) helped: 2795 HURT: 0 total cycles in shared programs: 68282020 -> 68164158 (-0.17%) cycles in affected programs: 1891820 -> 1773958 (-6.23%) helped: 2459 HURT: 261 Broadwell total instructions in shared programs: 9002466 -> 8985875 (-0.18%) instructions in affected programs: 658784 -> 642193 (-2.52%) helped: 2795 HURT: 5 total cycles in shared programs: 78503092 -> 78450404 (-0.07%) cycles in affected programs: 2873304 -> 2820616 (-1.83%) helped: 2275 HURT: 415 Skylake total instructions in shared programs: 9156978 -> 9140387 (-0.18%) instructions in affected programs: 682625 -> 666034 (-2.43%) helped: 2795 HURT: 5 total cycles in shared programs: 75591392 -> 75550574 (-0.05%) cycles in affected programs: 3192120 -> 3151302 (-1.28%) helped: 2271 HURT: 425 Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-08-17 10:52:38 +01:00
Michel Dänzer	4ac640e3d2	glx: Don't use current context in __glXSendError There's no guarantee that there is one, and we don't need one anyway. Fixes piglit tests: glx@glx-fbconfig-bad glx@glx_ext_import_context@import context, multi process glx@glx_ext_import_context@import context, single process Fixes: 2e3f067458e4 ("glx: fix error code when there is no context bound") Cc: "11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2016-08-17 17:16:34 +09:00
Ilia Mirkin	e988999791	nv50/ir: fix bb positions after exit instructions It's fairly rare that the BB layout puts BBs after the exit block, which is likely the reason these issues lingered for so long. This fixes a fraction of issues with the giant pixmark piano shader. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: <mesa-stable@lists.freedesktop.org>	2016-08-16 21:56:16 -04:00
Ilia Mirkin	0b5f40b881	nv50/ir: properly clear upper bits of a bitset fill Found by inspection. In practice, val is always == 0, so this never got triggered. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-08-16 21:56:16 -04:00
Francisco Jerez	4d436c011f	i965/fs: Estimate maximum sampler message execution size more accurately. The current logic used to determine the execution size of sampler messages was based on special-casing several argument and opcode combinations, which unsurprisingly missed the possibility that some messages could exceed the payload size limit or not depending on the number of coordinate components present. In particular: - The TXL, TXB and TEX messages (the latter on non-FS stages only) would attempt to use SIMD16 on Gen7+ hardware even if a shadow reference was present and the texture was a cubemap array, causing it to overflow the maximum supported sampler payload size and crash. - The TG4_OFFSET message with shadow comparison was falling back to SIMD8 regardless of the number of coordinate components, which is unnecessary when two coordinates or less are present. Both cases have been handled incorrectly ever since cubemap arrays and texture gather were respectively enabled (the current logic used by the SIMD lowering pass is almost unchanged from the previous no16 fall-back logic used pre-SIMD lowering times). Fixes the following GL4.5 conformance test on Gen7-8 (the bug also affects Gen9+ in principle, but SKL passes the test by luck because it manages to use the TXL_LZ message instead of TXL): GL45-CTS.texture_cube_map_array.sampling Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97267 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-16 16:31:59 -07:00
Francisco Jerez	61a02fb74c	i965/fs: Return zero from fs_inst::components_read for non-present sources. This makes it easier for the caller to find out how many scalar components are actually read by the instruction. As a bonus we no longer need to special-case BAD_FILE in the implementation of fs_inst::regs_read. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-16 16:31:59 -07:00
Francisco Jerez	0c754d1c42	i965/fs: Lower TEX to TXL during NIR translation. This simplifies the code slightly and will allow the SIMD lowering pass to find out easily what the actual texturing opcode is in order to determine the maximum execution size of texturing instructions. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-08-16 16:31:59 -07:00
Rob Clark	5def00875d	freedreno/a3xx: fix generic clear path Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-08-16 19:26:03 -04:00
Brian Paul	df2dcf6200	st/mesa: use pipe var instead of st->pipe in st_create_context_priv() As is done in most other places in the function. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-16 08:28:33 -06:00
Brian Paul	038b1b11fe	gallium: remove unused u_clear.h file Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-16 08:28:33 -06:00

... 3 4 5 6 7 ...

84144 Commits