When adding planes to the plane splitter, we supply a world clip
rect to the polygon clipper. Generally this is used to help with
float accuracy issues, but it also clips polygons to the visible
region.
The previous code supplied the visible world rect, but this is
not always correct. When drawing picture cache tiles, we may
be rendering to a tile that is partially off-screen. In this case
we need to pass the combined world dirty rect, which is inflated
to include the off-screen tile parts that are being drawn. This
ensures that preserve-3d items are correctly clipped to the tile
boundaries rather than the currently visible screen rect.
Differential Revision: https://phabricator.services.mozilla.com/D41111
--HG--
extra : moz-landing-system : lando
have texture cache to manage potentially multiple array textures of each type.
Differential Revision: https://phabricator.services.mozilla.com/D39912
--HG--
extra : moz-landing-system : lando
those unwrap_or are mostly seen during the batching, where we should asssume that
the primitives are not clipped out and just unwrap() accordingly.
Differential Revision: https://phabricator.services.mozilla.com/D39940
--HG--
extra : moz-landing-system : lando
Refactors get_clip_result_complex to cover ClipOut cases for rectangles as well
as Clip for non-repeated images.
Differential Revision: https://phabricator.services.mozilla.com/D40094
--HG--
extra : moz-landing-system : lando
Allow the swizzle to be configurable with a texture binding. This is still experimental and needs to be tested well on all platforms.
Basic approach is the following:
- WR device figures out how it can use BGRA and makes the texture cache format configurable at run-time. It tries to make the uploads to the shared texture cache pages to be done without any driver conversions, and without extra memory allocated.
- it also reports the preferred input format for the images, which may be different from the texture cache format
- if WR texture cache is asked to allocate a shared texture with a different (swizzled) format from the preferred, it associates the cache entry with a swizzle
- the swizzle becomes a part of the `SourceTexture`, which affects batch splitting
- when a texture reaches binding by GL device, it checks whether the current swizzle on this texture doesn't match the given one, and configures the texture sampling accordingly
- we can't use blits with swizzling, so when that needs to happen we use `cs_copy` path, which is now mostly rewritten
The idea is that Gecko would ask WR for the preferred format and configure its image decoding to produce image data that doesn't require any swizzling.
The PR changes existing texture upload (and batching) paths. On Linux, if texture storage is available, we now use it and provide the data as RGBA, assuming no conversion by the driver. The swizzling kicks in when we sample this data in the shader. On Windows/Angle we use BGRA as an internal format for texture cache and expect Gecko to provide BGRA data, this should be unchanged.
Differential Revision: https://phabricator.services.mozilla.com/D21965
--HG--
extra : moz-landing-system : lando
The glReadPixels call for capturing profiler screenshots is very slow
on Adreno devices. Similarly to bug 1498732, this is because the
stride of the data being transferred is not a multiple of 256, so the
driver is taking the synchronous path instead of reading in to a PBO
asynchronously.
This solves this problem by increasing the width of the area we read
so that we hit the fast path. To do this we must ensure that the PBO
and the final scale-down texture are large enough to include the extra
pixels in each row. As the required size of the PBO or texture may now
change, for example after a screen rotation, we now handle deleting
and recreating them when necessary.
Differential Revision: https://phabricator.services.mozilla.com/D39189
--HG--
extra : moz-landing-system : lando
looks like a typo that got unnoticed? I wonder how our alpha saved tasks work today :)
Differential Revision: https://phabricator.services.mozilla.com/D39021
--HG--
extra : moz-landing-system : lando
This patch fixes a couple of picture caching issues that could
cause more invalidations than required. Specifically:
* Ensure the viewport rect is included in child surfaces, so
that redundant clips are filtered out correctly.
* Use epsilon comparisons where appropriate for tile descriptor
comparisons, to avoid invalidations due to float inaccuracies.
Differential Revision: https://phabricator.services.mozilla.com/D38455
--HG--
extra : moz-landing-system : lando
The code to batch preserve-3d elements was incorrectly using the
bounds and visibility mask from the parent element. This could
result in batching bugs in some cases, which were showing up as
draw order issues.
Differential Revision: https://phabricator.services.mozilla.com/D38834
--HG--
extra : moz-landing-system : lando
This patch reverts the previous attempted fix for snapping issues
with picture caching, and implements a better solution.
This fixes the main visual issue by ensuring that any fractional
offset in the root transform is accounted for by:
* Offsetting the tile rects by this amount, so that the content
origin is a whole device pixel.
* Invalidating all tiles if the fractional part of the root
transform changes. This is required since it can affect the
snapping logic that WR applies. Fortunately, this occurs
very rarely - Gecko typically has a constant fractional part
for each page.
Differential Revision: https://phabricator.services.mozilla.com/D38267
--HG--
extra : moz-landing-system : lando
This patch reverts the previous attempted fix for snapping issues
with picture caching, and implements a better solution.
This fixes the main visual issue by ensuring that any fractional
offset in the root transform is accounted for by:
* Offsetting the tile rects by this amount, so that the content
origin is a whole device pixel.
* Invalidating all tiles if the fractional part of the root
transform changes. This is required since it can affect the
snapping logic that WR applies. Fortunately, this occurs
very rarely - Gecko typically has a constant fractional part
for each page.
Differential Revision: https://phabricator.services.mozilla.com/D38267
--HG--
extra : moz-landing-system : lando
When rendering text in webrender we rasterize glyphs either in local
space at a chosen raster scale, or in device space taking in to
account to text's transform.
We are not able to rasterize glyphs larger than a certain size,
however. So if the device-space font size exceeds this limit, then
currently we force the glyph to be rasterized in local space, at the
untransformed font size. This must then be scaled by the shader when
rendering text, and at high zoom levels this will result in blurry
text.
This change makes it so that rather than rasterizing at the
untransformed font size we rasterize at the font size limit. This will
mean the glyphs are rasterized at a larger size and will therefore
require less scaling, meaning they will appear less blurry.
Differential Revision: https://phabricator.services.mozilla.com/D37644
--HG--
extra : moz-landing-system : lando
I suspect we may change things more in the future as we blob's size
just be the visible rect but this is an incremental step in the right
direction.
It also includes some changes to make sure that we always update our
tiles appropriately.
Differential Revision: https://phabricator.services.mozilla.com/D37079
--HG--
extra : moz-landing-system : lando
Replace `serde`-derived `bincode` with custom binary
serialization/deserialization that generates more efficient code at rustc
`opt-level = 2`.
Differential Revision: https://phabricator.services.mozilla.com/D32782
--HG--
extra : moz-landing-system : lando
Replace `serde`-derived `bincode` with custom binary
serialization/deserialization that generates more efficient code at rustc
`opt-level = 2`.
Differential Revision: https://phabricator.services.mozilla.com/D32782
--HG--
extra : moz-landing-system : lando
Replace `serde`-derived `bincode` with custom binary
serialization/deserialization that generates more efficient code at rustc
`opt-level = 2`.
Differential Revision: https://phabricator.services.mozilla.com/D32782
--HG--
extra : moz-landing-system : lando
I suspect we may change things more in the future as we blob's size
just be the visible rect but this is an incremental step in the right
direction.
It also includes some changes to make sure that we always update our
tiles appropriately.
Differential Revision: https://phabricator.services.mozilla.com/D37079
--HG--
extra : moz-landing-system : lando
I suspect we may change things more in the future as we blob's size
just be the visible rect but this is an incremental step in the right
direction.
It also includes some changes to make sure that we always update our
tiles appropriately.
Differential Revision: https://phabricator.services.mozilla.com/D37079
--HG--
extra : moz-landing-system : lando
On startup some program binaries are loaded from disk into an
in-memory cache. When we call create_program() we check if the
required program is present in this cache, and if so we call
glProgramBinary(). This is done early on so that the driver can
perform any necessary work in the background.
There may however be binaries in the disk cache that have not yet been
loaded in to memory, in order not to slow down startup. This change
makes it so that we attempt to load missing binaries from disk during
link_program(). The reason we do not do this in create_program() is
because that would result in loading all shaders from disk during
startup, which we want to avoid. Loading these shaders may therefore
take slightly longer than if they'd been loaded at startup, but will
still be much faster than recompiling them from scratch, and startup
will remain quick.
If loading the shaders on startup had previously timed out, then we do
not attempt to load shaders on demand as the disk is probably too slow
for that to be useful.
Depends on D33954
Differential Revision: https://phabricator.services.mozilla.com/D33955
--HG--
extra : moz-landing-system : lando
it's very helpful to see the list of clips and the way they affect a chased primitive
Example:
```
building clip chain instance with local rect TypedRect(1561.0×1968.0 at (-300.0,-300.0))
clip Rectangle(3840.0×1874.0, Clip) at (0.0,0.0) in space SpatialNodeIndex(1)
flags (empty), resulted in Partial
clip Rectangle(3840.0×1874.0, Clip) at (0.0,0.0) in space SpatialNodeIndex(2)
flags (empty), resulted in Partial
```
Differential Revision: https://phabricator.services.mozilla.com/D37137
--HG--
extra : moz-landing-system : lando
it's very helpful to see the list of clips and the way they affect a chased primitive
Example:
```
building clip chain instance with local rect TypedRect(1561.0×1968.0 at (-300.0,-300.0))
clip Rectangle(3840.0×1874.0, Clip) at (0.0,0.0) in space SpatialNodeIndex(1)
flags (empty), resulted in Partial
clip Rectangle(3840.0×1874.0, Clip) at (0.0,0.0) in space SpatialNodeIndex(2)
flags (empty), resulted in Partial
```
Differential Revision: https://phabricator.services.mozilla.com/D37137
--HG--
extra : moz-landing-system : lando
a follow-up to D36603 that switches the base space from the surface node to the raster node.
Differential Revision: https://phabricator.services.mozilla.com/D36828
--HG--
extra : moz-landing-system : lando
* Add a script for running wrench under various software rasterizers.
* Add support to wrench for non-blocking event loop.
* Add support to wrench for selecting GL/ES rendering API.
* Update x11 bindings for wrench, to fix a release only crash.
Differential Revision: https://phabricator.services.mozilla.com/D36551
--HG--
extra : moz-landing-system : lando
* Add a script for running wrench under various software rasterizers.
* Add support to wrench for non-blocking event loop.
* Add support to wrench for selecting GL/ES rendering API.
* Update x11 bindings for wrench, to fix a release only crash.
Differential Revision: https://phabricator.services.mozilla.com/D36551
--HG--
extra : moz-landing-system : lando
we save the native fonts by their full path now. On macOS, there is no
such thing as a full filesystem path for a CGFont (or at least we don't track it),
so loading a capture falls back to the old logic of using the dummy font.
Differential Revision: https://phabricator.services.mozilla.com/D36604
--HG--
extra : moz-landing-system : lando
This is the first piece of the blob-recoord series. Adding
some checks to ensure things stay sane.
Differential Revision: https://phabricator.services.mozilla.com/D35806
--HG--
extra : moz-landing-system : lando
Gecko layouts typically produce a picture cache where the origin
of the picture rect is an integer. However, it can occasionally
be a fractional origin.
In these cases, we need to ensure the content origin is floored,
to maintain consistent snapping. When this case occurs, the UV
rect for the tile also needs adjusting, to ensure the exact
1:1 texel:pixel mapping when drawing the tile.
Differential Revision: https://phabricator.services.mozilla.com/D35761
--HG--
extra : moz-landing-system : lando
Some pages from Gecko produce a display list for the main content
tile cache that has a transparent background. Detect these cases
and disable subpixel text rendering to ensure correct blending.
Differential Revision: https://phabricator.services.mozilla.com/D35627
--HG--
extra : moz-landing-system : lando
In future, picture cache tiles will support different sizes, depending
on the size of the content slice being cached, and how frequently
parts of the slice are changing.
Although currently unused, this patch adds support for specifying
multiple different tile sizes for the picture cache texture array.
Differential Revision: https://phabricator.services.mozilla.com/D34989
--HG--
extra : moz-landing-system : lando
In future, picture cache tiles will support different sizes, depending
on the size of the content slice being cached, and how frequently
parts of the slice are changing.
Although currently unused, this patch adds support for specifying
multiple different tile sizes for the picture cache texture array.
Differential Revision: https://phabricator.services.mozilla.com/D34989
--HG--
extra : moz-landing-system : lando
Fixes an edge case where splitting the top level primitive list
for picture caching may result in a mismatched push/pop clip
pair.
This is not a particularly efficient fix, but it's a rare enough
edge case for now that this fix will be good enough until we work
out the long term solution for the push/pop clip chain instances.
Differential Revision: https://phabricator.services.mozilla.com/D35139
--HG--
extra : moz-landing-system : lando
This patch fixes two issues with blob images + new picture caching.
1) The logic that determines a conservative set of visible tiles
for tiled / blob images was no longer correct. It was relying
on the bounds of a single tile to build the conservative rect.
Instead, take the overall primitive world bounds and derive a
conservative set of visible tiles from this.
2) The logic to detect if an image was dirty was incorrect, and
somewhat error prone. It now maintains a set of dirty images
that have been requested. The image key dependencies are then
checked during the tile cache post_update step.
Differential Revision: https://phabricator.services.mozilla.com/D35126
--HG--
extra : moz-landing-system : lando
These now work on actual devices now, but must remain disabled on the
emulator until bug 1555002 is fixed.
Differential Revision: https://phabricator.services.mozilla.com/D34619
--HG--
extra : moz-landing-system : lando
There appears to be a driver bug on android 8 and older where it does
not render correctly.
Differential Revision: https://phabricator.services.mozilla.com/D34618
--HG--
extra : moz-landing-system : lando
When using an advanced blend equation, fragment shader output must be
marked with a matching layout qualifier. Not doing so was causing
subsequent glDraw* operations to fail.
This patch adds a new shader feature, WR_FEATURE_ADVANCED_BLEND, which
requires the necessary extension and adds the qualifier. Variants of
the brush_image shaders are created with this feature, and are used
whenever a brush_image shader is requested for BlendMode::Advanced.
Differential Revision: https://phabricator.services.mozilla.com/D34617
--HG--
extra : moz-landing-system : lando
This patch implements the majority of the planned picture caching
improvements. It supports most of the functionality required to
(as a follow up) support OS compositor integration. It also improves
on the robustness and functionality of the previous picture caching
implementation.
There are some expected temporary performance regressions in
some cases (such as content that is constantly invalidating) and
during initial page render when many render targets must be drawn
to. These performance regressions will be resolved in follow up
commits by supporting multi-resolution tiles.
The scene is split into a number of slices, determined by the scroll
root of each primitive, which can be found by the primitive's
spatial node indices. If a scene contains too many slices, then
picture caching is disabled on the page, to avoid excessive texture
memory usage, and rendering falls back to rasterizing each frame.
The specific changes in this patch are:
* Support tile caches for multiple scroll roots, allowing the
entire page (including fixed divs and the main UI bar) to be
cached in most cases, in addition to the main content.
* Remove requirement to read tiles back from the framebuffer.
Instead, they are drawn into the picture cache target tiles,
and blitted to the screen. This is slightly slower than the
existing picture caching when content is constantly changing,
however this cost will disappear / become irrelevant when
the OS compositor integration work is complete.
* Switch picture cache render targets to be nearest sampled (they
are always rendered 1:1) and support depth buffer targets.
* Make use of the external scroll offset support to allow removal
of the primitive correlation hacks in the previous picture
caching implementation. Also allows storing of primitive
dependencies in picture space rather than world space, which
reduces floating point inaccuracies.
* Determine if each tile and picture cache can be considered
opaque. This is used to determine whether subpixel AA text
rendering is available on a slice, and for rendering optimizations
related to disabling blending and/or tile clears.
* Use the clip chain instance results from the recent visibility pass
work to determine clip chain dependencies. This results in fewer
clip item dependencies in tiles, which is faster to check validity
and reduces redundant invalidations.
* Remove extra overhead during batching related to batch lists,
and region iteration, as they are no longer required.
* Support PrimitiveVisibilityMask during batching. This allows a
single traversal of a picture (surface) root during batching to
efficiently construct multiple alpha batcher objects (typically
one per invalida tile).
* Picture caching is now handled implicitly by WR, depending on
the content of the scene. There is no requirement for client
code to manually select which stacking context should be cached.
* Simplify how clip chain / transform dependencies are tracked by
picture cache tiles.
* Support pushing / popping enclosing clip chain roots without
the need for a stacking context / picture in some cases. This
simplifies the logic to split the scene into multiple slices.
The main remaining work in this area is (a) extend the code to
optionally provide each slice as an input to the OS compositor
rather than drawing the tiles in WR, and (b) support multi-resolution
tiles so that we reduce the draw call, batching and render target
overhead in cases where much of the page content is changing.
Differential Revision: https://phabricator.services.mozilla.com/D34319
--HG--
extra : moz-landing-system : lando
The tile cache is 352 bytes large and in the majority of cases picture primitives don't have one, so this saves a few KB of ram in typical pages reduces the likely hood of hitting OOM crashes while growing the primitives vector.
Differential Revision: https://phabricator.services.mozilla.com/D34346
--HG--
extra : moz-landing-system : lando
The presence or absence of the DEVICE_SERIAL environment variable
is sufficient to control this.
Differential Revision: https://phabricator.services.mozilla.com/D33407
--HG--
extra : moz-landing-system : lando
This is in preparation for having the same script be used for emulator
and device runs. No functional change in this patch; it just renames
the file and class.
Differential Revision: https://phabricator.services.mozilla.com/D33406
--HG--
rename : testing/mozharness/scripts/android_emulator_wrench.py => testing/mozharness/scripts/android_wrench.py
extra : moz-landing-system : lando
Force perspective interpolation of UV coordinates in clip shaders.
In addition to fixing the interpolation curve, also adds checks for the homogeneous coordinates to be outside of the meaningful hemisphere, forcing the clip shaders to output zeroes in those areas.
Differential Revision: https://phabricator.services.mozilla.com/D34017
--HG--
extra : moz-landing-system : lando