Change the external scroll offset to be a vector, rather than a
point. This can also be updated gecko-side in future, but for
now is converted to a vector at the API boundary.
Also plumb through the external scroll offset so that it is stored
inside the ScrollFrameInfo in a spatial node. This will allow
modifying the transforms that the clip-scroll tree creates to
take into account the external scroll offset in future.
Differential Revision: https://phabricator.services.mozilla.com/D21319
--HG--
extra : moz-landing-system : lando
This doesn't introduce any functional changes. However, it refactors
the way that stacking context coords are converted into reference
frame relative coordinates.
Having a single method to retrieve the current offset will make it
easier to take advantage of the newly added API that allows Gecko
to supply initial scroll offsets for scroll nodes. In turn, this
will allow WR to normalize the local coordinates of primitives, which
will allow future improvements and simplifications to the picture
caching implementation.
Differential Revision: https://phabricator.services.mozilla.com/D21090
--HG--
extra : moz-landing-system : lando
A number of small tweaks to enable the picture caching invalidation
tests. With this in place, we can start adding more test coverage
for various invalidation scenarios.
- Make the reference image render after the test images, so that dirty
region tracking is more intuitive.
- Instead of replaying the same frame in wrench to ensure frames are
caching, try to cache tiles every frame when testing mode is enabled.
- Add a basic invalidation test for a rectangle with color that changes
each frame.
- Make the dirty region index implicit and rename dirty_region to dirty
in reftest format.
- Fix an underflow error when moving to next frame in wrench.
Differential Revision: https://phabricator.services.mozilla.com/D20963
--HG--
extra : moz-landing-system : lando
This is a new implementation of mix-blend compositing that is meant to be more idiomatic to WR and efficient.
Previously, mix-blend mode was composed in the following way:
1. parent stacking context was forced to isolate
2. source picture is also isolated
3. when rendering the isolated context, the framebuffer is read upon reaching the source. Both the readback and the source are placed in the RT cache.
4. a mix-blend draw call is issued to read from those cache segments and blend on top of the backdrop
The new implementation works by using the picture cutting (intruduced for preserve-3D contexts earlier) and some bits of magic:
1. backdrop stacking context is isolated with a special composition mode that prevents it from actually rendeing unless the suorce stacking context is invisible.
2. source stacking context is isolated with mix-blend composition mode that has a pointer to the backdrop picture
3. the instance of the backdrop picture is placed as a peer of the source picture (not a child)
4. if the backdrop is invisible, the source is drawn as a simple blit
5. otherwise, it's a draw call that reads from the isolated backdrop and source textures
Note the differences:
- parent stacking context is not isolated, but backdrop is
- no framebuffer readback is involved
- the source and backdrop pictures are rendered in parallel in a pass, improving the batching
- we don't blend onto the backdrop while reading from the backdrop copy at the same time
- the depth of the render pass tree is reduced: previously the parent and the source were isolated, now the source and the backdrop, which are siblings
Differential Revision: https://phabricator.services.mozilla.com/D20608
--HG--
rename : gfx/wr/wrench/reftests/blend/multiply-2-ref.yaml => gfx/wr/wrench/reftests/blend/multiply-3-ref.yaml
rename : gfx/wr/wrench/reftests/blend/multiply-3.yaml => gfx/wr/wrench/reftests/blend/multiply-4.yaml
extra : moz-landing-system : lando
Document splitting is crashing with early initialization of the debug
renderer. Not sure why, and this is just a temporary workaround, but
one that I think we want anyway, as we don't want to be unnecessarily
lazy-initting the debug renderer.
Depends on D20698
Differential Revision: https://phabricator.services.mozilla.com/D20700
--HG--
extra : moz-landing-system : lando
This is a new implementation of mix-blend compositing that is meant to be more idiomatic to WR and efficient.
Previously, mix-blend mode was composed in the following way:
1. parent stacking context was forced to isolate
2. source picture is also isolated
3. when rendering the isolated context, the framebuffer is read upon reaching the source. Both the readback and the source are placed in the RT cache.
4. a mix-blend draw call is issued to read from those cache segments and blend on top of the backdrop
The new implementation works by using the picture cutting (intruduced for preserve-3D contexts earlier) and some bits of magic:
1. backdrop stacking context is isolated with a special composition mode that prevents it from actually rendeing unless the suorce stacking context is invisible.
2. source stacking context is isolated with mix-blend composition mode that has a pointer to the backdrop picture
3. the instance of the backdrop picture is placed as a peer of the source picture (not a child)
4. if the backdrop is invisible, the source is drawn as a simple blit
5. otherwise, it's a draw call that reads from the isolated backdrop and source textures
Note the differences:
- parent stacking context is not isolated, but backdrop is
- no framebuffer readback is involved
- the source and backdrop pictures are rendered in parallel in a pass, improving the batching
- we don't blend onto the backdrop while reading from the backdrop copy at the same time
- the depth of the render pass tree is reduced: previously the parent and the source were isolated, now the source and the backdrop, which are siblings
Differential Revision: https://phabricator.services.mozilla.com/D20608
--HG--
rename : gfx/wr/wrench/reftests/blend/multiply-2-ref.yaml => gfx/wr/wrench/reftests/blend/multiply-3-ref.yaml
rename : gfx/wr/wrench/reftests/blend/multiply-3.yaml => gfx/wr/wrench/reftests/blend/multiply-4.yaml
extra : moz-landing-system : lando
Without this patch any enclosing scale transform between a blurred
picture and the nearest raster root was being ignored entirely for the
purposes of blur.
Also includes a couple of reftests to exercise this code.
Differential Revision: https://phabricator.services.mozilla.com/D20908
--HG--
extra : moz-landing-system : lando
Currently on Android we upload texture data to the webrender texture
cache using a PBO. On Adreno GPUs, however, this upload is still being
done synchronously, and profiles show a lot of time spent waiting in
glTexSubImage3D.
The problem is that the stride of the data in the PBO is not a
multiple of 256 bytes, so the driver is not able to DMA the upload.
This patch ensures that data is laid out optimally in the PBO, using
glMapBufferRange then copying the data line-by-line if required. This
allows the driver to perform the upload asynchronously as intended.
Differential Revision: https://phabricator.services.mozilla.com/D20492
--HG--
extra : moz-landing-system : lando
The tiling origin is computed withing image::tiles instead of being provided to the function.
In addition, the image rect in device space is exposed as function parameter.
In a followup, callers will have to determine the correct image rect using the blob image's visible area.
Differential Revision: https://phabricator.services.mozilla.com/D20175
--HG--
extra : moz-landing-system : lando
angle_shader_validation.rs just checks that the number of "switch" and "default:" are the same number in the source file, even if they occur in comments.
On Windows the vFuncs array is always 0 in the fragment shader. If we move the computation of the vFuncs array outside of the switch (so that it is computed for every type of shader, even when it is not needed) then it works.
For table/discrete we just create a lookup table for all 256 possible input values. We should probably switch to just computing the value in the shader, unless the list of value is really long.
The format for stacking contexts in the built display list goes from
PushStackingContext item
push_iter of Vec<FilterOp>
to
SetFilterOps item
push_iter of Vec<FilterOp>
1st SetFilterData item
push_iter of array of func types
push_iter funcR values
push_iter funcG values
push_iter funcB values
push_iter funcA values
.
.
.
nth SetFilterData item
push_iter of array of func types
push_iter funcR values
push_iter funcG values
push_iter funcB values
push_iter funcA values
PushStackingContext item
We need separate a SetFilterData item for each filter because we can't push_iter a variable sized thing.
When we iterate over the built display list to flatten it we work similarly to how gradients work with a SetGradientStops item before the actual gradient item. So when we see SetFilterOps or SetFilterData we use them to fill out values on the built display list iterator but don't those items return them to the iterator user and instead continue iterating until we hit the PushStackingContext item, at which point to the iterator consumer it appears as those the FilterOps and FilterDatas were on the PushStackingContext item. (This part is trickier too since we need a TempFilterData type that just holds ItemRange's until we get the actual bytes later.)
Do we need to clear cur_filters and cur_filter_data at some point to prevent them from getting ready by items for which they do not apply?
The format for stacking contexts in the built display list goes from
PushStackingContext item
push_iter of Vec<FilterOp>
to
SetFilterOps item
push_iter of Vec<FilterOp>
1st SetFilterData item
push_iter of array of func types
push_iter funcR values
push_iter funcG values
push_iter funcB values
push_iter funcA values
.
.
.
nth SetFilterData item
push_iter of array of func types
push_iter funcR values
push_iter funcG values
push_iter funcB values
push_iter funcA values
PushStackingContext item
We need separate a SetFilterData item for each filter because we can't push_iter a variable sized thing.
When we iterate over the built display list to flatten it we work similarly to how gradients work with a SetGradientStops item before the actual gradient item. So when we see SetFilterOps or SetFilterData we use them to fill out values on the built display list iterator but don't those items return them to the iterator user and instead continue iterating until we hit the PushStackingContext item, at which point to the iterator consumer it appears as those the FilterOps and FilterDatas were on the PushStackingContext item. (This part is trickier too since we need a TempFilterData type that just holds ItemRange's until we get the actual bytes later.)
Do we need to clear cur_filters and cur_filter_data at some point to prevent them from getting ready by items for which they do not apply?
This bug occurs under the following conditions:
- The clip chain instance has multiple clip items.
- The first item in the clip chain is a clip rectangle, with:
- ClipMode::Clip
- Is in the same coordinate system as the primitive.
In this case, the code would skip adding the clip rect to the
mask (due to the same coord system). However, the logic that
determines whether to render subsequent masks with blend disabled
or multiplicative blend was only considering the index of the
clip item in the clip chain. In this case, these masks would
get added to the blend enabled batches, but the first clip mask
which would have written the initial mask values was skipped.
The end result was that the subsequent clip masks would be
blending with uninitialized render target contents from a previous
frame.
This patch changes the logic to track when the first clip mask
has actually been added to the batch, rather than relying on
the index. In this case, it means that the rounded rect mask
will get drawn in the blend disabled path, writing the correct
mask values without blending with the existing render target contents.
Differential Revision: https://phabricator.services.mozilla.com/D20590
--HG--
extra : moz-landing-system : lando
Without this patch, if we got a display item with the root clip id, we
would always clip that display item with the root clip of the enclosing
pipeline. However, this violates the documented semantics on
ClipId::root() which states that it effectively does no clipping.
Specifically, it could end up doing clipping if the display item was
part of a scrollframe that was scrolled such that the display item
extended beyond the enclosing pipeline.
This patch adds an extra argument to some of the flattening functions -
the flag is true when recursing the DL between a pipeline item and the
first stacking context that has a clip. For these items, the pipeline
clip is applied. Once inside the stacking context, the pipeline clip is
not applied.
Differential Revision: https://phabricator.services.mozilla.com/D20483
--HG--
extra : moz-landing-system : lando
On integrated GPUs, we are typically completely bound by memory
bandwidth and the number of pixels that get written / blended.
On real world pages, it's often the case that we end up with
clip tasks that are long in one dimension but not the other, due
to box-shadow edges, clip mask segments etc. When this occurs,
the logic that tries to get a small 'used_rect' to clear targets
to fails, since the union of those ends up being a very large
rect that covers (most of) the surface. This can cost a lot of
GPU time on some integrated chipsets.
Instead, it appears to be much faster to issue multiple clears,
one for each clip mask region, which is typically < 10% of the
surface we were clearing previously.
However, we can also restore an old optimization we used to have
which means we can skip clears altogether in the common case. The
first mask in a clip task will write to all the pixels in the mask,
so we can draw that with blending disabled (also a significant win
on integrated GPUs) and skip the clear in these cases. With this
functionality in place, the multiplicative blend mode is only
enabled for any clips other than the first in a mask (this is
quite a rare case - most clip tasks end up with a single mask).
On low end GPUs driving a 4k screen, I've measured GPU wins of up
to 5 ms/frame on some real world pages with this change.
Differential Revision: https://phabricator.services.mozilla.com/D19893
--HG--
extra : moz-landing-system : lando
By using WebRenderTextureHostWrapper for canvas, we could avoid triggering frame build on WebRender backend if WebRenderTextureHostWrapper is only change.
Differential Revision: https://phabricator.services.mozilla.com/D19896
--HG--
extra : moz-landing-system : lando
For some reason running these via cross-compiled wrench in Mozilla
automation produces a few pixels difference.
Differential Revision: https://phabricator.services.mozilla.com/D19368
--HG--
extra : source : f06721b552a819e4d2456f1c31d62c782d9a42cb
This introduces some env vars to allow running cross-compiled
binaries instead of running things on the builder. Additionally
the `cargo test --features "ipc"` check is modified to be `check`
instead since there are no actual tests being run by that command.
The only thing we lose is a rustdoc example check but we are
checking that on Linux anyway so doing it for Mac is redundant.
Differential Revision: https://phabricator.services.mozilla.com/D19367
--HG--
extra : source : ee403c79877e028c58fa9091dd360fe50a80af37
For some reason running these via cross-compiled wrench in Mozilla
automation produces a few pixels difference.
Differential Revision: https://phabricator.services.mozilla.com/D19368
--HG--
extra : moz-landing-system : lando
This introduces some env vars to allow running cross-compiled
binaries instead of running things on the builder. Additionally
the `cargo test --features "ipc"` check is modified to be `check`
instead since there are no actual tests being run by that command.
The only thing we lose is a rustdoc example check but we are
checking that on Linux anyway so doing it for Mac is redundant.
Differential Revision: https://phabricator.services.mozilla.com/D19367
--HG--
extra : moz-landing-system : lando
Manage the texture space for picture tiles separately inside the texture cache.
Differential Revision: https://phabricator.services.mozilla.com/D19708
--HG--
extra : moz-landing-system : lando
This is a preparatory change that can be useful by itself:
- use match on EntryKind to allow safe expansion
- avoid code duplication in get()
- fix some comments
Differential Revision: https://phabricator.services.mozilla.com/D19674
--HG--
extra : moz-landing-system : lando
By retaining a global GPU cache handle for a dummy image block, we
can reduce the per-frame GPU cache uploads quite a bit, which
helps with compositor time.
Differential Revision: https://phabricator.services.mozilla.com/D19326
--HG--
extra : moz-landing-system : lando
The WR double style border shader has a condition to check if the
widths of the edges are too small to apply the style, in which case
it draws the border segment as solid. However, the check was
incorrectly skipping when the width of the inner / outer edge
was exactly one pixel.
Differential Revision: https://phabricator.services.mozilla.com/D19440
--HG--
extra : moz-landing-system : lando
We are currently drawing tiles as separate primitives. This doesn't work well for
masking out edge AA between tiles, since they aren't aware of each other.
The change switches image tiles to be drawn as segments sharing the same header.
Differential Revision: https://phabricator.services.mozilla.com/D19458
--HG--
extra : moz-landing-system : lando
By retaining a global GPU cache handle for a dummy image block, we
can reduce the per-frame GPU cache uploads quite a bit, which
helps with compositor time.
Differential Revision: https://phabricator.services.mozilla.com/D19326
--HG--
extra : moz-landing-system : lando
We used to hard-code the raster spatial node of plane splits to the root.
Now we are using the actual root during batching.
Differential Revision: https://phabricator.services.mozilla.com/D19384
--HG--
extra : moz-landing-system : lando
Add support for supplying a keyframe file to animate a wrench
yaml file with.
For now, the keyframe animation file must supply a keyframe for
every frame. In future, we may expand this to allow specifying
interpolation ranges.
For now, this is for development purposes only - however we can
easily extend this to support animated reftests in the future.
Differential Revision: https://phabricator.services.mozilla.com/D18884
--HG--
extra : moz-landing-system : lando
The existing picture caching code in WR assumes that the tiles are being
drawn into the main framebuffer. This is true to the main content frame,
however it's not the case for all popup windows. In the case of popup
windows on mac, they have a rounded rect clip, which results in a surface
being used. This breaks some assumptions in the picture caching code.
The long term fix involves supporting picture caching on surfaces. However,
we don't want picture caching on for non-content windows anyway (due to
wasting texture memory), so for now we will simply disable picture cache
composite modes if they are being drawn on a non-root surface.
Differential Revision: https://phabricator.services.mozilla.com/D18917
--HG--
extra : moz-landing-system : lando
This change rewords get_relative_transform and assotiated pieces of logic,
so that we flatten the transforms at preserve-3d context boundaries.
It addresses a problem found by 1524797 but doesn't resolve the bug yet (!).
There is another issue likely contributing here, and we can treat this PR
as WIP and not merge until the case is completely resolved.
Differential Revision: https://phabricator.services.mozilla.com/D19254
--HG--
extra : moz-landing-system : lando
There's some new limited const fn support in stable, and this is the recommended
way to initialize atomics now.
If this for some reason doesn't compile in all platforms / versions we support
I'll just sprinkle some #[allow(deprecated)] instead.
Also, cargo changes the output of Cargo.lock, see
https://github.com/rust-lang/cargo/issues/6180. So also update those comments.
Differential Revision: https://phabricator.services.mozilla.com/D18495
--HG--
extra : moz-landing-system : lando
This is a follow-up to https://phabricator.services.mozilla.com/D16560
Previously, we had a conservative estimation of the local size based on the footprint
of the screen onto the potential raster root. This was too conservative in general,
and in some cases it wasn't conservative enough, since with filters we can have areas
needed in local space that don't necessarily project on the screen.
This change is doing an exact check for the surface size after we compute it, and
falls back to the parent raster root accordingly.
Differential Revision: https://phabricator.services.mozilla.com/D18258
--HG--
extra : moz-landing-system : lando
When some of a border's corners have a border-radius, and that radius
is larger than the sum of the border width and element size, then it
results in the corners of the border overlapping. Webrender draws
borders by rasterizing each segment individually in to the cache, then
compositing them together. In this overlapping case, this has 2
problems:
a) we composite overlapping segments on top of eachother
b) corner segments are not correctly clipped to the curve of the
overlapping adjacent corners
This patch allows corner segments to be clipped by their adjacent
corners. We provide the outer corner position and radii of the
adjacent corners to the border shader, which then applies those clips,
if required, along with the segment's own corner clip when rasterizing
the segment.
As the adjacent corners now affect the result of the cached segment,
they are added to the cache key.
We continue to rasterize the entire segment in to the cache as before,
but now modify the local rect and texel rect of the BrushSegment so
that it only composites the subportion of the corner segment which
does not overlap with the opposite edges of the border.
Differential Revision: https://phabricator.services.mozilla.com/D16872
--HG--
extra : moz-landing-system : lando
For screen-space rasterized images, we provide the shader with the
UV corners of an image. The shaders then interpolate between the corners
as an intermediate step of finding their UV to assign to a vertex.
When the transformation is perspective, the corners stop being
representative in real screen space, and the old code didn't handle the
case of a corner being out of the positive hemisphere. This change
doesn't do perspective division on Rust side and defers this to the
shader, which can do division *after* interpolation between corners.
This change makes us handle the near plane better and resolves clipping
problems with perspective-interpolated images that occured due to
precision issues of perspective divided corners.
Differential Revision: https://phabricator.services.mozilla.com/D18123
--HG--
extra : moz-landing-system : lando
Implement scaling of borders using the same scale extraction and clamping to
nearest power of two that gecko uses in FrameLayerBuilder::ChooseScale.
Differential Revision: https://phabricator.services.mozilla.com/D17456
--HG--
extra : moz-landing-system : lando
Note that the dirty rect assertions don't seem to quite work yet, but
Glenn is going to take over that last piece.
Depends on D17995
Differential Revision: https://phabricator.services.mozilla.com/D17996
--HG--
extra : moz-landing-system : lando
The current code panics with an out-of-bounds access here if picture
caching is used outside an iframe.
Depends on D17994
Differential Revision: https://phabricator.services.mozilla.com/D17995
--HG--
extra : moz-landing-system : lando
Per discussion with gw, the current behavior is an oversight. We also
want to expose this to wrench.
Depends on D17993
Differential Revision: https://phabricator.services.mozilla.com/D17994
--HG--
extra : moz-landing-system : lando
There are various testing-only things we want to do here, specifically
copying around dirty regions, and shrinking the tile size. We could make
each of these specific options and thread them all through to the right
places, but that adds complexity without a use-case. So we just add a
simple testing mode for wrench.
Differential Revision: https://phabricator.services.mozilla.com/D17991
--HG--
extra : moz-landing-system : lando
Now that we no longer guarantee that a picture with perspective transform is rasterized in local space, we need to ensure that the shaders don't apply perspective correction to the texture coordinates twice.
For that to be the case, we pass an extra flag to the plane splitting shader, and un-do the perspective correction if it's not enabled.
Differential Revision: https://phabricator.services.mozilla.com/D17854
--HG--
extra : moz-landing-system : lando
Now that we no longer guarantee that a picture with perspective transform is rasterized in local space, we need to ensure that the shaders don't apply perspective correction to the texture coordinates twice.
For that to be the case, we pass an extra flag to the plane splitting shader, and un-do the perspective correction if it's not enabled.
Differential Revision: https://phabricator.services.mozilla.com/D17854
--HG--
extra : moz-landing-system : lando