As it turns out, the difference between the snapped local rect and the
unsnapped local rect was not just that the former contained snapped
primitives and the latter contained unsnapped primitives, but also that
the former took into account surface inflation for primitives, the
entire clip chain instead of just the primitive's local clip, and
removal of culled primitives. As such, the picture's rects can be wildly
different, even if snapping has been taken care of earlier, and parts of
WebRender have come to rely upon this more accurate representation of a
picture.
Differential Revision: https://phabricator.services.mozilla.com/D46605
--HG--
extra : moz-landing-system : lando
As it turns out, the difference between the snapped local rect and the
unsnapped local rect was not just that the former contained snapped
primitives and the latter contained unsnapped primitives, but also that
the former took into account surface inflation for primitives, the
entire clip chain instead of just the primitive's local clip, and
removal of culled primitives. As such, the picture's rects can be wildly
different, even if snapping has been taken care of earlier, and parts of
WebRender have come to rely upon this more accurate representation of a
picture.
Differential Revision: https://phabricator.services.mozilla.com/D46605
--HG--
extra : moz-landing-system : lando
Adds a module documentation in segment.rs giving an overview of the role of primitive segments and how they interact with clipping. Also reformatted the clip.rs documentation to play well with rustdoc.
Differential Revision: https://phabricator.services.mozilla.com/D46472
--HG--
extra : moz-landing-system : lando
During metrics initialization we load a few uncached glyph widths which can occasionally
show up in a profile. This should reduce the overhead of that somewhat.
Differential Revision: https://phabricator.services.mozilla.com/D46786
--HG--
extra : moz-landing-system : lando
Fix scissor rect being incorrect during pinch zoom due to floating
point inaccuracies.
Differential Revision: https://phabricator.services.mozilla.com/D46743
--HG--
extra : moz-landing-system : lando
Cairo would normally query both the advance and other metrics at the same time,
then store them in a glyph cache sitting on each cairo_scaled_font_t any time
any of the extents were queried. Each cached scaled glyph metrics would require
about 150 bytes of space and could thus use a horribly large amount of memory
when a lot of glyphs were being used within a scaled font.
This tries to duplicate the behavior of querying and storing both advance and
bounds at the same time to effectively cut the number of glyph loads in half
for most cases. This should only add another 8 bytes per hash entry to store
the cached bounds, thus putting us way ahead on memory usage compared to what
Cairo did under the hood.
Further, Cairo would keep around cairo_scaled_font_t's in a holdover cache
even after there are no existing references to them and the owning gfxFonts
have long since died. This gives an artificial boost in successive runs of the
benchmark, while not aiding in the performance of the first run. I don't
believe the extra memory use would be justified to reproduce that particular
behavior, especially since our expectations are that the glyph cache for
a gfxFont dies when the gfxFont itself dies from the gfxFontCache.
In any case, this should at least significantly boost our glyph metrics
performance on a cold start, with the caveat about the warm start case.
Differential Revision: https://phabricator.services.mozilla.com/D46726
--HG--
extra : moz-landing-system : lando
Cairo would normally query both the advance and other metrics at the same time,
then store them in a glyph cache sitting on each cairo_scaled_font_t any time
any of the extents were queried. Each cached scaled glyph metrics would require
about 150 bytes of space and could thus use a horribly large amount of memory
when a lot of glyphs were being used within a scaled font.
This tries to duplicate the behavior of querying and storing both advance and
bounds at the same time to effectively cut the number of glyph loads in half
for most cases. This should only add another 8 bytes per hash entry to store
the cached bounds, thus putting us way ahead on memory usage compared to what
Cairo did under the hood.
Further, Cairo would keep around cairo_scaled_font_t's in a holdover cache
even after there are no existing references to them and the owning gfxFonts
have long since died. This gives an artificial boost in successive runs of the
benchmark, while not aiding in the performance of the first run. I don't
believe the extra memory use would be justified to reproduce that particular
behavior, especially since our expectations are that the glyph cache for
a gfxFont dies when the gfxFont itself dies from the gfxFontCache.
In any case, this should at least significantly boost our glyph metrics
performance on a cold start, with the caveat about the warm start case.
Differential Revision: https://phabricator.services.mozilla.com/D46726
--HG--
extra : moz-landing-system : lando
These variants perform significantly faster than the C implementations
according to local testing and that in treeherder. Image decoding is as
much as 40% faster in the most simple cases (solid green PNG image).
Differential Revision: https://phabricator.services.mozilla.com/D46446
--HG--
extra : moz-landing-system : lando
Some image decoders (e.g. PNG) may have a native representation of the
data as RGB, and do not have accelerated methods to transform from RGB
to RGBX/BGRX. Exposing this as part of the swizzle/premultiply methods
allows us to write accelerated versions ourselves in a later patch in
this series.
Differential Revision: https://phabricator.services.mozilla.com/D46445
--HG--
extra : moz-landing-system : lando
The image decoders produce surfaces row by row, so a variant to get a
function pointer to perform swizzle/premultiply operations makes more
ergonomic sense.
Differential Revision: https://phabricator.services.mozilla.com/D46444
--HG--
extra : moz-landing-system : lando
We don't need this anymore now that we're always
using the visible rect.
This requires bug 1582528 to properly set the visible area.
Differential Revision: https://phabricator.services.mozilla.com/D46628
--HG--
extra : moz-landing-system : lando
We don't need this anymore now that we're always
using the visible rect.
This requires bug 1582528 to properly set the visible area.
Differential Revision: https://phabricator.services.mozilla.com/D46628
--HG--
extra : moz-landing-system : lando
This update brings in several bugfixes and compatibility with newer
libstdc++ versions.
Differential Revision: https://phabricator.services.mozilla.com/D45860
--HG--
extra : moz-landing-system : lando
This also changes CompositorBridgeChild to create the CanvasChild lazily.
Depends on D44154
Differential Revision: https://phabricator.services.mozilla.com/D44873
--HG--
extra : moz-landing-system : lando
If VR process haven't launched yet, we couldn't get available VR displays and its states, so we need to make enumationCompleted to be false, and ask it do the enumeration again.
Differential Revision: https://phabricator.services.mozilla.com/D46238
--HG--
extra : moz-landing-system : lando
Preparation work for adding DirectComposition support. It does not add a new functionality.
Differential Revision: https://phabricator.services.mozilla.com/D46429
--HG--
extra : moz-landing-system : lando
We only use the png codec so let's save some link time by not including
the other codecs.
Differential Revision: https://phabricator.services.mozilla.com/D46317
--HG--
extra : moz-landing-system : lando
Previously, the setup_picture_caching function was hard coded
to support only a very specific shape of display list. With this
change, flags are added to PrimitiveCluster that can specify
if a picture cache slice should be created before / after this
cluster when picture caching is set up.
The usage of these flags in this patch matches the old behaviour,
so should not have any functional effect.
However, in future we will make use of this functionality to
create picture slices for a number of different use cases, such as:
* Creating cache tiles for the UI.
* Slicing the scene where there are video elements, in order to
allow these to be composited directly by the OS. This may also
apply to WebGL and/or canvas elements.
* Slicing the scene when there is a very large fixed position
background image or other element, to avoid invalidating the
entire tile cache each frame.
Differential Revision: https://phabricator.services.mozilla.com/D46125
--HG--
extra : moz-landing-system : lando
IDCompositionDevice::Commit needs to be called when there is an update to DirectComposition. RenderCompositorANGLE updates DirectComposition only during initialization. Then it is not necessary to call the function in EndFrame().
Differential Revision: https://phabricator.services.mozilla.com/D46249
--HG--
extra : moz-landing-system : lando
This fixes a case where the is_same_content field was no longer
being reset to true before dependency calculation could occur.
In some cases, this could result in a tile being updated, but
the dirty rect being empty, which meant all primitives in that
tile would fail the visibility culling test.
Differential Revision: https://phabricator.services.mozilla.com/D46247
--HG--
extra : moz-landing-system : lando
This patch replaces the is_backface_visible bool in the common
per-primitive data in the display list with a PrimitiveFlags
enumeration. This will allow Gecko to specify extra information
about certain primitive, such as tagging scroll bars.
Differential Revision: https://phabricator.services.mozilla.com/D45970
--HG--
extra : moz-landing-system : lando
This provides the internal device-space dirty rects calculated
during picture cache updates to the external render() method.
This allows clients to provide these to OS partial present APIs,
to reduce power usage and improve performance.
In this initial implementation, if a scroll or scale of the main
picture cache has occurred, the dirty rect will be the entire
screen. This should ensure correctness. In future, we can handle
this case by supplying the picture cache transforms to the OS
compositor integration.
However, the dirty rects will be valid for any non-scroll cases,
such as animations or video playback. This should result in some
significant power savings and performance improvements for these
use cases.
Differential Revision: https://phabricator.services.mozilla.com/D46082
--HG--
extra : moz-landing-system : lando
This patch replaces the is_backface_visible bool in the common
per-primitive data in the display list with a PrimitiveFlags
enumeration. This will allow Gecko to specify extra information
about certain primitive, such as tagging scroll bars.
Differential Revision: https://phabricator.services.mozilla.com/D45970
--HG--
extra : moz-landing-system : lando
In places where profiler_is_active() was used around a profiler_add_marker() (or
similar) call, replace it with profiler_can_accept_markers().
Differential Revision: https://phabricator.services.mozilla.com/D44435
--HG--
extra : moz-landing-system : lando