mirror of
https://github.com/mozilla/gecko-dev.git
synced 2024-10-21 01:05:45 +00:00
702c53489e
On integrated GPUs, we are typically completely bound by memory bandwidth and the number of pixels that get written / blended. On real world pages, it's often the case that we end up with clip tasks that are long in one dimension but not the other, due to box-shadow edges, clip mask segments etc. When this occurs, the logic that tries to get a small 'used_rect' to clear targets to fails, since the union of those ends up being a very large rect that covers (most of) the surface. This can cost a lot of GPU time on some integrated chipsets. Instead, it appears to be much faster to issue multiple clears, one for each clip mask region, which is typically < 10% of the surface we were clearing previously. However, we can also restore an old optimization we used to have which means we can skip clears altogether in the common case. The first mask in a clip task will write to all the pixels in the mask, so we can draw that with blending disabled (also a significant win on integrated GPUs) and skip the clear in these cases. With this functionality in place, the multiplicative blend mode is only enabled for any clips other than the first in a mask (this is quite a rare case - most clip tasks end up with a single mask). On low end GPUs driving a 4k screen, I've measured GPU wins of up to 5 ms/frame on some real world pages with this change. Differential Revision: https://phabricator.services.mozilla.com/D19893 --HG-- extra : moz-landing-system : lando |
||
---|---|---|
.. | ||
aligned-gradient.yaml | ||
benchmarks.list | ||
box-shadow-large.yaml | ||
clip-clear.yaml | ||
large-blur-radius.yaml | ||
large-boxshadow-ellipse-2.yaml | ||
large-boxshadow-ellipse.yaml | ||
large-clip-rect.yaml | ||
many-box-shadows.yaml | ||
many-images.yaml | ||
overlapping-text-shadows.yaml | ||
radial-gradient.yaml | ||
simple-batching.yaml | ||
text-rendering.yaml | ||
transforms-simple.yaml | ||
unaligned-gradient.yaml |