Previously, we submitted polygons as a list of triangles, duplicating
some ancillary data with each vertex. As we move away from constant
buffers for some of this ancillary data, it will bloat the size of each
vertex. To avoid this, we will now instance over a unit triangle
instead. Each instance contains three triangle coordinates and ancillary
data can be shared between them. The target vertex is computed similarly
to how we handle rects in the unit quad shaders.
Since bug 1378258 remove malloc_print_stats, there are a bunch of
allocator stats that are now unused, reducing the memory footprint of
allocator metadata.
--HG--
extra : rebase_source : 337ef3b647c20119334b6576d591006f6bb3dd16
When initializing a new chunk for use as an arena, we started by zeroing
out the chunk (if that wasn't the case) and then initializing a new
arena chunk in there. It turns out this can have a noticeable overhead,
especially when e.g. the new arena chunk is used for a large allocation
filled out by something that is realloc()ated.
OTOH, the chunk recycle code only ever keeps zeroed or arena chunks
around (there is a "recycled" type too, but in practice, at the moment,
this means they were arena chunks before). Arena chunks that were
recycled were totally emptied, so all the runs they may contain will
contain zeroed-out or poisoned data. They also contain a header, that is
overwritten by the new arena chunk initialization.
This means we can get away with reusing non-zeroed recycled chunks
without zeroing them, as long as the arena chunk header marks the runs
as madvised instead of zeroed.
Code-wise, this would benefit from getting a ChunkType out of
chunk_alloc, but this would require more refactoring than I'm willing to
do at the moment.
Before returning a chunk, chunk_recycle calls pages_commit (when
MALLOC_DECOMMIT is enabled), which is guaranteed to zero the chunk.
The code further zeroing the chunk afterwards, which is now moved out to
chunk_alloc callers, never took advantage of that fact, duplicating the
effort of zeroing the chunk on Windows.
By indicating to the callers that the chunk has already been zeroed, we
allow callers to skip zeroing on their own.
The current code only allows chunk_calloc() callers to tell whether they
want zeroed memory or not, but some might be okay either way, assuming
they act accordingly afterwards. So move the zeroing out of chunk_alloc.
Many functions in the mozjemalloc codebase like to return the opposite
boolean one would tend to expect. Pages_purge is one of them, and this
reverses the logic to match expectations.
Also make it static.
It turns out that not recycling some kinds of chunk can lead to the
recycle queue being starved in some scenarios. When that happens, we end
up mmap()ing new memory, but that turns out to be significantly slower.
So instead of not recycling huge chunks, we force-clean them, before
madvising so that the pages can still be reclaimed in case of memory
pressure.
--HG--
extra : rebase_source : 2dbd028daca92c9cd7c8079eb3dc5a0cfa06495b
~AzureState is expensive, especially in GlyphBufferAzure::Flush, which is a high
fan-in function.
MozReview-Commit-ID: 4JfjMje0Kgs
--HG--
extra : rebase_source : bbc2f06871d9bde9130ddb95d053d16a3f2a091d
Also, switch the hover quirk to the same mechanism.
Bug: 1379696
Reviewed-By: bholley
MozReview-Commit-ID: KrmNqNyASf6
Source-Repo: https://github.com/servo/servo
Source-Revision: 8fa2a262dc8f2dcab884aead38439ba8756518dc
--HG--
extra : subtree_source : https%3A//hg.mozilla.org/projects/converted-servo-linear
extra : subtree_revision : c68725c4b5f088cc4ae280059486b62f6b02c43e