This patch re-organizes control flow in the browser-chrome harness so that
extra memory cleanup only happens just before the browser is closed, even
when running with --repeat.
Each protocol in IPDL has a bunch of autogenerated functions that
instantiate IPC::Message with various parameters. Each of these
functions, then:
1) Pays the cost of calling malloc()
2) Setting up various parameters
3) Calling IPC::Message()
There's no reason that we should be duplicating 1) across all of these
autogenerated functions. In step 2), several of the parameters we're
setting up are common across all or nearly all calls: the message
segment size is almost always zero, and we're always indicating that
IPDL-generated messages should be recorded in telemetry.
Instead of duplicating that code several thousand times, we can add a
small helper function that takes the only interesting parameters for an
IPDL message. This helper function can then deal with calling malloc in
a single place and setting up the common parameters. For messages that
require a custom segment size, we'll have to use the old scheme, but
such messages are uncommon.
The previous changes are not required for this scheme to work, but they
do help significantly, as the helper function (Message::IPDLMessage) can
now take four parameters, which ensures that its arguments are passed
solely in registers on Win64 and ARM. The wins from this change are
also larger than they would be without the previous parts: ~100K on
x86-64 Linux (!) and ~80K on ARM Android.
The current IPC::Message constructor takes a large number of arguments,
three of which--the nesting level, the priority, and the
compression--are almost always constant by virtue of the vast majority
of Message construction being done by auto-generated IPDL code. But
then we take these constant values into the Message constructor, we
check them for various values, and then based on those values, we
perform a bunch of bitfield operations to store flags based on those
values. This is wasted work.
Furthermore, for replies to IPDL messages, we'll construct a Message
object, and then call mutating setters on the Message object that will
perform even more bitfield manipulations. Again, these operations are
performing tasks at runtime that are the same every single time, and use
information we already have at compile time.
The impact of these extra operations is not large, maybe 15-30K of extra
code, depending on platform. Nonetheless, we can easily make them go
away, and make everything cleaner to boot.
This patch adds a HeaderFlags class that encapsulates all the knowledge
about the various kinds of flags Message needs to know about. We can
construct HeaderFlags objects with strongly-typed enum arguments for the
various kinds of flags, and the compiler can take care of folding all of
those flags together into a constant when possible (and it is possible
for all the IPDL-generated code that instantiates Messages). The upshot
is that we do no unnecessary work in the Message constructor itself. We
can also remove various mutating operations on Message, as those
operations were only there to support post-constructor flag twiddling,
which is no longer necessary.
There's no need to be repeating 'IPC::Message::' prefixes or spreading
around more ExprVar calls than we need here. Let's try to improve the
signal-to-noise ratio of this code by introducing a helper function to
inject some of the boilerplate for us.
_generateMessageConstructor takes a lot of `md.FOO`-style parameters,
which could be derived inside the function by simply passing `md`.
Especially with the upcoming changes to calculate things like reply-ness
of messages, sync-ness, etc, we'd be wanting to pass even more
parameters like `md.FOO`. So let's just pass `md` in, and then we can
make all the necessary future changes in a single place.
In bug 1361258, we unified the initialization sequence on mac, and
chose to make the zone registration happen after jemalloc
initialization.
The order between jemalloc init and zone registration shouldn't actually
matter, because jemalloc initializes the first time the allocator is
actually used.
On the other hand, in some build setups (e.g. with light optimization),
the initialization of the thread_arena thread local variable can happen
after the forced jemalloc initialization because of the order the
corresponding static initializers run. In some levels of optimization,
the thread_arena initializer resets the value the jemalloc
initialization has set, which subsequently makes choose_arena() return
a bogus value (or hit an assertion in ThreadLocal.h on debug builds).
So instead of initializing jemalloc from a static initializer, which
then registers the zone, we instead register the zone and let jemalloc
initialize itself when used, which increases the chances of the
thread_arena initializer running first.
--HG--
extra : rebase_source : 4d9a5340d097ac8528dc4aaaf0c05bbef40b59bb
We're getting relatively frequent intermittent shutdown crashes when stream
filters are added during ts_paint tests. Skipping favicon.ico stream filters
should prevent that until we have a more reliable fix.