Commit Graph

1087 Commits

Author SHA1 Message Date
Mike Hommey
398c8e8356 Bug 1417309 - Add the necessary bits to support a --enable-project=memory option. r=nalexander
The option allows to iterate on the allocator code without requiring a
complete setup to build Firefox.
2017-11-16 08:37:36 +09:00
Nathan Froyd
f51359bd19 Bug 1325632 - part 5 - ensure that we compile with -fno-sized-deallocation when possible; r=chmanchester
We currently turn off the C++14 sized-deallocation facility on MSVC, and
we'd like to ensure we do the same thing for clang and gcc.  To do so,
we add new functionality to moz.configure for checking and adding
compilation flags, similar to the facility for checking and adding
warning flags.  The newly added facility is then used to add
-fno-sized-deallocation to the compilation flags, when the option is
supported.

Once we do this, we can't define the sized deallocation functions in
mozalloc.h; the compiler will complain that we are using
-fno-sized-deallocation, yet defining these special functions that we'll
never use.  These functions were added for MinGW, where we needed to
compile with C++14 ahead of other platforms to be compatible with MSVC
headers.  But they're no longer necessary, though they would be if we
removed -fno-sized-deallocation; the compiler will complain if we do
that and we'll add them back at that point.
2017-11-15 14:53:16 -04:00
Mike Hommey
1f75d81c1c Bug 1417234 - Use SRWLock as Mutex for mozjemalloc on Windows. r=njn
SRWLock is more lightweight than CriticalSection, but is only available
on Windows Vista and more. So until we actually dropped support Windows
XP, we had to use CriticalSection.

Now that all supported Windows versions do have SRWLock, this is a
switch we can make, and not only because SRWLock is more lightweight,
but because it can be statically initialized like on other platforms,
allowing to use the same initialization code as on other platforms,
and removing the requirement for a DllMain, which in turn can allow
to statically link mozjemalloc in some cases, instead of requiring a
shared library (DllMain only works on shared libraries), or manually
call the initialization function soon enough.

There is a downside, though: SRWLock, as opposed to CriticalSection, is
not fair, meaning it can have thread scheduling implications, and can
theoretically increase latency on some threads. However, it is the
default used by Rust Mutex, meaning it's at least good enough there.
Let's see how things go with this.

--HG--
extra : rebase_source : 337dc4e245e461fd0ea23a2b6b53981346a545c6
2017-11-14 12:58:33 +09:00
Ryan VanderMeulen
6c84e6c823 Bug 1402283 - Fix typo. r+a=RyanVM 2017-11-10 17:07:27 -05:00
Mike Hommey
1e661e8113 Bug 1402283 - Fix non-Nightly build bustage. r+a=RyanVM 2017-11-10 16:55:37 -05:00
Mike Hommey
41ede31cd3 Bug 1402283 - Enforce arena matching on moz_arena_{realloc,free}. r=njn
This enforces the API contract as described in memory/build/malloc_decls.h.
2017-11-10 16:05:24 +09:00
Mike Hommey
1dbfb30e4e Bug 1402283 - Associate an arena to each huge allocation. r=njn
Currently, huge allocations are completely independent from arenas. But
in order to ensure that e.g. moz_arena_realloc can't reallocate huge
allocations from another arena, we need to track which arena was
responsible for the huge allocation. We do that in the corresponding
extent_node_t.
2017-11-10 16:05:22 +09:00
Mike Hommey
60238a717e Bug 1402283 - Replace isalloc/isalloc_validate with static methods of a new AllocInfo class. r=njn
Both functions do essentially the same thing, one having more validation
than the other. We can use a template with a boolean parameter to avoid
the duplication.

Furthermore, we're soon going to require, in some cases, more
information than just the size of the allocation, so we wrap their
result in a helper class that gives information about an active
allocation.
2017-11-10 16:05:21 +09:00
Mike Hommey
a6f4d02c4c Bug 1402283 - Make arena_ralloc use the same arena as the original pointer when none is provided. r=njn
When using plain realloc() on a pointer that was allocated with
moz_arena_malloc, we want the resulting pointer to still belong to the
same arena.
2017-11-10 16:05:19 +09:00
Mike Hommey
e9f49d03b4 Bug 1402283 - Rename extent_node_t fields. r=njn 2017-11-10 16:05:17 +09:00
Mike Hommey
e68ddaa1c2 Bug 1415454 - Replace log2 lookup table with FloorLog2. r=njn
FloorLog2 expands to, essentially, a compiler builtin/intrinsic, that,
in turn, expands to a single machine instruction on tier 1 and other
platforms. On platforms where that's not the case, we can expect the
compiler to generate fast code anyways. So overall, this is all better
than manually using a log2 lookup table.

Also replace a manual power-of-two check with mozilla::IsPowerOfTwo,
which does the same test.

--HG--
extra : rebase_source : e8164c254723c74ef83e798073327ec6afa6f1fb
2017-11-08 16:20:40 +09:00
Mike Hommey
3a2a081a07 Bug 1415454 - Remove the unused arena_bin_t* argument to arena_t::AllocRun. r=njn
--HG--
extra : rebase_source : 6c01dad41bc348237f1ee099bc4bd8138738dfde
2017-11-03 15:54:20 +09:00
Mike Hommey
2b7e3e93bd Bug 1415454 - Inline MallocBinEasy and MallocBinHard. r=njn
They are both only used once, are trivial wrappers, and even repeat the
same assertions.

--HG--
extra : rebase_source : b40b26e303cb69a451e63937efd8a666053954e5
2017-11-03 15:48:40 +09:00
Mike Hommey
80c65dc086 Bug 1414168 - Rename arena_run_t fields. r=njn
--HG--
extra : rebase_source : 2d80b0a7e3634a84f8b7b6dd229d6cd42d59d290
2017-11-03 15:23:44 +09:00
Mike Hommey
b6fb21eb81 Bug 1414168 - Move bin initialization to a method of the arena_bin_t class. r=njn
--HG--
extra : rebase_source : f21ff1f44bf8b47185b08652f10d9b77e29dcd64
2017-11-08 15:53:24 +09:00
Mike Hommey
2489970e52 Bug 1414168 - Change how run sizes are calculated. r=njn
There are multiple flaws to the current code:
- The loop calculating the right parameters for a given run size is
  repeated.
- The loop trying different run sizes doesn't actually work to fulfil
  the overhead constraint: while it stops when the constraint is
  fulfilled, the values that are kept are those from the previous
  iteration, which may well be well over the constraint.

In practice, the latter resulted in a few surprising results:
- most size classes had an overhead slightly over the constraint
  (1.562%), which, while not terribly bad, doesn't match the set
  expectations.
- some size classes ended up with relatively good overheads only because
  of the additional constraint that run sizes had to be larger than the
  run size of smaller size classes. Without this constraint, some size
  classes would end up with overheads well over 2% just because that
  happens to be the last overhead value before reaching below the 1.5%
  constraint.

Furthermore, for higher-level fragmentation concerns, smaller run sizes
are better than larger run sizes, and in many cases, smaller run sizes
can yield the same (or even sometimes, better) overhead as larger run
sizes. For example, the current code choses 8KiB for runs of size 112,
but using 4KiB runs would actually yield the same number of regions, and
the same overhead.

We thus change the calculation to:
- not force runs to be smaller than those of smaller classes.
- avoid the code repetition.
- actually enforce its overhead constraint, but make it 1.6%.
- for especially small size classes, relax the overhead constraint to
  2.4%.

This leads to an uneven set of run sizes:
 size class    before   after
   4            4 KiB    4 KiB
   8            4 KiB    4 KiB
   16           4 KiB    4 KiB
   32           4 KiB    4 KiB
   48           4 KiB    4 KiB
   64           4 KiB    4 KiB
   80           4 KiB    4 KiB
   96           4 KiB    4 KiB
   112          8 KiB    4 KiB
   128          8 KiB    8 KiB
   144          8 KiB    4 KiB
   160          8 KiB    8 KiB
   176          8 KiB    4 KiB
   192         12 KiB    4 KiB
   208         12 KiB    8 KiB
   224         12 KiB    4 KiB
   240         12 KiB    4 KiB
   256         16 KiB   16 KiB
   272         16 KiB    4 KiB
   288         16 KiB    4 KiB
   304         16 KiB   12 KiB
   320         20 KiB   12 KiB
   336         20 KiB    4 KiB
   352         20 KiB    8 KiB
   368         20 KiB    4 KiB
   384         24 KiB    8 KiB
   400         24 KiB   20 KiB
   416         24 KiB   16 KiB
   432         24 KiB   12 KiB
   448         28 KiB    4 KiB
   464         28 KiB   16 KiB
   480         28 KiB    8 KiB
   496         28 KiB   20 KiB
   512         32 KiB   32 KiB
   1024        64 KiB   64 KiB
   2048       132 KiB  128 KiB

* Note: before is before this change only, not before the set of changes
  from this bug; before that, the run size for 96 could be 8 KiB in some
  configurations.

In most cases, the overhead hasn't changed, with a few exceptions:
- Improvements:
 size class   before   after
   208         1.823%   0.977%
   304         1.660%   1.042%
   320         1.562%   1.042%
   400         0.716%   0.391%
   464         1.283%   0.879%
   480         1.228%   0.391%
   496         1.395%   0.703%
- Regressions:
   352         0.312%   1.172%
   416         0.130%   0.977%
   2048        1.515%   1.562%

For the regressions, the values are either still well within the
constraint or very close to the previous value, that I don't feel like
it's worth trying to avoid them, with the risk of making things worse
for other size classes.

--HG--
extra : rebase_source : fdff18df8a0a35c24162313d4adb1a1c24fb6e82
2017-11-08 14:04:10 +09:00
Mike Hommey
16459608dd Bug 1414168 - Base run offset calculations on the fixed header size, excluding regs_mask. r=njn
On 64-bit platforms, sizeof(arena_run_t) includes a padding at the end
of the struct to align to 64-bit, since the last field, regs_mask, is
32-bit, and its offset can be a multiple of 64-bit depending on the
configuration. But we're doing size calculations for a dynamically-sized
regs_mask based on sizeof(arena_run_t), completely ignoring that
padding.

Instead, we use the offset of regs_mask as a base for the calculation.

Practically speaking, this doesn't change much with the current set of
values, but could affect the overheads when we squeeze run sizes more.

--HG--
extra : rebase_source : a3bdf10a507b81aa0b2b437031b884e18499dc8f
2017-11-08 10:08:37 +09:00
Mike Hommey
f3daece337 Bug 1414168 - Avoid padding near the beginning of arena_run_t. r=njn
This makes the run header larger than necessary, which happens to make
the current arena_bin_run_calc_size pick 8KiB runs for size class 96
when MOZ_DIAGNOSTIC_ASSERT_ENABLED is set. This change makes it pick
4KiB runs, making MOZ_DIAGNOSTIC_ASSERT_ENABLED builds use the same set
of run sizes as non-MOZ_DIAGNOSTIC_ASSERT_ENABLED builds.

--HG--
extra : rebase_source : fd7ef2d58ec601186647799e9dcf8146e723241c
2017-11-08 09:56:08 +09:00
Mike Hommey
87faa92489 Bug 1414168 - Change and move the relaxed calculation rule for small size classes. r=njn
First and foremost, the code and corresponding comment weren't in
agreement on what's going on.

The code checks:
  RUN_MAX_OVRHD * (bin->mSizeClass << 3) <= RUN_MAX_OVRHD_RELAX
which is equivalent to:
  (bin->mSizeClass << 3) <= RUN_MAX_OVRHD_RELAX / RUN_MAX_OVRHD
replacing constants:
  (bin->mSizeClass << 3) <= 0x1800 / 0x3d

The left hand side is just bin->mSizeClass * 8, and the right hand side
is about 100, so this can be roughly summarized as:
  bin->mSizeClass <= 12

The comment says the overhead constraint is relaxed for runs with a
per-region overhead greater than RUN_MAX_OVRHD / (mSizeClass << (3+RUN_BFP)).
Which, on itself, doesn't make sense, because it translates to
61 / (mSizeClass * 32768), which, even for a size class of 1 would mean
less than 0.2%, and this value would be even smaller for bigger classes.
The comment would make more sense with RUN_MAX_OVRHD_RELAX, but would
still not match what the code was doing.

So we change how the relaxed rule works, as per the comment in the new
code, and make it happen after the standard run overhead constraint has
been checked.

--HG--
extra : rebase_source : cec35b5bfec416761fbfbcffdc2b39f0098af849
2017-11-07 14:36:07 +09:00
Mike Hommey
1117bc02f4 Bug 1414168 - Demystify the last test in the main arena_bin_run_size_calc loop. r=njn
The description above the RUN_* constant definitions talks about binary
fixed point math, which is one way to look at the problem, but a clearer
one is to look at it as comparing ratios in a way that doesn't use
divisions.

So, starting from the current expression:
  (try_reg0_offset << RUN_BFP) <= RUN_MAX_OVRHD * try_run_size

This can be rewritten as
  try_reg0_offset * (1 << RUN_BFP) <= RUN_MAX_OVRHD * try_run_size

Dividing both sides with ((1 << RUN_BFP) * try_run_size), and
simplifying, gives us:
  try_reg0_offset / try_run_size <= RUN_MAX_OVRHD / (1 << RUN_BFP)

Replacing the constants:
  try_reg0_offset / try_run_size <= 0x3d / (1 << 12)
or
  try_reg0_offset / try_run_size <= 61 / 4096

61 / 4096 is roughly 1.5%.

So what the check really intends to do is check that the overhead is
below 1.5%.

So we introduce a helper class and a user-defined literal that makes the
test more self-descriptive, while producing identical machine code.

This is a lot of code to add, but I think it's one of those cases where
abstraction can help make the code clearer.

--HG--
extra : rebase_source : 3d4a94f524a60e40ba75859c4f761f59d689e81a
2017-11-07 08:55:37 +09:00
Mike Hommey
1329eac959 Bug 1414168 - Split the condition for the main arena_bin_run_size_calc loop into pieces. r=njn
This is, practically speaking, a no-op, and will hopefully help make the
following changes clearer.

--HG--
extra : rebase_source : b704bdf2ae46c2408e0061363822b9744ef449cb
2017-11-07 07:42:21 +09:00
Mike Hommey
5a60492a53 Bug 1414155 - Define AddressRadixTree node size as a size rather than a power of 2. r=njn
--HG--
extra : rebase_source : 03799ccb0d5ba7c627cd3652777b56b7ae26b942
2017-11-03 13:50:44 +09:00
Mike Hommey
020ff16947 Bug 1414155 - Replace constants describing size class numbers. r=njn
--HG--
extra : rebase_source : 11b479c2928a236ecea94b1bc76497dee717a0b3
2017-11-03 12:21:53 +09:00
Mike Hommey
a079c13bb9 Bug 1414155 - Rename chunk related constants. r=njn
--HG--
extra : rebase_source : f458cecab13bc2c9c78685eee670f76e2908d3dc
2017-11-03 12:16:11 +09:00
Mike Hommey
0cd74597a7 Bug 1414155 - Rename page size related constants. r=njn
--HG--
extra : rebase_source : db6b040ca046e350284f6a2aece2c1d1fa3c4ee4
2017-11-03 12:13:17 +09:00
Mike Hommey
89805eb175 Bug 1414155 - Replace chunk size related macros and constants. r=njn
--HG--
extra : rebase_source : 4e445e489148873f141de71d3a6ffd701e14f340
2017-11-03 12:07:16 +09:00
Mike Hommey
91ec9d43c0 Bug 1414155 - Replace size class related macros and constants. r=njn
Hopefully, this makes things a little clearer.

--HG--
extra : rebase_source : cc36c52bfb00bf1b46488e496eb524a9dc46a3e5
2017-11-03 10:10:50 +09:00
Mike Hommey
18e7756799 Bug 1414155 - Remove SIZE_INV values for QUANTUM_2POW_MIN < 4. r=njn
QUANTUM_2POW_MIN is exactly 4, and we are unlikely to ever make it
smaller. Also turn a MOZ_ASSERT into a static_assert, because it only
uses constants, and will fail if QUANTUM_2POW_MIN is lowered without
touching size_invs.

--HG--
extra : rebase_source : 7c8ee3c0ea30a88bddba816c41c6f63914f7a03c
2017-11-03 11:41:30 +09:00
Mike Hommey
b4b9a5993f Bug 1414155 - Replace the cacheline-related macros with a constant. r=njn
--HG--
extra : rebase_source : 571145f154478b1703be44b73b8562de7973d788
2017-11-01 19:34:41 +09:00
Mike Hommey
26674ebcb7 Bug 1414155 - Consolidate "constant/globals". r=njn
There is a set of "constants" that are actually globals that depend on
the page size that we get at runtime, when compiling without
MALLOC_STATIC_PAGESIZE, but that are actual constants when compiling
with it. Their value was unfortunately duplicated.

We setup a set of macros allowing to make the declarations unique.

--HG--
extra : rebase_source : 56557b7ba01ee60fe85f2cd3c2a0aa910c4c93c6
2017-11-01 18:33:24 +09:00
Mike Hommey
6b7f782847 Bug 1414155 - Define pagesize_2pow in terms of pagesize, not the opposite. r=njn
At the same time, add user-defined literals to make those constants more
legible.

--HG--
extra : rebase_source : ce143ad9d8a6603179042d8cf432f00c815156c5
2017-11-01 18:07:06 +09:00
Mike Hommey
273538b95c Bug 1414155 - Move arena_chunk_map_t and arena_chunk_t around. r=njn
At the moment, while they are used before their declaration, it's from a
macro. It is desirable to replace the macros with C++ constants, which
will require the structures being defined first.

--HG--
extra : rebase_source : 7a351dafea04a7d75b6eec50fa52fb49c135e569
2017-11-01 17:54:31 +09:00
Mike Hommey
1c4a4f48fa Bug 1414155 - Factor out size classes logic for tiny, quantum and sub-page classes. r=njn
We create a new helper class that rounds up allocations sizes and
categorizes them. Compilers are smart enough to elide what they don't
need, like in malloc_good_size, they elide the code related to the
class type enum.

--HG--
extra : rebase_source : 61381e600587b045e720a85a7b46673edeb691b9
2017-11-03 08:53:34 +09:00
Mike Hommey
161c7d7841 Bug 1414155 - Move a few things around. r=njn
--HG--
extra : rebase_source : 11049459c8318e1a9f1cf535dfbce115bf57f918
2017-11-01 19:29:36 +09:00
Mike Hommey
66867e7f34 Bug 1414073 - Rename arena_bin_t fields. r=njn
At the same time, fold malloc_bin_stats_t into it.

--HG--
extra : rebase_source : 38c6a194d100783ecd0f769952de7bb4f71f17b0
2017-11-03 09:26:07 +09:00
Mike Hommey
8f8239d7bf Bug 1413570 - Disable SSE2 when building mozjemalloc on x86 during PGO profile gen phase. r=froydnj
Because of alignment issues due to the system glibc when running the
SSE2 gcov code generated during the PGO profile gen phase, Firefox
crashes when running the PGO profile. We work around the issue by
disabling SSE2 when building mozjemalloc during that phase. That
shouldn't affect the coverage data anyways, which is bound to the
original C++ code, and the profile-use code generation will still emit
SSE2 based on the coverage data if it needs to.

--HG--
extra : rebase_source : 3596fdc795cdef0789f3a2dd8f10b42cde00430f
2017-11-02 10:26:49 +09:00
Mike Hommey
eab43e4a6c Bug 1413475 - Run clang-format on all files in memory/build/. r=njn
--HG--
extra : rebase_source : a0a7ebff22c2387389d2f1dc75f8a5084f76ebb7
2017-11-01 17:20:54 +09:00
Mike Hommey
af14262e54 Bug 1413475 - Change comments to use C++ style // instead of /* */ in memory/build/. r=njn
--HG--
extra : rebase_source : 8d8b85e8123f414cb1e0e1eb067e0d198b3ebb8f
2017-11-01 17:15:12 +09:00
Mike Hommey
1de1ed32d0 Bug 1413475 - Normalize license boilerplates in memory/build/. r=njn
--HG--
extra : rebase_source : 9689f766211fbe1476c5e6d4774f1e95bb8e0208
2017-11-01 16:56:27 +09:00
Mike Hommey
f081d2458b Bug 1413475 - Replace SIZEOF_INT_2POW with LOG2(sizeof(int)). r=njn
--HG--
extra : rebase_source : f1add73810649c8b12e2ee139528fad9186fc20b
2017-11-01 16:47:59 +09:00
Mike Hommey
d07b1b083f Bug 1413475 - Inline STRERROR_BUF in mozjemalloc.cpp. r=njn
It is only used once.

--HG--
extra : rebase_source : 044e7d8ac3e6db834702ab8aedae9b44112d2932
2017-11-01 16:46:44 +09:00
Mike Hommey
7b9d775493 Bug 1413475 - Remove unused MAP_NOSYNC definition in mozjemalloc.cpp. r=njn
--HG--
extra : rebase_source : 95c760647b1c97accfb044961aae6ae1f2113f8e
2017-11-01 16:45:40 +09:00
Mike Hommey
a71adf8fd9 Bug 1413475 - Move MALLOC_DECOMMIT definition closer to that of MALLOC_DOUBLE_PURGE. r=njn
--HG--
extra : rebase_source : 41e4dc84269dfa47c9b7daf4ab587dd41c7eb290
2017-11-01 16:45:24 +09:00
Mike Hommey
485f23d7b1 Bug 1413475 - Reorganize #includes in mozjemalloc.cpp. r=njn
--HG--
extra : rebase_source : 8b03bba78544e1980802c5c7e92ecead738e0408
2017-11-01 16:10:24 +09:00
Mike Hommey
66579f53a7 Bug 1413475 - Inline _CRT_SPINCOUNT in mozjemalloc.cpp. r=njn
It is only used once.

--HG--
extra : rebase_source : 589c078fac563f7e7f057e3a2904970126b6a4b9
2017-11-01 16:06:28 +09:00
Mike Hommey
36809628ea Bug 1413475 - Remove ssize_t definition from mozjemalloc.cpp. r=njn
The only use of ssize_t in mozjemalloc was removed in bug 1403444.

--HG--
extra : rebase_source : 98dde18a08a9a64b2b61698c43cbf3fc7eb74b5d
2017-11-01 15:51:06 +09:00
Mike Hommey
673c2f9373 Bug 1402284 - Separate arenas created from moz_arena_* functions from others. r=njn
We introduce the notion of private arenas, separate from other arenas
(main and thread-local). They are kept in a separate arena tree, and
arena lookups from moz_arena_* functions only access the tree of
private arenas. Iteration still goes through all arenas, private and
non-private.

--HG--
extra : rebase_source : 86c43c7c920b01eb6fa1fa214d612fd9220eac3e
2017-10-31 07:13:39 +09:00
Mike Hommey
4092f53e8a Bug 1402284 - Move arena tree related globals to a static singleton of a new class. r=njn
We create the ArenaCollection class to handle operations on the
arena tree. Ideally, iter() would trigger locking, but the
prefork/postfork code complicates things, so we leave this for later.

--HG--
extra : rebase_source : bd7021098baf0ec01c14063294098edea4473d36
2017-10-28 07:13:58 +09:00
Mike Hommey
e23a5782c2 Bug 1402284 - Initialize arena_t objects via a constructor instead of manually with an Init method. r=njn
Note we use a local variable for fallible allocator because using plain
`new (fallible)` would require some figuring out for non-Firefox builds
(e.g.  standalone js).

--HG--
extra : rebase_source : 2132f98ebc7e37a139b673f80631e672bcf8ed15
2017-10-28 08:42:59 +09:00
Mike Hommey
b4a43c8f3a Bug 1402284 - Make RedBlackTree::{Insert,Remove} work when the type has a constructor. r=njn
RedBlackTree::{Insert,Remove} allocate an object on the stack for its
RedBlackTreeNode, and that shouldn't have side effects if the type
happens to have a constructor. This will allow to add constructors to
some of the mozjemalloc types.

--HG--
extra : rebase_source : 14dbb7d73c86921701d83156186df5d645530dda
2017-10-30 09:55:18 +09:00