As explained in bug 1111355, having avx enabled appears to change the
alignment behavior of alloca (apparently adding an extra 16 bytes) of
padding/alignment (and using 32-byte alignment instead of 16-byte). The
suggestion of using __bultin_alloca_with_align in bug 1111355 didn't fix
the problem, so this seems to be the best available workaround, given
that this code, which should perhaps better be written in assembly, is
written in C++.
Interestingly, this is NOT fixed by #pragma GCC target ("arch=x86-64").
(I determined the (undocumented) name for the default -march value on
x86_64 from the gcc source code (gcc/config/i386/i386.c, function
ix86_option_override_internal, code that sets opts->x_ix86_arch_string .)
I confirmed that this sets the same macros based on the empty diff
between the output of 'gcc -E -dM -x c++ /dev/null' and 'gcc -E -dM -x
c++ -march=x86-64 /dev/null', which was not an empty diff for other
-march values (e.g., k8).)
I confirmed that the push_options and pop_options actually work by
putting the push/pop pair around a different (earlier) function, and
testing that this did not fix the bug (with the pop_options before
NS_InvokeByIndex).
See the gcc documentation at:
https://gcc.gnu.org/onlinedocs/gcc-4.9.0/gcc/Function-Specific-Option-Pragmas.htmlhttps://gcc.gnu.org/onlinedocs/gcc-4.9.0/gcc/Function-Attributes.htmlhttps://gcc.gnu.org/onlinedocs/gcc-4.9.0/gcc/i386-and-x86-64-Options.html
--HG--
extra : transplant_source : %DA%7CJ4H%DE%80%15%84%0D%116%85Q%9A%F9%2C%D1v%16
I kept all the existing PL_DHashTableAdd() calls fallible, in order to be
conservative, except for the ones in nsAtomTable.cpp which already were
followed immediately by an abort on failure.
--HG--
extra : rebase_source : eeba14d732077ef2e412f4caca852de6b6b85f55
Because they are now just equivalent to |new PLDHashTable()| +
PL_DHashTableInit() and PL_DHashTableFinish(t) + |delete t|, respectively.
They're only used in a handful of places and obscure things more than they
clarify -- I only recently worked out exactly how they different from Init()
and Finish().
--HG--
extra : rebase_source : c958491447523becff3e01de45a5d2d227d1ecd3
Because it's no longer needed now that entry storage isn't allocated there.
(The other possible causes of failures are much less interesting and simply
crashing is a reasonable thing to do for them.)
This also makes PL_DNewHashTable() infallible.
--HG--
extra : rebase_source : 848cc9bbdfe434525857183b8370d309f3acbf49
This makes zero-element hash tables, which are common, smaller, and also avoids
unnecessary malloc/free pairs.
I did some measurements during some basic browsing of a few sites. I found that
35% of all live tables were empty with a few tabs open. And cumulatively, for
the whole session, 45% of tables never had an element added to them.
There is more to be done w.r.t. simplifying initialization, which will occur in
the next patch.
--HG--
extra : rebase_source : b9bfdcd680f39f3c947a49ae8462c04bc5e38805
PL_DHashTableLookup() had the same return protocol as SearchTable(), which
meant that compilers could generate a tail call from the former to the latter.
Bug 1124973 replaced PL_DHashTableLookup() with PL_DHashTableSearch(), which
has a slightly different return protocol, and so the tail call was lost. This
appears to be the cause of the Fennec performance regression.
This patch splits SearchTable() in two (using templates to avoid explicit code
duplication). SearchTable<false>() now has the same return protocol as
PL_DHashTableSearch(), and so the tail call can now be used again.
As well as renaming and privatizing |keyHash|, this patch also:
- renames GetKeyHash() to ComputeKeyHash(), which better indicates it's not
some kind of getter function; and
- makes PLDHashEntryStub inherit from PLDHashEntryHdr, for consistency with
everywhere else.
It feels safer to use a function with a new name, rather than just changing the
behaviour of the existing function.
For most of these cases the PL_DHashTableLookup() result was checked with
PL_DHASH_ENTRY_IS_{FREE,BUSY} so the conversion was easy. A few of them
preceded that check with a useless null check, but the intent of these was
still easy to determine.
I'll do the trickier ones in subsequent patches.
--HG--
extra : rebase_source : ab37a7a30be563861ded8631771181aacf054fd4
Because (a) this is how it's usually done, (b) it allows static_cast<> instead
of reinterpret_cast<>, and (c) it will make subsequent patches easier.
--HG--
extra : rebase_source : 76e67d4b6ec0e5dc898a8214b6a6b562f9e08380