This adds two AUTO_PROFILER_LABEL_DYNAMIC_... macros and updates select
usages of the old macros to use the new ones. These new macros cause
the dynamic string of the label to be included in BHR stacks.
We don't want to do this all of the time, as in many cases we may not
be interested enough in the dynamic string or it may be sensitive
information, but it is rather important information for certain cases.
This uses the same buffer that we use for the strings for JS frames,
and if we fail to fit into that buffer we just append the raw label.
If the string is too long for our static buffer (128 bytes), we just
leave it truncated, as it should be stable and we may be able to infer
from the truncated form what the full form would be.
Differential Revision: https://phabricator.services.mozilla.com/D51665
--HG--
extra : moz-landing-system : lando
This macro makes any forward declarations unnecessarily verbose, and the
build system uses Clang by default anyway, except in the hazard analysis
which already specified -Wno-attributes.
Depends on D49097
Differential Revision: https://phabricator.services.mozilla.com/D49098
--HG--
extra : moz-landing-system : lando
This is similar to AUTO_PROFILER_LABEL, but with only one argument: a category pair.
This reduces duplication for label frames that want just the subcategory name as
their label: Instead of AUTO_PROFILER_LABEL("Layer building", GRAPHICS_LayerBuilding),
you can now just write AUTO_PROFILER_LABEL_CATEGORY_PAIR(GRAPHICS_LayerBuilding) and
the string will automatically be taken from the subcategory.
Differential Revision: https://phabricator.services.mozilla.com/D11339
--HG--
extra : moz-landing-system : lando
The actual subcategories will be added in later patches, so that there are no
unused categories.
Differential Revision: https://phabricator.services.mozilla.com/D11334
--HG--
extra : moz-landing-system : lando
- modify line wrap up to 80 chars; (tw=80)
- modify size of tab to 2 chars everywhere; (sts=2, sw=2)
--HG--
extra : rebase_source : 7eedce0311b340c9a5a1265dc42d3121cc0f32a0
extra : amend_source : 9cb4ffdd5005f5c4c14172390dd00b04b2066cd7
This change reduces the binary size on macOS x64 by around 50KB.
Here's a diff of the impact on the code generated for Attr_Binding::get_specified
in the Mac build. It's a bit hard to read because %r12 and %rbx swap their
function, but what happens in this method is that "movq %r12, %rcx" goes
away, and the two instructions "leal 0x1(%r12) %eax" and
"movl %eax, 0x10(%rbx)" turn into an "incl 0x10(%r12)".
So the old code was preserving the original value of profilingStack->stackPointer
in a register, and then using it later to compute the incremented stackPointer.
The new code uses an "incl" instruction for the stackPointer increment and
doesn't worry that the stackPointer value might have changed since the stack
size check at the start of the function. (It can't have changed.)
before: %rbx has the ProfilingStack*, %r12 has profilingStack->stackPointer
after: %r12 has the ProfilingStack*, %rbx has profilingStack->stackPointer
@@ -3,37 +3,35 @@
movq %rsp, %rbp
pushq %r15
pushq %r14
pushq %r12
pushq %rbx
subq $0x10, %rsp
movq %rcx, %r14
movq %rdx, %r15
- movq 0x80(%rdi), %rbx
- movq %rbx, -40(%rbp)
- testq %rbx, %rbx
+ movq 0x80(%rdi), %r12
+ movq %r12, -40(%rbp)
+ testq %r12, %r12
je loc_xxxxx
- movl 0x10(%rbx), %r12d
- cmpl (%rbx), %r12d
+ movl 0x10(%r12), %ebx
+ cmpl (%r12), %ebx
jae loc_xxxxx
- movq 0x8(%rbx), %rax
- movq %r12, %rcx
- shlq $0x5, %rcx
- leaq aAttr, %rdx ; "Attr"
- movq %rdx, (%rax,%rcx)
- leaq aSpecified, %rdx ; "specified"
- movq %rdx, 0x8(%rax,%rcx)
- leaq -40(%rbp), %rdx
- movq %rdx, 0x10(%rax,%rcx)
- movl $0x3a1, 0x1c(%rax,%rcx)
- leal 0x1(%r12), %eax
- movl %eax, 0x10(%rbx)
+ movq 0x8(%r12), %rax
+ shlq $0x5, %rbx
+ leaq aAttr, %rcx ; "Attr"
+ movq %rcx, (%rax,%rbx)
+ leaq aSpecified, %rcx ; "specified"
+ movq %rcx, 0x8(%rax,%rbx)
+ leaq -40(%rbp), %rcx
+ movq %rcx, 0x10(%rax,%rbx)
+ movl $0x3a1, 0x1c(%rax,%rbx)
+ incl 0x10(%r12)
movq %r15, %rdi
call __ZNK7mozilla3dom4Attr9SpecifiedEv ; mozilla::dom::Attr::Specified() const
movzxl %al, %eax
movabsq $0xfff9000000000000, %rcx
orq %rax, %rcx
movq %rcx, (%r14)
movq -40(%rbp), %rax
@@ -47,11 +45,11 @@
popq %rbx
popq %r12
popq %r14
popq %r15
popq %rbp
ret
; endp
- movq %rbx, %rdi
+ movq %r12, %rdi
call __ZN14ProfilingStack18ensureCapacitySlowEv ; ProfilingStack::ensureCapacitySlow()
jmp loc_xxxxx
Depends on D9205
Differential Revision: https://phabricator.services.mozilla.com/D9206
--HG--
extra : moz-landing-system : lando
These flags will be used by WebIDL APIs in an upcoming patch.
Depends on D9199
Differential Revision: https://phabricator.services.mozilla.com/D9203
--HG--
extra : moz-landing-system : lando
They were not displayed in the UI, and the instructions to initialize the line
field of a stack frame increased code size unnecessarily.
This change reduces the binary size on Linux x64 by around 100KB.
Here's a diff of the impact on the code generated for Attr_Binding::get_specified
in the Mac build:
@@ -20,17 +20,16 @@
movq 0x8(%rbx), %rax
movq %r12, %rcx
shlq $0x5, %rcx
leaq aGetAttrspecifi, %rdx ; "get Attr.specified"
movq %rdx, (%rax,%rcx)
movq $0x0, 0x8(%rax,%rcx)
leaq -40(%rbp), %rdx
movq %rdx, 0x10(%rax,%rcx)
- movl $0x106, 0x18(%rax,%rcx)
movl $0x1c, 0x1c(%rax,%rcx)
leal 0x1(%r12), %eax
movl %eax, 0x10(%rbx)
movq %r15, %rdi
call __ZNK7mozilla3dom4Attr9SpecifiedEv ; mozilla::dom::Attr::Specified() const
movzxl %al, %eax
movabsq $0xfff9000000000000, %rcx
Depends on D9193
Differential Revision: https://phabricator.services.mozilla.com/D9195
--HG--
extra : moz-landing-system : lando
This eliminates a few instructions from each inlined instance of
AutoProfilerLabel because we no longer need to handle allocation failure in the
inlined code.
I think this allocation should be fine to make infallible: The allocation size
is limited by the thread's stack depth, and we only hit this code path when the
stack is the deepest it's ever been during the thread's life time.
This change reduces the binary size on Linux x64 by around 100KB.
Here's a diff of the impact on the code generated for Attr_Binding::get_specified
in the Mac build, it really just eliminates one test and one jump at the very end
of the method:
@@ -9,30 +9,29 @@
movq %rcx, %r14
movq %rdx, %r15
movq 0x80(%rdi), %rbx
movq %rbx, -40(%rbp)
testq %rbx, %rbx
je loc_xxxxx
movl 0x10(%rbx), %r12d
- cmpl %r12d, (%rbx)
- jbe loc_xxxxx
+ cmpl (%rbx), %r12d
+ jae loc_xxxxx
movq 0x8(%rbx), %rax
movq %r12, %rcx
shlq $0x5, %rcx
leaq aGetAttrspecifi, %rdx ; "get Attr.specified"
movq %rdx, (%rax,%rcx)
movq $0x0, 0x8(%rax,%rcx)
leaq -40(%rbp), %rdx
movq %rdx, 0x10(%rax,%rcx)
movl $0x106, 0x18(%rax,%rcx)
movl $0x1c, 0x1c(%rax,%rcx)
-
leal 0x1(%r12), %eax
movl %eax, 0x10(%rbx)
movq %r15, %rdi
call __ZNK7mozilla3dom4Attr9SpecifiedEv ; mozilla::dom::Attr::Specified() const
movzxl %al, %eax
movabsq $0xfff9000000000000, %rcx
orq %rax, %rcx
@@ -50,12 +49,9 @@
popq %r14
popq %r15
popq %rbp
ret
; endp
movq %rbx, %rdi
call __ZN14ProfilingStack18ensureCapacitySlowEv ; ProfilingStack::ensureCapacitySlow()
- testb %al, %al
- jne loc_xxxxx
-
jmp loc_xxxxx
Depends on D9192
Differential Revision: https://phabricator.services.mozilla.com/D9193
--HG--
extra : moz-landing-system : lando
This eliminates a few instructions from every profiler label and saves code size.
We have around 9000 WebIDL constructors + methods + getters + setters which all
have an inlined instance of this code.
This change reduces the binary size on Linux x64 by around 160KB.
Here's a diff of the impact on the code generated for Attr_Binding::get_specified
in the Mac build:
movq %rsp, %rbp
pushq %r15
pushq %r14
pushq %r12
pushq %rbx
subq $0x10, %rsp
movq %rcx, %r14
movq %rdx, %r15
- movq __ZN7mozilla8profiler6detail12RacyFeatures18sActiveAndFeaturesE@GOT, %rax ; __ZN7mozilla8profiler6detail12RacyFeatures18sActiveAndFeaturesE@GOT
- movl (%rax), %eax
- testl %eax, %eax
- js loc_xxxxx
-
- movq $0x0, -40(%rbp)
- jmp loc_xxxxx
-
- movq 0x78(%rdi), %rbx
+ movq 0x80(%rdi), %rbx
movq %rbx, -40(%rbp)
testq %rbx, %rbx
je loc_xxxxx
movl 0x10(%rbx), %r12d
cmpl %r12d, (%rbx)
jbe loc_xxxxx
Differential Revision: https://phabricator.services.mozilla.com/D9192
--HG--
extra : moz-landing-system : lando
This change reduces the binary size on macOS x64 by around 50KB.
Here's a diff of the impact on the code generated for Attr_Binding::get_specified
in the Mac build. It's a bit hard to read because %r12 and %rbx swap their
function, but what happens in this method is that "movq %r12, %rcx" goes
away, and the two instructions "leal 0x1(%r12) %eax" and
"movl %eax, 0x10(%rbx)" turn into an "incl 0x10(%r12)".
So the old code was preserving the original value of profilingStack->stackPointer
in a register, and then using it later to compute the incremented stackPointer.
The new code uses an "incl" instruction for the stackPointer increment and
doesn't worry that the stackPointer value might have changed since the stack
size check at the start of the function. (It can't have changed.)
before: %rbx has the ProfilingStack*, %r12 has profilingStack->stackPointer
after: %r12 has the ProfilingStack*, %rbx has profilingStack->stackPointer
@@ -3,37 +3,35 @@
movq %rsp, %rbp
pushq %r15
pushq %r14
pushq %r12
pushq %rbx
subq $0x10, %rsp
movq %rcx, %r14
movq %rdx, %r15
- movq 0x80(%rdi), %rbx
- movq %rbx, -40(%rbp)
- testq %rbx, %rbx
+ movq 0x80(%rdi), %r12
+ movq %r12, -40(%rbp)
+ testq %r12, %r12
je loc_xxxxx
- movl 0x10(%rbx), %r12d
- cmpl (%rbx), %r12d
+ movl 0x10(%r12), %ebx
+ cmpl (%r12), %ebx
jae loc_xxxxx
- movq 0x8(%rbx), %rax
- movq %r12, %rcx
- shlq $0x5, %rcx
- leaq aAttr, %rdx ; "Attr"
- movq %rdx, (%rax,%rcx)
- leaq aSpecified, %rdx ; "specified"
- movq %rdx, 0x8(%rax,%rcx)
- leaq -40(%rbp), %rdx
- movq %rdx, 0x10(%rax,%rcx)
- movl $0x3a1, 0x1c(%rax,%rcx)
- leal 0x1(%r12), %eax
- movl %eax, 0x10(%rbx)
+ movq 0x8(%r12), %rax
+ shlq $0x5, %rbx
+ leaq aAttr, %rcx ; "Attr"
+ movq %rcx, (%rax,%rbx)
+ leaq aSpecified, %rcx ; "specified"
+ movq %rcx, 0x8(%rax,%rbx)
+ leaq -40(%rbp), %rcx
+ movq %rcx, 0x10(%rax,%rbx)
+ movl $0x3a1, 0x1c(%rax,%rbx)
+ incl 0x10(%r12)
movq %r15, %rdi
call __ZNK7mozilla3dom4Attr9SpecifiedEv ; mozilla::dom::Attr::Specified() const
movzxl %al, %eax
movabsq $0xfff9000000000000, %rcx
orq %rax, %rcx
movq %rcx, (%r14)
movq -40(%rbp), %rax
@@ -47,11 +45,11 @@
popq %rbx
popq %r12
popq %r14
popq %r15
popq %rbp
ret
; endp
- movq %rbx, %rdi
+ movq %r12, %rdi
call __ZN14ProfilingStack18ensureCapacitySlowEv ; ProfilingStack::ensureCapacitySlow()
jmp loc_xxxxx
Depends on D9205
Differential Revision: https://phabricator.services.mozilla.com/D9206
--HG--
extra : moz-landing-system : lando
These flags will be used by WebIDL APIs in an upcoming patch.
Depends on D9199
Differential Revision: https://phabricator.services.mozilla.com/D9203
--HG--
extra : moz-landing-system : lando
They were not displayed in the UI, and the instructions to initialize the line
field of a stack frame increased code size unnecessarily.
This change reduces the binary size on Linux x64 by around 100KB.
Here's a diff of the impact on the code generated for Attr_Binding::get_specified
in the Mac build:
@@ -20,17 +20,16 @@
movq 0x8(%rbx), %rax
movq %r12, %rcx
shlq $0x5, %rcx
leaq aGetAttrspecifi, %rdx ; "get Attr.specified"
movq %rdx, (%rax,%rcx)
movq $0x0, 0x8(%rax,%rcx)
leaq -40(%rbp), %rdx
movq %rdx, 0x10(%rax,%rcx)
- movl $0x106, 0x18(%rax,%rcx)
movl $0x1c, 0x1c(%rax,%rcx)
leal 0x1(%r12), %eax
movl %eax, 0x10(%rbx)
movq %r15, %rdi
call __ZNK7mozilla3dom4Attr9SpecifiedEv ; mozilla::dom::Attr::Specified() const
movzxl %al, %eax
movabsq $0xfff9000000000000, %rcx
Depends on D9193
Differential Revision: https://phabricator.services.mozilla.com/D9195
--HG--
extra : moz-landing-system : lando
This eliminates a few instructions from each inlined instance of
AutoProfilerLabel because we no longer need to handle allocation failure in the
inlined code.
I think this allocation should be fine to make infallible: The allocation size
is limited by the thread's stack depth, and we only hit this code path when the
stack is the deepest it's ever been during the thread's life time.
This change reduces the binary size on Linux x64 by around 100KB.
Here's a diff of the impact on the code generated for Attr_Binding::get_specified
in the Mac build, it really just eliminates one test and one jump at the very end
of the method:
@@ -9,30 +9,29 @@
movq %rcx, %r14
movq %rdx, %r15
movq 0x80(%rdi), %rbx
movq %rbx, -40(%rbp)
testq %rbx, %rbx
je loc_xxxxx
movl 0x10(%rbx), %r12d
- cmpl %r12d, (%rbx)
- jbe loc_xxxxx
+ cmpl (%rbx), %r12d
+ jae loc_xxxxx
movq 0x8(%rbx), %rax
movq %r12, %rcx
shlq $0x5, %rcx
leaq aGetAttrspecifi, %rdx ; "get Attr.specified"
movq %rdx, (%rax,%rcx)
movq $0x0, 0x8(%rax,%rcx)
leaq -40(%rbp), %rdx
movq %rdx, 0x10(%rax,%rcx)
movl $0x106, 0x18(%rax,%rcx)
movl $0x1c, 0x1c(%rax,%rcx)
-
leal 0x1(%r12), %eax
movl %eax, 0x10(%rbx)
movq %r15, %rdi
call __ZNK7mozilla3dom4Attr9SpecifiedEv ; mozilla::dom::Attr::Specified() const
movzxl %al, %eax
movabsq $0xfff9000000000000, %rcx
orq %rax, %rcx
@@ -50,12 +49,9 @@
popq %r14
popq %r15
popq %rbp
ret
; endp
movq %rbx, %rdi
call __ZN14ProfilingStack18ensureCapacitySlowEv ; ProfilingStack::ensureCapacitySlow()
- testb %al, %al
- jne loc_xxxxx
-
jmp loc_xxxxx
Depends on D9192
Differential Revision: https://phabricator.services.mozilla.com/D9193
--HG--
extra : moz-landing-system : lando
This eliminates a few instructions from every profiler label and saves code size.
We have around 9000 WebIDL constructors + methods + getters + setters which all
have an inlined instance of this code.
This change reduces the binary size on Linux x64 by around 160KB.
Here's a diff of the impact on the code generated for Attr_Binding::get_specified
in the Mac build:
movq %rsp, %rbp
pushq %r15
pushq %r14
pushq %r12
pushq %rbx
subq $0x10, %rsp
movq %rcx, %r14
movq %rdx, %r15
- movq __ZN7mozilla8profiler6detail12RacyFeatures18sActiveAndFeaturesE@GOT, %rax ; __ZN7mozilla8profiler6detail12RacyFeatures18sActiveAndFeaturesE@GOT
- movl (%rax), %eax
- testl %eax, %eax
- js loc_xxxxx
-
- movq $0x0, -40(%rbp)
- jmp loc_xxxxx
-
- movq 0x78(%rdi), %rbx
+ movq 0x80(%rdi), %rbx
movq %rbx, -40(%rbp)
testq %rbx, %rbx
je loc_xxxxx
movl 0x10(%rbx), %r12d
cmpl %r12d, (%rbx)
jbe loc_xxxxx
Differential Revision: https://phabricator.services.mozilla.com/D9192
--HG--
extra : moz-landing-system : lando
This change reduces the binary size on macOS x64 by around 50KB.
Here's a diff of the impact on the code generated for Attr_Binding::get_specified
in the Mac build. It's a bit hard to read because %r12 and %rbx swap their
function, but what happens in this method is that "movq %r12, %rcx" goes
away, and the two instructions "leal 0x1(%r12) %eax" and
"movl %eax, 0x10(%rbx)" turn into an "incl 0x10(%r12)".
So the old code was preserving the original value of profilingStack->stackPointer
in a register, and then using it later to compute the incremented stackPointer.
The new code uses an "incl" instruction for the stackPointer increment and
doesn't worry that the stackPointer value might have changed since the stack
size check at the start of the function. (It can't have changed.)
before: %rbx has the ProfilingStack*, %r12 has profilingStack->stackPointer
after: %r12 has the ProfilingStack*, %rbx has profilingStack->stackPointer
@@ -3,37 +3,35 @@
movq %rsp, %rbp
pushq %r15
pushq %r14
pushq %r12
pushq %rbx
subq $0x10, %rsp
movq %rcx, %r14
movq %rdx, %r15
- movq 0x80(%rdi), %rbx
- movq %rbx, -40(%rbp)
- testq %rbx, %rbx
+ movq 0x80(%rdi), %r12
+ movq %r12, -40(%rbp)
+ testq %r12, %r12
je loc_xxxxx
- movl 0x10(%rbx), %r12d
- cmpl (%rbx), %r12d
+ movl 0x10(%r12), %ebx
+ cmpl (%r12), %ebx
jae loc_xxxxx
- movq 0x8(%rbx), %rax
- movq %r12, %rcx
- shlq $0x5, %rcx
- leaq aAttr, %rdx ; "Attr"
- movq %rdx, (%rax,%rcx)
- leaq aSpecified, %rdx ; "specified"
- movq %rdx, 0x8(%rax,%rcx)
- leaq -40(%rbp), %rdx
- movq %rdx, 0x10(%rax,%rcx)
- movl $0x3a1, 0x1c(%rax,%rcx)
- leal 0x1(%r12), %eax
- movl %eax, 0x10(%rbx)
+ movq 0x8(%r12), %rax
+ shlq $0x5, %rbx
+ leaq aAttr, %rcx ; "Attr"
+ movq %rcx, (%rax,%rbx)
+ leaq aSpecified, %rcx ; "specified"
+ movq %rcx, 0x8(%rax,%rbx)
+ leaq -40(%rbp), %rcx
+ movq %rcx, 0x10(%rax,%rbx)
+ movl $0x3a1, 0x1c(%rax,%rbx)
+ incl 0x10(%r12)
movq %r15, %rdi
call __ZNK7mozilla3dom4Attr9SpecifiedEv ; mozilla::dom::Attr::Specified() const
movzxl %al, %eax
movabsq $0xfff9000000000000, %rcx
orq %rax, %rcx
movq %rcx, (%r14)
movq -40(%rbp), %rax
@@ -47,11 +45,11 @@
popq %rbx
popq %r12
popq %r14
popq %r15
popq %rbp
ret
; endp
- movq %rbx, %rdi
+ movq %r12, %rdi
call __ZN14ProfilingStack18ensureCapacitySlowEv ; ProfilingStack::ensureCapacitySlow()
jmp loc_xxxxx
Depends on D9205
Differential Revision: https://phabricator.services.mozilla.com/D9206
--HG--
extra : moz-landing-system : lando
These flags will be used by WebIDL APIs in an upcoming patch.
Depends on D9199
Differential Revision: https://phabricator.services.mozilla.com/D9203
--HG--
extra : moz-landing-system : lando
They were not displayed in the UI, and the instructions to initialize the line
field of a stack frame increased code size unnecessarily.
This change reduces the binary size on Linux x64 by around 100KB.
Here's a diff of the impact on the code generated for Attr_Binding::get_specified
in the Mac build:
@@ -20,17 +20,16 @@
movq 0x8(%rbx), %rax
movq %r12, %rcx
shlq $0x5, %rcx
leaq aGetAttrspecifi, %rdx ; "get Attr.specified"
movq %rdx, (%rax,%rcx)
movq $0x0, 0x8(%rax,%rcx)
leaq -40(%rbp), %rdx
movq %rdx, 0x10(%rax,%rcx)
- movl $0x106, 0x18(%rax,%rcx)
movl $0x1c, 0x1c(%rax,%rcx)
leal 0x1(%r12), %eax
movl %eax, 0x10(%rbx)
movq %r15, %rdi
call __ZNK7mozilla3dom4Attr9SpecifiedEv ; mozilla::dom::Attr::Specified() const
movzxl %al, %eax
movabsq $0xfff9000000000000, %rcx
Depends on D9193
Differential Revision: https://phabricator.services.mozilla.com/D9195
--HG--
extra : moz-landing-system : lando
This eliminates a few instructions from each inlined instance of
AutoProfilerLabel because we no longer need to handle allocation failure in the
inlined code.
I think this allocation should be fine to make infallible: The allocation size
is limited by the thread's stack depth, and we only hit this code path when the
stack is the deepest it's ever been during the thread's life time.
This change reduces the binary size on Linux x64 by around 100KB.
Here's a diff of the impact on the code generated for Attr_Binding::get_specified
in the Mac build, it really just eliminates one test and one jump at the very end
of the method:
@@ -9,30 +9,29 @@
movq %rcx, %r14
movq %rdx, %r15
movq 0x80(%rdi), %rbx
movq %rbx, -40(%rbp)
testq %rbx, %rbx
je loc_xxxxx
movl 0x10(%rbx), %r12d
- cmpl %r12d, (%rbx)
- jbe loc_xxxxx
+ cmpl (%rbx), %r12d
+ jae loc_xxxxx
movq 0x8(%rbx), %rax
movq %r12, %rcx
shlq $0x5, %rcx
leaq aGetAttrspecifi, %rdx ; "get Attr.specified"
movq %rdx, (%rax,%rcx)
movq $0x0, 0x8(%rax,%rcx)
leaq -40(%rbp), %rdx
movq %rdx, 0x10(%rax,%rcx)
movl $0x106, 0x18(%rax,%rcx)
movl $0x1c, 0x1c(%rax,%rcx)
-
leal 0x1(%r12), %eax
movl %eax, 0x10(%rbx)
movq %r15, %rdi
call __ZNK7mozilla3dom4Attr9SpecifiedEv ; mozilla::dom::Attr::Specified() const
movzxl %al, %eax
movabsq $0xfff9000000000000, %rcx
orq %rax, %rcx
@@ -50,12 +49,9 @@
popq %r14
popq %r15
popq %rbp
ret
; endp
movq %rbx, %rdi
call __ZN14ProfilingStack18ensureCapacitySlowEv ; ProfilingStack::ensureCapacitySlow()
- testb %al, %al
- jne loc_xxxxx
-
jmp loc_xxxxx
Depends on D9192
Differential Revision: https://phabricator.services.mozilla.com/D9193
--HG--
extra : moz-landing-system : lando
This eliminates a few instructions from every profiler label and saves code size.
We have around 9000 WebIDL constructors + methods + getters + setters which all
have an inlined instance of this code.
This change reduces the binary size on Linux x64 by around 160KB.
Here's a diff of the impact on the code generated for Attr_Binding::get_specified
in the Mac build:
movq %rsp, %rbp
pushq %r15
pushq %r14
pushq %r12
pushq %rbx
subq $0x10, %rsp
movq %rcx, %r14
movq %rdx, %r15
- movq __ZN7mozilla8profiler6detail12RacyFeatures18sActiveAndFeaturesE@GOT, %rax ; __ZN7mozilla8profiler6detail12RacyFeatures18sActiveAndFeaturesE@GOT
- movl (%rax), %eax
- testl %eax, %eax
- js loc_xxxxx
-
- movq $0x0, -40(%rbp)
- jmp loc_xxxxx
-
- movq 0x78(%rdi), %rbx
+ movq 0x80(%rdi), %rbx
movq %rbx, -40(%rbp)
testq %rbx, %rbx
je loc_xxxxx
movl 0x10(%rbx), %r12d
cmpl %r12d, (%rbx)
jbe loc_xxxxx
Differential Revision: https://phabricator.services.mozilla.com/D9192
--HG--
extra : moz-landing-system : lando
This was used to label IndexedDB work and work in storage/mozStorage*.
I don't think this deserves its own category; categories are most useful for
the main thread, and most of the time-consuming database-related work happens
on helper threads. The main thread pieces are mostly for asynchronicity-
coordination and don't usually take up time.
This patch labels IndexedDB work as DOM instead (which is maybe debatable) and
the rest as OTHER.
MozReview-Commit-ID: 3UYhFFbi3Ry
--HG--
extra : rebase_source : 5c88dfd67274103de01fe44191f49776017738f9
The next changeset is going to move over more annotations that Gecko developers
would count as "layout" into the LAYOUT category, and which is currently marked
as GRAPHICS.
We can add a subcategory for style resolution once we have subcategories, but
for now I think it makes more sense to put style resolution into the same bucket
as reflow and display list building.
MozReview-Commit-ID: 7r9eICVBA1Z
--HG--
extra : rebase_source : ce2df7a07522e99b0ccb59e40a8eae590ebfe834
Categories are useful to indicate: This much % of time was spent in this category.
The EVENTS category isn't a very good match for this. This category is currently
only set on labels of functions that handle the processing of an event. But
those functions are usually closer to the base of the stack, and the actual CPU
work during the processing of an event is usually in another category closer to
the top of the stack, e.g. in JS if we're running an event handler, or in LAYOUT
if we're hit testing the position of the event.
This changeset removes the EVENTS category and replaces all uses of it with the
OTHER category.
MozReview-Commit-ID: JPm5hQiBkvp
--HG--
extra : rebase_source : 66f8ee003d2f70111f4cff16d6e2d906ef4bf10b
They're very similar as far as most users of the profiler are concerned, I'd
say, and I don't believe it's worth giving them two different colors in the
activity graphs.
MozReview-Commit-ID: HTqjp56naL3
--HG--
extra : rebase_source : cf8d64bc3e76ed9bb07100081aebfc404845b8bc