Sample profiles - Update text profile documentation.

There's been some changes to the text encoding for sample profiles. This
updates the documentation and an example.

llvm-svn: 250310
This commit is contained in:
Diego Novillo 2015-10-14 18:37:39 +00:00
parent bb5605ca3a
commit 33452761bb

View File

@ -1355,15 +1355,18 @@ read by the backend. LLVM supports three different sample profile formats:
1. ASCII text. This is the easiest one to generate. The file is divided into
sections, which correspond to each of the functions with profile
information. The format is described below.
information. The format is described below. It can also be generated from
the binary or gcov formats using the ``llvm-profdata`` tool.
2. Binary encoding. This uses a more efficient encoding that yields smaller
profile files, which may be useful when generating large profiles. It can be
generated from the text format using the ``llvm-profdata`` tool.
profile files. This is the format generated by the ``create_llvm_prof`` tool
in http://github.com/google/autofdo.
3. GCC encoding. This is based on the gcov format, which is accepted by GCC. It
is only interesting in environments where GCC and Clang co-exist. Similarly
to the binary encoding, it can be generated using the ``llvm-profdata`` tool.
is only interesting in environments where GCC and Clang co-exist. This
encoding is only generated by the ``create_gcov`` tool in
http://github.com/google/autofdo. It can be read by LLVM and
``llvm-profdata``, but it cannot be generated by either.
If you are using Linux Perf to generate sampling profiles, you can use the
conversion tool ``create_llvm_prof`` described in the previous section.
@ -1382,14 +1385,27 @@ of the other two, consult the ``ProfileData`` library in in LLVM's source tree
.. code-block:: console
function1:total_samples:total_head_samples
offset1[.discriminator]: number_of_samples [fn1:num fn2:num ... ]
offset2[.discriminator]: number_of_samples [fn3:num fn4:num ... ]
...
offsetN[.discriminator]: number_of_samples [fn5:num fn6:num ... ]
offset1[.discriminator]: number_of_samples [fn1:num fn2:num ... ]
offset2[.discriminator]: number_of_samples [fn3:num fn4:num ... ]
...
offsetN[.discriminator]: number_of_samples [fn5:num fn6:num ... ]
offsetA[.discriminator]: fnA:num_of_total_samples
offsetA1[.discriminator]: number_of_samples [fn7:num fn8:num ... ]
offsetA1[.discriminator]: number_of_samples [fn9:num fn10:num ... ]
offsetB[.discriminator]: fnB:num_of_total_samples
offsetB1[.discriminator]: number_of_samples [fn11:num fn12:num ... ]
The file may contain blank lines between sections and within a
section. However, the spacing within a single line is fixed. Additional
spaces will result in an error while reading the file.
This is a nested tree in which the identation represents the nesting level
of the inline stack. There are no blank lines in the file. And the spacing
within a single line is fixed. Additional spaces will result in an error
while reading the file.
Any line starting with the '#' character is completely ignored.
Inlined calls are represented with indentation. The Inline stack is a
stack of source locations in which the top of the stack represents the
leaf function, and the bottom of the stack represents the actual
symbol to which the instruction belongs.
Function names must be mangled in order for the profile loader to
match them in the current translation unit. The two numbers in the
@ -1398,6 +1414,14 @@ function (first number), and the total number of samples accumulated
in the prologue of the function (second number). This head sample
count provides an indicator of how frequently the function is invoked.
There are two types of lines in the function body.
- Sampled line represents the profile information of a source location.
``offsetN[.discriminator]: number_of_samples [fn5:num fn6:num ... ]``
- Callsite line represents the profile information of an inlined callsite.
``offsetA[.discriminator]: fnA:num_of_total_samples``
Each sampled line may contain several items. Some are optional (marked
below):
@ -1451,6 +1475,24 @@ d. [OPTIONAL] Potential call targets and samples. If present, this
instruction that calls one of ``foo()``, ``bar()`` and ``baz()``,
with ``baz()`` being the relatively more frequently called target.
As an example, consider a program with the call chain ``main -> foo -> bar``.
When built with optimizations enabled, the compiler may inline the
calls to ``bar`` and ``foo`` inside ``main``. The generated profile
could then be something like this:
.. code-block:: console
main:35504:0
1: _Z3foov:35504
2: _Z32bari:31977
1.1: 31977
2: 0
This profile indicates that there were a total of 35,504 samples
collected in main. All of those were at line 1 (the call to ``foo``).
Of those, 31,977 were spent inside the body of ``bar``. The last line
of the profile (``2: 0``) corresponds to line 2 inside ``main``. No
samples were collected there.
Profiling with Instrumentation
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^